A real-time used-car community that connects dealers and consumers with a live team of expert appraisers to create transparency and efficiency in the trade-in process.Website
"They are easy to work with; it's clear they have a great organizational culture, being professional, result-oriented, and fun to work with, all at the same time."
This included information about:
Any customer could later run business analytics on a certain US region for a certain kind of car to figure out specific properties about that market. Using this information they could improve their marketing effort, detect tendencies, and most importantly improve the accuracy of their appraise (so it’s closer to the real market price).
The main challenge was to build a massive parallel scraper that could handle scraping 10 million vehicles each day in a cost-effective way. Performance was a priority from the ground up since we wanted to scrape as much as possible, with as few AWS EC2 instances as possible.
Also, we would need to store, process, and make accessible (ETL) all that daily data for later usage (billions of car data points). This data needed to be queried in real time, so this was another important challenge to tackle.
All the work that we did on Wolfy, gave us the experience and know-how to tackle this big technical challenge that we had.
Every process was containerized and managed using Kubernetes, all running on AWS’s EKS. The solution was deployed using multiple AWS services like EKS, ECR, EC2, Elastic Search Service, Redis Service, RDS, S3 among others.
A master-slave architecture was implemented so that every day the master process would spin up several EC2 machines with several slave spider processes each. Each spider scraped several dealerships, always being polite to their robots.txt file. When each process was finished, the master process would stop the idle instances to avoid extra costs.
Spiders saved all the scraped data to an RDS database that we later dumped to S3 files (as a backup).
Finally, we used the S3 dumps to perform an ETL process to load the scraped data into an Elasticsearch search engine. Using the processed data from Elasticsearch, we exposed all the important information through an API that customers could easily query.
The Appraisal Lane currently has billions of vehicle data points generated by the scraping process, and they can query them to gain real-time insight into the market data.
TAL was our first client where we had to build a team of 5 people, that augmented and integrated with their existing team and processes.
The ETL process we created integrated with some of the client's existing data and we were able to process and store more than 10 million data points per day (that’s more than 3.5 billion per year!).
Users can now create custom API calls that query billions of car data points (with sub-second response times) to get real-time information about the market and improve their sales process.
The Appraisal Lane had a successful exit and was bought by Reynolds and Reynolds for a multi-million dollar sum; which led to R&R integrating TAL’s innovative product into their existing systems.