About Audiohook
Audiohook is an adtech company focused on audio channels, allowing performance marketers to use digital audio as a new channel. Audiohook allows businesses to reach their customers across all digital and traditional audio channels, providing DSP services to enable those experiences.
For Audiohook, giving performance marketers insight into their campaigns to understand the reach and effectiveness is vital as they look to open up the new channels. Audiohook provides analytics as well as data exports for further downstream analysis.
As a startup, dynamic data models and speed of analysis are paramount. Audiohook had hit a wall trying to get their prior cloud data warehouse to perform the way they needed and went looking for a new solution that would meet their demands.
The use case
Audiohook provides deep analytics that allows customers to understand the reach and effectiveness of their marketing campaigns. Audiohook has traditionally provided pre-aggregated results in the user-facing application and pushes the complex analysis to an ELT datapipe.
Audiohook’s data model is built around three large event based datasets that need to be joined to answer customer questions
- Bid request data - All the information about a specific advertising opportunity
- Clickstream data - Onsite clickstream data from customers’ websites
- Ad event data - Events about bidding (e.g. winning a bid), bid responses, ad play events
Any downstream analysis on these events requires doing complex joins, windows and aggregations across these datasets to track clickstream data (ie conversions) back to ad events and then back to the original big request. The main metrics that Audiohook is showing in the app are impressions and conversions.
Reporting ELT Data Model flow
Firebolt to the rescue
While most Firebolt customers are looking to improve query performance in user facing applications, another place Firebolt speed can help is on ELT workloads and that is what drew Audiohook. Audiohook was struggling with computing impressions and conversions for their customers based on their core data model. On another leading cloud data warehouse offering this takes on the order of an hour, but on Firebolt they brought this down to ~2 minutes.
Audiohook Reporting ELT Runtime (Minutes)
Firebolt, running on AWS S3, sped up the ELT query by 30x.
Firebolt’s indexing capabilities drew Audiohook as they looked to improve the performance on their ELT. They needed to design the data model to more effectively prune the unnecessary records at ELT job time to significantly cut down on run time.
Firebolt’s primary index - powerful sparse indexing - was the answer. Leveraging the Firebolt primary index, Audiohook is able to quickly find the exact rows in a table, where other technologies would be unable to prune and require scanning a lot of data. This allows Audiohook to get a much more efficient spend.
Sparse Index Zoom in
Sparse indexes point to small data ranges within files. They are much smaller than partitions, and can be scanned and moved to local cache on their own. This frees Firebolt from going through big partitions, and accelerates queries dramatically.
Audiobook configured most of their primary indexes using timestamp or record ID, however on some of their most heavily queried tables they are using a composite primary index. This flexibility to allow for multiple primary index fields can be very powerful in the right workloads.
Aggregating indexes have also been used to speed up downstream queries in a user-friendly manner, to make it easy to maintain additional views of the data and simplify and speed up downstream analysis.
Firebolt’s model of allowing a customer to dial the compute instance attributes to the workload, allowed Audiohook to choose an engine with more RAM which allowed them to avoid intermediate result spilling to disk and further speed up query performance.
The future
Now that Audiohook has implemented a lift and shift case to Firebolt, there are more opportunities to change the way that data is represented in the application and the types of dynamic queries that are served up to the user.
Audiohook had previously taken the traditional roll-up approach to building their data app, but now with a data warehouse backend that can serve complex queries with sub second latency, they have the opportunity to consider the views they are going to provide to their customers.
Audiohook is also looking into other analysis that will be backed by Firebolt and give their customers more insight, including bid request and response analysis that will help their customers dial in their bidding.
Firebolt demonstrated great cost-efficiency, delivering the lowest TCO compared to the alternatives considered and prior solution. Therefore Firebolt was selected, and within a few weeks was fully programmatically orchestrated in production with the Firebolt Python SDK to orchestrate the needed transformations.