November 12, 2024
November 13, 2024

Mosha Pasumansky talks query subresult reuse at CMU

No items found.

Listen to this article

Powered by NotebookLM
Listen to this article

What is query subresult caching and reuse? Production analytics workloads often contain a number of similar-looking queries that aren’t identical, but share a similar pattern. The same tables may be joined together over and over again, but with different filters or functions called on the result of that join. Subresult caching and reuse leverages that by caching the intermediate results of queries, so that later queries can read from the cache, saving the engine from recomputing them.

In Mosha’s talk at CMU, based on the work of Alex Hall at Firebolt, he starts with an overview of common caching techniques in database management systems, such as a buffer pool and full query result caching. Then he explores the additional techniques of caching intermediate subresults and operator artifacts. He provides examples of exactly what a subresult is and explores how and why Firebolt wants to cache and reuse those subresults. To spoil the surprise, it’s a powerful optimization that can significantly save on compute resources and accelerate queries, helping Firebolt run faster and cheaper. Watch the talk above to learn more, check out Alex Hall’s blog about subresult reuse for an even deeper dive, and if you want to see it in action for yourself, try Firebolt out for free with $200 in credits.

Read all the posts

Intrigued? Want to read some more?