I often ask the question - "Is Firebolt in tune with the problems that are endemic to analytics platforms today and is Firebolt changing how the analytics community operates at large ?" When I see David Jayatillake and Tristan Handy comment on Firebolt's approach it is clear that Firebolt is on track. These are folks who understand the industry and they call-out things as they see it. In the case of David J, it was a balanced review of Firebolt and he is right, Firebolt has built the right foundations and is evolving to address broader use cases.
Firebolt wasn’t the only thing on Mr. Handy’s mind in that post. He also mentioned Erica Louie’s article explaining how we are working too hard, and another article about why some cloud data warehouse platforms are getting expensive.
I started subconsciously making connections between these articles. I can’t help it, I have spent over 20 years deploying and managing data systems, I see the world in relationships.
Data warehouses are a special type of big data workload. They’re to be time variant and non volatile. “Non volatile” implies immutability which is an asset, not a barrier. If my data warehouse maintains a non volatile time variant set of data, I can know things I couldn’t otherwise know. If, for instance, I ledger all changes to every order_detail record rather than update them when changes occur, I can now see the influences on the lifetime of that order_detail record. Updates and deletes are the “forget me stick” of the data world, they’re similar to your data warehouse getting a concussion.
Initially, it made good sense when building out a new CDW to cover these requirements. The team at Firebolt did an amazing job. Primary, join and aggregating indexes made data warehouses perform at amazing speed and many man hours were removed from the process as summary tables were no longer necessary.
Mr. Jayatillake’s concern over Firebolt’s lack of mutability has merit for several big data workloads. Our engineering team is working to deliver mutations at scale so we can expand beyond existing workloads seamlessly. Row level mutations are already in alpha, and coming very soon. Like most things the Firebolt engineering team do, this is going to be fast and efficient.
Mr. Handy was correct when he wrote “decisions about how to shape your schemas will depend very significantly on use case and platform”. If your platform doesn’t perform, you will have to make design concessions. I’ve been working with Firebolt for a year now, and I’m still unlearning many of the design patterns I used to execute on other platforms. I’ve found myself reverting to design patterns I used 20 years ago. Highly relational, low transform, and immutable. And we can do that because Firebolt’s indexing is there to ensure things go fast without unnecessary schema.
And to Ms. Louie’s point, data engineers are working too hard because platforms don’t perform. This is Firebolt’s competitive advantage, not just reduced cloud costs, but massively reduced people costs. Building the next generation platform, designed for data apps, requires engineering rigor with a customer-centric mindset. There is plenty to come on this as Firebolt continues to build, I can assure you of that. Keep your eyes peeled…