Listen to this article
We’ve introduced a new option for you to choose from when you’re provisioning your Firebolt engines: compute family. This blog serves as a look into why we chose to do this, what’s going on inside Firebolt when you choose a compute family, and go into more detail about which compute family is right for you if you’re a Firebolt user.
One big caveat for this blog: the exact compute family and sizes are intentionally left unspecified. This is for a number of reasons, the simplest being that we expect it to change soon and somewhat regularly, and there’s little value in communicating something and then having to constantly update that information and correct ourselves. Now, onto the rest of the blog.
The backend of Firebolt
When you start up a Firebolt engine, what you’re getting is one or more AWS EC2 instances imaged with Firebolt and set up to run your queries with security and access control in mind. Firebolt users have always had some say in the amount of resources they consume: by increasing the node size in your engine, you provision a larger EC2 instance. By adding more nodes or clusters, you add more of those EC2 instances to your engine.
But there’s a separate choice we’re now giving users: a different compute family. When we say “compute family,” we’re describing the EC2 compute family.
If you’re unfamiliar with AWS EC2, it has a lot of different instance types configured in many different ways: there’s general purpose, compute-optimized, memory-optimized, storage-optimized, accelerated computing, and HPC-optimized. Among those categories, there are many different possible compute families, too, whether that’s Hpc7g, C6i, or M5zn. These selections come with different processor manufacturers and different generations of those processors, various ratios of vCPUs to RAM to SSD, and a number of other nuances and details. The best way of framing it is that if you have a task that requires a predictable amount of hardware, you can find an EC2 instance right-sized for your task.
Getting back to Firebolt, we have a task for our EC2 instances: running Firebolt queries. And unfortunately, because our users have very different usage patterns, this is less predictable than you’d hope. The size and shape and cardinality of a dataset can have a major impact on the resources necessary to query it; the size and shape and frequency of queries is even more significant. A Firebolt engine may have radically different tasks asked of it, and this leads to different requirements on the hardware end of things.
Historically, Firebolt has used a storage-optimized EC2 family to handle everything. Firebolt uses tiered caching to minimize reads over the network and ensure that data is queried as efficiently as possible. If a Firebolt user has a lot of data, Firebolt thrives on RAM and SSD capacity. As you continuously access data in your queries, Firebolt’s caching mechanisms minimize expensive network reads, saving on cost and delivering the low latency efficiency that Firebolt is known for. The storage-optimized compute family makes sense as a default and place to get started because it provides more resources for caching, and we expect it will remain the best choice for a large number of Firebolt users.
Why add a new compute family?
Firebolt is not one-size-fits-all, and the variance in user workloads necessitates variance in compute families, too. There’s two reasons we made the choice to add this option to Firebolt.
The first is the more important one: we have customers who are bottlenecked by CPU. These tend to be the customers with relatively predictable, highly selective queries accessing the same subset of their data. Though they may have more data, the data they are routinely querying and processing is relatively small, limiting the need for much memory or disk. With memory and disk utilization being substantially lower than CPU utilization, that opens the door to finding a new hardware configuration that works better for them. We could save them money by provisioning fewer of the resources that they don’t need. Or, we could improve their performance by using those savings to provide them with more vCPUs at the same cumulative cost.
The other reason to add a new compute family is closely-related: in the real world, a huge number of small to mid-sized companies don’t have an immense amount of data. The conversation around data engineering and data infrastructure tends to be dominated by the companies working with petabytes or exabytes, but it’s important to remember that the majority of data teams are working with a dataset that can fit on a single SSD. Even if they are routinely accessing all of their data, those teams don’t need storage-optimized EC2 instances, so it was time to look into providing them a more efficient offering.
Resource utilization is a bin-packing problem
In an ideal scenario, the best way to maximize efficiency and stretch your money as far as it can possibly go is to find a compute family that uses equal proportions of CPU, memory, and disk for your workload. With no clear bottleneck and every resource being leveraged to its full potential, there’s no wasted hardware, which in turn means no wasted money. If every bin is nearly full without overflowing, you’re getting the most of what you’re paying for. You get better performance for the best possible price, and everyone is happy.
So, let’s look at a couple examples. How efficiently are we using our bins of CPU, memory, and disk? When you kick off a query, you can look at the resource utilization for your engine in the Firebolt UI:

If your utilization looks like that, you’ll ask the same questions we did regarding efficient resource allocation. This is, of course, a contrived example: I started up a cold engine with a single small node and ran:
SELECT * FROM <terabyte-sized-table>;
to guarantee minimal memory utilization and no caching whatsoever. It’s the worst-case scenario and not going to happen very often - a follow-up query that scans the entire table again would be able to use the cache, even if it’s still not super memory-intensive.
Cancelling that query and running a few dozen standard, highly-selective data app queries immediately makes the resource allocation look more reasonable:

If I was running more queries concurrently and could fully saturate the engine, memory and CPU would be capping out at about the same time. Disk space might turn into the first bottleneck here depending on how much data I have in my database, and the historical default of a storage-optimized compute family makes sense for these queries.
We repeated this same experiment under a number of different scenarios, looked at utilization graphs for a number of customer engines and engines running our own benchmark, and it became clear that we could benefit from offering an alternative with differently-shaped bins.
Introducing the compute-optimized family
The compute-optimized family for engines better allows you to right-size your engine for your workload and guarantee that you’re spending your money on the compute resources you need. For smaller amounts of data or query patterns that are continually accessing the same subset of a dataset the majority of the time, having more memory and disk isn’t necessarily the best option for a Firebolt user.
To be more specific about it, a compute-optimized engine provides the exact same number of vCPUs as the storage-optimized family at the same node size, with less memory and less disk space. Because this is fewer resources overall, a compute-optimized engine uses half the FBUs and thus comes at half the price of an equivalently sized storage-optimized engine.
How we chose the new compute family
As discussed earlier, EC2 gives you a lot of possible instance types. We knew from analyzing Firebolt engine utilization for users and our benchmark that we needed comparatively more CPU, but even in the EC2 compute-optimized category, there are… a lot of options.

That doesn’t even encompass the whole world because many of these instance types have a variation with a “d” tacked onto the end to represent that they also have a local disk. One requirement that was clear was that we did, in fact, want a local disk. Tiered caching depends on having a disk, so no caching is a pretty big disadvantage as soon as you need to cache more than you can store in memory. Even worse, if you run out of memory when processing a query, which is a greater risk on a compute family with comparably less memory, you spill to disk. Spilling to disk gets very ugly when you don’t have a disk. While these EC2 instances could be connected to EBS and spill over the network, that runs the risk of incurring EBS costs that either Firebolt or Firebolt users would have to pay. When the entire goal of adding the new compute family is to improve price-performance, potential high EBS costs would be counterproductive.
That requirement narrowed down the field because not every instance type can come with a disk, but there was still a wide range of options. When presented with so many options, what’s the best way to narrow them down? Benchmark all of them. So we did. We also benchmarked a number of memory-optimized instances and compute-optimized instances without disks out of curiosity.
The results of that onslaught of testing unfortunately can’t be made publicly available, but rest assured there exists a spreadsheet with a mind-boggling number of data points. All the data steered us towards adding a new compute-optimized family, which nearly doubled the price-performance ratio compared to our storage-optimized instance for certain workloads. Double is good.
As a perhaps noteworthy aside, we also didn’t end up going with the fastest possible compute family for now, as the higher cost led to a worse price-performance ratio. But you should know that somewhere out there, there’s a compute instance we could run Firebolt on that would make it even faster.
Selecting your compute family as a Firebolt user
So when should you use this new compute family? The simple answer is, of course, it depends. It’s hard to generalize much with advice here because, depending on how much data you have, how much of that data you’re querying, and how large the engine you’re using in Firebolt is, the math can get complicated.
We can start simple: if you have a very small amount of data (i.e., 10 GB or less), you should almost always be using a compute-optimized engine.
As your data size grows, the answer depends more on what kind of queries you’re trying to run. For highly selective queries that are accessing the same tables over and over again, a compute-optimized engine has a good chance of making sense. For queries with complex joins or data transformations, storage-optimized will often make more sense. This isn’t absolute because there are so many factors at play that being too prescriptive runs the risk of being misleading.
The practical answer is that you need to run some queries in Firebolt to get a better idea of your hardware needs. It’s recommended to start with a storage-optimized engine, as having more memory makes it less likely you’ll spill to disk and experience steep slowdowns. Firebolt is very fast, and we want you to experience it when it’s firing on all cylinders, so we want to make sure you experience it when it’s not capping out on memory.
Once you’ve had an engine running for an hour, handling production queries on production data, you can analyze the resource utilization and evaluate if you’re encountering a bottleneck on CPU, memory, or storage. If you see CPU maxed out and storage and disk below half, you can comfortably move to a compute-optimized engine that consumes the same FBUs to increase your vCPUs and help with that bottleneck. If CPU utilization isn’t maxed out, but memory and disk utilization are even lower, you may be able to move to a compute-optimized engine that uses fewer FBUs, with no degradation to performance but major savings on cost.
And that goes back to why we did this in the first place: with the option to select your compute family, you can save money, improve performance, or both.
Try it out today
If you’re not already signed up for Firebolt, go sign up today. On our smallest compute-optimized engine, the $200 in free credits gives you 215 hours of engine uptime. With auto-start and auto-stop, that’s going to last you for a long while, so get experimenting with our sample datasets or even your own data.