Listen to this article
Firebolt is a modern cloud data warehouse for data-intensive applications that offers a compute infrastructure, called Engines, which is tuned for today’s large-scale, low-latency workloads. In Firebolt, an engine is a compute resource that will be used to ingest data into Firebolt tables and to run queries on this data. The modern analytic workloads use data collected from multiple sources in varied formats and at a scale that continues to grow rapidly. One of the key characteristics of these workloads is that they are very dynamic. Consequently, their infrastructure needs can vary wildly and in an unpredictable way. In order to deal with these dynamic workloads effectively and keep their workloads running smoothly, customers need a flexible infrastructure that is also cost-effective so that they pay only for the resources they need. In this post, we will take a closer look at how Firebolt provides Granular Scaling, the ability to incrementally add compute nodes when customers need to scale their workloads. This ability to granularly control how much compute is added to an engine helps Firebolt customers control their infrastructure costs because they are not forced to double the number of nodes every time they need to scale, doubling their costs. Since customers pay only for the resources they need and actually consume, granular scaling helps optimize the price-performance of their workloads.
Firebolt Engines
Before we look at how granular scaling works in Firebolt, let us take a quick look at some key engine concepts. A cluster is a collection of nodes of a certain type. These nodes come in four types: Small, Medium, Large and XLarge, each type providing a certain amount of CPU, RAM and storage. An engine is a collection of one or more clusters.
Scaling Engines
Customers can scale their engines vertically (Scale Up/Down), changing the node type used in their engines based on the needs of their workloads. For example, a customer can scale up their engine to an “M” type node from an “S” type node or scale down from an “XL” node to an “L” node as the demands of their workload changes. They can also scale out an engine horizontally, adding more nodes, to deal with a workload that can benefit from distributing queries across multiple nodes. When no longer needed, these additional nodes can be removed from the engine, saving costs. To deal with an increase in the number of queries and/or users, Firebolt offers concurrency scaling, allowing customers to add more clusters to their engine.
Note: All the scaling operations in Firebolt are dynamic, meaning customers do not need to stop their engines for these operations, and hence do not incur any service interruption.
Granular Scaling
To illustrate how granular scaling works in Firebolt, let us take an example scenario where you are running an engine, named MyEngine, which is running an analytics workload. The engine comprises a single cluster with two “S” nodes, created as shown below.
Now, let us assume that the amount of data being processed to serve the incoming queries continues to increase, and that you must maintain the performance of this workload to meet your SLA. To achieve this, you would like to add more compute to the engine but don't want to necessarily double the engine size. With Firebolt, you can add nodes in increments of one as shown below:
Let us look at another example to show how customers can scale along multiple dimensions, also illustrating how granular scaling helps save costs. For this example, let us say that you are running an engine with a single cluster that comprises one node of type “XL”, as shown below:
As in the earlier example, the amount of data being processed by the workload continues to go up, and you would like to add some more compute power to deal with this data growth. In this case, you can add another node of type XL, but this may be an overkill and will double your costs. An “XL” node provides the same amount of CPU, RAM and SSD as four “M” nodes. So, instead of adding another XL node, you can scale down the engine to use a node of type “M”. For the incremental compute, you can add one more node, resulting in your engine using five “M” type nodes instead of two “XL” nodes. Both scaling down to an “M” type and scaling out to five nodes can be easily done with a single SQL statement as below:
In order to understand the cost savings resulting from incrementally scaling your engines, we can use the consumption metric for engines, Firebolt Units (FBU). An FBU provides a certain amount of compute that scales linearly with the node type as shown below:
For the example scenario discussed above, instead of being forced to use two XL nodes and pay for 128 FBUs (2 * 64), you are only paying for 80 FBUs by using five M type nodes (5 * 16). This results in 37.5% cost savings, while helping you meet the performance needs of your workload.
Note that right-sizing engines depends on your workload and your price-performance requirements. While the above examples shed some light on how granular scaling works and how you can easily achieve it with Firebolt, we strongly recommend that you leverage the observability metrics provided by engines to optimally size your engines. These metrics provide insights on how the hardware resources (CPU, RAM and SSD) are being utilized. In addition, Firebolt provides metrics to help you monitor how long your queries have been running and how many of your queries are getting queued. For more information on how to use engine observability to size your engines visit the guide here.