Compute on Firebolt is delivered in the form of on-demand, stateless engines. Each engine is composed of 1 to 10 clusters; with each cluster composed of 1 to 128 compute nodes with a choice of node types. This multidimensional elasticity helps meet various performance requirements, supporting scale-up, scale-out, and concurrency scaling. Workloads can run on a single engine or across multiple read-write engines accessing the same, shared data, optimizing both cost and performance while ensuring workload isolation. Integration and operation are streamlined with an easy-to-use SQL API, facilitating engine management and online scaling.
The metadata service is central to Firebolt, maintaining consistency across its distributed architecture, regardless of the number of nodes, clusters, or engines. It ensures transactional integrity and strong consistency for distributed writes within the cluster. Furthermore, isolated reads and writes can be done from any provisioned cluster to any data managed by Firebolt. This service also underpins security and observability, promoting a secure, transparent operational environment. Metadata access is simplified via information_schema objects for easier management and eco-system integration.
Firebolt managed storage service merges the speed of block storage with the scalability of object storage, enhanced by features like tiered storage and adaptive prefetching. It utilizes a columnar format for efficient data compression, organizing data in tablets with sparse indexing to deliver effective data pruning and rapid response times. This fully manage storage provides performance-and capacity efficiency. Alternatively, for those requiring integration with the data lake, breadth of native file formats is supported for efficient data exploration and ingestion. This allows direct querying of various open file formats (including PARQUET, JSON, CSV, TSV, AVRO, and ORC) stored on Amazon S3, utilizing external tables.