FAQ

What's the best practice for setting up a Firebolt engine to support high concurrency?

For high concurrency, use multiple clusters within your engine. Clusters help handle more simultaneous queries by distributing the load. Keep in mind that cache is shared across nodes in a cluster, but not between clusters, so the right balance depends on your workload. You can also consider using auto-scaling to dynamically adjust resources based on demand.

https://firebolt013marketing.webflow.io/faqs-v2-knowledge-center/whats-the-best-practice-for-setting-up-a-firebolt-engine-to-support-high-concurrency

SQL

What steps are involved in switching production workloads to Firebolt once migration validation is complete?

What are Firebolt’s best practices for handling complex dashboard queries with varying granularity (e.g., daily, weekly, monthly)?

How can performance risks be mitigated when dealing with tenants of significantly different data sizes?

What are Firebolt’s best practices regarding the use of views versus pre-joined tables for aggregations?

Are primary indexes critical to Firebolt's query performance, and how should they be managed during migration?

How do I invite more users to join my account?

What are the best practices for setting up Superset to connect to Firebolt for dashboarding?

How should primary indexes be selected in Firebolt to optimize query performance?

What factors significantly impact query performance when joining high-cardinality tables in Firebolt?

Are semi-joins (WHERE IN clauses) generally more performant in Firebolt than explicit joins for filtering datasets?

How do I query my S3 buckets using IAM roles?

What is the difference between CPU time and thread time in Firebolt's query profile analysis, and why is it important?

What's the best practice for setting up a Firebolt engine to support high concurrency?

How can I learn if there is an issue causing interruption to Firebolt services or applications?

What's the easiest way to label a query when running it from the Python SDK?

Explore All FAQs

What's the best practice for setting up a Firebolt engine to support high concurrency?

How do I query my S3 buckets using IAM roles?

What's the easiest way to label a query when running it from the Python SDK?

Are semi-joins (WHERE IN clauses) generally more performant in Firebolt than explicit joins for filtering datasets?

What is the difference between CPU time and thread time in Firebolt's query profile analysis, and why is it important?

How can I learn if there is an issue causing interruption to Firebolt services or applications?

What factors significantly impact query performance when joining high-cardinality tables in Firebolt?

Is there a way to Auto-Format or Beautify my query?

What are the considerations for splitting into separate Databases and Database best practices?

How do I invite more users to join my account?

Are primary indexes critical to Firebolt's query performance, and how should they be managed during migration?

Does Firebolt provide tools or capabilities to monitor database performance and scaling activities?

What steps are involved in switching production workloads to Firebolt once migration validation is complete?

What are Firebolt’s best practices for handling complex dashboard queries with varying granularity (e.g., daily, weekly, monthly)?

What is the recommended approach for incremental data ingestion from PostgreSQL to Firebolt via AWS S3?

Does Firebolt recommend separating ingestion and query engines, and why?

How can performance risks be mitigated when dealing with tenants of significantly different data sizes?

How should primary indexes be selected in Firebolt to optimize query performance?

When should aggregating indexes be used in Firebolt, and what are their limitations?

What are Firebolt’s best practices regarding the use of views versus pre-joined tables for aggregations?

What are the best practices for setting up Superset to connect to Firebolt for dashboarding?

Can Firebolt support a unified table for multiple reporting use cases (e.g., unique counts, injected data, and regular event data) instead of using multiple tables?

How can query performance be optimized when querying event data with minute-level granularity in Firebolt?

How can we access Firebolt 2.0 engine cost data? How can we programmatically retrieve and export this data?

How does Firebolt support handle customer access, and can we restrict it?

Is it possible to rename the organization URL (e.g., from shopware.firebolt.io to velo.firebolt.io)?

Does Firebolt have Generative AI features or an AI roadmap relevant to analytics use cases?

How should we handle user management across different Firebolt accounts?

How does connecting to the AWS Marketplace for billing work?

How often do we need to run vacuum if we do small, frequent updates—and does auto vacuum solve this?

Should we create separate Firebolt accounts for development, staging, and production, or use a single account with multiple databases?

If we use Change Data Capture (CDC) with very incremental updates, what concerns should we have about concurrency and vacuum tasks in Firebolt?

Is there a limit to how much data a single Firebolt engine can handle if I see a reference to a 1.8 TB size?

If we change the tenant ID, will the sub-result cache still be used?

How does Firebolt handle query performance as data within a single tenant expands?

How should we structure the primary indexes?

Is continuous 24/7 ingestion engine usage required, or can the engine be started and stopped as needed?

What ingestion throughput can organizations expect, and how does Firebolt handle large batch loads or full refreshes?

How can organizations estimate the appropriate Firebolt engine size and associated costs for a given query load?

Why do some queries run sub-second in Firebolt while others might take multiple seconds?

In the absence of true streaming, how can a Node.js application handle large Firebolt query results without running out of memory?

Does Firebolt still require manual vacuuming, or is there an automatic process to reclaim storage space and optimize performance? Should vacuuming be done on a dedicated engine?

How can organizations optimize queries that filter on high-cardinality timestamp columns (such as a ‘closed_at’ column)?

What is the recommended approach for handling multiple workloads (read vs. write) in Firebolt? Should separate engines be used?

Does Firebolt offer ongoing query-optimization assistance, and is there an extra cost associated with this service?

Is it better to insert data into Firebolt one row at a time or in batches for real-time workloads?

What is the recommended approach for migrating an existing Firebolt environment to a new organization or domain (for example, if a team is switching from one AWS org to another)?

Does Firebolt always require creating additional tables, or can large joins be handled directly in SQL as with other warehouses (e.g., Snowflake)?

How should large table joins be handled in Firebolt to optimize query performance?

How can Firebolt be integrated with Apache Superset for visualization?

Does Firebolt provide an execution plan for queries, similar to Athena?

How do aggregating indexes work in Firebolt, and what are their trade-offs?

What are the best practices for structuring queries in Firebolt for performance optimization?

How can users extract and save queries in Firebolt for collaboration?

How can Firebolt users optimize query performance by leveraging primary indexes?

How can users ensure that their Firebolt service account setup is working correctly?

How does Firebolt ensure low-latency performance for AI applications?

How does Firebolt’s vector search compare to dedicated vector databases?

Does Firebolt support real-time AI workloads?

How does Firebolt’s AI-powered data warehouse differ from competitors like Snowflake, Amazon Redshift, and Google BigQuery?

Do you charge extra for AI features like vector search?

Do we need to create new service account for new firebolt version or exist version will work fine?

Could you please give explanation, when it makes sense to scale engines with more clusters, when with higher amount of nodes and when with bigger engines?