FAQ

Find quick answers to common questions about Firebolt

How does Firebolt’s vector search compare to dedicated vector databases?

Firebolt supports vector search but does not generate embeddings. Unlike dedicated vector databases, which specialize in unstructured data, Firebolt integrates vector search within a high-performance analytical data warehouse. This allows you to run hybrid queries (structured + unstructured) efficiently without managing separate systems.

If you already have embeddings generated from models like OpenAI, Hugging Face, or your own ML pipeline, Firebolt can store and query them at high speed and low latency, enabling AI-powered search and recommendations within your existing analytics environment.

What steps are involved in switching production workloads to Firebolt once migration validation is complete?

Switching production workloads to Firebolt typically involves updating configuration to point to Firebolt endpoints. If all validation is complete and data is already present, this process is straightforward.

Deployment & Architecture

FAQ

What steps are involved in switching production workloads to Firebolt once migration validation is complete?

What are Firebolt’s best practices for handling complex dashboard queries with varying granularity (e.g., daily, weekly, monthly)?

How can performance risks be mitigated when dealing with tenants of significantly different data sizes?

What are Firebolt’s best practices regarding the use of views versus pre-joined tables for aggregations?

Are primary indexes critical to Firebolt's query performance, and how should they be managed during migration?

How do I invite more users to join my account?

What are the best practices for setting up Superset to connect to Firebolt for dashboarding?

How should primary indexes be selected in Firebolt to optimize query performance?

What factors significantly impact query performance when joining high-cardinality tables in Firebolt?

Are semi-joins (WHERE IN clauses) generally more performant in Firebolt than explicit joins for filtering datasets?

How do I query my S3 buckets using IAM roles?

What is the difference between CPU time and thread time in Firebolt's query profile analysis, and why is it important?

What's the best practice for setting up a Firebolt engine to support high concurrency?

How can I learn if there is an issue causing interruption to Firebolt services or applications?

What's the easiest way to label a query when running it from the Python SDK?

Why can't I connect to my database with upper-case letters?

How do I connect my Firebolt database to Tableau?

Can I manipulate or filter S3 data when using COPY FROM to create or insert data into a table?

Are you all able to see query history further back than the last engine restart?

Is there a way to set the tmp directory in the python sdk when the user is running in lambda and the default directory is read-only?

Explore All FAQs

Is there a way to set the tmp directory in the python sdk when the user is running in lambda and the default directory is read-only?

Are you all able to see query history further back than the last engine restart?

Can I manipulate or filter S3 data when using COPY FROM to create or insert data into a table?

How do I connect my Firebolt database to Tableau?

Why can't I connect to my database with upper-case letters?

What's the best practice for setting up a Firebolt engine to support high concurrency?

How do I query my S3 buckets using IAM roles?

What's the easiest way to label a query when running it from the Python SDK?

Are semi-joins (WHERE IN clauses) generally more performant in Firebolt than explicit joins for filtering datasets?

What is the difference between CPU time and thread time in Firebolt's query profile analysis, and why is it important?

How can I learn if there is an issue causing interruption to Firebolt services or applications?

What factors significantly impact query performance when joining high-cardinality tables in Firebolt?

Is there a way to Auto-Format or Beautify my query?

What are the considerations for splitting into separate Databases and Database best practices?

How do I invite more users to join my account?

Are primary indexes critical to Firebolt's query performance, and how should they be managed during migration?

Does Firebolt provide tools or capabilities to monitor database performance and scaling activities?

What steps are involved in switching production workloads to Firebolt once migration validation is complete?

What are Firebolt’s best practices for handling complex dashboard queries with varying granularity (e.g., daily, weekly, monthly)?

What is the recommended approach for incremental data ingestion from PostgreSQL to Firebolt via AWS S3?

Does Firebolt recommend separating ingestion and query engines, and why?

How can performance risks be mitigated when dealing with tenants of significantly different data sizes?

How should primary indexes be selected in Firebolt to optimize query performance?

When should aggregating indexes be used in Firebolt, and what are their limitations?

What are Firebolt’s best practices regarding the use of views versus pre-joined tables for aggregations?

What are the best practices for setting up Superset to connect to Firebolt for dashboarding?

Can Firebolt support a unified table for multiple reporting use cases (e.g., unique counts, injected data, and regular event data) instead of using multiple tables?

How can query performance be optimized when querying event data with minute-level granularity in Firebolt?

How can we access Firebolt 2.0 engine cost data? How can we programmatically retrieve and export this data?

How does Firebolt support handle customer access, and can we restrict it?

Is it possible to rename the organization URL (e.g., from shopware.firebolt.io to velo.firebolt.io)?

Does Firebolt have Generative AI features or an AI roadmap relevant to analytics use cases?

How should we handle user management across different Firebolt accounts?

How does connecting to the AWS Marketplace for billing work?

How often do we need to run vacuum if we do small, frequent updates—and does auto vacuum solve this?

Should we create separate Firebolt accounts for development, staging, and production, or use a single account with multiple databases?

If we use Change Data Capture (CDC) with very incremental updates, what concerns should we have about concurrency and vacuum tasks in Firebolt?

Is there a limit to how much data a single Firebolt engine can handle if I see a reference to a 1.8 TB size?

If we change the tenant ID, will the sub-result cache still be used?

How does Firebolt handle query performance as data within a single tenant expands?

How should we structure the primary indexes?

Is continuous 24/7 ingestion engine usage required, or can the engine be started and stopped as needed?

What ingestion throughput can organizations expect, and how does Firebolt handle large batch loads or full refreshes?

How can organizations estimate the appropriate Firebolt engine size and associated costs for a given query load?

Why do some queries run sub-second in Firebolt while others might take multiple seconds?

In the absence of true streaming, how can a Node.js application handle large Firebolt query results without running out of memory?

Does Firebolt still require manual vacuuming, or is there an automatic process to reclaim storage space and optimize performance? Should vacuuming be done on a dedicated engine?

How can organizations optimize queries that filter on high-cardinality timestamp columns (such as a ‘closed_at’ column)?

What is the recommended approach for handling multiple workloads (read vs. write) in Firebolt? Should separate engines be used?

Does Firebolt offer ongoing query-optimization assistance, and is there an extra cost associated with this service?

Is it better to insert data into Firebolt one row at a time or in batches for real-time workloads?

What is the recommended approach for migrating an existing Firebolt environment to a new organization or domain (for example, if a team is switching from one AWS org to another)?

Does Firebolt always require creating additional tables, or can large joins be handled directly in SQL as with other warehouses (e.g., Snowflake)?

How should large table joins be handled in Firebolt to optimize query performance?

How can Firebolt be integrated with Apache Superset for visualization?

Does Firebolt provide an execution plan for queries, similar to Athena?

How do aggregating indexes work in Firebolt, and what are their trade-offs?

What are the best practices for structuring queries in Firebolt for performance optimization?