FAQ

Find quick answers to common questions about Firebolt
How does Firebolt’s vector search compare to dedicated vector databases?

Firebolt supports vector search but does not generate embeddings. Unlike dedicated vector databases, which specialize in unstructured data, Firebolt integrates vector search within a high-performance analytical data warehouse. This allows you to run hybrid queries (structured + unstructured) efficiently without managing separate systems.

If you already have embeddings generated from models like OpenAI, Hugging Face, or your own ML pipeline, Firebolt can store and query them at high speed and low latency, enabling AI-powered search and recommendations within your existing analytics environment.

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

What steps are involved in switching production workloads to Firebolt once migration validation is complete?

Switching production workloads to Firebolt typically involves updating configuration to point to Firebolt endpoints. If all validation is complete and data is already present, this process is straightforward.

Deployment & Architecture
COPY LINK TO ANSWER
what-steps-are-involved-in-switching-production-workloads-to-firebolt-once-migration-validation-is-complete

https://firebolt.io/faqs-v2-knowledge-center/what-steps-are-involved-in-switching-production-workloads-to-firebolt-once-migration-validation-is-complete

What are Firebolt’s best practices for handling complex dashboard queries with varying granularity (e.g., daily, weekly, monthly)?

Firebolt recommends using aggregating indexes where possible for regularly queried granularities (e.g., daily or weekly), and employing pre-joined or pre-aggregated tables to simplify and speed up dashboard queries. Ensure indexes align closely with filter criteria to optimize query performance across various granularities.

SQL
COPY LINK TO ANSWER
what-are-firebolts-best-practices-for-handling-complex-dashboard-queries-with-varying-granularity-e-g-daily-weekly-monthly

https://firebolt.io/faqs-v2-knowledge-center/what-are-firebolts-best-practices-for-handling-complex-dashboard-queries-with-varying-granularity-e-g-daily-weekly-monthly

How can performance risks be mitigated when dealing with tenants of significantly different data sizes?

When a tenant comprises a large percentage of data (e.g., 20-25% of all data), avoid subqueries or joins that initially select large volumes of data and subsequently discard most rows. Instead, optimize queries and table structures to filter data as early and narrowly as possible, potentially using aggregated or pre-joined tables.

SQL
COPY LINK TO ANSWER
how-can-performance-risks-be-mitigated-when-dealing-with-tenants-of-significantly-different-data-sizes

https://firebolt.io/faqs-v2-knowledge-center/how-can-performance-risks-be-mitigated-when-dealing-with-tenants-of-significantly-different-data-sizes

What are Firebolt’s best practices regarding the use of views versus pre-joined tables for aggregations?

Firebolt supports both using views and pre-joined tables. However, if most of the query execution time is spent on joins rather than aggregations, pre-joining tables (i.e., creating wider, denormalized tables during data ingestion) is often more performant. Views are effective for reusable SQL but may become slower with complex joins at scale. Aggregating indexes, which can pre-materialize aggregation results for fast query responses, work best on single tables without cross-table joins.

SQL
COPY LINK TO ANSWER
what-are-firebolts-best-practices-regarding-the-use-of-views-versus-pre-joined-tables-for-aggregations

https://firebolt.io/faqs-v2-knowledge-center/what-are-firebolts-best-practices-regarding-the-use-of-views-versus-pre-joined-tables-for-aggregations

Are primary indexes critical to Firebolt's query performance, and how should they be managed during migration?

Yes, primary indexes significantly impact query performance in Firebolt. Ensuring correct and optimized indexes is crucial, especially during migration. Indexes should be carefully reviewed and implemented based on query patterns and use cases.

SQL
COPY LINK TO ANSWER
are-primary-indexes-critical-to-firebolts-query-performance-and-how-should-they-be-managed-during-migration

https://firebolt.io/faqs-v2-knowledge-center/are-primary-indexes-critical-to-firebolts-query-performance-and-how-should-they-be-managed-during-migration

How do I invite more users to join my account?

You can add more users to your Firebolt account by either adding them through the web application under or with SQL commands. First create a login, using the email address of your invitee as the login_id. Next, associate the login to a user and assign them the appropriate permissions. Your invitee wiill automatically receive an email invitation to join your account. For more information visit our documentation.

Miscellaneous
COPY LINK TO ANSWER
how-do-i-invite-more-users-to-join-my-account

https://firebolt.io/faqs-v2-knowledge-center/how-do-i-invite-more-users-to-join-my-account

What are the best practices for setting up Superset to connect to Firebolt for dashboarding?

Setting up Apache Superset with Firebolt involves: - Installing Superset locally or on a server. - Configuring the Firebolt connector with appropriate credentials and connection parameters. - Testing queries in Superset to ensure Firebolt’s indexing structure is leveraged efficiently. - Optimizing queries for dashboard performance by using Firebolt’s indexing features to minimize latency. In this case, there were some challenges with reinstalling Superset, but Firebolt’s team is available to assist with setup and troubleshooting.

Integrations
COPY LINK TO ANSWER
what-are-the-best-practices-for-setting-up-superset-to-connect-to-firebolt-for-dashboarding

https://firebolt.io/faqs-v2-knowledge-center/what-are-the-best-practices-for-setting-up-superset-to-connect-to-firebolt-for-dashboarding

How should primary indexes be selected in Firebolt to optimize query performance?

Primary indexes should include the most frequently used filters, such as tenant_id and date/time columns if queries consistently filter data by tenant and date ranges. A well-chosen primary index ensures queries access only relevant data partitions, maintaining fast performance even as data volumes scale significantly.

SQL
COPY LINK TO ANSWER
how-should-primary-indexes-be-selected-in-firebolt-to-optimize-query-performance

https://firebolt.io/faqs-v2-knowledge-center/how-should-primary-indexes-be-selected-in-firebolt-to-optimize-query-performance

What factors significantly impact query performance when joining high-cardinality tables in Firebolt?

Query performance in high-cardinality joins is significantly impacted by data cardinality, joins resulting in large intermediate row outputs, and data shuffles across nodes. Firebolt users should leverage the EXPLAIN ANALYZE functionality to identify expensive operations such as table scans, joins, and shuffles. Reducing data volume before joins through effective indexing, semi-joins, or aggregation indexes can mitigate these impacts.

SQL
COPY LINK TO ANSWER
what-factors-significantly-impact-query-performance-when-joining-high-cardinality-tables-in-firebolt

https://firebolt.io/faqs-v2-knowledge-center/what-factors-significantly-impact-query-performance-when-joining-high-cardinality-tables-in-firebolt

Are semi-joins (WHERE IN clauses) generally more performant in Firebolt than explicit joins for filtering datasets?

Yes, semi-joins (implemented via WHERE IN clauses) can be more performant than explicit joins, as Firebolt has built-in optimizations that leverage semi-joins for better data pruning. Using semi-joins helps reduce intermediate row counts earlier in query execution, especially beneficial for high-cardinality datasets.

SQL
COPY LINK TO ANSWER
are-semi-joins-where-in-clauses-generally-more-performant-in-firebolt-than-explicit-joins-for-filtering-datasets

https://firebolt.io/faqs-v2-knowledge-center/are-semi-joins-where-in-clauses-generally-more-performant-in-firebolt-than-explicit-joins-for-filtering-datasets

How do I query my S3 buckets using IAM roles?

First, on your S3 account, confirgure the permission policy found in the help center article, https://docs.firebolt.io/Guides/loading-data/configuring-aws-role-to-access-amazon-s3.html#use-aws-iam-roles-to-access-amazon-s3. While still in your AWS Identity and Access Management (IAM) Console, start the process to upload data through the plus sign icon in the develop space. After selecting an ingestion engine, you can select 'IAM Role' as your authetnication method and you can create an IAM role in the application. Copy the trust policy here and follow the rest of the instructions in the article to apply to your AWS account. Note that you don't actually have to upload anything to create the IAM role.

SQL
COPY LINK TO ANSWER
how-do-i-query-my-s3-buckets-using-iam-roles

https://firebolt.io/faqs-v2-knowledge-center/how-do-i-query-my-s3-buckets-using-iam-roles

What is the difference between CPU time and thread time in Firebolt's query profile analysis, and why is it important?

In Firebolt's query profiling, CPU time refers to the actual processing time on CPU cores, while thread time represents the total wall-clock time across all threads and nodes. When thread time is significantly higher than CPU time, it typically indicates waits due to data loading from storage (like S3) or node concurrency constraints. This distinction helps diagnose bottlenecks related to IO-bound or compute-bound workloads.

Low Latency
COPY LINK TO ANSWER
what-is-the-difference-between-cpu-time-and-thread-time-in-firebolts-query-profile-analysis-and-why-is-it-important

https://firebolt.io/faqs-v2-knowledge-center/what-is-the-difference-between-cpu-time-and-thread-time-in-firebolts-query-profile-analysis-and-why-is-it-important

What's the best practice for setting up a Firebolt engine to support high concurrency?

For high concurrency, use multiple clusters within your engine. Clusters help handle more simultaneous queries by distributing the load. Keep in mind that cache is shared across nodes in a cluster, but not between clusters, so the right balance depends on your workload. You can also consider using auto-scaling to dynamically adjust resources based on demand.

SQL
COPY LINK TO ANSWER
whats-the-best-practice-for-setting-up-a-firebolt-engine-to-support-high-concurrency

https://firebolt.io/faqs-v2-knowledge-center/whats-the-best-practice-for-setting-up-a-firebolt-engine-to-support-high-concurrency

How can I learn if there is an issue causing interruption to Firebolt services or applications?

Firebolt proatively maintains a status page at https://firebolt.statuspage.io/ where we keep you notified about any active incidents that may cause interruption to your access or services. From this page, you can also hit the 'subscribe' button to stay informed by phone, RSS, email, or Slack.

Miscellaneous
COPY LINK TO ANSWER
how-can-i-learn-if-there-is-an-issue-causing-interruption-to-firebolt-services-or-applications

https://firebolt.io/faqs-v2-knowledge-center/how-can-i-learn-if-there-is-an-issue-causing-interruption-to-firebolt-services-or-applications

What's the easiest way to label a query when running it from the Python SDK?

You can label a query by setting the query_label system setting before running it:

cursor.execute("set query_label = '<label>';")
cursor.execute("your_query_here")

Here’s a full example using the Firebolt Python SDK:

id = '****'
secret = '****'

connection = connect(
    database="<db_name>",
    account_name="<account_name>",
    auth=ClientCredentials(id, secret)
)

cursor = connection.cursor()
cursor.execute("start engine <engine_name>")
cursor.execute("use engine <engine_name>")
cursor.execute("use database <database_name>")
cursor.execute("set query_label = '123';")
cursor.execute("select 1;")

print(cursor.fetchone())
connection.close()
SQL
COPY LINK TO ANSWER
whats-the-easiest-way-to-label-a-query-when-running-it-from-the-python-sdk

https://firebolt.io/faqs-v2-knowledge-center/whats-the-easiest-way-to-label-a-query-when-running-it-from-the-python-sdk

Why can't I connect to my database with upper-case letters?

If you created your database containing upper-case letters without quotation marks, the saved name of your database will be in all lowercase letters. Confirm the name of your database from the expolorer, information_schema.catalogs, or show catalogs. From other systems, such as an SDK, use the always use the 'official' name of your database. Within the application, you can still access your all-lowercase database name using upper case letters, without quotation marks, since that is transformed into a lower-case name behind the scenes. If you wish your object names to be case sensitive, always wrap definitions in double quotes. Please note that definitions in information_schema are constructed and will not match exactly what was executed on creation, including use of quotes.

Integrations
COPY LINK TO ANSWER
why-cant-i-connect-to-my-database-with-upper-case-letters

https://firebolt.io/faqs-v2-knowledge-center/why-cant-i-connect-to-my-database-with-upper-case-letters

How do I connect my Firebolt database to Tableau?

Firebolt is available as a connector directly from within Tableau. At this time, when you select the Firebolt connector from within Tableau, we will install a Firebolt V1 integration. To connect to V2 you will need to download the new connector and place it on the appropriate directories locally or on your server. You will also need a version of JDBC compatibile with Tableau. Full instructions can be found at https://docs.firebolt.io/Guides/integrations/tableau.html#integrate-with-tableau.

Integrations
COPY LINK TO ANSWER
how-do-i-connect-my-firebolt-database-to-tableau

https://firebolt.io/faqs-v2-knowledge-center/how-do-i-connect-my-firebolt-database-to-tableau

Can I manipulate or filter S3 data when using COPY FROM to create or insert data into a table?

At this time, COPY FROM does not support direct manipulation of S3 bucket data, however starting with 4.18 you can filter and alter data when reading using READ table-valued functions using full glob pattern capabilities (https://en.wikipedia.org/wiki/Glob_(programming)):
- Insert into an existing table using INSERT INTO +  READ_PARQUET or READ_CSV
- Create a new table with CREATE TABLE AS + READ_PARQUET or READ_CSV

SQL
COPY LINK TO ANSWER
can-i-manipulate-or-filter-s3-data-when-using-copy-from-to-create-or-insert-data-into-a-table

https://firebolt.io/faqs-v2-knowledge-center/can-i-manipulate-or-filter-s3-data-when-using-copy-from-to-create-or-insert-data-into-a-table

Are you all able to see query history further back than the last engine restart?

Yes, we can view query history prior to the last engine restart. The support team is able to retrieve the query history for the customer if they are able to provide the type of query it was (e.g., SELECT, INSERT, etc.), the approximate time it was executed, and which engine they used to execute it.

Engines
COPY LINK TO ANSWER
are-you-all-able-to-see-query-history-further-back-than-the-last-engine-restart

https://firebolt.io/faqs-v2-knowledge-center/are-you-all-able-to-see-query-history-further-back-than-the-last-engine-restart

Is there a way to set the tmp directory in the python sdk when the user is running in lambda and the default directory is read-only?

You need to ensure that use_token_cache is disabled so that it won't write to the home/sbx_user.

Integrations
COPY LINK TO ANSWER
is-there-a-way-to-set-the-tmp-directory-in-the-python-sdk-when-the-user-is-running-in-lambda-and-the-default-directory-is-read-only

https://firebolt.io/faqs-v2-knowledge-center/is-there-a-way-to-set-the-tmp-directory-in-the-python-sdk-when-the-user-is-running-in-lambda-and-the-default-directory-is-read-only

We use cookies to give you a better online experience
Got it