This is a special episode of The Data Engineering Show revisiting the best bits from three different fascinating episode
Explore Firebolt's cost efficiency with real-world data benchmarks highlighting low latency and high concurrency.
Andy Pavlo, Associate Professor at Carnegie Mellon University, delves into database internals and optimization.
Learn how the data management lifecycle looks like in Firebolt
Amplitude's cutting-edge data stack and how it processes 5 Trillion real-time events while dealing with mutable data
In this blog, we focus on distributed query execution as an integral part of Firebolt.
How does a tech stack that always needs to be at the forefront of technology look like?
Scaling a data platform to support 1.5T events per day requires complicated technical migrations
How does Substack's data platform support 500K paying subscribers?
Appsflyer deals not only with 120 billion events per day, but does so while growing quickly as a company
How Vimeo handles Data Ops to deal with massive scale?
Steven Moy thoroughly explains Yelp’s data architecture under the hood and how it evolved over the past ten years.
Canva is one of the hottest, if not the hottest, graphic design platforms out there.
Lear the top 10 tips of how to improve your cloud data warehouse performance.
Learn how to upgrade from Tableau extracts to Tableau live connection to deliver sub-seconds performance every time.
Gong manages hundreds of thousands of videoconferences and millions of emails PER DAY, which add up to hundreds of TBs.
It’s the mother of all development projects. You use it daily. And so do 65M developers around the world.
Ananth Packkildurai is Principal Software Engineer at Zendesk and runs one of the strongest newsletters in data
Bolt engineers are in the midst of designing a new next-gen data platform
Upstart cloud data warehouse sees rapid growth in 2021, plans to double its workforce
How to accelerate Looker performance on Redshift, Snowflake and BigQuery? Short-term fixes and the long-term solutions.
Many programming languages are imperative – tell the compiler how to operate by providing the instructions in order.
The data warehousing market has gone absolutely mad over performance. Why is this the case?
Why would you create ugly data? According to Jens Larsson, don’t even go near raw data.
Everything you needed to know about cloud data warehouses but were afraid to ask...
Amazon Athena engine version 2 - what’s new and big enough to call this a 2.0 release?
Explore the significant differences between ELT and ETL data integration processes and find the best option for you.
A detailed comparison of Snowflake vs. Redshift, by architecture, scalability, performance, use cases and cost.
Why even simple queries can be slow in cloud data warehouses and how Firebolt uses indexing to prune data and stay fast?
There’s so many data warehouses out there, who the hell needs another one? Three main things that make Firebolt unique.
Working with semi-structured data can be more like a Jason (horror movie) Sequel than JSON SQL.
The funding included participation from Zeev Ventures, TLV Partners, Bessemer Venture Partners and Angular Ventures.
"In the beginning, there was a data mess". Don’t Panic, just read our data hitchhiker’s guide to cloud analytics.
Indexes are the primary way for users to accelerate query performance in Firebolt. Learn about them here.
How to choose the best analytics engine for each type of analytics.
Demand from engineering teams has skyrocketed since Firebolt emerged from stealth last year
Learn when to use Postgres, MySQL, in-memory databases, HTAP, or data warehouses to meet the 1 sec SLA in analytics.
How companies should avoid creating a slow many headed federated Gorgon out of out of Athena.
Learn some simple rules of thumb you can use to choose the best federated query engine for your company's needs.
More and more, people are asking me “how do you compare Snowflake and Databricks?” We did our best to answer.
Let us guide you through the process of identifying the performance bottlenecks in your query in just 5 simple steps.
We often get asked “what’s the difference between Firebolt and Snowflake?” and it reminds me of Frozen.
When do you need to shift from Redshift, and what are the alternatives? Learn here.
Making sense of a data lakes, delta lake, lakehouse, data warehouse and more.
If you’re using Amazon Athena, you may have seen these errors. About AWS Athena errors and how to deal with them.
Should data engineering AND BI be handled by the same people?
Choosing the right data warehouse and analytics infrastructure on Amazon Web Services (AWS) can be confusing.
The technological concepts that make Snowflake so unique, and why it has proven to be so disruptful for the data space.
A checklist of criteria to help you determine which factors are most important for the success of your organization
How to Set Up Your Data Analytics Stack with Kafka, Hevo, and Firebolt.
There has been a lot of talk recently about Data Apps. That's what Firebolt is thinking about data apps.
Klarna is one of the leading fintech companies in the world, valued at $45B.
How to enable sub-second analysis across billions of rows of customer behavior data: Part I - Setting up the load
An episode about Eventbrite’s data stack modernization process, and how you get engineers to adopt new technologies
How the data platform evolved as Slack grew from a startup to an IPOed and then acquired company.
One of the ways Firebolt is able to support data-driven applications is by leveraging aggregating indexes on the tables.
Are you spending more than you planned on your Data Warehouse? Analyze more. Use less compute resources.
Firebolt provides an alternative to Druid, delivering fast response times, high concurrency and the convenience of a Saa
Sudeep Kumar, Principal Engineer at Salesforce considers the shift to Clickhouse as one of his biggest accomplishments
According to Yoav Shmaria, VP R&D Platform at Similarweb, the best way to manage data warehouse costs is tagging
Max walks the Bros through his recipe for a smart data-driven company, and the genesis of Airflow, Superset & Presto.
In this post, we look at factors to consider when building a data warehouse.
80% of the code that you write doesn’t work on the first try. But knowing which 80% is not working is the real challenge
Is Postgres truly the right engine for analytics?
How to ingest, store and query JSON data, for example, is a consistent question on the minds of customers.
How to support ad hoc analysis - Part 2: The right ad hoc analytics architecture
In our recent ‘Big Data Analytics for Gaming Workshop’ we let the audience do the talking, here’s a summary of the talk.
"When I see David Jayatillake and Tristan Handy comment on Firebolt's approach it is clear that Firebolt is on track."
Data Mesh is hot stuff. But from a technology perspective it’s still not very well defined.
How to support ad hoc analysis: Part 1 - The 4 requirements for an ad hoc analytics architecture
Event streams have always been problematic to analyze in SQL. This is how we do it.
Data apps are applications that rely heavily on data and have an easy to use.
In this blog we will discover the data using Streamlit and Jupyter and the Firebolt Python SDK.
Writing a data app, using Streamlit and Jupyter and the Firebolt Python SDK. A multi-series blog.
Writing a small data app using the Firebolt JDBC drive.
Looking at GithubArchive dataset of public events - leveraging Apache Airflow workflows for keeping our data up-to-date.
AWS re:invent 2022 was all about building the anticipation and delivering on expectations of us technologists.
Barr Moses explains how to make sure your data is accurate in a world where so many different teams are accessing it
At Firebolt, we found out that a duet of dbt and Paradime works for our needs.
In a recent workshop, 25 data pros working in the Ad Tech industry discussed querying large data sets efficiently
How ZipRecruiter and Yotpo build resilient self-service products that keep customers happy and engineers calm
How good you are at Spark or Flink ≠ how good you are at data engineering. Zach Wilson explains.
This guide will provide you with the fundamental knowledge necessary to handle semi-structured data effectively.
dbt data quality - Implementing data quality tests and using dbt extensions for enhanced data quality checks.
As people in the data industry go, Bill Inmon is among the top, often seen as the godfather of the data warehouse.
When it comes to data management, have we come a long way since the early 2000s?
Meenal Iyer, VP Data at Momentive.ai, talks about enforcing collaboration in large organizations
IQVIA deep dive into maximizing impact of BI solutions for faster and more informed decision-making in healthcare.
Joe Reis and Matt Housley joined the bros for some much-needed ranting, priceless data advice, and good laughs.
"If you cannot constrain a thing, you cannot ingest that thing."
"There's no point in measuring anything, if the data team can't measure itself."
This has nothing to do with the DW itself. But if you miss it, you'll fail with your warehouse project.
Vin Vashishta, the guy we all love to follow, has never seen a dashboard with positive ROI.
Rob says: delete nothing, update only metadata.
I'm not a fan of dimensional modeling. It exists to solve physical problems, not logical problems.
"Do data architects exist anymore?" Wow, as a recovering data architect that's a loaded question.
An issue many coming into the data warehouse world is difficulty with is managing time variance at scale and efficiency.
One of the more common and costly mistakes in the many data implementations is confusion about keys.
Every data team should have at least one data engineer with a software engineering background.
Megan Lieu about her approach to data advocacy as well as the power of notebooks