Databricks: the lakehouse that became an AI platform
A neutral, evidence-first reading of the data + AI company now at a $5.4B revenue run-rate and a $134B private valuation — assembled from company disclosures, primary filings and independent analysts so you can reach your own conclusion.
In a little over a decade, seven Berkeley researchers who built Apache Spark turned an open-source engine into the company that coined the lakehouse — and then, on the back of generative AI, into the most valuable private enterprise-software company in the world at $134 billion.
The genuinely open question is not whether Databricks is impressive — a $5.4B run-rate growing >65% with positive cash flow speaks for itself[4]. It is whether a consumption business with compressing margins can sustain a software-multiple valuation while open formats commoditize its moat, hyperscalers bundle against it, and an AI cycle its own CEO calls a bubble cools. The evidence cuts both ways on every question below. This study lays out both cases; the verdict is yours.
The decisive questions
Each links to the section that lays out the evidence on both sides.
The bull case is rare: growth accelerating from ~50% to >65% past a $5.4B run-rate, >140% net retention, positive cash flow. The bear case: that headline is an annualized run-rate, not audited revenue, margins are compressing, and the mark is over 2× Snowflake's.
Databricks built its lead on Spark and Delta Lake. But critics argue Apache Iceberg has won new adoptions, that Delta is 'really a Databricks product,' and that the battle has moved up to the catalog layer — so it bought Iceberg's creators to hedge.
Databricks competes head-to-head with Snowflake while running on AWS, Azure and Google — its suppliers and rivals at once. Microsoft Fabric (21,000+ customers) bundles a competing stack through the same Azure that resells Databricks.
AI products are a $1.4B run-rate and most of the growth story. Yet CEO Ali Ghodsi himself warns of a broad AI bubble and 'circular' financing — and AI inference is the very thing pressuring Databricks' own gross margin.
The climb that frames the debate
Company-disclosed revenue run-rates (US$B; annualized, not audited GAAP). The recent slope — $1.6B for FY2024 to a $5.4B run-rate two years later — is what both the bull and bear cases argue over.
How to read this
Eight sections, each built the same way: a neutral synthesis, framework visuals, a two-sided case-for / case-against ledger, dated quotes, and the sources used. Start with the question that interests you, or read in order from Overview & Timeline.