Databricks 2026 — Lakeflow, Genie, Mosaic

Databricks has been on a feature-shipping streak that’s hard to keep up with. Three recent additions — Lakeflow (declarative pipelines), AI/BI Genie (natural-language analytics), and Mosaic AI (model training and serving) — change the platform’s shape meaningfully. They’re worth understanding even if you don’t deploy Databricks, because they reshape the competitive landscape across data engineering, BI, and MLOps.

We’ve shipped Databricks for client work — banking ML pipelines, hospital data platforms with multi-modal needs. Here’s an honest take on what’s real, what’s marketing, and what to adopt.

Lakeflow: declarative pipelines that mostly replace DLT#

What it is: Lakeflow is Databricks’ next-generation pipeline framework, evolving from Delta Live Tables (DLT). You define data flows as declarative SQL or Python; Databricks handles orchestration, retry, lineage, and incremental processing.

What’s actually new vs DLT: Lakeflow extends DLT’s declarative model with:

Cleaner ingestion connectors (CDC from Postgres, MySQL, Salesforce, etc. as first-class sources)
Better support for streaming + batch in the same pipeline
Tighter integration with Unity Catalog for lineage and governance
Improved orchestration UI

What’s worth adopting: For teams already on Databricks doing serious data engineering, Lakeflow is the right path forward. The declarative shape is cleaner than equivalent Airflow/Dagster setups for the specific job of “ingest → transform → publish.”

Where it competes with dbt: dbt remains the SQL transformation standard (our dbt advanced patterns piece). Lakeflow’s transformation capabilities overlap with dbt but Databricks’ integration story for dbt-on-Databricks is also strong, so it’s not strictly either/or. Many teams use dbt for warehouse transformations and Lakeflow (or Airflow/Dagster — see our orchestrator comparison) for ingest and the broader orchestration concerns.

Where it competes with Airflow/Dagster: Lakeflow handles pipelines that live inside Databricks well. For pipelines that span multiple systems (Databricks + external APIs + non-Databricks compute), a generic orchestrator still wins.

Verdict: Worth adopting for net-new Databricks pipelines. Existing DLT pipelines should migrate on Databricks’ suggested timeline. Don’t migrate existing Airflow/Dagster setups to Lakeflow just because — only if the workload is genuinely Databricks-centric.

AI/BI Genie: natural-language analytics that’s almost ready#

What it is: AI/BI Genie lets business users ask questions in natural language and get charts/tables answered against curated Databricks data. “What was last month’s revenue by region?” — Genie generates the SQL, runs it, returns the visualization.

What’s actually new: The interesting part isn’t natural-language-to-SQL (everyone’s doing that). It’s the curation layer — Genie spaces let data teams pre-define which tables, joins, and metrics are exposed, with descriptions and example queries. The LLM operates inside those guardrails rather than trying to figure out the whole warehouse from scratch.

Where it works: Well-defined business domains (sales pipeline, marketing attribution, financial reporting) where the data team has invested in semantic modeling. Genie surfaces these queries to non-technical users with reasonable accuracy.

Where it disappoints: Anywhere the underlying data model is messy. Genie inherits the cleanliness of your warehouse. Garbage data + Genie = garbage answers, presented confidently.

Where it competes: With Looker (the closest enterprise BI equivalent), with Mode/Hex (analyst-focused BI with notebooks), and with the broader emerging category of “LLM analytics” tools (Sturdy, Continuum, etc.).

Verdict: Adopt for well-modeled domains with non-technical users who currently ask data teams for one-off questions. Don’t adopt as a “let users query the warehouse directly” tool — that’s where it embarrasses itself and you.

Mosaic AI: serving + training that’s a real platform now#

What it is: Mosaic AI is Databricks’ integrated ML platform — model serving, agent framework, vector search, model registry, evaluation, RAG infrastructure. Built on the foundation from the MosaicML acquisition plus Databricks’ own ML investments.

What’s notable:

Model serving competes with SageMaker, Vertex AI, dedicated serving stacks (KServe, BentoML, Seldon — see our model serving comparison). Mosaic’s pitch is unified billing and tight integration with Unity Catalog and Lakeflow.
Vector search is built into the platform — no separate Pinecone/Weaviate integration needed for RAG.
Agent framework + evaluation gives you the LangChain/LangGraph equivalent native to Databricks.
Foundation model APIs (Llama, DBRX, Mixtral) hosted inside Databricks.

Where it works: Teams already deep in Databricks who want one platform for ML. Reduces integration surface. Unity Catalog gives you data governance across training, serving, and feature stores.

Where it disappoints: For teams not already on Databricks, adopting Mosaic AI just for the ML platform is a big commitment. The integrated story only pays off if you’re using Databricks’ data side already.

Where it competes: With the cloud ML platforms (SageMaker vs Vertex AI), with Anyscale Ray, with the modular serving stacks (Kubeflow vs BentoML vs Seldon).

Verdict: Adopt if you’re already on Databricks for data. Don’t migrate to Databricks just for Mosaic AI — pick the right ML stack for your existing infrastructure.

What we deploy by default on Databricks#

For client work where Databricks is the data platform of choice:

Lakeflow for ingest pipelines + Bronze/Silver layer transformations
dbt-on-Databricks for Gold-layer business transformations (we like dbt’s SQL discipline)
Unity Catalog for governance from day one — non-negotiable
AI/BI Genie for executive/business-user analytics on well-modeled domains
Mosaic AI Model Serving for ML model deployment when the data is already in Databricks
Mosaic Vector Search for RAG over Databricks-resident knowledge bases

We don’t usually run Databricks as the analytics serving layer for user-facing dashboards (we prefer ClickHouse for that). Databricks is the heavy-lifting platform; ClickHouse is the latency-critical serving layer.

When Databricks is the right call#

For the broader Databricks-vs-Snowflake-vs-BigQuery question, see our warehouse comparison. In short:

Databricks wins when ML is meaningful, when you need notebook + SQL in one platform, when Spark/Photon performance at PB scale matters, when you want lakehouse with Unity Catalog governance
Snowflake wins when the workload is pure SQL analytics with high concurrency
BigQuery wins when you’re already on GCP

The 2026 Databricks additions (Lakeflow, Genie, Mosaic) widen the gap for the workloads Databricks already won — ML-heavy, multi-modal, governance-focused. They don’t fundamentally change Snowflake’s or BigQuery’s competitive position for the workloads those win.

What Databricks still doesn’t solve#

A few things even with Mosaic + Lakeflow + Genie:

Sub-second user-facing analytics: Databricks SQL has gotten faster, but for “page loads with live data” workloads, ClickHouse / SingleStore still win.
Truly heterogeneous workflows: Pipelines that span Databricks + AWS Lambda + an external API + a third-party service are still better orchestrated by Airflow/Dagster than by Lakeflow.
Cost predictability for variable workloads: Databricks compute is per-DBU + cloud bill. Genuinely spiky workloads can produce surprising monthly variance. Reserved capacity helps; doesn’t eliminate.
Small-team simplicity: Databricks is the right platform for teams that have data engineering capacity. Tiny teams (2-3 engineers) are usually better served by simpler stacks (Postgres + dbt + Looker, or Snowflake + dbt + a BI tool).

The pattern of patterns#

Databricks’ 2026 additions consolidate it into the lakehouse-plus-ML-plus-BI platform it’s been positioning toward for years. For organizations that fit the lakehouse model and want one platform for data engineering + ML + governance, the platform is genuinely strong.

For organizations whose needs are narrower — pure analytics, pure ML, simple data warehousing — a more focused tool is usually still better. Databricks rewards commitment; punishes half-measures.

The teams getting the most from Databricks aren’t the ones who adopted every new feature. They’re the ones who picked the parts that fit their workload (Lakeflow for pipelines, dbt for SQL transforms, Mosaic for ML serving, Genie for business analytics) and ignored the rest.

Databricks is a platform commitment, not a feature checklist. If you’re evaluating whether the 2026 additions justify deeper investment, our data engineering team has shipped Databricks in production. Tell us about the workload.

Lakeflow: declarative pipelines that mostly replace DLT#

AI/BI Genie: natural-language analytics that’s almost ready#

Mosaic AI: serving + training that’s a real platform now#

What we deploy by default on Databricks#

When Databricks is the right call#

What Databricks still doesn’t solve#

The pattern of patterns#

Related posts.

Amazon's $13B India Bet Is a Data-Residency Story

OpenAI's 'Jalapeño' Chip and the Real Economics of Inference

Plumbing-First AI: Why Implementation Is Mostly Data Engineering