Spending Doubles, Shipping Stalls: The 2026 Enterprise AI Execution Gap
Enterprises plan to double AI spend in 2026 while execution lags. The bottleneck isn't models — it's the data engineering nobody budgeted for.
Two numbers define enterprise AI in 2026, and they don’t fit together cleanly.
BCG’s AI Radar 2026 (bcg.com) finds the average share of organizational revenue going to AI has more than doubled — from roughly 0.8% in 2025 to a projected 1.7% in 2026 — and that 94% of CEOs intend to keep investing even if returns take time to land. Deloitte’s State of AI 2026 (reported by AIwire) finds the opposite mood on the ground: execution is falling behind adoption. Budgets are climbing a steep curve; shipped systems are not climbing with them.
The gap between those two numbers is the most important thing happening in enterprise AI right now. And the reason for it is unglamorous.
The bottleneck is not the model#
Here is the house thesis, stated plainly: AI implementation is mostly data engineering with a model on top.
The doubling of spend is going somewhere. Some of it goes to compute — US data center power demand could reach 35 to 45 GW by 2030, roughly double 2024 levels (Data Center Knowledge), which tells you how much raw capacity is being poured into this. But compute and model access were never the binding constraint for most enterprises. You can rent a frontier model in an afternoon. What you cannot rent is clean, governed, queryable access to your own operational data.
That is where the budget quietly disappears and the timeline quietly slips. The demo works because someone hand-fed it a tidy slice of data. The production system stalls because the rest of the data is scattered across a legacy ERP, three spreadsheets, a vendor API with a 2 a.m. batch window, and a column whose meaning changed in 2023 and nobody documented it.

The 94% of CEOs holding the line through a slow payoff are right to — but the payoff is gated by plumbing, not by model selection. The organizations falling behind on execution are not falling behind because they picked the wrong model. They are behind because nobody budgeted for the eighteen months of data engineering that has to happen before the model has anything reliable to stand on.
What the unglamorous work actually is#
When we say “data engineering,” we mean a specific, boring, indispensable set of things:
Pipelines that move and shape the data#
The data has to land somewhere fast and queryable, on a schedule, with backfills and idempotency, surviving the day the upstream schema changes without warning. Our default operational engine is ClickHouse for the analytical store, Airflow for orchestration, and dbt for transformations. It is not the only valid stack, but it is fast, observable, and it does not trap your data the way a monolithic legacy ERP does. The point is that the model reads from a layer you control and can reason about.
Evals, because “it looked good in the demo” is not a status#
A model in production needs a standing test harness, not a launch-day vibe check. Evals catch the regression the day a prompt change degrades extraction accuracy, or the day a fine-tune that helped one case quietly broke five others. Without evals you don’t have a shipped system; you have a system you hope still works.
Observability and cost tracking, wired in from day one#
You need to see what the system is doing — latency, drift, failure modes — and what it is costing, per request and per workflow. AI spend has a way of detonating quietly. Cost tracking is the difference between a system with a known unit economics and a surprise invoice. These are non-negotiable, in the same breath as the pipelines themselves.

None of this shows up in the keynote. All of it determines whether the keynote’s promise survives contact with production.
A concrete shape: where the 80% goes#
Take an Operational Automation engagement we recognize from the inside — automating intake and document handling for a Hospital Management System. The “AI” everyone pictures is the part that reads a referral document and extracts the structured fields. That part is real, and it is maybe 20% of the work.
The other 80% is data engineering. The referral arrives in six formats. Patient identity has to be resolved against an existing record without creating duplicates. The extracted fields have to validate against the schema the downstream clinical system expects, and fail loudly when they don’t. Every automated decision has to be logged so a human can reconstruct it later. The pipeline has to handle the malformed PDF, the missing field, the duplicate submission, the 2 a.m. batch from the lab — without a person watching.
The model was the easy 20%. The 80% that made it a shipped system rather than a demo was pipelines, validation, identity resolution, evals, and audit logging. Swap in a School ERP automating enrolment and the ratio holds: the model reads the form; the data engineering makes it trustworthy enough to run unattended.
This is exactly why doubling the budget does not double the output. The money flows toward the visible 20% — model access, a pilot, a slick interface — while the invisible 80% stays under-resourced. The execution gap is the predictable result of funding the tip of the iceberg and ignoring the mass under the waterline.
Closing the gap#
The fix is not a better model. It is a reallocation of attention toward the part of the system that actually determines whether it ships.
- Budget for the data layer first. The model is the cheap part; the governed data feeding it is the expensive part.
- Stand up evals and observability before the launch, not after the first incident.
- Treat cost tracking as a feature, not an afterthought.
- Refuse to let a legacy ERP be the source of truth for your AI — get the data into a layer you own and can query.
The enterprises pulling ahead in 2026 are not the ones with the biggest model budget. They are the ones who understood that AI implementation is mostly data engineering with a model on top, and who funded the plumbing accordingly.
Built to ship means the system runs without us and we measure what we built. The model is the part everyone sees. The data engineering is the part that decides whether anyone gets to.
If your AI budget doubled but nothing shipped, the gap is almost certainly in the data layer. Let’s find it. Talk to us.