← Consulting Firms 🕐 12 min read
Consulting Firms

The AI Factory Budget Is About to Triple: What 515 Enterprise Decision-Makers Told Deloitte About the 2028 Infrastructure Cliff

The 2026 institutional corpus has characterized almost every dimension of enterprise AI except the one CFOs care most about by this time next year: **what does the hardware, power, and token bill actu

See also (wiki): wiki/inference-economics.md, wiki/ai-budget-cfo-decisions.md, wiki/ai-talent-workforce-planning.md, wiki/ai-vendor-contracts.md


Executive Summary

  • Deloitte’s Enterprise AI Infrastructure Survey: A 2028 Outlook (March 30, 2026; n=515 US director-level-and-above decision-makers at organizations with $500M+ annual revenue; fielded November–December 2025) is the first large-sample primary survey to quantify the 2026-to-2028 capital-expenditure trajectory for enterprise AI infrastructure. The headline: respondents expect their infrastructure budgets to more than triple over three years, with large enterprises projecting nearly four times current spend.
  • Three trajectory lines anchor the survey: AI factories at scale jump from 64% to 88% of respondents; monthly token consumption of 10+ billion rises from 30% to 61%; scaled edge-AI deployment doubles from 36% to 72%. The 100+ billion token cohort is expected to triple. This is not a pilot-to-production story — it is a production-to-industrial-scale story.
  • The binding constraint is not ambition. 97% of respondents say they are confident they can scale AI workloads in the next three years. The binding constraints Deloitte names are (1) economic uncertainty (cited by 51% as the leading risk factor), (2) a 16-point talent-capability gap (81% confidence in IT teams vs. 65% in business and product teams), and (3) power/cooling/grid-interconnection strategy — 47–48% of leaders plan mixed power sourcing across grid, self-built on/near-site generation, and third-party.
  • The survey’s audience is above the 200–2,000-person workshop band: every respondent sits at a $500M+ revenue company. The data should not be read as a mid-market benchmark. It should be read as the vendor-contract and token-pricing environment a mid-market CIO will be negotiating inside of — the cohort signaling where GPU supply, memory pricing, and power contracts are headed.
  • The Monday-morning action for a 200–2,000-person American company is not to build an AI factory. It is to write token economics and total-cost-of-ownership into the next AI vendor contract — because the buyers above you in the market just told Deloitte they expect the per-token cost environment to compound through 2028.

Why This Study Matters: The Capex Dimension Was the Missing One

The 2026 institutional corpus has characterized almost every dimension of enterprise AI except the one CFOs care most about by this time next year: what does the hardware, power, and token bill actually look like at scale? McKinsey’s Nov 2025 State of AI (n=1,993) named 6% of companies as high performers capturing >5% EBIT impact. BCG’s AI at Work 2025 (n=10,600) named 5% with substantial financial gains. IBM IBV’s Dynamic Finance (n=600 CFOs, Mar 2026) quantified a 12% “advanced” cohort making funding decisions 19% faster. MIT CISR’s Enterprise AI Maturity (n=721) mapped the four-stage maturity curve.

None of these studies carries the capex trajectory. Deloitte’s infrastructure survey is the first to quantify, at $500M+ revenue scale, how much AI spend is about to move through the system over the next three years and what the binding resource constraints look like on the other side of that move.

That trajectory matters to mid-market buyers for a specific, asymmetric reason. Token prices, GPU lead times, data-center power contracts, and AI-specialist compensation are all set in the same market where the $500M+ cohort is about to triple its capex. A 300-person American company negotiating a two-year AI contract in 2027 will be pricing against the supply-and-demand conditions the cohort in this survey creates. Deloitte’s data is the best available read on those conditions.

The Three Trajectories That Actually Matter

AI factories: 64% to 88% at scale by 2028. Deloitte defines an AI factory as “purpose-built, high-performance infrastructure paired with AI-optimized software, scaling the end-to-end AI lifecycle across modalities, with output measured by token throughput.” The 73% who expect at-scale operations by 2028 — up from roughly a third today — is the clearest signal in the survey that large enterprises are treating AI as a sustained infrastructure commitment, not a program budget. The expected applications skew toward revenue-adjacent work: innovation support (71%), risk management (64%), token optimization (59%).

Token consumption: 30% to 61% at 10+ billion per month by 2028. The top of the distribution moves even faster — the 100+ billion monthly token cohort triples. The pattern is consistent with what practitioner sources in the corpus report: enterprise AI loads are moving from point assistance (thousands of tokens per interaction) to agentic workflows (hundreds of thousands to millions per multi-step task). Deloitte’s authors flag the counter-reading directly: “High token growth can indicate inefficient patterns (oversized prompts, weak context management) rather than successful adoption.” Not every billion tokens is value-creating; some are the exhaust of poorly designed agent loops.

Edge AI: 36% to 72% at scale by 2028. Two-thirds of respondents favor cloud-managed edge — cloud orchestration paired with local compute. For mid-market operators with physical operations (manufacturing, logistics, retail), the edge trajectory is the single most transferable finding. It signals that the vendor ecosystem is about to standardize around cloud-orchestrated local inference, which is where the BCG physical-AI framework (Pass 470) and the IBM IBV five-trends sovereignty thread (Pass 467) converge.

The Budget Picture: Triple, With a 4x Ceiling

86% of respondents expect AI-infrastructure budgets to increase over three years. Average budgets are expected to triple; large enterprises project close to four times. The cost drivers Deloitte names are structural, not transient: high-bandwidth memory shortages, wafer prices up 20%, GPU and CPU pricing, power generation, and cooling infrastructure. These are hardware-supply-chain dynamics, not software-licensing dynamics — meaning they do not compress when new models release.

This is the number mid-market buyers should treat with the most care. It is self-reported by the buying side of the market, fielded during a period (Nov–Dec 2025) when AI capex narratives were actively shaping procurement plans. The direction of the forecast is credible; the precise multiple is buyers’ stated plans, not measured outcomes. Vendor caveat applies — Deloitte sells AI-infrastructure transformation engagements and has an interest in the magnitude of the forecast.

Even with the caveat applied, the direction is unambiguous enough to drive vendor-contract structure today. A 200–2,000-person company signing a three-year commit to a specific per-token price in 2026 is pricing against a 2028 environment where the cohort above it has materially more inference demand, materially more power cost, and materially tighter memory supply. Two concrete contract implications follow.

First: price-step-down clauses tied to published per-token benchmarks are now table stakes. Per-token costs have fallen faster than Moore’s Law through 2024–2026, and the survey implies that trend continues — but the savings will only reach buyers who negotiated for them.

Second: token-consumption audits are a CFO obligation, not a technical detail. Deloitte’s own guidance is specific: “Track AI consumption with financial rigor — audit token usage with same discipline as financial systems.” The reason is that 61% of surveyed organizations still extract post-signature data (agreements, logs, model outputs) manually in the adjacent Deloitte + Docusign study (Pass 474) — and token-spend attribution is not meaningfully better in most mid-market environments. A CFO who cannot answer “what did we spend on tokens by business unit this quarter” is about to manage a line item that will triple without attribution.

The Talent-Capability Gap Is the Real 2028 Constraint

81% of respondents believe IT teams have the technical and financial acumen to scale AI; only 65% believe the same of business and product teams. That 16-point gap is the single most actionable diagnostic in the survey.

The gap matters because the question for 2028 is not “can IT deploy the AI factory.” IT will deploy the AI factory — the vendor ecosystem is lined up to make that possible. The question is whether the business owns the workflow redesign, the value attribution, and the governance practice that converts infrastructure into EBIT. At a 16-point confidence gap, Deloitte’s own conclusion is direct: organizations succeeding by 2028 will “scale with economic discipline, operational resilience, and clear accountability” — not simply scaling faster.

The skill shortages Deloitte names are the specific roles this gap runs through: security and compliance specialists, AI/ML and agent operations engineers, change management experts. Emerging needs — robotics systems engineering, carbon and energy monitoring — point to where the gap widens next as physical AI and sustainability obligations enter the workload mix.

This triangulates cleanly with two existing corpus findings. BCG’s 10-20-70 framework (70% of AI value comes from people and process) and Deloitte’s own Pass 462 Global Human Capital Trends finding (organizations using human-centric AI strategies are 1.6x more likely to exceed investment returns vs. tech-first approaches; 59% still take tech-first approaches). The infrastructure survey adds the capex cost of getting the people dimension wrong: you spend triple on infrastructure that the non-IT half of your organization cannot operate at the confidence level IT can.

Power, Cooling, and the Physical-Layer Problem

47–48% of leaders plan mixed power sourcing: grid, self-built on/near-site generation, third-party. This is not a data-center detail. It is the boardroom recognition that AI workloads at 2028 scale cannot be sourced from grid power alone in most US markets. Cooling strategies span traditional air to liquid. Grid interconnection is now a strategic concern, not an operations detail.

For a 200–2,000-person American company, the power question is rarely in the direct critical path — most mid-market AI loads will continue to run in commercial cloud, which absorbs the power and cooling constraint on behalf of tenants. But the power constraint shapes the cloud contract. Cloud providers who cannot source new power will price older inference capacity aggressively while metering new-capacity access. The mid-market posture that survives this is: (1) multi-cloud enough that any one provider’s power constraint is a price lever, not a capacity lever; (2) workload-portability enough that inference can route to the provider with the best per-token price that quarter.

Source Credibility

Credibility: MEDIUM — vendor caveat applies. Deloitte has a direct commercial interest in AI-infrastructure transformation consulting engagements; the survey’s executive recommendations align closely with services Deloitte sells (financial-operations principles grounded in token economics, value-based governance, infrastructure-and-execution alignment). Self-reported projections are stated intentions at the time of fielding (Nov–Dec 2025), not measured outcomes — the 2028 numbers will reflect what buyers planned to do, modulated by three years of market conditions.

Methodologically: n=515 is a credible sample. Director-level-and-above at organizations with $500M+ annual revenue is a rigorous filter. Five industries (consumer; energy, resources, industrials; financial services; life sciences and health care; technology, media, telecom) provide reasonable breadth. The survey does not report detailed comparative breakouts across industries, which limits what can be said about sector-specific trajectories. The $500M+ revenue threshold is above the 200–2,000-person workshop audience’s typical revenue band; all prescriptions for mid-market readers should be framed as “what the cohort above you is signaling” rather than direct benchmarks.

Freshness: TIER 1 (March 2026 publication, late-2025 fieldwork, current-generation models). Cite directly.

Cross-reference against: IBM IBV Enterprise 2030 (capex lens), MIT CISR Enterprise AI Maturity (maturity-curve lens), McKinsey State of AI Nov 2025 (high-performer share), BCG AI at Work 2025 (workforce lens). These case studies and surveys are vendor-published and represent selected wins or stated intentions with no control group and no independent verification. Cross-reference against: METR RCT (experienced developers 19% slower), CMU study (40.7% code complexity increase), Atlan 200-deployment analysis (median +159.8% ROI requires workflow redesign first).

Key Data Points

Metric Value Source / Date
Sample size 515 US director-level-and-above decision-makers Deloitte Insights, Mar 30 2026
Revenue threshold $500M+ annual revenue Deloitte Insights, 2026
Fielding period November–December 2025 Deloitte Insights, 2026
AI factories — current at limited or scale deployment 64% Deloitte Insights, 2026
AI factories — expected at-scale by 2028 73% Deloitte Insights, 2026
AI factories — expected any deployment by 2028 88% Deloitte Insights, 2026
Token consumption 10+ billion/month — current 30% Deloitte Insights, 2026
Token consumption 10+ billion/month — 2028 61% Deloitte Insights, 2026
100+ billion token cohort trajectory Triples by 2028 Deloitte Insights, 2026
AI-at-the-edge — current scaled 36% Deloitte Insights, 2026
AI-at-the-edge — 2028 scaled 72% Deloitte Insights, 2026
Cloud-managed edge preference 68% Deloitte Insights, 2026
Budget increase expectation (3 years) 86% Deloitte Insights, 2026
Average budget multiplier >3x Deloitte Insights, 2026
Large-enterprise budget multiplier ~4x Deloitte Insights, 2026
Confidence in ability to scale AI workloads 97% Deloitte Insights, 2026
IT-team capability confidence 81% Deloitte Insights, 2026
Business/product-team capability confidence 65% Deloitte Insights, 2026
Capability gap 16 points Deloitte Insights, 2026
IT leadership (CIO/CTO) own infrastructure decisions 51% Deloitte Insights, 2026
Economic uncertainty as leading risk 51% Deloitte Insights, 2026
Regulatory pressure as risk factor 48% Deloitte Insights, 2026
Talent gap as risk factor 40% Deloitte Insights, 2026
Mixed power sourcing plan 47–48% Deloitte Insights, 2026
Wafer price increase +20% Deloitte Insights, 2026

What This Means for Your Organization

The survey describes a cohort of companies five to ten times larger than the 200–2,000-person workshop audience preparing to triple AI-infrastructure spend through 2028. Three questions decide whether the finding is directly relevant or contextually relevant to your organization.

  1. What is your three-year token exposure, and is it in the next contract? The 30%-to-61% shift in the 10+ billion-tokens-per-month cohort is the single most important data point for vendor-contract negotiation. A CFO who cannot currently answer “how many tokens did we consume last quarter, by business unit, by model tier” is about to manage a line item that the cohort above has told Deloitte will triple. Before the next AI vendor commit, require token economics and total-cost-of-ownership reporting as contract terms — not as a nice-to-have.

  2. What is your IT-to-business capability gap, and where does it show up first? The 16-point gap (81% vs. 65%) is the operational version of BCG’s 10-20-70. Infrastructure alone does not produce EBIT; business-side ownership of workflow redesign, value attribution, and governance practice does. The most testable mid-market diagnostic is to ask each business-unit leader: “Name two decisions an AI agent could make inside your operation this quarter that you would be confident signing off on.” If the answer is unclear for more than half the business-unit leaders, the capability gap is the binding constraint regardless of how much infrastructure budget is available.

  3. Is your architecture workload-portable enough to route inference to the cheapest provider? Power, cooling, and grid-interconnection constraints in the $500M+ cohort mean cloud providers will price older inference capacity aggressively while metering new-capacity access. A mid-market operator whose AI workloads are hardwired to a single provider loses the per-token price lever that the survey implies will be the most valuable piece of the 2027–2028 contract environment. The architectural posture that captures this is inference abstraction — a gateway that lets you swap providers by configuration, not re-platforming.

If this raised questions specific to your organization — about token economics, vendor-contract structure for the next three-year commit, or how to close the 16-point IT-to-business capability gap before the infrastructure bill triples — I would welcome the conversation. brandon@brandonsneider.com.

Sources

  • Deloitte Insights. Enterprise AI Infrastructure Survey: A 2028 Outlook. Kavitha Prabhakar, Chris Thomas, Nicholas Merizzi, Diana Kearns-Manolatos, Iram Parveen. March 30, 2026. URL: https://www.deloitte.com/us/en/insights/topics/technology-management/ai-infrastructure-survey.html. Sample: n=515 US director-level-and-above decision-makers at organizations with $500M+ annual revenue, fielded November–December 2025, five industries (consumer; energy, resources & industrials; financial services; life sciences & health care; technology, media & telecommunications). Credibility: MEDIUM — vendor caveat (Deloitte sells AI-infrastructure transformation engagements); self-reported planned spend and capacity, not measured outcomes; rigorous sample and methodology.

  • McKinsey. The State of AI (November 2025). n=1,993 global respondents. Used as cross-reference for the high-performer share and EBIT-impact benchmark. URL: https://www.mckinsey.com/capabilities/quantumblack/our-insights/the-state-of-ai.

  • IBM Institute for Business Value. The Enterprise in 2030. Used as cross-reference for the capex and business-model trajectory. Internal research file: research/04-consulting-firms/ibm-ibv-enterprise-2030.md.

  • IBM Institute for Business Value + Oracle. Dynamic Finance at Work. March 18, 2026; n=600 CFOs. Used as cross-reference for the “advanced cohort” 12% finding and funding-decision speed. Internal research file: research/04-consulting-firms/ibm-ibv-dynamic-finance-2026.md.

  • BCG. AI at Work 2025. n=10,635. Used as cross-reference for the 5% substantial-financial-gain cohort. Internal research file: research/01-ai-native-landscape/bcg-ai-at-work-2025.md.

  • Deloitte Insights. 2026 Global Human Capital Trends. Published March 4, 2026; n=9,000+. Used as cross-reference for the human-centric-vs-tech-first 1.6x outcome finding. Internal research file: research/07-adoption-challenges/deloitte-global-human-capital-trends-2026.md.


Brandon Sneider | brandon@brandonsneider.com April 2026