Executive Summary
- The single most expensive pre-deployment mistake in enterprise AI is not choosing the wrong model — it is deploying AI against a data architecture that cannot support it. The remediation cost arrives six months into production, not before launch, which is why Gartner predicts 60% of AI projects will be abandoned through 2026 due to lack of AI-ready data.
- The decision tree below classifies any planned AI workflow into one of three paths: Proceed (deploy on existing architecture), Prepare (lightweight data cleanup, 2–4 weeks), or Reset (full ontology or schema rebuild, 3–24 months). Domain count is the primary variable; data structure type and decision-point density are secondary.
- Rewired’s most actionable data principle — “No data architecture, no AI advantage” (Lamarre et al., 2024, Ch. 25 p.391) — describes the structural consequence: without governed, reusable data, AI investments produce recurring bespoke-integration costs and bounded outcomes that cap each use case rather than compounding across them.
- The workflow-archetype table at the end maps four common workflow types to their expected data paths. Transactional workflows almost always Proceed or Prepare. Agentic workflows almost always require Reset first — because agentic AI surfaces every data quality problem simultaneously and at scale.
How to Use This Decision Tree
Work through the branches in order, one question at a time. Stop at the first terminal branch (Path A, B, or C) that applies. Do not skip ahead.
The tree is calibrated for a workflow-level decision, not an enterprise-level one. The same company may simultaneously have one workflow that Proceeds, one that Prepares, and one that Resets.
Branch 1: Domain Count
How many distinct business domains does this workflow cross? A “domain” is a category of data with its own source system, its own ownership function, and its own quality standards.
Q: Does this workflow operate within a single data domain?
(One source system, one ownership function, one data type)
├── YES → proceed to Branch 2
│
└── NO → How many domains does it cross?
│
├── 2 domains → proceed to Branch 3
│
└── 3+ domains → PATH C: Full Reset Required
(See Path C description below)
Why domain count is the primary variable:
Nebraska Medicine’s revenue cycle automation (Palantir AIPCon 8, September 2025) crossed four domains — patient demographics, procedure codes, payer contracts, and physician credentials — each with a separate source system and a separate ownership function. The 10-hour build time that made headlines required a 6-month Ontology investment first. The Ontology is what resolved the four-domain problem. Without it, the AI had no coherent semantic layer to query.
SlickDeals’s deal-scoring deployment (AWS re:Invent 2025) operated within a single domain — user-generated deal content and community engagement signals in one system. No semantic integration was required. Infrastructure modernization (SQL Server → Databricks) was the investment. Result: 360x latency reduction, 7% merchant revenue increase.
Same AI stack. Same deployment era. Radically different data preparation requirements. The variable was domain count, not AI capability.
Source: research/07-adoption-challenges/ai-data-reset-decision-framework.md; research/01-ai-native-landscape/palantir-aipcon-enterprise-agentic-deployment-2026.md
Branch 2: Single-Domain Data Structure (for workflows passing Branch 1)
Q: Is the single-domain data accessible via API or direct query without manual extraction,
with a consistent schema for the last 12+ months?
├── YES → PATH A: Proceed on Existing Architecture
│ (See Path A description below)
│
└── NO → What is the primary data structure problem?
│
├── Inconsistent formats / partial schema documentation
│ → PATH B: Lightweight Cleanup First (2–4 weeks)
│
├── Manual export required / no API access
│ → PATH B: Lightweight Cleanup First (2–4 weeks)
│ [Infrastructure modernization is the primary work]
│
└── Missing 12+ months of history / data gaps >5%
→ PATH B: Lightweight Cleanup First (2–4 weeks)
[May extend to 6–8 weeks if historical backfill required]
Branch 3: Two-Domain Workflows
Q: For the two domains this workflow crosses — does a documented
entity resolution mapping already exist between them?
(e.g., a customer ID that resolves across CRM and ERP)
├── YES → PATH B: Lightweight Cleanup First (2–4 weeks)
│ [Verify the mapping is current and audited; do not assume]
│
└── NO → Q: Is the entity resolution achievable in under 4 weeks
with available internal resources?
│
├── YES → PATH B: Lightweight Cleanup First (2–4 weeks)
│ [4-week estimate is optimistic; plan for 6]
│
└── NO → PATH C: Full Reset Required if this workflow is the priority.
OR: Scope workflow to one domain only and Proceed.
[The scope-reduction alternative is often underused —
many multi-domain workflows deliver 80%+ of their value
within one domain. Evaluate this option before committing to Reset.]
Source: research/07-adoption-challenges/ai-data-reset-decision-framework.md
The Three Paths
Path A — Proceed on Existing Architecture
When: Single domain, structurally coherent, accessible via API, schema documented and consistent.
What it means: The data architecture does not require remediation before AI deployment. Infrastructure modernization (real-time streaming, query optimization, observability tooling) may still be required, but the data model itself is sound.
Corpus anchors:
- SlickDeals (single domain, real-time deal content): 360x latency reduction + 7% revenue gain with infrastructure modernization, no semantic layer (research/07-adoption-challenges/ai-data-reset-decision-framework.md)
- TTEC call center (structured call logs, single system): 15–20% handle time reduction (research/04-consulting-firms/mckinsey-rewired-2nd-edition-synthesis.md, E.ON Next case)
- High-volume transaction scoring workflows: when input is structured and output is a score, Path A is the default starting point
Timeline: 6–16 weeks to production (infrastructure + deployment + testing) Budget: $70K–$400K for mid-market single-workflow initiative
Caveats:
- Re-run Domain Count check before deployment if scope expands. A workflow that starts as single-domain but grows to include a second source system has moved to Branch 3.
- Path A does not eliminate all data risk. Only 7% of enterprises have completely AI-ready data (Cloudera/HBR, March 2026). “Structurally coherent” is not the same as “clean.”
Path B — Lightweight Data Cleanup First (2–4 Weeks)
When: Single domain with schema inconsistencies, manual export requirements, or missing entity resolution across two domains.
What it means: Targeted remediation of the specific data problems blocking this workflow, without rebuilding the broader data architecture. The investment is bounded.
Common Path B work:
- API creation or direct query access to replace manual export
- Schema documentation and normalization for one source system
- Entity resolution mapping between two systems (customer ID alignment across CRM + ERP)
- Historical backfill for 12–24 months of missing data
- dbt data model for the specific tables this workflow queries
Timeline: 2–4 weeks for schema and API work; 6–8 weeks if historical backfill is required Budget: $80K–$400K in additional preparation cost (on top of Path A deployment budget)
Corpus anchors:
- Caylent/Teamfront (2025): 4 SQL clusters, 2,500+ stored procedures cleaned in 10 weeks (compressed from 40-week estimate) with AI-assisted work — 70% automated, 20% AI-assisted, 10% manual (research/07-adoption-challenges/data-cleaning-real-timelines-case-studies.md)
- Realistic mid-market Silver-layer timeline (Medallion architecture): 3–6 months for first domain, though Path B is a subset of this
Caution: Path B estimate assumes the entity resolution between two domains is achievable. Vendor “48-hour medallion” claims (Nexla and similar) cover pipeline infrastructure, not entity resolution or business-rule definition — those require functional stakeholder time. Budget for 0.5 FTE of business-side involvement, not just technical resources.
Path C — Full Ontology or Schema Reset Before Deployment
When: Workflow crosses 3+ data domains; OR DRI (Data Readiness Index) score is below 0.50; OR entity resolution between even two domains would require more than 4 weeks.
What it means: A semantic integration layer — Ontology, canonical data model, data mesh implementation, or equivalent — must be built before the AI can operate reliably at production scale. This is not a technology choice; it is the structural prerequisite for multi-domain AI.
Why this is Rewired’s central data argument:
Lamarre et al., 2024, Ch. 25 p.391: “No data architecture, no AI advantage.” The book’s framing is that without a governed, reusable data architecture, AI investments produce “recurring bespoke-integration costs and bounded outcomes.” Each use case requires its own integration work. The compounding effect of reusable data products (Ch. 26, p.401) — the mechanism behind Palantir’s 139% net dollar retention — only activates when the semantic layer is shared.
The mid-market adaptation: for organizations below the Fortune 500 band, Path C usually means purchasing a platform (Snowflake + Cortex, Databricks, or a domain-specific data product tool) rather than building a proprietary Ontology. The investment logic is identical; the build-vs-buy decision differs by scale.
Corpus anchors:
- Nebraska Medicine: 6-month Palantir Ontology → 10-hour second use case (research/01-ai-native-landscape/palantir-aipcon-enterprise-agentic-deployment-2026.md)
- Guardian Life: consolidated enterprise data + microservices architecture → RFP/quoting compressed from 5–7 days to 24 hours (research/01-ai-native-landscape/mit-cisr-scaling-ai-maturity-bottom-line-2026.md)
- Italgas: IoT + data platform investment since 2017 → WorkOnSite 40% faster project completion (same source)
Timeline: 9–24 months for mid-market; 18–36 months for Fortune 500 with complex legacy estates Budget: $450K–$2.3M total (mid-market, single-domain Reset with two workflows in scope); $2M–$15M+ for enterprise scope
Before committing to Path C, ask: Can the workflow be scoped to one domain and deliver 80% of the intended value? BCG’s concentration finding — 1–3 domains generates results; 100 use cases fails — applies here. A scoped single-domain Path A deployment now, funded by measurable returns, may be a better first step than a 24-month Reset that delays all value.
Four Workflow Archetypes — Recommended Path by Type
| Archetype | Description | Typical Domain Count | Expected Data Path | Notes |
|---|---|---|---|---|
| Transactional | High-volume, structured input → score or classification. Examples: fraud scoring, deal ranking, approval routing | 1–2 | Path A or B | If data is in a single system, this is the fastest path to production ROI. SlickDeals and TTEC call center are the anchors. |
| Analytical | Cross-domain aggregation for decision support. Examples: demand forecasting, customer 360, supply chain risk | 2–4 | Path B or C | Two-domain analytical AI (CRM + ERP) often resolves in Path B. Three or more domains — customer + product + supplier + logistics — typically require Path C. |
| Generative | Document creation, synthesis, or summarization using enterprise knowledge. Examples: contract drafting, RFP response, clinical note generation | 1–3 | Path A, B, or C | Generative AI is unusually sensitive to unstructured data quality. Only 26% of organizations can use unstructured data in a way that delivers business value (IBM IBV CDO Survey, 2025). A single-domain generative use case (legal document review, for example) may be Path A; a multi-domain knowledge synthesis use case (patient history + clinical protocols + billing rules) requires Path C. |
| Agentic | Multi-step autonomous execution across systems. Examples: revenue cycle automation, multi-system order management, cross-platform incident resolution | 3–5+ | Path C (almost always) | Agentic AI surfaces every data quality problem simultaneously and at scale. Bain’s agentic governance framework (Foundation phase) requires data quality and observability infrastructure before orchestration begins. Attempting Path A for an agentic workflow produces the Data Mirage failure pattern at production scale. The Nebraska Medicine case is the anchor: Path C first, then agentic deployment. |
The “Scope Reduction” Alternative
Before committing to Path C, evaluate this option seriously: can the workflow’s scope be reduced to a single domain while delivering 80%+ of the intended value?
This is not a consolation prize. It is a deliberate sequencing strategy — the same one BCG recommends for funding AI transformation programs (bank early wins in a proven-deployment workflow, then fund Year Two end-to-end redesign with those returns).
Worked example: A healthcare system wants to automate prior authorization (patient + procedure + payer + physician = Path C, 12–18 months). Scope reduction: automate only the structured prior-auth lookup for a single payer contract (single domain, Path A, 8 weeks). That deployment generates measurable savings and builds the organizational confidence — and partial data infrastructure — that makes the full four-domain Path C faster and more fundable.
This compounding logic is the practical expression of Rewired’s Ch. 31 p.469 (“the best use case is the reuse case”): design the first deployment so its infrastructure serves as the foundation for the second.
What This Means for Your Organization
The decision tree above gives you a 15-minute workflow classification. The harder question is whether your organization has the discipline to stop at a Path C classification and do the foundational work, rather than proceeding on a Path A assumption and encountering the data problem six months into production.
The corpus answer is sobering: only 7% of enterprises say their data is completely ready for AI (Cloudera/HBR, March 2026). Most organizations are running Path A deployments against Path C data realities. That mismatch is where the Gartner 60%-abandonment figure lives.
The data reset decision is a business decision, not a technical one. It determines whether the AI budget produces compounding returns or recurring remediation costs. If you’d like to work through the classification for a specific workflow in your environment, the 30-minute assessment at research/09-ai-adoption-cycle/ai-workflow-readiness-30-minute-assessment.md runs the full diagnostic — or I’d welcome a direct conversation: brandon@brandonsneider.com.
Related Wiki Articles
- wiki/data-readiness.md — full data readiness framework; Medallion architecture timelines; DRI scoring
- wiki/workflow-redesign.md — workflow-redesign prerequisites that parallel data reset in the deployment sequence
- wiki/ai-maturity-models.md — organizational maturity context; Stage 2→3 transition as the data-backbone prerequisite
- research/09-ai-adoption-cycle/ai-workflow-readiness-30-minute-assessment.md — Section 1 (Data Foundation) operationalizes the branch questions above
- research/09-ai-adoption-cycle/ai-deployment-failure-mode-red-flag-checklist.md — flags 9, 10, 11, 12 map to the data branches in this decision tree
- research/04-consulting-firms/mckinsey-rewired-2nd-edition-synthesis.md — Capability 4 (data backbone) and Tension 3 (build vs. buy for mid-market)
Brandon Sneider | brandon@brandonsneider.com April 2026