← AI Adoption Cycle 🕐 9 min read
AI Adoption Cycle

When to Reset Your Data Architecture Before Deploying AI

Work through the branches in order, one question at a time. Stop at the first terminal branch (Path A, B, or C) that applies. Do not skip ahead.


Executive Summary

  • The single most expensive pre-deployment mistake in enterprise AI is not choosing the wrong model — it is deploying AI against a data architecture that cannot support it. The remediation cost arrives six months into production, not before launch, which is why Gartner predicts 60% of AI projects will be abandoned through 2026 due to lack of AI-ready data.
  • The decision tree below classifies any planned AI workflow into one of three paths: Proceed (deploy on existing architecture), Prepare (lightweight data cleanup, 2–4 weeks), or Reset (full ontology or schema rebuild, 3–24 months). Domain count is the primary variable; data structure type and decision-point density are secondary.
  • Rewired’s most actionable data principle — “No data architecture, no AI advantage” (Lamarre et al., 2024, Ch. 25 p.391) — describes the structural consequence: without governed, reusable data, AI investments produce recurring bespoke-integration costs and bounded outcomes that cap each use case rather than compounding across them.
  • The workflow-archetype table at the end maps four common workflow types to their expected data paths. Transactional workflows almost always Proceed or Prepare. Agentic workflows almost always require Reset first — because agentic AI surfaces every data quality problem simultaneously and at scale.

How to Use This Decision Tree

Work through the branches in order, one question at a time. Stop at the first terminal branch (Path A, B, or C) that applies. Do not skip ahead.

The tree is calibrated for a workflow-level decision, not an enterprise-level one. The same company may simultaneously have one workflow that Proceeds, one that Prepares, and one that Resets.


Branch 1: Domain Count

How many distinct business domains does this workflow cross? A “domain” is a category of data with its own source system, its own ownership function, and its own quality standards.

Q: Does this workflow operate within a single data domain?
(One source system, one ownership function, one data type)

├── YES → proceed to Branch 2
│
└── NO → How many domains does it cross?
    │
    ├── 2 domains → proceed to Branch 3
    │
    └── 3+ domains → PATH C: Full Reset Required
                     (See Path C description below)

Why domain count is the primary variable:

Nebraska Medicine’s revenue cycle automation (Palantir AIPCon 8, September 2025) crossed four domains — patient demographics, procedure codes, payer contracts, and physician credentials — each with a separate source system and a separate ownership function. The 10-hour build time that made headlines required a 6-month Ontology investment first. The Ontology is what resolved the four-domain problem. Without it, the AI had no coherent semantic layer to query.

SlickDeals’s deal-scoring deployment (AWS re:Invent 2025) operated within a single domain — user-generated deal content and community engagement signals in one system. No semantic integration was required. Infrastructure modernization (SQL Server → Databricks) was the investment. Result: 360x latency reduction, 7% merchant revenue increase.

Same AI stack. Same deployment era. Radically different data preparation requirements. The variable was domain count, not AI capability.

Source: research/07-adoption-challenges/ai-data-reset-decision-framework.md; research/01-ai-native-landscape/palantir-aipcon-enterprise-agentic-deployment-2026.md


Branch 2: Single-Domain Data Structure (for workflows passing Branch 1)

Q: Is the single-domain data accessible via API or direct query without manual extraction,
   with a consistent schema for the last 12+ months?

├── YES → PATH A: Proceed on Existing Architecture
│         (See Path A description below)
│
└── NO → What is the primary data structure problem?
    │
    ├── Inconsistent formats / partial schema documentation
    │   → PATH B: Lightweight Cleanup First (2–4 weeks)
    │
    ├── Manual export required / no API access
    │   → PATH B: Lightweight Cleanup First (2–4 weeks)
    │           [Infrastructure modernization is the primary work]
    │
    └── Missing 12+ months of history / data gaps >5%
        → PATH B: Lightweight Cleanup First (2–4 weeks)
                  [May extend to 6–8 weeks if historical backfill required]

Branch 3: Two-Domain Workflows

Q: For the two domains this workflow crosses — does a documented
   entity resolution mapping already exist between them?
   (e.g., a customer ID that resolves across CRM and ERP)

├── YES → PATH B: Lightweight Cleanup First (2–4 weeks)
│         [Verify the mapping is current and audited; do not assume]
│
└── NO → Q: Is the entity resolution achievable in under 4 weeks
           with available internal resources?
    │
    ├── YES → PATH B: Lightweight Cleanup First (2–4 weeks)
    │         [4-week estimate is optimistic; plan for 6]
    │
    └── NO → PATH C: Full Reset Required if this workflow is the priority.
              OR: Scope workflow to one domain only and Proceed.
              [The scope-reduction alternative is often underused —
               many multi-domain workflows deliver 80%+ of their value
               within one domain. Evaluate this option before committing to Reset.]

Source: research/07-adoption-challenges/ai-data-reset-decision-framework.md


The Three Paths

Path A — Proceed on Existing Architecture

When: Single domain, structurally coherent, accessible via API, schema documented and consistent.

What it means: The data architecture does not require remediation before AI deployment. Infrastructure modernization (real-time streaming, query optimization, observability tooling) may still be required, but the data model itself is sound.

Corpus anchors:

  • SlickDeals (single domain, real-time deal content): 360x latency reduction + 7% revenue gain with infrastructure modernization, no semantic layer (research/07-adoption-challenges/ai-data-reset-decision-framework.md)
  • TTEC call center (structured call logs, single system): 15–20% handle time reduction (research/04-consulting-firms/mckinsey-rewired-2nd-edition-synthesis.md, E.ON Next case)
  • High-volume transaction scoring workflows: when input is structured and output is a score, Path A is the default starting point

Timeline: 6–16 weeks to production (infrastructure + deployment + testing) Budget: $70K–$400K for mid-market single-workflow initiative

Caveats:

  • Re-run Domain Count check before deployment if scope expands. A workflow that starts as single-domain but grows to include a second source system has moved to Branch 3.
  • Path A does not eliminate all data risk. Only 7% of enterprises have completely AI-ready data (Cloudera/HBR, March 2026). “Structurally coherent” is not the same as “clean.”

Path B — Lightweight Data Cleanup First (2–4 Weeks)

When: Single domain with schema inconsistencies, manual export requirements, or missing entity resolution across two domains.

What it means: Targeted remediation of the specific data problems blocking this workflow, without rebuilding the broader data architecture. The investment is bounded.

Common Path B work:

  • API creation or direct query access to replace manual export
  • Schema documentation and normalization for one source system
  • Entity resolution mapping between two systems (customer ID alignment across CRM + ERP)
  • Historical backfill for 12–24 months of missing data
  • dbt data model for the specific tables this workflow queries

Timeline: 2–4 weeks for schema and API work; 6–8 weeks if historical backfill is required Budget: $80K–$400K in additional preparation cost (on top of Path A deployment budget)

Corpus anchors:

  • Caylent/Teamfront (2025): 4 SQL clusters, 2,500+ stored procedures cleaned in 10 weeks (compressed from 40-week estimate) with AI-assisted work — 70% automated, 20% AI-assisted, 10% manual (research/07-adoption-challenges/data-cleaning-real-timelines-case-studies.md)
  • Realistic mid-market Silver-layer timeline (Medallion architecture): 3–6 months for first domain, though Path B is a subset of this

Caution: Path B estimate assumes the entity resolution between two domains is achievable. Vendor “48-hour medallion” claims (Nexla and similar) cover pipeline infrastructure, not entity resolution or business-rule definition — those require functional stakeholder time. Budget for 0.5 FTE of business-side involvement, not just technical resources.


Path C — Full Ontology or Schema Reset Before Deployment

When: Workflow crosses 3+ data domains; OR DRI (Data Readiness Index) score is below 0.50; OR entity resolution between even two domains would require more than 4 weeks.

What it means: A semantic integration layer — Ontology, canonical data model, data mesh implementation, or equivalent — must be built before the AI can operate reliably at production scale. This is not a technology choice; it is the structural prerequisite for multi-domain AI.

Why this is Rewired’s central data argument:

Lamarre et al., 2024, Ch. 25 p.391: “No data architecture, no AI advantage.” The book’s framing is that without a governed, reusable data architecture, AI investments produce “recurring bespoke-integration costs and bounded outcomes.” Each use case requires its own integration work. The compounding effect of reusable data products (Ch. 26, p.401) — the mechanism behind Palantir’s 139% net dollar retention — only activates when the semantic layer is shared.

The mid-market adaptation: for organizations below the Fortune 500 band, Path C usually means purchasing a platform (Snowflake + Cortex, Databricks, or a domain-specific data product tool) rather than building a proprietary Ontology. The investment logic is identical; the build-vs-buy decision differs by scale.

Corpus anchors:

  • Nebraska Medicine: 6-month Palantir Ontology → 10-hour second use case (research/01-ai-native-landscape/palantir-aipcon-enterprise-agentic-deployment-2026.md)
  • Guardian Life: consolidated enterprise data + microservices architecture → RFP/quoting compressed from 5–7 days to 24 hours (research/01-ai-native-landscape/mit-cisr-scaling-ai-maturity-bottom-line-2026.md)
  • Italgas: IoT + data platform investment since 2017 → WorkOnSite 40% faster project completion (same source)

Timeline: 9–24 months for mid-market; 18–36 months for Fortune 500 with complex legacy estates Budget: $450K–$2.3M total (mid-market, single-domain Reset with two workflows in scope); $2M–$15M+ for enterprise scope

Before committing to Path C, ask: Can the workflow be scoped to one domain and deliver 80% of the intended value? BCG’s concentration finding — 1–3 domains generates results; 100 use cases fails — applies here. A scoped single-domain Path A deployment now, funded by measurable returns, may be a better first step than a 24-month Reset that delays all value.


Archetype Description Typical Domain Count Expected Data Path Notes
Transactional High-volume, structured input → score or classification. Examples: fraud scoring, deal ranking, approval routing 1–2 Path A or B If data is in a single system, this is the fastest path to production ROI. SlickDeals and TTEC call center are the anchors.
Analytical Cross-domain aggregation for decision support. Examples: demand forecasting, customer 360, supply chain risk 2–4 Path B or C Two-domain analytical AI (CRM + ERP) often resolves in Path B. Three or more domains — customer + product + supplier + logistics — typically require Path C.
Generative Document creation, synthesis, or summarization using enterprise knowledge. Examples: contract drafting, RFP response, clinical note generation 1–3 Path A, B, or C Generative AI is unusually sensitive to unstructured data quality. Only 26% of organizations can use unstructured data in a way that delivers business value (IBM IBV CDO Survey, 2025). A single-domain generative use case (legal document review, for example) may be Path A; a multi-domain knowledge synthesis use case (patient history + clinical protocols + billing rules) requires Path C.
Agentic Multi-step autonomous execution across systems. Examples: revenue cycle automation, multi-system order management, cross-platform incident resolution 3–5+ Path C (almost always) Agentic AI surfaces every data quality problem simultaneously and at scale. Bain’s agentic governance framework (Foundation phase) requires data quality and observability infrastructure before orchestration begins. Attempting Path A for an agentic workflow produces the Data Mirage failure pattern at production scale. The Nebraska Medicine case is the anchor: Path C first, then agentic deployment.

The “Scope Reduction” Alternative

Before committing to Path C, evaluate this option seriously: can the workflow’s scope be reduced to a single domain while delivering 80%+ of the intended value?

This is not a consolation prize. It is a deliberate sequencing strategy — the same one BCG recommends for funding AI transformation programs (bank early wins in a proven-deployment workflow, then fund Year Two end-to-end redesign with those returns).

Worked example: A healthcare system wants to automate prior authorization (patient + procedure + payer + physician = Path C, 12–18 months). Scope reduction: automate only the structured prior-auth lookup for a single payer contract (single domain, Path A, 8 weeks). That deployment generates measurable savings and builds the organizational confidence — and partial data infrastructure — that makes the full four-domain Path C faster and more fundable.

This compounding logic is the practical expression of Rewired’s Ch. 31 p.469 (“the best use case is the reuse case”): design the first deployment so its infrastructure serves as the foundation for the second.


What This Means for Your Organization

The decision tree above gives you a 15-minute workflow classification. The harder question is whether your organization has the discipline to stop at a Path C classification and do the foundational work, rather than proceeding on a Path A assumption and encountering the data problem six months into production.

The corpus answer is sobering: only 7% of enterprises say their data is completely ready for AI (Cloudera/HBR, March 2026). Most organizations are running Path A deployments against Path C data realities. That mismatch is where the Gartner 60%-abandonment figure lives.

The data reset decision is a business decision, not a technical one. It determines whether the AI budget produces compounding returns or recurring remediation costs. If you’d like to work through the classification for a specific workflow in your environment, the 30-minute assessment at research/09-ai-adoption-cycle/ai-workflow-readiness-30-minute-assessment.md runs the full diagnostic — or I’d welcome a direct conversation: brandon@brandonsneider.com.



Brandon Sneider | brandon@brandonsneider.com April 2026