Executive Summary
- Only 7% of enterprises report their data is completely ready for AI adoption. Another 27% say theirs is “not very or not at all ready” (Cloudera/Harvard Business Review Analytic Services, n=230+ data decision-makers, October 2025). Gartner predicts 60% of AI projects will be abandoned through 2026 because the data underneath them cannot support the use case (Gartner, Q3 2024 survey of n=248 data management leaders, February 2025).
- Data readiness is the most predictable pilot failure. Every AI vendor demo runs on clean, curated sample data. The company’s actual data — spread across an average of 96 SaaS applications at a mid-market firm — looks nothing like it. The gap between the demo and reality is where pilots die.
- The self-assessment below takes 10 minutes, requires no technical expertise, and surfaces the five data problems that kill 60% of AI projects before they produce a single result. A CIO or COO who runs through these five questions with the IT lead before the first vendor meeting saves 3-6 months of discovery that otherwise surfaces mid-implementation.
- The companies that capture value from AI — the 5% that scale past pilot — do not have perfect data. They know exactly where their data gaps are before they start, and they scope their first use case to data that already works.
The Five Questions
Print this page. Sit with the IT lead and the business owner of the proposed AI use case. Answer each question for the specific process the pilot will target — not for the organization overall. A company-wide data audit is a six-figure project. This is a 10-minute diagnostic for one use case.
Question 1: Where Does the Data for This Use Case Live?
What to ask: For the task AI will perform, which systems hold the data it needs? Is it in one system or spread across several? Can the person who manages that system point to the exact tables, fields, or exports?
What the answer reveals: If nobody can identify which systems contain the relevant data within five minutes, the data is either undocumented, scattered across personal spreadsheets and email threads, or trapped in a system nobody fully understands. Any of these conditions adds 80-160 hours of data preparation before the AI tool touches a single record (IBM/Gartner, 2025).
| Answer | What It Means |
|---|---|
| Data lives in one system with a clear owner | Pilot-ready. Proceed to Question 2. |
| Data lives in 2-3 systems that already share data | Viable, but confirm the integration is reliable and current. |
| Data lives in 4+ systems with no existing integration | The pilot scope is too broad. Narrow to the subset in a single system, or budget 4-8 weeks for data consolidation. |
| Nobody can identify which systems hold the data | Stop. This is a data inventory problem, not an AI problem. Start with the shadow AI discovery worksheet and a systems audit. |
Why this matters: The typical mid-market company runs 96 SaaS applications (BetterCloud, 2024). Only 29% of enterprise applications are integrated (MuleSoft 2025 Connectivity Benchmark). The AI vendor will not tell you this in the demo. They will show you the tool working on integrated data and let you assume yours looks the same.
Question 2: Is the Data Complete Enough to Act On?
What to ask: Pull a sample of 50-100 records from the dataset the AI will use. What percentage of fields are populated? Are there obvious gaps — missing dates, blank customer names, incomplete transaction records? Are the fields that matter most for the AI use case consistently filled?
What the answer reveals: AI models trained or prompted on incomplete data produce incomplete outputs. A customer service AI that cannot see purchase history for 30% of customers will hallucinate or fail silently on those interactions. The Atlan State of Enterprise Data and AI report (n=561 data professionals, 2025) finds 97% of organizations face “context gap” issues — misaligned definitions, poor data reliability, or insufficient business context. The average organization reports 2.69 distinct context gaps.
| Answer | What It Means |
|---|---|
| 90%+ of critical fields are populated consistently | Pilot-ready on completeness. Proceed to Question 3. |
| 70-89% populated, with known gaps in specific areas | Viable if the AI use case can tolerate those gaps. Identify which missing fields would cause failures and fill them first. |
| Below 70%, or the team cannot easily pull a sample | The data needs cleanup before AI adds value. Budget 80-160 hours for the data preparation sprint. |
Why this matters: Organizations lose $12.9 million per year on average from poor data quality (Gartner, 2025). AI amplifies this cost because it processes bad data faster and at higher volume. Without high-quality data, AI multiplies mistakes faster (World Economic Forum, January 2026).
Question 3: Who Owns Data Quality?
What to ask: When a customer record is wrong, who fixes it? When two systems disagree on the same data point, who decides which is correct? Is there a documented process, or does it depend on whoever notices first?
What the answer reveals: Data quality ownership is the single best predictor of AI readiness. Cloudera/HBR found that 44% of organizations cite “lack of clear data strategy” as a top obstacle to AI data preparation — second only to data silos. Organizations where data quality is everyone’s responsibility effectively have no one responsible.
| Answer | What It Means |
|---|---|
| A named person or team owns data quality with documented standards | Strong foundation. This is rare — only 15% of organizations report mature data governance (DATAVERSITY, 2025). |
| IT fixes data issues when they surface, but there is no formal owner | Common at mid-market companies. Assign a data quality owner for the pilot scope before launch. Does not require a new hire — it requires naming the person already doing it informally. |
| Nobody owns it, or “everyone” owns it | High risk. Data quality will degrade faster than AI processes it. Fix this before spending on AI tools. |
Why this matters: Organizations with mature data governance reduce AI implementation costs 20-35% and accelerate time-to-value by 40-60% (Atlan, 2025). The governance does not need to be enterprise-wide before the first pilot. It needs to exist for the data the pilot touches.
Question 4: What Is the Cleanup Cost Before AI Adds Value?
What to ask: Based on the completeness check in Question 2, estimate the work required to bring this dataset to pilot-ready condition. Can IT staff do it in existing hours, or does it require a dedicated sprint? How many hours? Is there a backlog of known data problems that have been deferred?
What the answer reveals: Data preparation costs routinely exceed the AI implementation itself. Hidden data preparation, infrastructure, and integration expenses add 35-50% to initial AI budgets (Azumo/AI Smart Ventures, 2026). A standard enterprise AI deployment takes 16-28 weeks, and data infrastructure readiness is the biggest timeline risk — adding 3-6 months when organizations discover quality problems mid-implementation.
| Answer | What It Means |
|---|---|
| Data is clean enough for the pilot scope; cleanup is under 40 hours | Proceed. Factor the hours into the pilot timeline. |
| Cleanup requires 40-160 hours of dedicated work | Viable, but this is a pre-pilot project, not part of the pilot. Budget it separately. Timeline impact: 2-6 weeks before AI work begins. |
| Cleanup exceeds 160 hours, or the team cannot estimate the scope | The data problem is larger than the AI opportunity. Consider whether a data foundation project should precede the AI pilot entirely. |
Why this matters: Data teams spend an estimated 60% of their time on cleaning and validation rather than building models or features (DataStackHub, 2025). At a mid-market company with 3-10 IT staff, those hours come directly from other priorities. The IT capacity card addresses the trade-off; this question surfaces the cost before the pilot starts.
Question 5: Is the Data You Need Exportable from the System That Holds It?
What to ask: Can you export the pilot’s data in a standard format (CSV, JSON, API) without contacting the vendor? Are there restrictions on bulk exports, API rate limits, or contractual limitations on data portability? Has anyone tried?
What the answer reveals: Many SaaS platforms provide limited export functionality — restrictions on historical data, missing metadata, incomplete relationship data between entities. The EU Data Act (effective September 2025) prohibits vendor lock-in for European data, but American mid-market companies often discover export limitations only when they need the data for a new purpose. Constellation Research notes that enterprise data tolls and API economics are an emerging cost center, with vendors increasingly restricting or pricing access to data their customers created.
| Answer | What It Means |
|---|---|
| Full API access or standard export available; tested and confirmed | Pilot-ready on portability. |
| Export exists but has not been tested, or covers only partial data | Test it now. Export the pilot dataset before signing the AI contract. Discovering limitations after the pilot launches wastes the most expensive resource: time. |
| No export capability, or the vendor contract restricts data portability | This is a procurement problem, not an AI problem. Review the contract (reference the AI vendor contract red lines card) and resolve portability before proceeding. |
Why this matters: Companies with strong data integration achieve 10.3x ROI from AI initiatives versus 3.7x for those with poor connectivity (MuleSoft, 2025). The integration starts with the ability to move data between systems — and that ability is not guaranteed.
Scoring: Where Do You Stand?
Count the number of questions where the answer falls in the top row of each table (the “pilot-ready” row):
| Score | Assessment |
|---|---|
| 5 of 5 pilot-ready | Rare. The data foundation supports a well-scoped pilot. Move to the pilot structure card and success metrics card. |
| 3-4 of 5 pilot-ready | Typical for companies that succeed. Address the 1-2 gaps as a pre-pilot sprint (2-6 weeks). Do not let the gaps delay the decision to proceed — let them define the preparation timeline. |
| 1-2 of 5 pilot-ready | The data foundation needs work before AI tools add value. This is not a reason to abandon AI — it is a reason to invest in data infrastructure first. The companies in the 5% start here, not with a vendor demo. |
| 0 of 5 pilot-ready | Invest in data governance before AI. The data governance prerequisite research document provides the framework for minimum viable data governance. |
Key Data Points
| Statistic | Source | Credibility |
|---|---|---|
| 7% of enterprises report data completely ready for AI | Cloudera/HBR Analytic Services, n=230+, October 2025 | High — independent survey of data decision-makers by HBR |
| 60% of AI projects will be abandoned through 2026 due to data issues | Gartner, n=248 data management leaders, Q3 2024 survey, February 2025 prediction | High — Gartner analyst prediction based on primary survey data |
| 97% of organizations face data context gap issues | Atlan State of Enterprise Data & AI, n=561, 2025 | Medium-high — large sample, industry-specific respondent base |
| $12.9M average annual cost of poor data quality | Gartner, 2025 | High — widely cited, consistent across multiple Gartner analyses |
| 96 SaaS apps average at companies with 200-749 employees | BetterCloud, 2024 | Medium-high — vendor survey but consistent with independent estimates |
| 29% of enterprise applications integrated | MuleSoft 2025 Connectivity Benchmark | Medium — vendor-funded, but largest dataset on this metric |
| 35-50% hidden cost added by data preparation to AI budgets | Multiple implementation firms, 2025-2026 | Medium — practitioner consensus, not a single study |
| Data infrastructure problems add 3-6 months to AI deployment | Azumo/AI Smart Ventures implementation data, 2026 | Medium — practitioner estimate, consistent with Gartner timelines |
| 15% of organizations report mature data governance | DATAVERSITY, 2025 | Medium-high — independent survey, governance-focused |
What This Means for Your Organization
The executives who succeed with AI are not the ones with perfect data. They are the ones who know exactly what their data can and cannot support before they commit budget, timeline, and organizational attention to a pilot.
This self-assessment takes 10 minutes. The alternative — discovering data problems 8-12 weeks into a pilot — costs 3-6 months, burns IT capacity, and creates the organizational skepticism that kills the second pilot before it starts. Gartner’s prediction that 60% of AI projects will be abandoned for data reasons is not about companies with bad data. It is about companies that did not check before they started.
Run through these five questions with the IT lead and the business owner of the proposed use case. If the score is 3 or higher, scope the pilot to the data that is ready and build the preparation sprint for the data that is not. If the score is below 3, the highest-return investment is not an AI tool — it is the data foundation that makes every future AI investment viable.
If this raised questions specific to your data environment — particularly around scoping the first use case to what the data can actually support — I’d welcome the conversation: brandon@brandonsneider.com
Sources
-
Cloudera and Harvard Business Review Analytic Services, “Only 7% of Enterprises Say Their Data Is Completely Ready for AI,” n=230+ data decision-makers, October 2025. https://www.cloudera.com/about/news-and-blogs/press-releases/2026-03-05-only-7-percent-of-enterprises-say-their-data-is-completely-ready-for-ai-according-to-new-report-from-cloudera-and-harvard-business-review-analytic-services-reveals.html — Independent survey by HBR Analytic Services; high credibility.
-
Gartner, “Lack of AI-Ready Data Puts AI Projects at Risk,” Q3 2024 survey of n=248 data management leaders, February 2025. https://www.gartner.com/en/newsroom/press-releases/2025-02-26-lack-of-ai-ready-data-puts-ai-projects-at-risk — Gartner analyst prediction based on primary survey; high credibility.
-
Atlan, “The State of Enterprise Data and AI 2025,” n=561 data professionals across 50+ industries. https://atlan.com/know/state-of-enterprise-data-ai-2025/ — Large independent survey; medium-high credibility.
-
World Economic Forum, “Why Data Readiness Is a Strategic Imperative for Businesses,” January 2026. https://www.weforum.org/stories/2026/01/why-data-readiness-is-now-a-strategic-imperative-for-businesses/ — WEF research synthesis; high credibility for framing.
-
BetterCloud, “The Big List of 2026 SaaS Statistics.” https://www.bettercloud.com/monitor/saas-statistics/ — Vendor research but consistent with independent benchmarks; medium-high credibility.
-
MuleSoft, “2025 Connectivity Benchmark.” Referenced via CMSWire analysis. https://www.cmswire.com/digital-experience/want-real-ai-impact-in-digital-experience-fix-your-data-silos/ — Vendor-funded benchmark; medium credibility but largest dataset on integration metrics.
-
Gartner, “Poor Data Quality Costs Organizations $12.9 Million per Year,” 2025. Referenced via LinkedIn and IBM analysis. https://www.ibm.com/think/insights/cost-of-poor-data-quality — Gartner primary research; high credibility.
-
DATAVERSITY, “Data Governance Survey,” 2025. Referenced via mid-market data governance analysis. — Independent practitioner survey; medium-high credibility.
-
Azumo/AI Smart Ventures, “AI Implementation Cost Guides,” 2026. https://azumo.com/artificial-intelligence/ai-insights/ai-development-cost — Practitioner estimates from implementation firms; medium credibility.
-
Constellation Research, “Enterprise Technology 2026: 15 Trends to Watch.” https://www.constellationr.com/blog-news/insights/enterprise-technology-2026-15-ai-saas-data-business-trends-watch — Independent analyst firm; high credibility.
Brandon Sneider | brandon@brandonsneider.com March 2026