← AI Native Landscape 🕐 7 min read
AI Native Landscape

Anthropic's Economic Index: What 1.1 Million Claude Conversations Say About AI Productivity

Alex Tamkin and Peter McCrory analyzed 100,000 conversations from Claude.ai Free, Pro, and Max tiers using Anthropic's privacy-preserving CLIO system, classifying tasks via the O*NET occupational taxo

See also (wiki): productivity-rcts · workflow-redesign · assistive-to-agentic-shift


Executive Summary

  • Anthropic’s November 2025 analysis of 100,000 Claude conversations estimates an average ~80% task-time reduction (median 84%). Projected out, that would add ~1.8% annually to US labor productivity — doubling the post-2019 trend — if universally adopted over a decade.
  • The March 2026 “Learning Curves” follow-up (1 million conversations, Feb 2026) shows usage diversifying: top 10 tasks fell from 24% to 19% of conversations, personal use rose to 42%, and high-tenure users show 3–4 percentage points higher task success after controls.
  • The March 2026 “Labor Market Impacts” paper finds no systematic unemployment increase for exposed workers since late 2022 — but a 14% drop in job-finding rates for 22–25-year-olds entering exposed occupations post-ChatGPT.
  • These are vendor-published findings with real methodological caveats: conversations that failed or were abandoned don’t show up, Claude self-reports the time estimates, and organizational friction (review, rework, coordination) is not modeled. Cross-reference against METR’s RCT (19% slower, n=16 experienced developers, July 2025), CMU’s 40.7% code complexity increase, and Atlan’s 200-deployment analysis (median +159.8% ROI requires workflow redesign first).
  • The actionable takeaway is not the headline number. It is the shape of the curve: experience compounds, automation is migrating from chat UIs to API/agent architectures, and the bottleneck shifts from task speed to the tasks AI doesn’t accelerate.

The November 2025 Productivity Paper

Alex Tamkin and Peter McCrory analyzed 100,000 conversations from Claude.ai Free, Pro, and Max tiers using Anthropic’s privacy-preserving CLIO system, classifying tasks via the O*NET occupational taxonomy.

Headline findings:

  • Average time reduction: ~80%. Median: 84%. The distribution concentrates between 50% and 95%, peaking at 80–90%.
  • Average task length without AI: 1.4 hours (84 minutes). Median labor value per conversation: $54.
  • Occupation-weighted aggregate projection: 1.8% annual US labor productivity increase over the next decade under universal adoption — approximately doubling the 2019–present trend.

The top five occupations accounting for that aggregate gain: software developers (19%), general/operations managers (6%), market research analysts (5%), customer service representatives (4%), and secondary school teachers (3%).

What to trust, what to discount. Anthropic publishes credibility notes the corpus should take seriously. Validation against JIRA ticket estimates shows Claude’s Spearman correlation with actual developer time is ρ=0.44, close to the human developer baseline of ρ=0.50 — but Claude compresses estimates (overestimates short tasks, underestimates long ones). The paper acknowledges six limitations: the model has no view into post-conversation refinement, O*NET doesn’t capture tacit knowledge, quality of AI vs. human output is not compared, organizational restructuring is unmodeled, innovation effects are unmodeled, and the sample is Claude.ai users — who self-select into tasks they expect AI to help with.

Translation: the 80% figure is the upper bound of what the tool saw, not the lower bound of what the organization captured.

The March 2026 Learning Curves Paper

Massenkoff, Lyubich, McCrory, Appel, and Heller analyzed one million conversations from February 5–12, 2026 across Claude.ai and the first-party API. The three-month comparison (Nov 2025 → Feb 2026) shows a platform in motion.

Shift Nov 2025 Feb 2026
Top 10 tasks share 24% 19%
Personal use share (Claude.ai) 35% 42%
Coursework share 19% 12%
Top 20 countries per-capita share 45% 48%
Top 5 US states’ share 30% 24%

Two findings matter for organizations planning AI rollouts:

Experience compounds. Users with 6+ months of tenure show 10% higher success rates raw, 3–4 percentage points after controlling for task type. The education level of prompts rises approximately one grade-year per year of Claude usage. The implication: early-access cohorts will outperform late cohorts even on identical tools, which reinforces the BCG finding that the 5-hour training threshold separates regular users (79%) from occasional ones (67%).

Automation has migrated to the API. Coding moved off Claude.ai and into API-based agent workflows (Claude Code is the canonical example). Business sales outreach automation and automated trading both doubled in frequency. Augmentation grew on the chat UI; automation grew on the API. Any organization still measuring AI adoption by seat counts on a chat interface is measuring the wrong surface.

Model selection also tracks task value: for every $10 increase in hourly task wage, Opus usage rises 1.5 percentage points on Claude.ai and 2.8 percentage points on the API. Opus is selected 55% of the time for Computer/Math tasks versus 45% for educational ones. Procurement teams assuming a single-model license will underprice agentic workloads.

The March 2026 Labor Market Paper

Massenkoff and McCrory introduce an “Observed Exposure” metric combining O*NET task data, Claude usage, and Eloundou et al.'s (2023) theoretical capability assessments. It asks: of the tasks LLMs could theoretically accelerate, which ones are actually seeing automated usage?

  • Computer & Math occupations: 33% observed coverage versus 94% theoretical capability — a 61-point gap between what AI can do and what workers are using it for.
  • Most exposed occupations: Computer Programmers (75%), Customer Service Representatives, Data Entry Keyers (67%).
  • 30% of workers have zero exposure.
  • Top-quartile exposed workers earn 47% more than unexposed workers, are 16 percentage points more likely to be female, and 17.4% hold graduate degrees (vs. 4.5% unexposed).

Two findings executives should not miss:

No aggregate unemployment effect. Differential unemployment changes between exposed and unexposed workers since late 2022 are “indistinguishable from zero.” The authors explicitly reference the offshoring forecasts of the 2000s that predicted “a quarter of US jobs” displaced, which never materialized in the employment data.

A clear cohort effect at the entry level. Job-finding rates for ages 22–25 entering exposed occupations dropped 14% after ChatGPT, with entry rates down roughly 0.5 percentage points in high-exposure roles. Senior workers are augmented; juniors are crowded out of the roles that used to train them. This is the first credible early-career labor market signal in the corpus.

Key Data Points

Metric Value Source Date Tier
Average task time reduction ~80% Anthropic 100K conversation study Nov 2025 Tier 1
Median task time reduction 84% Anthropic 100K conversation study Nov 2025 Tier 1
Projected annual US productivity gain 1.8% over 10 years Anthropic projection Nov 2025 Tier 1
Share of aggregate gain from software devs 19% Anthropic Nov 2025 Tier 1
Top 10 tasks share change 24% → 19% Anthropic Learning Curves, n=1M Mar 2026 Tier 1
High-tenure user success advantage +3–4pp after controls Anthropic Learning Curves Mar 2026 Tier 1
Observed vs. theoretical exposure, Computer/Math 33% vs. 94% Anthropic Labor Market Mar 2026 Tier 1
Drop in job-finding rate, 22–25 in exposed roles –14% post-ChatGPT Anthropic Labor Market Mar 2026 Tier 1
METR RCT: experienced developer speed impact –19% (slower) METR, n=16, 246 tasks Jul 2025 Tier 1
Atlan 200-deployment median ROI +159.8% (after workflow redesign) Atlan 2025 Tier 2

What This Means for Your Organization

Three decisions follow from this evidence.

First, treat the 80% time-savings figure as the ceiling, not the floor. Anthropic measures what happens inside a successful conversation. Your P&L measures what happens after the conversation — review cycles, rework, coordination, the tasks AI does not accelerate. The gap between the two is the work. Organizations that capture real productivity gains redesign the downstream steps (review, approval, handoff) before they scale the upstream tool. Those that skip this land in the Faros pattern: 98% more output, same throughput.

Second, sequence your rollout around the learning curve. The 3–4 percentage point experience advantage compounds over months. Fast-cycle a pilot cohort through real work before the broader rollout, and budget for the 5-hour training threshold that separates regulars from occasional users. The enterprises capturing aggregate gains are not the ones with the most seats — they are the ones with the most users past month six.

Third, plan for the hiring dynamic, not just the productivity one. The 14% drop in entry-level job-finding in exposed occupations is the most important finding in the March 2026 labor paper. If you hire junior analysts, paralegals, support reps, or early-career engineers, the roles that used to develop them into senior contributors are the ones AI is reshaping first. The question is not whether to hire fewer juniors. It is how to redesign the early-career path so your senior bench exists in five years.

If this raised questions specific to your organization — particularly around sequencing the rollout or rethinking the junior hiring pipeline — I’d welcome the conversation. brandon@brandonsneider.com.

Sources

Primary (Tier 1 — Q4 2025 and later, cite directly):

  • Tamkin, A. & McCrory, P. “Estimating AI Productivity Gains from Claude Conversations.” Anthropic Economic Research, November 25, 2025. https://www.anthropic.com/research/estimating-productivity-gains. n=100,000 Claude.ai conversations analyzed via CLIO. Credibility: MEDIUM-HIGH for methodology disclosure; LOW for selection bias (vendor-published, successful conversations only, Claude.ai users self-select into favorable tasks). Apply the standard vendor-case-study caveat.
  • Massenkoff, M., Lyubich, E., McCrory, P., Appel, R., Heller, R. “Anthropic Economic Index: Learning Curves.” March 24, 2026. https://www.anthropic.com/research/economic-index-march-2026-report. n=1,000,000 conversations, sample period Feb 5–12, 2026. Credibility: MEDIUM-HIGH; same vendor caveats apply, but larger sample and longitudinal comparison add weight.
  • Massenkoff, M. & McCrory, P. “Labor Market Impacts of AI.” Anthropic, March 5, 2026. https://www.anthropic.com/research/labor-market-impacts. Credibility: HIGH for macro analysis (uses BLS and public labor data, not Claude-specific), MEDIUM for “Observed Exposure” construct (depends on Claude usage as a proxy for AI usage overall).

Cross-reference sources (not re-cited here, in corpus):

  • METR. “Measuring the Impact of Early-2025 AI on Experienced Open-Source Developer Productivity.” July 2025. n=16 developers, 246 tasks. Finds 19% slower.
  • Atlan. 200-deployment analysis. 2025. Median +159.8% ROI conditional on workflow redesign.
  • BCG. “AI at Work 2025.” June 2025. n=10,635. The 5-hour training threshold.
  • Eloundou et al. “GPTs are GPTs: An Early Look at the Labor Market Impact Potential of Large Language Models.” 2023. Theoretical capability baseline used by Anthropic’s Labor Market paper.

Brandon Sneider | brandon@brandonsneider.com April 2026