Executive Summary
- Human-in-the-loop (HITL) design — the worker approves AI outputs before they take effect — is typically sold as a safety control. The evidence says it is also the most reliable adoption mechanism available, especially for the workforce segment most at risk of rejecting AI outright.
- The control-based acceptance finding is the single most-replicated result in the algorithm-decision literature. Dietvorst, Simmons & Massey (Management Science, 2016) show across four studies that people will use an imperfect algorithm when given even trivial modification rights (cap adjustments at 10 percentage points) — and will pay in accuracy for that control.
- Aversion is highest exactly where enterprises most want to deploy AI: in serious-consequence white-collar judgment work. Filiz et al. (2023, n=143): 49% aversion in high-stakes scenarios vs. 29% in trivial ones. Full-automation deployment in these workflows meets the stiffest resistance. HITL inverts that.
- The corpus already contains the analog evidence: +144% trust from hands-on training (Deloitte TrustID Q3 2025, ~60k US employees); 2.6x usage consistency when workers trust the AI program (BCG AI at Work 2025, n=10,600); BCG 5-hour training floor; Colgate training-before-access. HITL operationalizes all four mechanisms inside the deployed workflow itself.
- For the Endangered archetype (25% of the workforce, per the corpus framework), HITL is not a safety nicety. It is the continued-practice mechanism that preserves the skill Microsoft’s Future of Work 2026 report shows eroding within three months of full automation, and the retained-authority mechanism that prevents the Parasuraman & Riley disuse failure mode.
The Mechanism: Control Is the Active Ingredient
Dietvorst et al.'s “Overcoming Algorithm Aversion” (Management Science, 2016) ran four studies of the same intervention: give participants the ability to modify an algorithmic forecast, even by a capped trivial amount. Across every study, modification rights increased the willingness to use the algorithm — over the unmodifiable algorithm of equal or better accuracy. Participants with modification rights also reported higher satisfaction with the process, stronger belief in the algorithm’s superiority, and higher willingness to use algorithms on future tasks.
The mechanism is not accuracy. It is not explanation. It is not transparency. It is retained agency. HITL deployment design operationalizes retained agency at the workflow level: the worker sees the output, decides whether it takes effect, and owns the outcome.
Van der Waa et al.'s “ABC of algorithmic aversion” (AI & SOCIETY, 2023) extends the finding: acceptance of automated decision-making is determined primarily by (a) the tangible benefits to the user and (b) the control the user retains. The identity of the agent — human or algorithm — matters far less than these two design variables. MIS Quarterly’s 2024 integrative review reaches the same conclusion across the broader literature: aversion and appreciation are not dispositions. They are moderated by task stakes, perceived agency, and user control.
Aversion Is Highest Exactly Where Full Automation Is Most Tempting
Filiz et al. (PLOS ONE / PMC, 2023, n=143) found 39% overall algorithm aversion — rising to 49% in high-consequence scenarios (autonomous driving, MRI evaluation, criminal case assessment) and falling to 29% in trivial ones (dating, recipes, weather). The effect sizes comparing serious vs. trivial conditions were Cohen’s d between 0.64 and 1.95.
This has a direct enterprise implication. The work executives most want to automate — legal review, clinical documentation, underwriting, financial analysis, compliance — sits in the high-aversion band. Deploying full automation into those workflows puts the highest-stakes work through the adoption pattern with the lowest tolerance for error. HITL design redistributes the decision to the worker, which is what the literature predicts will resolve the aversion.
What the Corpus Already Shows
The following findings — already in this research repo — all point in the same direction as the lab literature:
- Deloitte TrustID Q3 2025 (~60k US employees, surfaced in HBR Nov 2025): hands-on AI training correlates with +144% trust; interactive practice with +72% trust. Hands-on is structurally closer to HITL than to passive observation of a working system.
- BCG AI at Work 2025 (n=10,600, 11 countries): 2.6x usage consistency when workers trust the AI program. Trust is the gating variable, not capability.
- BCG’s 5-hour training floor (same study): 79% regular AI users with 5+ hours training vs. 67% below. Practice is the precondition.
- Colgate case: training-before-access. Workers rehearsed the review behavior before the tool went live. This is a HITL-design pattern adapted to onboarding.
- Microsoft Future of Work 2026: clinicians using AI polyp detection showed significant skill degradation after three months of use. Full automation erodes the skill that makes the worker valuable when the AI is wrong. HITL keeps the worker practicing judgment.
The Parasuraman & Riley Frame
Parasuraman and Riley’s 1997 taxonomy (Human Factors; 6,000+ citations) names four failure modes: use, misuse, disuse, abuse. Enterprise AI risk conversations focus almost entirely on misuse — hallucinations, rubber-stamping, liability. Disuse is the equal and opposite problem: workers reject automation they should be using, and the ROI evaporates. Lee & See (2004) show why disuse happens — when process is opaque and purpose is suspect (worker believes the tool exists to replace them), trust never calibrates even when performance is high.
Full-automation deployment maximizes disuse risk. HITL deployment minimizes it by making process visible at every step and by making purpose legible: the tool is there to assist the worker whose judgment still gates the outcome.
The Endangered Archetype
The corpus segments the workforce by AI exposure. The Endangered archetype — roughly 25% of the workforce whose core task can be substantially automated by current models — is the population most at risk of adoption failure for two linked reasons: skill erosion (they stop practicing the judgment that made them valuable) and dignity loss (they experience the system as a replacement, not an assistant). HITL addresses both. Retained decision authority preserves the dignity. Continued review practice preserves the skill. The deployment design is the intervention.
The Named Evidence Gap
No published enterprise RCT directly compares (a) HITL deployment where the worker approves AI outputs before they take effect, against (b) full-automation deployment of the same workflow, with adoption resistance as the primary outcome. The closest analogs — Dietvorst 2016 (forecasting tasks), HBS Cybernetic Teammate 2025 (n=776 P&G, measuring AI-enhanced individual vs. full human team), Stanford Enterprise AI Playbook 2026 (51 enterprise deployments) — do not isolate the variable.
For a COO or CHRO deciding in 2026, the operational call must be made on mechanism evidence (Dietvorst, Parasuraman & Riley, Lee & See, van der Waa) plus analog enterprise data (BCG, Deloitte, Microsoft FoW 2026). The direction is clear; the precise effect size is not.
The Rubber Stamp Caveat
HITL as adoption architecture fails when the review step is cosmetic. Thomson Reuters monitors reviewer behavior specifically to catch this — reviews completed in under two seconds are flagged as rubber-stamping. If workers are incentivized to approve quickly and have no incentive to reject, the approval gate becomes misuse in the Parasuraman & Riley sense. The design fix is to make rejection a low-friction, low-reputation-cost action and to track reject rates as a leading indicator of healthy HITL operation, not as a sign of AI failure.
Key Data Points
| Stat | Source | Year | Credibility |
|---|---|---|---|
| Modification rights (even trivial) increased willingness to use an algorithmic forecaster across 4 studies; participants sacrificed accuracy for control | Dietvorst, Simmons & Massey, Management Science | 2016 | TIER 1 foundational mechanism; TIER 4 by date |
| 39% overall algorithm aversion; 49% in high-consequence vs. 29% in trivial (n=143) | Filiz et al., PLOS ONE / PMC | 2023 | MEDIUM (small sample); TIER 3 |
| +144% trust from hands-on AI training; +72% from interactive practice (~60k US employees) | Deloitte TrustID, surfaced in HBR | Nov 2025 | HIGH; TIER 1 |
| 2.6x usage consistency when workers trust the AI program; 5-hour training lifts regular use 67%→79% (n=10,635) | BCG AI at Work | 2025 | HIGH; TIER 2 |
| Clinicians using AI polyp detection showed significant skill degradation within 3 months | Microsoft Future of Work Report | 2026 | HIGH; TIER 1 |
| Use / misuse / disuse / abuse taxonomy — disuse is as costly to ROI as misuse | Parasuraman & Riley, Human Factors | 1997 | Foundational (6,000+ citations); TIER 5 by date |
| Control and benefits — not agent identity — determine acceptance of automated decisions | van der Waa et al., AI & SOCIETY | 2023 | HIGH; TIER 3 |
| Rubber-stamp risk: Thomson Reuters flags reviews under 2 seconds | CIO.com / Thomson Reuters CTO Joel Hron | 2025 | MEDIUM (practitioner); TIER 1 |
Vendor-funded or practitioner-reported data points (Deloitte, BCG, Microsoft, Thomson Reuters) should be read as directional. The independent academic literature (Dietvorst, Filiz, van der Waa, Parasuraman & Riley, Lee & See) provides the mechanism evidence.
What This Means for Your Organization
If the rollout plan is full automation of a high-stakes workflow, the adoption resistance problem is structurally baked in before a single seat is licensed. The literature is consistent on this: the workforce segments most exposed to AI — the same segments where ROI is most sensitive to adoption — are the ones where full automation triggers the strongest aversion response. The mitigation is not better change management on top of the same deployment design. It is a different deployment design.
HITL as adoption architecture means three things in practice. First, the default deployment is approve-before-effect, not post-hoc audit. Second, the review step is designed to be consequential and time-costed, not a rubber stamp — which means tracking reject rates as a leading indicator of healthy operation. Third, the workflow is redesigned so that the worker’s judgment is the deliverable and the AI is the drafter, which preserves both the skill (Microsoft FoW 2026 shows this matters inside three months) and the dignity (Parasuraman & Riley’s disuse failure mode). For the Endangered archetype specifically, this is the design difference between a rollout that clears BCG’s 5-hour training floor and one that does not.
If this raised questions specific to your organization — which workflows should go HITL, which can safely go full-automation, and how to design the review step so it does not collapse into a rubber stamp — I would welcome the conversation. brandon@brandonsneider.com.
Sources
- Dietvorst, B., Simmons, J., Massey, C. “Overcoming Algorithm Aversion: People Will Use Imperfect Algorithms If They Can (Even Slightly) Modify Them.” Management Science, 2016. https://pubsonline.informs.org/doi/10.1287/mnsc.2016.2643 — HIGH credibility; foundational mechanism.
- Dietvorst, B., Simmons, J., Massey, C. “Algorithm Aversion: People Erroneously Avoid Algorithms After Seeing Them Err.” Journal of Experimental Psychology: General, 2015.
- Filiz, I., Judek, J., Lorenz, M., Spiwoks, M. “The extent of algorithm aversion in decision-making situations with varying gravity.” PLOS ONE / PMC, 2023. https://pmc.ncbi.nlm.nih.gov/articles/PMC9942970/ — MEDIUM; small single-site sample.
- van der Waa, J. et al. “The ABC of algorithmic aversion: not agent, but benefits and control determine the acceptance of automated decision-making.” AI & SOCIETY (Springer), 2023. https://link.springer.com/article/10.1007/s00146-023-01649-6 — HIGH; peer-reviewed integrative framework.
- MIS Quarterly. “An Integrative Perspective on Algorithm Aversion and Appreciation in Decision-Making.” MISQ 48(4), 2024. https://misq.umn.edu/misq/article/48/4/1575/2300/An-Integrative-Perspective-on-Algorithm-Aversion — HIGH; top IS journal review.
- Parasuraman, R., Riley, V. “Humans and Automation: Use, Misuse, Disuse, Abuse.” Human Factors 39(2), 1997. https://journals.sagepub.com/doi/10.1518/001872097778543886 — Foundational (6,000+ citations).
- Lee, J. D., See, K. A. “Trust in Automation: Designing for Appropriate Reliance.” Human Factors 46(1), 2004. https://journals.sagepub.com/doi/10.1518/hfes.46.1.50_30392 — Foundational.
- Deloitte TrustID Q3 2025, surfaced in Harvard Business Review, Nov 2025 — HIGH; ~60k US employees.
- BCG, “AI at Work 2025,” with MIT Sloan. n=10,600, 11 countries. https://www.bcg.com/publications/2025/ai-at-work-momentum-builds-but-gaps-remain — HIGH; vendor-coauthored but independently replicated.
- Microsoft, “Future of Work Report 2026.” Jan 2026. Coverage: https://allwork.space/2026/01/workers-gain-hours-with-ai-but-risk-losing-skills-according-to-microsofts-new-future-of-work-report/ — HIGH.
- SHRM, “Keep Humans in the Loop for Successful AI Adoption.” June 30, 2025. https://www.shrm.org/topics-tools/news/keep-humans-in-the-loop-for-successful-ai-adoption — MEDIUM; practitioner.
- CIO.com, “Keeping humans in the AI loop.” 2025. https://www.cio.com/article/4042910/keeping-humans-in-the-ai-loop.html — MEDIUM; trade press with named senior executives.
- MDPI Entropy 28(4), 377. “Human-in-the-Loop Artificial Intelligence: A Systematic Review.” 2026. https://www.mdpi.com/1099-4300/28/4/377 — MEDIUM; systematic review.
Brandon Sneider | brandon@brandonsneider.com April 2026