Trust, Scrutiny, or Collaboration? A Performance-Based Framework for Human-AI Interaction in Medicine

Laura Zwaan, Ph.D., Adam Rodman, M.D., and Taro Shimizu, M.D., Ph.D.
Abstract
Artificial intelligence (AI) is increasingly integrated into clinical decision-making, but strategies for human�AI collaboration remain underdeveloped. A recent randomized trial by Qazi and colleagues found that physicians exposed to incorrect large language model recommendations scored 14 percentage points lower on diagnostic reasoning than those receiving error-free suggestions, attributing this to automation bias. We argue that this framing is incomplete: The observed effect more closely resembles the well-documented impact of inaccurate information on reasoning, a phenomenon equally present in human-to-human consultation. More importantly, the appropriate response to AI errors is not uniform skepticism. Whether a physician should defer to, scrutinize, or collaboratively engage with AI depends on two key dimensions: the relative accuracy of humans and AI on a given task, and the degree to which their errors are complementary. We propose a framework mapping these dimensions onto four interaction zones: human-dominant, AI-dominant, hybrid review, and disagreement resolution. Each requires a distinct clinical workflow strategy. This framework is intentionally dynamic: As AI performance evolves and physician expertise develops, optimal interaction strategies must be recalibrated accordingly. Trust calibration, not blanket skepticism, is the appropriate goal.

Trust, Scrutiny, or Collaboration? A Performance-Based Framework for Human-AI Interaction in Medicine

Abstract