Rayan Ebnali Harari, Abdullah Altaweel, Tareq Ahram, Madeleine Keehner, Hamid Shokoohi
Int J Med Inform . 2024 Nov 29:195:105701. doi: 10.1016/j.ijmedinf.2024.105701. Online ahead of print.
Background: The integration of generative artificial intelligence (AI) as clinical decision support systems (CDSS) into telemedicine presents a significant opportunity to enhance clinical outcomes, yet its application remains underexplored.
Objective: This study investigates the efficacy of one of the most common generative AI tools, ChatGPT, for providing clinical guidance during cardiac arrest scenarios.
Methods: We examined the performance, cognitive load, and trust associated with traditional methods (paper guide), autonomous ChatGPT, and clinician-supervised ChatGPT, where a clinician supervised the AI recommendations. Fifty-four subjects without medical backgrounds participated in randomized controlled trials, each assigned to one of three intervention groups: paper guide, ChatGPT, or supervised ChatGPT. Participants completed a standardized CPR scenario using an Augmented Reality (AR) headset, and performance, physiological, and self-reported metrics were recorded.
Main findings: Results indicate that the Supervised-ChatGPT group showed significantly higher decision accuracy compared to the paper guide and ChatGPT groups, although the scenario completion time was longer. Physiological data showed a reduced LF/HF ratio in the Supervised-ChatGPT group, suggesting potentially lower cognitive load. Trust in AI was also highest in the supervised condition. In one instance, ChatGPT suggested a risky option, highlighting the need for clinician supervision.
Conclusion: Our findings highlight the potential of supervised generative AI to enhance decision-making accuracy and user trust in emergency healthcare settings, despite trade-offs with response time. The study underscores the importance of clinician oversight and the need for further refinement of AI systems to improve safety. Future research should explore strategies to optimize AI supervision and assess the implementation of these systems in real-world clinical settings.