-
Trial Emulation, Simulation, and Augmentation Using Electronic Health Records and Generative AI
Issa J. Dahabreh, M.D., Sc.D., Robert W. Yeh, M.D., M.Sc., M.B.A., and Piersilvio De Bartolomeis, M.Sc.Abstract
Computational tools, such as TRIALSCOPE, can support target trial emulations by leveraging biomedical language models to automate information extraction from unstructured data, conduct data validation, and streamline statistical analysis and model assessment. These advances promise to improve the quality of research using electronic health record data for causal inference. Nevertheless, automation cannot, on its own, address biases inherent in the process of generating observational data, such as unmeasured confounding. When these biases affect emulation results and a new randomized trial is deemed necessary, emulations aided by automation may still be valuable for informing trial design via realistic simulations, and for improving the efficiency of randomized trials via trial augmentation methods that combine the power of randomization with large-scale observational data.