• Early Detection of Pancreatic Cancer Using Multimodal Learning on Electronic Health Record

    Mosbah Aouad, Anirudh Choudhary, Awais Farooq, Steven Nevers, Lusine Demirkhanyan, Bhrandon Harris, Suguna Pappu, Christopher Gondi, Ravishankar Iyer

    Abstract

    Pancreatic ductal adenocarcinoma (PDAC) is one of the deadliest cancers, and early detection remains a major clinical challenge due to the absence of specific symptoms andreliable biomarkers. In this work, we propose a new multimodal approach that integrateslongitudinal diagnosis code histories and routinely collected laboratory measurements fromelectronic health records to detect PDAC up to one year prior to clinical diagnosis. Ourmethod combines neural controlled differential equations to model irregular lab time series, pretrained language models and recurrent networks to learn diagnosis code trajectoryrepresentations, and cross-attention mechanisms to capture interactions between the twomodalities. We develop and evaluate our approach on a real-world dataset of nearly 4,700patients and achieve significant improvements in AUC ranging from 6.5% to 15.5% overstate-of-the-art methods. Furthermore, our model identifies diagnosis codes and laboratorypanels associated with elevated PDAC risk, including both established and new biomarkers.Our code is available at https://github.com/MosbahAouad/EarlyPDAC-MML.