• Pancreatic cancer risk prediction using deep sequential modeling of longitudinal diagnostic and medication records

    Chunlei Zheng, Asif Khan, Daniel Ritter, Debora S Marks, Nhan V Do, Nathanael R Fillmore, Chris Sander
    Cell Rep Med. 2025 Sep 16;6(9):102359. doi: 10.1016/j.xcrm.2025.102359.

    Abstract

    Pancreatic ductal adenocarcinoma (PDAC) is a rare, aggressive cancer often diagnosed late with low survival rates, due to the lack of population-wide screening programs and the high cost of early detection methods. To enable early detection of high-risk individuals, we develop a transformer-based model trained on longitudinal Veterans Affairs electronic health record (EHR) with 19,426 PDAC cases and ∼15.9 million controls. Our model combines diagnostic and medication trajectories to predict PDAC risk within a 6-, 12-, and 36-month assessment window. Incorporating medication significantly improved performance; among the top 1,000-5,000 highest-risk patients in a cohort of 1 million patients, 3-year PDAC incidence is 115-70 times higher than a reference estimate based on age and sex alone. Furthermore, analysis of most predictive features highlights the role of events such as chronic inflammatory conditions and specific medications on overall PDAC risk. Our work provides an AI-driven identification of high-risk individuals, with a potential to improve early detection, enhance patient care, and reduce healthcare costs.