• An Electrocardiogram Foundation Model Built on over 10 Million Recordings

    Jun Li, B.S., Aaron D. Aguirre, M.D., Ph.D., Valdery Moura Junior, Ph.D., Jiarui Jin, B.S., Che Liu, M.S., Lanhai Zhong, B.S., Chenxi Sun, Ph.D., Gari Clifford, Ph.D., M. Brandon Westover, M.D., Ph.D., and Shenda Hong, Ph.D. 

    Abstract

    Background:Artificial intelligence (AI) has demonstrated significant potential in electrocardiogram (ECG) analysis and cardiovascular disease assessment. Recently, foundation models have played a remarkable role in advancing medical AI, bringing benefits such as efficient disease diagnosis and crossdomain knowledge transfer. The development of an ECG foundation model holds the promise of elevating AI-ECG research to new heights. However, building such a model poses several challenges, including insufficient database sample sizes and inadequate generalization across multiple domains. In addition, there is a notable performance gap between single-lead and multilead ECG analysis. 

    Methods:We propose a general-purpose ECG foundation model (ECGFounder), which leverages real-world ECG annotations from cardiologists to broaden the diagnostic capabilities of ECG analysis. ECGFounder was built on 10,771,552 ECGs from 1,818,247 unique subjects with 150 label categories from the Harvard�Emory ECG Database, enabling comprehensive cardiovascular disease diagnosis. The model is designed to be both an effective out-of-the-box solution and easily fine-tunable for downstream tasks, maximizing usability. Importantly, we extended its application to reduced-lead ECGs, particularly single-lead ECGs. ECGFounder is therefore applicable to various downstream tasks in mobile and remote monitoring scenarios. 

    Results:Experimental results demonstrate that ECGFounder achieves expert-level performance on internal validation sets, with area under the receiver operating characteristic curve (AUROC) exceeding 0.95 for 80 diagnoses. It also shows strong classification performance and generalization across various diagnoses on external validation sets. When fine-tuned, ECGFounder outperforms baseline models in demographic analysis, clinical event detection, and cross-modality cardiac rhythm diagnosis, surpassing baseline methods by 3 to 5 points in the AUROC. 

    Conclusions:The ECG foundation model offers an effective solution, allowing it to generalize across a wide range of tasks. By enhancing existing cardiovascular diagnostics and facilitating integration with cloud-based systems, which analyze ECG data uploaded from wearable devices, it significantly contributes to the advancement of the cardiovascular AI community and enables management of cardiac conditions. (Funded by the National Science Foundation and others.)