Wei Dai, M.Sc., Ehsan Adeli, Ph.D., Zelun Luo, Ph.D., Dev Dash, M.D., M.P.H., Shrinidhi Lakshmikanth, M.Sc., Zane Durante, B.S., Paul Tang, M.D., Amit Kaushal, M.D., Ph.D., Arnold Milstein, M.D., M.P.H., Li Fei-Fei, Ph.D., and Kevin Schulman, M.D.
Background: While computer vision has gained traction in medical applications, models specifically engineered for intensive care unit (ICU) activities are limited.
Methods: We present Clinical Behavioral Atlas (CBA), a computer vision system that can identify 40 clinically relevant activity categories and 55 object categories solely through RGB video data. The system was developed using a dataset comprising over 140,000 hours of continuous video and over 350,000 densely annotated frames, collected from 16 sensors in 8 ICU rooms at an academic medical center.
Results: The model demonstrated strong performance in entity and activity detection, with sensitivities of 0.75∼0.81 and average precisions of 0.64∼0.73, respectively. Permutation tests yielded P values of less than 0.05 for most activity categories. We observed a positive correlation between the performance and both the number and size of entities. The model excelled at identifying common and large objects, even with limited samples, but struggled with small items like oral swabs. Activity detection performance correlated linearly with video duration. The model showed robust performance (>0.85 average precision) for most clinical activities, but activities of daily living exhibited greater variation and lower average precision (0.23–0.95), indicating potential for further refinement due to their complexity and relative scarcity in the dataset. Experiments against other popular activity recognition models reveal that our method substantially outperforms all baselines, with improvements of 0.30 and 0.45 in average precision over the next best method.
Conclusions: CBA expands automated identification of clinically important bedside clinical actions such as ICU preventive bundle elements. While we have demonstrated the feasibility of computer vision as a tool to assist in clinical care in high-intensity settings such as the ICU, the development of a full clinical-level performance CBA model will require larger datasets, ideally from multiple locations. (Funded by Schmidt Futures and others.)