A comparative analysis of machine learning approaches to predict C. difficile infection in hospitalized patients - 24/02/22
, Gina Barnes, MPH, Jana Hoffman, PhD, Jacob Calvert, MSc, Qingqing Mao, PhD, Ritankar Das, MScHighlights |
• | Clostridioides difficile is a leading cause of infectious diarrhea in hospitalized patients. |
• | Machine learning algorithms can predict Clostridioides difficile with excellent discrimination. |
• | XGBoost maintained predictive performance across a hold-out test set and an external dataset |
Abstract |
Background |
Interventions to better prevent or manage Clostridioides difficile infection (CDI) may significantly reduce morbidity, mortality, and healthcare spending.
Methods |
We present a retrospective study using electronic health record data from over 700 United States hospitals. A subset of hospitals was used to develop machine learning algorithms (MLAs); the remaining hospitals served as an external test set. Three MLAs were evaluated: gradient-boosted decision trees (XGBoost), Deep Long Short Term Memory neural network, and one-dimensional convolutional neural network. MLA performance was evaluated with area under the receiver operating characteristic curve (AUROC), sensitivity, specificity, diagnostic odds ratios and likelihood ratios.
Results |
The development dataset contained 13,664,840 inpatient encounters with 80,046 CDI encounters; the external dataset contained 1,149,088 inpatient encounters with 7,107 CDI encounters. The highest AUROCs were achieved for XGB, Deep Long Short Term Memory neural network, and one-dimensional convolutional neural network via abstaining from use of specialized training techniques, resampling in isolation, and resampling and output bias in combination, respectively. XGBoost achieved the highest AUROC.
Conclusions |
MLAs can predict future CDI in hospitalized patients using just 6 hours of data. In clinical practice, a machine-learning based tool may support prophylactic measures, earlier diagnosis, and more timely implementation of infection control measures.
Le texte complet de cet article est disponible en PDF.Key Words : Machine learning, Algorithm, Prediction, Clostridioides difficile, CDI, Electronic health record, XGBoost
Plan
| Funding/support: No external financial or material support was received to support this research. |
|
| Conflicts of interest: All authors who have affiliations listed with Dascena (Houston, Texas, U.S.A) are employees or contractors of Dascena. |
|
| Availability of data and materials: The data analyzed in this study was obtained from a proprietary longitudinal electronic health record (EHR) repository that includes over 700 hospitals located in the U.S. Requests to access the processed data and statistical information should be directed to Qingqing Mao, PhD, at qmao@dascena.com. |
Vol 50 - N° 3
P. 250-257 - mars 2022 Retour au numéroBienvenue sur EM-consulte, la référence des professionnels de santé.
L’accès au texte intégral de cet article nécessite un abonnement.
Déjà abonné à cette revue ?
