Yale School of Medicine Section of Digestive Diseases, P.O. Box 208019, New Haven, CT, 06520-8019, USA. email@example.com. Yale School of Medicine Section of Digestive Diseases, P.O. Box 208019, New Haven, CT, 06520-8019, USA. VA Connecticut Healthcare System, West Haven, CT, USA.
Risk stratification of patients with gastrointestinal bleeding (GIB) is recommended, but current risk assessment tools have variable performance. Machine learning (ML) has promise to improve risk assessment. We performed a systematic review to evaluate studies utilizing ML techniques for GIB. Bibliographic databases and conference abstracts were searched for studies with a population of overt GIB that used an ML algorithm
with outcomes of mortality, rebleeding, hemostatic intervention, and/or hospital stay. Two independent reviewers screened titles and abstracts, reviewed full-text studies, and extracted data from included studies. Risk of bias was assessed with an adapted Quality in Prognosis Studies tool. Area under receiver operating characteristic curves (AUCs) were the primary assessment of performance with AUC ≥ 0.80 predefined as an acceptable threshold of good performance. Fourteen studies with 30 assessments of ML models met inclusion criteria. No study had low risk of bias. Median AUC reported in validation datasets for predefined outcomes of mortality, intervention, or rebleeding was 0.84 (range 0.40-0.98). AUCs were higher with artificial neural networks (median 0.93, range 0.78-0.98) than other ML models (0.81, range 0.40-0.92). ML performed better than clinical risk scores (Glasgow-Blatchford, Rockall, Child-Pugh, MELD) for mortality in upper GIB. Limitations include heterogeneity of ML models, inconsistent comparisons of ML models with clinical risk scores, and high risk of bias. ML generally provided good-excellent prognostic performance in patients with GIB, and artificial neural networks tended to outperform other ML models. ML was better than clinical risk scores for mortality in upper GIB.