TY - JOUR
T1 - Ensemble learning for poor prognosis predictions
T2 - A case study on SARS-CoV-2
AU - Wu, Honghan
AU - Zhang, Huayu
AU - Karwath, Andreas
AU - Ibrahim, Zina
AU - Shi, Ting
AU - Zhang, Xin
AU - Wang, Kun
AU - Sun, Jiaxing
AU - Dhaliwal, Kevin
AU - Bean, Daniel
AU - Cardoso, Victor Roth
AU - Li, Kezhi
AU - Teo, James T.
AU - Banerjee, Amitava
AU - Gao-Smith, Fang
AU - Whitehouse, Tony
AU - Veenith, Tonny
AU - Gkoutos, Georgios, V
AU - Wu, Xiaodong
AU - Dobson, Richard
AU - Guthrie, Bruce
N1 - Publisher Copyright:
© The Author(s) 2020. Published by Oxford University Press on behalf of the American Medical Informatics Association.
Copyright:
This record is sourced from MEDLINE/PubMed, a database of the U.S. National Library of Medicine
PY - 2021/3/18
Y1 - 2021/3/18
N2 - OBJECTIVE: Risk prediction models are widely used to inform evidence-based clinical decision making. However, few models developed from single cohorts can perform consistently well at population level where diverse prognoses exist (such as the SARS-CoV-2 [severe acute respiratory syndrome coronavirus 2] pandemic). This study aims at tackling this challenge by synergizing prediction models from the literature using ensemble learning. MATERIALS AND METHODS: In this study, we selected and reimplemented 7 prediction models for COVID-19 (coronavirus disease 2019) that were derived from diverse cohorts and used different implementation techniques. A novel ensemble learning framework was proposed to synergize them for realizing personalized predictions for individual patients. Four diverse international cohorts (2 from the United Kingdom and 2 from China; N = 5394) were used to validate all 8 models on discrimination, calibration, and clinical usefulness. RESULTS: Results showed that individual prediction models could perform well on some cohorts while poorly on others. Conversely, the ensemble model achieved the best performances consistently on all metrics quantifying discrimination, calibration, and clinical usefulness. Performance disparities were observed in cohorts from the 2 countries: all models achieved better performances on the China cohorts. DISCUSSION: When individual models were learned from complementary cohorts, the synergized model had the potential to achieve better performances than any individual model. Results indicate that blood parameters and physiological measurements might have better predictive powers when collected early, which remains to be confirmed by further studies. CONCLUSIONS: Combining a diverse set of individual prediction models, the ensemble method can synergize a robust and well-performing model by choosing the most competent ones for individual patients.
AB - OBJECTIVE: Risk prediction models are widely used to inform evidence-based clinical decision making. However, few models developed from single cohorts can perform consistently well at population level where diverse prognoses exist (such as the SARS-CoV-2 [severe acute respiratory syndrome coronavirus 2] pandemic). This study aims at tackling this challenge by synergizing prediction models from the literature using ensemble learning. MATERIALS AND METHODS: In this study, we selected and reimplemented 7 prediction models for COVID-19 (coronavirus disease 2019) that were derived from diverse cohorts and used different implementation techniques. A novel ensemble learning framework was proposed to synergize them for realizing personalized predictions for individual patients. Four diverse international cohorts (2 from the United Kingdom and 2 from China; N = 5394) were used to validate all 8 models on discrimination, calibration, and clinical usefulness. RESULTS: Results showed that individual prediction models could perform well on some cohorts while poorly on others. Conversely, the ensemble model achieved the best performances consistently on all metrics quantifying discrimination, calibration, and clinical usefulness. Performance disparities were observed in cohorts from the 2 countries: all models achieved better performances on the China cohorts. DISCUSSION: When individual models were learned from complementary cohorts, the synergized model had the potential to achieve better performances than any individual model. Results indicate that blood parameters and physiological measurements might have better predictive powers when collected early, which remains to be confirmed by further studies. CONCLUSIONS: Combining a diverse set of individual prediction models, the ensemble method can synergize a robust and well-performing model by choosing the most competent ones for individual patients.
KW - COVID-19
KW - decision support
KW - ensemble learning
KW - model synergy
KW - risk prediction
UR - http://www.scopus.com/inward/record.url?scp=85103227572&partnerID=8YFLogxK
U2 - 10.1093/jamia/ocaa295
DO - 10.1093/jamia/ocaa295
M3 - Article
C2 - 33185672
AN - SCOPUS:85103227572
SN - 1067-5027
VL - 28
SP - 791
EP - 800
JO - Journal of the American Medical Informatics Association
JF - Journal of the American Medical Informatics Association
IS - 4
ER -