Developing and externally validating a machine learning risk prediction model for 30-day mortality after stroke using national stroke registers in the UK and Sweden

Vasa Curcin, Wenjuan Wang, Benjamin Bray, Charles Wolfe, Marie Eriksson, Josline Adhiambo Otieno

Research output: Contribution to journalArticlepeer-review


We aimed to develop and externally validate a generalisable risk prediction model for 30-day stroke mortality suitable for supporting quality improvement analytics in stroke care using large nationwide stroke registers in the United Kingdom and Sweden.
Registry based cohort study.
Stroke registries including the Sentinel Stroke National Audit Program (SSNAP) in England, Wales, and Northern Ireland (2013-2019) and the national Swedish stroke register (Riksstroke 2015-2020).
Participants and Methods
Data from SSNAP were used for developing and temporally validating the model, and data from Riksstroke were used for external validation. Models were developed with the variables available in both registries using logistic regression (LR), LR with elastic net and interaction terms, and XGBoost. Performances were evaluated with discrimination, calibration, and decision curves.
Outcome measures
The primary outcome was all-cause 30-day in-hospital mortality after stroke.
In total, 488,497 stroke patients with 12.4% 30-day in-hospital mortality were used for developing and temporally validating the model in the UK. A total of 128,360 stroke patients with 10.8% 30-day in-hospital mortality and 13.1% all mortality were used for external validation in Sweden. In the SSNAP temporal validation set, the final XGBoost model achieved the highest area under the ROC curve (AUC) (0.852 (95% CI: 0.848-0.855)) and was well-calibrated. The performances on the external validation in Riksstroke were as good and achieved AUC at 0.861 (95% CI: 0.858, 0.865) for in-hospital mortality. For Riksstroke, the models slightly overestimated the risk for in-hospital mortality, whilst they were better calibrated at the risk for all mortality.
The risk prediction model was accurate and externally validated using high quality registry data. This is potentially suitable to be deployed as part of quality improvement analytics in stroke care to enable the fair comparison of stroke mortality outcomes across hospitals and health systems across countries.
Original languageEnglish
JournalBMJ Open
Publication statusAccepted/In press - 22 Aug 2023

Cite this