Risk prediction of 30-day mortality after stroke using machine learning: a nationwide registry-based cohort study

Research output: Contribution to journalArticlepeer-review

7 Citations (Scopus)
79 Downloads (Pure)


We aimed to develop and validate machine learning (ML) models for 30-day stroke mortality for mortality risk stratification and as benchmarking models for quality improvement in stroke care.

Data from the UK Sentinel Stroke National Audit Program between 2013 to 2019 were used. Models were developed using XGBoost, Logistic Regression (LR), LR with elastic net with/without interaction terms using 80% randomly selected admissions from 2013 to 2018, validated on the 20% remaining admissions, and temporally validated on 2019 admissions. The models were developed with 30 variables. A reference model was developed using LR and 4 variables. Performances of all models was evaluated in terms of discrimination, calibration, reclassification, Brier scores and Decision-curves.

In total, 488,497 stroke patients with a 12.3% 30-day mortality rate were included in the analysis. In 2019 temporal validation set, XGBoost model obtained the lowest Brier score (0.069 (95% CI: 0.068–0.071)) and the highest area under the ROC curve (AUC) (0.895 (95% CI: 0.891–0.900)) which outperformed LR reference model by 0.04 AUC (p < 0.001) and LR with elastic net and interaction term model by 0.003 AUC (p < 0.001). All models were perfectly calibrated for low (< 5%) and moderate risk groups (5–15%) and ≈1% underestimation for high-risk groups (> 15%). The XGBoost model reclassified 1648 (8.1%) low-risk cases by the LR reference model as being moderate or high-risk and gained the most net benefit in decision curve analysis.

All models with 30 variables are potentially useful as benchmarking models in stroke-care quality improvement with ML slightly outperforming others.
Original languageEnglish
Article number195
JournalBMC Neurology
Issue number1
Publication statusPublished - 27 May 2022


Dive into the research topics of 'Risk prediction of 30-day mortality after stroke using machine learning: a nationwide registry-based cohort study'. Together they form a unique fingerprint.

Cite this