Multi-objective Symbolic Regression for Clinician-in-the-loop Machine Learning algorithms

Research output: Contribution to journalArticlepeer-review

14 Downloads (Pure)

Abstract

Background
The power of machine learning (ML) in medicine is further enhanced by input from clinical experts with detailed knowledge of phenomena explored and the organization in which the ML models will be deployed. This input is needed not only in the problem conception and evaluation, but also in the training itself using the clinician-in-the-loop approach. We explore the applicability of such an approach to Multi-objective Symbolic Regression (MOSR) and compare it with state-of-the-art methodologies.

Methods
This is a retrospective study based on the admission data of 22576 patients routinely collected by the intensive care units (ICU) of Guy’s and St. Thomas’ NHS Foundation Trust in central London between April 1st, 2008, and December 31st, 2022. Patient-level data was used, including blood tests, urine tests, medical history, signs, symptoms, and common clinical scores. The outcome was modelled as a binary classification; four experiments were executed predicting mortality at 7 days (4.5% of patients), 30 days (10.2%), 6 months (17.4%) and 12 months (21.2%). A random train-test split of 80/20% ratio was performed. Clinician-in-the-loop approach used a Multi-objective Symbolic Regression algorithm bespoke designed to also optimize for the F1-score.

Results
For all four experiments, MOSR outperformed all other available ML algorithms providing significantly higher F1-scores and AUC. This makes MOSR the most competitive algorithm, particularly for imbalanced datasets such as typical ICU admission data.

Conclusions
We have demonstrated how ML algorithms designed for a clinician-in-the-loop approach outperform standard off-the-shelf alternatives and note the role of the application environment and clinical knowledge in the training itself enhancing the model performance. We therefore champion the adoption of approaches that, like MOSR, enable clinical input also in shaping the training behaviour.

Key messages
• The input of clinical knowledge and understanding of the application domain dynamics are crucial for ML development in healthcare.
• In Health Data Science we need ML algorithms like MOSR capable of receiving clinical input also in shaping the training behaviour.
Original languageEnglish
Article numberckae144.1202
JournalEuropean journal of public health
Volume34
Issue numberSupplement_3
DOIs
Publication statusPublished - 28 Oct 2024

Fingerprint

Dive into the research topics of 'Multi-objective Symbolic Regression for Clinician-in-the-loop Machine Learning algorithms'. Together they form a unique fingerprint.

Cite this