TY - JOUR
T1 - Unsupervised machine learning identifies distinct ALS molecular subtypes in post-mortem motor cortex and blood expression data
AU - Marriott, Heather
AU - Kabiljo, Renata
AU - Hunt, Guy
AU - Al Khleifat, Ahmad
AU - Jones, Ashley
AU - Troakes, Claire
AU - Dobson, Richard
AU - Al-Chalabi, Ammar
AU - Iacoangeli, Alfredo
N1 - Funding Information:
H.M is supported by GlaxoSmithKline and the KCL funded centre for Doctoral Training (CDT) in Data-Driven Health. R.K receives funding from MND Scotland. G.P.H is supported by the Perron Institute for Neurological and Translational Science and the KCL funded centre for Doctoral Training (CDT) in Data-Driven Health. A.A.K is funded by ALS Association Milton Safenowitz Research Fellowship (grant number22-PDF-609.DOI : https://doi.org/10.52546/pc.gr.150909 .), The Motor Neurone Disease Association (MNDA) Fellowship (Al Khleifat/Oct21/975 − 799) and The Darby Rimmer Foundation. J.Q is funded by The Darby Rimmer Foundation and the Motor Neurone Disease Association. S.K and A.L.P are funded by MSWA and the Perron Institute for Neurological and Translational Science. P.S is an employee and shareholder of GlaxoSmithKline plc. A.I is funded by the Motor Neurone Disease Association, The Darby Rimmer Foundation, Rosetrees Trust, Alzheimer’s Research UK, Spastic Paraplegia Foundationand, MND Scotland and The NIHR Maudsley Biomedical Research Centre. A.A-C is an NIHR Senior Investigator (NIHR202421) and has received support from an EU Joint Programme - Neurodegenerative Disease Research (JPND) project. The work is supported through the following funding organisations under the aegis of JPND - www.jpnd.eu (United Kingdom, Medical Research Council (MR/L501529/1; MR/R024804/1) and Economic and Social Research Council (ES/L008238/1)) and through the Motor Neurone Disease Association, My Name’5 Doddie Foundation, and Alan Davidson Foundation. The London Neurodegenerative Diseases Brain Bank at KCL has received funding from the MRC and through the Brains for Dementia Research project (jointly funded by Alzheimer’s Society and Alzheimer’s Research UK. We would like to acknowledge the Target ALS Human Postmortem Tissue Core, New York Genome Centre for Genomics of Neurodegenerative Disease, Amyotrophic Lateral Sclerosis Association and TOW Foundation for the RNA-sequencing data used in this publication. This study represents independent research part funded by the National Institute for Health Research (NIHR) Biomedical Research Centre at South London and Maudsley NHS Foundation Trust and King’s College London. The authors acknowledge use of the King’s Computational Research, Engineering and Technology Environment (CREATE) https://doi.org/10.18742/rnvf-m076 . This work was supported by resources provided by the Pawsey Supercomputing Research Centre with funding from the Australian Government and the Government of Western Australia. The views expressed are those of the author(s) and not necessarily those of the NHS, the NIHR or the Department of Health and Social Care.
Publisher Copyright:
© 2023, The Author(s).
PY - 2023/12/21
Y1 - 2023/12/21
N2 - Amyotrophic lateral sclerosis (ALS) displays considerable clinical and genetic heterogeneity. Machine learning approaches have previously been utilised for patient stratification in ALS as they can disentangle complex disease landscapes. However, lack of independent validation in different populations and tissue samples have greatly limited their use in clinical and research settings. We overcame these issues by performing hierarchical clustering on the 5000 most variably expressed autosomal genes from motor cortex expression data of people with sporadic ALS from the KCL BrainBank (N = 112). Three molecular phenotypes linked to ALS pathogenesis were identified: synaptic and neuropeptide signalling, oxidative stress and apoptosis, and neuroinflammation. Cluster validation was achieved by applying linear discriminant analysis models to cases from TargetALS US motor cortex (N = 93), as well as Italian (N = 15) and Dutch (N = 397) blood expression datasets, for which there was a high assignment probability (80–90%) for each molecular subtype. The ALS and motor cortex specificity of the expression signatures were tested by mapping KCL BrainBank controls (N = 59), and occipital cortex (N = 45) and cerebellum (N = 123) samples from TargetALS to each cluster, before constructing case-control and motor cortex-region logistic regression classifiers. We found that the signatures were not only able to distinguish people with ALS from controls (AUC 0.88 ± 0.10), but also reflect the motor cortex-based disease process, as there was perfect discrimination between motor cortex and the other brain regions. Cell types known to be involved in the biological processes of each molecular phenotype were found in higher proportions, reinforcing their biological interpretation. Phenotype analysis revealed distinct cluster-related outcomes in both motor cortex datasets, relating to disease onset and progression-related measures. Our results support the hypothesis that different mechanisms underpin ALS pathogenesis in subgroups of patients and demonstrate potential for the development of personalised treatment approaches. Our method is available for the scientific and clinical community at https://alsgeclustering.er.kcl.ac.uk.
AB - Amyotrophic lateral sclerosis (ALS) displays considerable clinical and genetic heterogeneity. Machine learning approaches have previously been utilised for patient stratification in ALS as they can disentangle complex disease landscapes. However, lack of independent validation in different populations and tissue samples have greatly limited their use in clinical and research settings. We overcame these issues by performing hierarchical clustering on the 5000 most variably expressed autosomal genes from motor cortex expression data of people with sporadic ALS from the KCL BrainBank (N = 112). Three molecular phenotypes linked to ALS pathogenesis were identified: synaptic and neuropeptide signalling, oxidative stress and apoptosis, and neuroinflammation. Cluster validation was achieved by applying linear discriminant analysis models to cases from TargetALS US motor cortex (N = 93), as well as Italian (N = 15) and Dutch (N = 397) blood expression datasets, for which there was a high assignment probability (80–90%) for each molecular subtype. The ALS and motor cortex specificity of the expression signatures were tested by mapping KCL BrainBank controls (N = 59), and occipital cortex (N = 45) and cerebellum (N = 123) samples from TargetALS to each cluster, before constructing case-control and motor cortex-region logistic regression classifiers. We found that the signatures were not only able to distinguish people with ALS from controls (AUC 0.88 ± 0.10), but also reflect the motor cortex-based disease process, as there was perfect discrimination between motor cortex and the other brain regions. Cell types known to be involved in the biological processes of each molecular phenotype were found in higher proportions, reinforcing their biological interpretation. Phenotype analysis revealed distinct cluster-related outcomes in both motor cortex datasets, relating to disease onset and progression-related measures. Our results support the hypothesis that different mechanisms underpin ALS pathogenesis in subgroups of patients and demonstrate potential for the development of personalised treatment approaches. Our method is available for the scientific and clinical community at https://alsgeclustering.er.kcl.ac.uk.
KW - amyotrophic lateral sclerosis
KW - Unsupervised and supervised machine learning
KW - Precision medicine
KW - Transcriptomics
KW - Patient stratification
KW - Biomarkers
UR - http://www.scopus.com/inward/record.url?scp=85180173144&partnerID=8YFLogxK
U2 - 10.1186/s40478-023-01686-8
DO - 10.1186/s40478-023-01686-8
M3 - Article
SN - 2051-5960
VL - 11
JO - Acta Neuropathologica Communications
JF - Acta Neuropathologica Communications
IS - 1
M1 - 208
ER -