TY - CHAP
T1 - Multi-omic Data Integration and Feature Selection for Survival-Based Patient Stratification via Supervised Concrete Autoencoders
AU - da Costa Avelar, Pedro Henrique
AU - Laddach, Roman
AU - Karagiannis, Sophia N.
AU - Wu, Min
AU - Tsoka, Sophia
N1 - Funding Information:
Acknowledgements. We would like to thank Dr Jonathan Cardoso-Silva for fruitful conversations, and João Nuno Beleza Oliveira Vidal Lourenço for designing the diagrams. P.H.C.A. acknowledges that during his stay at KCL and A*STAR he’s partly funded by King’s College London and the A*STAR Research Attachment Programme (ARAP). The research was also supported by the National Institute for Health Research Biomedical Research Centre based at Guy’s and St Thomas’ NHS Foundation Trust and King’s College London (IS-BRC-1215-20006). The authors are solely responsible for study design, data collection, analysis, decision to publish, and preparation of the manuscript. The views expressed are those of the author(s) and not necessarily those of the NHS, the NIHR or the Department of Health. This work used King’s CREATE compute cluster for its experiments [18]. The results shown here are in whole or part based upon data generated by the TCGA Research Network: https://www.cancer.gov/tcga.
Funding Information:
We would like to thank Dr Jonathan Cardoso-Silva for fruitful conversations, and João Nuno Beleza Oliveira Vidal Lourenço for designing the diagrams. P.H.C.A. acknowledges that during his stay at KCL and A*STAR he’s partly funded by King’s College London and the A*STAR Research Attachment Programme (ARAP). The research was also supported by the National Institute for Health Research Biomedical Research Centre based at Guy’s and St Thomas’ NHS Foundation Trust and King’s College London (IS-BRC-1215-20006). The authors are solely responsible for study design, data collection, analysis, decision to publish, and preparation of the manuscript. The views expressed are those of the author(s) and not necessarily those of the NHS, the NIHR or the Department of Health. This work used King’s CREATE compute cluster for its experiments [18]. The results shown here are in whole or part based upon data generated by the TCGA Research Network: https://www.cancer.gov/tcga.
Publisher Copyright:
© 2023, The Author(s), under exclusive license to Springer Nature Switzerland AG.
PY - 2023
Y1 - 2023
N2 - Cancer is a complex disease with significant social and economic impact. Advancements in high-throughput molecular assays and the reduced cost for performing high-quality multi-omic measurements have fuelled insights through machine learning. Previous studies have shown promise on using multiple omic layers to predict survival and stratify cancer patients. In this paper, we develop and report a Supervised Autoencoder (SAE) model for survival-based multi-omic integration, which improves upon previous work, as well as a Concrete Supervised Autoencoder model (CSAE) which uses feature selection to jointly reconstruct the input features as well as to predict survival. Our results show that our models either outperform or are on par with some of the most commonly used baselines, while either providing a better survival separation (SAE) or being more interpretable (CSAE). Feature selection stability analysis on our models shows a power-law relationship with features commonly associated with survival. The code for this project is available at: https://github.com/phcavelar/coxae.
AB - Cancer is a complex disease with significant social and economic impact. Advancements in high-throughput molecular assays and the reduced cost for performing high-quality multi-omic measurements have fuelled insights through machine learning. Previous studies have shown promise on using multiple omic layers to predict survival and stratify cancer patients. In this paper, we develop and report a Supervised Autoencoder (SAE) model for survival-based multi-omic integration, which improves upon previous work, as well as a Concrete Supervised Autoencoder model (CSAE) which uses feature selection to jointly reconstruct the input features as well as to predict survival. Our results show that our models either outperform or are on par with some of the most commonly used baselines, while either providing a better survival separation (SAE) or being more interpretable (CSAE). Feature selection stability analysis on our models shows a power-law relationship with features commonly associated with survival. The code for this project is available at: https://github.com/phcavelar/coxae.
KW - Concrete autoencoders
KW - Multi-omic integration
KW - Supervised autoencoders
KW - Survival prediction
KW - Survival stratification
UR - http://www.scopus.com/inward/record.url?scp=85151057038&partnerID=8YFLogxK
U2 - 10.1007/978-3-031-25891-6_5
DO - 10.1007/978-3-031-25891-6_5
M3 - Conference paper
AN - SCOPUS:85151057038
SN - 9783031258909
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 47
EP - 61
BT - Machine Learning, Optimization, and Data Science - 8th International Workshop, LOD 2022, Revised Selected Papers
A2 - Nicosia, Giuseppe
A2 - Giuffrida, Giovanni
A2 - Ojha, Varun
A2 - La Malfa, Emanuele
A2 - La Malfa, Gabriele
A2 - Pardalos, Panos
A2 - Di Fatta, Giuseppe
A2 - Umeton, Renato
PB - Springer Science and Business Media Deutschland GmbH
T2 - 8th International Conference on Machine Learning, Optimization, and Data Science, LOD 2022, held in conjunction with the 2nd Advanced Course and Symposium on Artificial Intelligence and Neuroscience, ACAIN 2022
Y2 - 18 September 2022 through 22 September 2022
ER -