Abstract

Large-scale analyses of omics data are crucial for advancing precision medicine and personalised treatments. Cur- rent methods to unravel cancer development and prognosis exhibit a dichotomy between interpretability and representational power. Most either rely on post-hoc interpretability techniques with black-box models or use linear models that might miss complex biological interactions. We propose a novel configurable prior-knowledge-based deep auto-encoding framework called PAAE and its generative variant PAVAE, for analyzing cancer RNA-seq data. Our method constrains its learned internal rep- resentation with biological pathways, providing interpretability without sacrificing predictive power. Our model is tested on 3 dif- ferent downstream tasks: cancer subtype classification, survival analysis and unsupervised clustering of the learned features. Our models outperform baselines while having orders of magnitude less parameters than naive models. Extensive interpretability analyses, including task-relevant feature identification demon- strate our model’s effectiveness at identifying underlying biologi- cal signals in an unsupervised fashion. Visualisations are used to highlight the intrinsic interpretability of our models. The source code of this study is available at github.com/phcavelar/pathwayae
Original languageEnglish
Title of host publicationIEEE International Conference on Bioinformatics and Biomedicine
Publication statusAccepted/In press - 2024

Publication series

NameIEEE International Conference on Bioinformatics and Biomedicine

Fingerprint

Dive into the research topics of 'Pathway Activity Autoencoders for Enhanced Omics Analysis and Clinical Interpretability'. Together they form a unique fingerprint.

Cite this