Statistical Dissection of Pharmacogenetics with Machine Learning

Student thesis: Doctoral ThesisDoctor of Philosophy

Abstract

A major challenge for personalized medicine is to identify biomarkers that predict response to therapeutics. Current experience indicates that few pharmacogenetic biomarkers are individually predictive, since such biomarkers usually lack the sensitivity and specificity to achieve a clinically meaningful prediction on their own. The future is in combining multiple biomarkers together with clinical and other characteristics, with the aim of producing multivariable prediction algorithms that can serve as decision support tools for personalizing medicine. This PhD project approaches pharmacogenetics from two different angles. Firstly, we investigate specific hurdles to the adoption of genetic tests to guide pharmaceutical treatment. We study the characteristics of a pharmacogenetic test that predicts the development of a serious adverse drug reaction caused by the antipsychotic clozapine and model the requirements for the test to be clinically useful. In addition, we assess the cost-effectiveness of pharmacogenetic testing by means of a literature review and estimate how this would change in a future where genetic information is available at no additional cost at the time of prescribing, for example via an electronic health record. The second emphasis of this PhD is on the application of machine learning methods to predict pharmacogenetic responses in Phase 2 clinical trials. Genetic studies are highly dimensional and traditionally genetic variants are investigated independently in univariate analyses. Alternatively, machine learning algorithms can be used to model large numbers of variables simultaneously, even if these variables are correlated as is the case for genetic variants. These methods optimize predictive ability and a wide range of linear and non-linear algorithms exists. Two clinical trials and one gene expression case-control study in different disease areas were analysed using machine learning (elastic net, random forest, support vector machine) and deep learning (neural network) methods with the aim of predicting efficacy and safety measures using genetic and clinical baseline variables. We compare the predictive ability of the algorithms used and evaluate the strengths and weaknesses of their application to pharmacogenetic problems.
Date of Award2017
Original languageEnglish
Awarding Institution
  • King's College London
SupervisorCathryn Lewis (Supervisor), Mike Weale (Supervisor) & Raquel Iniesta (Supervisor)

Cite this

'