Machine Learning based image analysis approaches for cancer biomarker discovery from tissue sections

Student thesis: Doctoral ThesisDoctor of Philosophy


Histological analysis of tissue biopsy samples by a pathologist is currently the gold standard for diagnosis and prognosis for a host of malignancies, including cancer.

The emergence of automated whole slide imaging and advances in image analysis software are changing histology in a profound manner. Computational pathology tools are increasing the throughput, providing a quantitative dimension, in addition to reducing the time burden on pathologists. Importantly, this also opens new opportunities to address critical diagnostic, prognostication, and biological questions about how structural change in tissue relates to patient’s disease. There are currently no clinically approved tools that combine the automated digital evaluation of tissue morphology with immunohistochemical biomarkers for disease, on a glass slide.

This thesis describes several applications within the framework of the advancing field of high throughput whole slide image analysis and explores innovative directions for pathological analysis. A fundamental notion pursued here, was that developing computational methods to supplement chromogen staining measurements with information on histopathological tissue type. Specifically, by employing computational approaches to chromogen-stained histological sections, it would be able to define biomarkers more accurately for prostate cancer diagnosis.

Here I have focused upon developing, improving, and validating digital imaging, analysis, and interpretation of pathological tissue samples. The thesis is divided into three sections. The first section details quantitative whole slide, tissue array, image analysis to gain biological insights into protein expression changes in cancer and describe a software I developed for this purpose. The second section explores application of machine learning semantic image segmentation models to identify areas of histopathological significance in prostate tissues to answer biological questions about the localisation of protein immunostaining patterns in tissue. The third section explores the failure points that may occur when a deep learning model is deployed on a range of different locations and devices by the application of adversarial attacks to determine weaknesses in these trained convolutional neural networks.

An overarching theme explored in my research is how whole slide imaging and analysis can be combined with measurements of immunohistochemical staining to inform disease outcome and biological expression changes in prostate cancer. I provide evidence of how tissue array analysis can be applied in conjunction with machine learning to provide screening for protein biomarkers in a range of diseases. The aim of these protein expression screens is to identify the dysregulation of expression for a target protein in the presence of disease. The application of this analysis pipeline to 15 Wingless-related integration site (Wnt) signalling related proteins reveals a putative PCa suppressor protein, SOX-14, overexpression of c-Myc in stromal cells, and differential expression of several other protein targets in cancer. Furthermore, a pipeline has been developed to apply deep learning segmentation to histopathology images. This pipeline has been tested by applying adversarial attacks, a method of finding examples of images that will make the convolutional neural network misclassify images.

Date of Award1 Oct 2023
Original languageEnglish
Awarding Institution
  • King's College London
SupervisorMagnus Lynch (Supervisor) & Aamir Ahmed (Supervisor)

Cite this