Objective: Routine haematoxylin and eosin (H&E) photomicrographs from human papillomavirus-associated oropharyngeal squamous cell carcinomas (HPV + OpSCC) contain a wealth of prognostic information. In this study, we developed a high content image analysis (HCIA) workflow to quantify features of H&E images from HPV + OpSCC patients to identify prognostic features and predict patient outcomes. Methods: First, we have developed an open-source HCIA tool for single-cell segmentation and classification of H&E images. Subsequently, we have used our HCIA tool to analyse a set of 889 images from diagnostic H&E slides in a retrospective cohort of HPV + OpSCC patients with favourable (FO, n = 60) or unfavourable (UO, n = 30) outcomes. We have identified and measured 31 prognostic features which were quantified in each sample and used to train a neural network (NN) model to predict patient outcomes. Results: Univariate and multivariate statistical analyses revealed significant differences between FO and UO patients in 31 and 17 variables, respectively (P < 0.05). At the single-image level, the NN model had an overall accuracy of 72.5% and 71.2% in recognising FO and UO patients when applied to test or validation sets, respectively. When considering 10 images per patient, the accuracy of the NN model increased to 86.7% in the test set. Conclusion: Our open-source H&E analysis workflow and predictive models confirm previously reported prognostic features and identifies novel factors which predict HPV + OpSCC outcomes with promising accuracy. Our work supports the use of machine learning in digital pathology to exploit clinically relevant features in routine diagnostic pathology without additional biomarkers.
|Early online date||23 Apr 2023|
|Publication status||Published - Jun 2023|