Long-range regulatory interactions in acute promyelocytic leukemia

Student thesis: Doctoral ThesisDoctor of Philosophy


The PML-RARA fusion protein is the hallmark driver of Acute Promyelocytic Leukemia (APL). The fusion disrupts retinoic acid signalling, leading to largescale gene expression changes, with uncontrolled proliferation and a differentiation block. In contemporary clinical trials, over 90% of APL patients have excellent clinical outcome, however this does not apply to all patients in the real world. There are aspects of the disease that still pose challenges in the clinic such as the high incidence of early death due to bleeding complications and the increased risk of relapse in a subset of patients not captured in trial data. The global regulatory mechanisms employed by PML-RARA still remain incompletely understood, therefore this thesis has aimed to interrogate in detail, the global orchestration driven by PML-RARA, in both a cell line model, and in primary patient samples. Utilising a PML-RARA inducible cell line model (U937-PR9), multi-omic datasets were generated and integrated to create a genome-wide map of PML-RARA driven transcriptional misregulation. Transcriptional changes were analysed with RNA-seq, PML-RARA binding sites mapped using Cut&Run, chromatin accessibility changes with ATAC-seq, and regulatory interaction changes identified with promoter capture Hi-C. Integration of each dataset uncovered that PML-RARA employs multiple distinct methods to regulate gene expression changes, including the activation of key proliferative and coagulation associated genes through chromatin remodelling, however the exact context dictating general transcriptional directions were not immediately apparent. Machine learning was applied to understand if the combinatorial effect of other TFs near PML-RARA binding sites were responsible for the distinct regulatory outcomes. This identified several complex TF-motif combinations that predicted outcomes with high accuracy, suggesting that PML-RARA may cooperate with multiple TFs in a context dependant way to exert varying transcriptional control. The findings in the U937-PR9 cell line were investigated in primary patient samples. RNA-seq and capture Hi-C experiments were carried out in two patients with the t(15;17) translocation. This corroborated that genes upregulated by PML-RARA-dependent chromatin remodelling remain engaged and highly expressed in patients. Additional experiments were carried out in high-risk APL samples and integrated with existing transcriptomic and epigenetic datasets. Indepth comparisons of high-risk and low-risk regulatory profiles highlighted fundamentally mis-regulated pathways and gene sets that may be core drivers of high-risk disease. The final part of this aimed to employ machine learning to deconvolute the complex rules that dictate transcriptional outcome. The datasets generated throughout this thesis were used to generate machine learning models and set to predict the relative expression levels of a gene based on its chromatin interaction, accessibility and binding site repertoire. The preliminary predictive models performed with very high accuracy, and the interrogation models’ learnings highlighted interpretable patterns of transcriptional regulation that drive specific gene sets to the cell type/disease of interest.
Date of Award1 Nov 2021
Original languageEnglish
Awarding Institution
  • King's College London
SupervisorCameron Osborne (Supervisor) & Richard Dillon (Supervisor)

Cite this