Protein-Protein Interaction Networks and Impact of Disease-related Mutations.

Student thesis: Doctoral ThesisDoctor of Philosophy


Numerous studies have suggested the correlation between the stability of protein complexes, the importance of Protein Protein Interactions (PPIs) and the resulting molecular mechanisms for the underlying protein functions. In particular, the three-dimensional (3D) properties of protein binding interfaces are thought to embed key roles in mediating biological activities and in regulating cellular functions. The alteration of binding interfaces can disrupt the biological system in cell and consequently result in different phenotypic traits or diseases. In a scenario in which the rapid growth of biologically relevant information contributed by large-scale sequencing projects has paved the way to insights into the relationship between genotype and phenotype, it is really important to effectively combine all available information playing a role in this, to extract some principle rules guiding our understanding on the occurring molecular and atomistic mechanisms. It is therefore timely to implement large-scale studies on the role of gene variants on Protein-Protein Interaction Networks (PPINs) and more specifically protein complexes. The objective of this project is a proteome-wide scale analysis of Protein Interaction data by mapping human genetic variation data onto structurally determined binary protein complexes. 3D PPINs were used as a tool to extract the association between human proteins and to enable an insight into molecular features of human genetic variation. A comprehensive literature review on the topic of PPINs is given in Chapter 1 ("Introduction to Protein-Protein Interactions and Networks"). Non-synonymous Single Nucleotide Polymorphisms (nsSNPs) were the main focus in this study since they can directly cause conformational changes of proteins or failures in forming protein complexes. Two diseaserelated nsSNP datasets were investigated including: a) germ-line disease nsSNPs and b) somatic cancer nsSNPs. A set of nsSNPs which are known not to be related to diseases was used as the background to be compared with the chosen disease nsSNPs in order to highlight the characteristic properties identifying the features of disease nsSNPs. A survey on human genetic variation and the current state of related studies is presented in Chapter 2 ("Human Gene Variants"). The study of inter-domain disordered regions was also included in this project, as recent studies suggested their importance in regulating biological functions. An introduction on protein "Intrinsic Disorder" is also presented in Chapter 2. An automated system pipeline was developed in this study to generate structure-integrated PPINs at protein domain level, map nsSNPs onto these structures, and classify nsSNPs. The collected nsSNP datasets are classi ed by their occurrence in different protein regions, including surface, interface, core and disordered. The detail of the pipeline development is given in Chapter 4 ("Pipeline to Generate 3D Protein-Protein Interaction Networks"). The interface regions showed a previously documented enrichment with disease-related nsSNPs. In addition, our results showed that germ-line disease nsSNPs and somatic cancer nsSNPs exhibit distinctive features in terms of their physical-chemical preferences and functional speci city. This may suggest that these two types of disease-related nsSNPs affect cellular functions through different mechanisms. Moreover, the functions of affected proteins were found to be highly related to the types of diseases the germ-line nsSNPs lead to. These results will be presented together in Chapter 3 ("Computational Analyses of Diseaserelated Variants").
Date of Award2014
Original languageEnglish
Awarding Institution
  • King's College London
SupervisorSophia Tsoka (Supervisor), Franca Fraternali (Supervisor) & Frederic Festy (Supervisor)

Cite this