A computational biology approach to studying the composition, function, and impact of the human mycobiome

Student thesis: Doctoral ThesisDoctor of Philosophy


The human microbiome plays a crucial role in health and disease. The mycobiome is the least explored area of the microbiome, though current research has associated the mycobiome with gastrointestinal, neurological, and immune disorders. Despite a huge strides in high throughput sequencing technologies and associated computational and bioinformatics pipelines to investigate the microbiome, the analysis of the mycobiome is still in its infancy due to bottlenecks in optimised extraction methods, lack of fungal genome catalogues, and bioinformatic processes. Fungi are complex organisms, and with their diverse nature (i.e., morphology, cell wall), there is difficulty in finding a universal extraction protocol for the mycobiome. Moreover, current taxonomic classifications of fungi use operational taxonomic units and amplicon based-marker regions for sequencing. 18S ribosomal DNA (18S) and inter transcribed spacers (ITS) are the current gold standards for fungal identification in mycobiome research. However, amplicon sequencing for mycobiome analysis suffers from primer bias, is limited to genus level annotation, and has false-positive detection. In contrast, shotgun sequencing delivers better downstream analysis of the mycobiome by offering higher taxonomic resolution, functional information, and the use of a customisable database tailored towards human fungi. However, due to the lack of a universal fungal extraction protocol and reliance on PCR, shotgun sequencing analysis is not currently the norm. Current reference databases do not cover the phylogenetic diversity of fungal species compared to bacteria, and species-level annotation is very limited. This all points toward the need to focus on building an efficient metagenomic pipeline to analyse the mycobiome.

System biology provides a powerful platform to integrate multi-omics data and uncover the intricacies of genotype-phenotype relationships. Bringing together data from (meta)genomics, (meta)transcriptomics and metabolomics deliver essential resources for fungi that can be used to highlight critical biological pathways. Doing so enables the identification of mechanistic changes associated with virulence to uncover diagnostic biomarkers and novel therapeutic targets. More interestingly, the effects of fungal metabolism remains relatively unexplored in human health. To enable an in-depth analysis of these processes, there is a need to deliver fungal-specific databases, pipelines, and analysis tools to assist mycobiome research. In this study, the BioFung database was created to provide fungal-specific annotation of protein-encoding sequences, addressing the need for a more precise functional annotation database to understand genome variation and pathogenic effect on human health and disease. The application of BioFung to Candida species, a prominent member of the mycobiome, was demonstrated. This led to the discovery of differences in metabolic pathways in invasive species. Integration of metagenomics and metabolomics data sets confirmed choline, polyamine and fatty acid biosynthesis metabolic pathways as potential targets for diagnostic markers in invasive Candida species.

To achieve optimised results for shotgun metagenomics sequencing of the mycobiome, it was established that using an adapted PowerSoil Microbiome kit (Qiagen) provided sufficient yield and quality for large-scale mycobiome studies. Furthermore, the in-house fungal catalogue provided species-level identification of fungi more efficiently than currently available tools. The application of metagenomics to healthy oral and gut samples was further demonstrated, revealing the presence of new fungal genera in the oral and gut mycobiome. Applying this in-house catalogue pipeline to a liver disease cohort showed compositional changes in oral and gut mycobiome of patients with increasing stages of liver failure. The application of machine learning techniques determined that the oral mycobiome is a potential predictor of liver disease. The severity of liver disease was indicated by Fusarium in the oral mycobiome, whereas Candida was a better indicator of liver disease in the gut mycobiome. This revealed the potential effect the mycobiome plays on human health and disease.
This thesis provides novel methodologies that include a proficient standardised extraction for PCR-free fungal metagenomics and bioinformatics pipeline with accurate species-level annotation using an in-house fungal catalogue. This in-house fungal catalogue was applied to a liver disease cohort, demonstrating fungal compositional changes in the progression of chronic liver disease. Finally, the development of a database (BioFung) allows fungal-specific annotation to investigate fungal biology and interactions with other species. This work explores this methodology to explore functional diversity in Candida species metabolism, highlighting potential pathways for biomarkers and therapeutic targets in invasive species. In summary, the BioFung database, the optimised metagenomic pipeline, and bioinformatic analysis methods in this work pave the way for understanding the crucial role of mycobiome better in human health and disease.

Date of Award1 Oct 2022
Original languageEnglish
Awarding Institution
  • King's College London
SupervisorDave Moyes (Supervisor) & Saeed Shoaie (Supervisor)

Cite this