King's College London

Research portal

Algorithms for the analysis of molecular sequences

Student thesis: Doctoral ThesisDoctor of Philosophy

DNA sequencing is the translation of molecular structure into a human- and machine-readable format: a sequence, or string, of letters. The exponential growth of data (from DNA, RNA, and proteins) produced by biotechnology has resulted in two major scientific questions. First, what conclusions can we draw from all of the data that we have? Second, how can we do this in an efficient manner? The answers to these questions are where computer science and biological science meet: in the research field of bioinformatics. The obvious beauty of the aforementioned fields of study is their resemblance to stringology, the analysis of strings. 
The research presented in this thesis lies within the intersection of computational molecular biology and stringology. Specifically, the aim was to design string-processing algorithms to analyse molecular sequences, in order to aid and enhance biological research. This thesis is an exploration of three important concepts in molecular biology: circular molecules, sequence motifs, and pan-genomes. In Chapter 2, we study the problem of accuracy when aligning two linear sequences obtained from circular molecular structures. Chapter 3 focuses on common, and thus biologically important, patterns found in molecular sequences. Lastly, in Chapter 4, we consider the complexities of handling pan-genomic data.
Original languageEnglish
Awarding Institution
Award date1 Dec 2019


Download statistics

No data available

View graph of relations

© 2018 King's College London | Strand | London WC2R 2LS | England | United Kingdom | Tel +44 (0)20 7836 5454