Abstract
Novel high-throughput (Deep) sequencing technologies have redefined the way genome sequencing is performed. They are able to produce millions of short sequences in a single experiment and with a much lower cost than previous methods. In this paper, we address the problem of efficiently mapping and classifying millions of short sequences to a reference genome, based on whether they occur exactly once in the genome or not, and by taking into consideration probability scores. In particular, we design algorithms for Massive Exact and Approximate Pattern Matching of short degenerate and weighted sequences, derived from Deep sequencing technologies, to a reference genome.
Original language | Undefined/Unknown |
---|---|
Pages (from-to) | 385 - 397 |
Number of pages | 3 |
Journal | International Journal of Computational Biology and Drug Design |
Volume | 2 |
Issue number | 4 |
DOIs | |
Publication status | Published - 2010 |