Abstract
A broad variety of short-read alignment programmes has been released recently to address the task of mapping tens of millions of short reads to a reference genome, placing emphasis on various aspects of the problem. Although all programmes allow for a small number of alignment mismatches, some of them either perform poorly when allowing gap insertions or they do not allow for gap insertions at all. The seed-and-extend strategy is applied in most of these programmes: after a fast alignment between a fragment of the reference sequence and a high-quality fragment of a short read-the seed-an important problem is to extend the alignment between a relatively short succeeding fragment of the reference sequence and the remaining low-quality fragment of the read allowing a number of mismatches and the insertion of gaps in the alignment. However, the length of the short reads in combination with the gap occurrence frequency observed in various applications suggest that the single-gap alignment of (parts of) those reads is desirable.
In this article, we present libgapmis, an ultrafast library for pairwise short-read single-gap alignment including accelerated SSE-based and GPU-based versions. It implements an algorithm, which computes a modified version of the traditional dynamic programming matrix for sequence alignment to solve the above alignment problem. We show that the library functions of the CPU-based version are up to 20x faster compared to competing programmes, while the respective SSE-based and GPU-based versions are up to 6x and 11x faster than our CPU-based implementation, respectively. The functions made available via our library can be seamlessly integrated into any short-read alignment pipeline.
Original language | English |
---|---|
Title of host publication | 2012 IEEE International Conference on Bioinformatics and Bio medicine Workshops (BIBMW) |
Editors | J Gao, W Dubitzky, C Wu, M Liebman, R Alhaij, L Ungar, A Christianson, Hu |
Place of Publication | Piscataway, N.J. |
Publisher | IEEE |
Pages | 688-695 |
Number of pages | 8 |
Volume | N/A |
Edition | N/A |
ISBN (Print) | 9781467327466 |
DOIs | |
Publication status | Published - 2012 |
Event | IEEE International Conference on Bioinformatics and Biomedicine Workshops (BIBMW) - Philadelphia, Panama Duration: 4 Oct 2012 → 7 Oct 2012 |
Publication series
Name | IEEE International Conference on Bioinformatics and Biomedicine-BIBM |
---|---|
Publisher | IEEE |
ISSN (Print) | 2156-1125 |
Conference
Conference | IEEE International Conference on Bioinformatics and Biomedicine Workshops (BIBMW) |
---|---|
Country/Territory | Panama |
City | Philadelphia |
Period | 4/10/2012 → 7/10/2012 |
Keywords
- short reads
- alignment
- gap
- vector intrinsics
- SSE
- GPGPU
- SEQUENCE