TY - JOUR
T1 - Extending alignments with k-mismatches and ℓ-gaps
AU - Barton, Carl
AU - Iliopoulos, Costas
AU - Lee, Inbok
AU - Mouchard, Laurent
AU - Park, Kunsoo
AU - Pissis, Solon
PY - 2014/3/13
Y1 - 2014/3/13
N2 - Recently, the problem of extending an alignment with k-mismatches and a single gap for pairwise sequence alignment was introduced (Flouri et al., 2011). The authors considered the problem of extending an alignment under the Hamming distance model by also allowing the insertion of a single gap; and presented a Θ(mβ)-time algorithm to solve it, where m is the length of the shortest sequence to be extended, and β is the maximum allowed length of the single gap. Very recently, it was shown (Flouri et al., 2012) that this problem is strongly and directly motivated by the next-generation re-sequencing application: aligning tens of millions of short DNA sequences against a reference genome. In this article, we consider an extension of this problem: extending an alignment with k-mismatches and two gaps ; and present a Θ(mβ)-time algorithm to solve it. This extension is proved to be fundamental in the next-generation re-sequencing application (Alachiotis et al., 2012). In addition, we present a generalisation of our solution to solve the problem of extending an alignment with k-mismatches and ℓ-gaps in time Θ(mβℓ). The presented solutions work provided that all gaps in the alignment must occur in one of the two sequences.
AB - Recently, the problem of extending an alignment with k-mismatches and a single gap for pairwise sequence alignment was introduced (Flouri et al., 2011). The authors considered the problem of extending an alignment under the Hamming distance model by also allowing the insertion of a single gap; and presented a Θ(mβ)-time algorithm to solve it, where m is the length of the shortest sequence to be extended, and β is the maximum allowed length of the single gap. Very recently, it was shown (Flouri et al., 2012) that this problem is strongly and directly motivated by the next-generation re-sequencing application: aligning tens of millions of short DNA sequences against a reference genome. In this article, we consider an extension of this problem: extending an alignment with k-mismatches and two gaps ; and present a Θ(mβ)-time algorithm to solve it. This extension is proved to be fundamental in the next-generation re-sequencing application (Alachiotis et al., 2012). In addition, we present a generalisation of our solution to solve the problem of extending an alignment with k-mismatches and ℓ-gaps in time Θ(mβℓ). The presented solutions work provided that all gaps in the alignment must occur in one of the two sequences.
U2 - 10.1016/j.tcs.2013.06.012
DO - 10.1016/j.tcs.2013.06.012
M3 - Article
VL - 525
JO - Journal of Theoretical Computer Science
JF - Journal of Theoretical Computer Science
ER -