TY - CHAP
T1 - A Parallel Algorithm for Fixed-Length Approximate String-Matching with k-mismatches
AU - Crochemore, Maxime
AU - Iliopoulos, Costas S.
AU - Pissis, Solon
PY - 2010
Y1 - 2010
N2 - This paper deals with the approximate string-matching problem with Hamming distance. The approximate string-matching with k-mismatches problem is to find all locations at which a query of length m matches a factor of a text of length n with k or fewer mismatches. The approximate string-matching algorithms have both pleasing theoretical features, as well as direct applications, especially in computational biology. We consider a generalisation of this problem, the fixed-length approximate string-matching with k-mismatches problem: given a text t, a pattern x and an integer E, search for all the occurrences in t of all factors of x of length l with k or fewer mismatches with a factor of t. We present a practical parallel algorithm of comparable simplicity that requires only O(nm inverted right perpendicularl/winverted left perpendicular/p)) time, where w is the word size of the machine (e.g. 32 or 64 in practice) and p the number of processors. Thus the algorithm's performance is independent of k and the alphabet size vertical bar Sigma vertical bar. The proposed parallel algorithm makes use of message-passing parallelism model, and word-level parallelism for efficient approximate string-matching.
AB - This paper deals with the approximate string-matching problem with Hamming distance. The approximate string-matching with k-mismatches problem is to find all locations at which a query of length m matches a factor of a text of length n with k or fewer mismatches. The approximate string-matching algorithms have both pleasing theoretical features, as well as direct applications, especially in computational biology. We consider a generalisation of this problem, the fixed-length approximate string-matching with k-mismatches problem: given a text t, a pattern x and an integer E, search for all the occurrences in t of all factors of x of length l with k or fewer mismatches with a factor of t. We present a practical parallel algorithm of comparable simplicity that requires only O(nm inverted right perpendicularl/winverted left perpendicular/p)) time, where w is the word size of the machine (e.g. 32 or 64 in practice) and p the number of processors. Thus the algorithm's performance is independent of k and the alphabet size vertical bar Sigma vertical bar. The proposed parallel algorithm makes use of message-passing parallelism model, and word-level parallelism for efficient approximate string-matching.
UR - http://www.scopus.com/inward/record.url?scp=78650438144&partnerID=8YFLogxK
U2 - 10.1007/978-3-642-12476-1_6
DO - 10.1007/978-3-642-12476-1_6
M3 - Conference paper
SN - 9783642124754
VL - N/A
T3 - Lecture Notes in Computer Science
SP - 92
EP - 101
BT - Algorithms and Applications
A2 - Elomaa, Tapio
A2 - Mannila, Heikki
A2 - Orponen, Pekka
PB - Springer
CY - Berlin
T2 - Workshop on Algorithms and Applications Dedicated to Esko Ukkonen for his 60th Birthday
Y2 - 28 May 2010
ER -