On-line weighted pattern matching

Panagiotis Charalampopoulos, Costas S. Iliopoulos, Solon P. Pissis, Jakub Radoszewski*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

5 Citations (Scopus)

Abstract

A weighted sequence is a sequence of probability distributions over an alphabet of size σ. Weighted sequences arise naturally in many applications. We study the problem of weighted pattern matching in which we are given a string pattern P of length m, a weight threshold [Formula presented], and a weighted text X arriving on-line. We say that P occurs in X at position i if the product of probabilities of the letters of P at positions i−m+1,…,i in X is at least [Formula presented]. We first discuss how to apply a known general scheme that transforms off-line pattern matching algorithms to on-line algorithms to obtain an on-line algorithm that requires O((σ+log⁡z)log⁡m) or O(σlog2⁡m) time per arriving position; with the space requirement however being O(mmin⁡(σ,z)). Our main result is a new algorithm that processes each arriving position of X in O(z+σ) time using O(m+z) extra space.

Original languageEnglish
Pages (from-to)49-59
Number of pages11
JournalINFORMATION AND COMPUTATION
Volume266
Early online date3 Jan 2019
DOIs
Publication statusPublished - 1 Jun 2019

Keywords

  • On-line pattern matching
  • Position weight matrix (PWM)
  • String matching automaton
  • Uncertain sequence
  • Weighted sequence

Fingerprint

Dive into the research topics of 'On-line weighted pattern matching'. Together they form a unique fingerprint.

Cite this