Skip to main content
Log in

Scoring Unusual Words with Varying Mismatch Errors

  • Published:
Mathematics in Computer Science Aims and scope Submit manuscript

Abstract.

Patterns consisting of strings with a bounded number of mismatches are central to coding theory and find multiple applications in text processing and computational biology. In this latter field, the presence of over-represented patterns of this kind has been linked, for instance, to modeling regulatory regions in biosequences. The study and computation of expected number of occurrences and related scores for these patterns is made difficult by the sheer explosion of the roster of candidates that need to be evaluated. In recent work, properties of pattern saturation and score monotonicity have proved capable to mitigate this problem. In such a context, expectation and score monotonicity has been established within the i.i.d. model for all cases of interest except that of a fixed word length with a varying number of mismatches. The present paper completes this investigation by showing that the expected number of occurrences in a textstring for such a word is bi-tonic, that is, behaves as a unimodal function of the number of errors. This extends to this case the time and space savings brought about by discovery algorithms based on pattern saturation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alberto Apostolico.

Additional information

Work Supported in part by the Italian Ministry of University and Research under the Bi-National Project FIRB RBIN04BYZ7, and by the Research Program of Georgia Tech. An extended abstract related to this work was presented at the Dagstuhl Seminar Dagstuhl on “Combinatorial and Algorithmic Foundations of Pattern and Association Discovery”, May 14-19, 2006 [3].

Rights and permissions

Reprints and permissions

About this article

Cite this article

Apostolico, A., Pizzi, C. Scoring Unusual Words with Varying Mismatch Errors. Math.comput.sci. 1, 639–653 (2008). https://doi.org/10.1007/s11786-007-0032-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11786-007-0032-4

Mathematics Subject Classification (2000).

Keywords.

Navigation