Some string matching problems from Bioinformatics which still need better solutions

https://doi.org/10.1016/S1570-8667(03)00062-5Get rights and content
Under an Elsevier user license
open archive

Abstract

Bioinformatics, the discipline which studies the computational problems arising from molecular biology, poses many interesting problems to the string searching community. We will describe two problems arising from Bioinformatics, their preliminary solutions, and the more general problem that they pose. The first problem is searching for α-helices in protein sequences. This particular instance of the search is based on matching of hydrophobicity/hydrophilicity. We find an algorithm which is linear in the sequence length for fixed helix length and is O(nlogn) for any helix length. The second problem is on matching probabilistic sequences against sequences or against other probabilistic sequences. In both cases we derive efficient formulas to compute scores according to a Markovian model of evolution.

Keywords

String matching
Bioinformatics
Alpha-helix prediction
Pattern matching
Homology detection
Profile searching
Probabilistic sequence searching
Probabilistic sequence matching

Cited by (0)