Abstract
Comparing pairs of sequences is a problem emerging in several application areas (ranging from molecular biology, to signal processing, text retrieval, and intrusion detection, just to cite a few) and important results have been achieved through the years. In fact, most of the algorithms in the literature rely on the assumption that matching symbols (or at least a substitution schema among them) are known in advance. This paper opens the way to a more involved mechanism for sequence comparison, where determining the best substitution schema is also part of the matching problem. The basic idea is that any symbol of one sequence can be correlated with many symbols of the other sequence, provided each correlation frequently occurs over the various positions. The approach fits a variety of problems difficult to be handled with classical techniques, particularly where strings to be matched are defined over different alphabets.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Baker, B.S.: Parameterized pattern matching: Algorithms and applications. Journal of Computer and System Sciences 52, 28–42 (1996)
Durbin, R., Eddy, S., Krogh, A., Mitchison, G.: Biological Sequence Analysis, Probabilistic models of proteins and nucleic acids. Cambridge University Press, Cambridge (1998)
Gusfield, D.: Algorithms on Strings, Trees, and Sequences. Cambridge University Press, Cambridge (1997)
Henikoff, S., Henikoff, J.: Amino acid substitution matrices from protein blocks. In: Proceedings of the National Academy of Sciences (PNAS):Biochemistry, pp. 10915–10919 (1992)
Hsu, J., Chen, A.L.P., Liu, C.C.: Efficient repeating pattern finding in music databases. In: CIKM 1998: Proceedings of the seventh international conference on Information and knowledge management, pp. 281–288. ACM, New York (1998)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Greco, G., Terracina, G. (2008). Measuring Sequence Similarity Trough Many-to-Many Frequent Correlations. In: Lovrek, I., Howlett, R.J., Jain, L.C. (eds) Knowledge-Based Intelligent Information and Engineering Systems. KES 2008. Lecture Notes in Computer Science(), vol 5177. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-85563-7_62
Download citation
DOI: https://doi.org/10.1007/978-3-540-85563-7_62
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-85562-0
Online ISBN: 978-3-540-85563-7
eBook Packages: Computer ScienceComputer Science (R0)