Skip to main content

Measuring Sequence Similarity Trough Many-to-Many Frequent Correlations

  • Conference paper
  • 1919 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5177))

Abstract

Comparing pairs of sequences is a problem emerging in several application areas (ranging from molecular biology, to signal processing, text retrieval, and intrusion detection, just to cite a few) and important results have been achieved through the years. In fact, most of the algorithms in the literature rely on the assumption that matching symbols (or at least a substitution schema among them) are known in advance. This paper opens the way to a more involved mechanism for sequence comparison, where determining the best substitution schema is also part of the matching problem. The basic idea is that any symbol of one sequence can be correlated with many symbols of the other sequence, provided each correlation frequently occurs over the various positions. The approach fits a variety of problems difficult to be handled with classical techniques, particularly where strings to be matched are defined over different alphabets.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Baker, B.S.: Parameterized pattern matching: Algorithms and applications. Journal of Computer and System Sciences 52, 28–42 (1996)

    Article  MATH  MathSciNet  Google Scholar 

  2. Durbin, R., Eddy, S., Krogh, A., Mitchison, G.: Biological Sequence Analysis, Probabilistic models of proteins and nucleic acids. Cambridge University Press, Cambridge (1998)

    MATH  Google Scholar 

  3. Gusfield, D.: Algorithms on Strings, Trees, and Sequences. Cambridge University Press, Cambridge (1997)

    MATH  Google Scholar 

  4. Henikoff, S., Henikoff, J.: Amino acid substitution matrices from protein blocks. In: Proceedings of the National Academy of Sciences (PNAS):Biochemistry, pp. 10915–10919 (1992)

    Google Scholar 

  5. Hsu, J., Chen, A.L.P., Liu, C.C.: Efficient repeating pattern finding in music databases. In: CIKM 1998: Proceedings of the seventh international conference on Information and knowledge management, pp. 281–288. ACM, New York (1998)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Ignac Lovrek Robert J. Howlett Lakhmi C. Jain

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Greco, G., Terracina, G. (2008). Measuring Sequence Similarity Trough Many-to-Many Frequent Correlations. In: Lovrek, I., Howlett, R.J., Jain, L.C. (eds) Knowledge-Based Intelligent Information and Engineering Systems. KES 2008. Lecture Notes in Computer Science(), vol 5177. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-85563-7_62

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-85563-7_62

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-85562-0

  • Online ISBN: 978-3-540-85563-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics