Skip to main content

Similarity Analysis of DNA Sequences Based on the Relative Entropy

  • Conference paper
Advances in Natural Computation (ICNC 2005)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 3610))

Included in the following conference series:

  • 1961 Accesses

Abstract

This paper investigates the similarity of two sequences, one of the main issues for fragments clustering and classification when sequencing the genomes of microbial communities directly sampled from natural environment. In this paper, we use the relative entropy as a criterion of similarity of two sequences and discuss its characteristics in DNA sequences. A method for evaluating the relative entropy is presented and applied to the comparison between two sequences. With combination of the relative entropy and the length of variables defined in this paper, the similarity of sequences is easily obtained. The SOM and PCA are applied to cluster subsequences from different genomes. Computer simulations verify that the method works well.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 119.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Tyson, G.W., Chapman, J., et al.: Community structure and metabolism through reconstruction of microbial genomes from the environment. Nature 428, 37–43 (2004)

    Article  Google Scholar 

  2. Steuer, R., Kurths, J., et al.: The mutual information- Detecting and evaluation dependencies between variables. Bioinformatics 18(Suppl. 2), 231–240 (2002)

    Google Scholar 

  3. Thomas, M.C., Joy, A.T.: Elements of Information Theory. Wiley, New York (2001)

    Google Scholar 

  4. Vinga, S., Almeida, J.: Alignment-free sequence comparison - a review. BIoinformatics 19, 513–523 (2003)

    Article  Google Scholar 

  5. Basu, S., Burma, D.P., et al.: Words in DNA sequences- some case studies based on their frequency statistics. Mathematical Biology 46(6), 479–503 (2003)

    Article  MATH  MathSciNet  Google Scholar 

  6. Strickert, M.: Self-Organizing Neural Networks for Sequence Processing. University of Osnabruck 7, 68 (2004)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Yang, W., Pi, X., Zhang, L. (2005). Similarity Analysis of DNA Sequences Based on the Relative Entropy. In: Wang, L., Chen, K., Ong, Y.S. (eds) Advances in Natural Computation. ICNC 2005. Lecture Notes in Computer Science, vol 3610. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11539087_137

Download citation

  • DOI: https://doi.org/10.1007/11539087_137

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-28323-2

  • Online ISBN: 978-3-540-31853-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics