Skip to main content

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 294))

  • 1139 Accesses

Abstract

Tools that effectively analyze and compare sequences are of great importance in various areas of applied computational research, especially in the framework of molecular biology. In the present paper, we introduce simple geometric criteria based on the notion of string linearity and use them to compare DNA sequences of various organisms, as well as to distinguish them from random sequences. Our experiments reveal a significant difference between biosequences and random sequences the former having much higher deviation from linearity than the latter as well as a general trend of increasing deviation from linearity between primitive and biologically complex organisms. The proposed approach is potentially applicable to the construction of dendograms representing the evolutionary relationships among species.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Apostolico, A., Giancarlo, R.: Sequence alignment in molecular biology. Journal of Computational Biology 5(2), 173–196 (1998)

    Article  Google Scholar 

  2. Apostolico, A., Cunial, F.: The subsequence composition of polypeptides. Journal of Computational Biology 17(8), 1–39 (2010)

    Article  MathSciNet  Google Scholar 

  3. Brimkov, B., Brimkov, V.E.: Geometric approach to string analysis: deviation from linearity and its use for biosequence classification (2013), http://arxiv.org/abs/1308.2885v1

  4. Broox Jr., F.P.: Three great challenges for half-century-old computer science. J. ACM 50, 25–26 (2003)

    Article  Google Scholar 

  5. Monod, J.: Chance and Necessity. Collins, London (1972)

    Google Scholar 

  6. Nevil-Manning, C., Witten, I.: Protein is incompressible. In: Proc. Conf. Data Compression, p. 257 (1999)

    Google Scholar 

  7. Salzburger, W., Steinke, D., Braasch, I., Meyer, A.: Genome desertification in eutherians: can gene deserts explain the uneven distribution of genes in placental mammalian genomes? J. Mol. Evol. 69(3), 207–216 (2009)

    Article  Google Scholar 

  8. Sankoff, D., Kruskal, J.B. (eds.): Time Warps, String Edits, and Macromolecules: The Theory and Practice of Sequence Computation. Addison-Wesley, Reading (1983)

    Google Scholar 

  9. Schwartz, R., King, J.: Sequences of hydrophobic and hydrophilic runs and alternations in proteins of known structure. Protein Sci. 15, 102–112 (2006)

    Article  Google Scholar 

  10. Pande, V., Grosberg, A., Tanaka, T.: Nonrandomness in protein sequences: evidence for a physically driven stage of evolution. Proc. Natl. Acad. Sci. USA 91, 12972–12975 (1994)

    Article  Google Scholar 

  11. Pandić, M., Balaban, A.T.: On a four-dimensional representation of DNA primary sequences. J. Chem. Inf. Comput. Sci. 43, 532–539 (2003)

    Article  Google Scholar 

  12. Waterman, M.S.: Introduction to Computational Biology. Maps, Sequences and Genomes. Chapman Hall (1995)

    Google Scholar 

  13. Weiss, O., Jiménez-Montañgo, M., Herzel, H.: Information content of protein sequences. J. Theoret. Biology 206, 379–386 (2000)

    Article  Google Scholar 

  14. White, S., Jacobs, R.: Statistical distribution of hydrophobic residues along the length of protein chains. Biophys. J. 57, 911–921 (1990)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Boris Brimkov .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Brimkov, B., Brimkov, V.E. (2014). Geometric Approach to Biosequence Analysis. In: Saez-Rodriguez, J., Rocha, M., Fdez-Riverola, F., De Paz Santana, J. (eds) 8th International Conference on Practical Applications of Computational Biology & Bioinformatics (PACBB 2014). Advances in Intelligent Systems and Computing, vol 294. Springer, Cham. https://doi.org/10.1007/978-3-319-07581-5_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-07581-5_12

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-07580-8

  • Online ISBN: 978-3-319-07581-5

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics