Skip to main content

A New Approach to Sequence Representation of Proteins in Bioinformatics

  • Conference paper
MICAI 2005: Advances in Artificial Intelligence (MICAI 2005)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3789))

Included in the following conference series:

Abstract

A method to represent arbitrary sequences (strings) is discussed. We emphasize the application of the method to the analysis of the similarity of sets of proteins expressed as sequences of amino acids. We define a pattern of arbitrary structure called a metasymbol. An implementation of a detailed representation is discussed. We show that a protein may be expressed as a collection of metasymbols in a way such that the underlying structural similarities are easier to identify.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Gibbs, A.J., McIntyre, G.A.: The diagram, a method for comparing sequences. Its use with amino acid and nucleotide sequences. Eur. J. Biochem. 16, 1–11 (1970)

    Article  Google Scholar 

  2. Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for the similarities in the amino acid sequence of two proteins. J. Mol. Biol. 48, 443–453 (1970)

    Article  Google Scholar 

  3. National Center for Biotechnology Information, http://www.ncbi.nlm.nih.gov (last access 30-04-05)

  4. Mount, D.W.: Bioinformatics. Sequence and genome analysis. Cold Spring Harbor Laboratory Press, New York (2001)

    Google Scholar 

  5. Lipman, D.J., Altschul, S.F., Kececioglu, J.D.: A tool for multiple sequence alignment. Proc. Natl. Acad. Sci. 86, 4412–4415 (1989)

    Article  Google Scholar 

  6. Higgins, D.G., Thompson, J.D., Gibson, T.J.: Using CLUSTAL for multiple sequence alignments. Methods Enzimol. 266, 237–244 (1996)

    Google Scholar 

  7. Corpert, F.: Multiple sequence alignment with hierarchical clustering. Nucleic. Acids. Res. 16, 10881–10890 (1988)

    Article  Google Scholar 

  8. Morgenstern, B., Frech, K., Dress, A., Werner, T.: DIALING: finding local similarities by multiple sequence alignment. Bioinformatics 14, 290–294 (1998)

    Article  Google Scholar 

  9. Notredame, C., Higgins, D.G.: SAGA: Sequence alignment by genetic algorithm. Nucleic Acids Res. 24, 1515–1524 (1996)

    Article  Google Scholar 

  10. Gribskov, M., Luethy, R., Eisenberg, D.: Profile analysis. Methods Enzimol 183, 146–159 (1990)

    Article  Google Scholar 

  11. Burkhardt, S., Kärkkäinen, J.: Better Filtering with Gapped q-Grams. In: Amir, A., Landau, G.M. (eds.) CPM 2001. LNCS, vol. 2089, pp. 73–85. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  12. Parida, L.: Algorithmic Techniques in Computational Genomics, Doctoral Dissertation, Courant Institute of Mathematical Sciences, University of New York (1998)

    Google Scholar 

  13. Kuri, A.: Pattern based lossless data compression. WSEAS Transactions on communications 3(1), 22–29 (2004)

    Google Scholar 

  14. Kuri, A., Herrera, O.: Efficient lossless data compression for nonergodic sources using advanced search operators and genetic algorithms. In: Nazuno, J., Gelbukh, A., Yañez, C., Camacho, O. (eds.) Advances in Artificial Intelligence, Computing Science and Computer Engineering, vol. 10, pp. 243–251 (2004) ISBN: 970-36-0194-4, ISSN: 1665-9899

    Google Scholar 

  15. Kuri, A., Galaviz, J.: Pattern-based data compression. In: Monroy, R., Arroyo-Figueroa, G., Sucar, L.E., Sossa, H. (eds.) MICAI 2004. LNCS (LNAI), vol. 2972, pp. 1–10. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  16. Li, M., Vitányi, P.: An introduction to Kolmogorov complexity and its applications, 2nd edn. Springer, New York (1997)

    MATH  Google Scholar 

  17. Nevill-Manning, C.G., Witten, I.H.: Protein is incompressible. In: Storer, J.A., Cohn, M. (eds.) Proc. Data Compression Conference, pp. 257–266. IEEE Press, Los Alamitos (1999)

    Google Scholar 

  18. Kuri-Morales, A., Herrera, O., Galaviz, J., Ortiz, M.: Practical Estimation of Kolmogorov Complexity using Highly Efficient Compression Algorithms, cursos.itam.mx/akuri/-2005/tempart (last access: 04/30/05)

    Google Scholar 

  19. Kuri, A.: Lossless Data Compression through Pattern Recognition, cursos.itam.mx/akuri/2005/tempart (last access: 04/30/05)

    Google Scholar 

  20. Definition of Bioinformatics in the Web, http://www.google.com.mx/search?hl-=es&lr=&oi=defmore&q=define:Bioinformatics (last access: 01/02/05)

  21. Kuri, A., Galaviz, J.: Data Compression using a Dictionary of Patterns, cursos.itam.mx/akuri/2005/tempart (last access: 05/02/05)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Kuri-Morales, A.F., Ortiz-Posadas, M.R. (2005). A New Approach to Sequence Representation of Proteins in Bioinformatics. In: Gelbukh, A., de Albornoz, Á., Terashima-Marín, H. (eds) MICAI 2005: Advances in Artificial Intelligence. MICAI 2005. Lecture Notes in Computer Science(), vol 3789. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11579427_90

Download citation

  • DOI: https://doi.org/10.1007/11579427_90

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-29896-0

  • Online ISBN: 978-3-540-31653-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics