Skip to main content

Mining Quantitative Association Rules in Protein Sequences

  • Chapter
Data Mining

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3755))

Abstract

Lot of research has gone into understanding the composition and nature of proteins, still many things remain to be understood satisfactorily. It is now generally believed that amino acid sequences of proteins are not random, and thus the patterns of amino acids that we observe in the protein sequences are also non-random. In this study, we have attempted to decipher the nature of associations between different amino acids that are present in a protein. This very basic analysis provides insights into the co-occurrence of certain amino acids in a protein. Such association rules are desirable for enhancing our understanding of protein composition and hold the potential to give clues regarding the global interactions amongst some particular sets of amino acids occuring in proteins. Presence of strong non-trivial associations suggests further evidence for non-randomness of protein sequences. Knowledge of these rules or constraints is highly desirable for the in-vitro synthesis of artificial proteins.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Branden, C., Tooze, J.: Introduction to Protein Structure. Garland Publishing, New York (1991)

    Google Scholar 

  2. Yockey, H.P.: On the information content of cytochrome. J. Theor. Biol. 67, 147–151 (1977)

    Google Scholar 

  3. Strait, B.J., Dewey, G.: The Shannon information entropy of protein sequences. Biophys. J. 71, 148–155 (1996)

    Article  Google Scholar 

  4. Pande, S.V., Grosberg, A.Y., Tanaka, T.: Non-randomness in protein sequences: evidence for a physically driven stage of evolution? Proc. Natl. Acad. Sci. USA 91, 12972–12975 (1994)

    Article  Google Scholar 

  5. White, S.H., Jacobs, R.E.: The evolution of proteins from random amino acid sequences - I. Evidence of proteins from the lengthwise distribution of amino acids in modern proteins. J. Mol. Evol. 36, 79–95 (1993)

    Article  Google Scholar 

  6. Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: Proc. of the 20th Int’l Conference on Very Large Databases, Santiago, Chile (September 1994)

    Google Scholar 

  7. Fukuda, T., Morimoto, Y., Morishita, S., Tokuyama, T.: Data mining using two-dimensional optimized association rules: Scheme, algorithms, and visualization. In: Proc. 1996 ACM-SIGMOD Int. Conf. Management of Data, Montreal, Canada, pp. 13–23 (1996)

    Google Scholar 

  8. Srikant, R., Agrawal, R.: Mining quantitative association rules in large relational tables. In: Proc. ACM SIGMOD (1996)

    Google Scholar 

  9. Brin, S., Motwani, R., Silverstein, C.: Beyond market basket: Generalizing association rules to correlations. In: Proc. 1197 ACM SIGMOD, Tuscon, AZ, pp. 265–276 (1997)

    Google Scholar 

  10. Han, J., Kamber, M.: Data Mining: Concepts and Techniques. Morgan Kaufmann Publishers, San Francisco (2001)

    Google Scholar 

  11. Lent, B., Swami, A., Widom, J.: Clustering association rules. In: Proc. Int’l Conf. Data Engineering (ICDE 1997), England, pp. 220–231 (1997)

    Google Scholar 

  12. http://scop.mrc-lmb.cam.ac.uk/scop/

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Gupta, N., Mangal, N., Tiwari, K., Mitra, P. (2006). Mining Quantitative Association Rules in Protein Sequences. In: Williams, G.J., Simoff, S.J. (eds) Data Mining. Lecture Notes in Computer Science(), vol 3755. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11677437_21

Download citation

  • DOI: https://doi.org/10.1007/11677437_21

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-32547-5

  • Online ISBN: 978-3-540-32548-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics