Mining Quantitative Association Rules in Protein Sequences

Gupta, Nitin; Mangal, Nitin; Tiwari, Kamal; Mitra, Pabitra

doi:10.1007/11677437_21

Nitin Gupta²⁰,
Nitin Mangal²¹,
Kamal Tiwari²⁰ &
…
Pabitra Mitra²¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3755))

3362 Accesses
21 Citations

Abstract

Lot of research has gone into understanding the composition and nature of proteins, still many things remain to be understood satisfactorily. It is now generally believed that amino acid sequences of proteins are not random, and thus the patterns of amino acids that we observe in the protein sequences are also non-random. In this study, we have attempted to decipher the nature of associations between different amino acids that are present in a protein. This very basic analysis provides insights into the co-occurrence of certain amino acids in a protein. Such association rules are desirable for enhancing our understanding of protein composition and hold the potential to give clues regarding the global interactions amongst some particular sets of amino acids occuring in proteins. Presence of strong non-trivial associations suggests further evidence for non-randomness of protein sequences. Knowledge of these rules or constraints is highly desirable for the in-vitro synthesis of artificial proteins.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Branden, C., Tooze, J.: Introduction to Protein Structure. Garland Publishing, New York (1991)
Google Scholar
Yockey, H.P.: On the information content of cytochrome. J. Theor. Biol. 67, 147–151 (1977)
Google Scholar
Strait, B.J., Dewey, G.: The Shannon information entropy of protein sequences. Biophys. J. 71, 148–155 (1996)
Article Google Scholar
Pande, S.V., Grosberg, A.Y., Tanaka, T.: Non-randomness in protein sequences: evidence for a physically driven stage of evolution? Proc. Natl. Acad. Sci. USA 91, 12972–12975 (1994)
Article Google Scholar
White, S.H., Jacobs, R.E.: The evolution of proteins from random amino acid sequences - I. Evidence of proteins from the lengthwise distribution of amino acids in modern proteins. J. Mol. Evol. 36, 79–95 (1993)
Article Google Scholar
Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: Proc. of the 20th Int’l Conference on Very Large Databases, Santiago, Chile (September 1994)
Google Scholar
Fukuda, T., Morimoto, Y., Morishita, S., Tokuyama, T.: Data mining using two-dimensional optimized association rules: Scheme, algorithms, and visualization. In: Proc. 1996 ACM-SIGMOD Int. Conf. Management of Data, Montreal, Canada, pp. 13–23 (1996)
Google Scholar
Srikant, R., Agrawal, R.: Mining quantitative association rules in large relational tables. In: Proc. ACM SIGMOD (1996)
Google Scholar
Brin, S., Motwani, R., Silverstein, C.: Beyond market basket: Generalizing association rules to correlations. In: Proc. 1197 ACM SIGMOD, Tuscon, AZ, pp. 265–276 (1997)
Google Scholar
Han, J., Kamber, M.: Data Mining: Concepts and Techniques. Morgan Kaufmann Publishers, San Francisco (2001)
Google Scholar
Lent, B., Swami, A., Widom, J.: Clustering association rules. In: Proc. Int’l Conf. Data Engineering (ICDE 1997), England, pp. 220–231 (1997)
Google Scholar
http://scop.mrc-lmb.cam.ac.uk/scop/

Download references

Author information

Authors and Affiliations

Bioinformatics Group, Dept. of Computer Science, University of California, San Diego, 3859 Miramar Street #D, La Jolla, CA, 92037, USA
Nitin Gupta & Kamal Tiwari
Department of Computer Science and Engineering, Indian Institute of Technology, Kanpur, 208016, India
Nitin Mangal & Pabitra Mitra

Authors

Nitin Gupta
View author publications
You can also search for this author in PubMed Google Scholar
Nitin Mangal
View author publications
You can also search for this author in PubMed Google Scholar
Kamal Tiwari
View author publications
You can also search for this author in PubMed Google Scholar
Pabitra Mitra
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

The Australian Taxation Office,
Graham J. Williams
School of Computing and Mathematics, University of Western Sydney, Sydney, NSW, Australia
Simeon J. Simoff

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Gupta, N., Mangal, N., Tiwari, K., Mitra, P. (2006). Mining Quantitative Association Rules in Protein Sequences. In: Williams, G.J., Simoff, S.J. (eds) Data Mining. Lecture Notes in Computer Science(), vol 3755. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11677437_21

Download citation

DOI: https://doi.org/10.1007/11677437_21
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-32547-5
Online ISBN: 978-3-540-32548-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics