Abstract
Lot of research has gone into understanding the composition and nature of proteins, still many things remain to be understood satisfactorily. It is now generally believed that amino acid sequences of proteins are not random, and thus the patterns of amino acids that we observe in the protein sequences are also non-random. In this study, we have attempted to decipher the nature of associations between different amino acids that are present in a protein. This very basic analysis provides insights into the co-occurrence of certain amino acids in a protein. Such association rules are desirable for enhancing our understanding of protein composition and hold the potential to give clues regarding the global interactions amongst some particular sets of amino acids occuring in proteins. Presence of strong non-trivial associations suggests further evidence for non-randomness of protein sequences. Knowledge of these rules or constraints is highly desirable for the in-vitro synthesis of artificial proteins.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Branden, C., Tooze, J.: Introduction to Protein Structure. Garland Publishing, New York (1991)
Yockey, H.P.: On the information content of cytochrome. J. Theor. Biol. 67, 147–151 (1977)
Strait, B.J., Dewey, G.: The Shannon information entropy of protein sequences. Biophys. J. 71, 148–155 (1996)
Pande, S.V., Grosberg, A.Y., Tanaka, T.: Non-randomness in protein sequences: evidence for a physically driven stage of evolution? Proc. Natl. Acad. Sci. USA 91, 12972–12975 (1994)
White, S.H., Jacobs, R.E.: The evolution of proteins from random amino acid sequences - I. Evidence of proteins from the lengthwise distribution of amino acids in modern proteins. J. Mol. Evol. 36, 79–95 (1993)
Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: Proc. of the 20th Int’l Conference on Very Large Databases, Santiago, Chile (September 1994)
Fukuda, T., Morimoto, Y., Morishita, S., Tokuyama, T.: Data mining using two-dimensional optimized association rules: Scheme, algorithms, and visualization. In: Proc. 1996 ACM-SIGMOD Int. Conf. Management of Data, Montreal, Canada, pp. 13–23 (1996)
Srikant, R., Agrawal, R.: Mining quantitative association rules in large relational tables. In: Proc. ACM SIGMOD (1996)
Brin, S., Motwani, R., Silverstein, C.: Beyond market basket: Generalizing association rules to correlations. In: Proc. 1197 ACM SIGMOD, Tuscon, AZ, pp. 265–276 (1997)
Han, J., Kamber, M.: Data Mining: Concepts and Techniques. Morgan Kaufmann Publishers, San Francisco (2001)
Lent, B., Swami, A., Widom, J.: Clustering association rules. In: Proc. Int’l Conf. Data Engineering (ICDE 1997), England, pp. 220–231 (1997)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Gupta, N., Mangal, N., Tiwari, K., Mitra, P. (2006). Mining Quantitative Association Rules in Protein Sequences. In: Williams, G.J., Simoff, S.J. (eds) Data Mining. Lecture Notes in Computer Science(), vol 3755. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11677437_21
Download citation
DOI: https://doi.org/10.1007/11677437_21
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-32547-5
Online ISBN: 978-3-540-32548-2
eBook Packages: Computer ScienceComputer Science (R0)