Abstract
Amino acid sequences are usually described using categorical variables which are difficult to change to a numerical form. We compare two numerical coding methods in polyproline type II secondary structure predictions, the frequently used binary vector coding method and our new real value coding method based on the PAM250 substitution table which consists of amino acid mutation information. The real value coding method has good properties such as space saving and illustrative form. Our first results are almost comparable to the results of traditional binary vector coding method.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Siermala M., Juhola M., Vihinen M.: Neural Network Prediction of Polyproline Type II Secondary Structure. In: Hasman et al. (eds): Medical Infobahn for Europe, Proceedings of MIE2000 and GMDS2000, Vol. 77. IOS Press, Amsterdam (2000) 475–947
Hanke J., Raich J.: Kohonen map as a visualization tool for the analysis of protein sequences: multiple alignments, domains and segments of secondary structures. CABIOS 12 (1996) 447–454
Agresti A.: An Introduction to Categorial Data Analysis. John Wiley & Sons, New York (1996)
Dayhoff M., Schwartz R., Orcutt B.: A model of evolutionary change in proteins, matrices for detecting distant relationships. In: Dayhoff M. O. (ed.): Atlas of protein sequence and structure Vol. 5. National biomedical research foundation, Washington DC (1978) 345–358
Goldberg D.: Genetic Algorithms in Search, Optimization, and Machine Learning. Addison-Wesley, Reading (1989)
Berman H., Westbrook J., Feng Z., Gilliland G., Bhat T., Weissig H., Shindyalov I., Bourne P.: The Protein Data Bank. Nucleic Acids Research 28 (2000) 235–242
Needelman S., Wunsch C.: A general method applicable to search for similarities in amino-acid sequence of two proteins. J. Mol. Biol. 48 (1970) 443–453
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2001 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Siermala, M., Juhola, M., Vihinen, M. (2001). Binary Vector or Real Value Coding for Secondary Structure Prediction? A Case Study of Polyproline Type II Prediction. In: Crespo, J., Maojo, V., Martin, F. (eds) Medical Data Analysis. ISMDA 2001. Lecture Notes in Computer Science, vol 2199. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45497-7_40
Download citation
DOI: https://doi.org/10.1007/3-540-45497-7_40
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-42734-6
Online ISBN: 978-3-540-45497-7
eBook Packages: Springer Book Archive