Abstract
To understand the structure-to-function relationship, life sciences researchers and biologists need to retrieve similar structures from protein databases and classify them into the same protein fold. With the technology innovation the number of protein structures increases every day, so, retrieving structurally similar proteins using current structural alignment algorithms may take hours or even days. Therefore, improving the efficiency of protein structure retrieval and classification becomes an important research issue. In this paper we propose novel approach which provides faster classification (minutes) of protein structures. We build separate Hidden Markov Model for each class. In our approach we align tertiary structures of proteins. Additionally we have compared our approach against an existing approach named 3D HMM. The results show that our approach is more accurate than 3D HMM.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Berman, H.M., Westbrook, J., Feng, Z., Gilliland, G., Bhat, T.N., Weissig, H., Shindyalov, I.N., Bourne, P.E.: The Protein Data Bank. Nucleic Acids Research 28, 235–242 (2000)
Murzin, A.G., Brenner, S.E., Hubbard, T., Chothia, C.: Scop: A Structural Classification of Proteins Database for the Investigation of Sequences and Structures. J. Mol. Biol. 247, 536–540 (1995)
Orengo, C.A., Michie, A.D., Jones, D.T., Swindells, M.B., Thornton, J.M.: CATH–A Hierarchic Classification of Protein Domain Structures. Structure 5(8), 1093–1108 (1997)
Holm, L., Sander, C.: The FSSP Database: Fold Classification Based on Structure- Structure Alignment of Proteins. Nucleic Acids Research 24, 206–210 (1996)
Holm, L., Sander, C.: Protein structure comparison by alignment of distance matrices. J. Mol. Biol. 233, 123–138 (1993)
Kim, Y.J., Patel, J.M.: A framework for protein structure classification and identification of novel protein structures. BMC Bioinformatics 7, 456 (2006)
Hastie, T., Tibshirani, R.: Discriminant adaptive nearest neighbor classification. IEEE Trans. on Pattern and Machine Intell. 18(6), 607–616 (1996)
Cortes, C., Vapnik, V.: Support vector networks. Machine Learning 20, 273–297 (1995)
Khati, P.: Comparative analysis of protein classification methods. Master Thesis. University of Nebraska, Lincoln (2004)
Plötz, T., Fink, G.A.: Pattern recognition methods for advanced stochastic protein sequence analysis using HMMs. Pattern Recognition 39, 2267–2280 (2006)
Alexandrov, V., Gerstein, M.: Using 3D Hidden Markov Models that explicitly represent spatial coordinates to model and compare protein structures. BMC Bioinformatics 5, 2 (2004)
Fujita, M., Toh, H., Kanehisa, M.: Protein sequence-structure alignment using 3D-HMM. In: Fourth International Workshop on Bioinformatics and Systems Biology (IBSB 2004). Poster Abstracts: 7–8, Kyoto, Japan (2004)
Can, T., Camoglu, O., Singh, A.K., Wang, Y.F.: Automated protein classification using consensus decision. In: Third Int. IEEE Computer Society Computational Systems Bioinformatics Conference, Stanford, pp. 224–235 (2004)
Cheek, S., Qi, Y., Krishna, S.S., Kinch, L.N., Grishin, N.V.: Scopmap: Automated assignment of protein structures to evolutionary superfamilies. BMC Bioinformatics 5(1), 197 (2004)
Camoglu, O., Can, T., Singh, A.K., Wang, Y.F.: Decision tree based information integration for automated protein classification. J. Bioinform. Comput. Biol. 3(3), 717–742 (2005)
Ortiz, A.R., Strauss, C.E., Olmea, O.: Mammoth (matching molecular models obtained from theory): An automated method for model comparison. Protein Science 11, 2606–2621 (2002)
Shindyalov, H.N., Bourne, P.E.: Protein structure alignment by incremental combinatorial extension (ce) of the optimal path. Protein Eng. 9, 739–747 (1998)
Gibrat, J.F., Madej, T., Bryant, S.H.: Surprising similarities in structure comparison. Curr. Opin. Struct. Biol. 6(3), 377–385 (1996)
Ephraim, Y., Merhav, N.: Hidden Markov processes. IEEE Transactions on Information Theory 48, 1518–1569 (2002)
Rabiner, L.R.: A tutorial on hidden Markov models and selected applications in speech recognition. Proc. IEEE 77(2), 257–285 (1989)
Churchill, G.A.: Stochastic models for heterogeneous DNA sequences. Bull. Math. Biol. 51, 79–94 (1998)
Karchin, R.: Hidden Markov Models and Protein Sequence Analysis. In: Seventh International Conference on Intelligent Systems for Molecular Biology – ISMB (1999)
Viterbi, A.J.: Error bounds for convolutional codes and an asymptotically optimum decoding algorithm. IEEE Transactions on Information Theory 13(2), 260–269 (1967)
Durbin, R., Edy, S., Krogh, A., Mitchison, G.: Biological sequence analysis: Probabilistic models of proteins and nucleic acids. Cambridge University Press, Cambridge (1998)
Mirceva, G., Kalajdziski, S., Trivodaliev, K., Davcev, D.: Comparative Analysis of three efficient approaches for retrieving protein 3D structures. In: 4-th Cairo International Biomedical Engineering Conference 2008 (CIBEC 2008), Cairo, Egypt, pp. 1–4 (2008)
SCOP (Structural Classification of Proteins) Database, http://scop.mrc-lmb.cam.ac.uk/scop/
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Mirceva, G., Davcev, D. (2009). HMM Approach for Classifying Protein Structures. In: Lee, Yh., Kim, Th., Fang, Wc., Ślęzak, D. (eds) Future Generation Information Technology. FGIT 2009. Lecture Notes in Computer Science, vol 5899. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-10509-8_5
Download citation
DOI: https://doi.org/10.1007/978-3-642-10509-8_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-10508-1
Online ISBN: 978-3-642-10509-8
eBook Packages: Computer ScienceComputer Science (R0)