Abstract
How to predict subcellular multi-locations of proteins with machine learning techniques is a challenging problem in computational biology community. Regarding the protein multi-location problem as a multi-label pattern classification problem, we propose a new predicting method for dealing with the protein subcellular localization problem in this paper. Two key points of the proposed method are to divide a seriously unbalanced multi-location problem into a number of more balanced two-class subproblems by using the part-versus-part task decomposition approach, and learn all of the subproblems by using the min-max modular support vector machine (M3-SVM). To evaluate the effectiveness of the proposed method, we perform experiments on yeast protein data set by using two kinds of task decomposition strategies and three kinds of feature extraction methods. The experimental results demonstrate that our method achieves the highest prediction accuracy, which is much better than that obtained by the existing approach based on the traditional support vector machine.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Nakashima, H., Nishikawa, K.: Discrimination of Intracellular and Extracellular Proteins Using Amino Acid Composition and Residue-pair Frequencies. J. Mol. Biol. 238, 54–61 (1994)
Cedano, J., Aloy, P., Perez-Pons, J.A., Querol, E.: Relation Between Amino Acid Composition and Cellular Location of Proteins. J. Mol. Biol. 266, 594–600 (1997)
Reinhardt, A., Hubbard, T.: Using Neural Networks for Prediction of the Subcellular Location of Proteins. Nucleic Acids Research 26, 2230–2236 (1998)
Fujiwara, Y., Asogawa, M., Nakai, K.: Prediction of Mitochondrial Targeting Signals Using Hidden Markov Models. Genome Informatics, 53–60 (1997)
Hua, S., Sun, Z.: Support Vector Machine Approach for Protein Subcellular Localization Prediction. Bioinformatics 17, 721–728 (2001)
Lu, B.L., Ito, M.: Task Decomposition and Module Combination Based on Class Relations: a Modular Neural Network for Pattern Classification. IEEE Transactions on Neural Networks 10, 1244–1256 (1999)
Lu, B.L., Wang, K.A., Utiyama, M., Isahara, H.: A Part-Versus-Part Method for Massively Parallel Training of Support Vector Machines. In: Proceedings of International Joint Conference on Neural Networks, pp. 735–740 (2005)
Chen, K., Liang, W.M., Lu, B.L.: Data Analysis of Swiss-Prot Database. BCMI Technical Report BCMI-TR-0501, Shanghai Jiao Tong University (2005)
Joachims, T.: Learning to Classify Text Using Support Vector Machine: Method, Theory, and Algorithms. Kluwer Academic Publishers, Dordrecht (2002)
Chou, K.C., Cai, Y.D.: Prediction of Protein Subcellular Locations by GO-FunD-PseAA Predictor. Biochemical and Biophysical Research Communications 320, 1236–1239 (2004)
Apweiler, R.: The InterPro Database, an Integrated Documentation Resource for Protein Families, Domains and Functional Sites. Nucleic Acids Research 29, 37–40 (2001)
Yang, Y., Lu, B.L.: Extracting Features from Protein Sequences Using Chinese Segmentation Techniques for Subcellular Localization. In: Proceedings of 2005 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology, pp. 288–295 (2005)
Liu, F.Y., Wu, K., Zhao, H., Lu, B.L.: Fast Text Categorization with Min-Max Modular Support Vector Machines. In: Proceedings of International Joint Conference on Neural Networks, pp. 570–575 (2005)
Chou, K.C., Cai, Y.D.: Predicting Protein Localization in Budding Yeast. Bioinformatics 21(7), 944–950 (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Yang, Y., Lu, BL. (2006). Prediction of Protein Subcellular Multi-locations with a Min-Max Modular Support Vector Machine. In: Wang, J., Yi, Z., Zurada, J.M., Lu, BL., Yin, H. (eds) Advances in Neural Networks - ISNN 2006. ISNN 2006. Lecture Notes in Computer Science, vol 3973. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11760191_98
Download citation
DOI: https://doi.org/10.1007/11760191_98
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-34482-7
Online ISBN: 978-3-540-34483-4
eBook Packages: Computer ScienceComputer Science (R0)