Abstract
The prediction of protein sequence function is one of the problems arising in the recent progress in bioinformatics. Traditional methods have its limits. We present a novel method of protein sequence function prediction based on sequential pattern mining. First, we use our designed sequential pattern mining algorithms to mine known function sequence dataset. Then, we build a classifier using the patterns generated to predict function of protein sequences. Experiments confirm the effectiveness of our method.
Supported by Graduated Innovation Lab of Northwestern Polytechnical University(Grant Nos. 06044 and 07042).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Neville-Manning, C.G., Sethi, K.S., Wu, D., Brutlag, D.L.: Enumerating and ranking discrete motifs. In: Proceedings of Intelligent Systems for Molecular Biology, pp. 202–209. AAAI Press, Menlo Park (1997)
Wang, L., Jiang, T.: On the complexity of multiple sequence alignment. J. Comput. Biol. 1, 337–348 (1994)
Suyama, M., Nishioka, T., Jun’ichi, O.: Searching for common sequence patterns among distantly related proteins. Protein Eng. 8, 1075–1080 (1995)
Agrawal, R., Srikant, R.: Mining Sequential Patterns: Generalizations and Performance Improvements. In: Apers, P.M.G., Bouzeghoub, M., Gardarin, G. (eds.) EDBT 1996. LNCS, vol. 1057, pp. 3–17. Springer, Heidelberg (1996)
Jian, P., Jiawei, H.: Mining Sequential Patterns by Pattern-growth: The PrefixSpan Approach. IEEE Transactions on Knowledge and Data Engineering 6(10), 1–17 (2004)
Zaki, M.: SPADE: An Efficient Algorithm for Mining Frequent Sequences. Machine Learning 40, 31–60 (2001)
Wang, K., Xu, Y., Yu, J.X.: Scalable Sequential Pattern Mining for Biological Sequences. In: CIKM 2004, Washington, DC, USA, November 13 (2004)
Wang, M., Shang, X.-q., Xue, H.: Joined Pattern Segment-based Sequential Pattern Mining Algorithm for Biological Datasets (in Chinese). Computer Engineering and Applications 44, 190–193 (2008)
Wang, M., Shang, X.-q., Xue, H.: Joined Pattern Segment-based Closed Sequential Pattern Mining Algorithm (in Chinese). Computer Engineering and Applications 44, 148–151 (2008)
Coenen, F., Leng, P.: An Evaluation of Approaches to Classification Rule Selection. In: Proceedings of the 4th IEEE International Conference on Data Mining (ICDM 2004), Brighton, UK, pp. 359–362. IEEE Computer Society, Los Alamitos (2004)
Yin, X., Han, J.: CPAR: Classification based on Predictive Association Rules. In: Proc. SIAM Int. Conf. on Data Mining (SDM 2003), San Francisco, CA, pp. 331–335 (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Wang, M., Shang, Xq., Li, Zh. (2008). Sequential Pattern Mining for Protein Function Prediction. In: Tang, C., Ling, C.X., Zhou, X., Cercone, N.J., Li, X. (eds) Advanced Data Mining and Applications. ADMA 2008. Lecture Notes in Computer Science(), vol 5139. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-88192-6_68
Download citation
DOI: https://doi.org/10.1007/978-3-540-88192-6_68
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-88191-9
Online ISBN: 978-3-540-88192-6
eBook Packages: Computer ScienceComputer Science (R0)