Abstract
One of the main challenges for making an accurate classifier based on mining emerging pattern is extraction of a minimal set of strong emerging patterns from a high-dimensional dataset. This problem is harder when features are generated dynamically and so the entire feature space is unavailable. In this scheme, features are obtained one by one instead of having all features available before learning starts.
In this paper, we propose mining Jumping Emerging Patterns by Streaming Feature selection (JEPSF for short) using a dynamic border-differential algorithm where builds a new border of the jumping emerging pattern space based on an old border of the jumping emerging pattern space for new coming features. This framework completely avoids going back to the most initial step to build a new border of the jumping emerging pattern space. This algorithm helps reducing number of irrelevant emerging patterns, what in turns brings significant advantages to the presented algorithm. Thus the number of jumping emerging patterns is reduced and strong emerging pattern is also extracted. We experimentally represent effectiveness of the proposed approach against other state-of-the-art methods, in terms of predictive accuracy.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Dong, G., Li, J.: Efficient Mining of Emerging Patterns: Discovering Trends and Differences. In: Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 43–52. ACM (1999)
Li, J., Dong, G., Ramamohanarao, K.: Making use of the most expressive jumping emerging patterns for classification. Knowledge and Information Systems 3, 131–145 (2001)
Lo, D., Cheng, H., Han, J., Khoo, S., Sun, C.: Classification of software behaviours for failure detection: a discriminative pattern mining approach. In: KDD 2009, pp. 557–566 (2009)
Li, J., Liu, H., Downing, J.R., Yeoh, A.E.J., Wong, L.: Simple rules underlying gene expression profiles of more than six subtypes of acute lymphoblastic leukemia (all) patients. Bioinformatics 19(1), 71–78 (2003)
Fang, G., Pandey, G., Wang, W., Gupta, M., Steinbach, M., Kumar, V.: Mining low-support discriminative patterns from dense and high-dimensional data. IEEE Transactions on Knowledge and Data Engineering 24(2), 279–294 (2012)
Mao, S., Dong, G.: Discovery of highly differentiative gene groups from microarray gene expression data using the gene club approach. J. Bioinformatics and Computational Biology 3, 1263–1280 (2005)
Boulesteix, A.-L., Tutz, G., Strimmer, K.: A CART-based approach to discover emerging patterns in microarray data. Bioinformatics 19(18), 2465–2472 (2003)
Li, J., Wong, L.: Identifying good diagnostic gene groups from gene expression profiles using the concept of emerging patterns. Bioinformatics 18(5), 725–734 (2002)
Wu, X., Yu, K., Wang, H., Ding, W.: Online streaming feature selection. In: ICML 2010, pp. 1159–1166 (2010)
Zhou, J., Foster, D., Stine, R.A., Ungar, L.H.: Streamwise feature selection. J. of Machine Learning Research 7, 1861–1885 (2006)
Dong, G., Li, J.: Mining border descriptions of emerging patterns from dataset pairs. Knowledge and Information Systems 8, 178–202 (2005)
Li, J., Ramamohanarao, K., Dong, G.: The Space of Jumping Emerging Patterns and Its Incremental Maintenance Algorithms. In: Proceedings of the Seventeenth International Conference on Machine Learning, ICML 2000, USA, pp. 551–558 (2000)
Bayardo, R.J.: Efficiently mining long patterns from databases. In: SIGMOD 1998, pp. 85–93 (1998)
Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidates generation. In: SIGMOD, pp. 1–12 (May 2000)
Fan, H., Ramamohanarao, K.: An efficient single-scan algorithm for mining essential jumping emerging patterns for classification. In: Chen, M.-S., Yu, P.S., Liu, B. (eds.) PAKDD 2002. LNCS (LNAI), vol. 2336, pp. 456–462. Springer, Heidelberg (2002)
Loekito, E., Bailey, J.: Fast mining of high dimensional expressive contrast patterns using zero suppressed binary decision diagrams. In: KDD 2006, pp. 307–316 (2006)
Yu, K., Ding, W., Simovici, D., Wu, X.: Mining Emerging Patterns by Streaming Feature Selection. In: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2012), China, pp. 60–68 (2012)
Dong, G., Zhang, X., Wong, L., Li, J.: CAEP: Classification by Aggregating Emerging Patterns. In: Arikawa, S., Nakata, I. (eds.) DS 1999. LNCS (LNAI), vol. 1721, pp. 30–42. Springer, Heidelberg (1999)
Quinlan, J.R.: C4.5: programs for machine learning, vol. 1. Morgan Kaufmann (1993)
Cortes, C., Vapnik, V.: Support vector machine. Machine Learning 20(3), 273–297 (1995)
Freund, Y., Shapire, R.: A decision-theoretic generalization of on-line learning and an application to boosting. In: Vitányi, P.M.B. (ed.) EuroCOLT 1995. LNCS, vol. 904, pp. 23–37. Springer, Heidelberg (1995)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Alavi, F., Hashemi, S. (2014). Mining Jumping Emerging Patterns by Streaming Feature Selection. In: Huynh, V., Denoeux, T., Tran, D., Le, A., Pham, S. (eds) Knowledge and Systems Engineering. Advances in Intelligent Systems and Computing, vol 245. Springer, Cham. https://doi.org/10.1007/978-3-319-02821-7_30
Download citation
DOI: https://doi.org/10.1007/978-3-319-02821-7_30
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-02820-0
Online ISBN: 978-3-319-02821-7
eBook Packages: EngineeringEngineering (R0)