Abstract
The classification of rare cases is a challenging problem in many real life applications. The scarcity of the rare cases makes it difficult for traditional classifiers to classify them correctly. In this paper, we propose a new approach to use emerging patterns (EPs) [3] in rare-class classification (EPRC). Traditional EP-based classifiers [2] fail to achieve accepted results when dealing with rare cases. EPRC overcomes this problem by applying three improving stages: generating new undiscovered EPs for the rare class, pruning low-support EPs, and increasing the supports of the rare-class EPs. An experimental evaluation carried out on a number of rare-class databases shows that EPRC outperforms EP-based classifiers as well as other classification methods such as PNrule [1], Metacost [6], and C4.5 [7].
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Joshi, M.V., Agarwal, R., Kumar, V.: Mining Needles in a Haystack: Classifying Rare Classes via Two-Phase Rule Induction. In: Proceedings of ACM (SIGMOD 2001), Santa Barbara, California, USA (2001)
Bailey, J., Manoukian, T., Ramamohanarao, K.: Classification Using Constrained Emerging Patterns. In: Dong, G., Tang, C., Wang, W. (eds.) WAIM 2003. LNCS, vol. 2762, pp. 226–237. Springer, Heidelberg (2003)
Dong, G., Li, J.: Efficient Mining of Emerging Patterns: Discovering Trends and Differences. In: Proceedings of 1999 International Conference on Knowledge Discovery and Data Mining (KDD 1999), San Diego, CA, USA (1999)
Blake, C., Keogh, E., Merz, C.J.: UCI repository of machine learning databases. Department of Information and Computer Science, University of California at Irvine, CA (1999), http://www.ics.uci.edu/~mlearn/MLRepository.html
van der Putten, P., de Ruiter, M., van Someren, M.: The CoIL Challenge 2000 report (June 2000), http://www.liacs.nl/~putten/library/cc
Domingos, P.: MetaCost: A General Method for Making Classifiers Cost-Sensitive. In: Proceedings of 1999 International Conference on Knowledge Discovery and Data Mining (KDD 1999), San Diego, CA, USA (1999)
Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations. Morgan Kaufmann, San Mateo (1999)
Cheng, J., Hatzis, C., Hayashi, H., Krogel, M., Morishita, S., Page, D., Sese, J.: KDD Cup 2001 report. ACM SIGKDD Explorations (January 2002)
van Rijsbergan, C.J.: Information Retrieval. Butterworths, London (1979)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Alhammady, H., Ramamohanarao, K. (2004). The Application of Emerging Patterns for Improving the Quality of Rare-Class Classification. In: Dai, H., Srikant, R., Zhang, C. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2004. Lecture Notes in Computer Science(), vol 3056. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-24775-3_27
Download citation
DOI: https://doi.org/10.1007/978-3-540-24775-3_27
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-22064-0
Online ISBN: 978-3-540-24775-3
eBook Packages: Springer Book Archive