Practical Application of Associative Classifier for Document Classification

Yoon, Yongwook; Lee, Gary Geunbae

doi:10.1007/11562382_36

Practical Application of Associative Classifier for Document Classification

Yongwook Yoon²⁰ &
Gary Geunbae Lee²⁰

Conference paper

992 Accesses
4 Citations

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 3689))

Abstract

In practical text classification tasks, the ability to interpret the classification result is as important as the ability to classify exactly. The associative classifier has favorable characteristics, rapid training, good classification accuracy, and excellent interpretation. However, the associative classifier has some obstacles to overcome when it is applied in the area of text classification. First of all, the training process of the associative classifier produces a huge amount of classification rules, which makes the prediction for a new document ineffective. We resolve this by pruning the rules according to their contribution to correct classifications. In addition, since the target text collection generally has a high dimension, the training process might take a very long time. We propose mutual information between the word and class variables as a feature selection measure to reduce the space dimension. Experimental classification results using the 20-newsgroups dataset show many benefits of the associative classification in both training and predicting.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: VLDB 1994, Santiago, Chile, pp. 487–499 (September 1994)
Google Scholar
Bekkerman, R., El-Yaniv, R., Tishby, N., Winter, Y.: On Feature Distributional Clustering for Text Categoriztion. In: Proceedings of SIGIR 2001, pp. 146–153 (2001)
Google Scholar
Cover, T., Thomas, J.: Elements of Information Theory. John Wiley, Chichester (1991)
Book MATH Google Scholar
Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. In: SIGMOD 2000, Dallas, TX, pp. 1–12 (May 2000)
Google Scholar
Lang, K.: NEWSWEEDER: learning to filter netnews. In: Proceedings of ICML 1995, 12th International Conference on Machine Learning, pp. 331–339 (1995)
Google Scholar
Li, W., Pei, J., Han, J.: CMAR: Accurate and Efficient Classification Based on Multiple Class-Association Rules. In: ICDM 2001, San Jose, CA, pp. 369–376 (November 2001)
Google Scholar
Liu, B., Hsu, W., Ma, Y.: Integrating classification and association rule mining. In: KDD 1998, New York, pp. 80–86 (August 1998)
Google Scholar
McCallum, A.: Bow: A toolkit for statistical language modeling, text retrieval, classification and clustering (1996), http://www.cs.cmu.edu/~mccallum/bow
McCallum, A., Nigam, K.: A comparison of event models for nave Bayes text classification. In: AAAI-98 Workshop on Learning for Text Categorization. AAAI Press, Menlo Park (1998)
Google Scholar
Sebstiani, F.: Machine Learning in Automated Text Categorization. ACM Computing Surverys 34(1), 1–47 (2002)
Article Google Scholar
Webb, G.: Association Rules. In: Ye, N. (ed.) The Handbook of Data Mining. Lawrence Erlbaum Associates, Inc., Mahwah (2004)
Google Scholar
Yin, X., Han, J.: CPAR: Classification based on Predictive Association Rules. In: SDM 2003, San Francisco, CA (May 2003)
Google Scholar
Yoon, Y., Lee, C., Lee, G.: Systematic Construction of Hierarchical Classifier in SVM-Based Text Categorization. In: Su, K.-Y., Tsujii, J., Lee, J.-H., Kwong, O.Y. (eds.) IJCNLP 2004. LNCS (LNAI), vol. 3248, pp. 616–625. Springer, Heidelberg (2005)
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science & Engineering, Pohang University of Science & Technology, Pohang, 790-784, South Korea
Yongwook Yoon & Gary Geunbae Lee

Authors

Yongwook Yoon
View author publications
You can also search for this author in PubMed Google Scholar
Gary Geunbae Lee
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science and Engineering, Pohang University of Science and Technology, San 31, Hyoja-dong, Nam-gu, 790-784, Pohang, Korea
Gary Geunbae Lee
Computer and Communication Media Research, NEC Corp., Miyazaki 4-1-1, Miyamae-ku, 216-8555, Kawasaki, Japan
Akio Yamada
Human-Computer Communications Laboratory, Department of Systems Engineering and Engineering Management, The Chinese University of Hong Kong, Hong Kong
Helen Meng
School of Engineering, Information and Communications University, 119, Munjiro, Yuseong-gu, 305-732, Daejeon, Korea
Sung Hyon Myaeng

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Yoon, Y., Lee, G.G. (2005). Practical Application of Associative Classifier for Document Classification. In: Lee, G.G., Yamada, A., Meng, H., Myaeng, S.H. (eds) Information Retrieval Technology. AIRS 2005. Lecture Notes in Computer Science, vol 3689. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11562382_36

Download citation

DOI: https://doi.org/10.1007/11562382_36
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-29186-2
Online ISBN: 978-3-540-32001-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics