Abstract
The current research on association rule based text classification neglected several key problems. First, weights of elements in profile vectors may have much impact on generating classification rules. Second, traditional association rule lacks semantics. Increasing semantic of association rule may help to improve the classification accuracy. Focusing on the above problems, we propose a new classification approach. This approach include: (1) Mining frequent item-sets on item-weighted transactions; (2) Generating enhanced association rule that has richer semantics than traditional association rule. Experiments show that new approach outperforms CMAR, S-EM and NB algorithms on classification accuracy.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Liu, B., Lee, W.S., Yu, P.S., Li, X.: Partially Supervised Classification of Text Document[C]. In: Proceeding of ICML- 2002 (2002)
Liu, B., Hsu, W., Ma, Y.: Integrating classification and association rule mining[C]. Proceeding of KDD 1998, New York (August 1998)
Salton, G., Wong, A., Yang, C.S.: A Vector Space Model for Automatic Indexing [J]. Communication of the ACM (1) (1995)
Sebastiani, F.: Machine Learning in Automated Text Categorization [J]. ACM computing Surveys 34(1) 11–12, 32–33 (2002)
Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation [C]. In: Proceeding of SIGMOD 2000, Dallas, TX (2000)
Wang, W., Yang, J., Yu, P.: Efficient mining of weighted Association Rules (WAR) [J]. IBM Research Report RC 21692(97734) (March 2000)
Zhang, Z., et al.: Enabling Personalization Recommendation With WeightedFP for Text information Retrieval Based on User-Focus[C]. In: Proceeding of ITCC 2004 (2004)
Jie, Z., Changjie, T., Tianqing, Z.: Mining Predicate Association Rule by Gene Expression Programming. In: Meng, X., Su, J., Wang, Y. (eds.) WAIM 2002. LNCS, vol. 2419, Springer, Heidelberg (2002)
Li, W., Han, J., Pei, J.: CMAR: accurate and efficient classification based on multiple class-association rules[C]. In: Proceeding of ICDM (2001)
Yin, X., Han, J.: CPAR: Classification based on Predictive Association Rules[C]. In: SDM 2003. Proceeding of International Conference on Data Mining (2003)
Xiaoyun, C., Wei, C.: Text Categorization Based on Classification Rules Tree by Frequent Patterns [J]. Journal of Software 17(5) (May 2006)
Cavnar, W.B., Trenkle, J.M.: N-Gram-Based Text Categorization. In: Proceeding of Third Annual Symposium on Document Analysis (1994)
Zhou, S.G., et al.: A Chinese document categorization system without dictionary support and segmentation processing [J]. Journal of Computer Research and Development 38(7) (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2007 Springer Berlin Heidelberg
About this paper
Cite this paper
Qiu, J. et al. (2007). A Novel Text Classification Approach Based on Enhanced Association Rule. In: Alhajj, R., Gao, H., Li, J., Li, X., Zaïane, O.R. (eds) Advanced Data Mining and Applications. ADMA 2007. Lecture Notes in Computer Science(), vol 4632. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-73871-8_24
Download citation
DOI: https://doi.org/10.1007/978-3-540-73871-8_24
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-73870-1
Online ISBN: 978-3-540-73871-8
eBook Packages: Computer ScienceComputer Science (R0)