Abstract
This paper proposes C3, a new learning scheme to improve classification performance of rare category emails in the early stage of incremental learning. C3 consists of three components: the chief-learner, the co-learners and the combiner. The chief-learner is an ordinary learning model with an incremental learning capability. The chief-learner performs well on categories trained with sufficient samples but badly on rare categories trained with insufficient samples. The co-learners that are focused on the rare categories are used to compensate for the weakness of the chief-learner in classifying new samples of the rare categories. The combiner combines the outputs of both the chief-learner and the co-learner to make a finial classification. The chief-learner is updated incrementally with all the new samples overtime and the co-learners are updated with new samples from rare categories only. After the chief-learner has gained sufficient knowledge about the rare categories, the co-learners become unnecessary and are removed. The experiments on customer emails from an e-commerce company have shown that the C3 model outperformed the Naive Bayes model on classifying the emails of rare categories in the early stage of incremental learning.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
McCallum, A., Nigam, K.: A Comparison of Event Models for Naive Bayes Text Classification. In: AAAI 1998 Workshop on Learning for Text Categorization (1998)
Giraud-Carrier, C.: A Note on the Utility of Incremental Learning. AI Communications 13(4), 215–223 (2000)
Sebastiani, F.: Machine Learning in Automated Text Categorization. ACM Computing Surveys 34(1), 1–47 (2002)
Rennie, J.D.M.: ifile: An Application of Machine Learning to E-Mail Filtering. In: Proceedings of the KDD 2000 Workshop on Text Mining (2000)
Segal, R.B., Kephart, J.O.: Incremental Learning in SwiftFile. In: Proceedings of The 17th International Conference on Machine Learning, pp. 863–870 (2000)
Fahlman, S.E., Lebiere, C.: The Cascade-Correlation Learning Architecture. Advances in Neural Information Processing Systems 2, 524–532 (1990)
Chalup, S.K.: Incremental Learning in Biological and Machine Learning Systems. International Journal of Neural Systems 12(6), 447–465 (2002)
Dietterich, T.G.: Machine-Learning Research: Four Current Directions. AI Magazine 18(4), 97–136 (1997)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Yang, J., Huang, J.Z., Zhang, N., Xu, Z. (2003). C3: A New Learning Scheme to Improve Classification of Rare Category Emails. In: Gedeon, T.(.D., Fung, L.C.C. (eds) AI 2003: Advances in Artificial Intelligence. AI 2003. Lecture Notes in Computer Science(), vol 2903. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-24581-0_64
Download citation
DOI: https://doi.org/10.1007/978-3-540-24581-0_64
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-20646-0
Online ISBN: 978-3-540-24581-0
eBook Packages: Springer Book Archive