Abstract
To perform the task of email categorization, the tournament methods are proposed in this article in which the multi-class categorization process is broken down into a set of binary classification tasks. The methods of elimination tournament and Round Robin tournament are implemented and applied to classify emails within 15 folders. Substantial experiments are conducted to compare the effectiveness and robustness of the tournament methods against the n-way classification method. The experimental results prove that the tournament methods outperform the n-way method by 11.7% regarding precision, and the Round Robin performs slightly better than the Elimination tournament on average.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Xia, Y., Dalli, A., Wilks, Y., Guthrie, L.: FASiL Adaptive Email Categorization System. In: Gelbukh, A. (ed.) CICLing 2005. LNCS, vol. 3406, pp. 718–729. Springer, Heidelberg (2005)
Guthrie, L., Walker, E., Guthrie, J.: Document Classification by machine: Theory and practice. In: Proc. COLING 1994, pp. 1059–1063 (1994)
Smadja, F., Tumblin, H.: Automatic Spam Detection as a Text Classification Task. Elron Software (2003)
Lewis, D.: Naive Bayes at forty: The independence assumption in information retrieval. In: Nédellec, C., Rouveirol, C. (eds.) ECML 1998. LNCS, vol. 1398, pp. 4–15. Springer, Heidelberg (1998)
McCallum, A., Nigam, K.: A comparison of event models for naive bayes text classification. In: AAAI 1998 Workshop on Text Categorization (1998)
Androutsopoulos, K.I., Chandrinos, J., Paliouras, G.K.V., Spyropoulos, C.D.: An Evaluation of Naive Bayesian Anti-Spam Filtering. In: Proc. of the workshop on Machine Learning in the New Information Age (2000)
Carrerras, X., Marquez, L.: Boosting Trees for Anti-Spam Email Filtering. In: Proc. RANLP-2001 (2001)
Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Mateo (1993)
Thorsten, J.: A Statistical Learning Model of Text Classification with Support Vector Machines. In: Proc. of SIGIR 2001, New Orleans, ACM Press, New York (2001)
Wiener, Pederson, E.J.O., Weigend, A.S.: A neural network approach to topic spotting. In: Proc. SDAIR 1995, Nevada, Las Vegas, pp. 317–332 (1995)
Yang, Y.: An evaluation of statistical approaches to text categorization. Journal of Information Retrieval 1(1/2), 67–88 (1999)
Breiman, B.: Bagging predictors. Machine Learning 24(2), 123–140 (1996)
Freund, Y., Schapire, R.E.: Experiments with a New Boosting Algorithm. In: Proceedings of the 13th International Conference on Machine Learning, pp. 325–332 (1996)
Cohen, W.: Learning Rules that Classify EMail. In: Proc. AAAI Spring Symposium on Machine Learning in Information Access, Stanford, California (1996)
Payne, T., Edwards, P.: Interface Agents that Learn: An Investigation of Learning Issues in a Mail Agent Interface. Applied Artificial Intelligence Journal, AUCS/TR9508 (1997)
Aas, L., Eikvil, L.: Text categorisation: A survey. Norwegian Computing Center, Raport NR 941 (1999)
Fürnkranz, J.: Round Robin Classification. Journal of Machine Learning Research 2, 21–747 (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Xia, Y., Liu, W., Guthrie, L. (2005). Email Categorization with Tournament Methods. In: Montoyo, A., Muńoz, R., Métais, E. (eds) Natural Language Processing and Information Systems. NLDB 2005. Lecture Notes in Computer Science, vol 3513. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11428817_14
Download citation
DOI: https://doi.org/10.1007/11428817_14
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-26031-8
Online ISBN: 978-3-540-32110-1
eBook Packages: Computer ScienceComputer Science (R0)