Abstract
In this research, we attempt to apply the NTC (Neural Text Categorizer) to the text categorization without decomposing it into binary classifications. Because a single classifier has its very weak robustness to the entire text categorization, it is usually decomposed into binary classifications as many as categories. However, it requires to rearrange and relabel the given training examples with positive or negative labels for decomposing the text categorization. The task of this research is to apply the NTC to the text categorization without the decomposition and validate its feasibility. Therefore, we will compare the NTC with other approaches in the text categorization in the environment where the text categorization is not decomposed and validate that the NTC is practical tool for implement a light version of text categorization system.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Sebastiani, F.: Machine Learning in Automated Text Categorization. ACM Computing Survey 34(1), 1–47 (2002)
Cover, T.M., Hart, P.E.: Nearest Neighbor Pattern Classification. IEEE Transaction on Information Theory 13, 21–27 (1967)
Massand, B., Linoff, G., Waltz, D.: Classifying News Stories using Memory based Reasoning. In: The Proceedings of 15th ACM International Conference on Research and Development in Information Retrieval, pp. 59–65 (1992)
Yang, Y.: An evaluation of statistical approaches to text categorization. Information Retrieval 1(1-2), 67–88 (1999)
Kononenko, I.: ID3, sequential Bayes, naive Bayes and Bayesian neural networks. In: The Proceedings of 4th European Working Session on Learning, Montpellier, pp. 91–98 (1989)
Mitchell, T.M.: Machine Learning. McGraw-Hill, New York (1997)
Mladenic, D., Grobelink, M.: Feature Selection for unbalanced class distribution and Na?ve Bayes. In: The Proceedings of Inter-national Conference on Machine Learning, pp. 256–267 (1999)
Hearst, M.: Support Vector Machines. IEEE Intelligent Systems 13(4), 18–28 (1998)
Joachims, T.: Text Categorization with Support Vector Machines: Learning with many Relevant Features. In: The Proceedings of 10th European Conference on Machine Learning, pp. 143–151 (1998)
Drucker, H., Wu, D., Vapnik, V.N.: Support Vector Machines for Spam Categorization. IEEE Transaction on Neural Networks 10(5), 1048–1054 (1999)
Cristianini, N., Shawe-Taylor, J.: Support Vector Machines and Other Kernel-based Learning Methods. Cambridge University Press, Cambridge (2000)
McClelland, J., Rumelhart, D.: Parallel Distributed Processing, vol. 1,2. MIT Press, Cambridge (1986)
Wiener, E.D.: A Neural Network Approach to Topic Spotting in Text, The Thesis of Master of University of Colorado (1995)
Ruiz, M.E., Srinivasan, P.: Hierarchical Text Categorization Using Neural Networks. Information Retrieval 5(1), 87–118 (2002)
Lodhi, H., Saunders, C., Shawe-Taylor, J., Cristianini, N., Watkins, C.: Text Classification with String Kernels. Journal of Machine Learning Research 2(2), 419–444 (2002)
Estabrooks, A., Jo, T., Japkowicz, N.: A Multiple Resampling Method for Learning from Imbalanced Data Sets. Computational Intelligence 28(1), 18–36 (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Jo, T. (2009). Categorizing News Articles Using NTC without Decomposition. In: Ślęzak, D., Kim, Th., Zhang, Y., Ma, J., Chung, Ki. (eds) Database Theory and Application. DTA 2009. Communications in Computer and Information Science, vol 64. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-10583-8_5
Download citation
DOI: https://doi.org/10.1007/978-3-642-10583-8_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-10582-1
Online ISBN: 978-3-642-10583-8
eBook Packages: Computer ScienceComputer Science (R0)