Categorizing News Articles Using NTC without Decomposition

Jo, Taeho

doi:10.1007/978-3-642-10583-8_5

Taeho Jo⁶

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 64))

Included in the following conference series:

International Conference on Database Theory and Application

434 Accesses

Abstract

In this research, we attempt to apply the NTC (Neural Text Categorizer) to the text categorization without decomposing it into binary classifications. Because a single classifier has its very weak robustness to the entire text categorization, it is usually decomposed into binary classifications as many as categories. However, it requires to rearrange and relabel the given training examples with positive or negative labels for decomposing the text categorization. The task of this research is to apply the NTC to the text categorization without the decomposition and validate its feasibility. Therefore, we will compare the NTC with other approaches in the text categorization in the environment where the text categorization is not decomposed and validate that the NTC is practical tool for implement a light version of text categorization system.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Sebastiani, F.: Machine Learning in Automated Text Categorization. ACM Computing Survey 34(1), 1–47 (2002)
Article Google Scholar
Cover, T.M., Hart, P.E.: Nearest Neighbor Pattern Classification. IEEE Transaction on Information Theory 13, 21–27 (1967)
Article MATH Google Scholar
Massand, B., Linoff, G., Waltz, D.: Classifying News Stories using Memory based Reasoning. In: The Proceedings of 15th ACM International Conference on Research and Development in Information Retrieval, pp. 59–65 (1992)
Google Scholar
Yang, Y.: An evaluation of statistical approaches to text categorization. Information Retrieval 1(1-2), 67–88 (1999)
Google Scholar
Kononenko, I.: ID3, sequential Bayes, naive Bayes and Bayesian neural networks. In: The Proceedings of 4th European Working Session on Learning, Montpellier, pp. 91–98 (1989)
Google Scholar
Mitchell, T.M.: Machine Learning. McGraw-Hill, New York (1997)
MATH Google Scholar
Mladenic, D., Grobelink, M.: Feature Selection for unbalanced class distribution and Na?ve Bayes. In: The Proceedings of Inter-national Conference on Machine Learning, pp. 256–267 (1999)
Google Scholar
Hearst, M.: Support Vector Machines. IEEE Intelligent Systems 13(4), 18–28 (1998)
Article Google Scholar
Joachims, T.: Text Categorization with Support Vector Machines: Learning with many Relevant Features. In: The Proceedings of 10th European Conference on Machine Learning, pp. 143–151 (1998)
Google Scholar
Drucker, H., Wu, D., Vapnik, V.N.: Support Vector Machines for Spam Categorization. IEEE Transaction on Neural Networks 10(5), 1048–1054 (1999)
Article Google Scholar
Cristianini, N., Shawe-Taylor, J.: Support Vector Machines and Other Kernel-based Learning Methods. Cambridge University Press, Cambridge (2000)
Google Scholar
McClelland, J., Rumelhart, D.: Parallel Distributed Processing, vol. 1,2. MIT Press, Cambridge (1986)
Google Scholar
Wiener, E.D.: A Neural Network Approach to Topic Spotting in Text, The Thesis of Master of University of Colorado (1995)
Google Scholar
Ruiz, M.E., Srinivasan, P.: Hierarchical Text Categorization Using Neural Networks. Information Retrieval 5(1), 87–118 (2002)
Article MATH Google Scholar
Lodhi, H., Saunders, C., Shawe-Taylor, J., Cristianini, N., Watkins, C.: Text Classification with String Kernels. Journal of Machine Learning Research 2(2), 419–444 (2002)
Article MATH Google Scholar
Estabrooks, A., Jo, T., Japkowicz, N.: A Multiple Resampling Method for Learning from Imbalanced Data Sets. Computational Intelligence 28(1), 18–36 (2004)
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer and Information Engineering, Inha University,
Taeho Jo

Authors

Taeho Jo
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

University of Warsaw and Infobright Inc., Poland
Dominik Ślęzak
Hannam University, 306-791, Daejeon, South Korea
Tai-hoon Kim
Utrecht University, The Netherlands
Yanchun Zhang
Hosei University, Tokyo, Japan
Jianhua Ma
ETRI, South Korea
Kyo-il Chung

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Jo, T. (2009). Categorizing News Articles Using NTC without Decomposition. In: Ślęzak, D., Kim, Th., Zhang, Y., Ma, J., Chung, Ki. (eds) Database Theory and Application. DTA 2009. Communications in Computer and Information Science, vol 64. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-10583-8_5

Download citation

DOI: https://doi.org/10.1007/978-3-642-10583-8_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-10582-1
Online ISBN: 978-3-642-10583-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics