Abstract
The feature selection is an important part in automatic text classification. In this paper, we use a Chinese semantic dictionary – Hownet to extract the concepts from the word as the feature set, because it can better reflect the meaning of the text. We construct a combined feature set that consists of both sememes and the Chinese words, propose a CHI-MCOR weighing method according to the weighing theories and classification precision. The effectiveness of the competitive network and the Radial Basis Function (RBF) network in text classification are examined. Experimental result shows that if the words are extracted properly, not only the feature dimension is smaller but also the classification precision is higher, the RBF network outperform competitive network for automatic text classification because of the application of supervised learning. Besides its much shorter training time than the BP network’s, the RBF network makes precision and recall rates that are almost at the same level as the BP network’s.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Liao, S., Jiang, M.: An Improved Method of Feature Selection Based on Concept Attributes in Text Classification. In: Wang, L., Chen, K., S. Ong, Y. (eds.) ICNC 2005. LNCS, vol. 3610, pp. 1140–1149. Springer, Heidelberg (2005)
Mlademnic, D., Gtobelnik, M.: Feature Selection for Unbalanced Class Distribution and Naïve Bayees. In: Proceedings of the Sixteenth International Conference on Machine learning, pp. 258–267 (1999)
Yang, Y., Pedersen, J.O.: A Comparative Study on Feature Selection in Text Categorization. In: Proceedings of the 14th International Conference on Machine Learning, pp. 412–420 (1997)
Wang, L., Jiang, M., Lu, Y., Noe, F., Smith, J.C.: Clustering Analysis of Competitive Learning Network for Molecular Data. In: Wang, J., Yi, Z., Żurada, J.M., Lu, B.-L., Yin, H. (eds.) ISNN 2006. LNCS, vol. 3971, pp. 1244–1249. Springer, Heidelberg (2006)
Wang, L., Jiang, M., Lu, Y., Noe, F., Smith, J.C.: Self-Organizing Map Clustering Analysis for Molecular Data. In: Wang, J., Yi, Z., Żurada, J.M., Lu, B.-L., Yin, H. (eds.) ISNN 2006. LNCS, vol. 3971, pp. 1250–1255. Springer, Heidelberg (2006)
Dong, Z., Dong, Q.: The Download of Hownet [EB/OL], http://www.keenage.com
Wang, L., Jiang, M., Liao, S., et al.: A Feature Selection Method Based on Concept Extraction and SOM Text Clustering Analysis. International Journal of Computer Science and Network Security 6(1A), 20–28 (2006)
Liao, S., Jiang, M.: A Combined Weight Method Based on Concept Extraction in Automatic Classification of Chinese Text. In: Proceedings of 2005 IEEE International Conference on Neural Networks and Brain, pp. 625–630 (2005)
Martin, T., Hagan, H., Demuth, B., Beale, M.: Neural Network Design. PWS Publishing Company (1996)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Jiang, M., Wang, L., Lu, Y., Liao, S. (2006). A RBF Network for Chinese Text Classification Based on Concept Feature Extraction. In: King, I., Wang, J., Chan, LW., Wang, D. (eds) Neural Information Processing. ICONIP 2006. Lecture Notes in Computer Science, vol 4234. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11893295_33
Download citation
DOI: https://doi.org/10.1007/11893295_33
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-46484-6
Online ISBN: 978-3-540-46485-3
eBook Packages: Computer ScienceComputer Science (R0)