Skip to main content

Chinese Text Feature Dimension Reduction Based on Semantics

  • Conference paper
Chinese Lexical Semantics (CLSW 2013)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8229))

Included in the following conference series:

  • 2386 Accesses

Abstract

Feature dimension reduction is an important step in text categorization, but traditional feature dimension reduction method ignores semantic information of features. In order to solve this problem, this paper, with the semantic dictionary, proposes a new feature dimensionality reduction processing method. The word-semantic knowledge base is constructed on the basis of HowNet and The Semantic Knowledge-base of Contemporary Chinese. By using the knowledge base and the feature extraction method, text feature is mapped to semantic feature and the dimensional reduction of feature space is realized. Naïve Bayes method is introduced to verify the categorization performance. The experimental results indicate that the proposed approach has a good performance of high dimension reduction and categorization.

This paper is funded by the Natural Science Foundation of China (NSFC, Grant No.61070119), the Project of Construction of Innovative Teams and Teacher Career Development for Universities and Colleges Under Beijing Municipality (Grant No. IDHT20130519) and the Beijing Municipal Education Commission Special Fund (Grant No. PXM2012-014224-000020).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Zong, C.: Statistical Natural Language Processing, pp. 340–353. Tsinghua University Press, Beijing (2008)

    Google Scholar 

  2. Dai, L., Huang, H.: A Comparative Study on Feature Selection in Chinese Text Categorization. Journal of Chinese Information Processing 18(1), 26–32 (2004)

    MathSciNet  Google Scholar 

  3. Lewis, D.D.: Feature Selection and Feature Extraction for Text Categorization. In: Proceedings of the Workshop on Speech and Natural Language, pp. 23–26 (1992)

    Google Scholar 

  4. Liu, H., Wang, Y.: Mixed Method of Reducing Feature in Text Classification. Computer Engineering 35(2), 194–196 (2009)

    Google Scholar 

  5. Chen, J.: Research of Feature Selection Method for Chinese Text Classification. Northwest Normal University, Gansu (2012)

    Google Scholar 

  6. Zhang, B.: Analysis and Research on Feature Selection Algorithm for Text Classification. University of Science and Technology of China, Anhui (2010)

    Google Scholar 

  7. Wu, J., Kang, Y.: A Study on Feature Dimension Reduction in Text Categorization. Natural Science Journal of HaiNan University 25(1), 62–66 (2001)

    Google Scholar 

  8. Gao, M., Wang, Z.: Comparing Dimension Reduction Methods of Text Feature Matrix. Computer Engineering and Applications 30, 157–159 (2006)

    Google Scholar 

  9. Dong, Z., Dong, Q.: Theoretical Findings of HowNet. Journal of Chinese Information Processing 4(21), 3–9 (2007)

    Google Scholar 

  10. Yang, Y.: A Comparative Study on Feature Selection in Text Categorization. In: Proceeding of the Fourteenth International Conference on Machine Learning, pp. 412–423 (1997)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Du, Z., Zhang, Y., Zheng, R., Jiang, L. (2013). Chinese Text Feature Dimension Reduction Based on Semantics. In: Liu, P., Su, Q. (eds) Chinese Lexical Semantics. CLSW 2013. Lecture Notes in Computer Science(), vol 8229. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-45185-0_42

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-45185-0_42

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-45184-3

  • Online ISBN: 978-3-642-45185-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics