Skip to main content

An Improved Class-Center Method for Text Classification Using Dependencies and WordNet

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11839))

Abstract

Automatic text classification is a research focus and core technology in natural language processing and information retrieval. The class-center vector method is an important text classification method, which has the advantages of less calculation and high efficiency. However, the traditional class-center vector method for text classification has the disadvantages that the class vector is large and sparse; its classification accuracy is not high and it lacks semantic information. To overcome these problems, this paper proposes an improved class-center method for text classification using dependencies and the WordNet dictionary. Experiments show that, compared with traditional text classification algorithms, the improved class-center vector method has lower time complexity and higher accuracy on a large corpus.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    https://nlp.stanford.edu/software/lex-parser.html.

  2. 2.

    http://qwone.com/~jason/20Newsgroups.

  3. 3.

    http://www.keenage.com/html/e_index.html.

References

  1. Li, S.S., Xia, R., Zong, C.Q., Huang, C.R.: A framework of feature selection methods for text categorization. In: Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP, vol. 2, pp. 692–700 (2009)

    Google Scholar 

  2. Deng, X.L., Li, Y.Q., Weng, J., Zhang, J.L.: Feature selection for text classification: a review. Multimedia Tools Appl. 78(3), 3793–3816 (2018)

    Google Scholar 

  3. Abraham, R., Simha, J.B., Iyengar, S.S.: Medical datamining with a new algorithm for feature selection and Naive Bayesian classifier. In: International Conference on Information Technology, pp. 44–49 (2007)

    Google Scholar 

  4. Yigit, H.: A weighting approach for KNN classifier. In: International Conference on Electronics, Computer and Computation, vol. 8, pp. 228–131 (2014)

    Google Scholar 

  5. Awange, J.L., Paláncz, B., Lewis, R.H., Völgyesi, L.: Support Vector Machines (SVM). Tékhne, Revista de EST udos Politécnicos (2018)

    Chapter  Google Scholar 

  6. Cohen, W.W.: Context-sensitive learning methods for text categorization. In: Conference on Research and Development in Information Retrieval, pp. 307–315 (1996)

    Google Scholar 

  7. Chen, J.N., Huang, H.K., Tian, S.F., Qu, Y.L.: Feature selection for text classification with Naive Bayes. Expert Syst. Appl. 36(3), 5432–5435 (2009)

    Article  Google Scholar 

  8. https://blog.csdn.net/amds123/article/details/53696027,last. Accessed 17 May 2019

  9. Mao, G.: Research and implementation of text Classification Model Based on Class Center Vector. Dalian University of Technology, Dalian (2010)

    Google Scholar 

  10. Salton, G., Yu, C.T.: On the construction of effective vocabularies for information retrieval. In: Proceedings of the 1973 Meeting on Programming Languages and Information Retrieval, pp. 8–60 (1973)

    Google Scholar 

  11. Jones, K.S.: A statistical interpretation of term specificity and its application in retrieval. J. Docu. 28(1), 11–21 (1972)

    Article  Google Scholar 

  12. How, B.C., Narayanan, K.: An empirical study of feature selection for text categorization based on term weightage. In: International Conference on Web Intelligence, WI 2004, pp. 599–602. IEEE/WIC/ACM (2004)

    Google Scholar 

  13. Qu, S.N. Wang, S.J., Zou, Y.: Improvement of Text Feature Selection Method Based on TFIDF. IEEE Computer Society (2008)

    Google Scholar 

  14. Wang, D.X, Gao, X.Y., Andreae, P.: Automatic keyword extraction from single sentence natural language queries. In: PRICAI 2013, pp. 637–648 (2012)

    Google Scholar 

  15. Abdi, H., Williams, L.J.: Principal component analysis. Wiley Interdisc. Rev. Comput. Stat. 2(4), 433–459 (2010)

    Article  Google Scholar 

  16. Tsuge, S., Shishibori, M., Kuroiwa, S., et al.: Dimensionality reduction using non-negative matrix factorization for information retrieval. In: 2001 IEEE International Conference on Systems, Man and Cybernetics. e-Systems and e-Man for Cybernetics in Cyberspace (Cat. No. 01CH37236), vol. 2, pp. 960–965 (2001)

    Google Scholar 

  17. Tesiniere, L.: Elements de Syntaxe Structurale. Libairie C, Klincksieck. (1959)

    Google Scholar 

  18. Zhu, X., Yang, Y., Huang, Y., Guo, Q., Zhang, B.: Measuring similarity and relatedness using multiple semantic relations in WordNet. Knowledge and Information Systems (2019). https://doi.org/10.1007/s10115-019-01387-6. Accessed 01 August 2019

  19. Feng, G.Z., Li, S.T., Sun, T.L., Zhang, B.Z.: A probabilistic model derived term weighting scheme for text classification. Pattern Recogn. Lett. 110(1), 23–29 (2018)

    Article  Google Scholar 

  20. Liu, Y., Huang, R.C.: Research on optimization of maximum discriminant feature selection algorithm in text classification. J. Sichuan Univ. 56(1), 65–70 (2019). Natural Science Edition

    Google Scholar 

  21. Yun, J., Jing, L., Yu, J., et al.: A multi-layer text classification framework based on two-level representation model. Expert Syst. Appl. 39(2), 2035–2046 (2012)

    Article  Google Scholar 

  22. Cao, S.J.: Fuzzy support vector machine of dismissing margin based on the method of class-center. Comput. Eng. Appl. 42(22), 146–149 (2006)

    Google Scholar 

Download references

Acknowledgements

This work has been supported by the Natural Science Foundation of Guangxi of China under the contract number 2018GXNSFAA138087, the National Natural Science Foundation of China under the contract numbers 61462010 and 61363036, and Guangxi Collaborative Innovation Center of Multi-source Information Integration and Intelligent Processing.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yishan Chen .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zhu, X., Xu, Q., Chen, Y., Wu, T. (2019). An Improved Class-Center Method for Text Classification Using Dependencies and WordNet. In: Tang, J., Kan, MY., Zhao, D., Li, S., Zan, H. (eds) Natural Language Processing and Chinese Computing. NLPCC 2019. Lecture Notes in Computer Science(), vol 11839. Springer, Cham. https://doi.org/10.1007/978-3-030-32236-6_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-32236-6_1

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-32235-9

  • Online ISBN: 978-3-030-32236-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics