skip to main content
10.1145/3650215.3650220acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicmlcaConference Proceedingsconference-collections
research-article

Instance Discrimination for Improving Chinese Text Classification

Published:16 April 2024Publication History

ABSTRACT

Due to the information technology industry's rapid development, an abundance of Chinese text data has emerged on the Internet. Extracting valuable information from this data through Chinese text classification technology has become a highly significant topic. Despite the fact that pre-trained models for text representation have achieved the best results possible for a number of Chinese text categorization problems, they often encounter classification errors when faced with highly similar texts that require different labels, resulting in performance degradation. This issue is particularly pronounced in Chinese due to the substantial differences between Chinese and English. Even a slight difference in a Chinese character can lead to completely opposite semantic meanings in a sentence. To solve this issue, we propose a straightforward multi-task learning model that incorporates instance discrimination. Additionally, we perform supplementary training on BERT using instance discrimination as an intermediate task. In essence, our objective is to enhance the model's ability to acquire stronger and more diverse text representation vectors. We conducted experimental verification on three Chinese datasets, and the results demonstrate that our proposed method surpasses relevant single-task learning classifiers and other advanced Chinese text classification techniques.

References

  1. A. L. Maas, R. E. Daly, P. T. Pham, D. Huang, A. Y. Ng, and C. Potts, “Learning word vectors for sentiment analysis,” in The 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Conference, 19-24 June, 2011, Portland, Oregon, USA, pp. 142–150, 2011.Google ScholarGoogle Scholar
  2. S. I. Wang and C. D. Manning, “Baselines and bigrams: Simple, good sentiment and topic classification,” in The 50th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference, July 8-14, 2012, Jeju Island, Korea - Volume 2: Short Papers, pp. 90–94, 2012.Google ScholarGoogle Scholar
  3. N. Kalchbrenner, E. Grefenstette, and P. Blunsom, “A convolutional neural network for modelling sentences,” in Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, ACL 2014, June 22-27, 2014, Baltimore, MD, USA, Volume 1: Long Papers, pp. 655–665, 2014.Google ScholarGoogle ScholarCross RefCross Ref
  4. J. Y. Lee and F. Dernoncourt, “Sequential short-text classification with recurrent and convolutional neural networks,” in NAACL HLT, 2016, The 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, San Diego California, USA, June 12-17, 2016, pp. 515–520, 2016.Google ScholarGoogle ScholarCross RefCross Ref
  5. Li, Qian & Peng, Hao & Li, Jianxin & Xia, Congying & Yang, Renyu & Sun, Lichao & Yu, Philip & He, Lifang. 2020. A Text Classification Survey: From Shallow to Deep Learning.Google ScholarGoogle Scholar
  6. “Term frequency by inverse document frequency, ” in Encyclopedia of Database Systems, p. 3035, 2009.Google ScholarGoogle ScholarCross RefCross Ref
  7. T. Mikolov, K. Chen, G. Corrado, and J. Dean, “Efficient estimation of word representations in vector space, ” in 1st International Conference on Learning Representations, ICLR 2013, Scottsdale, Arizona, USA, May 2-4, 2013, Workshop Track Proceedings, 2013.Google ScholarGoogle Scholar
  8. J. Devlin, M. Chang, K. Lee, and K. Toutanova, “BERT: pre-training of deep bidirectional transformers for language understanding, ” in Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2-7, 2019, Volume 1 (Long and Short Papers), pp. 4171–4186, 2019.Google ScholarGoogle Scholar
  9. Sora Ohashi, Junya Takayama, Tomoyuki Kajiwara, Chenhui Chu, and Yuki Arase. 2020. Text Classification with Negative Supervision. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 351–357, Online. Association for Computational Linguistics.Google ScholarGoogle ScholarCross RefCross Ref
  10. Li, B., Zhou, H., He, J., Wang, M., Yang, Y., & Li, L. 2020. On the sentence embeddings from pre-trained language models. arXiv preprint arXiv:2011.05864.Google ScholarGoogle Scholar
  11. Wu, Qian-Kun, & Peng, Dun-Lu. 2011. A new species of the genus Phyllostachys (Hymenoptera, Braconidae) from China. 2021. MTL-BERT: A multi-task learning model for Chinese text combined with BERT. Small Microcomputer Systems.Google ScholarGoogle Scholar
  12. Xu, J., Hao, J., Bian, X., & Wang, X. 2021, July. Multi-Task Fine-Tuning on BERT Using Spelling Errors Correction for Chinese Text Classification Robustness. In 2021 IEEE 4th International Conference on Big Data and Artificial Intelligence (BDAI) (pp. 110-114). IEEE.Google ScholarGoogle Scholar
  13. Zhang, H., Sun, S., Hu, Y., Liu, J., & Guo, Y. 2020. Sentiment classification for chinese text based on interactive multitask learning. IEEE Access, 8, 129626-129635.Google ScholarGoogle ScholarCross RefCross Ref
  14. Qiu, X., Sun, T., Xu, Y., Shao, Y., Dai, N., & Huang, X. 2020. Pre-trained models for natural language processing: A survey. Science China Technological Sciences, 63(10), 1872-1897.Google ScholarGoogle ScholarCross RefCross Ref
  15. Wilson L. Taylor. “cloze procedure”: A new tool for measuring readability. Journalism Quarterly, 30(4):415–433, 1953.Google ScholarGoogle Scholar
  16. Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and V eselin Stoyanov. RoBERTa: A robustly optimized BERT pretraining approach. arXiv preprint arXiv:1907.11692, 2019.Google ScholarGoogle Scholar
  17. Y u Sun, Shuohuan Wang, Y ukun Li, Shikun Feng, Xuyi Chen, Han Zhang, Xin Tian, Danxiang Zhu, Hao Tian, and Hua Wu. ERNIE: enhanced representation through knowledge integration. arXiv preprint arXiv:1904.09223, 2019.Google ScholarGoogle Scholar
  18. Cui, Y., Che, W., Liu, T., Qin, B., Wang, S., & Hu, G. 2020. Revisiting pre-trained models for Chinese natural language processing. arXiv preprint arXiv:2004.13922.Google ScholarGoogle Scholar
  19. Sun, C., Qiu, X., Xu, Y., & Huang, X. 2019, October. How to fine-tune BERT for text classification. In China national conference on Chinese computational linguistics (pp. 194-206). Springer, Cham.Google ScholarGoogle Scholar
  20. Beltagy, I., Lo, K., & Cohan, A. 2019. SciBERT: A pretrained language model for scientific text. arXiv preprint arXiv:1903.10676.Google ScholarGoogle Scholar
  21. Phang, J., Févry, T., & Bowman, S. R. 2018. Sentence encoders on stilts: Supplementary training on intermediate labeled-data tasks. arXiv preprint arXiv:1811.01088.Google ScholarGoogle Scholar
  22. Arase, Y., & Tsujii, J. 2019. Transfer fine-tuning: A BERT case study. arXiv preprint arXiv:1909.00931.Google ScholarGoogle Scholar
  23. Dosovitskiy, A., Springenberg, J. T., Riedmiller, M., and Brox, T. 2014. Discriminative unsupervised feature learning with convolutional neural networks. In Advances in neural information rocessing systems, pages 766–774.Google ScholarGoogle Scholar
  24. Wu, Z., Xiong, Y., Yu, S. X., & Lin, D. 2018. Unsupervised feature learning via non-parametric INSTANCE discrimination. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 3733-3742).Google ScholarGoogle ScholarCross RefCross Ref
  25. Z. P. Guo, Y. Zhao, Y. B. Zheng, X. C. Si, Z. Y. Liu, andM. S. Sun, Thuctc: An efficient Chinese text classifier, (in Chinese), http://github.com/diuzi/THUCTC, 2016.Google ScholarGoogle Scholar
  26. Johnson, R. , & Tong, Z.. 2017. Deep Pyramid Convolutional Neural Networks for Text Categorization. Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers).Google ScholarGoogle ScholarCross RefCross Ref
  27. Kim, Y. . 2014. Convolutional neural networks for sentence classification. Eprint Arxiv.Google ScholarGoogle Scholar
  28. Liu, P., Qiu, X. , & Huang, X. . 2016. Recurrent Neural Network for Text Classification with Multi-Task Learning. AAAI Press, 10.48550/arXiv.1605.05101.Google ScholarGoogle Scholar
  29. Yang, S. , & Tang, Y. . 2020. Text Classification Based on Convolutional Neural Network and Attention Model. 2020 3rd International Conference on Artificial Intelligence and Big Data (ICAIBD). IEEE.Google ScholarGoogle Scholar
  30. Lai, S. , Xu, L. , Liu, K. , & Zhao, J. . 2015. Recurrent Convolutional Neural Networks for Text Classification. National Conference on Artificial Intelligence. AAAI Press.Google ScholarGoogle Scholar
  31. Joulin, A. , Grave, E. , Bojanowski, P. , & Mikolov, T. . 2017. Bag of tricks for efficient text classification.Google ScholarGoogle Scholar
  32. Vaswani, A. , Shazeer, N. , Parmar, N. , Uszkoreit, J. , Jones, L. , & Gomez, A. N. , 2017. Attention is all you need. arXiv.Google ScholarGoogle Scholar
  33. Ruder, S. . "An Overview of Multi-Task Learning in Deep Neural Networks." 2017.Google ScholarGoogle Scholar

Index Terms

  1. Instance Discrimination for Improving Chinese Text Classification

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Other conferences
      ICMLCA '23: Proceedings of the 2023 4th International Conference on Machine Learning and Computer Application
      October 2023
      1065 pages
      ISBN:9798400709449
      DOI:10.1145/3650215

      Copyright © 2023 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 16 April 2024

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed limited
    • Article Metrics

      • Downloads (Last 12 months)1
      • Downloads (Last 6 weeks)1

      Other Metrics

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format .

    View HTML Format