ABSTRACT
Due to the information technology industry's rapid development, an abundance of Chinese text data has emerged on the Internet. Extracting valuable information from this data through Chinese text classification technology has become a highly significant topic. Despite the fact that pre-trained models for text representation have achieved the best results possible for a number of Chinese text categorization problems, they often encounter classification errors when faced with highly similar texts that require different labels, resulting in performance degradation. This issue is particularly pronounced in Chinese due to the substantial differences between Chinese and English. Even a slight difference in a Chinese character can lead to completely opposite semantic meanings in a sentence. To solve this issue, we propose a straightforward multi-task learning model that incorporates instance discrimination. Additionally, we perform supplementary training on BERT using instance discrimination as an intermediate task. In essence, our objective is to enhance the model's ability to acquire stronger and more diverse text representation vectors. We conducted experimental verification on three Chinese datasets, and the results demonstrate that our proposed method surpasses relevant single-task learning classifiers and other advanced Chinese text classification techniques.
- A. L. Maas, R. E. Daly, P. T. Pham, D. Huang, A. Y. Ng, and C. Potts, “Learning word vectors for sentiment analysis,” in The 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Conference, 19-24 June, 2011, Portland, Oregon, USA, pp. 142–150, 2011.Google Scholar
- S. I. Wang and C. D. Manning, “Baselines and bigrams: Simple, good sentiment and topic classification,” in The 50th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference, July 8-14, 2012, Jeju Island, Korea - Volume 2: Short Papers, pp. 90–94, 2012.Google Scholar
- N. Kalchbrenner, E. Grefenstette, and P. Blunsom, “A convolutional neural network for modelling sentences,” in Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, ACL 2014, June 22-27, 2014, Baltimore, MD, USA, Volume 1: Long Papers, pp. 655–665, 2014.Google ScholarCross Ref
- J. Y. Lee and F. Dernoncourt, “Sequential short-text classification with recurrent and convolutional neural networks,” in NAACL HLT, 2016, The 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, San Diego California, USA, June 12-17, 2016, pp. 515–520, 2016.Google ScholarCross Ref
- Li, Qian & Peng, Hao & Li, Jianxin & Xia, Congying & Yang, Renyu & Sun, Lichao & Yu, Philip & He, Lifang. 2020. A Text Classification Survey: From Shallow to Deep Learning.Google Scholar
- “Term frequency by inverse document frequency, ” in Encyclopedia of Database Systems, p. 3035, 2009.Google ScholarCross Ref
- T. Mikolov, K. Chen, G. Corrado, and J. Dean, “Efficient estimation of word representations in vector space, ” in 1st International Conference on Learning Representations, ICLR 2013, Scottsdale, Arizona, USA, May 2-4, 2013, Workshop Track Proceedings, 2013.Google Scholar
- J. Devlin, M. Chang, K. Lee, and K. Toutanova, “BERT: pre-training of deep bidirectional transformers for language understanding, ” in Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2-7, 2019, Volume 1 (Long and Short Papers), pp. 4171–4186, 2019.Google Scholar
- Sora Ohashi, Junya Takayama, Tomoyuki Kajiwara, Chenhui Chu, and Yuki Arase. 2020. Text Classification with Negative Supervision. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 351–357, Online. Association for Computational Linguistics.Google ScholarCross Ref
- Li, B., Zhou, H., He, J., Wang, M., Yang, Y., & Li, L. 2020. On the sentence embeddings from pre-trained language models. arXiv preprint arXiv:2011.05864.Google Scholar
- Wu, Qian-Kun, & Peng, Dun-Lu. 2011. A new species of the genus Phyllostachys (Hymenoptera, Braconidae) from China. 2021. MTL-BERT: A multi-task learning model for Chinese text combined with BERT. Small Microcomputer Systems.Google Scholar
- Xu, J., Hao, J., Bian, X., & Wang, X. 2021, July. Multi-Task Fine-Tuning on BERT Using Spelling Errors Correction for Chinese Text Classification Robustness. In 2021 IEEE 4th International Conference on Big Data and Artificial Intelligence (BDAI) (pp. 110-114). IEEE.Google Scholar
- Zhang, H., Sun, S., Hu, Y., Liu, J., & Guo, Y. 2020. Sentiment classification for chinese text based on interactive multitask learning. IEEE Access, 8, 129626-129635.Google ScholarCross Ref
- Qiu, X., Sun, T., Xu, Y., Shao, Y., Dai, N., & Huang, X. 2020. Pre-trained models for natural language processing: A survey. Science China Technological Sciences, 63(10), 1872-1897.Google ScholarCross Ref
- Wilson L. Taylor. “cloze procedure”: A new tool for measuring readability. Journalism Quarterly, 30(4):415–433, 1953.Google Scholar
- Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and V eselin Stoyanov. RoBERTa: A robustly optimized BERT pretraining approach. arXiv preprint arXiv:1907.11692, 2019.Google Scholar
- Y u Sun, Shuohuan Wang, Y ukun Li, Shikun Feng, Xuyi Chen, Han Zhang, Xin Tian, Danxiang Zhu, Hao Tian, and Hua Wu. ERNIE: enhanced representation through knowledge integration. arXiv preprint arXiv:1904.09223, 2019.Google Scholar
- Cui, Y., Che, W., Liu, T., Qin, B., Wang, S., & Hu, G. 2020. Revisiting pre-trained models for Chinese natural language processing. arXiv preprint arXiv:2004.13922.Google Scholar
- Sun, C., Qiu, X., Xu, Y., & Huang, X. 2019, October. How to fine-tune BERT for text classification. In China national conference on Chinese computational linguistics (pp. 194-206). Springer, Cham.Google Scholar
- Beltagy, I., Lo, K., & Cohan, A. 2019. SciBERT: A pretrained language model for scientific text. arXiv preprint arXiv:1903.10676.Google Scholar
- Phang, J., Févry, T., & Bowman, S. R. 2018. Sentence encoders on stilts: Supplementary training on intermediate labeled-data tasks. arXiv preprint arXiv:1811.01088.Google Scholar
- Arase, Y., & Tsujii, J. 2019. Transfer fine-tuning: A BERT case study. arXiv preprint arXiv:1909.00931.Google Scholar
- Dosovitskiy, A., Springenberg, J. T., Riedmiller, M., and Brox, T. 2014. Discriminative unsupervised feature learning with convolutional neural networks. In Advances in neural information rocessing systems, pages 766–774.Google Scholar
- Wu, Z., Xiong, Y., Yu, S. X., & Lin, D. 2018. Unsupervised feature learning via non-parametric INSTANCE discrimination. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 3733-3742).Google ScholarCross Ref
- Z. P. Guo, Y. Zhao, Y. B. Zheng, X. C. Si, Z. Y. Liu, andM. S. Sun, Thuctc: An efficient Chinese text classifier, (in Chinese), http://github.com/diuzi/THUCTC, 2016.Google Scholar
- Johnson, R. , & Tong, Z.. 2017. Deep Pyramid Convolutional Neural Networks for Text Categorization. Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers).Google ScholarCross Ref
- Kim, Y. . 2014. Convolutional neural networks for sentence classification. Eprint Arxiv.Google Scholar
- Liu, P., Qiu, X. , & Huang, X. . 2016. Recurrent Neural Network for Text Classification with Multi-Task Learning. AAAI Press, 10.48550/arXiv.1605.05101.Google Scholar
- Yang, S. , & Tang, Y. . 2020. Text Classification Based on Convolutional Neural Network and Attention Model. 2020 3rd International Conference on Artificial Intelligence and Big Data (ICAIBD). IEEE.Google Scholar
- Lai, S. , Xu, L. , Liu, K. , & Zhao, J. . 2015. Recurrent Convolutional Neural Networks for Text Classification. National Conference on Artificial Intelligence. AAAI Press.Google Scholar
- Joulin, A. , Grave, E. , Bojanowski, P. , & Mikolov, T. . 2017. Bag of tricks for efficient text classification.Google Scholar
- Vaswani, A. , Shazeer, N. , Parmar, N. , Uszkoreit, J. , Jones, L. , & Gomez, A. N. , 2017. Attention is all you need. arXiv.Google Scholar
- Ruder, S. . "An Overview of Multi-Task Learning in Deep Neural Networks." 2017.Google Scholar
Index Terms
- Instance Discrimination for Improving Chinese Text Classification
Recommendations
Handwritten Chinese text editing and recognition system
This paper describes a handwritten Chinese text editing and recognition system that can edit handwritten text and recognize it with a client-server mode. First, the client end samples and redisplays the handwritten text by using digital ink technics, ...
Multiple Instance Learning Based Method for Similar Handwritten Chinese Characters Discrimination
ICDAR '11: Proceedings of the 2011 International Conference on Document Analysis and RecognitionThis paper proposes a Multiple Instance Learning based method for similar handwritten Chinese characters discrimination. The similar handwritten Chinese characters recognition problem is first defined as a Multiple-instance learning problem. Then the ...
A Bayesian-based method of unconstrained handwritten offline Chinese text line recognition
This paper presents a new Bayesian-based method of unconstrained handwritten offline Chinese text line recognition. In this method, a sample of a real character or non-character in realistic handwritten text lines is jointly recognized by a traditional ...
Comments