Chinese Text Classification Using BERT and Flat-Lattice Transformer

Lv, Haifeng; Ning, Yishuang; Ning, Ke; Ji, Xiaoyu; He, Sheng

doi:10.1007/978-3-031-23504-7_5

Haifeng Lv^10,12,
Yishuang Ning¹¹,
Ke Ning¹¹,
Xiaoyu Ji^10,12 &
…
Sheng He¹¹

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13729))

Included in the following conference series:

International Conference on AI and Mobile Services

252 Accesses
1 Citations

Abstract

Recently, large scale pre-trained language models such as BERT and models with lattice structure that consisting of character-level and word-level information have achieved state-of-the-art performance in most downstream natural language processing (NLP) tasks, including named entity recognition (NER), English text classification and sentiment analysis. For Chinese text classification, the existing methods have also tried such kinds of models. However, they cannot obtain the desired results since these pre-trained models are based on characters, which cannot be applied for Chinese language that is based on words. To address this problem, in this paper, we propose BFLAT which a simple but efficient model for Chinese text classification. Specifically, BFLAT utilizes BERT and word2vec to learn character-level and word-level vector representations, and then adopts the flat-lattice transformer to integrate both of the two-level vector representations. Experimental results on two datasets demonstrate that our proposed method outperforms the baseline methods over 1.38–21.82% and 3.42–20.7% in terms of relative F1-measure on two Chinese text classification benchmarks, respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 44.99; Price excludes VAT (USA)

Softcover Book: USD 59.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Chen, Y., Xu, L., Liu, K., Zeng, D.: Event extraction via dynamic multi-pooling convolutional neural networks. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, pp. 167–176 (2015)
Google Scholar
Diefenbach, D., Lopez, V.: Singh, K: Core techniques of question answering systems over knowledge bases: a survey. Knowl. Inf. Syst. 55(3), 529–569 (2018)
Article Google Scholar
Ren, F., Deng, J.: Background knowledge based multi-stream neural network for text classification. Appl. Sci. 8(12), 2472 (2018)
Article Google Scholar
Tao, H., Tong, S., Zhao, H., Xu, T., Jin, B., Liu, Q.: A radical-aware attention-based model for Chinese text classification. In: The Thirty-Third AAAI Conference on Artificial Intelligence (AAAI 2019), USA, 27 January–1 February 2019
Google Scholar
Yao, L., Mao, C., Luo, Y.: Graph convolutional networks for text classification. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp.7370–7377 (2019)
Google Scholar
Tian, J., Zhu, D., Long, H.: Chinese short text multi-classification based on word and part-of-speech tagging embedding. In: Proceedings of the 2018 International Conference on Algorithms, Computing and Artificial Intelligence, pp. 1–6 (2018)
Google Scholar
Zhou, J., Lu, Y., Dai, H.N., Wang, H., Xiao, H.: Sentiment analysis of Chinese microblog based on stacked bidirectional LSTM. IEEE Access 7, 38856–38866 (2019)
Article Google Scholar
Zhou, Y., Xu, B., Xu, J., Yang, L., Li, C.: Compositional recurrent neural networks for Chinese short text classification. In: 2016 IEEE/WIC/ACM International Conference on Web Intelligence (WI), pp. 137–144. IEEE (2016)
Google Scholar
Zhang, Y., Yang, J.: Chinese NER using lattice LSTM. arXiv preprint arXiv:1805.02023 (2018)
Zhao, H., Huang, C., Li, M.: An improved Chinese word segmentation system with conditional random field. In: Proceedings of the Fifth SIGHAN Workshop on Chinese Language Processing, Sydney, pp. 162–165 (2006)
Google Scholar
Gui, T., Ma, R., Zhang, Q., Zhao, L., et al.: CNN- based Chinese NER with lexicon rethinking. In: Proceedings of the 28th International Joint Conference on Artificial Intelligence, IJCAI 2019, pp. 4982–4988. AAAI Press (2019)
Google Scholar
Gui, T., Zou, Y., Zhang, Q., et al.: A lexicon-based graph neural network for Chinese NER. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Hong Kong, China, pp. 1039–1049. Association for Computational Linguistics (2019)
Google Scholar
Sui, D., Chen, Y., Liu, K., Zhao, J., Liu, S.: Leverage lexical knowledge for Chinese named entity recognition via collaborative graph network. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLPIJCNLP), pp. 3821–3831. Association for Computational Linguistics, Hong Kong, China (2019)
Google Scholar
Li, X., Yan, H., Qiu, X., Huang, X.: FLAT: Chinese NER using flat-lattice transformer. arXiv preprint arXiv:2004.11795 (2020)
Lee, J. D. M. C. K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)
Vaswani, A., Shazeer, N., Parmar, N., et al.: Attention is all you need. In: Guyon, I., et al. (eds.) Advances in Neural Information Processing Systems 30, pp. 5998–6008. Curran Associates, Inc. (2017)
Google Scholar
Goudjil, M., Koudil, M., Bedda, M., Ghoggali, N.: A novel active learning method using SVM for text classification. Int. J. Autom. Comput. 15(3), 290–298 (2018)
Article Google Scholar
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)
MATH Google Scholar
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)
Google Scholar
Joulin, A., Grave, E., Bojanowski, P., Mikolov, T.: Bag of tricks for efficient text classification. arXiv preprint arXiv:1607.01759 (2016)
Le, Q., Mikolov, T.: Distributed representations of sentences and documents. In: International Conference on Machine Learning, pp. 1188–1196 (2014)
Google Scholar
Wang, G., Li, C., et al.: Joint embedding of words and labels for text classification. arXiv preprint arXiv:1805.04174 (2018)
Kim, Y.: Convolutional neural networks for sentence classification. arXiv preprint arXiv:1408.5882 (2014)
Johnson, R., Zhang, T.: Deep pyramid convolutional neural networks for text categorization. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (ACL 2017) Volume 1: Long Papers, Vancouver, Canada, 30 July–4 August 2017 (2017)
Google Scholar
Zhang, X., Zhao, J., LeCun, Y.: Character-level convolutional networks for text classification. In: Advances in Neural Information Processing Systems, pp. 649–657 (2015)
Google Scholar
Luo, Y.: Recurrent neural networks for classifying relations in clinical notes. J. Biomed. Inform. 72, 85–95 (2017)
Article Google Scholar
Tai, K.S., Socher, R., Manning, C.D.: Improved semantic representations from tree-structured long short-term memory networks. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics (ACL 2015) Volume 1: Long Papers, Beijing, China, 26–31 July 2015 (2015)
Google Scholar
Yang, Z., Yang, D., Dyer, C., He, X., Smola, A., Hovy, E.: Hierarchical attention networks for document classification. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1480–1489 (2016)
Google Scholar
Yao, L., Mao, C., Luo, Y.: Graph convolutional networks for text classification. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, no. 01, pp. 7370–7377 (2019)
Google Scholar
Sabour, S., Frosst, N., Hinton, G.E.: Dynamic routing between capsules. In: Advances in Neural Information Processing Systems, pp. 3856–3866 (2017)
Google Scholar
Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: ALBERT: a lite BERT for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019)
Yang, Z., Dai, Z., Yang, Y., Carbonell, J.G., Salakhutdinov, R., Le, Q.V.: XLNet: generalized autoregressive pretraining for language understanding. In: Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, Vancouver, BC, Canada, 8–14 December 2019 (2019)
Google Scholar
Yang, J., Zhang, Y., Liang, S.: Subword encoding in lattice LSTM for Chinese word segmentation. arXiv preprint arXiv:1810.12594 (2018)
Yan, H., Deng, B., Li, X., Qiu, X.: TENER: adapting transformer encoder for named entity recognition. arXiv preprint arXiv:1911.04474 (2019)
Dai, Z., Yang, Z., Yang, Y., et al.: Transformer-XL: attentive language models beyond a fixed-length context. arXiv preprint arXiv:1901.02860 (2019)
Li, Y., Wang, X., Xu, P.: Chinese text classification model based on deep learning. Future Internet 10(11), 113 (2018)
Article Google Scholar
Cui, Y., Che, W., Liu, T., et al.: Pre-training with whole word masking for Chinese BERT. IEEE/ACM Trans. Audio Speech Lang. Process. 29, 3504–3514 (2021)
Article Google Scholar
Pang, N., Xiao, W., Zhao, X.: Chinese text classification via bidirectional lattice LSTM. In: Li, G., Shen, H.T., Yuan, Ye., Wang, X., Liu, H., Zhao, X. (eds.) KSEM 2020. LNCS (LNAI), vol. 12275, pp. 250–262. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-55393-7_23
Chapter Google Scholar

Download references

Acknowledgements

This paper is Supported in part by a grant from Guangxi Key Laboratory of Machine Vision and Intelligent Control, the Provincial College Students Innovation and Entrepreneurship Training Program Project (S202211354104). This work is also supported by the Shenzhen Development and Reform Commission subject (XMHT20200105010).

Author information

Authors and Affiliations

Guangxi Key Laboratory of Machine Vision and Intelligent Control, WuZhou University, Wuzhou, China
Haifeng Lv & Xiaoyu Ji
Kingdee Research, Kingdee International Software Group Company Limited, Shenzhen, China
Yishuang Ning, Ke Ning & Sheng He
Guangxi Colleges and Universities Key Laboratory of Industry Software Technology, Wuzhou University, Wuzhou, 543002, China
Haifeng Lv & Xiaoyu Ji

Authors

Haifeng Lv
View author publications
You can also search for this author in PubMed Google Scholar
Yishuang Ning
View author publications
You can also search for this author in PubMed Google Scholar
Ke Ning
View author publications
You can also search for this author in PubMed Google Scholar
Xiaoyu Ji
View author publications
You can also search for this author in PubMed Google Scholar
Sheng He
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yishuang Ning .

Editor information

Editors and Affiliations

Minzu University of China, Beijing, China
Xiuqin Pan
Hainan University, Haikou, China
Ting Jin
Kingdee International Software Group Co., Ltd., Shenzhen, China
Liang-Jie Zhang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lv, H., Ning, Y., Ning, K., Ji, X., He, S. (2022). Chinese Text Classification Using BERT and Flat-Lattice Transformer. In: Pan, X., Jin, T., Zhang, LJ. (eds) Artificial Intelligence and Mobile Services – AIMS 2022. AIMS 2022. Lecture Notes in Computer Science, vol 13729. Springer, Cham. https://doi.org/10.1007/978-3-031-23504-7_5

Download citation

DOI: https://doi.org/10.1007/978-3-031-23504-7_5
Published: 16 December 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-23503-0
Online ISBN: 978-3-031-23504-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics