Skip to main content

RAC-BERT: Character Radical Enhanced BERT for Ancient Chinese

  • Conference paper
  • First Online:
Natural Language Processing and Chinese Computing (NLPCC 2023)

Abstract

In recent years, Chinese pre-training language models have achieved significant improvements in the fields, such as natural language understanding (NLU) and text generation. However, most of these existing pre-trained language models focus on modern Chinese but ignore the rich semantic information embedded for Chinese characters, especially the radical information. To this end, we present RAC-BERT, a language-specific BERT model for ancient Chinese. Specifically, we propose two new radical-based pre-training tasks, which are: (1) replacing the masked tokens with random words of the same radical, that can mitigate the gap between the pre-training and fine-tuning stages; (2) predicting the radical of the masked token, not the original word, that reduces the computational effort. Extensive experiments were conducted on two ancient Chinese NLP datasets. The results show that our model significantly outperforms the state-of-the-art models on most tasks. And we conducted ablation experiments to demonstrate the effectiveness of our approach. The pre-trained model are publicly available at https://github.com/CubeHan/RAC-BERT

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    https://github.com/ethan-yt/guwenbert.

  2. 2.

    https://github.com/garychowcmu/daizhigev20.

  3. 3.

    https://github.com/Ethan-yt/CCLUE.

References

  1. Chen, L.: Deep Learning and Practice with MindSpore. Springer, Cham (2021)

    Book  Google Scholar 

  2. Cui, Y., et al.: Pre-training with whole word masking for Chinese BERT. IEEE/ACM Trans. Audio, Speech Lang. Process. 29, 3504ā€“3514 (2019)

    Article  Google Scholar 

  3. Delobelle, P., Winters, T., Berendt, B.: RobBERT: a Dutch RoBERTa-based language model. In: Findings of the Association for Computational Linguistics: EMNLP 2020, pp. 3255ā€“3265 (2020)

    Google Scholar 

  4. Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pp. 4171ā€“4186. Association for Computational Linguistics, Minneapolis, Minnesota (2019). https://doi.org/10.18653/v1/N19-1423, https://aclanthology.org/N19-1423

  5. Hu, R., Li, S., Zhu, Y.: Knowledge representation and sentence segmentation of ancient Chinese based on deep language models. J. Chin. Inf. Process. 35(4), 8ā€“15 (2021)

    Google Scholar 

  6. Ji, Z., Shen, Y., Sun, Y., Yu, T., Wang, X.: C-CLUE: a benchmark of classical Chinese based on a crowdsourcing system for knowledge graph construction. In: Qin, B., Jin, Z., Wang, H., Pan, J., Liu, Y., An, B. (eds.) CCKS 2021. CCIS, vol. 1466, pp. 295ā€“301. Springer, Singapore (2021). https://doi.org/10.1007/978-981-16-6471-7_24

    Chapter  Google Scholar 

  7. Ji, Z., Wang, X., Shen, Y., Rao, G.: CANCN-BERT: a joint pre-trained language model for classical and modern Chinese. In: Proceedings of the 30th ACM International Conference on Information & Knowledge Management, pp. 3112ā€“3116 (2021)

    Google Scholar 

  8. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint: arXiv:1412.6980 (2014)

  9. Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: ALBERT: a lite BERT for self-supervised learning of language representations. arXiv preprint: arXiv:1909.11942 (2019)

  10. Li, Y., Li, W., Sun, F., Li, S.: Component-enhanced Chinese character embeddings. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 829ā€“834 (2015)

    Google Scholar 

  11. Liu, Y., et al.: RoBERTa: a robustly optimized BERT pretraining approach. arXiv preprint: arXiv:1907.11692 (2019)

  12. Martin, L., et al.: Camembert: a tasty French language model. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 7203ā€“7219 (2020)

    Google Scholar 

  13. Nozza, D., Bianchi, F., Hovy, D.: What the [mask]? Making sense of language-specific BERT models. arXiv preprint: arXiv:2003.02912 (2020)

  14. Sun, Y., Lin, L., Yang, N., Ji, Z., Wang, X.: Radical-enhanced Chinese character embedding. In: Loo, C.K., Yap, K.S., Wong, K.W., Teoh, A., Huang, K. (eds.) ICONIP 2014. LNCS, vol. 8835, pp. 279ā€“286. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-12640-1_34

    Chapter  Google Scholar 

  15. Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, vol. 30 (2017)

    Google Scholar 

  16. Wang, D., et al.: Construction and application of pre-training model of ā€œsiku quanshuā€™ā€™ oriented to digital humanities. Libr. Tribune 42(6), 31ā€“43 (2022)

    MathSciNet  Google Scholar 

  17. Wang, X., Xiong, Y., Niu, H., Yue, J., Zhu, Y., Philip, S.Y.: BioHanBERT: a Hanzi-aware pre-trained language model for Chinese biomedical text mining. In: 2021 IEEE International Conference on Data Mining (ICDM), pp. 1415ā€“1420. IEEE (2021)

    Google Scholar 

Download references

Acknowledgement

This work is supported by the CAAI-Huawei MindSpore Open Fund (2022037A).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xin Wang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

Ā© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Han, L. et al. (2023). RAC-BERT: Character Radical Enhanced BERT for Ancient Chinese. In: Liu, F., Duan, N., Xu, Q., Hong, Y. (eds) Natural Language Processing and Chinese Computing. NLPCC 2023. Lecture Notes in Computer Science(), vol 14303. Springer, Cham. https://doi.org/10.1007/978-3-031-44696-2_59

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-44696-2_59

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-44695-5

  • Online ISBN: 978-3-031-44696-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics