rT5: A Retrieval-Augmented Pre-trained Model for Ancient Chinese Entity Description Generation

Hu, Mengting; Zhao, Xiaoqun; Wei, Jiaqi; Wu, Jianfeng; Sun, Xiaosu; Li, Zhengdan; Wu, Yike; Sun, Yufei; Zhang, Yuzhi

doi:10.1007/978-3-031-44693-1_57

Mengting Hu¹¹,
Xiaoqun Zhao¹¹,
Jiaqi Wei¹¹,
Jianfeng Wu¹²,
Xiaosu Sun¹¹,
Zhengdan Li¹¹,
Yike Wu^13,14,
Yufei Sun¹¹ &
…
Yuzhi Zhang¹¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 14302))

Included in the following conference series:

CCF International Conference on Natural Language Processing and Chinese Computing

2015 Accesses

Abstract

Ancient Chinese, the natural language of ancient China, serves as the key to understanding and propagating Chinese rich history and civilization. However, to facilitate comprehension and education, human experts previously need to write modern language descriptions for special entities, such as persons and locations, out of ancient Chinese texts. This process requires specialized knowledge and can be time-consuming. To address these challenges, we propose a new task called Ancient Chinese Entity Description Generation (ACEDG), which aims to automatically generate modern language descriptions for ancient entities. To address ACEDG, we propose two expert-annotated datasets, XunZi and MengZi, each containing ancient Chinese texts, and some of them have been annotated with entities and their descriptions by human experts. To leverage both labeled and unlabeled texts, we propose a retrieval-augmented pre-trained model called rT5. Specifically, a pseudo-parallel corpus is constructed using retrieval techniques to augment the pre-training stage. Subsequently, the pre-trained model is fine-tuned on our high-quality human-annotated entity-description corpus. Our experimental results, evaluated using various metrics, demonstrate the effectiveness of our method. By combining retrieval techniques and pre-training, our approach significantly advances the state-of-the-art performance in the ACEDG task compared with strong pre-trained models.

This research is supported by the youth program of National Science Fund of Tianjin, China (Grant No. 22JCQNJC01340), the Fundamental Research Funds for the Central University, Nankai University (Grant No. 63221028 and No. 63232114).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 99.00; Price excludes VAT (USA)

Softcover Book: USD 129.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Research on Named Entity Recognition in Ancient Chinese Based on Incremental Pre-training and Domain Lexicon

Emerging Entity Discovery Using Web Sources

Contextualizing Entity Representations for Zero-Shot Relation Extraction with Masked Language Models

References

Li, J., Sun, A., Han, J., Li, C.: A survey on deep learning for named entity recognition. IEEE Trans. Knowl. Data Eng. 34(1), 50–70 (2020)
Article Google Scholar
Li, J.: Generating classical Chinese poems via conditional variational autoencoder and adversarial training. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp. 3890–3900. Association for Computational Linguistics, Brussels, Belgium, October–November 2018
Google Scholar
Wang, Y., Zhang, J., Zhang, B., Jin, Q.: Research and implementation of Chinese couplet generation system with attention based transformer mechanism. IEEE Trans. Comput. Soc. Syst. (2021)
Google Scholar
Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)
Izacard, G., et al.: Unsupervised dense information retrieval with contrastive learning (2021)
Google Scholar
Raffel, C.: Exploring the limits of transfer learning with a unified text-to-text transformer. J. Mach. Learn. Res. 21(1), 5485–5551 (2020)
Google Scholar
Xue, L.: mT5: a massively multilingual pre-trained text-to-text transformer. arXiv preprint arXiv:2010.11934 (2020)
Papineni, K., Roukos, S., Ward, T., Zhu, W.-J.: BLEU: a method for automatic evaluation of machine translation. In: Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, pp. 311–318. Association for Computational Linguistics, Philadelphia, Pennsylvania, USA, July 2002
Google Scholar
Lin, C.-Y.: ROUGE: a package for automatic evaluation of summaries. In: Text Summarization Branches Out, pp. 74–81. Association for Computational Linguistics, Barcelona, Spain, July 2004
Google Scholar
Banerjee, S., Lavie, A.: METEOR: an automatic metric for MT evaluation with improved correlation with human judgments. In: Proceedings of the ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization, pp. 65–72. Association for Computational Linguistics, Ann Arbor, Michigan, June 2005
Google Scholar
Lewis, M.: BART: denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. arXiv preprint arXiv:1910.13461 (2019)
Radford, A., Narasimhan, K., Salimans, T., Sutskever, I., et al.: Improving language understanding by generative pre-training (2018)
Google Scholar
Dong, L.: Unified language model pre-training for natural language understanding and generation. In: Advances in Neural Information Processing Systems, vol. 32 (2019)
Google Scholar
Liu, Y.: RoBERTa: a robustly optimized BERT pretraining approach. arXiv preprint arXiv:1907.11692 (2019)
Chang, Y., Kong, L., Jia, K., Meng, Q.: Chinese named entity recognition method based on BERT. In: 2021 IEEE International Conference on Data Science and Computer Application (ICDSCA), pp. 294–299 (2021)
Google Scholar
Yang, Z., et al.: Generating classical Chinese poems from vernacular Chinese. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing. In: Conference on Empirical Methods in Natural Language Processing, vol. 2019, p. 6155. NIH Public Access (2019)
Google Scholar
Yuan, S., Zhong, L., Li, L., Zhang, R.: Automatic generation of Chinese couplets with attention based encoder-decoder model. In: 2019 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR), pp. 65–70. IEEE (2019)
Google Scholar
Guu, K., Lee, K., Tung, Z., Pasupat, P., Chang, M.: Retrieval augmented language model pre-training. In: International Conference on Machine Learning, pp. 3929–3938. PMLR (2020)
Google Scholar
Wang, H.: Retrieval enhanced model for commonsense generation. arXiv preprint arXiv:2105.11174 (2021)

Download references

Author information

Authors and Affiliations

College of Software, Nankai University, Tianjin, China
Mengting Hu, Xiaoqun Zhao, Jiaqi Wei, Xiaosu Sun, Zhengdan Li, Yufei Sun & Yuzhi Zhang
College of Computer Science, Nankai University, Tianjin, China
Jianfeng Wu
School of Journalism and Communication, Nankai University, Tianjin, China
Yike Wu
Convergence Media Research Center, Nankai University, Tianjin, China
Yike Wu

Authors

Mengting Hu
View author publications
You can also search for this author in PubMed Google Scholar
Xiaoqun Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Jiaqi Wei
View author publications
You can also search for this author in PubMed Google Scholar
Jianfeng Wu
View author publications
You can also search for this author in PubMed Google Scholar
Xiaosu Sun
View author publications
You can also search for this author in PubMed Google Scholar
Zhengdan Li
View author publications
You can also search for this author in PubMed Google Scholar
Yike Wu
View author publications
You can also search for this author in PubMed Google Scholar
Yufei Sun
View author publications
You can also search for this author in PubMed Google Scholar
Yuzhi Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yike Wu .

Editor information

Editors and Affiliations

Emory University, Atlanta, GA, USA
Fei Liu
Microsoft Research Asia, Beijing, China
Nan Duan
Soochow University, Suzhou, China
Qingting Xu
Soochow University, Suzhou, China
Yu Hong

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hu, M. et al. (2023). rT5: A Retrieval-Augmented Pre-trained Model for Ancient Chinese Entity Description Generation. In: Liu, F., Duan, N., Xu, Q., Hong, Y. (eds) Natural Language Processing and Chinese Computing. NLPCC 2023. Lecture Notes in Computer Science(), vol 14302. Springer, Cham. https://doi.org/10.1007/978-3-031-44693-1_57

Download citation

DOI: https://doi.org/10.1007/978-3-031-44693-1_57
Published: 08 October 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-44692-4
Online ISBN: 978-3-031-44693-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

the China Computer Federation (CCF) (opens in a new tab)

rT5: A Retrieval-Augmented Pre-trained Model for Ancient Chinese Entity Description Generation