Skip to main content

rT5: A Retrieval-Augmented Pre-trained Model forĀ Ancient Chinese Entity Description Generation

  • Conference paper
  • First Online:
Natural Language Processing and Chinese Computing (NLPCC 2023)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 14302))

  • 1080 Accesses

Abstract

Ancient Chinese, the natural language of ancient China, serves as the key to understanding and propagating Chinese rich history and civilization. However, to facilitate comprehension and education, human experts previously need to write modern language descriptions for special entities, such as persons and locations, out of ancient Chinese texts. This process requires specialized knowledge and can be time-consuming. To address these challenges, we propose a new task called Ancient Chinese Entity Description Generation (ACEDG), which aims to automatically generate modern language descriptions for ancient entities. To address ACEDG, we propose two expert-annotated datasets, XunZi and MengZi, each containing ancient Chinese texts, and some of them have been annotated with entities and their descriptions by human experts. To leverage both labeled and unlabeled texts, we propose a retrieval-augmented pre-trained model called rT5. Specifically, a pseudo-parallel corpus is constructed using retrieval techniques to augment the pre-training stage. Subsequently, the pre-trained model is fine-tuned on our high-quality human-annotated entity-description corpus. Our experimental results, evaluated using various metrics, demonstrate the effectiveness of our method. By combining retrieval techniques and pre-training, our approach significantly advances the state-of-the-art performance in the ACEDG task compared with strong pre-trained models.

This research is supported by the youth program of National Science Fund of Tianjin, China (Grant No. 22JCQNJC01340), the Fundamental Research Funds for the Central University, Nankai University (Grant No. 63221028 and No. 63232114).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Li, J., Sun, A., Han, J., Li, C.: A survey on deep learning for named entity recognition. IEEE Trans. Knowl. Data Eng. 34(1), 50ā€“70 (2020)

    ArticleĀ  Google ScholarĀ 

  2. Li, J.: Generating classical Chinese poems via conditional variational autoencoder and adversarial training. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp. 3890ā€“3900. Association for Computational Linguistics, Brussels, Belgium, Octoberā€“November 2018

    Google ScholarĀ 

  3. Wang, Y., Zhang, J., Zhang, B., Jin, Q.: Research and implementation of Chinese couplet generation system with attention based transformer mechanism. IEEE Trans. Comput. Soc. Syst. (2021)

    Google ScholarĀ 

  4. Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)

  5. Izacard, G., et al.: Unsupervised dense information retrieval with contrastive learning (2021)

    Google ScholarĀ 

  6. Raffel, C.: Exploring the limits of transfer learning with a unified text-to-text transformer. J. Mach. Learn. Res. 21(1), 5485ā€“5551 (2020)

    Google ScholarĀ 

  7. Xue, L.: mT5: a massively multilingual pre-trained text-to-text transformer. arXiv preprint arXiv:2010.11934 (2020)

  8. Papineni, K., Roukos, S., Ward, T., Zhu, W.-J.: BLEU: a method for automatic evaluation of machine translation. In: Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, pp. 311ā€“318. Association for Computational Linguistics, Philadelphia, Pennsylvania, USA, July 2002

    Google ScholarĀ 

  9. Lin, C.-Y.: ROUGE: a package for automatic evaluation of summaries. In: Text Summarization Branches Out, pp. 74ā€“81. Association for Computational Linguistics, Barcelona, Spain, July 2004

    Google ScholarĀ 

  10. Banerjee, S., Lavie, A.: METEOR: an automatic metric for MT evaluation with improved correlation with human judgments. In: Proceedings of the ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization, pp. 65ā€“72. Association for Computational Linguistics, Ann Arbor, Michigan, June 2005

    Google ScholarĀ 

  11. Lewis, M.: BART: denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. arXiv preprint arXiv:1910.13461 (2019)

  12. Radford, A., Narasimhan, K., Salimans, T., Sutskever, I., et al.: Improving language understanding by generative pre-training (2018)

    Google ScholarĀ 

  13. Dong, L.: Unified language model pre-training for natural language understanding and generation. In: Advances in Neural Information Processing Systems, vol. 32 (2019)

    Google ScholarĀ 

  14. Liu, Y.: RoBERTa: a robustly optimized BERT pretraining approach. arXiv preprint arXiv:1907.11692 (2019)

  15. Chang, Y., Kong, L., Jia, K., Meng, Q.: Chinese named entity recognition method based on BERT. In: 2021 IEEE International Conference on Data Science and Computer Application (ICDSCA), pp. 294ā€“299 (2021)

    Google ScholarĀ 

  16. Yang, Z., et al.: Generating classical Chinese poems from vernacular Chinese. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing. In: Conference on Empirical Methods in Natural Language Processing, vol. 2019, p. 6155. NIH Public Access (2019)

    Google ScholarĀ 

  17. Yuan, S., Zhong, L., Li, L., Zhang, R.: Automatic generation of Chinese couplets with attention based encoder-decoder model. In: 2019 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR), pp. 65ā€“70. IEEE (2019)

    Google ScholarĀ 

  18. Guu, K., Lee, K., Tung, Z., Pasupat, P., Chang, M.: Retrieval augmented language model pre-training. In: International Conference on Machine Learning, pp. 3929ā€“3938. PMLR (2020)

    Google ScholarĀ 

  19. Wang, H.: Retrieval enhanced model for commonsense generation. arXiv preprint arXiv:2105.11174 (2021)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yike Wu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

Ā© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Hu, M. et al. (2023). rT5: A Retrieval-Augmented Pre-trained Model forĀ Ancient Chinese Entity Description Generation. In: Liu, F., Duan, N., Xu, Q., Hong, Y. (eds) Natural Language Processing and Chinese Computing. NLPCC 2023. Lecture Notes in Computer Science(), vol 14302. Springer, Cham. https://doi.org/10.1007/978-3-031-44693-1_57

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-44693-1_57

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-44692-4

  • Online ISBN: 978-3-031-44693-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics