research-article

Augmenting Low-Resource Cross-Lingual Summarization with Progression-Grounded Training and Prompting

Authors:

Philip YuAuthors Info & Claims

ACM Transactions on Asian and Low-Resource Language Information Processing, Volume 23, Issue 9

Article No.: 129, Pages 1 - 22

https://doi.org/10.1145/3675167

Published: 16 August 2024 Publication History

Abstract

Cross-lingual summarization (CLS), generating summaries in one language from source documents in another language, offers invaluable assistance in enabling global access to information for people worldwide. State-of-the-art neural summarization models typically train or fine-tune language models on large-scale corpora. However, this is difficult to achieve in realistic low-resource scenarios due to the lack of domain-specific annotated data. In this article, we present a novel cross-lingual summarization model that utilizes progressive training with mBART and employs reinforcement learning to optimize discrete prompts, which addresses low-resource cross-lingual summarization through a two-pronged approach. During training, we introduce a progressive approach based on mBART, which allows the pre-trained model to gradually acquire the ability to compress information, develop cross-lingual capabilities, and ultimately adapt to specific summarization tasks. During downstream summarization, we employ a discrete-prompts joint pre-trained model based on reinforcement learning optimization to achieve low-resource cross-lingual summarization. Experimental results on four cross-lingual summarization datasets demonstrate state-of-the-art performance and superiority compared to six baselines in low-resource scenarios.

References

[1]

Yu Bai, Yang Gao, and He-Yan Huang. 2021. Cross-lingual abstractive summarization with limited parallel resources. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). ACL, 6910–6924.

[2]

Abhik Bhattacharjee, Tahmid Hasan, Wasi Uddin Ahmad, Yuan-Fang Li, Yong-Bin Kang, and Rifat Shahriyar. 2023. CrossSum: Beyond English-centric cross-lingual summarization for 1,500+ language pairs. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). ACL, 2541–2564.

[3]

Tom B. Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel M. Ziegler, Jeffrey Wu, Clemens Winter, Christopher Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford, Ilya Sutskever, and Dario Amodei. 2020. Language models are few-shot learners. In Proceedings of the 34th International Conference on Neural Information Processing Systems. 1877–1901.

[4]

Yue Cao, Hui Liu, and Xiaojun Wan. 2020. Jointly learning to align and summarize for neural cross-lingual summarization. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL’20). 6220–6231.

[5]

Fredrik Carlsson, Philipp Eisen, Faton Rekathati, and Magnus Sahlgren. 2022. Cross-lingual and multilingual CLIP. In Proceedings of the 13th Language Resources and Evaluation Conference (LREC’22). 6848–6854.

[6]

Zheng Chen and Hongyu Lin. 2022. CATAMARAN: A cross-lingual long text abstractive summarization dataset. In Proceedings of the 13th Language Resources and Evaluation Conference (LREC’22). European Language Resources Association, 6932–6937.

[7]

Zewen Chi, Li Dong, Shuming Ma, Shaohan Huang, Saksham Singhal, Xian-Ling Mao, He-Yan Huang, Xia Song, and Furu Wei. 2021. mT6: Multilingual pretrained text-to-text transformer with translation pairs. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’21). 1671–1683.

[8]

Mingkai Deng, Jianyu Wang, Cheng-Ping Hsieh, Yihan Wang, Han Guo, Tianmin Shu, Meng Song, Eric Xing, and Zhiting Hu. 2022. RLPrompt: Optimizing discrete text prompts with reinforcement learning. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’22). 3369–3391.

[9]

Zi-Yi Dou, Sachin Kumar, and Yulia Tsvetkov. 2020. A deep reinforced model for zero-shot cross-lingual summarization with bilingual semantic similarity rewards. In Proceedings of the 4th Workshop on Neural Generation and Translation. ACL, 60–68.

[10]

Xiangyu Duan, Mingming Yin, Min Zhang, Boxing Chen, and Weihua Luo. 2019. Zero-shot cross-lingual abstractive sentence summarization through teaching generation and attention. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (ACL’19). 3162–3172.

[11]

Aarón Galiano-Jiménez, Felipe Sánchez-Martínez, Víctor M. Sánchez-Cartagena, and Juan Antonio Pérez-Ortiz. 2023. Exploiting large pre-trained models for low-resource neural machine translation. In Proceedings of the 24th Annual Conference of the European Association for Machine Translation. European Association for Machine Translation, 59–68.

[12]

Tianyu Gao, Adam Fisch, and Danqi Chen. 2021. Making pre-trained language models better few-shot learners. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (ACL-IJCNLP’21). 3816–3830.

[13]

Xavier Glorot and Yoshua Bengio. 2010. Understanding the difficulty of training deep feedforward neural networks. In Proceedings of the 13th International Conference on Artificial Intelligence and Statistics (AISTATS’10). 249–256.

[14]

Han Guo, Bowen Tan, Zhengzhong Liu, Eric Xing, and Zhiting Hu. 2022. Efficient (soft) Q-learning for text generation with limited good data. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’22). ACL, 6969–6991.

[15]

Karen Hambardzumyan, Hrant Khachatrian, and Jonathan May. 2021. WARP: Word-level adversarial reprogramming. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). ACL, 4921–4933.

[16]

Shibo Hao, Bowen Tan, Kaiwen Tang, Bin Ni, Xiyan Shao, Hengzhe Zhang, Eric Xing, and Zhiting Hu. 2023. BertNet: Harvesting knowledge graphs with arbitrary relations from pretrained language models. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL’23).

[17]

Tahmid Hasan, Abhik Bhattacharjee, Md Saiful Islam, Kazi Samin, Yuan-Fang Li, Yong-Bin Kang, M. Sohel Rahman, and Rifat Shahriyar. 2021. XL-Sum: Large-scale multilingual abstractive summarization for 44 languages. In Proceedings of the Association of Computational Linguistics and International Joint Conference on Natural Language Processing. 4693–4703.

[18]

Baotian Hu, Qingcai Chen, and Fangze Zhu. 2015. LCSTS: A large scale Chinese short text summarization dataset. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’15). 1967–1972.

[19]

Zhengbao Jiang, Frank F. Xu, Jun Araki, and Graham Neubig. 2020. How can we know what language models know? Trans. Assoc. Comput. Ling. 8 (2020), 423–438.

[20]

Brian Lester, Rami Al-Rfou, and Noah Constant. 2021. The power of scale for parameter-efficient prompt tuning. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’21). 3045–3059.

[21]

Anton Leuski, Chin-Yew Lin, Liang Zhou, Ulrich Germann, Franz Josef Och, and Eduard Hovy. 2003. Cross-lingual C* ST* RD: English access to Hindi information. ACM Trans. Asian Lang. Inf. Process. 2, 3 (2003), 245–269.

Digital Library

[22]

Chengjing Li, Li Wang, and Zirong Huang. 2022. Hindsight-aware deep reinforcement learning algorithm for multi-agent systems. Int. J. Mach. Learn. Cybern. 13, 7 (2022), 2045–2057.

[23]

Xiang Lisa Li and Percy Liang. 2021. Prefix-tuning: Optimizing continuous prompts for generation. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). ACL, 4582–4597.

[24]

Yunlong Liang, Fandong Meng, Chulun Zhou, Jinan Xu, Yufeng Chen, Jinsong Su, and Jie Zhou. 2022. A variational hierarchical model for neural cross-lingual summarization. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). ACL, 2088–2099.

[25]

Chin-Yew Lin. 2004. ROUGE: A package for automatic evaluation of summaries. In Text Summarization Branches Out. ACL, 74–81.

[26]

Shu Liu, Bo Wang, Huaxiong Li, Chunlin Chen, and Zhi Wang. 2023. Continual portfolio selection in dynamic environments via incremental reinforcement learning. Int. J. Mach. Learn. Cybern. 14, 1 (2023), 269–279.

[27]

Xiao Liu, Kaixuan Ji, Yicheng Fu, Weng Tam, Zhengxiao Du, Zhilin Yang, and Jie Tang. 2022. P-Tuning: Prompt Tuning Can Be Comparable to Fine-tuning Across Scales and Tasks. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). 61–68.

[28]

Yinhan Liu, Jiatao Gu, Naman Goyal, Xian Li, Sergey Edunov, Marjan Ghazvininejad, Mike Lewis, and Luke Zettlemoyer. 2020. Multilingual denoising pre-training for neural machine translation. Trans. Assoc. Comput. Ling. 8 (2020), 726–742.

[29]

Thong Thanh Nguyen and Anh Tuan Luu. 2022. Improving neural cross-lingual abstractive summarization via employing optimal transport distance for knowledge distillation. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 36. 11103–11111.

[30]

Jessica Ouyang, Boya Song, and Kathleen McKeown. 2019. A robust abstractive system for cross-lingual summarization. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). 2025–2031.

[31]

Hao Peng, Ruitong Zhang, Shaoning Li, Yuwei Cao, Shirui Pan, and Philip S. Yu.2022. Reinforced, incremental and cross-lingual event detection from social messages. IEEE Trans. Pattern Anal. Mach. Intell. 45, 1 (2022), 980–998.

[32]

Kun Peng, Lei Jiang, Hao Peng, Rui Liu, Zhengtao Yu, Jiaqian Ren, Zhifeng Hao, and Philip S. Yu. 2024. Prompt based tri-channel graph convolution neural network for aspect sentiment triplet extraction. In Proceedings of the SIAM International Conference on Data Mining (SDM’24). 145–153.

[33]

Fabio Petroni, Tim Rocktäschel, Sebastian Riedel, Patrick Lewis, Anton Bakhtin, Yuxiang Wu, and Alexander Miller. 2019. Language models as knowledge bases? In Proceedings of the Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP’19). 2463–2473.

[34]

Archiki Prasad, Peter Hase, Xiang Zhou, and Mohit Bansal. 2023. GrIPS: Gradient-free, edit-based instruction search for prompting large language models. In Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics. 3827–3846.

[35]

Jing Qian, Li Dong, Yelong Shen, Furu Wei, and Weizhu Chen. 2022. Controllable natural language generation with contrastive prefixes. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL’22). 2912–2924.

[36]

Timo Schick and Hinrich Schütze. 2021. Exploiting cloze-questions for few-shot text classification and natural language inference. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume. ACL, 255–269.

[37]

Rico Sennrich, Barry Haddow, and Alexandra Birch. 2016. Neural machine translation of rare words with subword units. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). ACL, 1715–1725.

[38]

Shi-qi Shen, Yun Chen, Cheng Yang, Zhi-yuan Liu, and Mao-song Sun. 2018. Zero-shot cross-lingual neural headline generation. IEEE/ACM Transactions on Audio, Speech, and Language Processing 26, 12 (2018), 2319–2327.

[39]

Wei Shi, Yanghe Feng, Honglan Huang, Zhong Liu, Jincai Huang, and Guangquan Cheng. 2022. Efficient hierarchical policy network with fuzzy rules. Int. J. Mach. Learn. Cybern. 13, 2 (2022), 447–459.

[40]

Kihyuk Sohn, Xinchen Yan, and Honglak Lee. 2015. Learning structured output representation using deep conditional generative models. In Proceedings of the 28th International Conference on Neural Information Processing Systems-Volume 2. MIT Press, 3483–3491.

[41]

Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. 2014. Dropout: A simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15, 1 (2014), 1929–1958.

Digital Library

[42]

Sho Takase and Naoaki Okazaki. 2022. Multi-task learning for cross-lingual abstractive summarization. In Proceedings of the 13th Language Resources and Evaluation Conference (LREC’22). 3008–3016.

[43]

Derek Tam, Rakesh R. Menon, Mohit Bansal, Shashank Srivastava, and Colin Raffel. 2021. Improving and simplifying pattern exploiting training. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’21). 4980–4991.

[44]

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Proceedings of the 31st International Conference on Neural Information Processing Systems (NIPS’17). 6000–6010.

Digital Library

[45]

Tu Vu, Brian Lester, Noah Constant, Rami Al-Rfou, and Daniel Cer. 2022. SPoT: Better frozen model adaptation through soft prompt transfer. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). ACL, 5039–5059.

[46]

Xiaojun Wan, Huiying Li, and Jianguo Xiao. 2010. Cross-language document summarization based on machine translation quality prediction. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics. ACL, 917–926.

[47]

Jiaan Wang, Fandong Meng, Ziyao Lu, Duo Zheng, Zhixu Li, Jianfeng Qu, and Jie Zhou. 2022. ClidSum: A benchmark dataset for cross-lingual dialogue summarization. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’22). 7716–7729.

[48]

Jiaan Wang, Fandong Meng, Duo Zheng, Yunlong Liang, Zhixu Li, Jianfeng Qu, and Jie Zhou. 2022. A survey on cross-lingual summarization. Trans. Assoc. Comput. Ling. 10 (2022), 1304–1323.

[49]

Jiaan Wang, Fandong Meng, Duo Zheng, Yunlong Liang, Zhixu Li, Jianfeng Qu, and Jie Zhou. 2023. Towards unifying multi-lingual and cross-lingual summarization. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 15127–15143.

[50]

Zhenhan Wang, Ran Song, Zhengtao Yu, Cunli Mao, and Shengxiang Gao. 2024. DRA: Dynamic routing attention for neural machine translation with low-resource languages. International Journal of Machine Learning and Cybernetics 15, 4 (2024), 1–15.

[51]

Chao Xiang, Zhongming Jin, Zhengxu Yu, Xian-Sheng Hua, Yao Hu, Wei Qian, Kaili Zhu, Deng Cai, and Xiaofei He. 2023. Optimizing traffic efficiency via a reinforcement learning approach based on time allocation. International Journal of Machine Learning and Cybernetics 14, 10 (2023), 3381–3391.

[52]

Linting Xue, Noah Constant, Adam Roberts, Mihir Kale, Rami Al-Rfou, Aditya Siddhant, Aditya Barua, and Colin Raffel. 2021. mT5: A massively multilingual pre-trained text-to-text transformer. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. ACL, 483–498.

[53]

Jingqing Zhang, Yao Zhao, Mohammad Saleh, and Peter Liu. 2020. Pegasus: Pre-training with extracted gap-sentences for abstractive summarization. In Proceedings of the International Conference on Machine Learning(ICML’20). 11328–11339.

[54]

K. Zhang, Y. Zhang, Z. Yu, Y. Huang, and K. Tan. 2022. A two-stage fine-tuning method for low-resource cross-lingual summarization. Math. Biosci. Eng. 21, 1 (2022), 1125–1143.

[55]

Tianyi Zhang, Varsha Kishore, Felix Wu, Kilian Q. Weinberger, and Yoav Artzi. 2020. BERTScore: Evaluating text generation with BERT. In Proceedings of the International Conference on Learning Representations.

[56]

Yongbing Zhang, Shengxiang Gao, Yuxin Huang, Zhengtao Yu, and Kaiwen Tan. 2024. 3A-COT: an attend-arrange- abstract chain-of-thought for multi-document summarization. International Journal of Machine Learning and Cybernetics 15, 4 (2024), 1–19.

[57]

Junnan Zhu, Qian Wang, Yining Wang, Yu Zhou, Jiajun Zhang, Shaonan Wang, and Chengqing Zong. 2019. NCLS: Neural cross-lingual summarization. In Proceedings of the Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP’19). 3054–3064.

[58]

Junnan Zhu, Yu Zhou, Jiajun Zhang, and Chengqing Zong. 2020. Attend, translate and summarize: An efficient method for neural cross-lingual summarization. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. ACL, 1309–1321.

Cited By

Ahmat AYang YMa BDong RMa RWang L(2024)Cross-lingual prompting method with semantic-based answer space clusteringApplied Intelligence10.1007/s10489-024-06101-w55:2Online publication date: 12-Dec-2024
https://doi.org/10.1007/s10489-024-06101-w

Index Terms

Augmenting Low-Resource Cross-Lingual Summarization with Progression-Grounded Training and Prompting
1. Information systems
  1. Information retrieval
    1. Retrieval tasks and goals
      1. Summarization

Recommendations

Cross-Lingual Transfer of Large Language Model by Visually-Derived Supervision Toward Low-Resource Languages
MM '23: Proceedings of the 31st ACM International Conference on Multimedia

Recent progress on vision and language research has shown that visual supervision improves the performance of large language models (LLMs) in various natural language processing (NLP) tasks. In particular, the Vokenization approach [65] initiated a new ...
A Generalized Constraint Approach to Bilingual Dictionary Induction for Low-Resource Language Families

The lack or absence of parallel and comparable corpora makes bilingual lexicon extraction a difficult task for low-resource languages. The pivot language and cognate recognition approaches have been proven useful for inducing bilingual lexicons for such ...
Automatic wordnet development for low-resource languages using cross-lingual WSD

Wordnets are an effective resource for natural language processing and information retrieval, especially for semantic processing and meaning related tasks. So far, wordnets have been constructed for many languages. However, the automatic development of ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Asian and Low-Resource Language Information Processing

ACM Transactions on Asian and Low-Resource Language Information Processing Volume 23, Issue 9

September 2024

186 pages

EISSN:2375-4702

DOI:10.1145/3613646

Editor:
Imed Zitouni
Google, USA

Issue’s Table of Contents

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 16 August 2024

Online AM: 26 June 2024

Accepted: 22 June 2024

Revised: 24 March 2024

Received: 22 September 2023

Published in TALLIP Volume 23, Issue 9

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

National Natural Science Foundation of China
Yunnan provincial major science and technology special plan projects
Yunnan Provincial Key Research and Development Plan
Yunnan Fundamental Research Projects
Kunming University of Science and Technology?s ? Double First-rate ? construction joint project

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
179
Total Downloads

Downloads (Last 12 months)179
Downloads (Last 6 weeks)10

Reflects downloads up to 02 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Ahmat AYang YMa BDong RMa RWang L(2024)Cross-lingual prompting method with semantic-based answer space clusteringApplied Intelligence10.1007/s10489-024-06101-w55:2Online publication date: 12-Dec-2024
https://doi.org/10.1007/s10489-024-06101-w

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Full Text

View this article in Full Text.

Figures

Tables

Media

View full text|Download PDF

View Issue’s Table of Contents