skip to main content
research-article

Augmenting Low-Resource Cross-Lingual Summarization with Progression-Grounded Training and Prompting

Published: 16 August 2024 Publication History

Abstract

Cross-lingual summarization (CLS), generating summaries in one language from source documents in another language, offers invaluable assistance in enabling global access to information for people worldwide. State-of-the-art neural summarization models typically train or fine-tune language models on large-scale corpora. However, this is difficult to achieve in realistic low-resource scenarios due to the lack of domain-specific annotated data. In this article, we present a novel cross-lingual summarization model that utilizes progressive training with mBART and employs reinforcement learning to optimize discrete prompts, which addresses low-resource cross-lingual summarization through a two-pronged approach. During training, we introduce a progressive approach based on mBART, which allows the pre-trained model to gradually acquire the ability to compress information, develop cross-lingual capabilities, and ultimately adapt to specific summarization tasks. During downstream summarization, we employ a discrete-prompts joint pre-trained model based on reinforcement learning optimization to achieve low-resource cross-lingual summarization. Experimental results on four cross-lingual summarization datasets demonstrate state-of-the-art performance and superiority compared to six baselines in low-resource scenarios.

References

[1]
Yu Bai, Yang Gao, and He-Yan Huang. 2021. Cross-lingual abstractive summarization with limited parallel resources. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). ACL, 6910–6924.
[2]
Abhik Bhattacharjee, Tahmid Hasan, Wasi Uddin Ahmad, Yuan-Fang Li, Yong-Bin Kang, and Rifat Shahriyar. 2023. CrossSum: Beyond English-centric cross-lingual summarization for 1,500+ language pairs. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). ACL, 2541–2564.
[3]
Tom B. Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel M. Ziegler, Jeffrey Wu, Clemens Winter, Christopher Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford, Ilya Sutskever, and Dario Amodei. 2020. Language models are few-shot learners. In Proceedings of the 34th International Conference on Neural Information Processing Systems. 1877–1901.
[4]
Yue Cao, Hui Liu, and Xiaojun Wan. 2020. Jointly learning to align and summarize for neural cross-lingual summarization. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL’20). 6220–6231.
[5]
Fredrik Carlsson, Philipp Eisen, Faton Rekathati, and Magnus Sahlgren. 2022. Cross-lingual and multilingual CLIP. In Proceedings of the 13th Language Resources and Evaluation Conference (LREC’22). 6848–6854.
[6]
Zheng Chen and Hongyu Lin. 2022. CATAMARAN: A cross-lingual long text abstractive summarization dataset. In Proceedings of the 13th Language Resources and Evaluation Conference (LREC’22). European Language Resources Association, 6932–6937.
[7]
Zewen Chi, Li Dong, Shuming Ma, Shaohan Huang, Saksham Singhal, Xian-Ling Mao, He-Yan Huang, Xia Song, and Furu Wei. 2021. mT6: Multilingual pretrained text-to-text transformer with translation pairs. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’21). 1671–1683.
[8]
Mingkai Deng, Jianyu Wang, Cheng-Ping Hsieh, Yihan Wang, Han Guo, Tianmin Shu, Meng Song, Eric Xing, and Zhiting Hu. 2022. RLPrompt: Optimizing discrete text prompts with reinforcement learning. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’22). 3369–3391.
[9]
Zi-Yi Dou, Sachin Kumar, and Yulia Tsvetkov. 2020. A deep reinforced model for zero-shot cross-lingual summarization with bilingual semantic similarity rewards. In Proceedings of the 4th Workshop on Neural Generation and Translation. ACL, 60–68.
[10]
Xiangyu Duan, Mingming Yin, Min Zhang, Boxing Chen, and Weihua Luo. 2019. Zero-shot cross-lingual abstractive sentence summarization through teaching generation and attention. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (ACL’19). 3162–3172.
[11]
Aarón Galiano-Jiménez, Felipe Sánchez-Martínez, Víctor M. Sánchez-Cartagena, and Juan Antonio Pérez-Ortiz. 2023. Exploiting large pre-trained models for low-resource neural machine translation. In Proceedings of the 24th Annual Conference of the European Association for Machine Translation. European Association for Machine Translation, 59–68.
[12]
Tianyu Gao, Adam Fisch, and Danqi Chen. 2021. Making pre-trained language models better few-shot learners. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (ACL-IJCNLP’21). 3816–3830.
[13]
Xavier Glorot and Yoshua Bengio. 2010. Understanding the difficulty of training deep feedforward neural networks. In Proceedings of the 13th International Conference on Artificial Intelligence and Statistics (AISTATS’10). 249–256.
[14]
Han Guo, Bowen Tan, Zhengzhong Liu, Eric Xing, and Zhiting Hu. 2022. Efficient (soft) Q-learning for text generation with limited good data. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’22). ACL, 6969–6991.
[15]
Karen Hambardzumyan, Hrant Khachatrian, and Jonathan May. 2021. WARP: Word-level adversarial reprogramming. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). ACL, 4921–4933.
[16]
Shibo Hao, Bowen Tan, Kaiwen Tang, Bin Ni, Xiyan Shao, Hengzhe Zhang, Eric Xing, and Zhiting Hu. 2023. BertNet: Harvesting knowledge graphs with arbitrary relations from pretrained language models. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL’23).
[17]
Tahmid Hasan, Abhik Bhattacharjee, Md Saiful Islam, Kazi Samin, Yuan-Fang Li, Yong-Bin Kang, M. Sohel Rahman, and Rifat Shahriyar. 2021. XL-Sum: Large-scale multilingual abstractive summarization for 44 languages. In Proceedings of the Association of Computational Linguistics and International Joint Conference on Natural Language Processing. 4693–4703.
[18]
Baotian Hu, Qingcai Chen, and Fangze Zhu. 2015. LCSTS: A large scale Chinese short text summarization dataset. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’15). 1967–1972.
[19]
Zhengbao Jiang, Frank F. Xu, Jun Araki, and Graham Neubig. 2020. How can we know what language models know? Trans. Assoc. Comput. Ling. 8 (2020), 423–438.
[20]
Brian Lester, Rami Al-Rfou, and Noah Constant. 2021. The power of scale for parameter-efficient prompt tuning. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’21). 3045–3059.
[21]
Anton Leuski, Chin-Yew Lin, Liang Zhou, Ulrich Germann, Franz Josef Och, and Eduard Hovy. 2003. Cross-lingual C* ST* RD: English access to Hindi information. ACM Trans. Asian Lang. Inf. Process. 2, 3 (2003), 245–269.
[22]
Chengjing Li, Li Wang, and Zirong Huang. 2022. Hindsight-aware deep reinforcement learning algorithm for multi-agent systems. Int. J. Mach. Learn. Cybern. 13, 7 (2022), 2045–2057.
[23]
Xiang Lisa Li and Percy Liang. 2021. Prefix-tuning: Optimizing continuous prompts for generation. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). ACL, 4582–4597.
[24]
Yunlong Liang, Fandong Meng, Chulun Zhou, Jinan Xu, Yufeng Chen, Jinsong Su, and Jie Zhou. 2022. A variational hierarchical model for neural cross-lingual summarization. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). ACL, 2088–2099.
[25]
Chin-Yew Lin. 2004. ROUGE: A package for automatic evaluation of summaries. In Text Summarization Branches Out. ACL, 74–81.
[26]
Shu Liu, Bo Wang, Huaxiong Li, Chunlin Chen, and Zhi Wang. 2023. Continual portfolio selection in dynamic environments via incremental reinforcement learning. Int. J. Mach. Learn. Cybern. 14, 1 (2023), 269–279.
[27]
Xiao Liu, Kaixuan Ji, Yicheng Fu, Weng Tam, Zhengxiao Du, Zhilin Yang, and Jie Tang. 2022. P-Tuning: Prompt Tuning Can Be Comparable to Fine-tuning Across Scales and Tasks. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). 61–68.
[28]
Yinhan Liu, Jiatao Gu, Naman Goyal, Xian Li, Sergey Edunov, Marjan Ghazvininejad, Mike Lewis, and Luke Zettlemoyer. 2020. Multilingual denoising pre-training for neural machine translation. Trans. Assoc. Comput. Ling. 8 (2020), 726–742.
[29]
Thong Thanh Nguyen and Anh Tuan Luu. 2022. Improving neural cross-lingual abstractive summarization via employing optimal transport distance for knowledge distillation. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 36. 11103–11111.
[30]
Jessica Ouyang, Boya Song, and Kathleen McKeown. 2019. A robust abstractive system for cross-lingual summarization. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). 2025–2031.
[31]
Hao Peng, Ruitong Zhang, Shaoning Li, Yuwei Cao, Shirui Pan, and Philip S. Yu.2022. Reinforced, incremental and cross-lingual event detection from social messages. IEEE Trans. Pattern Anal. Mach. Intell. 45, 1 (2022), 980–998.
[32]
Kun Peng, Lei Jiang, Hao Peng, Rui Liu, Zhengtao Yu, Jiaqian Ren, Zhifeng Hao, and Philip S. Yu. 2024. Prompt based tri-channel graph convolution neural network for aspect sentiment triplet extraction. In Proceedings of the SIAM International Conference on Data Mining (SDM’24). 145–153.
[33]
Fabio Petroni, Tim Rocktäschel, Sebastian Riedel, Patrick Lewis, Anton Bakhtin, Yuxiang Wu, and Alexander Miller. 2019. Language models as knowledge bases? In Proceedings of the Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP’19). 2463–2473.
[34]
Archiki Prasad, Peter Hase, Xiang Zhou, and Mohit Bansal. 2023. GrIPS: Gradient-free, edit-based instruction search for prompting large language models. In Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics. 3827–3846.
[35]
Jing Qian, Li Dong, Yelong Shen, Furu Wei, and Weizhu Chen. 2022. Controllable natural language generation with contrastive prefixes. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL’22). 2912–2924.
[36]
Timo Schick and Hinrich Schütze. 2021. Exploiting cloze-questions for few-shot text classification and natural language inference. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume. ACL, 255–269.
[37]
Rico Sennrich, Barry Haddow, and Alexandra Birch. 2016. Neural machine translation of rare words with subword units. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). ACL, 1715–1725.
[38]
Shi-qi Shen, Yun Chen, Cheng Yang, Zhi-yuan Liu, and Mao-song Sun. 2018. Zero-shot cross-lingual neural headline generation. IEEE/ACM Transactions on Audio, Speech, and Language Processing 26, 12 (2018), 2319–2327.
[39]
Wei Shi, Yanghe Feng, Honglan Huang, Zhong Liu, Jincai Huang, and Guangquan Cheng. 2022. Efficient hierarchical policy network with fuzzy rules. Int. J. Mach. Learn. Cybern. 13, 2 (2022), 447–459.
[40]
Kihyuk Sohn, Xinchen Yan, and Honglak Lee. 2015. Learning structured output representation using deep conditional generative models. In Proceedings of the 28th International Conference on Neural Information Processing Systems-Volume 2. MIT Press, 3483–3491.
[41]
Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. 2014. Dropout: A simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15, 1 (2014), 1929–1958.
[42]
Sho Takase and Naoaki Okazaki. 2022. Multi-task learning for cross-lingual abstractive summarization. In Proceedings of the 13th Language Resources and Evaluation Conference (LREC’22). 3008–3016.
[43]
Derek Tam, Rakesh R. Menon, Mohit Bansal, Shashank Srivastava, and Colin Raffel. 2021. Improving and simplifying pattern exploiting training. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’21). 4980–4991.
[44]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Proceedings of the 31st International Conference on Neural Information Processing Systems (NIPS’17). 6000–6010.
[45]
Tu Vu, Brian Lester, Noah Constant, Rami Al-Rfou, and Daniel Cer. 2022. SPoT: Better frozen model adaptation through soft prompt transfer. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). ACL, 5039–5059.
[46]
Xiaojun Wan, Huiying Li, and Jianguo Xiao. 2010. Cross-language document summarization based on machine translation quality prediction. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics. ACL, 917–926.
[47]
Jiaan Wang, Fandong Meng, Ziyao Lu, Duo Zheng, Zhixu Li, Jianfeng Qu, and Jie Zhou. 2022. ClidSum: A benchmark dataset for cross-lingual dialogue summarization. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’22). 7716–7729.
[48]
Jiaan Wang, Fandong Meng, Duo Zheng, Yunlong Liang, Zhixu Li, Jianfeng Qu, and Jie Zhou. 2022. A survey on cross-lingual summarization. Trans. Assoc. Comput. Ling. 10 (2022), 1304–1323.
[49]
Jiaan Wang, Fandong Meng, Duo Zheng, Yunlong Liang, Zhixu Li, Jianfeng Qu, and Jie Zhou. 2023. Towards unifying multi-lingual and cross-lingual summarization. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 15127–15143.
[50]
Zhenhan Wang, Ran Song, Zhengtao Yu, Cunli Mao, and Shengxiang Gao. 2024. DRA: Dynamic routing attention for neural machine translation with low-resource languages. International Journal of Machine Learning and Cybernetics 15, 4 (2024), 1–15.
[51]
Chao Xiang, Zhongming Jin, Zhengxu Yu, Xian-Sheng Hua, Yao Hu, Wei Qian, Kaili Zhu, Deng Cai, and Xiaofei He. 2023. Optimizing traffic efficiency via a reinforcement learning approach based on time allocation. International Journal of Machine Learning and Cybernetics 14, 10 (2023), 3381–3391.
[52]
Linting Xue, Noah Constant, Adam Roberts, Mihir Kale, Rami Al-Rfou, Aditya Siddhant, Aditya Barua, and Colin Raffel. 2021. mT5: A massively multilingual pre-trained text-to-text transformer. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. ACL, 483–498.
[53]
Jingqing Zhang, Yao Zhao, Mohammad Saleh, and Peter Liu. 2020. Pegasus: Pre-training with extracted gap-sentences for abstractive summarization. In Proceedings of the International Conference on Machine Learning(ICML’20). 11328–11339.
[54]
K. Zhang, Y. Zhang, Z. Yu, Y. Huang, and K. Tan. 2022. A two-stage fine-tuning method for low-resource cross-lingual summarization. Math. Biosci. Eng. 21, 1 (2022), 1125–1143.
[55]
Tianyi Zhang, Varsha Kishore, Felix Wu, Kilian Q. Weinberger, and Yoav Artzi. 2020. BERTScore: Evaluating text generation with BERT. In Proceedings of the International Conference on Learning Representations.
[56]
Yongbing Zhang, Shengxiang Gao, Yuxin Huang, Zhengtao Yu, and Kaiwen Tan. 2024. 3A-COT: an attend-arrange- abstract chain-of-thought for multi-document summarization. International Journal of Machine Learning and Cybernetics 15, 4 (2024), 1–19.
[57]
Junnan Zhu, Qian Wang, Yining Wang, Yu Zhou, Jiajun Zhang, Shaonan Wang, and Chengqing Zong. 2019. NCLS: Neural cross-lingual summarization. In Proceedings of the Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP’19). 3054–3064.
[58]
Junnan Zhu, Yu Zhou, Jiajun Zhang, and Chengqing Zong. 2020. Attend, translate and summarize: An efficient method for neural cross-lingual summarization. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. ACL, 1309–1321.

Cited By

View all
  • (2024)Cross-lingual prompting method with semantic-based answer space clusteringApplied Intelligence10.1007/s10489-024-06101-w55:2Online publication date: 12-Dec-2024

Index Terms

  1. Augmenting Low-Resource Cross-Lingual Summarization with Progression-Grounded Training and Prompting

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Asian and Low-Resource Language Information Processing
    ACM Transactions on Asian and Low-Resource Language Information Processing  Volume 23, Issue 9
    September 2024
    186 pages
    EISSN:2375-4702
    DOI:10.1145/3613646
    Issue’s Table of Contents

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 16 August 2024
    Online AM: 26 June 2024
    Accepted: 22 June 2024
    Revised: 24 March 2024
    Received: 22 September 2023
    Published in TALLIP Volume 23, Issue 9

    Check for updates

    Author Tags

    1. CLS
    2. pretrain+finetune paradigm
    3. low-resource languages
    4. progressive training
    5. reinforcement learning
    6. discrete-prompts

    Qualifiers

    • Research-article

    Funding Sources

    • National Natural Science Foundation of China
    • Yunnan provincial major science and technology special plan projects
    • Yunnan Provincial Key Research and Development Plan
    • Yunnan Fundamental Research Projects
    • Kunming University of Science and Technology?s ? Double First-rate ? construction joint project

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)179
    • Downloads (Last 6 weeks)10
    Reflects downloads up to 02 Mar 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Cross-lingual prompting method with semantic-based answer space clusteringApplied Intelligence10.1007/s10489-024-06101-w55:2Online publication date: 12-Dec-2024

    View Options

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Full Text

    View this article in Full Text.

    Full Text

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media