Abstract
Along with automatic question answering and machine reading comprehension, question generation has become a popular yet challenging task of natural language understanding in recent years. However, as far as we are concerned, there was no study being conducted with a concentration on method for question generation in Vietnamese known as a low-resource language. In this paper, we evaluate different powerful question generation systems in two benchmark Vietnamese datasets: UIT-ViNewsQA and UIT-ViQuAD. First, we conduct experiments on deep neural network and sequence-to-sequence approaches, based on a context and an answer to generate a question. In addition, in order to investigate several powerful approaches, we utilize two strong language models (LM): the monolingual language model PhoBERT and a massively multilingual pre-trained language model mT5. To obtain higher performance, we enhance LM-based methods with reinforcement learning during the decoding process. Our experiments show that the best model achieves the BLEU 4 scores of 19.77 on UIT-ViNewsQA and 20.43 on UIT-ViQuAD.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Chan, Y.-H., Fan, Y.-C.: A recurrent BERT-based model for question generation. In: Proceedings of the 2nd Workshop on Machine Reading for Question Answering, pp. 154â162 (2019)
Cho, K., et al.: Learning phrase representations using RNN encoderdecoder for statistical machine translation. In: arXiv preprint arXiv:1406.1078 (2014)
Do, P.N.-T., Nguyen, N.D., Van Huynh, T., Van Nguyen, K., Nguyen, A.G.-T., Nguyen, N.L.-T.: Sentence extraction-based machine reading comprehension for Vietnamese. In: Qiu, H., Zhang, C., Fei, Z., Qiu, M., Kung, S.-Y. (eds.) KSEM 2021. LNCS (LNAI), vol. 12816, pp. 511â523. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-82147-0_42
Du, X., Shao, J., Cardie, C.: Learning to ask: Neural question generation for reading comprehension. In: arXiv preprint arXiv:1705.00106 (2017)
Duan, N.: et al.: Question generation for question answering. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 866â874 (2017)
Grave, E., et al.: Learning word vectors for 157 languages. In: arXiv preprint arXiv:1802.06893 (2018)
Kim, Y., et al.: Improving neural question generation using answer separation. In: Proceedings of the AAAI Conference on Artificial Intelligence. Vol. 33, pp. 6602â6609, January 2019
Kriangchaivech, K., Wangperawong, A.:Question generation by transformers. In: arXiv preprint arXiv:1909.05017 (2019)
Lopez, L.E., et al.: Simplifying paragraph-level question generation via transformer language models. In: arXiv preprint arXiv:2005.01107 (2020)
Luong, M.-T., Pham, H., Manning, C.D.: Effective approaches to attention-based neural machine translation". In: arXiv preprint arXiv:1508.04025 (2015)
Nguyen, A.T., Dao, M.H., Nguyen, D.Q.: A pilot study of textto- SQL semantic parsing for Vietnamese". In: arXiv preprint arXiv:2010018a91 (2020)
Nguyen, D.Q., Nguyen, A.T.: PhoBERT: pre-trained language models for Vietnamese. In: arXiv preprint arXiv:2003.00744 (2020)
Nguyen, K., et al.: A Vietnamese dataset for evaluating machine reading comprehension. In: Proceedings of the 28th International Conference on Computational Linguistics, pp. 2595â2605 (2020)
Van Nguyen, K., et al.: XLMRserini: open-domain question answering on Vietnamese Wikipedia-based textual knowledge source. In: 14th Asian Conference on Intelligent Information and Database Systems (2022)
Papineni, K., et al.: BLEU: a method for automatic evaluation of machine translation. In: Proceedings of the 40th annual meeting of the Association for Computational Linguistics, pp. 311â318 (2002)
Rajpurkar, P., et al.: SQUAD: 100,000+ questions for machine comprehension of text. In: arXiv preprint arXiv:1606.05250 (2016)
Rennie, S.J., et al.: Self-critical sequence training for image captioning. In: Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, pp. 7008â7024 (2017)
Sharma, S., et al.: Relevance of unsupervised metrics in task-oriented dialogue for evaluating natural language generation. In: arXiv preprint arXiv:1706.09799 (2017)
Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. In: Advances in Neural Information Processing Systems, pp. 3104â3112 (2014)
Van Nguyen, K., et al.: New Vietnamese corpus for machine reading comprehension of health news articles. In: ACM Trans. Asian Low-Resour. Lang. Inf. Process. (2022). Just Accepted. ISSN: 2375â4699. https://doi.org/10.1145/3527631.https://doi.org/10.1145/3527631
Van Nguyena, K., et al.: Vireader: a Wikipedia-based Vietnamese reading comprehension system using transfer learning. J. Intell. Fuzzy Syst. 1, 1â5 (2021)
Vu, T., et al.: VnCoreNLP: a Vietnamese natural language processing toolkit. In: arXiv preprint arXiv:1801.01331 (2018)
Williams, R.J.: Simple statistical gradient-following algorithms for connectionist reinforcement learning. Mach. Learn. 8(3), 229â256 (1992)
Xue, L., et al.: MT5: a massively multilingual pre-trained text-to-text transformer. In: arXiv preprint arXiv:2010.11934 (2020)
Yuan, X., et al.: Machine comprehension by text-to-text neural question generation. In: arXiv preprint arXiv:1705.02012 (2017)
Zhao, Y., et al.: Paragraph-level neural question generation with maxout pointer and gated self-attention networks. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp. 3901â3910 (2018)
Acknowledgement
This research was supported by The VNUHCM-University of Information Technologyâs Scientific Research Support Fund.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
Âİ 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Vu, N., Van Nguyen, K. (2022). Enhancing Vietnamese Question Generation with Reinforcement Learning. In: Nguyen, N.T., Tran, T.K., Tukayev, U., Hong, TP., TrawiĊski, B., Szczerbicki, E. (eds) Intelligent Information and Database Systems. ACIIDS 2022. Lecture Notes in Computer Science(), vol 13757. Springer, Cham. https://doi.org/10.1007/978-3-031-21743-2_45
Download citation
DOI: https://doi.org/10.1007/978-3-031-21743-2_45
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-21742-5
Online ISBN: 978-3-031-21743-2
eBook Packages: Computer ScienceComputer Science (R0)