Abstract
Generating questions from the long passage is an important and challenging task. Most of the recent works focus on generating questions whose answers are consecutive text spans in the given passage. However, realistic questions are more complicated and their answers are always inductive and summative. In this paper, we focus on a complex form of question generation task, in which the answer is implied in the long passage. It means that we cannot directly find sentences relevant to the question in the passage anymore. To this end, we first construct a dataset that meets our needs on top of RACE. Based on this, we propose an Intent-aware Complex Question Generation model (ICQG). It first encodes the long passage, which exploits a gated mechanism to fetch the valuable information for elaborating the question. And then, both the passage and answer are used to support the question decoding by modeling their interaction. Finally, an intent classifier is designed to predict what kinds of questions tend to be asked, which is used to guide the question decoding. We conduct both qualitative and quantitative evaluations, and the experimental results demonstrate that the proposed model is effective on this task and superior to the competitor methods.
Similar content being viewed by others
Code Availability
The code of this work is available from the corresponding author.
References
Devlin J, Chang M W, Lee K, Toutanova K (2019) Bert: Pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: Human language technologies, Vol 1 (Long and Short Papers), pp 4171–4186
Du X, Cardie C (2018) Harvesting paragraph-level question-answer pairs from wikipedia. In: Proceedings of the 56th annual meeting of the association for computational linguistics (Volume 1: Long Papers), pp 1907–1917
Du X, Shao J, Cardie C (2017) Learning to ask: Neural question generation for reading comprehension. In: Proceedings of the 55th annual meeting of the association for computational linguistics (Volume 1: Long Papers), pp 1342–1352
Gong Y, Bowman S (2018) Ruminating reader: Reasoning with gated multi-hop attention. In: Proceedings of the workshop on machine reading for question answering, pp 1–11
Gu J, Lu Z, Li H, Li VO (2016) Incorporating copying mechanism in sequence-to-sequence learning. In: Proceedings of the 54th annual meeting of the association for computational linguistics (Volume 1: Long Papers), pp 1631–1640
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780
Jia X, Zhou W, Sun X, Wu Y (2021) Eqg-race: Examination-type question generation. In: Proceedings of the AAAI conference on artificial intelligence, vol 35, pp 13143–13151
Kang J, San Roman HP et al (2019) Let me know what to ask: Interrogative-word-aware question generation. In: Proceedings of the 2nd workshop on machine reading for question answering, pp 163–171
Kim Y, Lee H, Shin J, Jung K (2019) Improving neural question generation using answer separation. In: Proceedings of the AAAI conference on artificial intelligence, vol 33, pp 6602– 6609
Kobayashi H (2018) Frustratingly easy model ensemble for abstractive summarization. In: Proceedings of the 2018 conference on empirical methods in natural language processing, pp 4165–4176
Lai G, Xie Q, Liu H, Yang Y, Hovy E (2017) Race: Large-scale reading comprehension dataset from examinations. In: Proceedings of the 2017 conference on empirical methods in natural language processing, pp 785–794
Li J, Galley M, Brockett C, Gao J, Dolan B (2016) A diversity-promoting objective function for neural conversation models. In: Proceedings of the 2016 conference of the North American chapter of the association for computational linguistics: Human language technologies, pp 110–119
Lin CY (2004) Rouge: A package for automatic evaluation of summaries. In: Text summarization branches out, pp 74–81
Liu B, Zhao M, Niu D, Lai K, He Y, Wei H, Xu Y (2019) Learning to generate questions by learning what not to generate. In: The World Wide Web conference, pp 1106–1118
Nguyen T, Rosenberg M, Song X, Gao J, Tiwary S, Majumder R, Deng L (2016) Ms marco: A human generated machine reading comprehension dataset. In: CoCo@ NIPS
Papineni K, Roukos S, Ward T, Zhu WJ (2002) Bleu: a method for automatic evaluation of machine translation. In: Proceedings of the 40th annual meeting of the Association for Computational Linguistics, pp 311–318
Pennington J, Socher R, Manning CD (2014) Glove: Global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp 1532–1543
Rajpurkar P, Zhang J, Lopyrev K, Liang P (2016) Squad: 100,000+ questions for machine comprehension of text. In: Proceedings of the 2016 conference on empirical methods in natural language processing, pp 2383–2392
Song L, Wang Z, Hamza W, Zhang Y, Gildea D (2018) Leveraging context information for natural question generation. In: Proceedings of the 2018 conference of the North American chapter of the association for computational linguistics: Human language technologies, Vol 2 (Short Papers), pp 569–574
Sun X, Liu J, Lyu Y, He W, Ma Y, Wang S (2018) Answer-focused and position-aware neural question generation. In: Proceedings of the 2018 conference on empirical methods in natural language processing, pp 3930–3939
Tuan LA, Shah DJ, Barzilay R (2019) Capturing greater context for question generation. arXiv preprint arXiv:191010274
Wang W, Yang N, Wei F, Chang B, Zhou M (2017) Gated self-matching networks for reading comprehension and question answering. In: Proceedings of the 55th annual meeting of the association for computational linguistics (Volume 1: Long Papers), pp 189–198
Wang Z, Hamza W, Florian R (2017) Bilateral multi-perspective matching for natural language sentences. In: Proceedings of the 26th international joint conference on artificial intelligence, pp 4144–4150
Yang Z, Qi P, Zhang S, Bengio Y, Cohen W, Salakhutdinov R, Manning CD (2018) Hotpotqa: A dataset for diverse, explainable multi-hop question answering. In: Proceedings of the 2018 conference on empirical methods in natural language processing, pp 2369–2380
Yuan X, Wang T, Gulcehre C, Sordoni A, Bachman P, Zhang S, Subramanian S, Trischler A (2017) Machine comprehension by text-to-text neural question generation. In: Proceedings of the 2nd workshop on representation learning for NLP, pp 15–25
Zhao Y, Ni X, Ding Y, Ke Q (2018) Paragraph-level neural question generation with maxout pointer and gated self-attention networks. In: Proceedings of the 2018 conference on empirical methods in natural language processing, pp 3901–3910
Zhou Q, Yang N, Wei F, Tan C, Bao H, Zhou M (2017) Neural question generation from text: A preliminary study. In: National CCF conference on natural language processing and Chinese computing, Springer, pp 662–671
Zhou W, Zhang M, Wu Y (2019) Question-type driven question generation. In: Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP), pp 6034–6039
Acknowledgements
This work is jointly supported by the Natural Science Foundation of China (No. 62006061), Strategic Emerging Industry Development Special Funds of Shenzhen (No. JCYJ20190806112210067 and JCYJ20200109113403826), Stability Support Program for Higher Education Institutions of Shenzhen (No. GXWD20201230155427003-20200824155011001).
Funding
This work is supported by the Natural Science Foundation of China (No. 62006061), Strategic Emerging Industry Development Special Funds of Shenzhen (No. JCYJ20190806112210067 and JCYJ20200109113403826), Stability Support Program for Higher Education Institutions of Shenzhen (No. GXWD20201230155427003-20200824155011001).
Author information
Authors and Affiliations
Contributions
Youcheng Pan proposed the method, designed the experiments, and drafted the manuscript.
Baotian Hu supervised the research and critically revised the manuscript.
Shiyue Wang carried out the experiments and revised the manuscript.
Xiaolong Wang and Qingcai Chen provided guidance and reviewed the manuscript.
Zenglin Xu and Min Zhang participated in the manuscript review.
Corresponding author
Ethics declarations
Conflict of Interests
Not applicable.
Additional information
Availability of data and material
The data used in this work is available from the corresponding author.
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Pan, Y., Hu, B., Wang, S. et al. Learning to generate complex question with intent prediction from long passage. Appl Intell 53, 5823–5833 (2023). https://doi.org/10.1007/s10489-022-03651-9
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-022-03651-9