Skip to main content
Log in

Syntax-guided question generation using prompt learning

  • Review
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Question generation (QG) aims to generate natural questions from relevant input. Existing state-of-the-art QG approaches primarily leverage pre-trained language models (PLMs) to encode the deep semantics within the input. Meanwhile, studies show that the input’s dependency parse tree (referred to as syntactic information) is promising in improving NLP-oriented tasks. However, how to incorporate syntactic information in PLMs to guide a QG process effectively still needs to be settled. This paper introduces a syntax-guided sentence-level QG model based on prompt learning. Specifically, we model the syntactic information by utilizing soft prompt learning, jointly considering the syntactic information from a constructed dependency parse graph and PLM to guide question generation. We conduct experiments on two benchmark datasets, SQuAD1.1 and MS MARCO. Experiment results show that our model exceeded both automatic and human evaluation metrics compared with mainstream approaches. Moreover, our case study shows that the model can generate more fluent questions with richer information.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Data Availability

The datasets generated during and/or analyzed during the current study are available in https://microsoft.github.io/msmarco/ and https://rajpurkar.github.io/SQuAD-explorer/.

Notes

  1. https://spacy.io/.

  2. If not otherwise specified, we use W to state a trainable matrix in this paper.

References

  1. Duan N, Tang D, Chen P, Zhou M (2017) Question generation for question answering. In: EMNLP 2017, pp 866–874

  2. Fang Y, Wang S, Gan Z, Sun S, Liu J (2020) Accelerating real-time question answering via question generation. CoRR abs/2009.05167

  3. Wang Z, Lan AS, Nie W, Waters AE, Grimaldi PJ, Baraniuk RG (2018) QG-net: a data-driven question generation model for educational content. In: L@S 2018, pp 7–1710

  4. Zhang Z, Zhu KQ (2021) Diverse and specific clarification question generation with keywords. In: WWW 2021, pp 3501–3511

  5. Zhou Q, Yang N, Wei F, Tan C, Bao H, Zhou M (2017) Neural question generation from text: a preliminary study. In: NLPCC 2017, pp 662–671

  6. Kim Y, Lee H, Shin J, Jung K (2019) Improving neural question generation using answer separation. AAAI 2009:6602–6609

    Article  Google Scholar 

  7. Yu J, Quan X, Su Q, Yin J (2020) Generating multi-hop reasoning questions to improve machine reading comprehension. In: WWW 2020, pp 281–291

  8. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I (2017) Attention is all you need. In: NIPS 2017, pp 5998–6008

  9. Shi X, Chen Z, Wang H, Yeung D-Y, Wong W-K, Woo W-c (2015) Convolutional LSTM network: a machine learning approach for precipitation nowcasting. In: NIPS 2015

  10. Chan Y-H, Fan Y-C (2019) A recurrent BERT-based model for question generation. In: MRQA 2019, pp 154–162. https://doi.org/10.18653/v1/D19-5821

  11. Pérez-Mayos L, Ballesteros M, Wanner L (2021) How much pretraining data do language models need to learn syntax?, pp 1571–1582

  12. Li J, Luong M-T, Jurafsky D, Hovy E (2015) When are tree structures necessary for deep learning of representations? In: EMNLP 2015, pp 2304–2314

  13. Xu Z, Guo D, Tang D, Su Q, Shou L, Gong M, Zhong W, Quan X, Jiang D, Duan N (2021) Syntax-enhanced pre-trained model. In: IJCNLP 2021, pp 5412–5422

  14. Bai J, Wang Y, Chen Y, Yang Y, Bai J, Yu J, Tong Y (2021) Syntax-bert: improving pre-trained transformers with syntax trees. In: EACL 2021, pp 3011–3020

  15. Dhole KD, Manning CD (2020) Syn-QG: Syntactic and shallow semantic rules for question generation. CoRR abs/2004.08694

  16. Li J, Tang T, Zhao WX, Nie J-Y, Wen J-R (2022) A survey of pretrained language models based text generation. CoRR abs/2201.05273

  17. Shin T, Razeghi Y, Logan IV RL, Wallace E, Singh, S (2020) Autoprompt: eliciting knowledge from language models with automatically generated prompts. In: EMNLP 2020, pp 4222–4235

  18. Liu P, Yuan W, Fu J, Jiang Z, Hayashi H, Neubig G (2021) Pre-train, prompt, and predict: a systematic survey of prompting methods in natural language processing. CoRR abs/2107.13586

  19. Li C, Gao F, Bu J, Xu L, Chen X, Gu Y, Shao Z, Zheng Q, Zhang N, Wang Y, Yu Z (2021) SentiPrompt: sentiment knowledge enhanced prompt-tuning for aspect-based sentiment analysis. CoRR abs/2109.08306

  20. Li H, Mo T, Fan H, Wang J, Wang J, Zhang F, Li W (2022) KiPT: knowledge-injected prompt tuning for event detection. In: ICCL 2022, pp 1943–1952

  21. Rajpurkar P, Zhang J, Lopyrev K, Liang P (2016) Squad: 100,000+ questions for machine comprehension of text. In: EMNLP 2016, pp 2383–2392. https://doi.org/10.18653/v1/d16-1264

  22. Nguyen T, Rosenberg M, Song X, Gao J, Tiwary S, Majumder R, Deng L (2016) MS MARCO: a human generated machine reading comprehension dataset. In: NIPS 2016

  23. Mitkov R, et al (2003) Computer-aided generation of multiple-choice tests. In: NAACL 2003, pp 17–22

  24. Heilman M, Smith NA (2010) Good question! statistical ranking for question generation. In: NAACL 2010, pp 609–617

  25. Miller GA (1998) WordNet: an electronic lexical database. MIT press, Cambridge, MA

    Google Scholar 

  26. Schuler KK (2005) VerbNet: a broad-coverage, comprehensive verb lexicon. University of Pennsylvania, Philadelphia

    Google Scholar 

  27. Du X, Shao J, Cardie C (2017) Learning to ask: Neural question generation for reading comprehension. In: ACL 2017, pp 1342–1352

  28. Devlin J, Chang M-W, Lee K, Toutanova K (2019) BERT: pre-training of deep bidirectional transformers for language understanding. In: NAACL 2019, pp 4171–4186

  29. Zhang Z, Han X, Liu Z, Jiang X, Sun M, Liu Q (2019) ERNIE: enhanced language representation with informative entities. In: ACL 2019, pp 1441–1451

  30. Yue X, Zhang XF, Yao Z, Lin SM, Sun H (2021) CliniQG4QA: generating diverse questions for domain adaptation of clinical question answering. In: BIBM 2021, pp 580–587

  31. Alsentzer E, Murphy JR, Boag W, Weng W, Jin D, Naumann T, McDermott MBA (2019) Publicly available clinical BERT embeddings. CoRR abs/1904.03323

  32. Xiao D, Zhang H, Li Y, Sun Y, Tian H, Wu H, Wang H (2020) ERNIE-GEN: an enhanced multi-flow pre-training and fine-tuning framework for natural language generation. In: IJCAI 2020, pp 3997–4003

  33. Wang B, Wang X, Tao T, Zhang Q, Xu J (2020) Neural question generation with answer pivot. In: AAAI 2020, vol 34, pp 9138–9145

  34. Scialom T, Piwowarski B, Staiano J (2019) Self-attention architectures for answer-agnostic neural question generation. In: ACL 2019, pp 6027–6032

  35. Chai Z, Wan X (2020) Learning to ask more: Semi-autoregressive sequential question generation under dual-graph interaction. In: ACL 2020, pp 225–237

  36. Lee K, He L, Lewis M, Zettlemoyer L (2017) End-to-end neural coreference resolution. In: EMNLP 2017, pp 188–197

  37. Ding W, Zhang C, Xie G, Hu X, Shen X, Shen Y (2021) Graph structure-aware bi-directional graph convolution model for semantic role labeling. In: Big Data 2021, pp 1008–1014

  38. Chung J, Gulcehre C, Cho K, Bengio Y (2014) Empirical evaluation of gated recurrent neural networks on sequence modeling. In: NIPS 2014

  39. Velickovic P, Cucurull G, Casanova A, Romero ., Liò P, Bengio Y (2018) Graph attention networks. In: ICLR 2018

  40. Li XL, Liang P (2021) Prefix-Tuning: Optimizing continuous prompts for generation. In: Zong C, Xia F, Li W, Navigli R (eds) IJCNLP 2021, pp 4582–4597

  41. Altinisik E, Sajjad H, Sencar HT, Messaoud S, Chawla S (2023) Impact of adversarial training on robustness and generalizability of language models, pp 7828–7840

  42. Papineni K, Roukos S, Ward T, Zhu W-J (2002) BLEU: a method for automatic evaluation of machine translation. In: ACL 2002, pp 311–318

  43. Lin C-Y (2004) ROUGE: a package for automatic evaluation of summaries. In: ACL 2004, pp 74–81

  44. Banerjee S, Lavie A (2005) METEOR: an automatic metric for MT evaluation with improved correlation with human judgments. In: ACL 2005, pp 65–72

  45. Pan L, Xie Y, Feng Y, Chua T-S, Kan M-Y (2020) Semantic graphs for generating deep questions. In: ACL 2020, pp 1463–1475

  46. Ma X, Zhu Q, Zhou Y, Li X (2020) Improving question generation with sentence-level semantic matching and answer position inferring. In: AAAI 2020, vol 34, pp 8464–8471

  47. Kudo T, Richardson J (2018) SentencePiece: a simple and language independent subword tokenizer and detokenizer for neural text processing. In: EMNLP 2018, pp 66–71

  48. Gong S, Li M, Feng J, Wu Z, Kong L (2022) Diffuseq: sequence to sequence text generation with diffusion models. CoRR abs/2210.08933

  49. Li XL, Thickstun J, Gulrajani I, Liang P, Hashimoto TB (2022) Diffusion-LM improves controllable text generation. CoRR abs/2205.14217

  50. Wang H, Li J, Wu H, Hovy E, Sun Y (2022) Pre-trained language models and their applications. Engineering

  51. Li J, Tang T, Zhao WX, Wei Z, Yuan NJ, Wen J-R (2021) Few-shot knowledge graph-to-text generation with pretrained language models. In: IJCNLP 2021, pp 1558–1568

  52. Mager M, Astudillo RF, Naseem T, Sultan MA, Lee Y-S, Florian R, Roukos S (2020) GPT-too: a language model first approach for AMR-to-text generation. In: ACL 2020, pp 1846–1852

  53. Ribeiro LFR, Schmitt M, Schütze H, Gurevych I (2020) Investigating pretrained language models for graph-to-text generation, vol. abs/2007.08426

  54. See A, Liu PJ, Manning CD (2017) Get to the point: Summarization with pointer-generator networks. In: ACL 2017, pp 1073–1083

Download references

Funding

Research in this paper was partially supported by the Natural Science Foundation of China (U21A20488). We thank the Big Data Computing Center of Southeast University for providing the facility support on the numerical calculations in this paper.

Author information

Authors and Affiliations

Authors

Contributions

Conceptualization, writing—original draft, and software were performed by ZH; conceptualization, methodology, and writing—review were provided by SB; supervision, writing—review, and Funding acquisition were approved by GQ. Review and data analysis were analyzed by YZ, ZR, and YL.

Corresponding author

Correspondence to Guilin Qi.

Ethics declarations

Conflict of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hou, Z., Bi, S., Qi, G. et al. Syntax-guided question generation using prompt learning. Neural Comput & Applic 36, 6271–6282 (2024). https://doi.org/10.1007/s00521-024-09421-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-024-09421-7

Keywords

Navigation