Skip to main content
Log in

Enhancing multiple-choice question answering through sequential fine-tuning and Curriculum Learning strategies

  • Regular Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

With the transformer-based pre-trained language models, multiple-choice question answering (MCQA) systems can reach a particular level of performance. This study focuses on inheriting the benefits of contextualized language representations acquired by language models and transferring and sharing information among MCQA datasets. In this work, a method called multi-stage-fine-tuning considering the Curriculum Learning strategy is presented, which proposes sequencing not training samples, but the source datasets in a meaningful order, not randomized. Consequently, an extensive series of experiments over various MCQA datasets shows that the proposed method reaches remarkable performance enhancements than classical fine-tuning over picked baselines T5 and RoBERTa. Moreover, the experiments are conducted on merged source datasets, and the proposed method achieves improved performance. This study shows that increasing the number of source datasets and even using some small-scale datasets helps build well-generalized models. Moreover, having a higher similarity between source datasets and target also plays a vital role in the performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Data availability

Data sharing is not applicable to this article as no datasets were generated or analyzed during the current study.

References

  1. Noraset T, Lowphansirikul L, Tuarob S (2021) Wabiqa: a wikipedia-based thai question-answering system. Inf Process Manag 58(1):102431

    Article  Google Scholar 

  2. Yigit G, Amasyali MF (2019) Ask me: a question answering system via dynamic memory networks. In: 2019 Innovations in intelligent systems and applications conference (ASYU). IEEE, pp 1–5

  3. Sarrouti M, El Alaoui SO (2020) Sembionlqa: a semantic biomedical question answering system for retrieving exact and ideal answers to natural language questions. Artif Intell Med 102:101767

    Article  Google Scholar 

  4. Kung H-Y, Yu R-W, Chen C-H, Tsai C-W, Lin C-Y (2021) Intelligent pig-raising knowledge question-answering system based on neural network schemes. Agron J

  5. Yigt G, Amasyali F (2021) Soru cevaplama sistemleri üzerine detaylı bir çalışma: Veri kümeleri, yöntemler ve açık araştırma alanları. Bilişim Teknolojileri Dergisi 14(3): 239–254

  6. Rajpurkar P, Zhang J, Lopyrev K, Liang P (2016) Squad: 100,000+ questions for machine comprehension of text. In: Proceedings of the 2016 conference on empirical methods in natural language processing, pp 2383–2392

  7. Clark C, Lee K, Chang M-W, Kwiatkowski T, Collins M, Toutanova K (2019) Boolq: exploring the surprising difficulty of natural yes/no questions. In: Proceedings of the 2019 conference of the North American Chapter of the association for computational linguistics: human language technologies, vol 1 (Long and Short Papers), pp 2924–2936

  8. Talmor A, Herzig J, Lourie N, Berant J (2019) Commonsenseqa: a question answering challenge targeting commonsense knowledge. In: Proceedings of the 2019 conference of the North American Chapter of the association for computational linguistics: human language technologies, vol 1 (Long and Short Papers), pp 4149–4158

  9. Mihaylov T, Clark P, Khot T, Sabharwal A (2018) Can a suit of armor conduct electricity? a new dataset for open book question answering. In: Proceedings of the 2018 conference on empirical methods in natural language processing, pp 2381–2391

  10. Huang L, Le Bras R, Bhagavatula C, Choi Y (2019) Cosmos QA: machine reading comprehension with contextual commonsense reasoning. In: Proceedings of the 2019 conference on empirical methods in natural language Processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP), pp 2391–2401

  11. Lai G, Xie Q, Liu H, Yang Y, Hovy E (2017) Race: large-scale reading comprehension dataset from examinations. In: Proceedings of the 2017 conference on empirical methods in natural language processing, pp 785–794

  12. Clark P, Cowhey I, Etzioni O, Khot T, Sabharwal A, Schoenick C, Tafjord O (2018) Think you have solved question answering? Try arc. the ai2 reasoning challenge

  13. Devlin J, Chang M-W, Lee K, Toutanova K (2019) Bert: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 conference of the North American Chapter of the Association for computational linguistics: human language technologies, vol 1 (Long and Short Papers), pp 4171–4186

  14. Liu Y, Ott M, Goyal N, Du J, Joshi M, Chen D, Levy O, Lewis M, Zettlemoyer L, Stoyanov V (2019) Roberta: a robustly optimized Bert pretraining approach

  15. Lan Z, Chen M, Goodman S, Gimpel K, Sharma P, Soricut R (2019) Albert: a lite bert for self-supervised learning of language representations. In: International conference on learning representations

  16. Lewis M, Liu Y, Goyal N, Ghazvininejad M, Mohamed A, Levy O, Stoyanov V, Zettlemoyer L (2020) Bart: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. In: Proceedings of the 58th annual meeting of the association for computational linguistics, pp 7871–7880

  17. Raffel C, Shazeer N, Roberts A, Lee K, Narang S, Matena M, Zhou Y, Li W, Liu PJ (2020) Exploring the limits of transfer learning with a unified text-to-text transformer. J Mach Learn Res 21:1–67

    MathSciNet  Google Scholar 

  18. Bengio Y, Louradour J, Collobert R, Weston J (2009) Curriculum learning. In: Proceedings of the 26th annual international conference on machine learning, pp 41–48

  19. Penha G, Hauff C (2020) Curriculum learning strategies for IR: an empirical study on conversation response ranking. Adv Inf Retr 12035:699

    Google Scholar 

  20. Platanios E.A, Stretcu O, Neubig G, Poczós B, Mitchell T (2019) Competence-based curriculum learning for neural machine translation. In: Proceedings of the 2019 conference of the North American Chapter of the association for computational linguistics: human language technologies, vol 1 (Long and Short Papers), pp 1162–1172

  21. Rao VA, Anuranjana K, Mamidi R (2020) A sentiwordnet strategy for curriculum learning in sentiment analysis. In: International conference on applications of natural language to information systems. Springer, pp 170–178

  22. Yigit G, Amasyali MF (2022) Assessing the impact of minor modifications on the interior structure of gru: Gru1 and gru2. Concurr Comput Practice Exp 34(20):6775

    Article  Google Scholar 

  23. Landauer TK, Dumais ST (1997) A solution to plato’s problem: the latent semantic analysis theory of acquisition, induction, and representation of knowledge. Psychol Rev 104(2):211

    Article  Google Scholar 

  24. Lifchitz A, Jhean-Larose S, Denhière G (2009) Effect of tuned parameters on an LSA multiple choice questions answering model. Behav Res Methods 41(4):1201–1209

    Article  Google Scholar 

  25. Aydin BI, Yilmaz YS, Li Y, Li Q, Gao J, Demirbas M (2014) Crowdsourcing for multiple-choice question answering. In: 26th IAAI conference

  26. Lam K, Pennock DM, Cosley D, Lawrence S et al (2012) 1 billion pages = 1 million dollars? mining the web to play" who wants to be a millionaire?". arXiv preprint arXiv:1212.2477

  27. Min S, Chen D, Zettlemoyer L, Hajishirzi H (2019) Knowledge guided text retrieval and reading for open domain question answering

  28. Lin BY, Chen X, Chen J, Ren X (2019) Kagnet: knowledge-aware graph networks for commonsense reasoning. In: Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP), pp 2829–2839

  29. Feng Y, Chen X, Lin BY, Wang P, Yan J, Ren X (2020) Scalable multi-hop relational reasoning for knowledge-aware question answering. In: Proceedings of the 2020 conference on empirical methods in natural language processing (EMNLP), pp 1295–1309

  30. Lv S, Guo D, Xu J, Tang D, Duan N, Gong M, Shou L, Jiang D, Cao G, Hu S (2020) Graph-based reasoning over heterogeneous external knowledge for commonsense question answering. In: Proceedings of the AAAI conference on artificial intelligence, vol 34, pp 8449–8456

  31. Liu H, Singh P (2004) Conceptnet-a practical commonsense reasoning tool-kit. BT Technol J 22(4):211–226

    Article  Google Scholar 

  32. Pan X, Sun K, Yu D, Chen J, Ji H, Cardie C, Yu D (2019) Improving question answering with external knowledge. In: Proceedings of the 2nd workshop on machine reading for question answering, pp 27–37

  33. Wang X, Kapanipathi P, Musa R, Yu M, Talamadupula K, Abdelaziz I, Chang M, Fokoue A, Makni B, Mattei N (2019) Improving natural language inference using external knowledge in the science questions domain. In: Proceedings of the AAAI conference on artificial intelligence. vol 33, pp 7208–7215

  34. Yasunaga M, Ren H, Bosselut A, Liang P, Leskovec J (2021) Qa-gnn: Reasoning with language models and knowledge graphs for question answering. In: Proceedings of the 2021 conference of the North American chapter of the association for computational linguistics: human language technologies, pp 535–546

  35. Talmor A, Berant J (2019) multiqa: an empirical investigation of generalization and transfer in reading comprehension. In: Proceedings of the 57th annual meeting of the association for computational linguistics, pp 4911–4921

  36. Fisch A, Talmor A, Jia R, Seo M, Choi E, Chen D (2019) MRQA 2019 shared task: evaluating generalization in reading comprehension. In: Proceedings of the 2nd workshop on machine reading for question answering, pp 1–13

  37. Joshi M, Choi E, Weld DS, Zettlemoyer L (2017) Triviaqa: a large scale distantly supervised challenge dataset for reading comprehension. In: Proceedings of the 55th annual meeting of the association for computational linguistics (vol 1: Long Papers), pp 1601–1611

  38. Khashabi D, Min S, Khot T, Sabharwal A, Tafjord O, Clark P, Hajishirzi H (2020) Unifiedqa: crossing format boundaries with a single qa system. In: Proceedings of the 2020 conference on empirical methods in natural language processing: findings, pp 1896–1907

  39. Kočiskỳ T, Schwarz J, Blunsom P, Dyer C, Hermann KM, Melis G, Grefenstette E (2018) The narrativeqa reading comprehension challenge. Trans Assoc Comput Linguist 6:317–328

    Article  Google Scholar 

  40. Lee J, Yoon W, Kim S, Kim D, Kim S, So CH, Kang J (2020) Biobert: a pre-trained biomedical language representation model for biomedical text mining. Bioinformatics 36(4):1234–1240

    Article  Google Scholar 

  41. Bisk Y, Zellers R, Gao J, Choi Y (2020) Piqa: reasoning about physical commonsense in natural language. In: Proceedings of the AAAI conference on artificial intelligence, vol 34, pp 7432–7439

  42. Sap M, Rashkin H, Chen D, LeBras R, Choi Y (2019) Socialiqa: commonsense reasoning about social interactions. In: Conference on empirical methods in natural language processing

  43. Kesgin HT, Amasyali MF (2023) Cyclical curriculum learning. IEEE Trans Neural Netw Learn Syst

Download references

Acknowledgements

This research is supported by The Scientific and Technological Research Council of Turkey (TÜBİTAK) in part of the project with 120E100 Grant Number. G. Yigit is supported by TUBİTAK - BİDEB 2211/A national fellowship program for PhD studies.

Author information

Authors and Affiliations

Authors

Contributions

All authors contributed to the study conception and design. Manuscript preparation, experiments, and analysis were performed by all authors.

Corresponding author

Correspondence to Gulsum Yigit.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yigit, G., Amasyali, M.F. Enhancing multiple-choice question answering through sequential fine-tuning and Curriculum Learning strategies. Knowl Inf Syst 65, 5025–5042 (2023). https://doi.org/10.1007/s10115-023-01918-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-023-01918-2

Keywords

Navigation