Abstract
With the transformer-based pre-trained language models, multiple-choice question answering (MCQA) systems can reach a particular level of performance. This study focuses on inheriting the benefits of contextualized language representations acquired by language models and transferring and sharing information among MCQA datasets. In this work, a method called multi-stage-fine-tuning considering the Curriculum Learning strategy is presented, which proposes sequencing not training samples, but the source datasets in a meaningful order, not randomized. Consequently, an extensive series of experiments over various MCQA datasets shows that the proposed method reaches remarkable performance enhancements than classical fine-tuning over picked baselines T5 and RoBERTa. Moreover, the experiments are conducted on merged source datasets, and the proposed method achieves improved performance. This study shows that increasing the number of source datasets and even using some small-scale datasets helps build well-generalized models. Moreover, having a higher similarity between source datasets and target also plays a vital role in the performance.
Similar content being viewed by others
Data availability
Data sharing is not applicable to this article as no datasets were generated or analyzed during the current study.
References
Noraset T, Lowphansirikul L, Tuarob S (2021) Wabiqa: a wikipedia-based thai question-answering system. Inf Process Manag 58(1):102431
Yigit G, Amasyali MF (2019) Ask me: a question answering system via dynamic memory networks. In: 2019 Innovations in intelligent systems and applications conference (ASYU). IEEE, pp 1–5
Sarrouti M, El Alaoui SO (2020) Sembionlqa: a semantic biomedical question answering system for retrieving exact and ideal answers to natural language questions. Artif Intell Med 102:101767
Kung H-Y, Yu R-W, Chen C-H, Tsai C-W, Lin C-Y (2021) Intelligent pig-raising knowledge question-answering system based on neural network schemes. Agron J
Yigt G, Amasyali F (2021) Soru cevaplama sistemleri üzerine detaylı bir çalışma: Veri kümeleri, yöntemler ve açık araştırma alanları. Bilişim Teknolojileri Dergisi 14(3): 239–254
Rajpurkar P, Zhang J, Lopyrev K, Liang P (2016) Squad: 100,000+ questions for machine comprehension of text. In: Proceedings of the 2016 conference on empirical methods in natural language processing, pp 2383–2392
Clark C, Lee K, Chang M-W, Kwiatkowski T, Collins M, Toutanova K (2019) Boolq: exploring the surprising difficulty of natural yes/no questions. In: Proceedings of the 2019 conference of the North American Chapter of the association for computational linguistics: human language technologies, vol 1 (Long and Short Papers), pp 2924–2936
Talmor A, Herzig J, Lourie N, Berant J (2019) Commonsenseqa: a question answering challenge targeting commonsense knowledge. In: Proceedings of the 2019 conference of the North American Chapter of the association for computational linguistics: human language technologies, vol 1 (Long and Short Papers), pp 4149–4158
Mihaylov T, Clark P, Khot T, Sabharwal A (2018) Can a suit of armor conduct electricity? a new dataset for open book question answering. In: Proceedings of the 2018 conference on empirical methods in natural language processing, pp 2381–2391
Huang L, Le Bras R, Bhagavatula C, Choi Y (2019) Cosmos QA: machine reading comprehension with contextual commonsense reasoning. In: Proceedings of the 2019 conference on empirical methods in natural language Processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP), pp 2391–2401
Lai G, Xie Q, Liu H, Yang Y, Hovy E (2017) Race: large-scale reading comprehension dataset from examinations. In: Proceedings of the 2017 conference on empirical methods in natural language processing, pp 785–794
Clark P, Cowhey I, Etzioni O, Khot T, Sabharwal A, Schoenick C, Tafjord O (2018) Think you have solved question answering? Try arc. the ai2 reasoning challenge
Devlin J, Chang M-W, Lee K, Toutanova K (2019) Bert: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 conference of the North American Chapter of the Association for computational linguistics: human language technologies, vol 1 (Long and Short Papers), pp 4171–4186
Liu Y, Ott M, Goyal N, Du J, Joshi M, Chen D, Levy O, Lewis M, Zettlemoyer L, Stoyanov V (2019) Roberta: a robustly optimized Bert pretraining approach
Lan Z, Chen M, Goodman S, Gimpel K, Sharma P, Soricut R (2019) Albert: a lite bert for self-supervised learning of language representations. In: International conference on learning representations
Lewis M, Liu Y, Goyal N, Ghazvininejad M, Mohamed A, Levy O, Stoyanov V, Zettlemoyer L (2020) Bart: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. In: Proceedings of the 58th annual meeting of the association for computational linguistics, pp 7871–7880
Raffel C, Shazeer N, Roberts A, Lee K, Narang S, Matena M, Zhou Y, Li W, Liu PJ (2020) Exploring the limits of transfer learning with a unified text-to-text transformer. J Mach Learn Res 21:1–67
Bengio Y, Louradour J, Collobert R, Weston J (2009) Curriculum learning. In: Proceedings of the 26th annual international conference on machine learning, pp 41–48
Penha G, Hauff C (2020) Curriculum learning strategies for IR: an empirical study on conversation response ranking. Adv Inf Retr 12035:699
Platanios E.A, Stretcu O, Neubig G, Poczós B, Mitchell T (2019) Competence-based curriculum learning for neural machine translation. In: Proceedings of the 2019 conference of the North American Chapter of the association for computational linguistics: human language technologies, vol 1 (Long and Short Papers), pp 1162–1172
Rao VA, Anuranjana K, Mamidi R (2020) A sentiwordnet strategy for curriculum learning in sentiment analysis. In: International conference on applications of natural language to information systems. Springer, pp 170–178
Yigit G, Amasyali MF (2022) Assessing the impact of minor modifications on the interior structure of gru: Gru1 and gru2. Concurr Comput Practice Exp 34(20):6775
Landauer TK, Dumais ST (1997) A solution to plato’s problem: the latent semantic analysis theory of acquisition, induction, and representation of knowledge. Psychol Rev 104(2):211
Lifchitz A, Jhean-Larose S, Denhière G (2009) Effect of tuned parameters on an LSA multiple choice questions answering model. Behav Res Methods 41(4):1201–1209
Aydin BI, Yilmaz YS, Li Y, Li Q, Gao J, Demirbas M (2014) Crowdsourcing for multiple-choice question answering. In: 26th IAAI conference
Lam K, Pennock DM, Cosley D, Lawrence S et al (2012) 1 billion pages = 1 million dollars? mining the web to play" who wants to be a millionaire?". arXiv preprint arXiv:1212.2477
Min S, Chen D, Zettlemoyer L, Hajishirzi H (2019) Knowledge guided text retrieval and reading for open domain question answering
Lin BY, Chen X, Chen J, Ren X (2019) Kagnet: knowledge-aware graph networks for commonsense reasoning. In: Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP), pp 2829–2839
Feng Y, Chen X, Lin BY, Wang P, Yan J, Ren X (2020) Scalable multi-hop relational reasoning for knowledge-aware question answering. In: Proceedings of the 2020 conference on empirical methods in natural language processing (EMNLP), pp 1295–1309
Lv S, Guo D, Xu J, Tang D, Duan N, Gong M, Shou L, Jiang D, Cao G, Hu S (2020) Graph-based reasoning over heterogeneous external knowledge for commonsense question answering. In: Proceedings of the AAAI conference on artificial intelligence, vol 34, pp 8449–8456
Liu H, Singh P (2004) Conceptnet-a practical commonsense reasoning tool-kit. BT Technol J 22(4):211–226
Pan X, Sun K, Yu D, Chen J, Ji H, Cardie C, Yu D (2019) Improving question answering with external knowledge. In: Proceedings of the 2nd workshop on machine reading for question answering, pp 27–37
Wang X, Kapanipathi P, Musa R, Yu M, Talamadupula K, Abdelaziz I, Chang M, Fokoue A, Makni B, Mattei N (2019) Improving natural language inference using external knowledge in the science questions domain. In: Proceedings of the AAAI conference on artificial intelligence. vol 33, pp 7208–7215
Yasunaga M, Ren H, Bosselut A, Liang P, Leskovec J (2021) Qa-gnn: Reasoning with language models and knowledge graphs for question answering. In: Proceedings of the 2021 conference of the North American chapter of the association for computational linguistics: human language technologies, pp 535–546
Talmor A, Berant J (2019) multiqa: an empirical investigation of generalization and transfer in reading comprehension. In: Proceedings of the 57th annual meeting of the association for computational linguistics, pp 4911–4921
Fisch A, Talmor A, Jia R, Seo M, Choi E, Chen D (2019) MRQA 2019 shared task: evaluating generalization in reading comprehension. In: Proceedings of the 2nd workshop on machine reading for question answering, pp 1–13
Joshi M, Choi E, Weld DS, Zettlemoyer L (2017) Triviaqa: a large scale distantly supervised challenge dataset for reading comprehension. In: Proceedings of the 55th annual meeting of the association for computational linguistics (vol 1: Long Papers), pp 1601–1611
Khashabi D, Min S, Khot T, Sabharwal A, Tafjord O, Clark P, Hajishirzi H (2020) Unifiedqa: crossing format boundaries with a single qa system. In: Proceedings of the 2020 conference on empirical methods in natural language processing: findings, pp 1896–1907
Kočiskỳ T, Schwarz J, Blunsom P, Dyer C, Hermann KM, Melis G, Grefenstette E (2018) The narrativeqa reading comprehension challenge. Trans Assoc Comput Linguist 6:317–328
Lee J, Yoon W, Kim S, Kim D, Kim S, So CH, Kang J (2020) Biobert: a pre-trained biomedical language representation model for biomedical text mining. Bioinformatics 36(4):1234–1240
Bisk Y, Zellers R, Gao J, Choi Y (2020) Piqa: reasoning about physical commonsense in natural language. In: Proceedings of the AAAI conference on artificial intelligence, vol 34, pp 7432–7439
Sap M, Rashkin H, Chen D, LeBras R, Choi Y (2019) Socialiqa: commonsense reasoning about social interactions. In: Conference on empirical methods in natural language processing
Kesgin HT, Amasyali MF (2023) Cyclical curriculum learning. IEEE Trans Neural Netw Learn Syst
Acknowledgements
This research is supported by The Scientific and Technological Research Council of Turkey (TÜBİTAK) in part of the project with 120E100 Grant Number. G. Yigit is supported by TUBİTAK - BİDEB 2211/A national fellowship program for PhD studies.
Author information
Authors and Affiliations
Contributions
All authors contributed to the study conception and design. Manuscript preparation, experiments, and analysis were performed by all authors.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Yigit, G., Amasyali, M.F. Enhancing multiple-choice question answering through sequential fine-tuning and Curriculum Learning strategies. Knowl Inf Syst 65, 5025–5042 (2023). https://doi.org/10.1007/s10115-023-01918-2
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-023-01918-2