FairytaleQA Translated: Enabling Educational Question and Answer Generation in Less-Resourced Languages

Leite, Bernardo; Osório, Tomás Freitas; Cardoso, Henrique Lopes

doi:10.1007/978-3-031-72315-5_16

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 15159))

Included in the following conference series:

European Conference on Technology Enhanced Learning

850 Accesses

Abstract

Question Answering (QA) datasets are crucial in assessing reading comprehension skills for both machines and humans. While numerous datasets have been developed in English for this purpose, a noticeable void exists in less-resourced languages. To alleviate this gap, our paper introduces machine-translated versions of FairytaleQA, a renowned QA dataset designed to assess and enhance narrative comprehension skills in young children. By employing fine-tuned, modest-scale models, we establish benchmarks for both Question Generation (QG) and QA tasks within the translated datasets. In addition, we present a case study proposing a model for generating question-answer pairs, with an evaluation incorporating quality metrics such as question well-formedness, answerability, relevance, and children suitability. Our evaluation prioritizes quantifying and describing error cases, along with providing directions for future work. This paper contributes to the advancement of QA and QG research in less-resourced languages, promoting accessibility and inclusivity in the development of these models for reading comprehension. The code and data is publicly available at github.com/bernardoleite/fairytaleqa-translated.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 74.99; Price excludes VAT (USA)

Softcover Book: USD 89.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Remember the Facts? Investigating Answer-Aware Neural Question Generation for Text Comprehension

Investigating Large Language Models for Prompt-Based Open-Ended Question Generation in the Technical Domain

Article 02 December 2024

An astronomical question answering dataset for evaluating large language models

Article Open access 18 March 2025

Notes

1.
Henceforth, we use “QA dataset” to denote datasets comprising questions and answers, and “QA task” to refer to the computational task of answering a question.
2.
We also have included translated datasets for Italian and Romanian in our repository, although they were not studied in this research.
3.
Made available at github.com/bernardoleite/fairytaleqa-translated.
4.
Made available at github.com/bernardoleite/fairytaleqa-translated.
5.
QA datasets are typically also used for QG.
6.
It is worth noting recent efforts to make these models more accessible, for instance through quantization techniques.
7.
https://www.deepl.com/translator.
8.
https://huggingface.co/t5-base.
9.
While the difference is statistically significant, the effect size is small (Cohen’s d < 0.19).
10.
The 5 questions were randomly selected from a pool of 7, corresponding to the 7 narrative elements that can be controlled.
11.
We prompt GPT-4 Turbo to generate QA pairs from the narrative text in the target language, tailored for children aged 7 to 10, aligned with the FairytaleQA’s audience.
12.
https://iave.pt/provas-e-exames/provas-e-exames/provas-de-afericao-eb/.
13.
In 4% of the cases, votes were tied, leading us to seek an additional volunteer to break the tie.
14.
The following examples have been translated from Portuguese to English.

References

Alonzo, J., Basaraba, D., Tindal, G., Carriveau, R.S.: They read, but how well do they understand? An empirical look at the nuances of measuring reading comprehension. Assess. Eff. Interv. 35(1), 34–44 (2009)
Google Scholar
Araujo, V., Trusca, M.M., Tufiño, R., Moens, M.F.: Sequence-to-sequence spanish pre-trained language models. arXiv preprint arXiv:2309.11259 (2023)
Brown, T., et al.: Language models are few-shot learners. In: Larochelle, H., Ranzato, M., Hadsell, R., Balcan, M., Lin, H. (eds.) Advances in Neural Information Processing Systems, vol. 33, pp. 1877–1901. Curran Associates, Inc. (2020)
Google Scholar
Carmo, D., Piau, M., Campiotti, I., Nogueira, R., Lotufo, R.: PTT5: pretraining and validating the T5 model on Brazilian Portuguese data. arXiv preprint arXiv:2008.09144 (2020)
Carrino, C.P., Costa-jussà, M.R., Fonollosa, J.A.R.: Automatic Spanish translation of SQuAD dataset for multi-lingual question answering. In: Proceedings of the 12th Language Resources and Evaluation Conference, pp. 5515–5523. European Language Resources Association, Marseille, France (2020). https://aclanthology.org/2020.lrec-1.677
Dublineau, J.: T5 question generation and question answering (2020). https://huggingface.co/JDBN/t5-base-fr-qg-fquad
Elkins, S., Kochmar, E., Serban, I., Cheung, J.C.K.: How useful are educational questions generated by large language models? In: Wang, N., Rebolledo-Mendez, G., Dimitrova, V., Matsuda, N., Santos, O.C. (eds.) Artificial Intelligence in Education. Posters and Late Breaking Results, Workshops and Tutorials, Industry and Innovation Tracks, Practitioners, Doctoral Consortium and Blue Sky. AIED 2023. Communications in Computer and Information Science, vol. 1831, pp. 536–542. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-36336-8_83
Ghanem, B., Lutz Coleman, L., Rivard Dexter, J., von der Ohe, S., Fyshe, A.: Question generation for reading comprehension assessment by modeling how and what to ask. In: Findings of the Association for Computational Linguistics: ACL 2022, pp. 2131–2146. Association for Computational Linguistics, Dublin, Ireland (2022). https://doi.org/10.18653/v1/2022.findings-acl.168
Kurdi, G., Leo, J., Parsia, B., Sattler, U., Al-Emari, S.: A systematic review of automatic question generation for educational purposes. Int. J. Artif. Intell. Educ. 30(1), 121–204 (2020)
Article Google Scholar
Lai, G., Xie, Q., Liu, H., Yang, Y., Hovy, E.: RACE: large-scale reading comprehension dataset from examinations. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 785–794. Association for Computational Linguistics, Copenhagen, Denmark (2017).https://doi.org/10.18653/v1/D17-1082
Leite., B., Cardoso., H.: On few-shot prompting for controllable question-answer generation in narrative comprehension. In: Proceedings of the 16th International Conference on Computer Supported Education - Volume 2: CSEDU, pp. 63–74. SciTePress, INSTICC (2024). https://doi.org/10.5220/0012623800003693
Leite, B., Cardoso, H.L.: Towards enriched controllability for educational question generation. In: Wang, N., Rebolledo-Mendez, G., Matsuda, N., Santos, O.C., Dimitrova, V. (eds.) Artificial Intelligence in Education, pp. 786–791. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-36272-9_72
Chapter Google Scholar
Leite, B., Lopes Cardoso, H.: Neural question generation for the Portuguese language: a preliminary study. In: Marreiros, G., Martins, B., Paiva, A., Ribeiro, B., Sardinha, A. (eds.) Progress in Artificial Intelligence, pp. 780–793. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-16474-3_63
Chapter Google Scholar
Lewis, M., et al.: BART: denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. In: Jurafsky, D., Chai, J., Schluter, N., Tetreault, J. (eds.) Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 7871–7880. Association for Computational Linguistics (2020). https://doi.org/10.18653/v1/2020.acl-main.703
Lynch, J.S., Van Den Broek, P., Kremer, K.E., Kendeou, P., White, M.J., Lorch, E.P.: The development of narrative comprehension and its relation to other early reading skills. Read. Psychol. 29(4), 327–365 (2008)
Article Google Scholar
Paris, A.H., Paris, S.G.: Assessing narrative comprehension in young children. Read. Res. Q. 38(1), 36–76 (2003)
Article MathSciNet Google Scholar
Raffel, C., et al.: Exploring the limits of transfer learning with a unified text-to-text transformer. J. Mach. Learn. Res. 21(140), 1–67 (2020)
Google Scholar
Rajpurkar, P., Zhang, J., Lopyrev, K., Liang, P.: SQuAD: 100,000+ questions for machine comprehension of text. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pp. 2383–2392. ACL, Austin, Texas (2016). https://doi.org/10.18653/v1/D16-1264
Staš, J., Hládek, D., Koctúr, T.: Slovak question answering dataset based on the machine translation of the squad v2. 0. J. Linguist. /Jazykovednỳ casopis 74(1), 381–390 (2023)
Google Scholar
Ushio, A., Alva-Manchego, F., Camacho-Collados, J.: Generative language models for paragraph-level question generation. In: Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pp. 670–688. Association for Computational Linguistics, Abu Dhabi, United Arab Emirates (2022)
Google Scholar
Xie, Q., Lai, G., Dai, Z., Hovy, E.: Large-scale cloze test dataset created by teachers. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp. 2344–2356. Association for Computational Linguistics, Brussels, Belgium (2018) https://doi.org/10.18653/v1/D18-1257
Xu, Y., et al.: Fantastic questions and where to find them: FairytaleQA – an authentic dataset for narrative comprehension. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics, Volume 1: Long Papers, pp. 447–460. Association for Computational Linguistics, Dublin, Ireland (2022). https://doi.org/10.18653/v1/2022.acl-long.34
Zhao, Z., Hou, Y., Wang, D., Yu, M., Liu, C., Ma, X.: Educational question generation of children storybooks via question type distribution learning and event-centric summarization. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics, Volume 1: Long Papers, pp. 5073–5085. Association for Computational Linguistics, Dublin, Ireland (2022). https://doi.org/10.18653/v1/2022.acl-long.348

Download references

Acknowledgments

The authors thank the 15 participants who voluntarily participated in the evaluation process. We would like to specifically acknowledge those who agreed to share their names: Ângela Esteves, Bruno Miguel Pinto, David Reis, Diana Pinto, Maria de Abreu, Mariana Coelho, Pedro Costa, Rui Leixo and Vítor Magalhães. This work was financially supported by Base Funding - UIDB/00027/2020 and Programatic Funding - UIDP/00027/2020 of the Artificial Intelligence and Computer Science Laboratory - LIACC - funded by national funds through the FCT/MCTES (PIDDAC). Bernardo Leite is supported by a PhD studentship (with reference 2021.05432.BD), funded by Fundação para a Ciência e a Tecnologia (FCT).

Author information

Authors and Affiliations

Artificial Intelligence and Computer Science Laboratory (LIACC), Faculty of Engineering of the University of Porto (FEUP), Rua Dr. Roberto Frias, 4200-465, Porto, Portugal
Bernardo Leite, Tomás Freitas Osório & Henrique Lopes Cardoso

Authors

Bernardo Leite
View author publications
You can also search for this author in PubMed Google Scholar
Tomás Freitas Osório
View author publications
You can also search for this author in PubMed Google Scholar
Henrique Lopes Cardoso
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Bernardo Leite .

Editor information

Editors and Affiliations

Universidade Federal Rural de Pernambuco, Recife, Pernambuco, Brazil
Rafael Ferreira Mello
Ruhr University Bochum, Bochum, Germany
Nikol Rummel
FernUniversität in Hagen, Hagen, Germany
Ioana Jivet
University for Continuing Education Krems, Krems an der Donau, Austria
Gerti Pishtari
University of Murcia, Murcia, Spain
José A. Ruipérez Valiente

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Leite, B., Osório, T.F., Cardoso, H.L. (2024). FairytaleQA Translated: Enabling Educational Question and Answer Generation in Less-Resourced Languages. In: Ferreira Mello, R., Rummel, N., Jivet, I., Pishtari, G., Ruipérez Valiente, J.A. (eds) Technology Enhanced Learning for Inclusive and Equitable Quality Education. EC-TEL 2024. Lecture Notes in Computer Science, vol 15159. Springer, Cham. https://doi.org/10.1007/978-3-031-72315-5_16

Download citation

DOI: https://doi.org/10.1007/978-3-031-72315-5_16
Published: 13 September 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-72314-8
Online ISBN: 978-3-031-72315-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics