To Answer or Not to Answer? Filtering Questions for QA Systems

Pirozelli, Paulo; Brandão, Anarosa A. F.; Peres, Sarajane M.; Cozman, Fabio G.

doi:10.1007/978-3-031-21689-3_33

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 13654 ))

Included in the following conference series:

Brazilian Conference on Intelligent Systems

845 Accesses

Abstract

Question answering (QA) systems are usually structured as strict conditional generators, which return an answer for every input question. Sometimes, however, the policy of always responding to questions may prove itself harmful, given the possibility of giving inaccurate answers, particularly for ambiguous or sensitive questions; instead, it may be better for a QA system to decide which questions should be answered or not. In this paper, we explore dual system architectures that filter unanswerable or meaningless questions, thus answering only a subset of the questions raised. Two experiments are performed in order to evaluate this modular approach: a classification on SQuAD 2.0 for unanswerable questions, and a regression on Pirá for question meaningfulness. Despite the difficulties involved in the tasks, we show that filtering questions may contribute to improve the quality of the answers generated by QA systems. By using classification and regression models to filter questions, we can get better control over the accuracy of the answers produced by the answerer systems.

This work was carried out at the Center for Artificial Intelligence (C4AI-USP), with support by the São Paulo Research Foundation (FAPESP grant #2019/ 07665-4) and by the IBM Corporation. Fabio G. Cozman thanks the support of the National Council for Scientific and Technological Development of Brazil (CNPq grant #312180/2018-7).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
In order to assure reproducibility, codes, dataset partitions, and trained models are made available at the project’s GitHub repository: https://github.com/C4AI/Pira/tree/main/Triggering.
2.
In Pirá, only QA sets with meaningful evaluations were used. For the original dataset, the numbers would be: train: 1896 (79.98%), validation: 225 (9.96%), test: 227 (10%), total: 2258 (100%).
3.
F1-score is implemented with the official SQuAD script. Available at: https://rajpurkar.github.io/SQuAD-explorer/.
4.
Both the classifiers described in this section and the regressors trained in the next use random initializations that may resul in slightly different predictions. To ensure the consistency of results, we repeated the same experiment 10 times each. The results described here are, therefore, representative of the trained models.

References

Acheampong, K.N., Tian, W., Sifah, E.B., Opuni-Boachie, K.O.-A.: The emergence, advancement and future of textual answer triggering. In: Arai, K., Kapoor, S., Bhatia, R. (eds.) SAI 2020. AISC, vol. 1229, pp. 674–693. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-52246-9_50
Chapter Google Scholar
Paschoal, A.F.A., et al.: Pirá: a bilingual portuguese-english dataset for question-answering about the ocean. In: 30th ACM International Conference on Information and Knowledge Management (CIKM 2021) (2021). https://doi.org/10.1145/3459637.3482012
Brown, T.B., et al.: Language models are few-shot learners. CoRR abs/2005.14165 (2020). https://arxiv.org/abs/2005.14165
Chen, D., Fisch, A., Weston, J., Bordes, A.: Reading Wikipedia to answer open-domain questions. In: Barzilay, R., Kan, M. (eds.) Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, ACL 2017, Vancouver, Canada, 30 July–4 August, Volume 1: Long Papers, pp. 1870–1879. Association for Computational Linguistics (2017). https://doi.org/10.18653/v1/P17-1171
Devlin, J., Chang, M., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: Burstein, J., Doran, C., Solorio, T. (eds.) Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, Minneapolis, MN, USA, 2–7 June 2019, Volume 1 (Long and Short Papers), pp. 4171–4186. Association for Computational Linguistics (2019). https://doi.org/10.18653/v1/n19-1423
European-Commission: Proposal for a regulation laying down harmonised rules on artificial intelligence (artificial intelligence act) and amending certain union legislative acts (2021). https://eur-lex.europa.eu/legal-content/EN/TXT/HTML/?uri=CELEX:52021PC0206 &from=EN#footnote8
Ferrucci, D.A.: Introduction to “this is watson’’. IBM J. Res. Dev. 56(3), 1 (2012). https://doi.org/10.1147/JRD.2012.2184356
Article Google Scholar
Jurczyk, T., Zhai, M., Choi, J.D.: SelQA: a new benchmark for selection-based question answering. In: 28th IEEE International Conference on Tools with Artificial Intelligence, ICTAI 2016, San Jose, CA, USA, 6–8 November 2016, pp. 820–827. IEEE Computer Society (2016). https://doi.org/10.1109/ICTAI.2016.0128
Kadavath, S., et al.: Language models (mostly) know what they know (2022). https://arxiv.org/abs/2207.05221
Karpukhin, V., et al.: Dense passage retrieval for open-domain question answering. In: Webber, B., Cohn, T., He, Y., Liu, Y. (eds.) Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, EMNLP 2020, Online, 16–20 November 2020, pp. 6769–6781. Association for Computational Linguistics (2020). https://doi.org/10.18653/v1/2020.emnlp-main.550
Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: ALBERT: a lite BERT for self-supervised learning of language representations. In: 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, 26–30 April 2020. OpenReview.net (2020). https://openreview.net/forum?id=H1eA7AEtvS
Lewis, M., et al.: BART: denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. In: Jurafsky, D., Chai, J., Schluter, N., Tetreault, J.R. (eds.) Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, ACL 2020, Online, 5–10 July 2020, pp. 7871–7880. Association for Computational Linguistics (2020). https://doi.org/10.18653/v1/2020.acl-main.703
Lewis, P.S.H., et al.: Retrieval-augmented generation for knowledge-intensive NLP tasks. In: Larochelle, H., Ranzato, M., Hadsell, R., Balcan, M., Lin, H. (eds.) Advances in Neural Information Processing Systems, vol. 33, Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, 6–12 December 2020, virtual (2020)
Google Scholar
Liu, C., Lowe, R., Serban, I., Noseworthy, M., Charlin, L., Pineau, J.: How NOT to evaluate your dialogue system: an empirical study of unsupervised evaluation metrics for dialogue response generation. In: Su, J., Carreras, X., Duh, K. (eds.) Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, EMNLP 2016, Austin, Texas, USA, 1–4 November 2016, pp. 2122–2132. The Association for Computational Linguistics (2016). https://doi.org/10.18653/v1/d16-1230
Liu, Y., et al.: RoBERTa: a robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019). http://arxiv.org/abs/1907.11692
Radford, A., Narasimhan, K., Salimans, T., Sutskever, I., et al.: Improving language understanding by generative pre-training (2018)
Google Scholar
Raffel, C., et al.: Exploring the limits of transfer learning with a unified text-to-text transformer. J. Mach. Learn. Res. 21, 140:1–140:67 (2020). http://jmlr.org/papers/v21/20-074.html
Rajpurkar, P., Jia, R., Liang, P.: Know what you don’t know: unanswerable questions for squad. In: Gurevych, I., Miyao, Y. (eds.) Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, ACL 2018, Melbourne, Australia, 15–20 July 2018, Volume 2: Short Papers, pp. 784–789. Association for Computational Linguistics (2018). https://doi.org/10.18653/v1/P18-2124
Rajpurkar, P., Zhang, J., Lopyrev, K., Liang, P.: Squad: 100, 000+ questions for machine comprehension of text. In: Su, J., Carreras, X., Duh, K. (eds.) Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, EMNLP, Austin, Texas, USA, 1–4 November 2016, pp. 2383–2392. The Association for Computational Linguistics (2016). https://doi.org/10.18653/v1/d16-1264
Reimers, N., Gurevych, I.: Sentence-BERT: sentence embeddings using Siamese BERT-networks. In: Inui, K., Jiang, J., Ng, V., Wan, X. (eds.) Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, EMNLP-IJCNLP 2019, Hong Kong, China, 3–7 November 2019, pp. 3980–3990. Association for Computational Linguistics (2019). https://doi.org/10.18653/v1/D19-1410
Rogers, A., Kovaleva, O., Downey, M., Rumshisky, A.: Getting closer to AI complete question answering: a set of prerequisite real tasks. In: The Thirty-Fourth AAAI Conference on Artificial Intelligence, AAAI 2020, The Thirty-Second Innovative Applications of Artificial Intelligence Conference, IAAI 2020, The Tenth AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2020, New York, NY, USA, 7–12 February 2020, pp. 8722–8731. AAAI Press (2020). https://ojs.aaai.org/index.php/AAAI/article/view/6398
Sanh, V., Debut, L., Chaumond, J., Wolf, T.: DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter. CoRR abs/1910.01108 (2019). http://arxiv.org/abs/1910.01108
Thoppilan, R., et al.: LaMDA: language models for dialog applications (2022)
Google Scholar
Wang, B., Yao, T., Zhang, Q., Xu, J., Wang, X.: ReCO: a large scale Chinese reading comprehension dataset on opinion. CoRR abs/2006.12146 (2020). https://arxiv.org/abs/2006.12146
Welbl, J., et al.: Challenges in detoxifying language models. In: Findings of the Association for Computational Linguistics: EMNLP 2021, pp. 2447–2469. Association for Computational Linguistics, November 2021. https://doi.org/10.18653/v1/2021.findings-emnlp.210
Yang, Y., Yih, W.T., Meek, C.: WikiQA: a challenge dataset for open-domain question answering. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 2013–2018. Association for Computational Linguistics, Lisbon, September 2015. https://doi.org/10.18653/v1/D15-1237

Download references

Author information

Authors and Affiliations

Instituto de Estudos Avançados, São Paulo, Brazil
Paulo Pirozelli
Escola Politécnica, São Paulo, Brazil
Anarosa A. F. Brandão & Fabio G. Cozman
Escola de Artes, Ciências e Humanidades, São Paulo, Brazil
Sarajane M. Peres
Center for Artificial Intelligence (C4AI), São Paulo, Brazil
Paulo Pirozelli, Anarosa A. F. Brandão, Sarajane M. Peres & Fabio G. Cozman

Authors

Paulo Pirozelli
View author publications
You can also search for this author in PubMed Google Scholar
Anarosa A. F. Brandão
View author publications
You can also search for this author in PubMed Google Scholar
Sarajane M. Peres
View author publications
You can also search for this author in PubMed Google Scholar
Fabio G. Cozman
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Paulo Pirozelli .

Editor information

Editors and Affiliations

Federal University of Rio Grande do Norte, Natal, Brazil
João Carlos Xavier-Junior
Federal University of Bahia, Salvador, Brazil
Ricardo Araújo Rios

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Pirozelli, P., Brandão, A.A.F., Peres, S.M., Cozman, F.G. (2022). To Answer or Not to Answer? Filtering Questions for QA Systems. In: Xavier-Junior, J.C., Rios, R.A. (eds) Intelligent Systems. BRACIS 2022. Lecture Notes in Computer Science(), vol 13654 . Springer, Cham. https://doi.org/10.1007/978-3-031-21689-3_33

Download citation

DOI: https://doi.org/10.1007/978-3-031-21689-3_33
Published: 19 November 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-21688-6
Online ISBN: 978-3-031-21689-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

To Answer or Not to Answer? Filtering Questions for QA Systems