A Reproducibility Study of Question Retrieval for Clarifying Questions

Cross, Sebastian; Zuccon, Guido; Mourad, Ahmed

doi:10.1007/978-3-031-28241-6_3

A Reproducibility Study of Question Retrieval for Clarifying Questions

Sebastian Cross¹⁶,
Guido Zuccon¹⁶ &
Ahmed Mourad¹⁶

Conference paper
First Online: 16 March 2023

1554 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13982))

Abstract

The use of clarifying questions within a search system can have a key role in improving retrieval effectiveness. The generation and exploitation of clarifying questions is an emerging area of research in information retrieval, especially in the context of conversational search.

In this paper, we attempt to reproduce and analyse a milestone work in this area. Through close communication with the original authors and data sharing, we were able to identify a key issue that impacted the original experiments and our independent attempts at reproduction; this issue relates to data preparation. In particular, the clarifying questions retrieval task consists of retrieving clarifying questions from a question bank for a given query. In the original data preparation, such question bank was split into separate folds for retrieval – each split contained (approximately) a fifth of the data in the full question bank. This setting does not resemble that of a production system; in addition, it also was only applied to learnt methods, while keyword matching methods used the full question bank. This created inconsistency in the reporting of the results and overestimated findings. We demonstrate this through a set of empirical experiments and analyses.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

1.
In that it resembles what a production system may look like.
2.
https://sourceforge.net/p/lemur/wiki/RankLib.
3.
Note this was true in early experiments, but in the experiments reported in this paper, we were able to reproduce the exact split of topics into folds as they had.
4.
We note that different information retrieval toolkits follow different reference implementation of some of the keyword matching methods, e.g. of BM25.
5.
Note that commonly in learning to rank, feature files are created for the top-k candidate documents. This however is not because retrieval only considers k documents. Learning to rank is unfeasible for large collections, and is therefore part of a cascade pipeline where full index retrieval occurs first with a cheaper model, and then learning to rank is applied to the top-k. Yet, retrieval considers the full index, not an arbitrary subset that – what the chances – contains all relevant documents.
6.
https://github.com/aliannejadi/qulac.
7.
Possibly tied with other questions that also have a zero-valued feature representation, which, in the dataset considered, are the majority of them.
8.
Once we obtained the feature files for learning to rank, we knew which topics were grouped together in which fold, and thus could recreate the same topic-wise division.
9.
Ours: (BM25) \(k_1 = 0.9 \), \(b = 0.4\), (QL) \(\mu =1000\), (RM3) \(fb_{terms} = 10\), \(fb_{docs} = 10\) \(original\_query\_weigh = 0.5\). They do not report parameter values.
10.
We used Porter Stemmer and Anserini’s default stop-list. They do not report their settings.
11.
We used version 2.17; Aliannejadi et al. did not report the version.
12.
https://huggingface.co/bert-base-uncased.
13.
https://github.com/aliannejadi/ClariQ.

References

Aliannejadi, M., Kiseleva, J., Chuklin, A., Dalton, J., Burtsev, M.: ConvAI3: Generating Clarifying Questions for Open-Domain Dialogue Systems (ClariQ). arXiv:2009.11352 (2020)
Aliannejadi, M., Kiseleva, J., Chuklin, A., Dalton, J., Burtsev, M.: Building and evaluating open-domain dialogue corpora with clarifying questions. In: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pp. 4473–4484 (2021)
Google Scholar
Aliannejadi, M., Zamani, H., Crestani, F., Croft, W.B.: Asking clarifying questions in open-domain information-seeking conversations. In: Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 475–484 (2019)
Google Scholar
Bi, K., Ai, Q., Croft, W.B.: Asking clarifying questions based on negative feedback in conversational search. In: Proceedings of the 2021 ACM SIGIR International Conference on Theory of Information Retrieval, pp. 157–166 (2021)
Google Scholar
Cabanac, G., Hubert, G., Boughanem, M., Chrisment, C.: Tie-breaking bias: effect of an uncontrolled parameter on information retrieval evaluation. In: Agosti, M., Ferro, N., Peters, C., de Rijke, M., Smeaton, A. (eds.) CLEF 2010. LNCS, vol. 6360, pp. 112–123. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-15998-5_13
Chapter Google Scholar
Cai, F., De Rijke, M., et al.: A survey of query auto completion in information retrieval. Found. Trends® Inf. Retrieval 10(4), 273–363 (2016)
Google Scholar
Carterette, B.: System effectiveness, user models, and user utility: a conceptual framework for investigation. In: Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 903–912 (2011)
Google Scholar
Cartright, M.A., Huston, S.J., Feild, H.: Galago: a modular distributed processing and retrieval system. In: Proceedings of the SIGIR 2012 Workshop on Open Source Information Retrieval, pp. 25–31 (2012)
Google Scholar
Clarke, C.L., Craswell, N., Soboroff, I.: Overview of the TREC 2009 web track. In: Proceedings of TREC (2009)
Google Scholar
Dubiel, M., Halvey, M., Azzopardi, L., Anderson, D., Daronnat, S.: Conversational strategies: impact on search performance in a goal-oriented task. In: The Third International Workshop on Conversational Approaches to Information Retrieval (2020)
Google Scholar
Fails, J.A., Pera, M.S., Anuyah, O., Kennington, C., Wright, K.L., Bigirimana, W.: Query formulation assistance for kids: what is available, when to help & what kids want. In: Proceedings of the 18th ACM International Conference on Interaction Design and Children, pp. 109–120 (2019)
Google Scholar
Kim, J.K., Wang, G., Lee, S., Kim, Y.B.: Deciding whether to ask clarifying questions in large-scale spoken language understanding. In: 2021 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), pp. 869–876. IEEE (2021)
Google Scholar
Krasakis, A.M., Aliannejadi, M., Voskarides, N., Kanoulas, E.: Analysing the effect of clarifying questions on document ranking in conversational search. In: Proceedings of the 2020 ACM SIGIR on International Conference on Theory of Information Retrieval, pp. 129–132 (2020)
Google Scholar
Lavrenko, V., Croft, W.B.: Relevance-based language models. In: ACM SIGIR Forum, vol. 51, pp. 260–267. ACM, New York (2017)
Google Scholar
Lee, C.-J., Lin, Y.-C., Chen, R.-C., Cheng, P.-J.: Selecting effective terms for query formulation. In: Lee, G.G., et al. (eds.) AIRS 2009. LNCS, vol. 5839, pp. 168–180. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-04769-5_15
Chapter Google Scholar
Li, H.: Learning to rank for information retrieval and natural language processing. Synth. Lect. Hum. Lang. Technol. 7(3), 1–121 (2014)
Article MathSciNet Google Scholar
Lin, J., Nogueira, R., Yates, A.: Pretrained transformers for text ranking: BERT and beyond. Synth. Lect. Hum. Lang. Technol. 14(4), 1–325 (2021)
Article Google Scholar
Lin, J., Yang, P.: The impact of score ties on repeatability in document ranking. In: Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 1125–1128 (2019)
Google Scholar
Liu, T.Y., et al.: Learning to rank for information retrieval. Found. Trends® Inf. Retrieval 3(3), 225–331 (2009)
Google Scholar
Lotze, T., Klut, S., Aliannejadi, M., Kanoulas, E.: Ranking clarifying questions based on predicted user engagement. In: MICROS Workshop at ECIR 2021 (2021)
Google Scholar
McSherry, F., Najork, M.: Computing information retrieval performance measures efficiently in the presence of tied scores. In: Macdonald, C., Ounis, I., Plachouras, V., Ruthven, I., White, R.W. (eds.) ECIR 2008. LNCS, vol. 4956, pp. 414–421. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-78646-7_38
Chapter Google Scholar
Nogueira, R., Cho, K.: Passage re-ranking with bert. arXiv preprint arXiv:1901.04085 (2019)
Robertson, S., Zaragoza, H., et al.: The probabilistic relevance framework: BM25 and beyond. Found. Trends® Inf. Retrieval 3(4), 333–389 (2009)
Google Scholar
Russell-Rose, T., Chamberlain, J., Shokraneh, F.: A visual approach to query formulation for systematic search. In: Proceedings of the 2019 Conference on Human Information Interaction and Retrieval, pp. 379–383 (2019)
Google Scholar
Scells, H., Zuccon, G., Koopman, B.: A comparison of automatic boolean query formulation for systematic reviews. Inf. Retrieval J. 24(1), 3–28 (2021)
Article Google Scholar
Scells, H., Zuccon, G., Koopman, B., Clark, J.: Automatic boolean query formulation for systematic review literature search. In: Proceedings of the Web Conference 2020, pp. 1071–1081 (2020)
Google Scholar
Sekulić, I., Aliannejadi, M., Crestani, F.: Towards facet-driven generation of clarifying questions for conversational search. In: Proceedings of the 2021 ACM SIGIR International Conference on Theory of Information Retrieval, pp. 167–175 (2021)
Google Scholar
Soboroff, I.M., Craswell, N., Clarke, C.L., Cormack, G., et al.: Overview of the TREC 2011 web track. In: Proceedings of TREC (2011)
Google Scholar
Tavakoli, L.: Generating clarifying questions in conversational search systems. In: Proceedings of the 29th ACM International Conference on Information & Knowledge Management, pp. 3253–3256 (2020)
Google Scholar
Tonellotto, N.: Lecture notes on neural information retrieval. arXiv preprint arXiv:2207.13443 (2022)
Vakulenko, S., Kanoulas, E., De Rijke, M.: A large-scale analysis of mixed initiative in information-seeking dialogues for conversational search. ACM Trans. Inf. Syst. (TOIS) 39(4), 1–32 (2021)
Article Google Scholar
Wang, J., Li, W.: Template-guided clarifying question generation for web search clarification. In: Proceedings of the 30th ACM International Conference on Information & Knowledge Management, pp. 3468–3472 (2021)
Google Scholar
Yang, P., Fang, H., Lin, J.: Anserini: reproducible ranking baselines using lucene. J. Data Inf. Qual. (JDIQ) 10(4), 1–20 (2018)
Article Google Scholar
Yang, Z., Moffat, A., Turpin, A.: How precise does document scoring need to be? In: Ma, S., et al. (eds.) AIRS 2016. LNCS, vol. 9994, pp. 279–291. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-48051-0_21
Chapter Google Scholar
Zamani, H., Dumais, S., Craswell, N., Bennett, P., Lueck, G.: Generating clarifying questions for information retrieval. In: Proceedings of the Web Conference 2020, pp. 418–428 (2020)
Google Scholar
Zhai, C.: Statistical language models for information retrieval. Synth. Lect. Hum. Lang. Technol. 1(1), 1–141 (2008)
Article MathSciNet Google Scholar
Zhai, C., Lafferty, J.: A study of smoothing methods for language models applied to information retrieval. ACM Trans. Inf. Syst. (TOIS) 22(2), 179–214 (2004)
Article Google Scholar
Zhao, Z., Dou, Z., Mao, J., Wen, J.R.: Generating clarifying questions with web search results. In: Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 234–244 (2022)
Google Scholar
Zou, J., Kanoulas, E., Liu, Y.: An empirical study on clarifying question-based systems. In: Proceedings of the 29th ACM International Conference on Information & Knowledge Management, pp. 2361–2364 (2020)
Google Scholar

Download references

Acknowledgments

This work was partially supported by Australian Research Council DECRA Research Fellowship (DE180101579).

Author information

Authors and Affiliations

The University of Queensland, St Lucia, Australia
Sebastian Cross, Guido Zuccon & Ahmed Mourad

Authors

Sebastian Cross
View author publications
You can also search for this author in PubMed Google Scholar
Guido Zuccon
View author publications
You can also search for this author in PubMed Google Scholar
Ahmed Mourad
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sebastian Cross .

Editor information

Editors and Affiliations

University of Amsterdam, Amsterdam, The Netherlands
Jaap Kamps
Université Grenoble-Alpes, Saint-Martin-d’Hères, France
Lorraine Goeuriot
Università della Svizzera Italiana, Lugano, Switzerland
Fabio Crestani
University of Copenhagen, Copenhagen, Denmark
Maria Maistro
University of Tsukuba, Ibaraki, Japan
Hideo Joho
Dublin City University, Dublin, Ireland
Brian Davis
Dublin City University, Dublin, Ireland
Cathal Gurrin
Universität Regensburg, Regensburg, Germany
Udo Kruschwitz
Dublin City University, Dublin, Ireland
Annalina Caputo

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Cross, S., Zuccon, G., Mourad, A. (2023). A Reproducibility Study of Question Retrieval for Clarifying Questions. In: Kamps, J., et al. Advances in Information Retrieval. ECIR 2023. Lecture Notes in Computer Science, vol 13982. Springer, Cham. https://doi.org/10.1007/978-3-031-28241-6_3

Download citation

DOI: https://doi.org/10.1007/978-3-031-28241-6_3
Published: 16 March 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-28240-9
Online ISBN: 978-3-031-28241-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics