Abstract
The purpose of an automatic query difficulty predictor is to decide whether an information retrieval system is able to provide the most appropriate answer for a current query. Researchers have investigated many types of automatic query difficulty predictors. These are mostly related to how search engines process queries and documents: they are based on the inner workings of searching/ranking system functions, and therefore they do not provide any really insightful explanation as to the reasons for the difficulty, and they neglect user-oriented aspects. In this paper we study if humans can provide useful explanations, or reasons, of why they think a query will be easy or difficult for a search engine. We run two experiments with variations in the TREC reference collection, the amount of information available about the query, and the method of annotation generation. We examine the correlation between the human prediction, the reasons they provide, the automatic prediction, and the actual system effectiveness. The main findings of this study are twofold. First, we confirm the result of previous studies stating that human predictions correlate only weakly with system effectiveness. Second, and probably more important, after analyzing the reasons given by the annotators we find that: (i) overall, the reasons seem coherent, sensible, and informative; (ii) humans have an accurate picture of some query or term characteristics; and (iii) yet, they cannot reliably predict system/query difficulty.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
We use the terms “predictors” and “annotators” or “participants” to distinguish between automatic and human prediction, respectively.
References
Bashir, S.: Combining pre-retrieval query quality predictors using genetic programming. Appl. Intell. 40(3), 525–535 (2014)
Benzécri, J.-P., et al.: Correspondence Analysis Handbook. Marcel Dekker, New York (1992)
Collins-Thompson, K., Macdonald, C., Bennett, P., Diaz, F., Voorhees, E.: TREC 2014 Web track overview. In: Text REtrieval Conference. NIST (2015)
Harman, D., Buckley, C.: The NRRC reliable information access (RIA) workshop. In: SIGIR, pp. 528–529 (2004)
Harman, D., Buckley, C.: Overview of the reliable information access workshop. Inf. Retr. 12(6), 615–641 (2009)
Hauff, C.: Predicting the effectiveness of queries and retrieval systems. Ph.D. thesis (2010)
Hauff, C., Hiemstra, D., de Jong, F.: A survey of pre-retrieval query performance predictors. In: CIKM, pp. 1419–1420 (2008)
Hauff, C., Kelly, D., Azzopardi, L.: A comparison of user and system query performance predictions. In: Conference on Information and Knowledge Management, CIKM, pp. 979–988 (2010)
Liu, J., Kim, C.S.: Why do users perceive search tasks as difficult? Exploring difficulty in different task types. In: Symposium on Human-Computer Interaction and Information Retrieval, USA, pp. 5:1–5:10 (2013)
Liu, J., Kim, C.S., Creel, C.: Exploring search task difficulty reasons in different task types and user knowledge groups. Inf. Process. Manag. 51(3), 273–285 (2014)
Mizzaro, S., Mothe, J.: Why do you think this query is difficult? A user study on human query prediction. In: SIGIR, pp. 1073–1076. ACM (2016)
Mothe, J., Tanguy, L.: Linguistic features to predict query difficulty. In: Conference on Research and Development in IR, SIGIR, Predicting Query Difficulty Workshop, pp. 7–10 (2005)
Shtok, A., Kurland, O., Carmel, D.: Predicting query performance by query-drift estimation. In: Azzopardi, L., Kazai, G., Robertson, S., Rüger, S., Shokouhi, M., Song, D., Yilmaz, E. (eds.) ICTIR 2009. LNCS, vol. 5766, pp. 305–312. Springer, Heidelberg (2009). doi:10.1007/978-3-642-04417-5_30
Spärck Jones, K.: A statistical interpretation of term specificity and its application in retrieval. J. Doc. 28(1), 11–21 (1972)
Tan, L., Clarke, C.L.: A family of rank similarity measures based on maximized effectiveness difference. arXiv preprint arXiv:1408.3587 (2014)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Chifu, AG., Déjean, S., Mizzaro, S., Mothe, J. (2017). Human-Based Query Difficulty Prediction. In: Jose, J., et al. Advances in Information Retrieval. ECIR 2017. Lecture Notes in Computer Science(), vol 10193. Springer, Cham. https://doi.org/10.1007/978-3-319-56608-5_27
Download citation
DOI: https://doi.org/10.1007/978-3-319-56608-5_27
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-56607-8
Online ISBN: 978-3-319-56608-5
eBook Packages: Computer ScienceComputer Science (R0)