Skip to main content

Human-Based Query Difficulty Prediction

  • Conference paper
  • First Online:
Advances in Information Retrieval (ECIR 2017)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10193))

Included in the following conference series:

Abstract

The purpose of an automatic query difficulty predictor is to decide whether an information retrieval system is able to provide the most appropriate answer for a current query. Researchers have investigated many types of automatic query difficulty predictors. These are mostly related to how search engines process queries and documents: they are based on the inner workings of searching/ranking system functions, and therefore they do not provide any really insightful explanation as to the reasons for the difficulty, and they neglect user-oriented aspects. In this paper we study if humans can provide useful explanations, or reasons, of why they think a query will be easy or difficult for a search engine. We run two experiments with variations in the TREC reference collection, the amount of information available about the query, and the method of annotation generation. We examine the correlation between the human prediction, the reasons they provide, the automatic prediction, and the actual system effectiveness. The main findings of this study are twofold. First, we confirm the result of previous studies stating that human predictions correlate only weakly with system effectiveness. Second, and probably more important, after analyzing the reasons given by the annotators we find that: (i) overall, the reasons seem coherent, sensible, and informative; (ii) humans have an accurate picture of some query or term characteristics; and (iii) yet, they cannot reliably predict system/query difficulty.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    We use the terms “predictors” and “annotators” or “participants” to distinguish between automatic and human prediction, respectively.

References

  1. Bashir, S.: Combining pre-retrieval query quality predictors using genetic programming. Appl. Intell. 40(3), 525–535 (2014)

    Article  Google Scholar 

  2. Benzécri, J.-P., et al.: Correspondence Analysis Handbook. Marcel Dekker, New York (1992)

    MATH  Google Scholar 

  3. Collins-Thompson, K., Macdonald, C., Bennett, P., Diaz, F., Voorhees, E.: TREC 2014 Web track overview. In: Text REtrieval Conference. NIST (2015)

    Google Scholar 

  4. Harman, D., Buckley, C.: The NRRC reliable information access (RIA) workshop. In: SIGIR, pp. 528–529 (2004)

    Google Scholar 

  5. Harman, D., Buckley, C.: Overview of the reliable information access workshop. Inf. Retr. 12(6), 615–641 (2009)

    Article  Google Scholar 

  6. Hauff, C.: Predicting the effectiveness of queries and retrieval systems. Ph.D. thesis (2010)

    Google Scholar 

  7. Hauff, C., Hiemstra, D., de Jong, F.: A survey of pre-retrieval query performance predictors. In: CIKM, pp. 1419–1420 (2008)

    Google Scholar 

  8. Hauff, C., Kelly, D., Azzopardi, L.: A comparison of user and system query performance predictions. In: Conference on Information and Knowledge Management, CIKM, pp. 979–988 (2010)

    Google Scholar 

  9. Liu, J., Kim, C.S.: Why do users perceive search tasks as difficult? Exploring difficulty in different task types. In: Symposium on Human-Computer Interaction and Information Retrieval, USA, pp. 5:1–5:10 (2013)

    Google Scholar 

  10. Liu, J., Kim, C.S., Creel, C.: Exploring search task difficulty reasons in different task types and user knowledge groups. Inf. Process. Manag. 51(3), 273–285 (2014)

    Article  Google Scholar 

  11. Mizzaro, S., Mothe, J.: Why do you think this query is difficult? A user study on human query prediction. In: SIGIR, pp. 1073–1076. ACM (2016)

    Google Scholar 

  12. Mothe, J., Tanguy, L.: Linguistic features to predict query difficulty. In: Conference on Research and Development in IR, SIGIR, Predicting Query Difficulty Workshop, pp. 7–10 (2005)

    Google Scholar 

  13. Shtok, A., Kurland, O., Carmel, D.: Predicting query performance by query-drift estimation. In: Azzopardi, L., Kazai, G., Robertson, S., Rüger, S., Shokouhi, M., Song, D., Yilmaz, E. (eds.) ICTIR 2009. LNCS, vol. 5766, pp. 305–312. Springer, Heidelberg (2009). doi:10.1007/978-3-642-04417-5_30

    Chapter  Google Scholar 

  14. Spärck Jones, K.: A statistical interpretation of term specificity and its application in retrieval. J. Doc. 28(1), 11–21 (1972)

    Article  Google Scholar 

  15. Tan, L., Clarke, C.L.: A family of rank similarity measures based on maximized effectiveness difference. arXiv preprint arXiv:1408.3587 (2014)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Josiane Mothe .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Chifu, AG., Déjean, S., Mizzaro, S., Mothe, J. (2017). Human-Based Query Difficulty Prediction. In: Jose, J., et al. Advances in Information Retrieval. ECIR 2017. Lecture Notes in Computer Science(), vol 10193. Springer, Cham. https://doi.org/10.1007/978-3-319-56608-5_27

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-56608-5_27

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-56607-8

  • Online ISBN: 978-3-319-56608-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics