Skip to main content

A Tutorial on Measuring Document Retrievability

  • Conference paper
  • 3817 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9022))

Abstract

Retrievability is an important and interesting indicator that can be used in a number of ways to analyse Information Retrieval systems and document collections. Rather than focusing totally on relevance, retrievability examines what is retrieved, how often it is retrieved, and whether a user is likely to retrieve it or not. This is important because a document needs to be retrieved, before it can be judged for relevance. In this tutorial, we explained the concept of retrievability along with a number of retrievability measures, how it can be estimated and how it can be used for analysis. Since retrieval precedes relevance, we described how retrievability relates to effectiveness - along with some of the insights that researchers have discovered thus far. We also showed how retrievability relates to efficiency, and how the theory of retrievability can be used to improve both effectiveness and efficiency. Then an overview of the different applications of retrievability such as Search Engine Bias, Corpus Profiling, etc. was presented, before wrapping up with challenges and opportunities. The final session of the day examined example problems and techniques to analyse and apply retrievability to other problems and domains. This tutorial was designed for: (i) researchers curious about retrievability and wanting to see how it can impact their research, (ii) researchers who would like to expand their set of analysis techniques, and/or (iii) researchers who would like to use retrievability to perform their own analysis.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Azzopardi, L.: Query side evaluation: an empirical analysis of effectiveness and effort. In: Proc. of the 32nd ACM SIGIR Conference, pp. 556–563 (2009)

    Google Scholar 

  2. Azzopardi, L.: The economics in interactive ir. In: Proc. of the 34th ACM SIGIR Conference, pp. 15–24 (2011)

    Google Scholar 

  3. Azzopardi, L.: Modelling interaction with economic models of search. In: Proceedings of the 37th International ACM SIGIR Conference on Research & Development in Information Retrieval, SIGIR 2014, pp. 3–12 (2014)

    Google Scholar 

  4. Azzopardi, L., Bache, R.: On the relationship between effectiveness and accessibility. In: Proc. of the 33rd International ACM SIGIR, pp. 889–890 (2010)

    Google Scholar 

  5. Azzopardi, L., English, R., Wilkie, C., Maxwell, D.: Page retrievability calculator. In: ECIR: Advances in Information Retrieval, pp. 737–741 (2014)

    Google Scholar 

  6. Azzopardi, L., Owens, C.: Search engine predilection towards news media providers. In: Proc. of the 32nd ACM SIGIR, pp. 774–775 (2009)

    Google Scholar 

  7. Azzopardi, L., Purvis, J., Glassey, R.: Pagefetch: a retrieval game for children (and adults). In: Proceedings of the 35th ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2012, pp. 1010–1010 (2012)

    Google Scholar 

  8. Azzopardi, L., de Rijke, M.: Automatic construction of known-item finding test beds. In: Proceedings of SIGIR 2006, pp. 603–604 (2006)

    Google Scholar 

  9. Azzopardi, L., de Rijke, M., Balog, K.: Building simulated queries for known-item topics: an analysis using six european languages. In: Proc. of the 30th ACM SIGIR Conference, pp. 455–462 (2007)

    Google Scholar 

  10. Azzopardi, L., Vinay, V.: Document accessibility: Evaluating the access afforded to a document by the retrieval system. In: Workshop on Novel Methodologies for Evaluation in Information Retrieval, pp. 52–60 (2008)

    Google Scholar 

  11. Bache, R.: Measuring and improving access to the corpus. Current Challenges in Patent Information Retrieval, The Information Retrieval Series 29, 147–165 (2011)

    Article  Google Scholar 

  12. Bache, R., Azzopardi, L.: Improving access to large patent corpora. In: Hameurlain, A., Küng, J., Wagner, R., Bach Pedersen, T., Tjoa, A.M. (eds.) Transactions on Large-Scale Data. LNCS, vol. 6380, pp. 103–121. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  13. Bashir, S.: Estimating retrievability ranks of documents using document features. Neurocomputing 123, 216–232 (2014)

    Article  Google Scholar 

  14. Bashir, S., Rauber, A.: Analyzing document retrievability in patent retrieval settings. In: Bhowmick, S.S., Küng, J., Wagner, R. (eds.) DEXA 2009. LNCS, vol. 5690, pp. 753–760. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  15. Bashir, S., Rauber, A.: Improving retrievability of patents with cluster-based pseudo-relevance feedback documents selection. In: Proc. of the 18th ACM CIKM, pp. 1863–1866 (2009)

    Google Scholar 

  16. Bashir, S., Rauber, A.: Improving retrievability of patents in prior-art search. In: Proc. of the 32nd ECIR, pp. 457–470 (2010)

    Google Scholar 

  17. Bashir, S., Rauber, A.: On the relationship bw query characteristics and ir functions retrieval bias. J. Am. Soc. Inf. Sci. Technol. 62(8), 1515–1532 (2011)

    Article  Google Scholar 

  18. Chi, E.H., Pirolli, P., Chen, K., Pitkow, J.: Using information scent to model user information needs and actions and the web. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 490–497 (2001)

    Google Scholar 

  19. Dasgupta, A., Ghosh, A., Kumar, R., Olston, C., Pandey, S., Tomkins, A.: The discoverability of the web. In: Proc. of the 16th ACM WWW, pp. 421–430 (2007)

    Google Scholar 

  20. Fang, X., Hu, P., Chau, M., Hu, H.F., Yang, Z., Sheng, O.: A data-driven approach to measure web site navigability. J. Manage. Inf. Syst. 29(2), 173–212 (2012)

    Article  Google Scholar 

  21. Gastwirth, J.L.: The estimation of the lorenz curve and gini index. The Review of Economics and Statistics 54, 306–316 (1972)

    Article  MathSciNet  Google Scholar 

  22. Handy, S.L., Measuring, A.N.D.: accessibility: An exploration of issues and alternatives. Environemnet and Planning A 29(7), 1175–1194 (1997)

    Article  Google Scholar 

  23. Hansen, W.: How accessibility shape land use. Journal of the American Institute of Planners 25(2), 73–76 (1959)

    Article  Google Scholar 

  24. Lawrence, S., Giles, L.: Accessibility of information on the web. Nature 400, 101–107 (1999)

    Article  Google Scholar 

  25. Marchetto, A., Tiella, R., Tonella, P., Alshahwan, N., Harman, M.: Crawlability metrics for automated web testing. International Journal on Software Tools for Technology Transfer, 131–149 (2011)

    Google Scholar 

  26. Morville, P.: Ambient Findability: What We Find Changes Who We Become. O’Reilly Media, Inc. (2005)

    Google Scholar 

  27. Mowshowitz, A., Kawaguchi, A.: Assessing bias in search engines. Information Processing and Management, 141–156 (2002)

    Google Scholar 

  28. Palmer, J.W.: Web site usability, design, and performance metrics. Info. Sys. Research 13(2), 151–167 (2002)

    Article  Google Scholar 

  29. Pickens, J., Cooper, M., Golovchinsky, G.: Reverted indexing for feedback and expansion. In: Proc. of the 19th ACM CIKM, pp. 1049–1058 (2010)

    Google Scholar 

  30. Purvis, J., Azzopardi, L.: A preliminary study using pagefetch to examine the searching ability of children and adults. In: Proceedings of the 4th Information Interaction in Context Symposium, IIIX 2012, pp. 262–265 (2012)

    Google Scholar 

  31. Upstill, T., Craswell, N., Hawking, D.: Buying bestsellers online: A case study in search & searchability. In: 7th Australasian Document Computing Symposium, Sydney, Australia (2002)

    Google Scholar 

  32. Vaughan, L., Thelwall, M.: Search engine coverage bias: evidence and possible causes. Information Processing and Management, 693–707 (2004)

    Google Scholar 

  33. Wilkie, C., Azzopardi, L.: An initial investigation on the relationship between usage and findability. In: Serdyukov, P., Braslavski, P., Kuznetsov, S.O., Kamps, J., Rüger, S., Agichtein, E., Segalovich, I., Yilmaz, E. (eds.) ECIR 2013. LNCS, vol. 7814, pp. 808–811. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  34. Wilkie, C., Azzopardi, L.: Relating retrievability, performance and length. In: Proc. of the 36th ACM SIGIR Conference, SIGIR 2013, pp. 937–940 (2013)

    Google Scholar 

  35. Wilkie, C., Azzopardi, L.: Best and fairest: An empirical analysis of retrieval system bias. In: ECIR: Advances in Information Retrieval, pp. 13–25 (2014)

    Google Scholar 

  36. Wilkie, C., Azzopardi, L.: A retrievability analysis: Exploring the relationship between retrieval bias and retrieval performance. In: Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management, CIKM 2014, pp. 81–90 (2014)

    Google Scholar 

  37. Zhang, Y., Zhu, H., Greenwood, S.: Web site complexity metrics for measuring navigability. In: Proc. of the 4th QSIC, pp. 172–179 (2004)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Azzopardi, L. (2015). A Tutorial on Measuring Document Retrievability. In: Hanbury, A., Kazai, G., Rauber, A., Fuhr, N. (eds) Advances in Information Retrieval. ECIR 2015. Lecture Notes in Computer Science, vol 9022. Springer, Cham. https://doi.org/10.1007/978-3-319-16354-3_92

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-16354-3_92

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-16353-6

  • Online ISBN: 978-3-319-16354-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics