Abstract
Retrievability is an important and interesting indicator that can be used in a number of ways to analyse Information Retrieval systems and document collections. Rather than focusing totally on relevance, retrievability examines what is retrieved, how often it is retrieved, and whether a user is likely to retrieve it or not. This is important because a document needs to be retrieved, before it can be judged for relevance. In this tutorial, we explained the concept of retrievability along with a number of retrievability measures, how it can be estimated and how it can be used for analysis. Since retrieval precedes relevance, we described how retrievability relates to effectiveness - along with some of the insights that researchers have discovered thus far. We also showed how retrievability relates to efficiency, and how the theory of retrievability can be used to improve both effectiveness and efficiency. Then an overview of the different applications of retrievability such as Search Engine Bias, Corpus Profiling, etc. was presented, before wrapping up with challenges and opportunities. The final session of the day examined example problems and techniques to analyse and apply retrievability to other problems and domains. This tutorial was designed for: (i) researchers curious about retrievability and wanting to see how it can impact their research, (ii) researchers who would like to expand their set of analysis techniques, and/or (iii) researchers who would like to use retrievability to perform their own analysis.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Azzopardi, L.: Query side evaluation: an empirical analysis of effectiveness and effort. In: Proc. of the 32nd ACM SIGIR Conference, pp. 556–563 (2009)
Azzopardi, L.: The economics in interactive ir. In: Proc. of the 34th ACM SIGIR Conference, pp. 15–24 (2011)
Azzopardi, L.: Modelling interaction with economic models of search. In: Proceedings of the 37th International ACM SIGIR Conference on Research & Development in Information Retrieval, SIGIR 2014, pp. 3–12 (2014)
Azzopardi, L., Bache, R.: On the relationship between effectiveness and accessibility. In: Proc. of the 33rd International ACM SIGIR, pp. 889–890 (2010)
Azzopardi, L., English, R., Wilkie, C., Maxwell, D.: Page retrievability calculator. In: ECIR: Advances in Information Retrieval, pp. 737–741 (2014)
Azzopardi, L., Owens, C.: Search engine predilection towards news media providers. In: Proc. of the 32nd ACM SIGIR, pp. 774–775 (2009)
Azzopardi, L., Purvis, J., Glassey, R.: Pagefetch: a retrieval game for children (and adults). In: Proceedings of the 35th ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2012, pp. 1010–1010 (2012)
Azzopardi, L., de Rijke, M.: Automatic construction of known-item finding test beds. In: Proceedings of SIGIR 2006, pp. 603–604 (2006)
Azzopardi, L., de Rijke, M., Balog, K.: Building simulated queries for known-item topics: an analysis using six european languages. In: Proc. of the 30th ACM SIGIR Conference, pp. 455–462 (2007)
Azzopardi, L., Vinay, V.: Document accessibility: Evaluating the access afforded to a document by the retrieval system. In: Workshop on Novel Methodologies for Evaluation in Information Retrieval, pp. 52–60 (2008)
Bache, R.: Measuring and improving access to the corpus. Current Challenges in Patent Information Retrieval, The Information Retrieval Series 29, 147–165 (2011)
Bache, R., Azzopardi, L.: Improving access to large patent corpora. In: Hameurlain, A., Küng, J., Wagner, R., Bach Pedersen, T., Tjoa, A.M. (eds.) Transactions on Large-Scale Data. LNCS, vol. 6380, pp. 103–121. Springer, Heidelberg (2010)
Bashir, S.: Estimating retrievability ranks of documents using document features. Neurocomputing 123, 216–232 (2014)
Bashir, S., Rauber, A.: Analyzing document retrievability in patent retrieval settings. In: Bhowmick, S.S., Küng, J., Wagner, R. (eds.) DEXA 2009. LNCS, vol. 5690, pp. 753–760. Springer, Heidelberg (2009)
Bashir, S., Rauber, A.: Improving retrievability of patents with cluster-based pseudo-relevance feedback documents selection. In: Proc. of the 18th ACM CIKM, pp. 1863–1866 (2009)
Bashir, S., Rauber, A.: Improving retrievability of patents in prior-art search. In: Proc. of the 32nd ECIR, pp. 457–470 (2010)
Bashir, S., Rauber, A.: On the relationship bw query characteristics and ir functions retrieval bias. J. Am. Soc. Inf. Sci. Technol. 62(8), 1515–1532 (2011)
Chi, E.H., Pirolli, P., Chen, K., Pitkow, J.: Using information scent to model user information needs and actions and the web. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 490–497 (2001)
Dasgupta, A., Ghosh, A., Kumar, R., Olston, C., Pandey, S., Tomkins, A.: The discoverability of the web. In: Proc. of the 16th ACM WWW, pp. 421–430 (2007)
Fang, X., Hu, P., Chau, M., Hu, H.F., Yang, Z., Sheng, O.: A data-driven approach to measure web site navigability. J. Manage. Inf. Syst. 29(2), 173–212 (2012)
Gastwirth, J.L.: The estimation of the lorenz curve and gini index. The Review of Economics and Statistics 54, 306–316 (1972)
Handy, S.L., Measuring, A.N.D.: accessibility: An exploration of issues and alternatives. Environemnet and Planning A 29(7), 1175–1194 (1997)
Hansen, W.: How accessibility shape land use. Journal of the American Institute of Planners 25(2), 73–76 (1959)
Lawrence, S., Giles, L.: Accessibility of information on the web. Nature 400, 101–107 (1999)
Marchetto, A., Tiella, R., Tonella, P., Alshahwan, N., Harman, M.: Crawlability metrics for automated web testing. International Journal on Software Tools for Technology Transfer, 131–149 (2011)
Morville, P.: Ambient Findability: What We Find Changes Who We Become. O’Reilly Media, Inc. (2005)
Mowshowitz, A., Kawaguchi, A.: Assessing bias in search engines. Information Processing and Management, 141–156 (2002)
Palmer, J.W.: Web site usability, design, and performance metrics. Info. Sys. Research 13(2), 151–167 (2002)
Pickens, J., Cooper, M., Golovchinsky, G.: Reverted indexing for feedback and expansion. In: Proc. of the 19th ACM CIKM, pp. 1049–1058 (2010)
Purvis, J., Azzopardi, L.: A preliminary study using pagefetch to examine the searching ability of children and adults. In: Proceedings of the 4th Information Interaction in Context Symposium, IIIX 2012, pp. 262–265 (2012)
Upstill, T., Craswell, N., Hawking, D.: Buying bestsellers online: A case study in search & searchability. In: 7th Australasian Document Computing Symposium, Sydney, Australia (2002)
Vaughan, L., Thelwall, M.: Search engine coverage bias: evidence and possible causes. Information Processing and Management, 693–707 (2004)
Wilkie, C., Azzopardi, L.: An initial investigation on the relationship between usage and findability. In: Serdyukov, P., Braslavski, P., Kuznetsov, S.O., Kamps, J., Rüger, S., Agichtein, E., Segalovich, I., Yilmaz, E. (eds.) ECIR 2013. LNCS, vol. 7814, pp. 808–811. Springer, Heidelberg (2013)
Wilkie, C., Azzopardi, L.: Relating retrievability, performance and length. In: Proc. of the 36th ACM SIGIR Conference, SIGIR 2013, pp. 937–940 (2013)
Wilkie, C., Azzopardi, L.: Best and fairest: An empirical analysis of retrieval system bias. In: ECIR: Advances in Information Retrieval, pp. 13–25 (2014)
Wilkie, C., Azzopardi, L.: A retrievability analysis: Exploring the relationship between retrieval bias and retrieval performance. In: Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management, CIKM 2014, pp. 81–90 (2014)
Zhang, Y., Zhu, H., Greenwood, S.: Web site complexity metrics for measuring navigability. In: Proc. of the 4th QSIC, pp. 172–179 (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Azzopardi, L. (2015). A Tutorial on Measuring Document Retrievability. In: Hanbury, A., Kazai, G., Rauber, A., Fuhr, N. (eds) Advances in Information Retrieval. ECIR 2015. Lecture Notes in Computer Science, vol 9022. Springer, Cham. https://doi.org/10.1007/978-3-319-16354-3_92
Download citation
DOI: https://doi.org/10.1007/978-3-319-16354-3_92
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-16353-6
Online ISBN: 978-3-319-16354-3
eBook Packages: Computer ScienceComputer Science (R0)