Abstract
A useful ability for search engines is to be able to rank objects with novelty and diversity: the top k documents retrieved should cover possible interpretations of a query with some distribution, or should contain a diverse set of subtopics related to the user’s information need, or contain nuggets of information with little redundancy. Evaluation measures have been introduced to measure the effectiveness of systems at this task, but these measures have worst-case NP-complete computation time. We use simulation to investigate the implications of this for optimization and evaluation of retrieval systems.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Agrawal, R., Gollapudi, S., Halverson, H., Ieong, S.: Diversifying search results. In: Proceedings of WSDM 2009, pp. 5–14 (2009)
Vee, E., Srivastava, U., Shanmugasundaram, J., Bhat, P., Amer-Yahia, S.: Efficient computation of diverse query results. In: Proceedings of ICDE 2008, pp. 228–236 (2008)
Clarke, C.L.A., Kolla, M., Cormack, G.V., Vechtomova, O., Ashkan, A., Büttcher, S., MacKinnon, I.: Novelty and diversity in information retrieval evaluation. In: Proceedings of SIGIR 2008, pp. 659–666 (2008)
Radlinski, F., Kleinberg, R., Joachims, T.: Learning diverse rankings with multi-armed bandits. In: Proceedings of ICML 2008, pp. 784–791 (2008)
Chen, H., Karger, D.R.: Less is more: Probabilistic models for retrieving fewer relevant documents. In: Proceedings of SIGIR 2006, pp. 429–436 (2006)
Zhai, C., Cohen, W.W., Lafferty, J.D.: Beyond independent relevance: Methods and evaluation metrics for subtopic retrieval. In: Proceedings of SIGIR 2003, pp. 10–17 (2003)
Carbonell, J.G., Goldstein, J.: The use of mmr, diversity-based reranking for reordering documents and producing summaries. In: Proceedings of SIGIR 1998, pp. 335–336 (1998)
Garey, M.R., Johnson, D.S.: Computers and Intractibility: A Guide to the Theory of NP-completeness. W.H. Freeman, New York (1979)
Jarvelin, K., Kekalainen, J.: Cumulated gain-based evaluation of ir techniques. ACM Trans. Inf. Syst. 20(4), 422–446 (2002)
Feige, U.: A threshold of ln n for approximating set cover. Journal of the ACM 45(4), 634–652 (1998)
Robertson, S.E.: The probability ranking principle in information retrieval. Journal of Documentation 33, 294–304 (1977)
Goffman, W.: On relevance as a measure. Information Storage and Retrieval 2(3), 201–203 (1964)
Allan, J., Carterette, B., Lewis, J.: When will information retrieval be ’good enough?’. In: Proceedings of SIGIR 2005, pp. 433–440 (2005)
Zaman, A., Simberloff, D.: Random binary matrices in biogeographical ecology—instituting a good neighbor policy. Environmental and Ecological Statistics 9, 405–421 (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Carterette, B. (2009). An Analysis of NP-Completeness in Novelty and Diversity Ranking. In: Azzopardi, L., et al. Advances in Information Retrieval Theory. ICTIR 2009. Lecture Notes in Computer Science, vol 5766. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04417-5_18
Download citation
DOI: https://doi.org/10.1007/978-3-642-04417-5_18
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-04416-8
Online ISBN: 978-3-642-04417-5
eBook Packages: Computer ScienceComputer Science (R0)