ABSTRACT
Search result diversification has attracted considerable attention as a means to tackle the ambiguous or multi-faceted information needs of users. One of the key problems in search result diversification is novelty, that is, how to measure the novelty of a candidate document with respect to other documents. In the heuristic approaches, the predefined document similarity functions are directly utilized for defining the novelty. In the learning approaches, the novelty is characterized based on a set of handcrafted features. Both the similarity functions and the features are difficult to manually design in real world due to the complexity of modeling the document novelty. In this paper, we propose to model the novelty of a document with a neural tensor network. Instead of manually defining the similarity functions or features, the new method automatically learns a nonlinear novelty function based on the preliminary representation of the candidate document and other documents. New diverse learning to rank models can be derived under the relational learning to rank framework. To determine the model parameters, loss functions are constructed and optimized with stochastic gradient descent. Extensive experiments on three public TREC datasets show that the new derived algorithms can significantly outperform the baselines, including the state-of-the-art relational learning to rank models.
- R. Agrawal, S. Gollapudi, A. Halverson, and S. Ieong. Diversifying search results. In Proceedings of ACM WSDM '09, pages 5--14, 2009. Google ScholarDigital Library
- S. Bhatia. Multidimensional search result diversification: Diverse search results for diverse users. In Proceedings of ACM SIGIR '11, pages 1331--1332, 2011. Google ScholarDigital Library
- J. Carbonell and J. Goldstein. The use of mmr, diversity-based reranking for reordering documents and producing summaries. In Proceedings of ACM SIGIR '98, pages 335--336, 1998. Google ScholarDigital Library
- B. Carterette and P. Chandar. Probabilistic models of ranking novel documents for faceted topic retrieval. In Proceedings of ACM CIKM '09 pages 1287--1296, 2009. Google ScholarDigital Library
- O. Chapelle, D. Metlzer, Y. Zhang, and P. Grinspan. Expected reciprocal rank for graded relevance. In Proceedings of ACM CIKM '09, pages 621--630, 2009. Google ScholarDigital Library
- C. L. Clarke, M. Kolla, G. V. Cormack, O. Vechtomova, A. Ashkan, S. Büttcher, and I. MacKinnon. Novelty and diversity in information retrieval evaluation. In Proceedings of ACM SIGIR '08, pages 659--666, 2008. Google ScholarDigital Library
- C. L. Clarke, M. Kolla, and O. Vechtomova. An effectiveness measure for ambiguous and underspecified queries. In Proceedings of ICTIR '09, pages 188--199, 2009. Google ScholarDigital Library
- V. Dang and W. B. Croft. Diversity by proportionality: An election-based approach to search result diversification. In Proceedings of ACM SIGIR '12, pages 65--74, 2012. Google ScholarDigital Library
- S. Deerwester, S. T. Dumais, G. W. Furnas, T. K. Landauer, and R. Harshman. Indexing by latent semantic analysis. JASIS, 41(6):391--407, 1990.Google ScholarCross Ref
- S. Gollapudi and A. Sharma. An axiomatic approach for result diversification. In Proceedings of WWW '09, pages 381--390, 2009. Google ScholarDigital Library
- S. Guo and S. Sanner. Probabilistic latent maximal marginal relevance. In Proceedings ofACM SIGIR '10, pages 833--834, 2010. Google ScholarDigital Library
- J. He, V. Hollink, and A. de Vries. Combining implicit and explicit topic representations for result diversification. In Proceedings of ACM SIGIR '12, pages 851--860, 2012. Google ScholarDigital Library
- T. Hofmann. Probabilistic latent semantic indexing. In Proceedings of ACM SIGIR '99, pages 50--57, 1999. Google ScholarDigital Library
- S. Hu, Z. Dou, X. Wang, T. Sakai, and J.-R. Wen. Search result diversification based on hierarchical intents. In Proceedings of ACM CIKM '15, pages 63--72, 2015. Google ScholarDigital Library
- Q. V. Le and T. Mikolov. Distributed Representations of Sentences and Documents. ArXiv e-prints, May 2014.Google Scholar
- H. Li. Learning to rank for information retrieval and natural language processing; 2nd ed. Synthesis Lectures on Human Language Technologies. Morgan & Claypool Publ., San Rafael, CA, 2014.Google Scholar
- L. Li, K. Zhou, G.-R. Xue, H. Zha, and Y. Yu. Enhancing diversity, coverage and balance for summarization through structure learning. In Proceedings of WWW '09, pages 71--80, 2009. Google ScholarDigital Library
- T.-Y. Liu. Learning to rank for information retrieval. Found. Trends Inf. Retr., 3(3):225--331, Mar. 2009. Google ScholarDigital Library
- D. Metzler and W. B. Croft. A markov random field model for term dependencies. In Proceedings of ACM SIGIR '05, pages 472--479, 2005. Google ScholarDigital Library
- L. Mihalkova and R. Mooney. Learning to disambiguate search queries from short sessions. In W. Buntine, M. Grobelnik, D. Mladeniff, and J. Shawe-Taylor, editors, Machine Learning and Knowledge Discovery in Databases, volume 5782 of Lecture Notes in Computer Science, pages 111--127. Springer Berlin Heidelberg, 2009.Google ScholarCross Ref
- T. Qin, T.-Y. Liu, J. Xu, and H. Li. Letor: A benchmark collection for research on learning to rank for information retrieval. Inf. Retr., 13(4):346--374, Aug. 2010. Google ScholarDigital Library
- F. Radlinski and S. Dumais. Improving personalized web search using result diversification. In Proceedings of ACM SIGIR '06, pages 691--692, 2006. Google ScholarDigital Library
- F. Radlinski, R. Kleinberg, and T. Joachims. Learning diverse rankings with multi-armed bandits. In Proceedings of ACM ICML '08, pages 784--791, 2008. Google ScholarDigital Library
- D. Rafiei, K. Bharat, and A. Shukla. Diversifying web search results. In Proceedings of WWW '10, pages 781--790, 2010. Google ScholarDigital Library
- K. Raman, P. Shivaswamy, and T. Joachims. Online learning to diversify from implicit feedback. In Proceedings of ACM SIGKDD '12, pages 705--713, 2012. Google ScholarDigital Library
- R. L. Santos, C. Macdonald, and I. Ounis. Exploiting query reformulations for web search result diversification. In Proceedings of WWW '10, pages 881--890, 2010. Google ScholarDigital Library
- R. Socher, D. Chen, C. D. Manning, and A. Ng. Reasoning with neural tensor networks for knowledge base completion. In C. Burges, L. Bottou, M. Welling, Z. Ghahramani, and K. Weinberger, editors, Advances in Neural Information Processing Systems 26, pages 926--934. Curran Associates, Inc., 2013.Google Scholar
- L. Xia, J. Xu, Y. Lan, J. Guo, and X. Cheng. Learning maximal marginal relevance model via directly optimizing diversity evaluation measures. In Proceedings of ACM SIGIR '15, pages 113--122, 2015. Google ScholarDigital Library
- Y. Yue and T. Joachims. Predicting diverse subsets using structural svms. In Proceedings of ACM ICML '08, pages 1224--1231, 2008. Google ScholarDigital Library
- Y. Yue and T. Joachims. Interactively optimizing information retrieval systems as a dueling bandits problem. In Proceedings of ACM ICML '09, pages 1201--1208, 2009. Google ScholarDigital Library
- C. X. Zhai, W. W. Cohen, and J. Lafferty. Beyond independent relevance: Methods and evaluation metrics for subtopic retrieval. In Proceedings of ACM SIGIR '03, pages 10--17, 2003. Google ScholarDigital Library
- Y. Zhu, Y. Lan, J. Guo, X. Cheng, and S. Niu. Learning for search result diversification. In Proceedings of ACM SIGIR '14, pages 293--302, 2014. Google ScholarDigital Library
Index Terms
- Modeling Document Novelty with Neural Tensor Network for Search Result Diversification
Recommendations
Adapting Markov Decision Process for Search Result Diversification
SIGIR '17: Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information RetrievalIn this paper we address the issue of learning diverse ranking models for search result diversification. Typical methods treat the problem of constructing a diverse ranking as a process of sequential document selection. At each ranking position, the ...
Search Result Diversification Based on Hierarchical Intents
CIKM '15: Proceedings of the 24th ACM International on Conference on Information and Knowledge ManagementA large percentage of queries issued to search engines are broad or ambiguous. Search result diversification aims to solve this problem, by returning diverse results that can fulfill as many different information needs as possible. Most existing intent-...
Directly Optimize Diversity Evaluation Measures: A New Approach to Search Result Diversification
Special Issue: Mobile Social Multimedia Analytics in the Big Data Era and Regular PapersThe queries issued to search engines are often ambiguous or multifaceted, which requires search engines to return diverse results that can fulfill as many different information needs as possible; this is called search result diversification. Recently, ...
Comments