skip to main content
10.1145/2911451.2911498acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
research-article

Modeling Document Novelty with Neural Tensor Network for Search Result Diversification

Authors Info & Claims
Published:07 July 2016Publication History

ABSTRACT

Search result diversification has attracted considerable attention as a means to tackle the ambiguous or multi-faceted information needs of users. One of the key problems in search result diversification is novelty, that is, how to measure the novelty of a candidate document with respect to other documents. In the heuristic approaches, the predefined document similarity functions are directly utilized for defining the novelty. In the learning approaches, the novelty is characterized based on a set of handcrafted features. Both the similarity functions and the features are difficult to manually design in real world due to the complexity of modeling the document novelty. In this paper, we propose to model the novelty of a document with a neural tensor network. Instead of manually defining the similarity functions or features, the new method automatically learns a nonlinear novelty function based on the preliminary representation of the candidate document and other documents. New diverse learning to rank models can be derived under the relational learning to rank framework. To determine the model parameters, loss functions are constructed and optimized with stochastic gradient descent. Extensive experiments on three public TREC datasets show that the new derived algorithms can significantly outperform the baselines, including the state-of-the-art relational learning to rank models.

References

  1. R. Agrawal, S. Gollapudi, A. Halverson, and S. Ieong. Diversifying search results. In Proceedings of ACM WSDM '09, pages 5--14, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. S. Bhatia. Multidimensional search result diversification: Diverse search results for diverse users. In Proceedings of ACM SIGIR '11, pages 1331--1332, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. J. Carbonell and J. Goldstein. The use of mmr, diversity-based reranking for reordering documents and producing summaries. In Proceedings of ACM SIGIR '98, pages 335--336, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. B. Carterette and P. Chandar. Probabilistic models of ranking novel documents for faceted topic retrieval. In Proceedings of ACM CIKM '09 pages 1287--1296, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. O. Chapelle, D. Metlzer, Y. Zhang, and P. Grinspan. Expected reciprocal rank for graded relevance. In Proceedings of ACM CIKM '09, pages 621--630, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. C. L. Clarke, M. Kolla, G. V. Cormack, O. Vechtomova, A. Ashkan, S. Büttcher, and I. MacKinnon. Novelty and diversity in information retrieval evaluation. In Proceedings of ACM SIGIR '08, pages 659--666, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. C. L. Clarke, M. Kolla, and O. Vechtomova. An effectiveness measure for ambiguous and underspecified queries. In Proceedings of ICTIR '09, pages 188--199, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. V. Dang and W. B. Croft. Diversity by proportionality: An election-based approach to search result diversification. In Proceedings of ACM SIGIR '12, pages 65--74, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. S. Deerwester, S. T. Dumais, G. W. Furnas, T. K. Landauer, and R. Harshman. Indexing by latent semantic analysis. JASIS, 41(6):391--407, 1990.Google ScholarGoogle ScholarCross RefCross Ref
  10. S. Gollapudi and A. Sharma. An axiomatic approach for result diversification. In Proceedings of WWW '09, pages 381--390, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. S. Guo and S. Sanner. Probabilistic latent maximal marginal relevance. In Proceedings ofACM SIGIR '10, pages 833--834, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. J. He, V. Hollink, and A. de Vries. Combining implicit and explicit topic representations for result diversification. In Proceedings of ACM SIGIR '12, pages 851--860, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. T. Hofmann. Probabilistic latent semantic indexing. In Proceedings of ACM SIGIR '99, pages 50--57, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. S. Hu, Z. Dou, X. Wang, T. Sakai, and J.-R. Wen. Search result diversification based on hierarchical intents. In Proceedings of ACM CIKM '15, pages 63--72, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Q. V. Le and T. Mikolov. Distributed Representations of Sentences and Documents. ArXiv e-prints, May 2014.Google ScholarGoogle Scholar
  16. H. Li. Learning to rank for information retrieval and natural language processing; 2nd ed. Synthesis Lectures on Human Language Technologies. Morgan & Claypool Publ., San Rafael, CA, 2014.Google ScholarGoogle Scholar
  17. L. Li, K. Zhou, G.-R. Xue, H. Zha, and Y. Yu. Enhancing diversity, coverage and balance for summarization through structure learning. In Proceedings of WWW '09, pages 71--80, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. T.-Y. Liu. Learning to rank for information retrieval. Found. Trends Inf. Retr., 3(3):225--331, Mar. 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. D. Metzler and W. B. Croft. A markov random field model for term dependencies. In Proceedings of ACM SIGIR '05, pages 472--479, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. L. Mihalkova and R. Mooney. Learning to disambiguate search queries from short sessions. In W. Buntine, M. Grobelnik, D. Mladeniff, and J. Shawe-Taylor, editors, Machine Learning and Knowledge Discovery in Databases, volume 5782 of Lecture Notes in Computer Science, pages 111--127. Springer Berlin Heidelberg, 2009.Google ScholarGoogle ScholarCross RefCross Ref
  21. T. Qin, T.-Y. Liu, J. Xu, and H. Li. Letor: A benchmark collection for research on learning to rank for information retrieval. Inf. Retr., 13(4):346--374, Aug. 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. F. Radlinski and S. Dumais. Improving personalized web search using result diversification. In Proceedings of ACM SIGIR '06, pages 691--692, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. F. Radlinski, R. Kleinberg, and T. Joachims. Learning diverse rankings with multi-armed bandits. In Proceedings of ACM ICML '08, pages 784--791, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. D. Rafiei, K. Bharat, and A. Shukla. Diversifying web search results. In Proceedings of WWW '10, pages 781--790, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. K. Raman, P. Shivaswamy, and T. Joachims. Online learning to diversify from implicit feedback. In Proceedings of ACM SIGKDD '12, pages 705--713, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. R. L. Santos, C. Macdonald, and I. Ounis. Exploiting query reformulations for web search result diversification. In Proceedings of WWW '10, pages 881--890, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. R. Socher, D. Chen, C. D. Manning, and A. Ng. Reasoning with neural tensor networks for knowledge base completion. In C. Burges, L. Bottou, M. Welling, Z. Ghahramani, and K. Weinberger, editors, Advances in Neural Information Processing Systems 26, pages 926--934. Curran Associates, Inc., 2013.Google ScholarGoogle Scholar
  28. L. Xia, J. Xu, Y. Lan, J. Guo, and X. Cheng. Learning maximal marginal relevance model via directly optimizing diversity evaluation measures. In Proceedings of ACM SIGIR '15, pages 113--122, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Y. Yue and T. Joachims. Predicting diverse subsets using structural svms. In Proceedings of ACM ICML '08, pages 1224--1231, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Y. Yue and T. Joachims. Interactively optimizing information retrieval systems as a dueling bandits problem. In Proceedings of ACM ICML '09, pages 1201--1208, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. C. X. Zhai, W. W. Cohen, and J. Lafferty. Beyond independent relevance: Methods and evaluation metrics for subtopic retrieval. In Proceedings of ACM SIGIR '03, pages 10--17, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Y. Zhu, Y. Lan, J. Guo, X. Cheng, and S. Niu. Learning for search result diversification. In Proceedings of ACM SIGIR '14, pages 293--302, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Modeling Document Novelty with Neural Tensor Network for Search Result Diversification

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        SIGIR '16: Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval
        July 2016
        1296 pages
        ISBN:9781450340694
        DOI:10.1145/2911451

        Copyright © 2016 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 7 July 2016

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        SIGIR '16 Paper Acceptance Rate62of341submissions,18%Overall Acceptance Rate792of3,983submissions,20%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader