research-article

Modeling Document Novelty with Neural Tensor Network for Search Result Diversification

Authors:
Long Xia

Chinese Academy of Sciences, Beijing, China

Chinese Academy of Sciences, Beijing, China
View Profile

,
Jun Xu

Chinese Academy of Sciences, Beijing, China

Chinese Academy of Sciences, Beijing, China
View Profile

,
Yanyan Lan

Chinese Academy of Sciences, Beijing, China

Chinese Academy of Sciences, Beijing, China
View Profile

,
Jiafeng Guo

Chinese Academy of Sciences, Beijing, China

Chinese Academy of Sciences, Beijing, China
View Profile

,
Xueqi Cheng

Chinese Academy of Sciences, Beijing, China

Chinese Academy of Sciences, Beijing, China
View Profile

SIGIR '16: Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information RetrievalJuly 2016Pages 395–404https://doi.org/10.1145/2911451.2911498

Published:07 July 2016Publication History

SIGIR '16: Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval

Pages 395–404

ABSTRACT

Search result diversification has attracted considerable attention as a means to tackle the ambiguous or multi-faceted information needs of users. One of the key problems in search result diversification is novelty, that is, how to measure the novelty of a candidate document with respect to other documents. In the heuristic approaches, the predefined document similarity functions are directly utilized for defining the novelty. In the learning approaches, the novelty is characterized based on a set of handcrafted features. Both the similarity functions and the features are difficult to manually design in real world due to the complexity of modeling the document novelty. In this paper, we propose to model the novelty of a document with a neural tensor network. Instead of manually defining the similarity functions or features, the new method automatically learns a nonlinear novelty function based on the preliminary representation of the candidate document and other documents. New diverse learning to rank models can be derived under the relational learning to rank framework. To determine the model parameters, loss functions are constructed and optimized with stochastic gradient descent. Extensive experiments on three public TREC datasets show that the new derived algorithms can significantly outperform the baselines, including the state-of-the-art relational learning to rank models.

References

R. Agrawal, S. Gollapudi, A. Halverson, and S. Ieong. Diversifying search results. In Proceedings of ACM WSDM '09, pages 5--14, 2009. Google ScholarDigital Library
S. Bhatia. Multidimensional search result diversification: Diverse search results for diverse users. In Proceedings of ACM SIGIR '11, pages 1331--1332, 2011. Google ScholarDigital Library
J. Carbonell and J. Goldstein. The use of mmr, diversity-based reranking for reordering documents and producing summaries. In Proceedings of ACM SIGIR '98, pages 335--336, 1998. Google ScholarDigital Library
B. Carterette and P. Chandar. Probabilistic models of ranking novel documents for faceted topic retrieval. In Proceedings of ACM CIKM '09 pages 1287--1296, 2009. Google ScholarDigital Library
O. Chapelle, D. Metlzer, Y. Zhang, and P. Grinspan. Expected reciprocal rank for graded relevance. In Proceedings of ACM CIKM '09, pages 621--630, 2009. Google ScholarDigital Library
C. L. Clarke, M. Kolla, G. V. Cormack, O. Vechtomova, A. Ashkan, S. Büttcher, and I. MacKinnon. Novelty and diversity in information retrieval evaluation. In Proceedings of ACM SIGIR '08, pages 659--666, 2008. Google ScholarDigital Library
C. L. Clarke, M. Kolla, and O. Vechtomova. An effectiveness measure for ambiguous and underspecified queries. In Proceedings of ICTIR '09, pages 188--199, 2009. Google ScholarDigital Library
V. Dang and W. B. Croft. Diversity by proportionality: An election-based approach to search result diversification. In Proceedings of ACM SIGIR '12, pages 65--74, 2012. Google ScholarDigital Library
S. Deerwester, S. T. Dumais, G. W. Furnas, T. K. Landauer, and R. Harshman. Indexing by latent semantic analysis. JASIS, 41(6):391--407, 1990.Google ScholarCross Ref
S. Gollapudi and A. Sharma. An axiomatic approach for result diversification. In Proceedings of WWW '09, pages 381--390, 2009. Google ScholarDigital Library
S. Guo and S. Sanner. Probabilistic latent maximal marginal relevance. In Proceedings ofACM SIGIR '10, pages 833--834, 2010. Google ScholarDigital Library
J. He, V. Hollink, and A. de Vries. Combining implicit and explicit topic representations for result diversification. In Proceedings of ACM SIGIR '12, pages 851--860, 2012. Google ScholarDigital Library
T. Hofmann. Probabilistic latent semantic indexing. In Proceedings of ACM SIGIR '99, pages 50--57, 1999. Google ScholarDigital Library
S. Hu, Z. Dou, X. Wang, T. Sakai, and J.-R. Wen. Search result diversification based on hierarchical intents. In Proceedings of ACM CIKM '15, pages 63--72, 2015. Google ScholarDigital Library
Q. V. Le and T. Mikolov. Distributed Representations of Sentences and Documents. ArXiv e-prints, May 2014.Google Scholar
H. Li. Learning to rank for information retrieval and natural language processing; 2nd ed. Synthesis Lectures on Human Language Technologies. Morgan & Claypool Publ., San Rafael, CA, 2014.Google Scholar
L. Li, K. Zhou, G.-R. Xue, H. Zha, and Y. Yu. Enhancing diversity, coverage and balance for summarization through structure learning. In Proceedings of WWW '09, pages 71--80, 2009. Google ScholarDigital Library
T.-Y. Liu. Learning to rank for information retrieval. Found. Trends Inf. Retr., 3(3):225--331, Mar. 2009. Google ScholarDigital Library
D. Metzler and W. B. Croft. A markov random field model for term dependencies. In Proceedings of ACM SIGIR '05, pages 472--479, 2005. Google ScholarDigital Library
L. Mihalkova and R. Mooney. Learning to disambiguate search queries from short sessions. In W. Buntine, M. Grobelnik, D. Mladeniff, and J. Shawe-Taylor, editors, Machine Learning and Knowledge Discovery in Databases, volume 5782 of Lecture Notes in Computer Science, pages 111--127. Springer Berlin Heidelberg, 2009.Google ScholarCross Ref
T. Qin, T.-Y. Liu, J. Xu, and H. Li. Letor: A benchmark collection for research on learning to rank for information retrieval. Inf. Retr., 13(4):346--374, Aug. 2010. Google ScholarDigital Library
F. Radlinski and S. Dumais. Improving personalized web search using result diversification. In Proceedings of ACM SIGIR '06, pages 691--692, 2006. Google ScholarDigital Library
F. Radlinski, R. Kleinberg, and T. Joachims. Learning diverse rankings with multi-armed bandits. In Proceedings of ACM ICML '08, pages 784--791, 2008. Google ScholarDigital Library
D. Rafiei, K. Bharat, and A. Shukla. Diversifying web search results. In Proceedings of WWW '10, pages 781--790, 2010. Google ScholarDigital Library
K. Raman, P. Shivaswamy, and T. Joachims. Online learning to diversify from implicit feedback. In Proceedings of ACM SIGKDD '12, pages 705--713, 2012. Google ScholarDigital Library
R. L. Santos, C. Macdonald, and I. Ounis. Exploiting query reformulations for web search result diversification. In Proceedings of WWW '10, pages 881--890, 2010. Google ScholarDigital Library
R. Socher, D. Chen, C. D. Manning, and A. Ng. Reasoning with neural tensor networks for knowledge base completion. In C. Burges, L. Bottou, M. Welling, Z. Ghahramani, and K. Weinberger, editors, Advances in Neural Information Processing Systems 26, pages 926--934. Curran Associates, Inc., 2013.Google Scholar
L. Xia, J. Xu, Y. Lan, J. Guo, and X. Cheng. Learning maximal marginal relevance model via directly optimizing diversity evaluation measures. In Proceedings of ACM SIGIR '15, pages 113--122, 2015. Google ScholarDigital Library
Y. Yue and T. Joachims. Predicting diverse subsets using structural svms. In Proceedings of ACM ICML '08, pages 1224--1231, 2008. Google ScholarDigital Library
Y. Yue and T. Joachims. Interactively optimizing information retrieval systems as a dueling bandits problem. In Proceedings of ACM ICML '09, pages 1201--1208, 2009. Google ScholarDigital Library
C. X. Zhai, W. W. Cohen, and J. Lafferty. Beyond independent relevance: Methods and evaluation metrics for subtopic retrieval. In Proceedings of ACM SIGIR '03, pages 10--17, 2003. Google ScholarDigital Library
Y. Zhu, Y. Lan, J. Guo, X. Cheng, and S. Niu. Learning for search result diversification. In Proceedings of ACM SIGIR '14, pages 293--302, 2014. Google ScholarDigital Library

Index Terms

Modeling Document Novelty with Neural Tensor Network for Search Result Diversification
1. Information systems
  1. Information retrieval
    1. Retrieval models and ranking
      1. Information retrieval diversity
      2. Learning to rank

Recommendations

Adapting Markov Decision Process for Search Result Diversification
SIGIR '17: Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval

In this paper we address the issue of learning diverse ranking models for search result diversification. Typical methods treat the problem of constructing a diverse ranking as a process of sequential document selection. At each ranking position, the ...
Read More
Search Result Diversification Based on Hierarchical Intents
CIKM '15: Proceedings of the 24th ACM International on Conference on Information and Knowledge Management

A large percentage of queries issued to search engines are broad or ambiguous. Search result diversification aims to solve this problem, by returning diverse results that can fulfill as many different information needs as possible. Most existing intent-...
Read More
Directly Optimize Diversity Evaluation Measures: A New Approach to Search Result Diversification
Special Issue: Mobile Social Multimedia Analytics in the Big Data Era and Regular Papers

The queries issued to search engines are often ambiguous or multifaceted, which requires search engines to return diverse results that can fulfill as many different information needs as possible; this is called search result diversification. Recently, ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SIGIR '16: Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval
July 2016
1296 pages
ISBN:9781450340694
DOI:10.1145/2911451
General Chairs:
Raffaele Perego
ISTI-CNR, Italy
,
Fabrizio Sebastiani
Qatar Computing Research Institute, HBKU, Qatar
,
Program Chairs:
Javed Aslam
Northeastern University, US
,
Ian Ruthven
University of Strathclyde, UK
,
Justin Zobel
University of Melbourne, Australia
Copyright © 2016 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 7 July 2016
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
neural tensor network
relational learning to rank
search result diversification
Qualifiers
- research-article
Conference

Acceptance Rates
SIGIR '16 Paper Acceptance Rate62of341submissions,18%Overall Acceptance Rate792of3,983submissions,20%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 39
  Total Citations
  View Citations
- 532
  Total Downloads
- Downloads (Last 12 months)19
- Downloads (Last 6 weeks)2
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Modeling Document Novelty with Neural Tensor Network for Search Result Diversification

SIGIR '16: Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval

ABSTRACT

References

Cited By

Index Terms

Recommendations

Adapting Markov Decision Process for Search Result Diversification

Search Result Diversification Based on Hierarchical Intents

Directly Optimize Diversity Evaluation Measures: A New Approach to Search Result Diversification