research-article

Search Result Reranking with Visual and Structure Information Sources

Authors:

Leyu LinAuthors Info & Claims

ACM Transactions on Information Systems (TOIS), Volume 37, Issue 3

Article No.: 38, Pages 1 - 38

https://doi.org/10.1145/3329188

Published: 26 June 2019 Publication History

Abstract

Relevance estimation is among the most important tasks in the ranking of search results. Current methodologies mainly concentrate on text matching, link analysis, and user behavior models. However, users judge the relevance of search results directly from Search Engine Result Pages (SERPs), which provide valuable signals for reranking. In this article, we propose two different approaches to aggregate the visual, structure, as well as textual information sources of search results in relevance estimation. The first one is a late-fusion framework named Joint Relevance Estimation model (JRE). JRE estimates the relevance independently from screenshots, textual contents, and HTML source codes of search results and jointly makes the final decision through an inter-modality attention mechanism. The second one is an early-fusion framework named Tree-based Deep Neural Network (TreeNN), which embeds the texts and images into the HTML parse tree through a recursive process. To evaluate the performance of the proposed models, we construct a large-scale practical Search Result Relevance (SRR) dataset that consists of multiple information sources and relevance labels of over 60,000 search results. Experimental results show that the proposed two models achieve better performance than state-of-the-art ranking solutions as well as the original rankings of commercial search engines.

References

[1]

Javad Azimi, Ruofei Zhang, Zhou Yang, Vidhya Navalpakkam, Jianchang Mao, and Xiaoli Fern. 2012. The impact of visual appearance on user response in online display advertising. In Proceedings of the 21st International Conference on World Wide Web. ACM, 457--458.

Digital Library

[2]

Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2014. Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473.

[3]

Deng Cai, Shipeng Yu, Ji-Rong Wen, and Wei-Ying Ma. 2003. Extracting content structure for web pages based on visual representation. In Proceedings of the Asia-Pacific Web Conference. Springer, 406--417.

Digital Library

[4]

Olivier Chapelle and Ya Zhang. 2009. A dynamic bayesian network click model for web search ranking. In Proceedings of the 18th International Conference on World Wide Web. ACM, 1--10.

Digital Library

[5]

Danqi Chen, Weizhu Chen, Haixun Wang, Zheng Chen, and Qiang Yang. 2012. Beyond ten blue links: Enabling user click modeling in federated web search. In Proceedings of the 5th ACM International Conference on Web Search and Data Mining. ACM, 463--472.

Digital Library

[6]

Kan Chen, Trung Bui, Chen Fang, Zhaowen Wang, and Ram Nevatia. 2017. AMC: Attention guided multi-modal correlation learning for image search. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2644--2652.

[7]

Haibin Cheng, Roelof Van Zwol, Javad Azimi, Eren Manavoglu, Ruofei Zhang, Yang Zhou, and Vidhya Navalpakkam. 2012. Multimedia features for click prediction of new ads in display advertising. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 777--785.

Digital Library

[8]

Kyunghyun Cho, Bart Van Merrienboer, Dzmitry Bahdanau, and Yoshua Bengio. 2014. On the properties of neural machine translation: Encoder-decoder approaches. arXiv preprint arXiv:1409.1259.

[9]

Dan C. Ciresan, Ueli Meier, Jonathan Masci, Luca Maria Gambardella, and Jürgen Schmidhuber. 2011. Flexible, high performance convolutional neural networks for image classification. In Proceedings of the Proceedings-International Joint Conference on Artificial Intelligence (IJCAI’11), Vol. 22, 1237.

[10]

Jacob Cohen. 1968. Weighted kappa: Nominal scale agreement provision for scaled disagreement or partial credit. Psychol. Bull. 70, 4 (1968), 213.

[11]

Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. Imagenet: A large-scale hierarchical image database. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’09). IEEE, 248--255.

[12]

Georges E. Dupret and Benjamin Piwowarski. 2008. A user browsing model to predict search engine click data from past observations. In Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 331--338.

Digital Library

[13]

Mark Everingham, S. M. Ali Eslami, Luc Van Gool, Christopher K. I. Williams, John Winn, and Andrew Zisserman. 2015. The pascal visual object classes challenge: A retrospective. Int. J. Comput. Vis. 111, 1 (2015), 98--136.

Digital Library

[14]

Yixing Fan, Jiafeng Guo, Yanyan Lan, Jun Xu, Liang Pang, and Xueqi Cheng. 2017. Learning visual features from snapshots for web search. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management. ACM, 247--256.

Digital Library

[15]

Paolo Frasconi, Marco Gori, and Alessandro Sperduti. 1998. A general framework for adaptive processing of data structures. IEEE Trans. Neur. Netw. 9, 5 (1998), 768--786.

Digital Library

[16]

Yoav Freund, Raj D. Iyer, Robert E. Schapire, and Yoram Singer. 1998. An efficient boosting algorithm for combining preferences. In Proceedings of the 15th International Conference on Machine Learning. 170--178.

Digital Library

[17]

Jerome H. Friedman. 2001. Greedy function approximation: A gradient boosting machine. Ann. Stat. 29, 5 (2001), 1189--1232.

[18]

Christoph Goller and Andreas Kuchler. 1996. Learning task-dependent distributed representations by backpropagation through structure. In Proceedings of International Conference on Neural Networks (ICNN'96), Vol. 1. IEEE, 347--352.

[19]

Fan Guo, Chao Liu, and Yi Min Wang. 2009. Efficient multiple-click models in web search. In Proceedings of the 2nd ACM International Conference on Web Search and Data Mining. ACM, 124--131.

Digital Library

[20]

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 770--778.

[21]

Geoffrey E. Hinton. 1990. Mapping part-whole hierarchies into connectionist networks. Artif. Intell. 46, 1--2 (1990), 47--75.

[22]

Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural Comput. 9, 8 (1997), 1735--1780.

Digital Library

[23]

Baotian Hu, Zhengdong Lu, Hang Li, and Qingcai Chen. 2014. Convolutional neural network architectures for matching natural language sentences. In Proceedings of the International Conference on Neural Information Processing Systems. 2042--2050.

Digital Library

[24]

Thorsten Joachims. 2002. Optimizing search engines using clickthrough data. In Proceedings of the ACM Conference on Knowledge Discovery and Data Mining. 133--142.

Digital Library

[25]

Ranjay Krishna, Yuke Zhu, Oliver Groth, Justin Johnson, Kenji Hata, Joshua Kravitz, Stephanie Chen, Yannis Kalantidis, Li-Jia Li, David A Shamma, et al. 2017. Visual genome: Connecting language and vision using crowdsourced dense image annotations. Int. J. Comput. Vis. 123, 1 (2017), 32--73.

Digital Library

[26]

Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. 2012. Imagenet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems. 1097--1105.

Digital Library

[27]

Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C Lawrence Zitnick. 2014. Microsoft coco: Common objects in context. In Proceedings of the European Conference on Computer Vision. Springer, 740--755.

[28]

Jiasen Lu, Caiming Xiong, Devi Parikh, and Richard Socher. 2016. Knowing when to look: Adaptive attention via A visual sentinel for image captioning. arXiv preprint arXiv:1612.01887 (2016).

[29]

Cheng Luo, Yiqun Liu, Tetsuya Sakai, Fan Zhang, Min Zhang, and Shaoping Ma. 2017. Evaluating mobile search with height-biased gain. In Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 435--444.

Digital Library

[30]

Jie Luo, Sudarshan Lamkhede, Rochit Sapra, Evans Hsu, Helen Song, and Chang Yi. 2013. A unified search federation system based on online user feedback. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.

Digital Library

[31]

Minh Thang Luong, Hieu Pham, and Christopher D. Manning. 2015. Effective approaches to attention-based neural machine translation. arXiv preprint arXiv:1508.04025.

[32]

Thomas Mandl. 2006. Implementation and evaluation of a quality-based search engine. In Proceedings of the 17th Conference on Hypertext and Hypermedia. ACM, 73--84.

Digital Library

[33]

Schutze Manning Raghavan. 2008. Introduction to information retrieval. J. Am. Soc. Inf. Sci. Technol. 43, 3 (2008), 824--825.

[34]

Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Distributed representations of words and phrases and their compositionality. Adv. Neural Inf. Process. Syst. 26 (2013), 3111--3119.

Digital Library

[35]

L. Page. 1999. The PageRank citation ranking: Bringing order to the web. Stanf. Dig. Libr. Work. Pap. 9, 1 (1999), 1--14.

[36]

Liang Pang, Yanyan Lan, Jiafeng Guo, Jun Xu, Shengxian Wan, and Xueqi Cheng. 2016. Text matching as image recognition. In Proceedings of the Association for the Advancement of Artificial Intelligence Conference (AAAI’16). 2793--2799.

[37]

Jordan B. Pollack. 1990. Recursive distributed representations. Artif. Intell. 46, 1--2 (1990), 77--105.

Digital Library

[38]

Tao Qin, Tie Yan Liu, Jun Xu, and Hang Li. 2010. LETOR: A benchmark collection for research on learning to rank for information retrieval. Inf. Retriev. 13, 4 (2010), 346--374.

Digital Library

[39]

Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. 2017. Faster r-cnn: Towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39, 6 (2017), 1137--1149.

Digital Library

[40]

Stephen Robertson and Hugo Zaragoza. 2009. The probabilistic relevance framework: BM25 and beyond. Found. Trends Inf. Retriev. 3, 4 (2009), 333--389.

Digital Library

[41]

Kalervo Rvelin, Kek, and Jaana Inen. 2002. Cumulated gain-based evaluation of IR techniques. ACM Trans. Inf. Syst. 20, 4 (2002), 422--446.

Digital Library

[42]

Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014).

[43]

Richard Socher, Jeffrey Pennington, Eric H. Huang, Andrew Y. Ng, and Christopher D. Manning. 2011. Semi-supervised recursive autoencoders for predicting sentiment distributions. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 151--161.

Digital Library

[44]

Ruihua Song, Haifeng Liu, Ji-Rong Wen, and Wei-Ying Ma. 2004. Learning block importance models for web pages. In Proceedings of the 13th International Conference on World Wide Web. ACM, 203--211.

Digital Library

[45]

Ruihua Song, Haifeng Liu, Ji-Rong Wen, and Wei-Ying Ma. 2004. Learning important models for web page blocks based on layout and content analysis. ACM SIGKDD Explor. Newslett. 6, 2 (2004), 14--23.

Digital Library

[46]

Alessandro Sperduti and Antonina Starita. 1997. Supervised neural networks for the classification of structures. IEEE Trans. Neural Netw. 8, 3 (1997), 714--735.

Digital Library

[47]

Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich. 2015. Going deeper with convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1--9.

[48]

Kai Sheng Tai, Richard Socher, and Christopher D. Manning. 2015. Improved semantic representations from tree-structured long short-term memory networks. arXiv preprint arXiv:1503.00075 (2015).

[49]

Shengxian Wan, Yanyan Lan, Jiafeng Guo, Jun Xu, Liang Pang, and Xueqi Cheng. 2015. A deep architecture for semantic matching with multiple positional sentence representations. In Proceedings of theThirtieth AAAI Conference on Artificial Intelligence. 2835--2841.

Digital Library

[50]

Chao Wang, Yiqun Liu, Meng Wang, Ke Zhou, Jian-yun Nie, and Shaoping Ma. 2015. Incorporating non-sequential behavior into click models. In Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 283--292.

Digital Library

[51]

Chao Wang, Yiqun Liu, Min Zhang, Shaoping Ma, Meihong Zheng, Jing Qian, and Kuo Zhang. 2013. Incorporating vertical results into search click models. In Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 503--512.

Digital Library

[52]

Qiang Wu, Christopher J. C. Burges, Krysta M. Svore, and Jianfeng Gao. 2010. Adapting boosting for information retrieval measures. Inf. Retriev. 13, 3 (2010), 254--270.

Digital Library

[53]

Jun Xu and Hang Li. 2007. AdaRank: A boosting algorithm for information retrieval. In Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 391--398.

Digital Library

[54]

Kelvin Xu, Jimmy Ba, Ryan Kiros, Kyunghyun Cho, Aaron Courville, Ruslan Salakhutdinov, Richard Zemel, and Yoshua Bengio. 2015. Show, attend and tell: Neural image caption generation with visual attention. In Proceedings of the International Conference on Machine Learning. 2048--2057.

Digital Library

[55]

Dawei Yin, Yuening Hu, Jiliang Tang, Tim Daly, Mianwei Zhou, Hua Ouyang, Jianhui Chen, Changsung Kang, Hongbo Deng, and Chikashi Nobata. 2016. Ranking relevance in Yahoo search. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 323--332.

Digital Library

[56]

Junqi Zhang, Yiqun Liu, Shaoping Ma, and Qi Tian. 2018. Relevance estimation with multiple information sources on search engine result pages. In Proceedings of the 2018 ACM on Conference on Information and Knowledge Management. ACM, 10.

Digital Library

[57]

Masrour Zoghi, Tomáš Tunys, Lihong Li, Damien Jose, Junyan Chen, Chun Ming Chin, and Maarten de Rijke. 2016. Click-based hot fixes for underperforming torso queries. In Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 195--204.

Digital Library

Cited By

Dr. Pankaj Kumar Alok Verma Anurag Srivastava Ayushi Pal (2023)Fake Review Detection System: A ReviewInternational Journal of Scientific Research in Science and Technology10.32628/IJSRST52310378(346-354)Online publication date: 3-May-2023
https://doi.org/10.32628/IJSRST52310378
Dang PZhu HGuo TWan CZhao TSalama PWang YCao SZhang CSingh ASun YAkoglu LGunopulos DYan XKumar ROzcan FYe J(2023)Generalized Matrix Local Low Rank Representation by Random Projection and Submatrix PropagationProceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3580305.3599361(390-401)Online publication date: 6-Aug-2023
https://dl.acm.org/doi/10.1145/3580305.3599361
Zhang JLiu YMao JMa WXu JMa STian Q(2023)User Behavior Simulation for Search Result Re-rankingACM Transactions on Information Systems10.1145/351146941:1(1-35)Online publication date: 20-Jan-2023
https://dl.acm.org/doi/10.1145/3511469
Show More Cited By

Index Terms

Search Result Reranking with Visual and Structure Information Sources
1. Information systems
  1. Information retrieval
    1. Retrieval models and ranking

Recommendations

Relevance Estimation with Multiple Information Sources on Search Engine Result Pages
CIKM '18: Proceedings of the 27th ACM International Conference on Information and Knowledge Management

Relevance estimation is among the most important tasks in the ranking of search results because most search engines follow the Probability Ranking Principle. Current relevance estimation methodologies mainly concentrate on text matching between the ...
Intent-based diversification of web search results: metrics and algorithms

We study the problem of web search result diversification in the case where intent based relevance scores are available. A diversified search result will hopefully satisfy the information need of user-L.s who may have different intents. In this context, ...
Re-ranking search results using query logs
CIKM '06: Proceedings of the 15th ACM international conference on Information and knowledge management

This work addresses two common problems in search, frequently occurring with underspecified user queries: the top-ranked results for such queries may not contain documents relevant to the user's search intent, and fresh and relevant pages may not get ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Information Systems

ACM Transactions on Information Systems Volume 37, Issue 3

July 2019

335 pages

ISSN:1046-8188

EISSN:1558-2868

DOI:10.1145/3320115

Editor:
Maarten de Rijke
University of Amsterdam, The Netherlands

Issue’s Table of Contents

Copyright © 2019 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 26 June 2019

Accepted: 01 April 2019

Revised: 01 March 2019

Received: 01 August 2018

Published in TOIS Volume 37, Issue 3

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed

Funding Sources

National Key Basic Research Program
Natural Science Foundation of China

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

7
Total Citations
View Citations
416
Total Downloads

Downloads (Last 12 months)19
Downloads (Last 6 weeks)0

Reflects downloads up to 27 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Dr. Pankaj Kumar Alok Verma Anurag Srivastava Ayushi Pal (2023)Fake Review Detection System: A ReviewInternational Journal of Scientific Research in Science and Technology10.32628/IJSRST52310378(346-354)Online publication date: 3-May-2023
https://doi.org/10.32628/IJSRST52310378
Dang PZhu HGuo TWan CZhao TSalama PWang YCao SZhang CSingh ASun YAkoglu LGunopulos DYan XKumar ROzcan FYe J(2023)Generalized Matrix Local Low Rank Representation by Random Projection and Submatrix PropagationProceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3580305.3599361(390-401)Online publication date: 6-Aug-2023
https://dl.acm.org/doi/10.1145/3580305.3599361
Zhang JLiu YMao JMa WXu JMa STian Q(2023)User Behavior Simulation for Search Result Re-rankingACM Transactions on Information Systems10.1145/351146941:1(1-35)Online publication date: 20-Jan-2023
https://dl.acm.org/doi/10.1145/3511469
Ma THuang LLu QHu S(2023)KR-GCN: Knowledge-Aware Reasoning with Graph Convolution Network for Explainable RecommendationACM Transactions on Information Systems10.1145/351101941:1(1-27)Online publication date: 20-Jan-2023
https://dl.acm.org/doi/10.1145/3511019
Liu HJing LWen JXu PYu JNg M(2021)Bayesian Additive Matrix Approximation for Social RecommendationACM Transactions on Knowledge Discovery from Data10.1145/345139116:1(1-34)Online publication date: 20-Jul-2021
https://dl.acm.org/doi/10.1145/3451391
Aage NGiele RAndreasen C(2021)Length scale control for high-resolution three-dimensional level set–based topology optimizationStructural and Multidisciplinary Optimization10.1007/s00158-021-02904-464:3(1127-1139)Online publication date: 1-Sep-2021
https://dl.acm.org/doi/10.1007/s00158-021-02904-4
Zhang JMao JLiu YZhang RZhang MMa SXu JTian QZhu WTao DCheng XCui PRundensteiner ECarmel DHe QXu Yu J(2019)Context-Aware Ranking by Constructing a Virtual Environment for Reinforcement LearningProceedings of the 28th ACM International Conference on Information and Knowledge Management10.1145/3357384.3357945(1603-1612)Online publication date: 3-Nov-2019
https://dl.acm.org/doi/10.1145/3357384.3357945

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Figures

Tables

Media

View Issue’s Table of Contents