Abstract
Learning to rank methods aim to learn a refined ranking model from labeled data for desired ranking performance. However, the learned model may not improve the performance on each individual query because the distributions of relevant documents among queries are diversified in document feature space. The performance of learned ranking models may be largely affected by the usefulness of document features. To generate high-quality document ranking features, we capture the local context information of individual queries from the top-ranked documents of an initial retrieval using pseudo-relevance feedback. Based on the top-ranked feedback documents, we propose an attentive semi-supervised autoencoder to refine the ranked results using an optimized ranking-oriented reconstruction loss. Furthermore, we devise the hybrid listwise query constraints to capture the characteristics of relevant documents for different queries. We evaluate the proposed ranking model on LETOR collections including OHSUMED, MQ2007 and MQ2008. Our model produces better experimental results and consistent improvements of ranking performance over baseline methods.
Similar content being viewed by others
Data availability
Enquiries about data availability should be directed to the authors.
References
Ahmad WU, Chang KW, Wang H (2019) Context Attentive Document Ranking and Query Suggestion. In: Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR, pp 385–394. https://doi.org/10.1145/3331184.3331246
Ai Q, Bi K, Guo J, Croft WB (2018) Learning a deep listwise context model for ranking refinement. In: Proceedings of the 41st international ACM SIGIR conference on research and development in information retrieval. ACM, pp 135–144
Al-Asadi MA, Tasdemir S (2021) Empirical comparisons for combining balancing and feature selection strategies for characterizing football players using FIFA video game system. IEEE Access 9:149266–149286
Al-Asadi MA, Tasdemir S (2022) Predict the value of football players using FIFA video game data and machine learning techniques. IEEE Access 10:22631–22645
Banerjee A, Merugu S, Dhillon IS, Ghosh J (2004) Clustering with Bregman divergences. J Mach Learn Res 6(4):1705–1749
Bengio Y (2009) Learning deep architectures for AI. Found Trends Mach Learn 2(1):1–127
Burges Christopher J, Ragno Robert, Le Quoc V (2007) Learning to rank with nonsmooth cost functions. In: Advances in neural information processing systems (NIPS), pp 193–200
Burges CJC (2010) From RankNet to LambdaRank to LambdaMART: an overview. Learning 11(23–581):81
Burges C, Shaked T, Renshaw E, Lazier A, Deeds M, Hamilton N, Hullender G (2005) Learning to rank using gradient descent. In: International conference on machine learning (ICML), pp 89–96
Busolin F, Lucchese C, Nardini FM, Orlando S, Perego R, Trani S (2021) Learning early exit strategies for additive ranking ensembles. In: Diaz F, Shah C, Suel T, Castells P, Jones R, Sakai T (eds) SIGIR ’21: the 44th international ACM SIGIR conference on research and development in information retrieval, Virtual Event, Canada, July 11–15, 2021. ACM, pp 2217–2221
Cao Z, Qin T, Liu TY, Tsai MF, Li H (2007) Learning to rank: from pairwise approach to listwise approach. In: International conference on machine learning (ICML), pp 129–136
Chen M, Liu C, Sun J, Hoi SC (2021) Adapting interactional observation embedding for counterfactual learning to rank. In: Diaz F, Shah C, Suel T, Castells P, Jones R, Sakai T (eds) SIGIR ’21: the 44th international ACM SIGIR conference on research and development in information retrieval, Virtual Event, Canada, July 11–15, 2021. ACM, pp 285–294
Chen L, Wu L, Zhang K, Hong R, Wang M (2021) Set2setrank: collaborative set to set ranking for implicit feedback based recommendation. In: Diaz F, Shah C, Suel T, Castells P, Jones R, Sakai T (eds) SIGIR ’21: the 44th international ACM SIGIR conference on research and development in information retrieval, Virtual Event, Canada, July 11–15, 2021. ACM, pp 585–594
Choi J, Jung E, Suh J, Rhee W (2021) Improving bi-encoder document ranking models with two rankers and multi-teacher distillation. In: Diaz F, Shah C, Suel T, Castells P, Jones R, Sakai T (eds) SIGIR ’21: the 44th international ACM SIGIR conference on research and development in information retrieval, Virtual Event, Canada, July 11–15, 2021. ACM, pp 2192–2196
Fan Y, Guo J, Lan Y, Xu J, Zhai C, Cheng X (2018) Modeling diverse relevance patterns in ad-hoc retrieval. In: Proceedings of the 41st international ACM SIGIR conference on research and development in information retrieval, pp 375–384
Feng Y, Xu J, Lan Y, Guo J, Zeng W, Cheng X (2018) From greedy selection to exploratory decision-making: diverse ranking with policy-value networks. In: Proceedings of the 41st international ACM SIGIR conference on research and development in information retrieval, pp 125–134
Formal T, Piwowarski B, Clinchant S (2021) SPLADE: sparse lexical and expansion model for first stage ranking. In: Diaz F, Shah C, Suel T, Castells P, Jones R, Sakai T (eds) SIGIR ’21: the 44th international ACM SIGIR conference on research and development in information retrieval, Virtual Event, Canada, July 11–15, 2021. ACM, pp 2288–2292
Freund Y, Iyer R, Schapire RE, Singer Y (1998) An efficient boosting algorithm for combining preferences. In: International conference on machine learning (ICML), pp 170–178
Friedman JH (2001) Greedy function approximation: a gradient boosting machine. Ann Statist 29(5): 1189–1232
Goeric H (2016) Learning to rank with deep neural networks. Master’s thesis, University of Leuven
Guo J, Fan Y, Ai Q, Croft WB(2016) A deep relevance matching model for ad-hoc retrieval. In: Proceedings of the 25th ACM international on conference on information and knowledge management (CIKM). ACM, pp 55–64
Hansen C, Hansen C, Alstrup S, Grue Simonsen J, Lioma C (2019) Neural check-worthiness ranking with weak supervision: finding sentences for fact-checking. CoRR arXiv:1903.08404
He X, He Z, Du X, Chua TS (2018) Adversarial personalized ranking for recommendation. In: Proceedings of the 41st international ACM SIGIR conference on research and development in information retrieval, pp 355–364
Huang PS, He X, Gao J, Deng L, Acero A, Heck L (2013) Learning deep structured semantic models for web search using clickthrough data. In: Proceedings of the 22nd ACM international on conference on information and knowledge management (CIKM). ACM, pp 2333–2338
Joachims T, Swaminathan A, Schnabel T (2017) Unbiased learning-to-rank with biased feedback. In: Proceedings of the 10th ACM international conference on web search and data mining (WSDM). ACM, pp 781–789
Kim M, Ko Y (2021) Self-supervised fine-tuning for efficient passage re-ranking. In: Demartini G, Zuccon G, Shane Culpepper J, Huang Z, Tong H (eds) CIKM ’21: the 30th ACM international conference on information and knowledge management, virtual event, Queensland, Australia, November 1–5, 2021. ACM, pp 3142–3146
Kim Y, Rahimi R, Bonab H, Allan J (2021) Query-driven segment selection for ranking long documents. In: Demartini G, Zuccon G, Shane Culpepper J, Huang Z, Tong H (eds) CIKM ’21: the 30th ACM international conference on information and knowledge management, virtual event, Queensland, Australia, November 1–5, 2021. ACM, pp 3147–3151
Lavrenko V, Croft WB (2001) Relevance-based language models. In: International ACM SIGIR conference on research and development in information retrieval, vol 51, no 2, pp 120–127
Lee Y, Kim KE (2021) Dual correction strategy for ranking distillation in top-n recommender system. In: Demartini G, Zuccon G, Shane Culpepper J, Huang Z, Tong H (eds) CIKM ’21: the 30th ACM international conference on information and knowledge management, virtual event, Queensland, Australia, November 1–5, 2021. ACM, pp 3186–3190
Li J, Luong T, Jurafsky D (2015) A hierarchical neural autoencoder for paragraphs and documents. In: International joint conference on natural language processing, vol 1, pp 1106–1115
Liu T-Y (2009) Learning to rank for information retrieval. Found Trends Inf Retr 3(3):225–331
Lucchese C, Nardini FM, Perego R, Orlando S, Trani S (2018) Selective gradient boosting for effective learning to rank. In: Proceedings of the 41st international ACM SIGIR conference on research and development in information retrieval, pp 155–164
MacAvaney S, Yates A, Hui K, Frieder O (2019) Content-based weak supervision for ad-hoc re-ranking. arXiv Information retrieval
Mehrotra R, Yilmaz E (2015) Representative and informative query selection for learning to rank using submodular functions. In: Proceedings of the 38th international ACM SIGIR conference on research and development in information retrieval. ACM, pp 545–554
Mitra B, Diaz F, Craswell N (2017) Learning to match using local and distributed representations of text for web search. In: Proceedings of the 26th international conference on World Wide Web (WWW), pp 1291–1299
Niu S, Lan Y, Guo J, Cheng X, Geng X (2014) What makes data robust: a data analysis in learning to rank. In: Proceedings of the 37th international ACM SIGIR conference on research and development in information retrieval. ACM, pp 1191–1194
Qin T, Liu T-Y, Jun X, Li H (2010) LETOR: a benchmark collection for research on learning to rank for information retrieval. Inf Retr J 13(4):346–374
Qin T, Liu TY (2013) Introducing LETOR 4.0 datasets. Computer Science
Robertson Stephen E, Sparck Jones K (1976) Relevance weighting of search terms. J Am Soc Inf Sci 27(3):129–146
Rosset C, Mitra B, Xiong C, Craswell N, Song X, Tiwary S (2019) An axiomatic approach to regularizing neural ranking models. In: Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, pp 981–984. https://doi.org/10.1145/3331184.3331296
Salton G, Buckley C (1997) Improving retrieval performance by relevance feedback. J Am Soc Inf Sci 41(4):355–364
Schuth A, Oosterhuis H, Whiteson S, de Rijke M (2016) Multileave gradient descent for fast online learning to rank. In: Proceedings of the 9th ACM international conference on Web Search and Data Mining (WSDM). ACM, pp 457–466
Sedhain S, Menon AK, Sanner S, Xie L (2015) Autorec: autoencoders meet collaborative filtering. In: Proceedings of the 24th international conference on World Wide Web (WWW). ACM, pp 111–112
Shao J, Ji S, Yang T (2019) Privacy-aware document ranking with neural signals. In: Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR, pp 305–314. https://doi.org/10.1145/3331184.3331189
Tax N, Bockting S, Hiemstra D (2015) A cross-benchmark comparison of 87 learning to rank methods. Inf Process Manage 51(6):757–772
Tran A, Yang T, Ai Q (2021) ULTRA: an unbiased learning to rank algorithm toolbox. In: Demartini G, Zuccon G, Shane Culpepper J, Huang Z, Tong H (eds) CIKM ’21: the 30th ACM international conference on information and knowledge management, virtual event, Queensland, Australia, November 1–5, 2021. ACM, pp 4613–4622
Wang B, Klabjan D (2017) An attention-based deep net for learning to rank. arXiv preprint arXiv:1702.06106
Wang H, Langley R, Kim S, McCord-Snook E, Wang H (2018) Efficient exploration of gradient space for online learning to rank. In: Proceedings of the 41st international ACM SIGIR conference on research and development in information retrieval, pp 145–154
Wang H, Shi X, Yeung DY (2015) Relational stacked denoising autoencoder for tag recommendation. In: The association for the advancement of artificial intelligence (AAAI), pp 3052–3058
Wang J, Yu L, Zhang W, Gong Y, Xu Y, Wang B, Zhang P, Zhang D (2017) IRGAN: a minimax game for unifying generative and discriminative information retrieval models. In: Proceedings of the 40th international ACM SIGIR conference on research and development in information retrieval. ACM, pp 515–524
Wu L, Hu D, Hong L, Liu H (2018) Turning clicks into purchases: revenue optimization for product search in E-commerce. In: Proceedings of the 41st international ACM SIGIR conference on research and development in information retrieval, pp 365–374
Xu B, Lin H, Lin Y, Xu K (2017) Learning to rank with query-level semi-supervised autoencoders. In: Proceedings of the 26th ACM on conference on information and knowledge management (CIKM). ACM, pp 2395–2398
Xu B, Lin H, Lin Y, Xu K (2019) Incorporating query constraints for autoencoder enhanced ranking. Neurocomputing 356:142–150
Yang Z, Yan S, Lad A, Liu X, Guo W (2021) Cascaded deep neural ranking models in linkedin people search. In: Demartini G, Zuccon G, Shane Culpepper J, Huang Z, Tong H (eds) CIKM ’21: the 30th ACM international conference on information and knowledge management, virtual event, Queensland, Australia, November 1–5, 2021. ACM, pp 4312–4320
Yin D, Hu Y, Tang J, Daly T, Zhou M, Ouyang H, Chen J, Kang C, Deng H, Nobata C, Langlois JM (2016) Ranking relevance in yahoo search. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 323–332
Yoon S, Shin J, Jung K (2018) Learning to rank question-answer pairs using hierarchical recurrent encoder with latent topic clustering. In: Proceedings of the 2018 conference of the North American chapter of the association for computational linguistics: human language technologies (NAACL-HLT), pp 1575–1584
Zhai C, Lafferty JD (2004) A study of smoothing methods for language models applied to information retrieval. ACM Trans Inf Syst 22(2):179–214
Zhai S, Zhang ZM (2016) Semisupervised autoencoder for sentiment analysis. In: The association for the advancement of artificial intelligence (AAAI), pp 1394–1400
Zhuang F, Luo D, Yuan NJ, Xie X, He Q (2017) Representation learning with pair-wise constraints for collaborative ranking. In: Proceedings of the 10th ACM international conference on web search and data mining (WSDM). ACM, pp 567–575
Zhu Y, Nie JY, Dou Z, Ma Z, Zhang X, Du P, Zuo X, Jiang H (2015) Representation learning via semi-supervised autoencoder for multi-task learning. In: IEEE international conference on data mining (ICDM). IEEE, pp 1141–1146
Zhu Y, Nie JY, Dou Z, Ma Z, Zhang X, Du P, Zuo X, Jiang H (2021) Contrastive learning of user behavior sequence for context-aware document ranking. In: Demartini G, Zuccon G, Shane Culpepper J, Huang Z, Tong H (eds) CIKM ’21: the 30th ACM international conference on information and knowledge management, virtual event, Queensland, Australia, November 1–5, 2021. ACM, pp 2780–2791
Funding
This work is partially supported by grant from the Natural Science Foundation of China (No. 62006034), Natural Science Foundation of Liaoning Province (No. 2021-BS-067) and the Fundamental Research Funds for the Central Universities (No. DUT21RC(3)015).
Author information
Authors and Affiliations
Contributions
Bo Xu was involved in the conceptualization, methodology, writing—original draft preparation. Hongfei Lin contributed to the formal analysis and funding acquisition. Yuan Lin helped in writing—reviewing and editing and experiments. Kan Xu contributed to writing—reviewing and editing and validation.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Ethical approval
This manuscript does not contain any studies with human participants or animals performed by any of the authors. We have read and have abided by the statement of ethical standards for manuscripts.
Informed consent
The submitted manuscript has obtained informed consent from all authors.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Xu, B., Lin, H., Lin, Y. et al. Context-aware ranking refinement with attentive semi-supervised autoencoders. Soft Comput 26, 13941–13952 (2022). https://doi.org/10.1007/s00500-022-07433-w
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-022-07433-w