Skip to main content
Log in

Context-aware ranking refinement with attentive semi-supervised autoencoders

  • Application of soft computing
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

Learning to rank methods aim to learn a refined ranking model from labeled data for desired ranking performance. However, the learned model may not improve the performance on each individual query because the distributions of relevant documents among queries are diversified in document feature space. The performance of learned ranking models may be largely affected by the usefulness of document features. To generate high-quality document ranking features, we capture the local context information of individual queries from the top-ranked documents of an initial retrieval using pseudo-relevance feedback. Based on the top-ranked feedback documents, we propose an attentive semi-supervised autoencoder to refine the ranked results using an optimized ranking-oriented reconstruction loss. Furthermore, we devise the hybrid listwise query constraints to capture the characteristics of relevant documents for different queries. We evaluate the proposed ranking model on LETOR collections including OHSUMED, MQ2007 and MQ2008. Our model produces better experimental results and consistent improvements of ranking performance over baseline methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Data availability

Enquiries about data availability should be directed to the authors.

Notes

  1. http://research.microsoft.com/enus/um/people/letor/.

References

  • Ahmad WU, Chang KW, Wang H (2019) Context Attentive Document Ranking and Query Suggestion. In: Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR, pp 385–394. https://doi.org/10.1145/3331184.3331246

  • Ai Q, Bi K, Guo J, Croft WB (2018) Learning a deep listwise context model for ranking refinement. In: Proceedings of the 41st international ACM SIGIR conference on research and development in information retrieval. ACM, pp 135–144

  • Al-Asadi MA, Tasdemir S (2021) Empirical comparisons for combining balancing and feature selection strategies for characterizing football players using FIFA video game system. IEEE Access 9:149266–149286

    Article  Google Scholar 

  • Al-Asadi MA, Tasdemir S (2022) Predict the value of football players using FIFA video game data and machine learning techniques. IEEE Access 10:22631–22645

    Article  Google Scholar 

  • Banerjee A, Merugu S, Dhillon IS, Ghosh J (2004) Clustering with Bregman divergences. J Mach Learn Res 6(4):1705–1749

    MathSciNet  MATH  Google Scholar 

  • Bengio Y (2009) Learning deep architectures for AI. Found Trends Mach Learn 2(1):1–127

    Article  MathSciNet  MATH  Google Scholar 

  • Burges Christopher J, Ragno Robert, Le Quoc V (2007) Learning to rank with nonsmooth cost functions. In: Advances in neural information processing systems (NIPS), pp 193–200

  • Burges CJC (2010) From RankNet to LambdaRank to LambdaMART: an overview. Learning 11(23–581):81

    Google Scholar 

  • Burges C, Shaked T, Renshaw E, Lazier A, Deeds M, Hamilton N, Hullender G (2005) Learning to rank using gradient descent. In: International conference on machine learning (ICML), pp 89–96

  • Busolin F, Lucchese C, Nardini FM, Orlando S, Perego R, Trani S (2021) Learning early exit strategies for additive ranking ensembles. In: Diaz F, Shah C, Suel T, Castells P, Jones R, Sakai T (eds) SIGIR ’21: the 44th international ACM SIGIR conference on research and development in information retrieval, Virtual Event, Canada, July 11–15, 2021. ACM, pp 2217–2221

  • Cao Z, Qin T, Liu TY, Tsai MF, Li H (2007) Learning to rank: from pairwise approach to listwise approach. In: International conference on machine learning (ICML), pp 129–136

  • Chen M, Liu C, Sun J, Hoi SC (2021) Adapting interactional observation embedding for counterfactual learning to rank. In: Diaz F, Shah C, Suel T, Castells P, Jones R, Sakai T (eds) SIGIR ’21: the 44th international ACM SIGIR conference on research and development in information retrieval, Virtual Event, Canada, July 11–15, 2021. ACM, pp 285–294

  • Chen L, Wu L, Zhang K, Hong R, Wang M (2021) Set2setrank: collaborative set to set ranking for implicit feedback based recommendation. In: Diaz F, Shah C, Suel T, Castells P, Jones R, Sakai T (eds) SIGIR ’21: the 44th international ACM SIGIR conference on research and development in information retrieval, Virtual Event, Canada, July 11–15, 2021. ACM, pp 585–594

  • Choi J, Jung E, Suh J, Rhee W (2021) Improving bi-encoder document ranking models with two rankers and multi-teacher distillation. In: Diaz F, Shah C, Suel T, Castells P, Jones R, Sakai T (eds) SIGIR ’21: the 44th international ACM SIGIR conference on research and development in information retrieval, Virtual Event, Canada, July 11–15, 2021. ACM, pp 2192–2196

  • Fan Y, Guo J, Lan Y, Xu J, Zhai C, Cheng X (2018) Modeling diverse relevance patterns in ad-hoc retrieval. In: Proceedings of the 41st international ACM SIGIR conference on research and development in information retrieval, pp 375–384

  • Feng Y, Xu J, Lan Y, Guo J, Zeng W, Cheng X (2018) From greedy selection to exploratory decision-making: diverse ranking with policy-value networks. In: Proceedings of the 41st international ACM SIGIR conference on research and development in information retrieval, pp 125–134

  • Formal T, Piwowarski B, Clinchant S (2021) SPLADE: sparse lexical and expansion model for first stage ranking. In: Diaz F, Shah C, Suel T, Castells P, Jones R, Sakai T (eds) SIGIR ’21: the 44th international ACM SIGIR conference on research and development in information retrieval, Virtual Event, Canada, July 11–15, 2021. ACM, pp 2288–2292

  • Freund Y, Iyer R, Schapire RE, Singer Y (1998) An efficient boosting algorithm for combining preferences. In: International conference on machine learning (ICML), pp 170–178

  • Friedman JH (2001) Greedy function approximation: a gradient boosting machine. Ann Statist 29(5): 1189–1232

  • Goeric H (2016) Learning to rank with deep neural networks. Master’s thesis, University of Leuven

  • Guo J, Fan Y, Ai Q, Croft WB(2016) A deep relevance matching model for ad-hoc retrieval. In: Proceedings of the 25th ACM international on conference on information and knowledge management (CIKM). ACM, pp 55–64

  • Hansen C, Hansen C, Alstrup S, Grue Simonsen J, Lioma C (2019) Neural check-worthiness ranking with weak supervision: finding sentences for fact-checking. CoRR arXiv:1903.08404

  • He X, He Z, Du X, Chua TS (2018) Adversarial personalized ranking for recommendation. In: Proceedings of the 41st international ACM SIGIR conference on research and development in information retrieval, pp 355–364

  • Huang PS, He X, Gao J, Deng L, Acero A, Heck L (2013) Learning deep structured semantic models for web search using clickthrough data. In: Proceedings of the 22nd ACM international on conference on information and knowledge management (CIKM). ACM, pp 2333–2338

  • Joachims T, Swaminathan A, Schnabel T (2017) Unbiased learning-to-rank with biased feedback. In: Proceedings of the 10th ACM international conference on web search and data mining (WSDM). ACM, pp 781–789

  • Kim M, Ko Y (2021) Self-supervised fine-tuning for efficient passage re-ranking. In: Demartini G, Zuccon G, Shane Culpepper J, Huang Z, Tong H (eds) CIKM ’21: the 30th ACM international conference on information and knowledge management, virtual event, Queensland, Australia, November 1–5, 2021. ACM, pp 3142–3146

  • Kim Y, Rahimi R, Bonab H, Allan J (2021) Query-driven segment selection for ranking long documents. In: Demartini G, Zuccon G, Shane Culpepper J, Huang Z, Tong H (eds) CIKM ’21: the 30th ACM international conference on information and knowledge management, virtual event, Queensland, Australia, November 1–5, 2021. ACM, pp 3147–3151

  • Lavrenko V, Croft WB (2001) Relevance-based language models. In: International ACM SIGIR conference on research and development in information retrieval, vol 51, no 2, pp 120–127

  • Lee Y, Kim KE (2021) Dual correction strategy for ranking distillation in top-n recommender system. In: Demartini G, Zuccon G, Shane Culpepper J, Huang Z, Tong H (eds) CIKM ’21: the 30th ACM international conference on information and knowledge management, virtual event, Queensland, Australia, November 1–5, 2021. ACM, pp 3186–3190

  • Li J, Luong T, Jurafsky D (2015) A hierarchical neural autoencoder for paragraphs and documents. In: International joint conference on natural language processing, vol 1, pp 1106–1115

  • Liu T-Y (2009) Learning to rank for information retrieval. Found Trends Inf Retr 3(3):225–331

    Article  Google Scholar 

  • Lucchese C, Nardini FM, Perego R, Orlando S, Trani S (2018) Selective gradient boosting for effective learning to rank. In: Proceedings of the 41st international ACM SIGIR conference on research and development in information retrieval, pp 155–164

  • MacAvaney S, Yates A, Hui K, Frieder O (2019) Content-based weak supervision for ad-hoc re-ranking. arXiv Information retrieval

  • Mehrotra R, Yilmaz E (2015) Representative and informative query selection for learning to rank using submodular functions. In: Proceedings of the 38th international ACM SIGIR conference on research and development in information retrieval. ACM, pp 545–554

  • Mitra B, Diaz F, Craswell N (2017) Learning to match using local and distributed representations of text for web search. In: Proceedings of the 26th international conference on World Wide Web (WWW), pp 1291–1299

  • Niu S, Lan Y, Guo J, Cheng X, Geng X (2014) What makes data robust: a data analysis in learning to rank. In: Proceedings of the 37th international ACM SIGIR conference on research and development in information retrieval. ACM, pp 1191–1194

  • Qin T, Liu T-Y, Jun X, Li H (2010) LETOR: a benchmark collection for research on learning to rank for information retrieval. Inf Retr J 13(4):346–374

    Article  Google Scholar 

  • Qin T, Liu TY (2013) Introducing LETOR 4.0 datasets. Computer Science

  • Robertson Stephen E, Sparck Jones K (1976) Relevance weighting of search terms. J Am Soc Inf Sci 27(3):129–146

    Article  Google Scholar 

  • Rosset C, Mitra B, Xiong C, Craswell N, Song X, Tiwary S (2019) An axiomatic approach to regularizing neural ranking models. In: Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, pp 981–984. https://doi.org/10.1145/3331184.3331296

  • Salton G, Buckley C (1997) Improving retrieval performance by relevance feedback. J Am Soc Inf Sci 41(4):355–364

    Google Scholar 

  • Schuth A, Oosterhuis H, Whiteson S, de Rijke M (2016) Multileave gradient descent for fast online learning to rank. In: Proceedings of the 9th ACM international conference on Web Search and Data Mining (WSDM). ACM, pp 457–466

  • Sedhain S, Menon AK, Sanner S, Xie L (2015) Autorec: autoencoders meet collaborative filtering. In: Proceedings of the 24th international conference on World Wide Web (WWW). ACM, pp 111–112

  • Shao J, Ji S, Yang T (2019) Privacy-aware document ranking with neural signals. In: Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR, pp 305–314. https://doi.org/10.1145/3331184.3331189

  • Tax N, Bockting S, Hiemstra D (2015) A cross-benchmark comparison of 87 learning to rank methods. Inf Process Manage 51(6):757–772

  • Tran A, Yang T, Ai Q (2021) ULTRA: an unbiased learning to rank algorithm toolbox. In: Demartini G, Zuccon G, Shane Culpepper J, Huang Z, Tong H (eds) CIKM ’21: the 30th ACM international conference on information and knowledge management, virtual event, Queensland, Australia, November 1–5, 2021. ACM, pp 4613–4622

  • Wang B, Klabjan D (2017) An attention-based deep net for learning to rank. arXiv preprint arXiv:1702.06106

  • Wang H, Langley R, Kim S, McCord-Snook E, Wang H (2018) Efficient exploration of gradient space for online learning to rank. In: Proceedings of the 41st international ACM SIGIR conference on research and development in information retrieval, pp 145–154

  • Wang H, Shi X, Yeung DY (2015) Relational stacked denoising autoencoder for tag recommendation. In: The association for the advancement of artificial intelligence (AAAI), pp 3052–3058

  • Wang J, Yu L, Zhang W, Gong Y, Xu Y, Wang B, Zhang P, Zhang D (2017) IRGAN: a minimax game for unifying generative and discriminative information retrieval models. In: Proceedings of the 40th international ACM SIGIR conference on research and development in information retrieval. ACM, pp 515–524

  • Wu L, Hu D, Hong L, Liu H (2018) Turning clicks into purchases: revenue optimization for product search in E-commerce. In: Proceedings of the 41st international ACM SIGIR conference on research and development in information retrieval, pp 365–374

  • Xu B, Lin H, Lin Y, Xu K (2017) Learning to rank with query-level semi-supervised autoencoders. In: Proceedings of the 26th ACM on conference on information and knowledge management (CIKM). ACM, pp 2395–2398

  • Xu B, Lin H, Lin Y, Xu K (2019) Incorporating query constraints for autoencoder enhanced ranking. Neurocomputing 356:142–150

  • Yang Z, Yan S, Lad A, Liu X, Guo W (2021) Cascaded deep neural ranking models in linkedin people search. In: Demartini G, Zuccon G, Shane Culpepper J, Huang Z, Tong H (eds) CIKM ’21: the 30th ACM international conference on information and knowledge management, virtual event, Queensland, Australia, November 1–5, 2021. ACM, pp 4312–4320

  • Yin D, Hu Y, Tang J, Daly T, Zhou M, Ouyang H, Chen J, Kang C, Deng H, Nobata C, Langlois JM (2016) Ranking relevance in yahoo search. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 323–332

  • Yoon S, Shin J, Jung K (2018) Learning to rank question-answer pairs using hierarchical recurrent encoder with latent topic clustering. In: Proceedings of the 2018 conference of the North American chapter of the association for computational linguistics: human language technologies (NAACL-HLT), pp 1575–1584

  • Zhai C, Lafferty JD (2004) A study of smoothing methods for language models applied to information retrieval. ACM Trans Inf Syst 22(2):179–214

  • Zhai S, Zhang ZM (2016) Semisupervised autoencoder for sentiment analysis. In: The association for the advancement of artificial intelligence (AAAI), pp 1394–1400

  • Zhuang F, Luo D, Yuan NJ, Xie X, He Q (2017) Representation learning with pair-wise constraints for collaborative ranking. In: Proceedings of the 10th ACM international conference on web search and data mining (WSDM). ACM, pp 567–575

  • Zhu Y, Nie JY, Dou Z, Ma Z, Zhang X, Du P, Zuo X, Jiang H (2015) Representation learning via semi-supervised autoencoder for multi-task learning. In: IEEE international conference on data mining (ICDM). IEEE, pp 1141–1146

  • Zhu Y, Nie JY, Dou Z, Ma Z, Zhang X, Du P, Zuo X, Jiang H (2021) Contrastive learning of user behavior sequence for context-aware document ranking. In: Demartini G, Zuccon G, Shane Culpepper J, Huang Z, Tong H (eds) CIKM ’21: the 30th ACM international conference on information and knowledge management, virtual event, Queensland, Australia, November 1–5, 2021. ACM, pp 2780–2791

Download references

Funding

This work is partially supported by grant from the Natural Science Foundation of China (No. 62006034), Natural Science Foundation of Liaoning Province (No. 2021-BS-067) and the Fundamental Research Funds for the Central Universities (No. DUT21RC(3)015).

Author information

Authors and Affiliations

Authors

Contributions

Bo Xu was involved in the conceptualization, methodology, writing—original draft preparation. Hongfei Lin contributed to the formal analysis and funding acquisition. Yuan Lin helped in writing—reviewing and editing and experiments. Kan Xu contributed to writing—reviewing and editing and validation.

Corresponding author

Correspondence to Bo Xu.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Ethical approval

This manuscript does not contain any studies with human participants or animals performed by any of the authors. We have read and have abided by the statement of ethical standards for manuscripts.

Informed consent

The submitted manuscript has obtained informed consent from all authors.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Xu, B., Lin, H., Lin, Y. et al. Context-aware ranking refinement with attentive semi-supervised autoencoders. Soft Comput 26, 13941–13952 (2022). https://doi.org/10.1007/s00500-022-07433-w

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-022-07433-w

Keywords

Navigation