Skip to main content

An In-Depth Comparison of Neural and Probabilistic Tree Models for Learning-to-rank

  • Conference paper
  • First Online:
Advances in Information Retrieval (ECIR 2024)

Abstract

Learning-to-rank has been intensively studied and has demonstrated significant value in several fields, such as web search and recommender systems. Over the learning-to-rank datasets given as vectors of feature values, LambdaMART proposed more than a decade ago, and its subsequent descendants based on gradient-boosted decision trees (GBDT), have demonstrated leading performance. Recently, different novel tree models have been developed, such as neural tree ensembles that utilize neural networks to emulate decision tree models and probabilistic gradient boosting machines (PGBM). However, the effectiveness of these tree models for learning-to-rank has not been comprehensively explored. Hence, this study bridges the gap by systematically comparing several representative neural tree ensembles (e.g., TabNet, NODE, and GANDALF), PGBM, and traditional learning-to-rank models on two benchmark datasets. The experimental results reveal that benefiting from end-to-end gradient-based optimization and the power of feature representation and adaptive feature selection, the neural tree ensemble does have its advantage for learning-to-rank over the conventional tree-based ranking model, such as LambdaMART. This finding is important as LambdaMART has achieved leading performance in a long period.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    https://www.microsoft.com/en-us/research/project/mslr/.

  2. 2.

    https://webscope.sandbox.yahoo.com/catalog.php?datatype=c.

  3. 3.

    http://quickrank.isti.cnr.it/istella-dataset/.

  4. 4.

    https://pytorch.org/.

  5. 5.

    The source code for reproducing the results: https://github.com/wildltr/ptranking/.

References

  1. Ai, Q., Bi, K., Guo, J., Croft, W.B.: Learning a deep listwise context model for ranking refinement. In: The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval, pp. 135–144. SIGIR ’18, Association for Computing Machinery, New York, NY, USA (2018)

    Google Scholar 

  2. Arik, S.O., Pfister, T.: TabNet: attentive interpretable tabular learning. Proc. AAAI Conf. Artif. Intell. 35(8), 6679–6687 (2021)

    Google Scholar 

  3. Bruch, S., Han, S., Bendersky, M., Najork, M.: A stochastic treatment of learning to rank scoring functions. In: Proceedings of the 13th WSDM, pp. 61–69 (2020)

    Google Scholar 

  4. Bruch, S., Lucchese, C., Nardini, F.M.: Efficient and effective tree-based and neural learning to rank. Found. Trends® Inf. Retrieval 17(1), 1–123 (2023)

    Google Scholar 

  5. Bruch, S., Zoghi, M., Bendersky, M., Najork, M.: Revisiting approximate metric optimization in the age of deep neural networks. In: Proceedings of the 42nd SIGIR, pp. 1241–1244 (2019)

    Google Scholar 

  6. Bruch, S.: An alternative cross entropy loss for learning-to-rank. In: Proceedings of the Web Conference 2021, pp. 118–126. WWW ’21, Association for Computing Machinery, New York, NY, USA (2021)

    Google Scholar 

  7. Burges, C.J.C., Ragno, R., Le, Q.V.: Learning to rank with nonsmooth cost functions. In: Proceedings of NeurIPS, pp. 193–200 (2006)

    Google Scholar 

  8. Cao, Z., Qin, T., Liu, T.Y., Tsai, M.F., Li, H.: Learning to Rank: from pairwise approach to listwise approach. In: Proceedings of the 24th ICML, pp. 129–136 (2007)

    Google Scholar 

  9. Chapelle, O., Chang, Y.: Yahoo! learning to rank challenge overview. In: Chapelle, O., Chang, Y., Liu, T.Y. (eds.) Proceedings of the Learning to Rank Challenge. Proceedings of Machine Learning Research, vol. 14, pp. 1–24. PMLR, Haifa, Israel (2011)

    Google Scholar 

  10. Chapelle, O., Le, Q., Smola, A.: Large margin optimization of ranking measures. In: NIPS workshop on Machine Learning for Web Search (2007)

    Google Scholar 

  11. Chu, W., Ghahramani, Z.: Gaussian processes for ordinal regression. J. Mach. Learn. Res. 6, 1019–1041 (2005)

    MathSciNet  Google Scholar 

  12. Chu, W., Keerthi, S.S.: New approaches to support vector ordinal regression. In: Proceedings of the 22nd ICML, pp. 145–152 (2005)

    Google Scholar 

  13. Cossock, D., Zhang, T.: Subset ranking using regression. In: Proceedings of the 19th Annual Conference on Learning Theory, pp. 605–619 (2006)

    Google Scholar 

  14. Dato, D., et al.: Fast ranking with additive ensembles of oblivious and non-oblivious regression trees. ACM Trans. Inf. Syst. 35(2) (2016)

    Google Scholar 

  15. Dato, D., MacAvaney, S., Nardini, F.M., Perego, R., Tonellotto, N.: The istella22 dataset: bridging traditional and neural learning to rank evaluation. In: Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 3099–3107 (2022)

    Google Scholar 

  16. Freund, Y., Iyer, R., Schapire, R.E., Singer, Y.: An efficient boosting algorithm for combining preferences. J. Mach. Learn. Res. 4, 933–969 (2003)

    MathSciNet  Google Scholar 

  17. Ganjisaffar, Y., Caruana, R., Lopes, C.V.: Bagging gradient-boosted trees for high precision, low variance ranking models. In: Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 85–94. SIGIR ’11, Association for Computing Machinery, New York, NY, USA (2011)

    Google Scholar 

  18. Guiver, J., Snelson, E.: Learning to rank with SoftRank and Gaussian processes. In: Proceedings of the 31st SIGIR, pp. 259–266 (2008)

    Google Scholar 

  19. Guo, J., et al.: A deep look into neural ranking models for information retrieval. Inf. Process. Manag. (2019)

    Google Scholar 

  20. Järvelin, K., Kekäläinen, J.: Cumulated gain-based evaluation of IR techniques. ACM Trans. Inf. Syst. 20(4), 422–446 (2002)

    Article  Google Scholar 

  21. Joachims, T.: Training linear SVMs in linear time. In: Proceedings of the 12th KDD, pp. 217–226 (2006)

    Google Scholar 

  22. Joseph, M., Raj, H.: GANDALF: gated adaptive network for deep automated learning of features. arXiv:2207.08548 [cs.LG] (2022)

  23. Ke, G., et al.: LightGBM: a highly efficient gradient boosting decision tree. In: Proceedings of NeurIPS, pp. 3149–3157 (2017)

    Google Scholar 

  24. Ke, G., Xu, Z., Zhang, J., Bian, J., Liu, T.Y.: DeepGBM: a deep learning framework distilled by GBDT for online prediction tasks. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 384–394. KDD ’19, Association for Computing Machinery, New York, NY, USA (2019)

    Google Scholar 

  25. Lan, Y., Zhu, Y., Guo, J., Niu, S., Cheng, X.: Position-aware ListMLE: a sequential learning process for ranking. In: Proceedings of the 30th Conference on UAI, pp. 449–458 (2014)

    Google Scholar 

  26. Lin, J., Nogueira, R., Yates, A.: Pretrained transformers for text ranking: BERT and beyond. arxiv.org/abs/2010.06467 (2020)

  27. Lucchese, C., Nardini, F.M., Perego, R., Orlando, S., Trani, S.: Selective gradient boosting for effective learning to rank. In: The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval, pp. 155–164. SIGIR ’18, New York, NY, USA (2018)

    Google Scholar 

  28. Lyzhin, I., Ustimenko, A., Gulin, A., Prokhorenkova, L.: Which tricks are important for learning to rank? In: Proceedings of the 40th International Conference on Machine Learning, ICML’23, JMLR.org (2023)

    Google Scholar 

  29. Nardini, F., Rulli, C., Trani, S., Venturini, R.: Distilled neural networks for efficient learning to rank. IEEE Trans. Knowl. Data Eng. 35(05), 4695–4712 (2023)

    Google Scholar 

  30. Onal, K.D., Zhang, Y., Altingovde, I.S.: Others: neural information retrieval: at the end of the early years. J. Inf. Retrieval 21(2–3), 111–182 (2018)

    Article  Google Scholar 

  31. Peters, B., Niculae, V., Martins, A.F.T.: Sparse sequence-to-sequence models. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 1504–1519. Association for Computational Linguistics, Florence, Italy (2019)

    Google Scholar 

  32. Pobrotyn, P., Bartczak, T., Synowiec, M., Bialobrzeski, R., Bojar, J.: Context-aware learning to rank with self-attention. CoRR (2005)

    Google Scholar 

  33. Popov, S., Morozov, S., Babenko, A.: Neural oblivious decision ensembles for deep learning on tabular data. CoRR abs/1909.06312 (2019)

    Google Scholar 

  34. Qin, T., Liu, T.Y., Li, H.: A general approximation framework for direct optimization of information retrieval measures. J. Inf. Retrieval 13(4), 375–397 (2010)

    Article  Google Scholar 

  35. Qin, T., Liu, T.Y., Xu, J., Li, H.: LETOR: a benchmark collection for research on learning to rank for information retrieval. Inf. Retrieval J. 13(4), 346–374 (2010)

    Article  Google Scholar 

  36. Shen, L., Joshi, A.K.: Ranking and reranking with perceptron. Mach. Learn. 60(1–3), 73–96 (2005)

    Article  Google Scholar 

  37. Sprangers, O., Schelter, S., de Rijke, M.: Probabilistic gradient boosting machines for large-scale probabilistic regression. In: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & amp; Data Mining, pp. 1510–1520 (2021)

    Google Scholar 

  38. Taylor, M., Guiver, J., Robertson, S., Minka, T.: SoftRank: optimizing non-smooth rank metrics. In: Proceedings of the 1st WSDM, pp. 77–86 (2008)

    Google Scholar 

  39. Volkovs, M.N., Zemel, R.S.: BoltzRank: learning to maximize expected ranking gain. In: Proceedings of ICML, pp. 1089–1096 (2009)

    Google Scholar 

  40. Wang, J., et al.: IRGAN: a minimax game for unifying generative and discriminative information retrieval models. In: Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 515–524. SIGIR ’17, Association for Computing Machinery, New York, NY, USA (2017)

    Google Scholar 

  41. Wang, X., Li, C., Golbandi, N., Bendersky, M., Najork, M.: The lambdaloss framework for ranking metric optimization. In: Proceedings of the 27th CIKM, pp. 1313–1322 (2018)

    Google Scholar 

  42. Wu, Q., Burges, C.J., Svore, K.M., Gao, J.: Adapting boosting for information retrieval measures. J. Inf. Retrieval 13(3), 254–270 (2010)

    Article  Google Scholar 

  43. Xu, J., Li, H.: AdaRank: a boosting algorithm for information retrieval. In: Proceedings of the 30th SIGIR, pp. 391–398 (2007)

    Google Scholar 

  44. Yu, H.: Optimize what you evaluate with: a simple yet effective framework for direct optimization of IR metrics. CoRR abs/2008.13373 (2020)

    Google Scholar 

  45. Yu, H.T., Huang, D., Ren, F., Li, L.: Diagnostic evaluation of policy-gradient-based ranking. Electronics 11(1) (2022)

    Google Scholar 

  46. Yu, H.T., Jatowt, A., Joho, H., Jose, J.M., Yang, X., Chen, L.: WassRank: Listwise document ranking using optimal transport theory. In: Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining, pp. 24–32. WSDM ’19, Association for Computing Machinery, New York, NY, USA (2019)

    Google Scholar 

  47. Yu, H.T., Piryani, R., Jatowt, A., Inagaki, R., Joho, H., Kim, K.S.: An in-depth study on adversarial learning-to-rank. Inf. Retr. 26(1) (2023)

    Google Scholar 

  48. Yu, H.T.: Optimize what you evaluate with: search result diversification based on metric optimization. Proc. AAAI Conf. Artif. Intell. 36(9), 10399–10407 (2022)

    Google Scholar 

  49. Yue, Y., Finley, T., Radlinski, F., Joachims, T.: A support vector method for optimizing average precision. In: Proceedings of the 30th SIGIR, pp. 271–278 (2007)

    Google Scholar 

Download references

Acknowledgments

This research has been supported by JSPS KAKENHI Grant Number 19H04215.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Haitao Yu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Tan, H., Yang, K., Yu, H. (2024). An In-Depth Comparison of Neural and Probabilistic Tree Models for Learning-to-rank. In: Goharian, N., et al. Advances in Information Retrieval. ECIR 2024. Lecture Notes in Computer Science, vol 14610. Springer, Cham. https://doi.org/10.1007/978-3-031-56063-7_39

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-56063-7_39

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-56062-0

  • Online ISBN: 978-3-031-56063-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics