Abstract
Learning-to-rank has been intensively studied and has demonstrated significant value in several fields, such as web search and recommender systems. Over the learning-to-rank datasets given as vectors of feature values, LambdaMART proposed more than a decade ago, and its subsequent descendants based on gradient-boosted decision trees (GBDT), have demonstrated leading performance. Recently, different novel tree models have been developed, such as neural tree ensembles that utilize neural networks to emulate decision tree models and probabilistic gradient boosting machines (PGBM). However, the effectiveness of these tree models for learning-to-rank has not been comprehensively explored. Hence, this study bridges the gap by systematically comparing several representative neural tree ensembles (e.g., TabNet, NODE, and GANDALF), PGBM, and traditional learning-to-rank models on two benchmark datasets. The experimental results reveal that benefiting from end-to-end gradient-based optimization and the power of feature representation and adaptive feature selection, the neural tree ensemble does have its advantage for learning-to-rank over the conventional tree-based ranking model, such as LambdaMART. This finding is important as LambdaMART has achieved leading performance in a long period.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
https://www.microsoft.com/en-us/research/project/mslr/.
- 2.
https://webscope.sandbox.yahoo.com/catalog.php?datatype=c.
- 3.
http://quickrank.isti.cnr.it/istella-dataset/.
- 4.
https://pytorch.org/.
- 5.
The source code for reproducing the results: https://github.com/wildltr/ptranking/.
References
Ai, Q., Bi, K., Guo, J., Croft, W.B.: Learning a deep listwise context model for ranking refinement. In: The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval, pp. 135–144. SIGIR ’18, Association for Computing Machinery, New York, NY, USA (2018)
Arik, S.O., Pfister, T.: TabNet: attentive interpretable tabular learning. Proc. AAAI Conf. Artif. Intell. 35(8), 6679–6687 (2021)
Bruch, S., Han, S., Bendersky, M., Najork, M.: A stochastic treatment of learning to rank scoring functions. In: Proceedings of the 13th WSDM, pp. 61–69 (2020)
Bruch, S., Lucchese, C., Nardini, F.M.: Efficient and effective tree-based and neural learning to rank. Found. Trends® Inf. Retrieval 17(1), 1–123 (2023)
Bruch, S., Zoghi, M., Bendersky, M., Najork, M.: Revisiting approximate metric optimization in the age of deep neural networks. In: Proceedings of the 42nd SIGIR, pp. 1241–1244 (2019)
Bruch, S.: An alternative cross entropy loss for learning-to-rank. In: Proceedings of the Web Conference 2021, pp. 118–126. WWW ’21, Association for Computing Machinery, New York, NY, USA (2021)
Burges, C.J.C., Ragno, R., Le, Q.V.: Learning to rank with nonsmooth cost functions. In: Proceedings of NeurIPS, pp. 193–200 (2006)
Cao, Z., Qin, T., Liu, T.Y., Tsai, M.F., Li, H.: Learning to Rank: from pairwise approach to listwise approach. In: Proceedings of the 24th ICML, pp. 129–136 (2007)
Chapelle, O., Chang, Y.: Yahoo! learning to rank challenge overview. In: Chapelle, O., Chang, Y., Liu, T.Y. (eds.) Proceedings of the Learning to Rank Challenge. Proceedings of Machine Learning Research, vol. 14, pp. 1–24. PMLR, Haifa, Israel (2011)
Chapelle, O., Le, Q., Smola, A.: Large margin optimization of ranking measures. In: NIPS workshop on Machine Learning for Web Search (2007)
Chu, W., Ghahramani, Z.: Gaussian processes for ordinal regression. J. Mach. Learn. Res. 6, 1019–1041 (2005)
Chu, W., Keerthi, S.S.: New approaches to support vector ordinal regression. In: Proceedings of the 22nd ICML, pp. 145–152 (2005)
Cossock, D., Zhang, T.: Subset ranking using regression. In: Proceedings of the 19th Annual Conference on Learning Theory, pp. 605–619 (2006)
Dato, D., et al.: Fast ranking with additive ensembles of oblivious and non-oblivious regression trees. ACM Trans. Inf. Syst. 35(2) (2016)
Dato, D., MacAvaney, S., Nardini, F.M., Perego, R., Tonellotto, N.: The istella22 dataset: bridging traditional and neural learning to rank evaluation. In: Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 3099–3107 (2022)
Freund, Y., Iyer, R., Schapire, R.E., Singer, Y.: An efficient boosting algorithm for combining preferences. J. Mach. Learn. Res. 4, 933–969 (2003)
Ganjisaffar, Y., Caruana, R., Lopes, C.V.: Bagging gradient-boosted trees for high precision, low variance ranking models. In: Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 85–94. SIGIR ’11, Association for Computing Machinery, New York, NY, USA (2011)
Guiver, J., Snelson, E.: Learning to rank with SoftRank and Gaussian processes. In: Proceedings of the 31st SIGIR, pp. 259–266 (2008)
Guo, J., et al.: A deep look into neural ranking models for information retrieval. Inf. Process. Manag. (2019)
Järvelin, K., Kekäläinen, J.: Cumulated gain-based evaluation of IR techniques. ACM Trans. Inf. Syst. 20(4), 422–446 (2002)
Joachims, T.: Training linear SVMs in linear time. In: Proceedings of the 12th KDD, pp. 217–226 (2006)
Joseph, M., Raj, H.: GANDALF: gated adaptive network for deep automated learning of features. arXiv:2207.08548 [cs.LG] (2022)
Ke, G., et al.: LightGBM: a highly efficient gradient boosting decision tree. In: Proceedings of NeurIPS, pp. 3149–3157 (2017)
Ke, G., Xu, Z., Zhang, J., Bian, J., Liu, T.Y.: DeepGBM: a deep learning framework distilled by GBDT for online prediction tasks. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 384–394. KDD ’19, Association for Computing Machinery, New York, NY, USA (2019)
Lan, Y., Zhu, Y., Guo, J., Niu, S., Cheng, X.: Position-aware ListMLE: a sequential learning process for ranking. In: Proceedings of the 30th Conference on UAI, pp. 449–458 (2014)
Lin, J., Nogueira, R., Yates, A.: Pretrained transformers for text ranking: BERT and beyond. arxiv.org/abs/2010.06467 (2020)
Lucchese, C., Nardini, F.M., Perego, R., Orlando, S., Trani, S.: Selective gradient boosting for effective learning to rank. In: The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval, pp. 155–164. SIGIR ’18, New York, NY, USA (2018)
Lyzhin, I., Ustimenko, A., Gulin, A., Prokhorenkova, L.: Which tricks are important for learning to rank? In: Proceedings of the 40th International Conference on Machine Learning, ICML’23, JMLR.org (2023)
Nardini, F., Rulli, C., Trani, S., Venturini, R.: Distilled neural networks for efficient learning to rank. IEEE Trans. Knowl. Data Eng. 35(05), 4695–4712 (2023)
Onal, K.D., Zhang, Y., Altingovde, I.S.: Others: neural information retrieval: at the end of the early years. J. Inf. Retrieval 21(2–3), 111–182 (2018)
Peters, B., Niculae, V., Martins, A.F.T.: Sparse sequence-to-sequence models. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 1504–1519. Association for Computational Linguistics, Florence, Italy (2019)
Pobrotyn, P., Bartczak, T., Synowiec, M., Bialobrzeski, R., Bojar, J.: Context-aware learning to rank with self-attention. CoRR (2005)
Popov, S., Morozov, S., Babenko, A.: Neural oblivious decision ensembles for deep learning on tabular data. CoRR abs/1909.06312 (2019)
Qin, T., Liu, T.Y., Li, H.: A general approximation framework for direct optimization of information retrieval measures. J. Inf. Retrieval 13(4), 375–397 (2010)
Qin, T., Liu, T.Y., Xu, J., Li, H.: LETOR: a benchmark collection for research on learning to rank for information retrieval. Inf. Retrieval J. 13(4), 346–374 (2010)
Shen, L., Joshi, A.K.: Ranking and reranking with perceptron. Mach. Learn. 60(1–3), 73–96 (2005)
Sprangers, O., Schelter, S., de Rijke, M.: Probabilistic gradient boosting machines for large-scale probabilistic regression. In: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & amp; Data Mining, pp. 1510–1520 (2021)
Taylor, M., Guiver, J., Robertson, S., Minka, T.: SoftRank: optimizing non-smooth rank metrics. In: Proceedings of the 1st WSDM, pp. 77–86 (2008)
Volkovs, M.N., Zemel, R.S.: BoltzRank: learning to maximize expected ranking gain. In: Proceedings of ICML, pp. 1089–1096 (2009)
Wang, J., et al.: IRGAN: a minimax game for unifying generative and discriminative information retrieval models. In: Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 515–524. SIGIR ’17, Association for Computing Machinery, New York, NY, USA (2017)
Wang, X., Li, C., Golbandi, N., Bendersky, M., Najork, M.: The lambdaloss framework for ranking metric optimization. In: Proceedings of the 27th CIKM, pp. 1313–1322 (2018)
Wu, Q., Burges, C.J., Svore, K.M., Gao, J.: Adapting boosting for information retrieval measures. J. Inf. Retrieval 13(3), 254–270 (2010)
Xu, J., Li, H.: AdaRank: a boosting algorithm for information retrieval. In: Proceedings of the 30th SIGIR, pp. 391–398 (2007)
Yu, H.: Optimize what you evaluate with: a simple yet effective framework for direct optimization of IR metrics. CoRR abs/2008.13373 (2020)
Yu, H.T., Huang, D., Ren, F., Li, L.: Diagnostic evaluation of policy-gradient-based ranking. Electronics 11(1) (2022)
Yu, H.T., Jatowt, A., Joho, H., Jose, J.M., Yang, X., Chen, L.: WassRank: Listwise document ranking using optimal transport theory. In: Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining, pp. 24–32. WSDM ’19, Association for Computing Machinery, New York, NY, USA (2019)
Yu, H.T., Piryani, R., Jatowt, A., Inagaki, R., Joho, H., Kim, K.S.: An in-depth study on adversarial learning-to-rank. Inf. Retr. 26(1) (2023)
Yu, H.T.: Optimize what you evaluate with: search result diversification based on metric optimization. Proc. AAAI Conf. Artif. Intell. 36(9), 10399–10407 (2022)
Yue, Y., Finley, T., Radlinski, F., Joachims, T.: A support vector method for optimizing average precision. In: Proceedings of the 30th SIGIR, pp. 271–278 (2007)
Acknowledgments
This research has been supported by JSPS KAKENHI Grant Number 19H04215.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Tan, H., Yang, K., Yu, H. (2024). An In-Depth Comparison of Neural and Probabilistic Tree Models for Learning-to-rank. In: Goharian, N., et al. Advances in Information Retrieval. ECIR 2024. Lecture Notes in Computer Science, vol 14610. Springer, Cham. https://doi.org/10.1007/978-3-031-56063-7_39
Download citation
DOI: https://doi.org/10.1007/978-3-031-56063-7_39
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-56062-0
Online ISBN: 978-3-031-56063-7
eBook Packages: Computer ScienceComputer Science (R0)