Skip to main content
Log in

Graph-based comparative analysis of learning to rank datasets

  • Regular Paper
  • Published:
International Journal of Data Science and Analytics Aims and scope Submit manuscript

Abstract

The relative success of learning to rank algorithms has raised the attention of the research community for developing efficient and effective ranking methods. Proposed ranking algorithms are usually evaluated using available benchmark datasets. However, these datasets are of different characteristics and their usage in the evaluation of learning to rank algorithms may yield completely different experimental results. Consequently, having an appropriate understanding of the specifications of benchmark datasets would be beneficial both in the analysis of experimental results as well as in the development of new benchmark datasets. In this regard, the current research proposes a graph-based framework for comparative analysis of learning to rank datasets. For a given dataset, a feature–similarity graph is produced in which nodes represent features of the corresponding dataset, and weights of edges indicate Kendall’s Tau similarity values of connected pairs of features. Thereafter, a variety of structural and node-based attributes are extracted either from the produced feature–similarity graph or its giant component. This method is applied to four learning to rank datasets: MSLR-Web10K, Istella, WCL2R, and dotIR, where the last one is the only available Persian learning to rank dataset. Based on the experimentations, WCL2R is completely different from the other evaluated datasets in the structural and node-based properties. Among the three remaining datasets, MSLR-Web10, Istella, and dotIR, the last two are more similar to each other.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15

Similar content being viewed by others

Data availability

Data analyzed in this study were a re-analysis of existing data, which are openly available at locations cited in the reference section.

Notes

  1. http://www.istella.it.

  2. https://dbrg.ut.ac.ir/webir-dotir/.

References

  1. Manning, C.D., Raghavan, P., Schutze, H.: Introduction to Information Retrieval. Cambridge University Press (2008)

  2. Liu, T.Y.: Learning to rank for information retrieval. Found. Trends Inf. Retr. 3, 225–231 (2009). https://doi.org/10.1561/1500000016

    Article  ADS  CAS  Google Scholar 

  3. Li, H.: Learning to Rank for Information Retrieval and Natural Language Processing, Second Edition. Synth. Lect. Hum. Lang. Technol. 7, 1–123 (2015). https://doi.org/10.2200/S00607ED2V01Y201410HLT026/SUPPL_FILE/LI_CH1.PDF

  4. Li, P., Burges, C.J.C., Wu, Q.: McRank: learning to rank using multiple classification and gradient boosting. Adv. Neural Inf. Process. Syst. 20, 897–904 (2007)

    Google Scholar 

  5. Crammer, K., Singer, Y.: Pranking with ranking. Adv. Neural Inf. Process. Syst. 14, 641–647 (2001)

    Google Scholar 

  6. Shashua, A., Levin, A.: Ranking with large margin principle: two approaches. Adv. Neural Inf. Process. Syst. 15, 961–968 (2002)

    Google Scholar 

  7. Smola, A., Bartlett, P., Schölkopf, B., Schuurmans, D.: Large margin rank boundaries for ordinal regression. Adv. Large Margin Classif. 115–132 (2000)

  8. Burges, C.J., Ragno, R., Viet Le, Q.: Learning to Rank with Nonsmooth Cost Functions. In: Advances in Neural Information Processing Systems. pp. 193–200 (2006)

  9. Cao, Y., Xu, J., Liu, T.Y., Li, H., Huang, Y., Hon, H.W.: Adapting ranking SVM to document retrieval. Proc. Twenty-Ninth Annu. Int. ACM SIGIR Conf. Res. Dev. Inf. Retr. 2006, 186–193 (2006). https://doi.org/10.1145/1148170.1148205

    Article  Google Scholar 

  10. Freund, Y., Iyer, R., Schapire, R.E., Singer, Y., Dietterich, T.G.: An efficient boosting algorithm for combining preferences. (2003)

  11. Xu, J., Li, H.: AdaRank: A boosting algorithm for information retrieval. In: Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR’07. pp. 391–398 (2007)

  12. Cao, Z., Qin, T., Liu, T.Y., Tsai, M.F., Li, H.: Learning to rank: From pairwise approach to listwise approach. In: ACM International Conference Proceeding Series. pp. 129–136 (2007)

  13. Xu, J., Liu, T.Y., Lu, M., Li, H., Ma, W.Y.: Directly optimizing evaluation measures in learning to rank. ACM SIGIR 2008-31st Annu. Int. ACM SIGIR Conf. Res. Dev. Inf. Retrieval Proc. (2008). https://doi.org/10.1145/1390334.1390355

    Article  Google Scholar 

  14. Sibony, E.: Multiresolution analysis of ranking data, (2016)

  15. Tax, N., Bockting, S., Hiemstra, D.: A cross-benchmark comparison of 87 learning to rank methods. Inf. Process. Manag. 51, 757–772 (2015). https://doi.org/10.1016/J.IPM.2015.07.002

    Article  Google Scholar 

  16. Moreira, C., Calado, P., Martins, B.: Learning to rank academic experts in the DBLP dataset. Expert Syst. J. Knowl. Eng. 32, 477–493 (2015). https://doi.org/10.1111/EXSY.12062

    Article  Google Scholar 

  17. Yu, W., Qin, Z.: Spectrum-enhanced pairwise learning to rank. In: The Web Conference 2019 - Proceedings of the World Wide Web Conference, WWW 2019. pp. 2247–2257. Association for Computing Machinery, Inc (2019)

  18. Zhang, Y., Wang, D., Zhang, Y.: Neural IR meets graph embedding: a ranking model for product search. Web Conf. 2019 Proc. World Wide Web Conf WWW 2019 (2019). https://doi.org/10.1145/3308558.3313468

    Article  Google Scholar 

  19. Ferraro, A., Porcaro, L., Serra, X.: Balancing Exposure and Relevance in Academic Search. In: The Twenty-Ninth Text Retrieval Conference (2020)

  20. Maqsood, S., Islam, M.A., Afzal, M.T., Masood, N.: A comprehensive author ranking evaluation of network and bibliographic indices. Malaysian J. Libr. Inf. Sci. 25, 31–45 (2020). https://doi.org/10.22452/MJLIS.VOL25NO1.2

    Article  Google Scholar 

  21. Yang, X., Wang, B.: Local ranking and global fusion for personalized recommendation. Appl Soft Comput. 96, 106636 (2020). https://doi.org/10.1016/J.ASOC.2020.106636

    Article  Google Scholar 

  22. Sanz-Cruzado, J., Castells, P., Macdonald, C., Ounis, I.: Effective contact recommendation in social networks by adaptation of information retrieval models. Inf. Process. Manag. 57, 102285 (2020). https://doi.org/10.1016/J.IPM.2020.102285

    Article  Google Scholar 

  23. Nabua, E.B., Falcasantos, J.O., Joy, M., Jerez, Y., Wang, J., Yan, F., Zhang, Y.M., Zou, X.: A survey on application of knowledge graph. J. Phys. Conf. Ser. 1487, 012016 (2020). https://doi.org/10.1088/1742-6596/1487/1/012016

    Article  Google Scholar 

  24. Ji, S., Pan, S., Cambria, E., Marttinen, P., Yu, P.S.: A survey on knowledge graphs: representation, acquisition, and applications. IEEE Trans. Neural Netw. Learn. Syst. 33, 494–514 (2022). https://doi.org/10.1109/TNNLS.2021.3070843

    Article  MathSciNet  PubMed  Google Scholar 

  25. Gao, H., Wu, L., Hu, P., Wei, Z., Xu, F., Long, B., Gao, H., Wu, L., Hu, P., Wei, Z., Xu, F., Long, B.: Graph-augmented Learning to Rank for Querying Large-scale Knowledge Graph. arXiv. arXiv:2111.10541 (2021)

  26. Wu, H., Meng, F.J.: Research on the application of personalized course recommendation of learn to rank based on knowledge graph. Lect. Notes Inst. Comput. Sci. Soc. Telecommun. Eng. LNICST. 331, 19–30 (2020). https://doi.org/10.1007/978-3-030-62205-3_2/COVER

    Article  Google Scholar 

  27. Su, Y., Xing, Z., Peng, X., Xia, X., Wang, C., Xu, X., Zhu, L.: Reducing bug triaging confusion by learning from mistakes with a bug tossing knowledge graph. Proc. - 2021 36th IEEE/ACM Int Conf. Autom. Softw. Eng. ASE 2021, 191–202 (2021). https://doi.org/10.1109/ASE51524.2021.9678574

    Article  Google Scholar 

  28. Jafarzadeh, P., Amirmahani, Z., Ensan, F.: Learning to rank knowledge subgraph nodes for entity retrieval. SIGIR 2022 Proc. 45th Int. ACM SIGIR Conf. Res. Dev. Inf. Retr. (2022). https://doi.org/10.1145/3477495.3531888

    Article  Google Scholar 

  29. Devezas, J., Nunes, S.: A review of graph-based models for entity-oriented search. SN Comput. Sci. 26(2), 1–36 (2021). https://doi.org/10.1007/S42979-021-00828-W

    Article  Google Scholar 

  30. Ni, Y., Xu, Q.K., Cao, F., Mass, Y., Sheinwald, D., Zhu, H.J., Cao, S.S.: Semantic documents relatedness using concept graph representation. WSDM 2016 Proc 9th ACM Int. Conf. Web Search Data Min. (2016). https://doi.org/10.1145/2835776.2835801

    Article  Google Scholar 

  31. Irrera, O., Silvello, G.: Background Linking: Joining Entity Linking with Learning to Rank Models. (2021)

  32. Hosseini, H., Bagheri, E.: Learning to rank implicit entities on Twitter. Inf. Process. Manag. 58, 102503 (2021). https://doi.org/10.1016/J.IPM.2021.102503

    Article  Google Scholar 

  33. Menezes, T., Roth, C.: Semantic Hypergraphs. (2019)

  34. Dietz, L.: ENT rank: retrieving entities for topical information needs through entity-neighbor-text relations. SIGIR 2019 Proc 42nd Int. ACM SIGIR Conf. Res. Dev. Inf. Retr. (2019). https://doi.org/10.1145/3331184.3331257

    Article  Google Scholar 

  35. Yeh, J.Y., Tsai, C.J.: A graph-based feature selection method for learning to rank using spectral clustering for redundancy minimization and biased pagerank for relevance analysis. Comput. Sci. Inf. Syst. 19, 141–164 (2022). https://doi.org/10.2298/CSIS201220042Y

    Article  Google Scholar 

  36. Yeh, J.Y., Tsai, C.J.: Graph-based feature selection method for learning to rank. ACM Int Conf Proceeding Ser. (2020). https://doi.org/10.1145/3442555.3442567

    Article  Google Scholar 

  37. Geng, B., Yang, L., Hua, X.-S.: Learning to Rank with Graph Consistency. (2009)

  38. Fan, J., Luo, H., Gao, Y., Jain, R.: Incorporating concept ontology for hierarchical video classification, annotation, and visualization. IEEE Trans. Multimed. 9, 939–957 (2007). https://doi.org/10.1109/TMM.2007.900143

    Article  Google Scholar 

  39. Bałchanowski, M., Boryczka, U.: Aggregation of rankings using metaheuristics in recommendation systems. Electron 11, 369 (2022). https://doi.org/10.3390/ELECTRONICS11030369

    Article  Google Scholar 

  40. Zhang, Y., Xiao, Y., Wu, J., Lu, X.: Comprehensive world university ranking based on ranking aggregation. Comput. Stat. 36, 1139–1152 (2021). https://doi.org/10.1007/S00180-020-01033-8/METRICS

    Article  MathSciNet  Google Scholar 

  41. Valem, L.P., Pedronette, D.C.G.: Graph-based selective rank fusion for unsupervised image retrieval. Pattern Recognit. Lett. 135, 82–89 (2020). https://doi.org/10.1016/J.PATREC.2020.03.032

    Article  ADS  Google Scholar 

  42. Kendall, M.G.: A new measure of rank correlation. Biometrika 30, 81–93 (1938). https://doi.org/10.1093/BIOMET/30.1-2.81

    Article  Google Scholar 

  43. Vathy-Fogarassy, Á., Abonyi, J.: Graph-based clustering and data visualization algorithms. Springer-Verlag, London (2013)

    Book  Google Scholar 

  44. Dai, X., Xi, Y., Zhang, W., Liu, Q., Tang, R., He, X., Hou, J., Wang, J., Yu, Y.: Beyond relevance ranking: a general graph matching framework for utility-oriented learning to rank. ACM Trans. Inf. Syst. (2021). https://doi.org/10.1145/3464303

    Article  Google Scholar 

  45. Pahikkala, T., Tsivtsivadze, E., Airola, A., Järvinen, J., Boberg, J.: An efficient algorithm for learning to rank from preference graphs. Mach. Learn. 75, 129–165 (2009). https://doi.org/10.1007/S10994-008-5097-Z/METRICS

    Article  Google Scholar 

  46. Agarwal, A., Chakrabarti, S., Aggarwal, S.: Learning to rank networked entities. Proc ACM SIGKDD Int. Conf. Knowl. Discov. Data Min. 2006, 14–23 (2006). https://doi.org/10.1145/1150402.1150409

    Article  Google Scholar 

  47. Agarwal, S.: Learning to rank on graphs. Mach. Learn. 81, 333–357 (2010). https://doi.org/10.1007/S10994-010-5185-8/METRICS

    Article  MathSciNet  Google Scholar 

  48. Johnson, R., Zhang, T.: Graph-based semi-supervised learning and spectral kernel design. IEEE Trans. Inf. Theory. 54, 275–288 (2008). https://doi.org/10.1109/TIT.2007.911294

    Article  MathSciNet  Google Scholar 

  49. Shi, J., Tian, X.Y.: Learning to Rank Sports Teams on a Graph. Appl. Sci. 2020, Vol. 10, Page 5833. 10, 5833 (2020). https://doi.org/10.3390/APP10175833

  50. Qi, Y., Zhang, J., Liu, Y., Xu, W., Guo, J.: CGTR: Convolution Graph Topology Representation for Document Ranking. Int. Conf. Inf. Knowl. Manag. Proc. 2173–2176 (2020). https://doi.org/10.1145/3340531.3412073

  51. Fan, L., Li, Q., Liu, B., Wu, X.M., Zhang, X., Lv, F., Lin, G., Li, S., Jin, T., Yang, K.: Modeling User Behavior with Graph Convolution for Personalized Product Search. In: ACM Web Conference 2022. pp. 203–212. Association for Computing Machinery, Inc (2022)

  52. Sawhney, R., Agarwal, S., Wadhwa, A., Shah, R.: Exploring the scale-free nature of stock markets: Hyperbolic graph learning for algorithmic trading. Web Conf. 2021 - Proc. World Wide Web Conf. WWW 2021. 11–22 (2021). https://doi.org/10.1145/3442381.3450095

  53. Zhang, Y., Zhang, Q., Zhang, L.L., Yang, Y., Yan, C., Gao, X., Yang, Y.: Learning to Rank Ace Neural Architectures via Normalized Discounted Cumulative Gain. (2021). https://doi.org/10.48550/arxiv.2108.03001

  54. Formal, T., Clinchant, S., Renders, J.M., Lee, S., Cho, G.H.: Learning to Rank Images with Cross-Modal Graph Convolutions. Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics). 12035 LNCS, 589–604 (2020). https://doi.org/10.1007/978-3-030-45439-5_39/FIGURES/2

  55. Narang, K., Krishnan, A., Wang, J., Yang, C., Sundaram, H., Sutter, C.: Ranking User-Generated Content via Multi-Relational Graph Convolution. SIGIR 2021 - Proc. 44th Int. ACM SIGIR Conf. Res. Dev. Inf. Retr. 470–480 (2021). https://doi.org/10.1145/3404835.3462857

  56. Feng, F., He, X., Wang, X., Luo, C., Liu, Y., Chua, T.S.: Temporal Relational Ranking for Stock Prediction. ACM Trans. Inf. Syst. 37, (2019). https://doi.org/10.1145/3309547

  57. Bianchi, F., Palmonari, M., Cremaschi, M., Fersini, E.: Actively learning to rank semantic associations for personalized contextual exploration of knowledge graphs. Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics). 10249 LNCS, 120–135 (2017). https://doi.org/10.1007/978-3-319-58068-5_8/TABLES/4

  58. Muhammad, I., Bollegala, D., Coenen, F., Gamble, C., Kearney, A., Williamson, P.: Document Ranking for Curated Document Databases Using BERT and Knowledge Graph Embeddings: Introducing GRAB-Rank. Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics). 12925 LNCS, 116–127 (2021). https://doi.org/10.1007/978-3-030-86534-4_10/COVER

  59. Ni, C.C., Sum Liu, K., Torzec, N.: Layered Graph Embedding for Entity Recommendation using Wikipedia in the Yahoo! Knowledge Graph. Web Conf. 2020 - Companion World Wide Web Conf. WWW 2020. 811–818 (2020). https://doi.org/10.1145/3366424.3383570

  60. Maheshwari, G., Trivedi, P., Lukovnikov, D., Chakraborty, N., Fischer, A., Lehmann, J.: Learning to Rank Query Graphs for Complex Question Answering over Knowledge Graphs. In: The 18th International Semantic Web Conference (ISWC 2019). pp. 487–504. Springer (2019)

  61. Liu, S., Gu, W., Cong, G., Zhang, F.: Structural Relationship Representation Learning with Graph Embedding for Personalized Product Search. Int. Conf. Inf. Knowl. Manag. Proc. 915–924 (2020). https://doi.org/10.1145/3340531.3411936

  62. Pang, Y., Ji, Z., Jing, P., Li, X.: Ranking graph embedding for learning to rerank. IEEE Trans. Neural Networks Learn. Syst. 24, 1292–1303 (2013). https://doi.org/10.1109/TNNLS.2013.2253798

    Article  Google Scholar 

  63. Yang, S. Bin, Yang, B.: Learning to rank paths in spatial networks. Proc. - Int. Conf. Data Eng. 2020-April, 2006–2009 (2020). https://doi.org/10.1109/ICDE48307.2020.00225

  64. Xu, Q., Li, M., Yu, M.: Learning to rank with relational graph and pointwise constraint for cross-modal retrieval. Soft Comput. 23, 9413–9427 (2019). https://doi.org/10.1007/S00500-018-3608-9/METRICS

    Article  Google Scholar 

  65. Al-Tashi, Q., Abdulkadir, S.J., Rais, H.M., Mirjalili, S., Alhussian, H.: Approaches to Multi-Objective Feature Selection: A Systematic Literature Review. IEEE Access. 8, 125076–125096 (2020). https://doi.org/10.1109/ACCESS.2020.3007291

    Article  Google Scholar 

  66. Shirzad, M.B., Keyvanpour, M.R.: A Systematic Study of Feature Selection Methods for Learning to Rank Algorithms. Int. J. Inf. Retr. Res. 8, 46–67 (2018). https://doi.org/10.4018/IJIRR.2018070104

    Article  Google Scholar 

  67. Li, W., Chai, Z., Tang, Z.: A decomposition-based multi-objective immune algorithm for feature selection in learning to rank. Knowledge-Based Syst. 234, 107577 (2021). https://doi.org/10.1016/J.KNOSYS.2021.107577

  68. Lai, H.J., Pan, Y., Tang, Y., Yu, R.: FSMRank: Feature selection algorithm for learning to rank. IEEE Trans. Neural Networks Learn. Syst. 24, 940–952 (2013). https://doi.org/10.1109/TNNLS.2013.2247628

    Article  Google Scholar 

  69. Nesterov, Y.: Introductory Lectures on Convex Optimization: A Basic Course. 236 (2004). https://doi.org/10.1007/978-1-4419-8853-9

  70. Lei, S., Han, X.: Feature Selection and Model Comparison on Microsoft Learning-to-Rank Data Sets. (2018)

  71. Cheng, F., Guo, W., Zhang, X.: MOFSRank: A Multiobjective Evolutionary Algorithm for Feature Selection in Learning to Rank. Complexity. 2018, (2018). https://doi.org/10.1155/2018/7837696

  72. Moura, D., Petrucci, V., Mosse, D.: Learning to Rank Graph-based Application Objects on Heterogeneous Memories. In: ACM International Conference Proceeding Series. pp. 1–14. Association for Computing Machinery (2021)

  73. Chen, T., Guestrin, C.: XGBoost: A Scalable Tree Boosting System. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. pp. 785–794. ACM, New York, NY, USA (2016)

  74. Sousa, D.X., Canuto, S., Gonçalves, M.A., Rosa, T.C., Martins, W.S.: Risk-Sensitive Learning to Rank with Evolutionary Multi-Objective Feature Selection. ACM Trans. Inf. Syst. 37, (2019). https://doi.org/10.1145/3300196

  75. Purpura, A., Buchner, K., Silvello, G., Susto, G.A.: Neural Feature Selection for Learning to Rank. In: 32nd International Conference on Neural Information Processing Systems. pp. 9525–9536. Springer Science and Business Media Deutschland GmbH (2018)

  76. Adebayo, J., Gilmer, J., Muelly, M., Goodfellow, I., Hardt, M., Kim, B.: Sanity Checks for Saliency Maps. Adv. Neural Inf. Process. Syst. 2018-December, 9505–9515 (2018). https://doi.org/10.48550/arxiv.1810.03292

  77. Ng, A., Jordan, M., Weiss, Y.: On spectral clustering: analysis and an algorithm. In: The 14th International Conference on Neural Information Processing Systems: Natural and Synthetic. pp. 849–856 (2001)

  78. Blondel, V.D., Guillaume, J.L., Lambiotte, R., Lefebvre, E.: Fast unfolding of communities in large networks. J. Stat. Mech. Theory Exp. 2008, P10008 (2008). https://doi.org/10.1088/1742-5468/2008/10/P10008

    Article  Google Scholar 

  79. Newman, M.E.J., Girvan, M.: Mixing Patterns and Community Structure in Networks. Presented at the (2003)

  80. Newman, M.: Networks: A Introduction. Oxford University Press (2018)

  81. Barabási, A.-L., Pósfai, M.: Network Science. Cambridge University Press (2016)

  82. Newman, M.: The structure and function of complex networks. SIAM Rev. 45, 167–256 (2003). https://doi.org/10.1137/S003614450342480

    Article  ADS  MathSciNet  Google Scholar 

  83. Barrat, A., Barthélemy, M., Pastor-Satorras, R., Vespignani, A.: The architecture of complex weighted networks. Proc. Natl. Acad. Sci. 101, 3747–3752 (2004). https://doi.org/10.1073/PNAS.0400087101

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  84. Zhou, B., Meng, X., Stanley, H.E.: Power-law distribution of degree–degree distance: A better representation of the scale-free property of complex networks. Proc. Natl. Acad. Sci. U. S. A. 117, 14812–14818 (2020). https://doi.org/10.1073/PNAS.1918901117/SUPPL_FILE/PNAS.1918901117.SAPP.PDF

    Article  ADS  MathSciNet  CAS  PubMed  PubMed Central  Google Scholar 

  85. Pothen, A., Simon, H.D., Liou, K.-P.: Partitioning Sparse Matrices with Eigenvectors of Graphs. SIAM J. Matrix Anal. Appl. 11, 430–452 (1990). https://doi.org/10.1137/0611030

    Article  MathSciNet  Google Scholar 

  86. Hespanha, J.P.: An Efficient MATLAB Algorithm for Graph Partitioning. (2004)

  87. Newman, M.E.J.: Modularity and community structure in networks. Proc. Natl. Acad. Sci. 103, 8577–8582 (2006). https://doi.org/10.1073/PNAS.0601602103

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  88. Newman, M.E.J., Girvan, M.: Finding and evaluating community structure in networks. Phys. Rev. E. 69, 026113 (2004). https://doi.org/10.1103/PhysRevE.69.026113

  89. Newman, M.E.J.: Fast algorithm for detecting community structure in networks. Phys. Rev. E - Stat. Physics, Plasmas, Fluids, Relat. Interdiscip. Top. 69, 5 (2004). https://doi.org/10.1103/PHYSREVE.69.066133/FIGURES/5/MEDIUM

  90. Qin, T., Liu, T.-Y.: Introducing LETOR 4.0 Datasets. (2013). https://doi.org/10.48550/arxiv.1306.2597

  91. Qin, T., Liu, T.-Y.: Microsoft Learning to Rank Datasets - Microsoft Research, https://www.microsoft.com/en-us/research/project/mslr/

  92. Dato, D., Lucchese, C., Nardini, F.M., Orlando, S., Perego, R., Tonellotto, N., Venturini, R.: Fast ranking with additive ensembles of oblivious and non-oblivious regression trees. ACM Trans. Inf. Syst. 35, (2016). https://doi.org/10.1145/2987380

  93. Dato, D., Lucchese, C., Nardini, F.M., Orlando, S., Perego, R., Tonellotto, N., Venturini, R.: istella, http://blog.istella.it/istella-learning-to-rank-dataset/

  94. Alcântara, O.D.A., Pereira, Á.R., De Almeida, H.M., Gonçalves, M.A., Middleton, C., Baeza-Yates, R.: WCL2R: A Benchmark Collection for Learning to Rank Research with Clickthrough Data. (2010)

  95. Darrudi, E., Hashemi, H.B., AleAhmad, A., Zare Bidoki, A., Habibian, A., Mahdikhani, F., Rahgozar, M.: dotIR collection for Persian web retrieval. (2009)

  96. Karmaker, S.S.K., Sondhi, P., Zhai, C.X.: Empirical Analysis of Impact of Query-Specific Customization of nDCG: A Case-Study with Learning-to-Rank Methods. Int. Conf. Inf. Knowl. Manag. Proc. 3281–3284 (2020). https://doi.org/10.1145/3340531.3417454

Download references

Funding

This research received no specific financial support from any funding agency in the public, commercial, or not-for-profit sectors.

Author information

Authors and Affiliations

Authors

Contributions

The author confirms sole responsibility for the following: survey of the related works, design and implementation of the proposed method, data collection, analysis and interpretation of results, and manuscript preparation.

Corresponding author

Correspondence to Amir Hosein Keyhanipour.

Ethics declarations

Conflict of interest

The authors have no competing interests as defined by Springer, or other interests that might be perceived to influence the results and/or discussion reported in this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix A: Table of notations used in this paper

Appendix A: Table of notations used in this paper

Notation

Meaning

\(Q\)

Set of queries

\(D\)

Set of documents

\({q}_{i}\)

The ith query in the set of queries

\({d}_{j}\)

The jth document in the set of documents

\({d}^{(i)}\)

Set of documents associated to query \({q}_{i}\)

\({y}^{(i)}\)

Relevance labels of \({d}^{(i)}\)

\(m(i)\)

Number of documents associated to query \({q}_{i}\)

\({d}_{j}^{(i)}\)

The jth document in the set of documents associated to query \({q}_{i}\)

\({y}_{j}^{(i)}\)

Relevance label of the jth document in the set of documents associated to query \({q}_{i}\)

\(\overrightarrow{F}(q, d)\)

The vector of features associated to a given query-document pair

\(n\)

Number of features in a given L2R dataset

\(G\)

A weighted graph

\(V\)

The set of the vertices of the graph \(G\)

\(E\)

The set of the edges of the graph \(G\)

\({R}_{k}^{(i)}\)

Ranking of \({d}^{(i)}\) based on the kth feature of the L2R dataset

\(\tau \)

The Kendall’s Tau

\({A}_{\mathrm{w}}\)

The weighted adjacency matrix of the feature–similarity graph

\(A\)

The unweighted adjacency matrix of the feature–similarity graph

\(\sigma \)

The edge-pruning threshold

\({\varvec{L}}\)

The graph Laplacian

\({\mathrm{sp}}_{jk}\)

The total number of shortest paths from node \(j\) to node \(k\)

\({d}_{w(i)}\)

The weighted degree of node \(i\)

\({\mathrm{EC}}_{i}\)

The eigenvector centrality of node \(i\)

\({\mathrm{BC}}_{i}\)

The betweenness centrality of node \(i\)

\({\mathrm{CC}}_{i}\)

The closeness centrality of node \(i\)

\({\mathrm{LC}}_{i}\)

The local clustering coefficient of node \(i\)

\({C}_{1}\), \({C}_{2}\)

The global clustering coefficient values based on Eq. (17)

\({C}^{w}\)

The average of the weighted clustering coefficients

\({d}_{i}\)

The degree of node \(i\)

\(\mathrm{CT}\)

The number of connected triples

\({p}_{k}\)

The probability of having a node in a given network with degree \(k\)

\(\alpha \)

The power-law exponent

\(M\)

The modularity of a given network

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Keyhanipour, A.H. Graph-based comparative analysis of learning to rank datasets. Int J Data Sci Anal 17, 165–187 (2024). https://doi.org/10.1007/s41060-023-00406-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s41060-023-00406-8

Keywords

Navigation