Graph-based comparative analysis of learning to rank datasets

Keyhanipour, Amir Hosein

doi:10.1007/s41060-023-00406-8

Graph-based comparative analysis of learning to rank datasets

Regular Paper
Published: 30 June 2023

Volume 17, pages 165–187, (2024)
Cite this article

International Journal of Data Science and Analytics Aims and scope Submit manuscript

Amir Hosein Keyhanipour¹

112 Accesses
1 Altmetric
Explore all metrics

Abstract

The relative success of learning to rank algorithms has raised the attention of the research community for developing efficient and effective ranking methods. Proposed ranking algorithms are usually evaluated using available benchmark datasets. However, these datasets are of different characteristics and their usage in the evaluation of learning to rank algorithms may yield completely different experimental results. Consequently, having an appropriate understanding of the specifications of benchmark datasets would be beneficial both in the analysis of experimental results as well as in the development of new benchmark datasets. In this regard, the current research proposes a graph-based framework for comparative analysis of learning to rank datasets. For a given dataset, a feature–similarity graph is produced in which nodes represent features of the corresponding dataset, and weights of edges indicate Kendall’s Tau similarity values of connected pairs of features. Thereafter, a variety of structural and node-based attributes are extracted either from the produced feature–similarity graph or its giant component. This method is applied to four learning to rank datasets: MSLR-Web10K, Istella, WCL2R, and dotIR, where the last one is the only available Persian learning to rank dataset. Based on the experimentations, WCL2R is completely different from the other evaluated datasets in the structural and node-based properties. Among the three remaining datasets, MSLR-Web10, Istella, and dotIR, the last two are more similar to each other.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

L2RLab: Integrated Experimenter Environment for Learning to Rank

An Empirical Study of the Impact of Field Features in Learning-to-rank Method

Which noise affects algorithm robustness for learning to rank

Article 28 April 2015

Data availability

Data analyzed in this study were a re-analysis of existing data, which are openly available at locations cited in the reference section.

Notes

References

Manning, C.D., Raghavan, P., Schutze, H.: Introduction to Information Retrieval. Cambridge University Press (2008)
Liu, T.Y.: Learning to rank for information retrieval. Found. Trends Inf. Retr. 3, 225–231 (2009). https://doi.org/10.1561/1500000016
Article ADS CAS Google Scholar
Li, H.: Learning to Rank for Information Retrieval and Natural Language Processing, Second Edition. Synth. Lect. Hum. Lang. Technol. 7, 1–123 (2015). https://doi.org/10.2200/S00607ED2V01Y201410HLT026/SUPPL_FILE/LI_CH1.PDF
Li, P., Burges, C.J.C., Wu, Q.: McRank: learning to rank using multiple classification and gradient boosting. Adv. Neural Inf. Process. Syst. 20, 897–904 (2007)
Google Scholar
Crammer, K., Singer, Y.: Pranking with ranking. Adv. Neural Inf. Process. Syst. 14, 641–647 (2001)
Google Scholar
Shashua, A., Levin, A.: Ranking with large margin principle: two approaches. Adv. Neural Inf. Process. Syst. 15, 961–968 (2002)
Google Scholar
Smola, A., Bartlett, P., Schölkopf, B., Schuurmans, D.: Large margin rank boundaries for ordinal regression. Adv. Large Margin Classif. 115–132 (2000)
Burges, C.J., Ragno, R., Viet Le, Q.: Learning to Rank with Nonsmooth Cost Functions. In: Advances in Neural Information Processing Systems. pp. 193–200 (2006)
Cao, Y., Xu, J., Liu, T.Y., Li, H., Huang, Y., Hon, H.W.: Adapting ranking SVM to document retrieval. Proc. Twenty-Ninth Annu. Int. ACM SIGIR Conf. Res. Dev. Inf. Retr. 2006, 186–193 (2006). https://doi.org/10.1145/1148170.1148205
Article Google Scholar
Freund, Y., Iyer, R., Schapire, R.E., Singer, Y., Dietterich, T.G.: An efficient boosting algorithm for combining preferences. (2003)
Xu, J., Li, H.: AdaRank: A boosting algorithm for information retrieval. In: Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR’07. pp. 391–398 (2007)
Cao, Z., Qin, T., Liu, T.Y., Tsai, M.F., Li, H.: Learning to rank: From pairwise approach to listwise approach. In: ACM International Conference Proceeding Series. pp. 129–136 (2007)
Xu, J., Liu, T.Y., Lu, M., Li, H., Ma, W.Y.: Directly optimizing evaluation measures in learning to rank. ACM SIGIR 2008-31st Annu. Int. ACM SIGIR Conf. Res. Dev. Inf. Retrieval Proc. (2008). https://doi.org/10.1145/1390334.1390355
Article Google Scholar
Sibony, E.: Multiresolution analysis of ranking data, (2016)
Tax, N., Bockting, S., Hiemstra, D.: A cross-benchmark comparison of 87 learning to rank methods. Inf. Process. Manag. 51, 757–772 (2015). https://doi.org/10.1016/J.IPM.2015.07.002
Article Google Scholar
Moreira, C., Calado, P., Martins, B.: Learning to rank academic experts in the DBLP dataset. Expert Syst. J. Knowl. Eng. 32, 477–493 (2015). https://doi.org/10.1111/EXSY.12062
Article Google Scholar
Yu, W., Qin, Z.: Spectrum-enhanced pairwise learning to rank. In: The Web Conference 2019 - Proceedings of the World Wide Web Conference, WWW 2019. pp. 2247–2257. Association for Computing Machinery, Inc (2019)
Zhang, Y., Wang, D., Zhang, Y.: Neural IR meets graph embedding: a ranking model for product search. Web Conf. 2019 Proc. World Wide Web Conf WWW 2019 (2019). https://doi.org/10.1145/3308558.3313468
Article Google Scholar
Ferraro, A., Porcaro, L., Serra, X.: Balancing Exposure and Relevance in Academic Search. In: The Twenty-Ninth Text Retrieval Conference (2020)
Maqsood, S., Islam, M.A., Afzal, M.T., Masood, N.: A comprehensive author ranking evaluation of network and bibliographic indices. Malaysian J. Libr. Inf. Sci. 25, 31–45 (2020). https://doi.org/10.22452/MJLIS.VOL25NO1.2
Article Google Scholar
Yang, X., Wang, B.: Local ranking and global fusion for personalized recommendation. Appl Soft Comput. 96, 106636 (2020). https://doi.org/10.1016/J.ASOC.2020.106636
Article Google Scholar
Sanz-Cruzado, J., Castells, P., Macdonald, C., Ounis, I.: Effective contact recommendation in social networks by adaptation of information retrieval models. Inf. Process. Manag. 57, 102285 (2020). https://doi.org/10.1016/J.IPM.2020.102285
Article Google Scholar
Nabua, E.B., Falcasantos, J.O., Joy, M., Jerez, Y., Wang, J., Yan, F., Zhang, Y.M., Zou, X.: A survey on application of knowledge graph. J. Phys. Conf. Ser. 1487, 012016 (2020). https://doi.org/10.1088/1742-6596/1487/1/012016
Article Google Scholar
Ji, S., Pan, S., Cambria, E., Marttinen, P., Yu, P.S.: A survey on knowledge graphs: representation, acquisition, and applications. IEEE Trans. Neural Netw. Learn. Syst. 33, 494–514 (2022). https://doi.org/10.1109/TNNLS.2021.3070843
Article MathSciNet PubMed Google Scholar
Gao, H., Wu, L., Hu, P., Wei, Z., Xu, F., Long, B., Gao, H., Wu, L., Hu, P., Wei, Z., Xu, F., Long, B.: Graph-augmented Learning to Rank for Querying Large-scale Knowledge Graph. arXiv. arXiv:2111.10541 (2021)
Wu, H., Meng, F.J.: Research on the application of personalized course recommendation of learn to rank based on knowledge graph. Lect. Notes Inst. Comput. Sci. Soc. Telecommun. Eng. LNICST. 331, 19–30 (2020). https://doi.org/10.1007/978-3-030-62205-3_2/COVER
Article Google Scholar
Su, Y., Xing, Z., Peng, X., Xia, X., Wang, C., Xu, X., Zhu, L.: Reducing bug triaging confusion by learning from mistakes with a bug tossing knowledge graph. Proc. - 2021 36th IEEE/ACM Int Conf. Autom. Softw. Eng. ASE 2021, 191–202 (2021). https://doi.org/10.1109/ASE51524.2021.9678574
Article Google Scholar
Jafarzadeh, P., Amirmahani, Z., Ensan, F.: Learning to rank knowledge subgraph nodes for entity retrieval. SIGIR 2022 Proc. 45th Int. ACM SIGIR Conf. Res. Dev. Inf. Retr. (2022). https://doi.org/10.1145/3477495.3531888
Article Google Scholar
Devezas, J., Nunes, S.: A review of graph-based models for entity-oriented search. SN Comput. Sci. 26(2), 1–36 (2021). https://doi.org/10.1007/S42979-021-00828-W
Article Google Scholar
Ni, Y., Xu, Q.K., Cao, F., Mass, Y., Sheinwald, D., Zhu, H.J., Cao, S.S.: Semantic documents relatedness using concept graph representation. WSDM 2016 Proc 9th ACM Int. Conf. Web Search Data Min. (2016). https://doi.org/10.1145/2835776.2835801
Article Google Scholar
Irrera, O., Silvello, G.: Background Linking: Joining Entity Linking with Learning to Rank Models. (2021)
Hosseini, H., Bagheri, E.: Learning to rank implicit entities on Twitter. Inf. Process. Manag. 58, 102503 (2021). https://doi.org/10.1016/J.IPM.2021.102503
Article Google Scholar
Menezes, T., Roth, C.: Semantic Hypergraphs. (2019)
Dietz, L.: ENT rank: retrieving entities for topical information needs through entity-neighbor-text relations. SIGIR 2019 Proc 42nd Int. ACM SIGIR Conf. Res. Dev. Inf. Retr. (2019). https://doi.org/10.1145/3331184.3331257
Article Google Scholar
Yeh, J.Y., Tsai, C.J.: A graph-based feature selection method for learning to rank using spectral clustering for redundancy minimization and biased pagerank for relevance analysis. Comput. Sci. Inf. Syst. 19, 141–164 (2022). https://doi.org/10.2298/CSIS201220042Y
Article Google Scholar
Yeh, J.Y., Tsai, C.J.: Graph-based feature selection method for learning to rank. ACM Int Conf Proceeding Ser. (2020). https://doi.org/10.1145/3442555.3442567
Article Google Scholar
Geng, B., Yang, L., Hua, X.-S.: Learning to Rank with Graph Consistency. (2009)
Fan, J., Luo, H., Gao, Y., Jain, R.: Incorporating concept ontology for hierarchical video classification, annotation, and visualization. IEEE Trans. Multimed. 9, 939–957 (2007). https://doi.org/10.1109/TMM.2007.900143
Article Google Scholar
Bałchanowski, M., Boryczka, U.: Aggregation of rankings using metaheuristics in recommendation systems. Electron 11, 369 (2022). https://doi.org/10.3390/ELECTRONICS11030369
Article Google Scholar
Zhang, Y., Xiao, Y., Wu, J., Lu, X.: Comprehensive world university ranking based on ranking aggregation. Comput. Stat. 36, 1139–1152 (2021). https://doi.org/10.1007/S00180-020-01033-8/METRICS
Article MathSciNet Google Scholar
Valem, L.P., Pedronette, D.C.G.: Graph-based selective rank fusion for unsupervised image retrieval. Pattern Recognit. Lett. 135, 82–89 (2020). https://doi.org/10.1016/J.PATREC.2020.03.032
Article ADS Google Scholar
Kendall, M.G.: A new measure of rank correlation. Biometrika 30, 81–93 (1938). https://doi.org/10.1093/BIOMET/30.1-2.81
Article Google Scholar
Vathy-Fogarassy, Á., Abonyi, J.: Graph-based clustering and data visualization algorithms. Springer-Verlag, London (2013)
Book Google Scholar
Dai, X., Xi, Y., Zhang, W., Liu, Q., Tang, R., He, X., Hou, J., Wang, J., Yu, Y.: Beyond relevance ranking: a general graph matching framework for utility-oriented learning to rank. ACM Trans. Inf. Syst. (2021). https://doi.org/10.1145/3464303
Article Google Scholar
Pahikkala, T., Tsivtsivadze, E., Airola, A., Järvinen, J., Boberg, J.: An efficient algorithm for learning to rank from preference graphs. Mach. Learn. 75, 129–165 (2009). https://doi.org/10.1007/S10994-008-5097-Z/METRICS
Article Google Scholar
Agarwal, A., Chakrabarti, S., Aggarwal, S.: Learning to rank networked entities. Proc ACM SIGKDD Int. Conf. Knowl. Discov. Data Min. 2006, 14–23 (2006). https://doi.org/10.1145/1150402.1150409
Article Google Scholar
Agarwal, S.: Learning to rank on graphs. Mach. Learn. 81, 333–357 (2010). https://doi.org/10.1007/S10994-010-5185-8/METRICS
Article MathSciNet Google Scholar
Johnson, R., Zhang, T.: Graph-based semi-supervised learning and spectral kernel design. IEEE Trans. Inf. Theory. 54, 275–288 (2008). https://doi.org/10.1109/TIT.2007.911294
Article MathSciNet Google Scholar
Shi, J., Tian, X.Y.: Learning to Rank Sports Teams on a Graph. Appl. Sci. 2020, Vol. 10, Page 5833. 10, 5833 (2020). https://doi.org/10.3390/APP10175833
Qi, Y., Zhang, J., Liu, Y., Xu, W., Guo, J.: CGTR: Convolution Graph Topology Representation for Document Ranking. Int. Conf. Inf. Knowl. Manag. Proc. 2173–2176 (2020). https://doi.org/10.1145/3340531.3412073
Fan, L., Li, Q., Liu, B., Wu, X.M., Zhang, X., Lv, F., Lin, G., Li, S., Jin, T., Yang, K.: Modeling User Behavior with Graph Convolution for Personalized Product Search. In: ACM Web Conference 2022. pp. 203–212. Association for Computing Machinery, Inc (2022)
Sawhney, R., Agarwal, S., Wadhwa, A., Shah, R.: Exploring the scale-free nature of stock markets: Hyperbolic graph learning for algorithmic trading. Web Conf. 2021 - Proc. World Wide Web Conf. WWW 2021. 11–22 (2021). https://doi.org/10.1145/3442381.3450095
Zhang, Y., Zhang, Q., Zhang, L.L., Yang, Y., Yan, C., Gao, X., Yang, Y.: Learning to Rank Ace Neural Architectures via Normalized Discounted Cumulative Gain. (2021). https://doi.org/10.48550/arxiv.2108.03001
Formal, T., Clinchant, S., Renders, J.M., Lee, S., Cho, G.H.: Learning to Rank Images with Cross-Modal Graph Convolutions. Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics). 12035 LNCS, 589–604 (2020). https://doi.org/10.1007/978-3-030-45439-5_39/FIGURES/2
Narang, K., Krishnan, A., Wang, J., Yang, C., Sundaram, H., Sutter, C.: Ranking User-Generated Content via Multi-Relational Graph Convolution. SIGIR 2021 - Proc. 44th Int. ACM SIGIR Conf. Res. Dev. Inf. Retr. 470–480 (2021). https://doi.org/10.1145/3404835.3462857
Feng, F., He, X., Wang, X., Luo, C., Liu, Y., Chua, T.S.: Temporal Relational Ranking for Stock Prediction. ACM Trans. Inf. Syst. 37, (2019). https://doi.org/10.1145/3309547
Bianchi, F., Palmonari, M., Cremaschi, M., Fersini, E.: Actively learning to rank semantic associations for personalized contextual exploration of knowledge graphs. Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics). 10249 LNCS, 120–135 (2017). https://doi.org/10.1007/978-3-319-58068-5_8/TABLES/4
Muhammad, I., Bollegala, D., Coenen, F., Gamble, C., Kearney, A., Williamson, P.: Document Ranking for Curated Document Databases Using BERT and Knowledge Graph Embeddings: Introducing GRAB-Rank. Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics). 12925 LNCS, 116–127 (2021). https://doi.org/10.1007/978-3-030-86534-4_10/COVER
Ni, C.C., Sum Liu, K., Torzec, N.: Layered Graph Embedding for Entity Recommendation using Wikipedia in the Yahoo! Knowledge Graph. Web Conf. 2020 - Companion World Wide Web Conf. WWW 2020. 811–818 (2020). https://doi.org/10.1145/3366424.3383570
Maheshwari, G., Trivedi, P., Lukovnikov, D., Chakraborty, N., Fischer, A., Lehmann, J.: Learning to Rank Query Graphs for Complex Question Answering over Knowledge Graphs. In: The 18th International Semantic Web Conference (ISWC 2019). pp. 487–504. Springer (2019)
Liu, S., Gu, W., Cong, G., Zhang, F.: Structural Relationship Representation Learning with Graph Embedding for Personalized Product Search. Int. Conf. Inf. Knowl. Manag. Proc. 915–924 (2020). https://doi.org/10.1145/3340531.3411936
Pang, Y., Ji, Z., Jing, P., Li, X.: Ranking graph embedding for learning to rerank. IEEE Trans. Neural Networks Learn. Syst. 24, 1292–1303 (2013). https://doi.org/10.1109/TNNLS.2013.2253798
Article Google Scholar
Yang, S. Bin, Yang, B.: Learning to rank paths in spatial networks. Proc. - Int. Conf. Data Eng. 2020-April, 2006–2009 (2020). https://doi.org/10.1109/ICDE48307.2020.00225
Xu, Q., Li, M., Yu, M.: Learning to rank with relational graph and pointwise constraint for cross-modal retrieval. Soft Comput. 23, 9413–9427 (2019). https://doi.org/10.1007/S00500-018-3608-9/METRICS
Article Google Scholar
Al-Tashi, Q., Abdulkadir, S.J., Rais, H.M., Mirjalili, S., Alhussian, H.: Approaches to Multi-Objective Feature Selection: A Systematic Literature Review. IEEE Access. 8, 125076–125096 (2020). https://doi.org/10.1109/ACCESS.2020.3007291
Article Google Scholar
Shirzad, M.B., Keyvanpour, M.R.: A Systematic Study of Feature Selection Methods for Learning to Rank Algorithms. Int. J. Inf. Retr. Res. 8, 46–67 (2018). https://doi.org/10.4018/IJIRR.2018070104
Article Google Scholar
Li, W., Chai, Z., Tang, Z.: A decomposition-based multi-objective immune algorithm for feature selection in learning to rank. Knowledge-Based Syst. 234, 107577 (2021). https://doi.org/10.1016/J.KNOSYS.2021.107577
Lai, H.J., Pan, Y., Tang, Y., Yu, R.: FSMRank: Feature selection algorithm for learning to rank. IEEE Trans. Neural Networks Learn. Syst. 24, 940–952 (2013). https://doi.org/10.1109/TNNLS.2013.2247628
Article Google Scholar
Nesterov, Y.: Introductory Lectures on Convex Optimization: A Basic Course. 236 (2004). https://doi.org/10.1007/978-1-4419-8853-9
Lei, S., Han, X.: Feature Selection and Model Comparison on Microsoft Learning-to-Rank Data Sets. (2018)
Cheng, F., Guo, W., Zhang, X.: MOFSRank: A Multiobjective Evolutionary Algorithm for Feature Selection in Learning to Rank. Complexity. 2018, (2018). https://doi.org/10.1155/2018/7837696
Moura, D., Petrucci, V., Mosse, D.: Learning to Rank Graph-based Application Objects on Heterogeneous Memories. In: ACM International Conference Proceeding Series. pp. 1–14. Association for Computing Machinery (2021)
Chen, T., Guestrin, C.: XGBoost: A Scalable Tree Boosting System. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. pp. 785–794. ACM, New York, NY, USA (2016)
Sousa, D.X., Canuto, S., Gonçalves, M.A., Rosa, T.C., Martins, W.S.: Risk-Sensitive Learning to Rank with Evolutionary Multi-Objective Feature Selection. ACM Trans. Inf. Syst. 37, (2019). https://doi.org/10.1145/3300196
Purpura, A., Buchner, K., Silvello, G., Susto, G.A.: Neural Feature Selection for Learning to Rank. In: 32nd International Conference on Neural Information Processing Systems. pp. 9525–9536. Springer Science and Business Media Deutschland GmbH (2018)
Adebayo, J., Gilmer, J., Muelly, M., Goodfellow, I., Hardt, M., Kim, B.: Sanity Checks for Saliency Maps. Adv. Neural Inf. Process. Syst. 2018-December, 9505–9515 (2018). https://doi.org/10.48550/arxiv.1810.03292
Ng, A., Jordan, M., Weiss, Y.: On spectral clustering: analysis and an algorithm. In: The 14th International Conference on Neural Information Processing Systems: Natural and Synthetic. pp. 849–856 (2001)
Blondel, V.D., Guillaume, J.L., Lambiotte, R., Lefebvre, E.: Fast unfolding of communities in large networks. J. Stat. Mech. Theory Exp. 2008, P10008 (2008). https://doi.org/10.1088/1742-5468/2008/10/P10008
Article Google Scholar
Newman, M.E.J., Girvan, M.: Mixing Patterns and Community Structure in Networks. Presented at the (2003)
Newman, M.: Networks: A Introduction. Oxford University Press (2018)
Barabási, A.-L., Pósfai, M.: Network Science. Cambridge University Press (2016)
Newman, M.: The structure and function of complex networks. SIAM Rev. 45, 167–256 (2003). https://doi.org/10.1137/S003614450342480
Article ADS MathSciNet Google Scholar
Barrat, A., Barthélemy, M., Pastor-Satorras, R., Vespignani, A.: The architecture of complex weighted networks. Proc. Natl. Acad. Sci. 101, 3747–3752 (2004). https://doi.org/10.1073/PNAS.0400087101
Article ADS CAS PubMed PubMed Central Google Scholar
Zhou, B., Meng, X., Stanley, H.E.: Power-law distribution of degree–degree distance: A better representation of the scale-free property of complex networks. Proc. Natl. Acad. Sci. U. S. A. 117, 14812–14818 (2020). https://doi.org/10.1073/PNAS.1918901117/SUPPL_FILE/PNAS.1918901117.SAPP.PDF
Article ADS MathSciNet CAS PubMed PubMed Central Google Scholar
Pothen, A., Simon, H.D., Liou, K.-P.: Partitioning Sparse Matrices with Eigenvectors of Graphs. SIAM J. Matrix Anal. Appl. 11, 430–452 (1990). https://doi.org/10.1137/0611030
Article MathSciNet Google Scholar
Hespanha, J.P.: An Efficient MATLAB Algorithm for Graph Partitioning. (2004)
Newman, M.E.J.: Modularity and community structure in networks. Proc. Natl. Acad. Sci. 103, 8577–8582 (2006). https://doi.org/10.1073/PNAS.0601602103
Article ADS CAS PubMed PubMed Central Google Scholar
Newman, M.E.J., Girvan, M.: Finding and evaluating community structure in networks. Phys. Rev. E. 69, 026113 (2004). https://doi.org/10.1103/PhysRevE.69.026113
Newman, M.E.J.: Fast algorithm for detecting community structure in networks. Phys. Rev. E - Stat. Physics, Plasmas, Fluids, Relat. Interdiscip. Top. 69, 5 (2004). https://doi.org/10.1103/PHYSREVE.69.066133/FIGURES/5/MEDIUM
Qin, T., Liu, T.-Y.: Introducing LETOR 4.0 Datasets. (2013). https://doi.org/10.48550/arxiv.1306.2597
Qin, T., Liu, T.-Y.: Microsoft Learning to Rank Datasets - Microsoft Research, https://www.microsoft.com/en-us/research/project/mslr/
Dato, D., Lucchese, C., Nardini, F.M., Orlando, S., Perego, R., Tonellotto, N., Venturini, R.: Fast ranking with additive ensembles of oblivious and non-oblivious regression trees. ACM Trans. Inf. Syst. 35, (2016). https://doi.org/10.1145/2987380
Dato, D., Lucchese, C., Nardini, F.M., Orlando, S., Perego, R., Tonellotto, N., Venturini, R.: istella, http://blog.istella.it/istella-learning-to-rank-dataset/
Alcântara, O.D.A., Pereira, Á.R., De Almeida, H.M., Gonçalves, M.A., Middleton, C., Baeza-Yates, R.: WCL2R: A Benchmark Collection for Learning to Rank Research with Clickthrough Data. (2010)
Darrudi, E., Hashemi, H.B., AleAhmad, A., Zare Bidoki, A., Habibian, A., Mahdikhani, F., Rahgozar, M.: dotIR collection for Persian web retrieval. (2009)
Karmaker, S.S.K., Sondhi, P., Zhai, C.X.: Empirical Analysis of Impact of Query-Specific Customization of nDCG: A Case-Study with Learning-to-Rank Methods. Int. Conf. Inf. Knowl. Manag. Proc. 3281–3284 (2020). https://doi.org/10.1145/3340531.3417454

Download references

Funding

This research received no specific financial support from any funding agency in the public, commercial, or not-for-profit sectors.

Author information

Authors and Affiliations

Computer Engineering Department, Faculty of Engineering, College of Farabi, University of Tehran, Tehran, Iran
Amir Hosein Keyhanipour

Authors

Amir Hosein Keyhanipour
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

The author confirms sole responsibility for the following: survey of the related works, design and implementation of the proposed method, data collection, analysis and interpretation of results, and manuscript preparation.

Corresponding author

Correspondence to Amir Hosein Keyhanipour.

Ethics declarations

Conflict of interest

The authors have no competing interests as defined by Springer, or other interests that might be perceived to influence the results and/or discussion reported in this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix A: Table of notations used in this paper

Notation	Meaning
\(Q\)	Set of queries
\(D\)	Set of documents
\({q}_{i}\)	The ith query in the set of queries
\({d}_{j}\)	The jth document in the set of documents
\({d}^{(i)}\)	Set of documents associated to query \({q}_{i}\)
\({y}^{(i)}\)	Relevance labels of \({d}^{(i)}\)
\(m(i)\)	Number of documents associated to query \({q}_{i}\)
\({d}_{j}^{(i)}\)	The jth document in the set of documents associated to query \({q}_{i}\)
\({y}_{j}^{(i)}\)	Relevance label of the jth document in the set of documents associated to query \({q}_{i}\)
\(\overrightarrow{F}(q, d)\)	The vector of features associated to a given query-document pair
\(n\)	Number of features in a given L2R dataset
\(G\)	A weighted graph
\(V\)	The set of the vertices of the graph \(G\)
\(E\)	The set of the edges of the graph \(G\)
\({R}_{k}^{(i)}\)	Ranking of \({d}^{(i)}\) based on the kth feature of the L2R dataset
\(\tau \)	The Kendall’s Tau
\({A}_{\mathrm{w}}\)	The weighted adjacency matrix of the feature–similarity graph
\(A\)	The unweighted adjacency matrix of the feature–similarity graph
\(\sigma \)	The edge-pruning threshold
\({\varvec{L}}\)	The graph Laplacian
\({\mathrm{sp}}_{jk}\)	The total number of shortest paths from node \(j\) to node \(k\)
\({d}_{w(i)}\)	The weighted degree of node \(i\)
\({\mathrm{EC}}_{i}\)	The eigenvector centrality of node \(i\)
\({\mathrm{BC}}_{i}\)	The betweenness centrality of node \(i\)
\({\mathrm{CC}}_{i}\)	The closeness centrality of node \(i\)
\({\mathrm{LC}}_{i}\)	The local clustering coefficient of node \(i\)
\({C}_{1}\), \({C}_{2}\)	The global clustering coefficient values based on Eq. (17)
\({C}^{w}\)	The average of the weighted clustering coefficients
\({d}_{i}\)	The degree of node \(i\)
\(\mathrm{CT}\)	The number of connected triples
\({p}_{k}\)	The probability of having a node in a given network with degree \(k\)
\(\alpha \)	The power-law exponent
\(M\)	The modularity of a given network

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Keyhanipour, A.H. Graph-based comparative analysis of learning to rank datasets. Int J Data Sci Anal 17, 165–187 (2024). https://doi.org/10.1007/s41060-023-00406-8

Download citation

Received: 27 February 2023
Accepted: 13 June 2023
Published: 30 June 2023
Issue Date: March 2024
DOI: https://doi.org/10.1007/s41060-023-00406-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Graph-based comparative analysis of learning to rank datasets

Abstract

Access this article

Similar content being viewed by others

L2RLab: Integrated Experimenter Environment for Learning to Rank

An Empirical Study of the Impact of Field Features in Learning-to-rank Method

Which noise affects algorithm robustness for learning to rank

Data availability

Notes

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendix A: Table of notations used in this paper

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Graph-based comparative analysis of learning to rank datasets

Abstract

Access this article

Similar content being viewed by others

L2RLab: Integrated Experimenter Environment for Learning to Rank

An Empirical Study of the Impact of Field Features in Learning-to-rank Method

Which noise affects algorithm robustness for learning to rank

Data availability

Notes

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendix A: Table of notations used in this paper

Appendix A: Table of notations used in this paper

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation