Comparing Ranking Learning Algorithms for Information Retrieval Systems

Zilles, J.; Borges, E. N.; Lucca, G.; Marco-Detchart, C.; Berri, Rafael A.; Dimuro, G. P.

doi:10.1007/978-3-031-48232-8_26

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14404))

Included in the following conference series:

International Conference on Intelligent Data Engineering and Automated Learning

797 Accesses

Abstract

The growth of machine learning applications in various fields has enabled the advancement of Information Retrieval systems. As a result of this evolution, it has become possible to solve the well-known document classification problem. In the beginning, document positions within a result were given from a score, where each document receives an assigned value based on the terms used in the input query. The use of machine learning in this field is known as Learning to Rank, which allows the classification of documents to better meet user search requirements, taking into account aspects such as document preference, importance, and relevance. This paper presents a comparison of different algorithms for ranking documents using machine learning. It is observed that RankSVM presents relatively satisfactory results in smaller datasets, while algorithms that use Gradient Boosting obtain better results for larger datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 69.99; Price excludes VAT (USA)

Softcover Book: USD 89.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
https://lightgbm.readthedocs.io/en/latest/.
2.
https://xgboost.readthedocs.io/en/latest/index.html.
3.
For more information about the LETOR access – https://www.microsoft.com/en-us/research/project/letor-learning-rank-information-retrieval/.
4.
The configurations of the PC used in the experiments are: nim i5 8400, 24 Gb RAM with a gtx 1060 graphic card with 6 Gb.

References

Baeza-Yates, R.A., Ribeiro-Neto, B.: Modern Information Retrieval. Addison-Wesley Longman Publishing Co., Inc., USA (1999)
Google Scholar
Chen, T., Guestrin, C.: XGBoost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2016, pp. 785–794. Association for Computing Machinery, New York (2016). https://doi.org/10.1145/2939672.2939785
Harrag, F., Khamliche, M.: Mining stack overflow: a recommender systems-based model (2020). https://doi.org/10.20944/preprints202008.0265.v1
Kalervo, J., Jaana, K.: Cumulated gain-based evaluation of IR techniques. ACM Trans. Inf. Syst. 20(4), 422–446 (2002). https://doi.org/10.1145/582415.582418, https://doi.acm.org/10.1145/582415.582418
Kowalski, G.: Information Retrieval Systems: Theory and Implementation. The Information Retrieval Series. Springer, USA (2007). https://books.google.com.br/books?id=hfT6hFXNT4sC
Li, H.: Learning to Rank for Information Retrieval and Natural Language Processing. Synthesis Lectures on Human Language Technologies. Morgan & Claypool Publishers (2011). https://doi.org/10.2200/S00348ED1V01Y201104HLT012
Manning, C.D., Raghavan, P., Schütze, H.: Introduction to Information Retrieval. Cambridge University Press (2008). https://doi.org/10.1017/CBO9780511809071, https://nlp.stanford.edu/IR-book/pdf/irbookprint.pdf
Ogunleye, A., Wang, Q.G.: XGBoost model for chronic kidney disease diagnosis. IEEE/ACM Trans. Comput. Biol. Bioinf. 17(6), 2131–2140 (2020). https://doi.org/10.1109/TCBB.2019.2911071
Article Google Scholar
Singh, S.P., Singh, P., Mishra, A.: Predicting potential applicants for any private college using LightGBM. In: 2020 International Conference on Innovative Trends in Information Technology (ICITIIT), pp. 1–5 (2020). https://doi.org/10.1109/ICITIIT49094.2020.9071525
Tan, P.N., Steinbach, M.S., Karpatne, A., Kumar, V.: Introduction to Data Mining, 2nd edn. Pearson, London (2019)
Google Scholar
Wang, C., Wu, Q., Weimer, M., Zhu, E.: FLAML: a fast and lightweight AutoML library. In: Smola, A., Dimakis, A., Stoica, I. (eds.) Proceedings of Machine Learning and Systems, vol. 3, pp. 434–447 (2021). https://proceedings.mlsys.org/paper/2021/file/92cc227532d17e56e07902b254dfad10-Paper.pdf

Download references

Acknowledgements

This work was partially supported with grant PID2021-123673OB-C31 funded by MCIN/AEI/ 10.13039/501100011033 and by “ERDF A way of making Europe” and grant from the Research Services of UPV (PAID-PD-22). The authors also would like to thank the FAPERGS/Brazil (Proc. 23/2551-0000126-8) and CNPq/Brazil (3305805/2021-5, 150160/2023-2).

Author information

Authors and Affiliations

Centro de Ciências Computacionais, Universidade Federal do Rio Grande, Av. Itália km 08, Campus Carreiros, Rio Grande, 96201-900, Brazil
J. Zilles, E. N. Borges, Rafael A. Berri & G. P. Dimuro
Programa de Pós-Graduação em Engenharia Eletrônica e Computação, Universidade Católica de Pelotas, Gonçalves Chaves, Pelotas, 96015-560, Brazil
G. Lucca
Valencian Research Institute for Artificial Intelligence (VRAIN), Universitat Politècnica de València (UPV), Camino de Vera s/n, 46022, Valencia, Spain
C. Marco-Detchart

Authors

J. Zilles
View author publications
You can also search for this author in PubMed Google Scholar
E. N. Borges
View author publications
You can also search for this author in PubMed Google Scholar
G. Lucca
View author publications
You can also search for this author in PubMed Google Scholar
C. Marco-Detchart
View author publications
You can also search for this author in PubMed Google Scholar
Rafael A. Berri
View author publications
You can also search for this author in PubMed Google Scholar
G. P. Dimuro
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to G. Lucca .

Editor information

Editors and Affiliations

University of Évora, Évora, Portugal
Paulo Quaresma
Technical University of Madrid, Madrid, Spain
David Camacho
University of Manchester, Manchester, UK
Hujun Yin
University of Évora, Évora, Portugal
Teresa Gonçalves
Polytechnic University of Valencia, Valencia, Spain
Vicente Julian
University of Huelva, Huelva, Spain
Antonio J. Tallón-Ballesteros

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zilles, J., Borges, E.N., Lucca, G., Marco-Detchart, C., Berri, R.A., Dimuro, G.P. (2023). Comparing Ranking Learning Algorithms for Information Retrieval Systems. In: Quaresma, P., Camacho, D., Yin, H., Gonçalves, T., Julian, V., Tallón-Ballesteros, A.J. (eds) Intelligent Data Engineering and Automated Learning – IDEAL 2023. IDEAL 2023. Lecture Notes in Computer Science, vol 14404. Springer, Cham. https://doi.org/10.1007/978-3-031-48232-8_26

Download citation

DOI: https://doi.org/10.1007/978-3-031-48232-8_26
Published: 15 November 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-48231-1
Online ISBN: 978-3-031-48232-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Comparing Ranking Learning Algorithms for Information Retrieval Systems