Abstract
The growth of machine learning applications in various fields has enabled the advancement of Information Retrieval systems. As a result of this evolution, it has become possible to solve the well-known document classification problem. In the beginning, document positions within a result were given from a score, where each document receives an assigned value based on the terms used in the input query. The use of machine learning in this field is known as Learning to Rank, which allows the classification of documents to better meet user search requirements, taking into account aspects such as document preference, importance, and relevance. This paper presents a comparison of different algorithms for ranking documents using machine learning. It is observed that RankSVM presents relatively satisfactory results in smaller datasets, while algorithms that use Gradient Boosting obtain better results for larger datasets.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
- 2.
- 3.
For more information about the LETOR access – https://www.microsoft.com/en-us/research/project/letor-learning-rank-information-retrieval/.
- 4.
The configurations of the PC used in the experiments are: nim i5 8400, 24 Gb RAM with a gtx 1060 graphic card with 6 Gb.
References
Baeza-Yates, R.A., Ribeiro-Neto, B.: Modern Information Retrieval. Addison-Wesley Longman Publishing Co., Inc., USA (1999)
Chen, T., Guestrin, C.: XGBoost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2016, pp. 785–794. Association for Computing Machinery, New York (2016). https://doi.org/10.1145/2939672.2939785
Harrag, F., Khamliche, M.: Mining stack overflow: a recommender systems-based model (2020). https://doi.org/10.20944/preprints202008.0265.v1
Kalervo, J., Jaana, K.: Cumulated gain-based evaluation of IR techniques. ACM Trans. Inf. Syst. 20(4), 422–446 (2002). https://doi.org/10.1145/582415.582418, https://doi.acm.org/10.1145/582415.582418
Kowalski, G.: Information Retrieval Systems: Theory and Implementation. The Information Retrieval Series. Springer, USA (2007). https://books.google.com.br/books?id=hfT6hFXNT4sC
Li, H.: Learning to Rank for Information Retrieval and Natural Language Processing. Synthesis Lectures on Human Language Technologies. Morgan & Claypool Publishers (2011). https://doi.org/10.2200/S00348ED1V01Y201104HLT012
Manning, C.D., Raghavan, P., Schütze, H.: Introduction to Information Retrieval. Cambridge University Press (2008). https://doi.org/10.1017/CBO9780511809071, https://nlp.stanford.edu/IR-book/pdf/irbookprint.pdf
Ogunleye, A., Wang, Q.G.: XGBoost model for chronic kidney disease diagnosis. IEEE/ACM Trans. Comput. Biol. Bioinf. 17(6), 2131–2140 (2020). https://doi.org/10.1109/TCBB.2019.2911071
Singh, S.P., Singh, P., Mishra, A.: Predicting potential applicants for any private college using LightGBM. In: 2020 International Conference on Innovative Trends in Information Technology (ICITIIT), pp. 1–5 (2020). https://doi.org/10.1109/ICITIIT49094.2020.9071525
Tan, P.N., Steinbach, M.S., Karpatne, A., Kumar, V.: Introduction to Data Mining, 2nd edn. Pearson, London (2019)
Wang, C., Wu, Q., Weimer, M., Zhu, E.: FLAML: a fast and lightweight AutoML library. In: Smola, A., Dimakis, A., Stoica, I. (eds.) Proceedings of Machine Learning and Systems, vol. 3, pp. 434–447 (2021). https://proceedings.mlsys.org/paper/2021/file/92cc227532d17e56e07902b254dfad10-Paper.pdf
Acknowledgements
This work was partially supported with grant PID2021-123673OB-C31 funded by MCIN/AEI/ 10.13039/501100011033 and by “ERDF A way of making Europe” and grant from the Research Services of UPV (PAID-PD-22). The authors also would like to thank the FAPERGS/Brazil (Proc. 23/2551-0000126-8) and CNPq/Brazil (3305805/2021-5, 150160/2023-2).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Zilles, J., Borges, E.N., Lucca, G., Marco-Detchart, C., Berri, R.A., Dimuro, G.P. (2023). Comparing Ranking Learning Algorithms for Information Retrieval Systems. In: Quaresma, P., Camacho, D., Yin, H., Gonçalves, T., Julian, V., Tallón-Ballesteros, A.J. (eds) Intelligent Data Engineering and Automated Learning – IDEAL 2023. IDEAL 2023. Lecture Notes in Computer Science, vol 14404. Springer, Cham. https://doi.org/10.1007/978-3-031-48232-8_26
Download citation
DOI: https://doi.org/10.1007/978-3-031-48232-8_26
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-48231-1
Online ISBN: 978-3-031-48232-8
eBook Packages: Computer ScienceComputer Science (R0)