Skip to main content

Comparing Ranking Learning Algorithms for Information Retrieval Systems

  • Conference paper
  • First Online:
Intelligent Data Engineering and Automated Learning – IDEAL 2023 (IDEAL 2023)

Abstract

The growth of machine learning applications in various fields has enabled the advancement of Information Retrieval systems. As a result of this evolution, it has become possible to solve the well-known document classification problem. In the beginning, document positions within a result were given from a score, where each document receives an assigned value based on the terms used in the input query. The use of machine learning in this field is known as Learning to Rank, which allows the classification of documents to better meet user search requirements, taking into account aspects such as document preference, importance, and relevance. This paper presents a comparison of different algorithms for ranking documents using machine learning. It is observed that RankSVM presents relatively satisfactory results in smaller datasets, while algorithms that use Gradient Boosting obtain better results for larger datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://lightgbm.readthedocs.io/en/latest/.

  2. 2.

    https://xgboost.readthedocs.io/en/latest/index.html.

  3. 3.

    For more information about the LETOR access – https://www.microsoft.com/en-us/research/project/letor-learning-rank-information-retrieval/.

  4. 4.

    The configurations of the PC used in the experiments are: nim i5 8400, 24 Gb RAM with a gtx 1060 graphic card with 6 Gb.

References

  1. Baeza-Yates, R.A., Ribeiro-Neto, B.: Modern Information Retrieval. Addison-Wesley Longman Publishing Co., Inc., USA (1999)

    Google Scholar 

  2. Chen, T., Guestrin, C.: XGBoost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2016, pp. 785–794. Association for Computing Machinery, New York (2016). https://doi.org/10.1145/2939672.2939785

  3. Harrag, F., Khamliche, M.: Mining stack overflow: a recommender systems-based model (2020). https://doi.org/10.20944/preprints202008.0265.v1

  4. Kalervo, J., Jaana, K.: Cumulated gain-based evaluation of IR techniques. ACM Trans. Inf. Syst. 20(4), 422–446 (2002). https://doi.org/10.1145/582415.582418, https://doi.acm.org/10.1145/582415.582418

  5. Kowalski, G.: Information Retrieval Systems: Theory and Implementation. The Information Retrieval Series. Springer, USA (2007). https://books.google.com.br/books?id=hfT6hFXNT4sC

  6. Li, H.: Learning to Rank for Information Retrieval and Natural Language Processing. Synthesis Lectures on Human Language Technologies. Morgan & Claypool Publishers (2011). https://doi.org/10.2200/S00348ED1V01Y201104HLT012

  7. Manning, C.D., Raghavan, P., Schütze, H.: Introduction to Information Retrieval. Cambridge University Press (2008). https://doi.org/10.1017/CBO9780511809071, https://nlp.stanford.edu/IR-book/pdf/irbookprint.pdf

  8. Ogunleye, A., Wang, Q.G.: XGBoost model for chronic kidney disease diagnosis. IEEE/ACM Trans. Comput. Biol. Bioinf. 17(6), 2131–2140 (2020). https://doi.org/10.1109/TCBB.2019.2911071

    Article  Google Scholar 

  9. Singh, S.P., Singh, P., Mishra, A.: Predicting potential applicants for any private college using LightGBM. In: 2020 International Conference on Innovative Trends in Information Technology (ICITIIT), pp. 1–5 (2020). https://doi.org/10.1109/ICITIIT49094.2020.9071525

  10. Tan, P.N., Steinbach, M.S., Karpatne, A., Kumar, V.: Introduction to Data Mining, 2nd edn. Pearson, London (2019)

    Google Scholar 

  11. Wang, C., Wu, Q., Weimer, M., Zhu, E.: FLAML: a fast and lightweight AutoML library. In: Smola, A., Dimakis, A., Stoica, I. (eds.) Proceedings of Machine Learning and Systems, vol. 3, pp. 434–447 (2021). https://proceedings.mlsys.org/paper/2021/file/92cc227532d17e56e07902b254dfad10-Paper.pdf

Download references

Acknowledgements

This work was partially supported with grant PID2021-123673OB-C31 funded by MCIN/AEI/ 10.13039/501100011033 and by “ERDF A way of making Europe” and grant from the Research Services of UPV (PAID-PD-22). The authors also would like to thank the FAPERGS/Brazil (Proc. 23/2551-0000126-8) and CNPq/Brazil (3305805/2021-5, 150160/2023-2).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to G. Lucca .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zilles, J., Borges, E.N., Lucca, G., Marco-Detchart, C., Berri, R.A., Dimuro, G.P. (2023). Comparing Ranking Learning Algorithms for Information Retrieval Systems. In: Quaresma, P., Camacho, D., Yin, H., Gonçalves, T., Julian, V., Tallón-Ballesteros, A.J. (eds) Intelligent Data Engineering and Automated Learning – IDEAL 2023. IDEAL 2023. Lecture Notes in Computer Science, vol 14404. Springer, Cham. https://doi.org/10.1007/978-3-031-48232-8_26

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-48232-8_26

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-48231-1

  • Online ISBN: 978-3-031-48232-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics