Skip to main content

Similarity-Based Résumé Matching via Triplet Loss with BERT Models

  • Conference paper
  • First Online:
Intelligent Systems and Applications (IntelliSys 2022)

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 544))

Included in the following conference series:

Abstract

Automatic résumé matching for the recruitment engines is an important task because of the vast volume and varying types of applicants. We propose a résumé matching method to be used as a recommendation engine for recruiters. Our approach combines cutting-edge transformer-based natural language processing technology with the triplet loss, a training method originally developed for the computer vision domain. By treating the output embeddings of a transformer model similarly to those of a convolutional neural network, we develop a model for the document retrieval task. The paper also investigates a clustering based pretraining method before fine-tuning with the triplet loss. The method is applied on the data extracted from an online recruitment website, where real users actively create their own résumés. Measured by the precision at k score, the method yields an accuracy boost of %12 compared to a base model.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Hu, W., Qiu, H., Huang, J., Dumontier, M.: BioSearch: a semantic search engine for Bio2RDF. Database, 2017 (2017)

    Google Scholar 

  2. Wu, H., et al.: SemEHR: a general-purpose semantic search system to surface semantic data from clinical notes for tailored care, trial recruitment, and clinical research. J. Am. Med. Inf. Assoc. 25(5), 530–537 (2018)

    Article  Google Scholar 

  3. Li, Q., Avadhanam, S., Zhang, Q.: An end-to-end tool for news processing and semantic search. In: Companion Proceedings of the Web Conference 2020, pp. 139–142 (2020)

    Google Scholar 

  4. Al-Natsheh, H.T., Martinet, L., Muhlenbach, F., Rico, F., Zighed, D.A.: Semantic search-by-examples for scientific topic corpus expansion in digital libraries. In: 2017 IEEE International Conference on Data Mining Workshops (ICDMW), pp. 747–756. IEEE (2017)

    Google Scholar 

  5. Khan, H.U., Saqlain, S.M., Shoaib, M., Sher, M.: Ontology based semantic search in Holy Quran. Int. J. Future Comput. Commun. 2(6), 570 (2013)

    Article  Google Scholar 

  6. Bhatia, V., Rawat, P., Kumar, A., Shah, R.R.: End-to-End Résumé Parsing and Finding Candidates for a Job Description Using BERT. arXiv preprint arXiv:1910.03089 (2019)

  7. Lavi, D., Medentsiy, V., Graus, D.: conSultantBERT: Fine-Tuned Siamese Sentence-BERT for Matching Jobs and Job Seekers. arXiv preprint arXiv:2109.06501 (2021)

  8. Rafter, R., Bradley, K., Smyth, B.: Personalised retrieval for online recruitment services. In: The BCS/IRSG 22nd Annual Colloquium on Information Retrieval (IRSG 2000), Cambridge, 5–7 April 2000 (2000)

    Google Scholar 

  9. Färber, F., Weitzel, T., Keim, T.: An automated recommendation approach to selection in personnel recruitment (2003)

    Google Scholar 

  10. Reimers, N., Gurevych, I.: Making monolingual sentence embeddings multilingual using knowledge distillation. arXiv preprint arXiv:2004.09813 (2020)

  11. Wolf, T., et al.: Transformers: state-of-the-art natural language processing. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pp. 38–45 (2020)

    Google Scholar 

  12. Gordo, A., Almazán, J., Revaud, J., Larlus, D.: Deep image retrieval: learning global representations for image search. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9910, pp. 241–257. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46466-4_15

  13. Sanakoyeu, A., Tschernezki, V., Buchler, U., Ommer, B.: Divide and conquer the embedding space for metric learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 471–480 (2019)

    Google Scholar 

  14. Cabrera-Diego, L.A., Durette, B., Lafon, M., Torres-Moreno, J.M., El-Bèze, M.: How can we measure the similarity between résumés of selected candidates for a job?. In: Proceedings of the International Conference on Data Science (ICDATA) (p. 99). The Steering Committee of The World Congress in Computer Science, Computer Engineering and Applied Computing (WorldComp) (2015)

    Google Scholar 

  15. Herlocker, J.L., Konstan, J.A., Terveen, L.G., Riedl, J.T.: Evaluating collaborative filtering recommender systems. ACM Trans. Inf. Syst. 22(1), 5–53 (2004)

    Article  Google Scholar 

  16. Gururangan, S., et al.: Don’t stop pretraining: adapt language models to domains and tasks. arXiv preprint arXiv:2004.10964 (2020)

  17. Howard, J., Ruder, S.: Universal language model fine-tuning for text classification. arXiv preprint arXiv:1801.06146 (2018)

  18. Liu, Y., et al.: Roberta: a robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019)

  19. Wang, K., Reimers, N., Gurevych, I.: TSDAE: Using Transformer-Based Sequential Denoising Auto-Encoder for Unsupervised Sentence Embedding Learning. arXiv preprint arXiv:2104.06979 (2021)

  20. Lloyd, S.: Least squares quantization in PCM. IEEE Trans. Inf. Theory 28(2), 129–137 (1982)

    Article  MathSciNet  Google Scholar 

  21. Kaufman, L., Rousseeuw, P.J.: Partitioning around medoids (program pam). Finding Groups in Data: an Introduction to Cluster Analysis 344, 68–125 (1990)

    Google Scholar 

  22. Ester, M., Kriegel, H.P., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise. KDD 96(34), 226–231 (1996)

    Google Scholar 

  23. Schroff, F., Kalenichenko, D., Philbin, J.: Facenet: a unified embedding for face recognition and clustering. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 815–823 (2015)

    Google Scholar 

  24. Hermans, A., Beyer, L., Leibe, B.: In defense of the triplet loss for person re-identification. arXiv preprint arXiv:1703.07737 (2017)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ö. Anıl Özlü .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Özlü, Ö.A., Orman, G.K., Daniş, F.S., Turhan, S.N., Kara, K.C., Yücel, T.A. (2023). Similarity-Based Résumé Matching via Triplet Loss with BERT Models. In: Arai, K. (eds) Intelligent Systems and Applications. IntelliSys 2022. Lecture Notes in Networks and Systems, vol 544. Springer, Cham. https://doi.org/10.1007/978-3-031-16075-2_37

Download citation

Publish with us

Policies and ethics