Skip to main content
Log in

Skills prediction based on multi-label resume classification using CNN with model predictions explanation

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Skills extraction is a critical task when creating job recommender systems. It is also useful for building skills profiles and skills knowledge bases for organizations. The aim of skills extraction is to identify the skills expressed in documents such as resumes or job postings. Several methods have been proposed to tackle this problem. These methods already perform well when it comes to extracting explicitly mentioned skills from resumes. But skills have different levels of abstraction: high-level skills can be determined by low-level ones. Instead of just extracting skill-related terms, we propose a multi-label classification architecture model based on convolutional neural networks to predict high-level skills from resumes even if they are not explicitly mentioned in these resumes. Experiments carried out on a set of anonymous IT resumes collected from the Internet have shown the effectiveness of our method reaching 98.79% of recall and 91.34% of precision. In addition, features (terms) detected by convolutional filters are projected on the input resumes in order to present to the user, the terms which contributed to the model decision.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Data availability statement

Resumes repository: https://github.com/florex/resume_corpus.

Code availability

The preprocessing codes: https://github.com/florex/preprocessing. The CNN classifiers codes: https://github.com/florex/cnn_classifiers.

Notes

  1. https://www.dice.com/skills.

  2. https://www.indeed.com (consulted on the \(21\mathrm{th}\ Aug\ 2019)\).

  3. https://chrome.google.com/webstore/detail/multi-highlight/pfgfgjlejbbpfmcfjhdmikihihddeeji: a browser extension used to highlight words in web pages.

References

  1. Adadi A, Berrada M (2018) Peeking inside the black-box: a survey on explainable artificial intelligence (xai). IEEE Access 6:52138–52160

    Article  Google Scholar 

  2. Bansal S, Srivastava A, Arora A (2017) Topic modeling driven content based jobs recommendation engine for recruitment industry. Proc Comput Sci 122:865–872

    Article  Google Scholar 

  3. Bian J, Gao B, Liu TY (2014) Knowledge-powered deep learning for word embedding. In: Joint European conference on machine learning and knowledge discovery in databases, pp 132–148. Springer

  4. Çelik D, Elçi A (2012) An ontology-based information extraction approach for résumés. In: Joint international conference on pervasive computing and the networked world, pp 165–179. Springer

  5. Çelik D, Karakas A, Bal G, Gültunca C, Elçi A, Buluz B, Alevli MC (2013) Towards an information extraction system based on ontology to match resumes and jobs. In: 2013 IEEE 37th annual computer software and applications conference workshops, pp 333–338. IEEE

  6. Cohen PR, Kjeldsen R (1987) Information retrieval by constrained spreading activation in semantic networks. Inform Process Manag 23(4):255–268

    Article  Google Scholar 

  7. Development AFT (2015) Bridging the skills gap: workforce development is everyone’s business n2

  8. de Jesús Rubio J (2017) Usnfis: uniform stable neuro fuzzy inference system. Neurocomputing 262:57–66

    Article  Google Scholar 

  9. Faliagka E, Liadis L, Karydis I, Rigou M, Sioutas S, Tsakalidis A, Tzimas G (2013) On-line consistent ranking on e-recruitment; seeking the truth behind a well-formed cv. Artif Intell Rev. https://doi.org/10.1007/s10462-013-9414-y

    Article  Google Scholar 

  10. Ganzeboom HB, Treiman DJ (1996) Internationally comparable measures of occupational status for the 1988 international standard classification of occupations. Soc Sci Res 25(3):201–239

    Article  Google Scholar 

  11. Gibaja E, Ventura S (2014) Multi-label learning: a review of the state of the art and ongoing research. Wiley Int Rev Data Min Knowl Disc 4(6):411–444. https://doi.org/10.1002/widm.1139

    Article  Google Scholar 

  12. Guidotti R, Monreale A, Ruggieri S, Turini F, Giannotti F, Pedreschi D (2018) A survey of methods for explaining black box models. ACM Comput Surv 51(5):93:1–93:42. https://doi.org/10.1145/3236009

    Article  Google Scholar 

  13. Guo S, Alamudun F, Hammond T (2016) Resumatcher: a personalized resume-job matching system. Expert Syst Appl 60:169–182. https://doi.org/10.1016/j.eswa.2016.04.013

    Article  Google Scholar 

  14. Jacovi A, Sar Shalom O, Goldberg Y (2018) Understanding convolutional neural networks for text classification. In: Proceedings of the 2018 EMNLP workshop blackbox NLP: analyzing and interpreting neural networks for NLP, pp 56–65. Association for Computational Linguistics. http://aclweb.org/anthology/W18-5408

  15. Javed F, Hoang P, Mahoney T, McNair M (2017) Large-scale occupational skills normalization for online recruitment. In: Twenty-ninth IAAI conference

  16. Kenthapadi K, Le B, Venkataraman G (2017) Personalized job recommendation system at linkedin: practical challenges and lessons learned. In: Proceedings of the eleventh ACM conference on recommender systems, pp 346–347. ACM

  17. Kim Y (2014) Convolutional neural networks for sentence classification. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp 1746–1751. Association for computational linguistics, Doha, Qatar. https://doi.org/10.3115/v1/D14-1181

  18. Kivimäki I, Panchenko A, Dessy A, Verdegem D, Francq P, Bersini H, Saerens M (2013) A graph-based approach to skill extraction from text. In: Proceedings of TextGraphs-8 graph-based methods for natural language processing, pp 79–87

  19. Meda-Campaña JA (2018) On the estimation and control of nonlinear systems with parametric uncertainties and noisy outputs. IEEE Access 6:31968–31973

    Article  Google Scholar 

  20. Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J (2013) Distributed representations of words and phrases and their compositionality. In: Burges CJC, Bottou L, Welling M, Ghahramani Z, Weinberger KQ (eds) Advances in neural information processing systems, vol 26, pp 3111–3119. Curran Associates, Inc. http://papers.nips.cc/paper/5021-distributed-representations-of-words-and-phrases-and-their-compositionality.pdf

  21. Miller T (2019) Explanation in artificial intelligence: insights from the social sciences. Artif Intell 267:1–38. https://doi.org/10.1016/j.artint.2018.07.007

    Article  MathSciNet  MATH  Google Scholar 

  22. Nooralahzadeh F, Øvrelid L, Lønning JT (2018) Evaluation of domain-specific word embeddings using knowledge resources. In: Proceedings of the eleventh international conference on language resources and evaluation (LREC 2018)

  23. Ribeiro MT, Singh S, Guestrin C (2016) “Why should i trust you?”: Explaining the predictions of any classifier. In: Proceedings of the 22Nd ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’16, pp 1135–1144. ACM, New York, NY, USA. https://doi.org/10.1145/2939672.2939778

  24. Rubio DJJ (2009) Sofmls: online self-organizing fuzzy modified least-squares network. IEEE Trans Fuzzy Syst 17(6):1296–1309. https://doi.org/10.1109/TFUZZ.2009.2029569

    Article  Google Scholar 

  25. Rubio J (2017) Usnfis: uniform stable neuro fuzzy inference system. Neurocomputing. https://doi.org/10.1016/j.neucom.2016.08.150

    Article  Google Scholar 

  26. Sayfullina L, Malmi E, Kannala J (2018) Learning representations for soft skill matching. In: International conference on analysis of images, social networks and texts, pp 141–152. Springer

  27. Shalaby W, AlAila B, Korayem M, Pournajaf L, AlJadda K, Quinn S, Zadrozny W (2017) Help me find a job: a graph-based approach for job recommendation at scale. In: 2017 IEEE international conference on big data (big data), pp 1544–1553. IEEE

  28. Shi H, Wang X, Sun Y, Hu J (2018) Constructing high quality sense-specific corpus and word embedding via unsupervised elimination of pseudo multi-sense. In: Proceedings of the eleventh international conference on language resources and evaluation (LREC 2018)

  29. Zhang L, Wang S, Liu B (2018) Deep learning for sentiment analysis: a survey. Wiley Interdiscip Rev Data Min Knowl Discov 8(4):e1253

    Article  Google Scholar 

  30. Zhao M, Javed F, Jacob F, McNair M (2015) SKILL: a system for skill identification and normalization. In: Proceedings of the twenty-ninth AAAI conference on artificial intelligence, January 25–30, 2015, Austin, Texas, USA, pp 4012–4018

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kameni Florentin Flambeau Jiechieu.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Jiechieu, K.F.F., Tsopze, N. Skills prediction based on multi-label resume classification using CNN with model predictions explanation. Neural Comput & Applic 33, 5069–5087 (2021). https://doi.org/10.1007/s00521-020-05302-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-020-05302-x

Keywords

Navigation