Comparing Machine Learning Correlations to Domain Experts’ Causal Knowledge: Employee Turnover Use Case

Meddeb, Eya; Bowers, Christopher; Nichol, Lynn

doi:10.1007/978-3-031-14463-9_22

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13480))

Included in the following conference series:

International Cross-Domain Conference for Machine Learning and Knowledge Extraction

1308 Accesses

Abstract

This paper addresses two major phenomena, machine learning and causal knowledge discovery in the context of human resources management. First, we examine previous work analysing employee turnover predictions and the most important factors affecting these predictions using regular machine learning (ML) algorithms, we then interpret the results concluded from developing and testing different classification models using the IBM Human Resources (HR) data. Second, we explore an alternative process of extracting causal knowledge from semi-structured interviews with HR experts to form expert-derived causal graph (map). Through a comparison between the results concluded from using machine learning approaches and from interpreting findings of the interviews, we explore the benefits of adding domain experts’ causal knowledge to data knowledge. Recommendations are provided on the best methods and techniques to consider for causal graph learning to improve decision making.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Employee turnover in multinational corporations: a supervised machine learning approach

Article Open access 21 May 2024

Cognizant Prognostication: An In-Depth Comparative Study of Machine Learning Models for Predictive Employee Turnover Analysis in the Realm of Human Resources Analytics

HR Analytics: Analysis of Employee Attrition Using Perspectives from Machine Learning

Notes

References

Adams, W.C., et al.: Conducting semi-structured interviews. In: Wholey, J., Hatry, H., Newcomer, K. (eds.) Handbook of Practical Program Evaluation, vol. 4, pp. 492–505. John Wiley & Sons, Inc., Hoboken (2015)
Google Scholar
Aglietti, V., Damoulas, T., Álvarez, M., González, J.: Multi-task causal learning with Gaussian processes. arXiv preprint arXiv:2009.12821 (2020)
Al-Radaideh, Q.A., Al Nagi, E.: Using data mining techniques to build a classification model for predicting employees performance. Int. J. Adv. Comput. Sci. Appl. 3(2) (2012). https://doi.org/10.14569/IJACSA.2012.030225, http://dx.doi.org/10.14569/IJACSA.2012.030225
Athey, S.: Machine learning and causal inference for policy evaluation. In: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 5–6. Association for Computing Machinery, New York, NY, USA (2015). https://doi.org/10.1145/2783258.2785466, https://doi.org/10.1145/2783258.2785466
Athey, S.: 21.The Impact of Machine Learning on Economics. In: The Economics of Artificial Intelligence, pp. 507–552. University of Chicago Press, Chicago (2019). https://doi.org/10.7208/chicago/9780226613475.001.0001, https://www.nber.org/books-and-chapters/economics-artificial-intelligence-agenda
Athey, S., Imbens, G.: A measure of robustness to misspecification. Am. Econ. Rev. 105(5), 476–480 (2015). https://doi.org/10.1257/aer.p20151020, https://www.aeaweb.org/articles?id=10.1257/aer.p20151020
Barbiero, P., Squillero, G., Tonda, A.: Modeling generalization in machine learning: a methodological and computational study. arXiv preprint arXiv:2006.15680 (2020)
Bareinboim, E., Pearl, J.: Controlling selection bias in causal inference. In: Artificial Intelligence and Statistics, pp. 100–108. PMLR (2012). https://proceedings.mlr.press/v22/bareinboim12.html
Bareinboim, E., Pearl, J.: Transportability of causal effects: completeness results. In: Proceedings of the AAAI Conference on Artificial Intelligence, AAAI 2012, vol. 26, pp. 698–704 (2012)
Google Scholar
Boyd, K., Eng, K.H., Page, C.D.: Area under the precision-recall curve: point estimates and confidence intervals. In: Blockeel, H., Kersting, K., Nijssen, S., Železný, F. (eds.) ECML PKDD 2013. LNCS (LNAI), vol. 8190, pp. 451–466. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-40994-3_29
Chapter Google Scholar
Brownlee, J.: How to use ROC curves and precision-recall curves for classification in python. https://machinelearningmastery.com/roc-curves-and-precision-recall-curves-for-classification-in-python/ (2018). Accessed 10 Oct-2021
Cai, X., Shang, J., Jin, Z., Liu, F., Qiang, B., Xie, W., Zhao, L.: DBGE: employee turnover prediction based on dynamic bipartite graph embedding. IEEE Access 8, 10390–10402 (2020)
Article Google Scholar
Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: Smote: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002)
Article Google Scholar
Chen, T., Guestrin, C.: XGBoost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 785–794 (2016)
Google Scholar
Chien, C.F., Chen, L.F.: Data mining to improve personnel selection and enhance human capital: a case study in high-technology industry. Exp. Syst. Appl. 34(1), 280–290 (2008). https://doi.org/10.1016/j.eswa.2006.09.003, https://www.sciencedirect.com/science/article/pii/S0957417406002776
Chowdhury, S., Joel-Edgar, S., Dey, P.K., Bhattacharya, S., Kharlamov, A.: Embedding transparency in artificial intelligence machine learning models: managerial implications on predicting and explaining employee turnover. Int. J. Hum. Resour. Manag. 1–32 (2022)
Google Scholar
Correa, J.D., Tian, J., Bareinboim, E.: Identification of causal effects in the presence of selection bias. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 2744–2751 (2019)
Google Scholar
DGOKE1: IBM HR Dataset: exploratory data analysis. https://www.kaggle.com/code/dgokeeffe/ibm-hr-dataset-exploratory-data-analysis/data (2017). Accessed 17 June 2022
Duan, Y.: Statistical analysis and prediction of employee turnover propensity based on data mining. In: 2022 International Conference on Big Data, Information and Computer Network (BDICN), pp. 235–238 (2022). https://doi.org/10.1109/BDICN55575.2022.00052
Evans, C., Lewis, J.: Analysing Semi-Structured Interviews Using Thematic Analysis: Exploring Voluntary Civic Participation Among Adults. SAGE Publications Limited, London (2018)
Google Scholar
Farzaneh, F.: Attrition-binary classification of imbalanced data. https://www.kaggle.com/code/oceands/attrition-binary-classification-of-imbalanced-data/notebook (2021). Accessed 09 Oct 2021
Friedman, J.H.: Greedy function approximation: a gradient boosting machine. Ann. Stat. 29, 1189–1232 (2001)
Google Scholar
Galletta, A.: Mastering the Semi-Structured Interview and Beyond. New York University Press, New York (2013)
Google Scholar
Garg, S., Sinha, S., Kar, A.K., Mani, M.: A review of machine learning applications in human resource management. Int. J. Prod. Perform. Manag. 23 (2021)
Google Scholar
Guest, G., Bunce, A., Johnson, L.: How many interviews are enough? an experiment with data saturation and variability. Field Methods 18(1), 59–82 (2006)
Article Google Scholar
Hang, J., Dong, Z., Zhao, H., Song, X., Wang, P., Zhu, H.: Outside. In: Market-aware heterogeneous graph neural network for employee turnover prediction. In: Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining, pp. 353–362 (2022)
Google Scholar
Huang, J., Ling, C.X.: Using AUC and accuracy in evaluating learning algorithms. IEEE Trans. Knowl. Data Eng. 17(3), 299–310 (2005)
Article Google Scholar
Hünermund, P., Kaminski, J., Schmitt, C.: Causal Machine Learning And Business-Decision Making (2021)
Google Scholar
Jain, R., Nayyar, A.: Predicting employee attrition using XGBoost machine learning approach. In: 2018 International Conference on System Modeling & Advancement in Research Trends (SMART), pp. 113–120. IEEE (2018)
Google Scholar
Joarder, M.H.: The role of HRM practices in predicting faculty turnover intention: empirical evidence from private universities in Bangladesh. South East Asian J. Manag. 5 (2012)
Google Scholar
Kovan, I.: An overview of boosting methods: CatBoost, XGBoost, AdaBoost, LightBoost, Histogram-based gradient boost. https://towardsdatascience.com/an-overview-of-boosting-methods-catboost-xgboost-adaboost-lightboost-histogram-based-gradient-407447633ac1 (2021). Accessed 3 Mar 2022
Kumova, B.I., Saller, D.: Mining causal hypotheses in categorical time series by iterating on binary correlations. In: Holzinger, A., Kieseberg, P., Tjoa, A.M., Weippl, E. (eds.) CD-MAKE 2021. LNCS, vol. 12844, pp. 99–114. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-84060-0_7
Chapter Google Scholar
Lazzari, M., Alvarez, J.M., Ruggieri, S.: Predicting and explaining employee turnover intention. Int. J. Data Sci. Anal. 33(9), 911–923 (2022)
Google Scholar
Lee, S., Correa, J., Bareinboim, E.: General transportability-synthesizing observations and experiments from heterogeneous domains. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 10210–10217 (2020)
Google Scholar
Ling, C.X., Huang, J., Zhang, H., et al.: AUC: a statistically consistent and more discriminating measure than accuracy. In: IJCAI, vol. 3, pp. 519–524 (2003)
Google Scholar
Ma, X., Zhang, Y., Song, Y., Wang, E., Yao, F., Zhang, Z.: Application of data mining in the field of human resource management: a review. In: 1st International Symposium on Economic Development and Management Innovation (EDMI 2019), pp. 222–227. Atlantis Press (2019)
Google Scholar
Mackieson, P., Shlonsky, A., Connolly, M.: Increasing rigor and reducing bias in qualitative research: A document analysis of parliamentary debates using applied thematic analysis. Qual. Soc. Work. 18(6), 965–980 (2019)
Article Google Scholar
Madhavan, A.: Correlation vs causation: understand the difference for your product. https://amplitude.com/blog/causation-correlation (2019). Accessed 6 Mar 2022
Maria-Carmen, L.: Classical machine-learning classifiers to predict employee turnover. In: Education, Research and Business Technologies, pp. 295–306. Springer, Singapore (2022). https://doi.org/10.1007/978-981-16-8866-9_25
Moraffah, R., Karami, M., Guo, R., Raglin, A., Liu, H.: Causal interpretability for machine learning-problems, methods and evaluation. ACM SIGKDD Explor. News 22(1), 18–33 (2020)
Article Google Scholar
Palinkas, L.A., Horwitz, S.M., Green, C.A., Wisdom, J.P., Duan, N., Hoagwood, K.: Purposeful sampling for qualitative data collection and analysis in mixed method implementation research. Adm. Policy Mental Health Serv. Res. 42(5), 533–544 (2015)
Google Scholar
Pearl, J.: The seven tools of causal inference, with reflections on machine learning. Commun. ACM 62(3), 54–60 (2019)
Article Google Scholar
Pearl, J., Bareinboim, E.: Transportability of causal and statistical relations: a formal approach. In: Twenty-Fifth AAAI Conference on Artificial Intelligence (2011)
Google Scholar
Pearl, J., Bareinboim, E.: External validity: from do-calculus to transportability across populations. Stat. Sci. 29(4), 579–595 (2014)
Article MathSciNet Google Scholar
Pearl, J., Mackenzie, D.: The Book of Why: The New Science of Cause and Effect, 1st edn., Basic Books, New York (2018)
Google Scholar
Pedregosa, F., et al.: Scikit-learn: machine learning in python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
MathSciNet MATH Google Scholar
Peters, J., Janzing, D., Schölkopf, B.: Elements of Causal Inference: Foundations and Learning Algorithms. The MIT Press, Cambridge (2017)
Google Scholar
Pickus, S.: Logistic-regression-classifier-with-l2-regularization, April 2014. https://github.com/pickus91/Logistic-Regression-Classifier-with-L2-Regularization
Raschka, S.: Python Machine Learning. Packt Publishing Ltd., Birmingham (2015)
Google Scholar
Saarela, M., Jauhiainen, S.: Comparison of feature importance measures as explanations for classification models. SN Appl. Sci. 3(2), 1–12 (2021). https://doi.org/10.1007/s42452-021-04148-9
Article Google Scholar
Sakia, R.M.: The box-cox transformation technique: a review. J. R. Stat, Soc. Ser. D 41(2), 169–178 (1992)
Google Scholar
Schölkopf, B., et al.: Towards causal representation learning. arXiv preprint arXiv:2102.11107 (2021)
Sharma, R., Mithas, S., Kankanhalli, A.: Transforming decision-making processes: a research agenda for understanding the impact of business analytics on organizations. Eur. J. Inf. Syst. 23(4), 433–441 (2014)
Article Google Scholar
Shrestha, Y.R., Ben-Menahem, S.M., Von Krogh, G.: Organizational decision-making structures in the age of artificial intelligence. Calif. Manage. Rev. 61(4), 66–83 (2019)
Article Google Scholar
Sikaroudi, E., Mohammad, A., Ghousi, R., Sikaroudi, A.: A data mining approach to employee turnover prediction (case study: Arak automotive parts manufacturing). J. Ind. Syst. Eng. 8(4), 106–121 (2015)
Google Scholar
Simon, H.A.: On the concept of organizational goal. Admin. Sci. Q. 9,1–22 (1964)
Google Scholar
Spirtes, P.: Introduction to causal inference. J. Mach. Learn. Res. 11(5) (2010)
Google Scholar
Strohmeier, S., Piazza, F.: Domain driven data mining in human resource management: A review of current research. Expert Syst. Appl. 40(7), 2410–2420 (2013)
Article Google Scholar
Tang, X., Chen, A., He, J.: A modelling approach based on Bayesian networks for dam risk analysis: integration of machine learning algorithm and domain knowledge. Int. J. Dis. Risk Reduct. 71, 102818 (2022)
Google Scholar
Vega, R.P., Anderson, A.J., Kaplan, S.A.: A within-person examination of the effects of telework. J. Bus. Psychol. 30(2), 313–323 (2015)
Article Google Scholar
Zeng, S., Bayir, M.A., Pfeiffer III, J.J., Charles, D., Kiciman, E.: Causal transfer random forest: combining logged data and randomized experiments for robust prediction. In: Proceedings of the 14th ACM International Conference on Web Search and Data Mining, pp. 211–219 (2021)
Google Scholar
Zhao, Y., Hryniewicki, M.K., Cheng, F., Fu, B., Zhu, X.: Employee turnover prediction with machine learning: a reliable approach. In: Arai, K., Kapoor, S., Bhatia, R. (eds.) IntelliSys 2018. AISC, vol. 869, pp. 737–758. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-01057-7_56
Chapter Google Scholar
Zhu, Q., Shang, J., Cai, X., Jiang, L., Liu, F., Qiang, B.: CoxRF: employee turnover prediction based on survival analysis. In: 2019 IEEE SmartWorld, Ubiquitous Intelligence & Computing, Advanced & Trusted Computing, Scalable Computing & Communications, Cloud & Big Data Computing, Internet of People and Smart City Innovation (SmartWorld/SCALCOM/UIC/ATC/CBDCom/IOP/SCI), pp. 1123–1130. IEEE (2019)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computing, University of Worcester, Worcester, WR1 3AS, UK
Eya Meddeb & Christopher Bowers
Worcester Business School, University of Worcester, Worcester, WR1 3AS, UK
Eya Meddeb, Christopher Bowers & Lynn Nichol

Authors

Eya Meddeb
View author publications
You can also search for this author in PubMed Google Scholar
Christopher Bowers
View author publications
You can also search for this author in PubMed Google Scholar
Lynn Nichol
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Eya Meddeb .

Editor information

Editors and Affiliations

University of Natural Resources and Life Sciences Vienna, Vienna, Austria
Andreas Holzinger
St. Pölten University of Applied Sciences, St. Pölten, Austria
Peter Kieseberg
TU Wien, Vienna, Austria
A Min Tjoa
SBA Research, Vienna, Austria
Edgar Weippl

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Meddeb, E., Bowers, C., Nichol, L. (2022). Comparing Machine Learning Correlations to Domain Experts’ Causal Knowledge: Employee Turnover Use Case. In: Holzinger, A., Kieseberg, P., Tjoa, A.M., Weippl, E. (eds) Machine Learning and Knowledge Extraction. CD-MAKE 2022. Lecture Notes in Computer Science, vol 13480. Springer, Cham. https://doi.org/10.1007/978-3-031-14463-9_22

Download citation

DOI: https://doi.org/10.1007/978-3-031-14463-9_22
Published: 11 August 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-14462-2
Online ISBN: 978-3-031-14463-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Federation for Information Processing (opens in a new tab)

Comparing Machine Learning Correlations to Domain Experts’ Causal Knowledge: Employee Turnover Use Case

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Employee turnover in multinational corporations: a supervised machine learning approach

Cognizant Prognostication: An In-Depth Comparative Study of Machine Learning Models for Predictive Employee Turnover Analysis in the Realm of Human Resources Analytics

HR Analytics: Analysis of Employee Attrition Using Perspectives from Machine Learning

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Societies and partnerships

Subscribe and save

Buy Now

Navigation

Comparing Machine Learning Correlations to Domain Experts’ Causal Knowledge: Employee Turnover Use Case

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Employee turnover in multinational corporations: a supervised machine learning approach

Cognizant Prognostication: An In-Depth Comparative Study of Machine Learning Models for Predictive Employee Turnover Analysis in the Realm of Human Resources Analytics

HR Analytics: Analysis of Employee Attrition Using Perspectives from Machine Learning

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Societies and partnerships

Search

Navigation