Skip to main content
Log in

Utilizing data science techniques to analyze skill and demand changes in healthcare occupations: case study on USA and UAE healthcare sector

  • Methodologies and Application
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

New technologies are emerging on a continual basis with drastic trajectories and wider penetration into the job market. Health care is among the top ten sectors in terms of talent turnover rates. Hence, being proactive—by predicting what skills will be in demand, is essential to be prepared for these changes. To predict such transitions, this paper aims to develop a job analysis system with an example from the healthcare field in two countries; United States of America and United Arab Emirates. This empirical research consists of using data science with the help of multiple techniques. To study changes in job demand, we deployed Latent Semantic Indexing (LSI) and Latent Dirichlet Allocation (LDA) models; while for the skill changes we used techniques Factor Analysis and Non-Negative Matrix Factorization. Using different heatmaps visualizations of the LSI and LDA weights, results provided significant insights into the skill sets and demand changes in both job markets. The study concludes that low-skilled jobs are constantly being replaced by automated systems, while some of the high skill sets are also at risk.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19

Similar content being viewed by others

Notes

  1. https://www.onetcenter.org/.

References

  • Amin F, Fahmi A, Abdullah S (2018) Dealer using a new trapezoidal cubic hesitant fuzzy topsis method and application to group decision-making program. Soft Comput 23(14):5353–5366

    Article  Google Scholar 

  • Autor DH, Levy F, Murnane RJ (2003) The skill content of recent technological change: an empirical exploration. Quart J Econ 118(4):1279–1333

    Article  Google Scholar 

  • Blei DM, Ng AY, Jordan MI (2003) Latent dirichlet allocation. J Mach Learn Res Jan(3):993–1022

    MATH  Google Scholar 

  • Booz M (2018) These 3 industries have the highest talent turnover rates. Business LinkedIn, Sunnyvale

    Google Scholar 

  • Bradford RB (2008) An empirical study of required dimensionality for large-scale latent semantic indexing applications. In: Proceedings of the 17th ACM conference on Information and knowledge management. ACM, pp 153–162

  • Broadbent E, Stafford R, MacDonald B (2009) Acceptance of healthcare robots for the older population: review and future directions. Int J Soc Robot 1(4):319

    Article  Google Scholar 

  • Brynjolfsson E, McAfee A (2012) Race against the machine: how the digital revolution is accelerating innovation, driving productivity, and irreversibly transforming employment and the economy. Brynjolfsson and McAfee, New York

    Google Scholar 

  • Brynjolfsson E, McAfee A (2014) The second machine age: work, progress, and prosperity in a time of brilliant technologies. W. W. Norton & Company, New York

    Google Scholar 

  • Can F, Ozkarahan EA (1990) Concepts and effectiveness of the cover-coefficient-based clustering methodology for text databases. ACM Trans Database Syst (TODS) 15(4):483–517

    Article  Google Scholar 

  • Deerwester S, Dumais ST, Furnas GW, Landauer TK, Harshman R (1990) Indexing by latent semantic analysis. J Am Soc Inf Sci 41(6):391

    Article  Google Scholar 

  • Deng W, Zhao H, Zou L, Li G, Yang X, Wu D (2017) A novel collaborative optimization algorithm in solving complex optimization problems. Soft Comput 21(15):4387–4398

    Article  Google Scholar 

  • Deng W, Yao R, Zhao H, Yang X, Li G (2019) A novel intelligent diagnosis method using optimal ls-svm with improved pso algorithm. Soft Comput 23(7):2445–2462

    Article  Google Scholar 

  • Denny MJ, Spirling A (2018) Text preprocessing for unsupervised learning: why it matters, when it misleads, and what to do about it. Polit Anal 26(2):168–189

    Article  Google Scholar 

  • Di Nola A, Sessa S, Pedrycz W, Sanchez E (2013) Fuzzy relation equations and their applications to knowledge engineering, vol 3. Springer, Berlin

    MATH  Google Scholar 

  • Dumais ST, Furnas GW, Landauer TK, Deerwester S, Harshman R (1988) Using latent semantic analysis to improve access to textual information. In: Proceedings of the SIGCHI conference on Human factors in computing systems. ACM, pp 281–285

  • Durairaj M, Ranjani V (2013) Data mining applications in healthcare sector: a study. Int J Sci Technol Res 2(10):29–35

    Google Scholar 

  • Fahmi A, Abdullah S, Amin F, Khan MSA (2018) Trapezoidal cubic fuzzy number einstein hybrid weighted averaging operators and its application to decision making. Soft Comput 23(14):5753–5783

    Article  Google Scholar 

  • Gatys LA, Ecker AS, Bethge M (2015) A neural algorithm of artistic style. arXiv:1508.06576

  • Gautam S, Soni S (2018) Artificial intelligence techniques for music composition. Int J Sci Res Comput Sci Eng Inform Technol 3(3):385–389

    Google Scholar 

  • Gönül FF, Carter F, Wind J (2000) What kind of patients and physicians value direct-to-consumer advertising of prescription drugs. Health Care Manag Sci 3(3):215–226

    Article  Google Scholar 

  • Gregor H (2005) Parameter estimation for text analysis. Technical report

  • Grossman RL, Kamath C, Kegelmeyer P, Kumar V, Namburu R (2013) Data mining for scientific and engineering applications, vol 2. Springer, Berlin

    MATH  Google Scholar 

  • Halal W, Kolber J, Davies O, Global T (2017) Forecasts of ai and future jobs in 2030: muddling through likely, with two alternative scenarios. J Futures Stud 21(2):83–96

    Google Scholar 

  • Han J, Pei J, Kamber M (2011) Data mining: concepts and techniques. Elsevier, Amsterdam

    MATH  Google Scholar 

  • Hellinger E (1909) Neue begründung der theorie quadratischer formen von unendlichvielen veränderlichen. Journal für die reine und angewandte Mathematik 136:210–271

    Article  MathSciNet  Google Scholar 

  • Horst P (1965) Factor analysis of data matrices. Rinehart and Winston, Holt

    MATH  Google Scholar 

  • Isson JP, Hwang M (2018) Unstructured data analytics: how to improve customer acquisition, customer retention, and fraud detection and prevention. Wiley, Hoboken

    Book  Google Scholar 

  • Jensen PB, Jensen LJ, Brunak S (2012) Mining electronic health records: towards better research applications and clinical care. Nat Rev Genet 13(6):395

    Article  Google Scholar 

  • Jolliffe I (2005) Principal component analysis. In: Encyclopedia of statistics in behavioral science. American Cancer Society. Available at: https://doi.org/10.1002/0470013192.bsa501

  • Karakatsanis I, AlKhader W, MacCrory F, Alibasic A, Omar MA, Aung Z, Woon WL (2017) Data mining approach to monitoring the requirements of the job market: a case study. Inf Syst 65:1–6

    Article  Google Scholar 

  • Kim GH, Trimi S, Chung JH (2014) Big-data applications in the government sector. Commun ACM 57(3):78–85

    Article  Google Scholar 

  • Kowalczyk R (2002) Fuzzy e-negotiation agents. Soft Comput 6(5):337–347

    Article  Google Scholar 

  • Lanfranco AR, Castellanos AE, Desai JP, Meyers WC (2004) Robotic surgery: a current perspective. Ann Surg 239(1):14

    Article  Google Scholar 

  • Larose D, Larose C (2014) k-nearest neighbor algorithm. In: Discovering knowledge in data: an introduction to data mining. Wiley, pp 149–164

  • Lawley DN, Maxwell AE (1971) Factor analysis as a statistical method, vol 18. Wiley, Hoboken

    MATH  Google Scholar 

  • Lee DD, Seung HS (1999) Learning the parts of objects by non-negative matrix factorization. Nature 401(6755):788

    Article  Google Scholar 

  • Lee CS, Wang MH, Wu MJ, Nakagawa Y, Tsuji H, Yamazaki Y, Hirota K (2013) Soft-computing-based emotional expression mechanism for game of computer go. Soft Comput 17(7):1263–1282

    Article  Google Scholar 

  • Li J, Fong S, Zhuang Y, Khoury R (2016) Hierarchical classification in text mining for sentiment analysis of online news. Soft Comput 20(9):3411–3420

    Article  Google Scholar 

  • Low L (2012) Abu Dhabi’s vision 2030: an ongoing journey of economic development. World Scientific, Singapore

    Book  Google Scholar 

  • MacCrory F, Westerman G, Alhammadi Y, Brynjolfsson E (2014) Racing with and against the machine: changes in occupational skill composition in an era of rapid technological advance. ICIS

  • Manning CD, Raghavan P, Schütze H et al (2008) Introduction to information retrieval, vol 1. Cambridge University Press, Cambridge

    Book  Google Scholar 

  • Mimno D, Wallach HM, Talley E, Leenders M, McCallum A (2011) Optimizing semantic coherence in topic models. In: Proceedings of the conference on empirical methods in natural language processing, association for computational linguistics. pp 262–272

  • Moravec H (1988) Mind children: the future of robot and human intelligence. Harvard University Press, Cambridge

    Google Scholar 

  • Olson DL, Delen D (2008) Advanced data mining techniques. Springer, Berlin

    MATH  Google Scholar 

  • Řehůřek R, Sojka P (2010) Software Framework for Topic Modelling with Large Corpora. In: Proceedings of the LREC 2010 workshop on new challenges for nlp frameworks, ELRA, Valletta, Malta. pp 45–50. http://is.muni.cz/publication/884893/en

  • Röder M, Both A, Hinneburg A (2015) Exploring the space of topic coherence measures. In: Proceedings of the eighth ACM international conference on web search and data mining. ACM, pp 399–408

  • Rossetti MD, Felder RA, Kumar A (2000) Simulation of robotic courier deliveries in hospital distribution services. Health Care Manag Sci 3(3):201–213

    Article  Google Scholar 

  • Sarkar D (2016) Text analytics with python: a practitioner’s guide to natural language processing. Apress, Berkely, CA, USA

  • Soni J, Ansari U, Sharma D, Soni S (2011) Predictive data mining for medical diagnosis: an overview of heart disease prediction. Int J Comput Appl 17(8):43–48

    Google Scholar 

  • Weiser MR (2018) Masters program colon pathway: robotic right hemicolectomy. In: Patel AD, Oleynikov D (eds) The SAGES manual of robotic surgery. Springer International Publishing, Cham, pp 141–150

  • Woon WL, Aung Z, AlKhader W, Svetinovic D, Omar MA (2015) Changes in occupational skills-a case study using non-negative matrix factorization. In: International conference on neural information processing. Springer, pp 627–634

  • Yong AG, Pearce S (2013) A beginner’s guide to factor analysis: focusing on exploratory factor analysis. Tutor Quant Methods Psychol 9(2):79–94

    Article  Google Scholar 

  • Yoo I, Alafaireet P, Marinov M, Pena-Hernandez K, Gopidi R, Chang JF, Hua L (2012) Data mining in healthcare and biomedicine: a survey of the literature. J Med Syst 36(4):2431–2448

    Article  Google Scholar 

  • Zhao W, Chen JJ, Perkins R, Liu Z, Ge W, Ding Y, Zou W (2015) A heuristic approach to determine an appropriate number of topics in topic modeling. BMC Bioinform 16(Suppl 13):S8. https://doi.org/10.1186/1471-2105-16-S13-S8

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Armin Alibasic.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Ethical approval

This article does not contain any studies with animals performed by any of the authors.

Informed consent

Informed consent was obtained from all individual participants included in the study.

Additional information

Communicated by V. Loia.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Alibasic, A., Simsekler, M.C.E., Kurfess, T. et al. Utilizing data science techniques to analyze skill and demand changes in healthcare occupations: case study on USA and UAE healthcare sector. Soft Comput 24, 4959–4976 (2020). https://doi.org/10.1007/s00500-019-04247-1

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-019-04247-1

Keywords

Navigation