Abstract
New technologies are emerging on a continual basis with drastic trajectories and wider penetration into the job market. Health care is among the top ten sectors in terms of talent turnover rates. Hence, being proactive—by predicting what skills will be in demand, is essential to be prepared for these changes. To predict such transitions, this paper aims to develop a job analysis system with an example from the healthcare field in two countries; United States of America and United Arab Emirates. This empirical research consists of using data science with the help of multiple techniques. To study changes in job demand, we deployed Latent Semantic Indexing (LSI) and Latent Dirichlet Allocation (LDA) models; while for the skill changes we used techniques Factor Analysis and Non-Negative Matrix Factorization. Using different heatmaps visualizations of the LSI and LDA weights, results provided significant insights into the skill sets and demand changes in both job markets. The study concludes that low-skilled jobs are constantly being replaced by automated systems, while some of the high skill sets are also at risk.



















Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Amin F, Fahmi A, Abdullah S (2018) Dealer using a new trapezoidal cubic hesitant fuzzy topsis method and application to group decision-making program. Soft Comput 23(14):5353–5366
Autor DH, Levy F, Murnane RJ (2003) The skill content of recent technological change: an empirical exploration. Quart J Econ 118(4):1279–1333
Blei DM, Ng AY, Jordan MI (2003) Latent dirichlet allocation. J Mach Learn Res Jan(3):993–1022
Booz M (2018) These 3 industries have the highest talent turnover rates. Business LinkedIn, Sunnyvale
Bradford RB (2008) An empirical study of required dimensionality for large-scale latent semantic indexing applications. In: Proceedings of the 17th ACM conference on Information and knowledge management. ACM, pp 153–162
Broadbent E, Stafford R, MacDonald B (2009) Acceptance of healthcare robots for the older population: review and future directions. Int J Soc Robot 1(4):319
Brynjolfsson E, McAfee A (2012) Race against the machine: how the digital revolution is accelerating innovation, driving productivity, and irreversibly transforming employment and the economy. Brynjolfsson and McAfee, New York
Brynjolfsson E, McAfee A (2014) The second machine age: work, progress, and prosperity in a time of brilliant technologies. W. W. Norton & Company, New York
Can F, Ozkarahan EA (1990) Concepts and effectiveness of the cover-coefficient-based clustering methodology for text databases. ACM Trans Database Syst (TODS) 15(4):483–517
Deerwester S, Dumais ST, Furnas GW, Landauer TK, Harshman R (1990) Indexing by latent semantic analysis. J Am Soc Inf Sci 41(6):391
Deng W, Zhao H, Zou L, Li G, Yang X, Wu D (2017) A novel collaborative optimization algorithm in solving complex optimization problems. Soft Comput 21(15):4387–4398
Deng W, Yao R, Zhao H, Yang X, Li G (2019) A novel intelligent diagnosis method using optimal ls-svm with improved pso algorithm. Soft Comput 23(7):2445–2462
Denny MJ, Spirling A (2018) Text preprocessing for unsupervised learning: why it matters, when it misleads, and what to do about it. Polit Anal 26(2):168–189
Di Nola A, Sessa S, Pedrycz W, Sanchez E (2013) Fuzzy relation equations and their applications to knowledge engineering, vol 3. Springer, Berlin
Dumais ST, Furnas GW, Landauer TK, Deerwester S, Harshman R (1988) Using latent semantic analysis to improve access to textual information. In: Proceedings of the SIGCHI conference on Human factors in computing systems. ACM, pp 281–285
Durairaj M, Ranjani V (2013) Data mining applications in healthcare sector: a study. Int J Sci Technol Res 2(10):29–35
Fahmi A, Abdullah S, Amin F, Khan MSA (2018) Trapezoidal cubic fuzzy number einstein hybrid weighted averaging operators and its application to decision making. Soft Comput 23(14):5753–5783
Gatys LA, Ecker AS, Bethge M (2015) A neural algorithm of artistic style. arXiv:1508.06576
Gautam S, Soni S (2018) Artificial intelligence techniques for music composition. Int J Sci Res Comput Sci Eng Inform Technol 3(3):385–389
Gönül FF, Carter F, Wind J (2000) What kind of patients and physicians value direct-to-consumer advertising of prescription drugs. Health Care Manag Sci 3(3):215–226
Gregor H (2005) Parameter estimation for text analysis. Technical report
Grossman RL, Kamath C, Kegelmeyer P, Kumar V, Namburu R (2013) Data mining for scientific and engineering applications, vol 2. Springer, Berlin
Halal W, Kolber J, Davies O, Global T (2017) Forecasts of ai and future jobs in 2030: muddling through likely, with two alternative scenarios. J Futures Stud 21(2):83–96
Han J, Pei J, Kamber M (2011) Data mining: concepts and techniques. Elsevier, Amsterdam
Hellinger E (1909) Neue begründung der theorie quadratischer formen von unendlichvielen veränderlichen. Journal für die reine und angewandte Mathematik 136:210–271
Horst P (1965) Factor analysis of data matrices. Rinehart and Winston, Holt
Isson JP, Hwang M (2018) Unstructured data analytics: how to improve customer acquisition, customer retention, and fraud detection and prevention. Wiley, Hoboken
Jensen PB, Jensen LJ, Brunak S (2012) Mining electronic health records: towards better research applications and clinical care. Nat Rev Genet 13(6):395
Jolliffe I (2005) Principal component analysis. In: Encyclopedia of statistics in behavioral science. American Cancer Society. Available at: https://doi.org/10.1002/0470013192.bsa501
Karakatsanis I, AlKhader W, MacCrory F, Alibasic A, Omar MA, Aung Z, Woon WL (2017) Data mining approach to monitoring the requirements of the job market: a case study. Inf Syst 65:1–6
Kim GH, Trimi S, Chung JH (2014) Big-data applications in the government sector. Commun ACM 57(3):78–85
Kowalczyk R (2002) Fuzzy e-negotiation agents. Soft Comput 6(5):337–347
Lanfranco AR, Castellanos AE, Desai JP, Meyers WC (2004) Robotic surgery: a current perspective. Ann Surg 239(1):14
Larose D, Larose C (2014) k-nearest neighbor algorithm. In: Discovering knowledge in data: an introduction to data mining. Wiley, pp 149–164
Lawley DN, Maxwell AE (1971) Factor analysis as a statistical method, vol 18. Wiley, Hoboken
Lee DD, Seung HS (1999) Learning the parts of objects by non-negative matrix factorization. Nature 401(6755):788
Lee CS, Wang MH, Wu MJ, Nakagawa Y, Tsuji H, Yamazaki Y, Hirota K (2013) Soft-computing-based emotional expression mechanism for game of computer go. Soft Comput 17(7):1263–1282
Li J, Fong S, Zhuang Y, Khoury R (2016) Hierarchical classification in text mining for sentiment analysis of online news. Soft Comput 20(9):3411–3420
Low L (2012) Abu Dhabi’s vision 2030: an ongoing journey of economic development. World Scientific, Singapore
MacCrory F, Westerman G, Alhammadi Y, Brynjolfsson E (2014) Racing with and against the machine: changes in occupational skill composition in an era of rapid technological advance. ICIS
Manning CD, Raghavan P, Schütze H et al (2008) Introduction to information retrieval, vol 1. Cambridge University Press, Cambridge
Mimno D, Wallach HM, Talley E, Leenders M, McCallum A (2011) Optimizing semantic coherence in topic models. In: Proceedings of the conference on empirical methods in natural language processing, association for computational linguistics. pp 262–272
Moravec H (1988) Mind children: the future of robot and human intelligence. Harvard University Press, Cambridge
Olson DL, Delen D (2008) Advanced data mining techniques. Springer, Berlin
Řehůřek R, Sojka P (2010) Software Framework for Topic Modelling with Large Corpora. In: Proceedings of the LREC 2010 workshop on new challenges for nlp frameworks, ELRA, Valletta, Malta. pp 45–50. http://is.muni.cz/publication/884893/en
Röder M, Both A, Hinneburg A (2015) Exploring the space of topic coherence measures. In: Proceedings of the eighth ACM international conference on web search and data mining. ACM, pp 399–408
Rossetti MD, Felder RA, Kumar A (2000) Simulation of robotic courier deliveries in hospital distribution services. Health Care Manag Sci 3(3):201–213
Sarkar D (2016) Text analytics with python: a practitioner’s guide to natural language processing. Apress, Berkely, CA, USA
Soni J, Ansari U, Sharma D, Soni S (2011) Predictive data mining for medical diagnosis: an overview of heart disease prediction. Int J Comput Appl 17(8):43–48
Weiser MR (2018) Masters program colon pathway: robotic right hemicolectomy. In: Patel AD, Oleynikov D (eds) The SAGES manual of robotic surgery. Springer International Publishing, Cham, pp 141–150
Woon WL, Aung Z, AlKhader W, Svetinovic D, Omar MA (2015) Changes in occupational skills-a case study using non-negative matrix factorization. In: International conference on neural information processing. Springer, pp 627–634
Yong AG, Pearce S (2013) A beginner’s guide to factor analysis: focusing on exploratory factor analysis. Tutor Quant Methods Psychol 9(2):79–94
Yoo I, Alafaireet P, Marinov M, Pena-Hernandez K, Gopidi R, Chang JF, Hua L (2012) Data mining in healthcare and biomedicine: a survey of the literature. J Med Syst 36(4):2431–2448
Zhao W, Chen JJ, Perkins R, Liu Z, Ge W, Ding Y, Zou W (2015) A heuristic approach to determine an appropriate number of topics in topic modeling. BMC Bioinform 16(Suppl 13):S8. https://doi.org/10.1186/1471-2105-16-S13-S8
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Ethical approval
This article does not contain any studies with animals performed by any of the authors.
Informed consent
Informed consent was obtained from all individual participants included in the study.
Additional information
Communicated by V. Loia.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Alibasic, A., Simsekler, M.C.E., Kurfess, T. et al. Utilizing data science techniques to analyze skill and demand changes in healthcare occupations: case study on USA and UAE healthcare sector. Soft Comput 24, 4959–4976 (2020). https://doi.org/10.1007/s00500-019-04247-1
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-019-04247-1