Abstract
A significant problem in many Thai university organizations is the inability to effectively identify employees’ roles and functions. This research aims to study the workload management of university personnel by text mining techniques. There are two main research objectives. The first objective is to manipulate highly complex Thai word segmentation. The second objective is to produce a predictive model for identifying job performance for human resource management of university personnel. Research tools are machine learning algorithms and word segmentation analysis, including Decision Tree (DT), Generalized Linear Model (GLM), K-Nearest Neighbors (K-NN), Naïve Bayes (NB), Support Vector Machine (SVM), Term Frequency-Inverse Document Frequency (TF-IDF), Term Frequency (TF), Term Occurrences (TO), and Binary Term Occurrences (BTO) techniques. The research data is compiled from job descriptions for three positions from the School of Information and Communication Technology at the University of Phayao. The results show that the best predictive model is developed with the Generalized Linear Model (GLM). It has a high accuracy value of 89.80%, with Binary Term Occurrences (BTO) technique. Research operational plan for future work, researchers plan to develop an information system to support work within the School of Information and Communication Technology, University of Phayao to support further work.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Opatha, H.H.D.P.J.: HR analytics: a literature review and new conceptual model. Int. J. Sci. Res. Publ. 10, 130–141 (2020). https://doi.org/10.29322/IJSRP.10.06.2020.p10217
Feng, S.: Job satisfaction, management sentiment, and financial performance: text analysis with job reviews from indeed.com. Int. J. Inf. Manag. Data Insights 3, 100155 (2023). https://doi.org/10.1016/j.jjimei.2023.100155
Gazit, N., Ben-Gal, G., Eliashar, R.: Using job analysis for identifying the desired competencies of 21st-century surgeons for improving trainees selection. J. Surg. Educ. 80, 81–92 (2023). https://doi.org/10.1016/j.jsurg.2022.08.015
Jung, Y., Suh, Y.: Mining the voice of employees: a text mining approach to identifying and analyzing job satisfaction factors from online employee reviews. Decis. Support Syst. 123, 113074 (2019). https://doi.org/10.1016/j.dss.2019.113074
Hoff, K.A., Song, Q.C., Wee, C.J.M., Phan, W.M.J., Rounds, J.: Interest fit and job satisfaction: a systematic review and meta-analysis. J. Vocat. Behav. 123, 103503 (2020). https://doi.org/10.1016/j.jvb.2020.103503
Chen, H., Zhang, Y.: Educating data management professionals: a content analysis of job descriptions. J. Acad. Librariansh. 43, 18–24 (2017). https://doi.org/10.1016/j.acalib.2016.11.002
Zarindast, A., Sharma, A., Wood, J.: Application of text mining in smart lighting literature - an analysis of existing literature and a research agenda. Int. J. Inf. Manag. Data Insights 1, 100032 (2021). https://doi.org/10.1016/j.jjimei.2021.100032
Albalawi, Y., Buckley, J., Nikolov, N.S.: Investigating the impact of pre-processing techniques and pre-trained word embeddings in detecting Arabic health information on social media. J. Big Data. 8, 95 (2021). https://doi.org/10.1186/s40537-021-00488-w
Ousirimaneechai, N., Sinthupinyo, S.: Extraction of trend keywords and stop words from Thai Facebook pages using character n-grams. Int. J. Mach. Learn. 8, 589–594 (2018). https://doi.org/10.18178/ijmlc.2018.8.6.750
Tripathi, G., Naganna, S.: Feature selection and classification approach for sentiment analysis. Mach. Learn. Appl. Int. J. 2, 1–16 (2015). https://doi.org/10.5121/mlaij.2015.2201
Kompan, M., Bieliková, M.: News article classification based on a vector representation including words’ collocations. In: Dicheva, D., Markov, Z., Stefanova, E. (eds.) Third International Conference on Software, Services and Semantic Technologies S3T 2011, pp. 1–8. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-23163-6_1
Nasa-Ngium, P., Nuankaew, W.S., Nuankaew, P.: Analyzing and tracking student educational program interests on social media with chatbots platform and text analytics. Int. J. Interact. Mob. Technol. 17, 4–21 (2023). https://doi.org/10.3991/ijim.v17i05.31593
Yang, Y., Yu, C., Zhong, R.Y.: Generalized linear model-based data analytic approach for construction equipment management. Adv. Eng. Inform. 55, 101884 (2023). https://doi.org/10.1016/j.aei.2023.101884
Rico-Juan, J.R., Valero-Mas, J.J., Calvo-Zaragoza, J.: Extensions to rank-based prototype selection in k-nearest neighbour classification. Appl. Soft Comput. 85, 105803 (2019). https://doi.org/10.1016/j.asoc.2019.105803
Chen, J., Huang, H., Tian, S., Qu, Y.: Feature selection for text classification with Naïve Bayes. Expert Syst. Appl. 36, 5432–5435 (2009). https://doi.org/10.1016/j.eswa.2008.06.054
Hearst, M.A., Dumais, S.T., Osuna, E., Platt, J., Scholkopf, B.: Support vector machines. IEEE Intell. Syst. Appl. 13, 18–28 (1998). https://doi.org/10.1109/5254.708428
TeCho, J., Nattee, C., Theeramunkong, T.: Boosting-based ensemble learning with penalty profiles for automatic Thai unknown word recognition. Comput. Math. Appl. 63, 1117–1134 (2012). https://doi.org/10.1016/j.camwa.2011.11.062
Haruechaiyasak, C., Kongyoung, S., Dailey, M.: A comparative study on Thai word segmentation approaches. In: 2008 5th International Conference on Electrical Engineering/Electronics, Computer, Telecommunications and Information Technology, pp. 125–128 (2008). https://doi.org/10.1109/ECTICON.2008.4600388
Acknowledgements
This research project was supported by the Thailand Science Research and Innovation Fund and the University of Phayao (Grant No. FF66-UoE002). In addition, this research was supported by many advisors, academics, researchers, students, and staff. The authors would like to thank all of them for their support and collaboration in making this research possible.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Ethics declarations
The authors declare no conflict of interest.
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Nuankaew, W.S., Thipmontha, R., Jeefoo, P., Nasa-ngium, P., Nuankaew, P. (2023). Using Text Mining and Tokenization Analysis to Identify Job Performance for Human Resource Management at the University of Phayao. In: Nguyen, N.T., et al. Recent Challenges in Intelligent Information and Database Systems. ACIIDS 2023. Communications in Computer and Information Science, vol 1863. Springer, Cham. https://doi.org/10.1007/978-3-031-42430-4_47
Download citation
DOI: https://doi.org/10.1007/978-3-031-42430-4_47
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-42429-8
Online ISBN: 978-3-031-42430-4
eBook Packages: Computer ScienceComputer Science (R0)