Abstract
Understanding the personality is beneficial for many purposes, e.g., it is natural to predict a user’s personality before offering him or her any services. The personality is intrinsic in the behavior of a person in all aspects, such as text writing. Some work has been proposed in recent times for correctly classifying a person’s personality from the text. However, it is still a significant challenge as the achieved accuracy is low; therefore, the proposed work addresses this issue. Effective feature selection techniques provide better classification accuracy in multi-label classification and personality traits identification as multi-label classification problem requires efficacy of feature selection methods. Therefore, to improve the accuracy using feature selection technique, this paper proposes a method for personality trait recognition from textual data called P ersonality T rait Classification based on L inguistic and F eature selection as M ulti-label classification (PTLFM). It combines analysis of variance’s F-statistic, Chi-square, and Mutual information with the sequential feature selection wrapper method to rank features. These three criteria apprehend different aspects of the dataset. The experimental results demonstrate that the proposed PTLFM method achieves higher accuracy across all the personality traits than the prevailing state-of-the-art machine learning and deep learning models. PTLFM provides an impressive absolute improvement of 2.23% and 3.84% of comparative improvement over the existing prevalent method, with more than 90% of features discarded. Furthemore, the proposed PTLFM achieves a percentage gain compared to the competitive methods across different personality traits Extraversion, Neuroticism, Agreeableness, Conscientiousness, and Openness in absolute terms 1.17, 1.94, 2.35, 1.64, and 0.35 respectively, and in comparative terms 2.01, 3.27, 4.14, 2.86, and 0.56 respectively. The results suggest that although deep learning is a popular paradigm, it does not always lead to a better predictive performance than machine learning models in all the problem domains.
Similar content being viewed by others
Data Availability statement
This paper reused data and a data citation to the reference list is added in the manuscript.
References
Aguilar AG, Guillén M J Y, Roman NV (2014) Destination brand personality: an application to spanish tourism. Int J Tour Res 18(3):210–219
Al Marouf A, Hasan MK, Mahmud H (2020) Comparative analysis of feature selection algorithms for computational personality prediction from social media. IEEE Trans Comput Soc Syst 7(3):587–599
Arya R, Singh J, Kumar A (2021) A survey of multidisciplinary domains contributing to affective computing. Comput Sci Rev 40:100399
Bergner RM (2020) What is personality? two myths and a definition. New Ideas Psychol 57:100759
Bhardwaj S, Atrey PK, Saini MK, El Saddik A (2016) Personality assessment using multiple online social networks. Multimed Tools Appl 75 (21):13237–13269
Capretz LF, Ahmed F (2010) Making sense of software development and personality types. IT Profession 12(1):6–13
Coltheart M (1981) The mrc psycholinguistic database. Quart J Exper Psychol Sect A 33(4):497–505
Dhelim S, Aung N, Ning H (2020) Mining user interest based on personality-aware hybrid filtering in social networks. Knowl-Based Syst 206:106227
El-Demerdash K, El-Khoribi RA, Shoman MAI, Abdou S (2021) Deep learning based fusion strategies for personality prediction. Egyptian Informatics Journal
Elngar AA, Jain N, Sharma D, Negi H, Trehan A, Srivastava A (2020) A deep learning based analysis of the big five personality traits from handwriting samples using image processing. J Inf Technol Manag 12:3–35. Special Issue: Deep Learning for Visual Information Analytics and Management
Goldberg LR (1993) The structure of phenotypic personality traits. Am Psychol 48(1):26–34
Gulseven O, Mostert J (2019) The role of phenotypic personality traits as dimensions of decision-making styles. Open Psychol J 12(1):84–95
Kim Y (2014) Convolutional neural networks for sentence classification. In: Proceedings of the 2014 conference on Empirical Methods in Natural Language Processing (EMNLP), Association for Computational Linguistics
Lerner MJ, Millon T, Weiner IB (2003) Handbook of psychology, volume 5: personality and social psychology. Wiley
Mairesse F, Walker MA, Mehl MR, Moore RK (2007) Using linguistic cues for the automatic recognition of personality in conversation and text. J Artif Intell Res 30:457–500
Majumder N, Poria S, Gelbukh A, Cambria E (2017) Deep learning-based document modeling for personality detection from text. IEEE Intell Syst 32(2):74–79
Mehta Y, Majumder N, Gelbukh A, Cambria E (2019) Recent trends in deep learning based personality detection. Artif Intell Rev:1–27
Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J (2013) Distributed representations of words and phrases and their compositionality. In: Advances in neural information processing systems, vol 26, pp 3111–3119
Mishra NK, Singh PK (2020) Fs-mlc: feature selection for multi-label classification using clustering in feature space. Inf Process Manag 57(4):102240
Mishra NK, Singh PK (2021) Feature construction and smote-based imbalance handling for multi-label learning. Inf Sci 563:342–357
Mishra R, Barnwal SK, Malviya S, Mishra P, Tiwary US (2018) Prosodic feature selection of personality traits for job interview performance. In: International Conference on Intelligent Systems Design and Applications. Springer, pp 673–682
Mohammad SM, Kiritchenko S (2015) Using hashtags to capture fine emotion categories from tweets. Comput Intell 31(2):301–326
Myers IB (1998) Mbti manual: A guide to the development and use of the myers-briggs type indicator. Consulting Psychologists Press, Palo Alto
Pennebaker JW, Francis ME, Booth RJ (2001) Linguistic inquiry and word count: Liwc 2001. Lawrence Erlbaum Associates, Mahway
Pennebaker JW, King LA (1999) Linguistic styles: language use as an individual difference. J Person Soc Psychol 77(6):1296–1312
Pohjalainen J, Räsänen O, Kadioglu S (2015) Feature selection methods and their combinations in high-dimensional classification of speaker likability, intelligibility and personality traits. Comput Speech Lang 29(1):145–171
Quercia D, Lambiotte R, Stillwell D, Kosinski M, Crowcroft J (2012) The personality of popular facebook users. In: Proceedings of the ACM 2012 conference on computer supported cooperative work, pp 955–964
Rifkin R, Klautau A (2004) In defense of one-vs-all classification. J Mach Learn Res 5:101–141
Sharma A, Jayagopi DB (2021) Towards efficient unconstrained handwriting recognition using dilated temporal convolution network. Expert Syst Appl 164:114004
Tang B, Kay S, He H (2016) Toward optimal feature selection in naive bayes for text categorization. IEEE Trans Knowl Data Eng 28(9):2508–2521
Tayarani M, Esposito A, Vinciarelli A (2019) What an” ehm” leaks about you: Mapping fillers into personality traits with quantum evolutionary feature selection algorithms. IEEE Trans Affect Comput
Thakur D, Gera T, Singh J (2015) The senti strength calculator: Engineering the sentiment from the opinionated text. In: 2015 Fifth international conference on communication systems and network technologies. IEEE, pp 1103–1108
Tighe EP, Ureta JC, Pollo BAL, Cheng CK, Bulos RDD (2016) Personality trait classification of essays with the application of feature reduction [internet]. In: Proceedings of the 4th workshop on Sentiment Analysis where AI meets Psychology (SAAIP) co-located with 25th International Joint Conference on Artificial Intelligence (IJCAI), pp 22–28
Vuttipittayamongkol P, Elyan E (2020) Neighbourhood-based undersampling approach for handling imbalanced and overlapped data. Inf Sci 509:47–70
Wang C, Han Y (2011) Linking properties of knowledge with innovation performance: the moderate role of absorptive capacity. J Knowl Manag 15(5):802–819
Wang Y, Zhao N, Liu X, Karaburun S, Chen M, Zhu T (2020) Identifying big five personality traits through controller area network bus data. J Adv Transp 2020
Xue D, Wu L, Hong Z, Guo S, Gao L, Wu Z, Zhong X, Sun J (2018) Deep learning-based personality recognition from text posts of online social networks. Appl Intell 48(11):4232–4246
Yang Y, Pedersen JO (1997) A comparative study on feature selection in text categorization. In: Proceedings of the fourteenth International Conference on Machine Learning, ICML 97. Morgan Kaufmann Publishers Inc., San Francisco, pp 412–420
Zhao J, Zeng D, Xiao Y, Che L, Wang M (2020) User personality prediction based on topic preference and sentiment analysis using lstm model. Pattern Recogn Lett 138:397–402
Zhao S, Gholaminejad A, Ding G, Gao Y, Han J, Keutzer K (2019) Personalized emotion recognition by personality-aware high-order learning of physiological signals. ACM Trans Multimed Comput Commun Appl 15(1s):1–18
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interests
The authors declare that they have no conflict of interest.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Mishra, N.K., Singh, A. & Singh, P.K. Multi-label personality trait identification from text. Multimed Tools Appl 81, 21503–21519 (2022). https://doi.org/10.1007/s11042-022-12548-1
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-022-12548-1