Abstract
Users' personality traits can provide different clues about them in the Internet environment. Some areas where these clues can be used are law enforcement, advertising agencies, recruitment processes, and e-commerce applications. In this study, it is aimed to create a dataset and a prediction model for predicting the personality traits of Internet users who produce Turkish content. The main contribution of the study is the personality traits dataset composed of the Turkish Twitter content. In addition, the preprocessing, vectorization, and deep learning model comparisons made in the proposed prediction system will contribute to both current usages and future studies in the relevant literature. It has been observed that the success of the Bidirectional Encoder Representations from Transformers vectorization method and the Stemming preprocessing step on the Turkish personality traits dataset is high. In the previous studies, the effects of these processes on English datasets were reported to have lower success rates. In addition, the results show that the Bidirectional Long Short-Term Memory deep learning method has a better level of success than other methods both for the Turkish dataset and English datasets.
Similar content being viewed by others
Data availibility statement
The datasets generated during and/or analysed during the current study are available from the corresponding author on reasonable request.
References
Azucar D, Marengo D, Settanni M (2018) Predicting the Big 5 personality traits from digital footprints on social media: a meta-analysis. Personal Individ Differ 124:150–159
Anonymous PAN Shared Tasks. In: Webis. https://pan.webis.de/
Rangel F, Celli F, Rosso P et al (2015) Overview of the 3rd author profiling task at PAN 2015. In: Cappellato L, Ferro N, Jones G, Juan ES (eds) CLEF 2015 evaluation labs and workshop—working notes papers. CEUR-WS.org, Toulouse, France
Kosinski M, Stillwell D, Graepel T (2013) Private traits and attributes are predictable from digital records of human behavior. Proc Natl Acad Sci 110:5802–5805
Ahmad Z, Lutfi SL, Kushan AL et al (2017) Personality prediction of Malaysian Facebook users: cultural preferences and features variation. Adv Sci Lett 23:7900–7903
Laleh A, Shahram R (2017) Analyzing facebook activities for personality recognition. In: 2017 16th IEEE international conference on machine learning and applications (ICMLA), pp 960–964
Tandera T, Hendro SD et al (2017) Personality prediction system from Facebook users. Procedia Comput Sci 116:604–611
Vaidhya M, Shrestha B, Sainju B et al (2017) Personality traits analysis from Facebook data. In: 2017 21st international computer science and engineering conference (ICSEC), pp 1–5
Akhtar R, Winsborough D, Ort U et al (2018) Detecting the dark side of personality using social media status updates. Pers Individ Differ 132:90–97
Hassanein M, Hussein W, Rady S et al (2018) Predicting Personality traits from social media using text semantics. In: 2018 13th international conference on computer engineering and systems (ICCES), pp 184–189
Howlader P, Pal KK, Cuzzocrea A et al (2018) Predicting Facebook-users’ personality based on status and linguistic features via flexible regression analysis techniques. Assoc Computing Machinery, New York
Mao Y, Zhang D, Wu C et al (2018) Feature analysis and optimisation for computational personality recognition. In: 2018 IEEE 4th international conference on computer and communications (ICCC), pp 2410–2414
Tadesse MM, Lin H, Xu B et al (2018) Personality predictions based on user behavior on the Facebook social media platform. IEEE Access 6:61959–61969
Xue D, Wu LF, Hong Z et al (2018) Deep learning-based personality recognition from text posts of online social networks. Appl Intell 48:4232–4246
Marouf AA, Hasan MK, Mahmud H (2019) Identifying neuroticism from user generated content of social media based on psycholinguistic cues. In: 2019 international conference on electrical, computer and communication engineering (ECCE), pp 1–5
Zheng HC, Wu CH, Assoc Comp M (2019) Predicting personality using Facebook status based on semi-supervised learning. Assoc Computing Machinery, New York
Al Marouf A, Hasan MK, Mahmud H (2020) Comparative analysis of feature selection algorithms for computational personality prediction from social media. IEEE Trans Comput Soc Syst 7:587–599
Sun JS, Tian ZQ, Fu YL et al (2020) Digital twins in human understanding: a deep learning-based method to recognize personality traits. Int J Comput Integr Manuf 34:14
Wang S, Cui L, Liu L et al (2020) Personality traits prediction based on users’ digital footprints in social networks via attention RNN. In: 2020 IEEE international conference on services computing (SCC). IEEE, pp 54–56
Zhao JH, Zeng DL, Xiao YJ et al (2020) User personality prediction based on topic preference and sentiment analysis using LSTM model. Pattern Recognit Lett 138:397–402
Başaran S, Ejimogu OH (2021) A neural network approach for predicting personality from Facebook data. SAGE Open 11:21582440211032156
Bakry MR, Nasr MM, Alsheref FK (2022) Personality classification model of social network profiles based on their activities and contents. Int J Adv Comput Sci Appl 13:16–21
Kamalesh MD, Bharathi B (2022) Personality prediction model for social media using machine learning Technique. Comput Electr Eng 100:12
Yang B (2022) Analysis model of personality and psychological characteristics of network users under high-pressure working environment. Secur Commun Netw 2022:10
Zhou LX, Zhang ZY, Zhao LJ et al (2022) Attention-based BiLSTM models for personality recognition from user-generated content. Inf Sci 596:460–471
Ahmad N, Siddique J (2017) Personality assessment using Twitter tweets. In: ZanniMerk C, Frydman C, Toro C, Hicks Y, Howlett RJ, Jain LC (eds) Knowledge-based and intelligent information and engineering systems. Elsevier Science Bv, Amsterdam, pp 1964–1973
Bhatti SK, Muneer A, Lali MI et al (2017) Personality analysis of the USA public using Twitter profile pictures. IEEE, New York
Guntuku SC, Lin WS, Carpenter J et al (2017) Studying personality through the content of posted and liked images on Twitter. Assoc Computing Machinery, New York
Raje MS, Singh A (2018) Personality detection by analysis of Twitter profiles. In: Abraham A, Cherukuri AK, Madureira AM, Muda AK (eds) Proceedings of the eighth international conference on soft computing and pattern recognition. Springer International Publishing Ag, Cham, pp 667–675
Jeremy NH, Prasetyo C, Suhartono D (2019) Identifying personality traits for Indonesian user from Twitter dataset. Int J Fuzzy Log Intell Syst 19:283–289
Tutaysalgir E, Karagoz P, Toroslu IH (2019) Clustering based personality prediction on Turkish tweets. In: 2019 IEEE/ACM international conference on advances in social networks analysis and mining (ASONAM). IEEE, pp 825–828
Kosan MA, Karacan H, Urgen BA (2022) Predicting personality traits with semantic structures and LSTM-based neural networks. Alex Eng J 61:8007–8025
Karanatsiou D, Sermpezis P, Gruda D et al (2022) My tweets bring all the traits to the yard: predicting personality and relational traits in online social networks. ACM Trans Web 16:26
Matsumoto K, Kishima R, Tsuchiya S et al (2022) Relationship between personality patterns and harmfulness: analysis and prediction based on sentence embedding. Int J Inf Technol Web Eng 17:24
Rathi S, Verma JP, Jain R et al (2022) Psychometric profiling of individuals using Twitter profiles: a psychological natural language processing based approach. Concurr Comput Pract Exp 34:19
Elbaghazaoui BE, Amnai M, Fakhri Y (2023) Predicting the next word using the Markov chain model according to profiling personality. J Supercomput 16
Ferwerda B, Tkalcic M, Acm, (2018) Predicting users’ personality from Instagram pictures: using visual and/or content features? Assoc Computing Machinery, New York
Kim Y, Kim JH (2018) Using computer vision techniques on Instagram to link users’ personalities and genders to the features of their photos: an exploratory study. Inf Process Manag 54:1101–1114
Huang SG, Zheng JH, Xue D et al (2017) Predicting big-five personality for micro-blog based on robust multi-task learning. In: Zou B, Li M, Wang H, Song X, Xie W, Lu Z (eds) Data science, Pt 1. Springer, Berlin, pp 486–499
Li C, Wan J, Wang B (2017) Personality prediction of social network users. In: 2017 16th international symposium on distributed computing and applications to business, engineering and science (DCABES), pp 84–87
Lin J, Mao W, Zeng DD (2017) Personality-based refinement for sentiment classification in microblog. Knowl-Based Syst 132:204–214
Han SQ, Huang HL, Tang YQ (2020) Knowledge of words: an interpretable approach for personality recognition from social media. Knowl-Based Syst 194:20
Wang P, Yan Y, Si YD et al (2020) Classification of proactive personality: text mining based on Weibo text and short-answer questions text. IEEE Access 8:97370–97382
Wang P, Yan M, Zhan X et al (2021) Predicting self-reported proactive personality classification with Weibo text and short answer text. IEEE Access 9:77203–77211
Jiang Y, Deng S, Li H et al (2021) Predicting user personality with social interactions in Weibo. Aslib J Inf Manag 73(6):839–864
Yang K, Yuan H, Lau RYK (2022) PsyCredit: an interpretable deep learning-based credit assessment approach facilitated by psychometric natural language processing. Expert Syst Appl 198:13
Alsadhan N, Skillicorn D (2017) Estimating personality from social media posts. In: 2017 IEEE international conference on data mining workshops (ICDMW), pp 350–356
Varshney V, Varshney A, Ahmad T et al (2017) Recognising personality traits using social media. In: 2017 IEEE international conference on power, control, signals and instrumentation engineering (ICPCSI), pp 2876–2881
Guan Z, Wu B, Wang B et al (2020) Personality2vec: network representation learning for personality. In: 2020 IEEE fifth international conference on data science in cyberspace (DSC). IEEE, pp 30–37
Khan AS, Ahmad H, Asghar MZ et al (2020) Personality classification from online text using machine learning approach. Int J Adv Comput Sci Appl 11:460–476
Sun XG, Liu B, Meng Q et al (2020) Group-level personality detection based on text generated networks. World Wide Web 23:1887–1906
Lopez-Santillan R, Gonzalez LC, Montes-Y-Gomez M et al (2023) When attention is not enough to unveil a text’s author profile: enhancing a transformer with a wide branch. Neural Comput Appl 34:20
Strickland E (2022) Andrew NG: Unbiggen AI. In: IEEE spectrum. https://spectrum.ieee.org/andrew-ng-data-centric-ai
Rammstedt B, John OP (2007) Measuring personality in one minute or less: a 10-item short version of the Big Five Inventory in English and German. J Res Pers 41:203–212
Horzum MB, Tuncay A, Padir MA (2017) Adaptation of big five personality traits scale to Turkish culture. Sakarya Univ J Educ 7:398–408
Gosling SD, Rentfrow PJ, Swann WB Jr (2003) A very brief measure of the Big-Five personality domains. J Res Pers 37:504–528
Atak H (2013) On-Maddeli Kişilik Ölçeği'nin Türk Kültürü'neUyarlanması
Donnellan MB, Oswald FL, Baird BM et al (2006) The mini-IPIP scales: tiny-yet-effective measures of the Big Five factors of personality. Psychol Assess 18:192
Korkmaz M, Somer O, Güngör D (2013) Ergen örneklemde beş faktör kişilik envanteri’nin cinsiyetlere göre ortalama ve kovaryans yapılarıyla ölçme eşdeğerliği. Eğitim ve Bilim 38
Soto CJ, John OP (2017) The next Big Five Inventory (BFI-2): developing and assessing a hierarchical model with 15 facets to enhance bandwidth, fidelity, and predictive power. J Pers Soc Psychol 113:117
Soto C The Big Five Inventory–2 (BFI-2). In: Colby College—Personality Lab. https://www.colby.edu/psych/personality-lab/#4
Schweter S (2020) BERTurk—BERT models for Turkish. In: Zenodo. https://doi.org/10.5281/zenodo.3770924
Clark K, Luong M-T, Le QV et al (2020) Electra: pre-training text encoders as discriminators rather than generators. arXiv:2003.10555
Bojanowski P, Grave E, Joulin A et al (2017) Enriching word vectors with subword information. Trans Assoc Comput Linguist 5:135–146
Grave E, Bojanowski P, Gupta P et al (2018) Learning word vectors for 157 languages. arXiv:1802.06893
Mikolov T, Chen K, Corrado G et al (2013) Efficient estimation of word representations in vector space. arXiv:1301.3781
Lau JH, Baldwin T (2016) An empirical evaluation of doc2vec with practical insights into document embedding generation. arXiv:1607.05368
Schuster M, Paliwal KK (1997) Bidirectional recurrent neural networks. IEEE Trans Signal Process 45:2673–2681
Kumar JA, Abirami S (2021) Ensemble application of bidirectional LSTM and GRU for aspect category detection with imbalanced data. Neural Comput Appl 33:14603–14621
Gers FA, Schmidhuber J, Cummins F (2000) Learning to forget: continual prediction with LSTM. Neural Comput 12:2451–2471
Graves A, Schmidhuber J (2005) Framewise phoneme classification with bidirectional LSTM and other neural network architectures. Neural Netw 18:602–610
Kiranyaz S, Avci O, Abdeljaber O et al (2021) 1D convolutional neural networks and applications: a survey. Mech Syst Signal Process 151:107398
Ramachandran P, Zoph B, Le QV (2017) Searching for activation functions. arXiv:1710.05941
Funding
This research received no external funding.
Author information
Authors and Affiliations
Contributions
MAK and HK helped in conceptualization; MAK worked in methodology; MAK worked in software; MAK, HK, and BAU helped in validation; MAK worked in investigation; MAK worked in resources; MAK and HK helped in data curation; MAK helped in writing—original draft; HK and BAU helped in formal analysis; MAKosan helped in visualization; HK and BAU helped in writing—review and editing; HK and BAU worked in supervision; and HK and BAU worked in project administration.
Corresponding author
Ethics declarations
Conflict of interest
All authors declare that there is no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Kosan, M.A., Karacan, H. & Urgen, B.A. Personality traits prediction model from Turkish contents with semantic structures. Neural Comput & Applic 35, 17147–17165 (2023). https://doi.org/10.1007/s00521-023-08603-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-023-08603-z