A lexical psycholinguistic knowledge-guided graph neural network for interpretable personality detection

doi:10.1016/j.knosys.2022.108952

Knowledge-Based Systems

Volume 249, 5 August 2022, 108952

https://doi.org/10.1016/j.knosys.2022.108952 Get rights and content

Abstract

With the blossoming of online social media, personality detection based on user-generated content has a significant impact on information scientific and industrial applications. Most existing approaches rely heavily on semantic features or superficial psycholinguistic statistical features calculated by existing tools and fail to effectively exploit psycholinguistic knowledge that can help determine and interpret peoples personality traits. In this paper, we propose a novel lexical psycholinguistic knowledge-guided graph neural model for interpretable personality detection, which leverages the personality lexicons as a bridge for injecting relevant external knowledge to enrich the semantics of a document. Specifically, we learn a kind of personality-aware word embedding, that encodes psycholinguistic information in the continuous representations of words. Then, a Heterogeneous Personality word graph is constructed by aligning the personality lexicons with the personality knowledge graph, which is fed into a Message-passing graph Network (HPMN) to extract explicit lexicon and knowledge relations through the interactions among heterogeneous graph nodes. Finally, through a carefully designed readout function, all heterogeneous nodes are selectively incorporated as knowledge-guided document embeddings for user-generated text personality understanding and interpretation. Experiments show that our model effectively detects personality traits. Moreover, it provides a certain level of support for lexical hypotheses in psycholinguistic research from a computational linguistics perspective.

Introduction

With the rapid development of social media platforms, people can access and analyze much user-generated content (UGC) to automatically identify authors personality traits. Many studies have shown that automatic personality detection systems play an essential role in various applications, such as user interest mining [1], information dissemination [2], recommendation systems [3], [4], [5], and intelligent machine design [6]. Therefore, analyzing and detecting users’ personality traits is significant for grasping users’ current and future psychologies and predicting their reactions and behaviors.

Personality detection research based on user-generated text is mainly divided into psycholinguistic lexicon-based, neural language model-based, and interpretability research. Earlier researchers captured psycholinguistic lexicon statistics features such as Linguistic Inquiry and Word Count (LIWC) [7] and Medical Research Council (MRC) [8] features in texts for personality detection [9], [10]. However, the obtaining artificial features are a costly operation, and a statistical analysis cannot effectively represent the original semantics. To avoid feature engineering, deep neural models are employed to learn text-distributed representations from end to end, and the resulting detection accuracy is greatly improved [11], [12], [13]. However, neural language model embeddings lack the ability to explain personality. Recently, some researchers combined common knowledge to detect personality [14], [15], providing some ability to explain personality and contributing to the analysis of personality traits. The latest researchers employed interpretable machine learning to clearly quantify the impacts of various psycholinguistic statistical features [16], [17]. However, these methods do not deeply exploit psycholinguistic domain knowledge and fail to effectively integrate psycholinguistic knowledge and text semantics into the associated neural models.

In the psychology field, personality traits are defined as attribute combinations of individual thoughts and emotions to explain the differences in human behaviors [18]. The generally used measurement metric are the Big Five personality, including openness, conscientiousness, extroversion, agreeableness, and neuroticism [19]. The relationship between personality and language has been studied for a long time. Psycholinguistics found an interesting phenomenon in empirical research: personality traits affect people’s use of language, which refers to their choice of vocabulary. Specifically, the LIWC lexicon [20], [21] and some personality adjectives [22] (Personality Adjectives Check List)¹ have linear correlations with each personality trait. In addition, people with the same personality traits usually have the same fixed emotional polarities [23]. The details regarding this topic are described in Appendix. Fig. 1 shows a visual example of a neurotic user’s psycholinguistic knowledge. From the words “hate”, “murder”, and “hell”, we can roughly infer that he/she is a neurotic user. Based on the relationship between the synonym “damn”, emotional polarity, and personality traits, this inference is more confident to be confirmed. It can be seen that conducting personality detection research from the lexical psycholinguistic knowledge perspective can bring rich domain structure knowledge rather than superficial psycholinguistic statistical information. Although research on personality detection has achieved remarkable results, some challenges still remain.

•
Fusion of text semantics and psycholinguistic knowledge: It is a challenge to fully fuse lexical psycholinguistic knowledge and text semantics while accurately representing the personality traits derived from the user’s language.
•
Interpretability of personality detection: It is a challenge to utilize personality psychology knowledge to realize explainable personality detection in neural models.

To meet the above challenges, we propose a novel lexical psycholinguistic knowledge-guided graph neural network model for interpretable personality detection. Our model enriches personality document representations by incorporating heterogeneous external knowledge through the use of personality lexicons as intermediaries. In particular, instead of directly using previous pretrained word embeddings, we first refine a kind of personality-aware word embedding via position encoding and an attention mechanism. Second, to fully fuse knowledge and semantics, we align the personality lexicons with the constructed personality knowledge graph and automatically build a heterogeneous personality word graph for each user. Then, we develop a Heterogeneous Personality Message-passing graph neural Network (HPMN) and perform interactions among the word nodes, emotion and personality heterogeneous nodes in directed edges. Finally, regarding the interpretability of personality traits, we design a graph-level readout function, which delicately selects all heterogeneous nodes for incorporation as knowledge-guided document embeddings to achieve user-generated text personality understanding and interpretation. Therefore, personality detection is transformed into a heterogeneous word graph classification problem. After conducting a verification on 4 public personality datasets, the results show that our model can effectively improve the accuracy of personality detection and pay more attention to critical knowledge.

In summary, our contributions can be summarized as follows.

•
To the best of our knowledge, this is the first work that integrates lexical psycholinguistic knowledge and text semantics information into a neural model to achieve interpretable personality detection. Moreover, it provides support for lexical hypotheses in psycholinguistic research from a computational linguistic perspective.
•
Our model incorporates the distribution representations of words and the lexical knowledge by learning personality-aware word embeddings. In addition, we construct a heterogeneous personality word graph and develop a message-passing network, which extracts explicit lexicon and knowledge relations via the interactions among heterogeneous graph nodes. All heterogeneous nodes are selectively incorporated as knowledge-guided document embeddings for personality understanding and interpretation through a carefully designed graph readout layer.
•
Experiment results on four public datasets demonstrate that our model outperforms the state-of-the-art techniques in terms of personality detection. Our model can help various types of social software mine user information and help psychologists study and analyze personality traits in depth.

The rest of this paper is organized as follows. Section 2 introduces the work related to personality detection. Section 3 provides the problem formulation and Section 4 describes the proposed method. Further, Section 5 presents and analyzes the experimental results obtained on 4 public datasets. Finally, Section 6 outlines the conclusion and future research.

Section snippets

Related work

Due to the wide potential application value, personality detection has gradually attracted the attention of computer science researchers [24], [25]. Although personality detection in social networks is in its infancy, scholars have achieved fruitful results from multiple research perspectives. Aiming at the challenges mentioned in the previous section, this section focuses on the achievements of scholars in terms of four aspects: (1) psycholinguistic lexicon-based, (2) neural language

Problem formulation

Personality detection can be formulated as a user-level multilabel classification problem. Mathematically, given a user- generated document $D = {s_{1}, s_{2}, \dots, s_{n}}$ , where $s_{i} = {w_{i}^{1}, w_{i}^{2}, \dots, w_{i}^{m}}$ is the $i$ th sentence with $m$ words. Our goal is to detect $T$ personality traits $Y = {y^{t}}_{t = 1}^{T}$ for this user based on document $D$ , where $y^{t} \in {0, 1}$ is a binary variable.

Proposed method

In this section, we present our GNN-based personality detection model guided by lexical psycholinguistic knowledge. Our model takes full advantage of personality lexicons as a bridge to enrich the representations of personality documents with the incorporation of heterogeneous external knowledge. As illustrated in Fig. 2, our model contains three main parts.

(1)
Personality-aware word embedding: To fully fuse lexical psycholinguistic knowledge and text semantics, we design personality word position

Experimental settings

In this section, we introduce the datasets used in the experiment and present the baseline methods. After introducing the parameter set, we present the evaluation index used to evaluate the performance of the models.

Conclusion and future research

In this paper, we present a novel personality detection model with lexical psycholinguistic knowledge guild, which not only achieves accurate personality detection results for social media texts but also enables us to explore the interpretability of personality traits via word knowledge. First, we summarize a personality dictionary containing 2043 words and learn personality-aware word embeddings to refine more accurate word vectors. Then, in combination with the background psychological

CRediT authorship contribution statement

Yangfu Zhu: Conceptualization, Methodology, Software, Writing – original draft. Linmei Hu: Methodology, Writing – review & editing. Nianwen Ning: Writing – review & editing. Wei Zhang: Writing – review & editing. Bin Wu: Supervision, Funding acquisition.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

This work is supported by the NSFC-General Technology Basic Research Joint Funds, China under Grant (U1936220), the National Natural Science Foundation of China under Grant (61972047) and the National Key Research and Development Program of China (2018YFC0831500).

References (44)

DhelimS. et al.
Mining user interest based on personality-aware hybrid filtering in social networks
Knowl.-Based Syst.
(2020)
YinC. et al.
Reposting negative information on microblogs: Do personality traits matter?
Inf. Process. Manage.
(2020)
WangH. et al.
Cross-domain recommendation with user personality
Knowl.-Based Syst.
(2021)
TanderaT. et al.
Personality prediction system from Facebook users
Procedia Comput. Sci.
(2017)
RenZ. et al.
A sentiment-aware deep learning approach for personality detection from text
Inf. Process. Manage.
(2021)
HanS. et al.
Knowledge of words: An interpretable approach for personality recognition from social media
Knowl.-Based Syst.
(2020)
YarkoniT.
Personality in 100,000 words: A large-scale analysis of personality and word use among bloggers
J. Res. Personal.
(2010)
XueG. et al.
Dynamic network embedding survey
Neurocomputing
(2022)
HuG. et al.
FSS-GCN: A graph convolutional networks with fusion of semantic and structure for emotion cause analysis
Knowl.-Based Syst.
(2021)
SongX. et al.
Jkt: A joint graph convolutional network based deep knowledge tracing
Inform. Sci.
(2021)

ZhaoP. et al.

Modeling sentiment dependencies with graph convolutional networks for aspect-level sentiment classification

Knowl.-Based Syst.

(2020)

T. Shen, J. Jia, Y. Li, Y. Ma, Y. Bu, H. Wang, B. Chen, T.-S. Chua, W. Hall, Peia: Personality and emotion integrated...

XuC. et al.

Recommendation by users’ multimodal preferences for smart city applications

IEEE Trans. Ind. Inf.

(2020)

GuoA. et al.

From affect, behavior, and cognition to personality: an integrated personal character model for individual-like intelligent artifacts

World Wide Web

(2020)

PennebakerJ.W. et al.

Linguistic inquiry and word count: LIWC 2001

Mahway: Lawrence Erlbaum Assoc.

(2001)

ColtheartM.

The MRC psycholinguistic database

Q. J. Exp. Psychol. A

(1981)

P.-H. Arnoux, A. Xu, N. Boyette, J. Mahmud, R. Akkiraju, V. Sinha, 25 tweets to know you: A new model to predict...

XueD. et al.

Deep learning-based personality recognition from text posts of online social networks

Appl. Intell.

(2018)

MajumderN. et al.

Deep learning-based document modeling for personality detection from text

IEEE Intell. Syst.

(2017)

SunX. et al.

Who am I? Personality detection based on deep learning for texts

PoriaS. et al.

Common sense knowledge based personality recognition from text

MehtaY. et al.

Bottom-up and top-down: Predicting personality with psycholinguistic and language model features

Cited by (8)

An integrated FBWM-FCM-DEMATEL model to assess and manage the sustainability in the supply chain: A three-stage model based on the consumers’ point of view
2024, Applied Soft Computing
While sustainability is recognized as a crucial aspect of supply chain (SC) management, its implementation within a SC is often fraught with challenges. These challenges encompass possible causal relationships and trade-offs among sustainable SC (SSC) practices, spanning all three dimensions of sustainability. Furthermore, the increasing importance of consumers’ point of view in the rapid and widespread communication era highlights the imperative to integrate this view into the SC’s sustainability efforts. This paper proposes a comprehensive model for assessing and managing the implementation of sustainability in the SC while effectively tackling these challenges and requirements. Specifically, across three distinct stages, the model enables (1) identifying the SSC practices from the consumers’ point of view and ranking them based on their contribution to the overall sustainability; (2) determining the causal relationships among the practices and assessing the sustainability status of the SC; and (3) prioritizing the practices and developing improvement scenarios to improve the overall sustainability of the SC. To support its objectives, the proposed model employs social media analysis and scenario planning, along with three key techniques, including the Fuzzy Best-Worst Method (FBWM) for ranking the practices, Fuzzy Cognitive Mapping (FCM) for determining the possible causal relationships among the practices and assessing the sustainability status, and Decision-Making Trial and Evaluation Laboratory (DEMATEL) for segmenting the practices based on their cause-and-effect relationships. Finally, the paper presents an empirical study in which the proposed model is applied to a mobile phone and smartphone (MPSP) SC, representing one of the most commonly consumed products in contemporary human life.
PS-GCN: psycholinguistic graph and sentiment semantic fused graph convolutional networks for personality detection
2024, Connection Science
Healthcare knowledge graph construction: A systematic review of the state-of-the-art, open issues, and opportunities
2023, Journal of Big Data
PCENet: Psychological Clues Exploration Network for Multimodal Personality Assessment
2023, International Conference on Information and Knowledge Management, Proceedings
A deep learning approach to text-based personality prediction using multiple data sources mapping
2023, Neural Computing and Applications
Is ChatGPT a Good Personality Recognizer? A Preliminary Study
2023, arXiv

View all citing articles on Scopus

View full text

A lexical psycholinguistic knowledge-guided graph neural network for interpretable personality detection

Abstract

Introduction

Section snippets

Related work

Problem formulation

Proposed method

Experimental settings

Conclusion and future research

CRediT authorship contribution statement

Declaration of Competing Interest

Acknowledgments

Knowl.-Based Syst.

Inf. Process. Manage.

Knowl.-Based Syst.

Procedia Comput. Sci.

Inf. Process. Manage.

Knowl.-Based Syst.

J. Res. Personal.

Neurocomputing

Knowl.-Based Syst.

Inform. Sci.

Knowl.-Based Syst.

Recommendation by users’ multimodal preferences for smart city applications

IEEE Trans. Ind. Inf.

From affect, behavior, and cognition to personality: an integrated personal character model for individual-like intelligent artifacts

World Wide Web

Linguistic inquiry and word count: LIWC 2001

Mahway: Lawrence Erlbaum Assoc.

The MRC psycholinguistic database

Q. J. Exp. Psychol. A

Deep learning-based personality recognition from text posts of online social networks

Appl. Intell.

Deep learning-based document modeling for personality detection from text

IEEE Intell. Syst.

Who am I? Personality detection based on deep learning for texts

Common sense knowledge based personality recognition from text

Bottom-up and top-down: Predicting personality with psycholinguistic and language model features