Skip to main content
Log in

Cognition-driven multimodal personality classification

  • Research Paper
  • Published:
Science China Information Sciences Aims and scope Submit manuscript

Abstract

In this paper, we address a novel task, namely, cognition-driven multimodal personality classification (CMPC), aiming to infer personality traits (e.g., romantic, humorous, and gloomy) shown in real time by a human being from the perspective of cognitive psychology. Specifically, this task is motivated by a cognitive difference phenomenon that humans with different personality traits tend to give different personality-oriented textual descriptions when observing an image. In particular, to tackle the inherent noise challenges in this CMPC task, we propose a tailored reinforcement learning approach, namely, multi-agent SelectNet, aiming to integrate the opinion-word and image-region selection strategies to select informative opinion-word and image-region features for CMPC. To justify the effectiveness of our approach, we construct six kinds of multimodal personality classification datasets and conduct extensive experiments on the datasets. Experimental results demonstrate that our approach can significantly outperform other strong competitors, including the state-of-the-art unimodal and multimodal approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Goldberg L R. An alternative “description of personality”: the big-five factor structure. J Personal Social Psychol, 1990, 59: 1216–1229

    Article  Google Scholar 

  2. Ríssola E A, Bahrainian S A, Crestani F. Personality recognition in conversations using capsule neural networks. In: Proceedings of Web Intelligence, Thessaloniki, 2019. 180–187

  3. Li Y N, Wan J, Miao Q G, et al. CR-Net: a deep classification-regression network for multimodal apparent personality analysis. Int J Comput Vis, 2020, 128: 2763–2780

    Article  Google Scholar 

  4. Carver C S, Scheier M F. Control theory: a useful conceptual framework for personality-social, clinical, and health psychology. Psychol Bull, 1982, 92: 111–135

    Article  Google Scholar 

  5. Wang J J, Li J, Li S S, et al. Aspect sentiment classification with both word-level and clause-level attention networks. In: Proceedings of International Joint Conference on Artificial Intelligence, Stockholm, 2018. 4439–4445

  6. Devlin J, Chang M W, Lee K, et al. BERT: pre-training of deep bidirec-tional transformers for language understanding. In: Proceedings of North American Chapter of the Association for Computational Linguistics, Minneapolis, 2019. 4171–4186

  7. He K M, Zhang X Y, Ren S Q, et al. Deep residual learning for image recognition. In: Proceedings of Computer Vision and Pattern Recognition, Las Vegas, 2016. 770–778

  8. Liu F, Nowson S, Perez J. A language-independent and compositional model for personality trait recognition from short texts. In: Proceedings of European Chapter of the Association for Computational Linguistics, Valencia, 2017. 754–764

  9. Yamada K, Sasano R, Takeda K. Incorporating textual information on user behavior for personality prediction. In: Proceedings of Association for Computational Linguistics, Florence, 2019. 177–182

  10. Arnoux P H, Xu A B, Boyette N, et al. 25 Tweets to know you: a new model to predict personality with social media. In: Proceedings of International Conference on Web and Social Media, Montreal, 2017. 472–475

  11. Sun X G, Liu B, Cao J X, et al. Who am I? Personality detection based on deep learning for texts. In: Proceedings of International Conference on Communications, Kansas City, 2018. 1–6

  12. da Silva B B C, Paraboni I. Personality recognition from facebook text. In: Proceedings of the Portuguese Language, Canela, 2018. 107–114

  13. Pizzolli D, Strapparava C. Personality traits recognition in literary texts. In: Proceedings of Storytelling Workshop, 2019. 107–111

  14. Liu L Q, Preotiuc-Pietro D, Samani Z R, et al. Analyzing personality through social media profile picture choice. In: Proceedings of International Conference on Web and Social Media, Cologne, 2016. 211–220

  15. Ferwerda B, Tkalcic M. Predicting users’ personality from instagram pictures: using visual and/or content features? In: Proceedings of User Modeling, Adaptation and Personalization, Singapore, 2018. 157–161

  16. Moubayed N A, Vazquez-Alvarez Y, McKay A, et al. Face-based automatic personality perception. In: Proceedings of ACM-MM, Orlando, 2014. 1153–1156

  17. Xu J, Tian W J, Fan Y Y, et al. Personality trait prediction based on 2.5D face feature model. In: Proceedings of Cloud Computing and Security, Haikou, 2018. 611–623

  18. Kampman O, Barezi E J, Bertero D, et al. Investigating audio, video, and text fusion methods for end-to-end automatic personality prediction. In: Proceedings of Association for Computational Linguistics, Melbourne, 2018. 606–611

  19. Farnadi G, Tang J, de Cock M, et al. User profiling through deep multimodal fusion. In: Proceedings of Web Search and Data Mining, Marina Del Rey, 2018. 171–179

  20. Lei T, Barzilay R, Jaakkola T S. Rationalizing neural predictions. In: Proceedings of Empirical Methods in Natural Language Processing, Austin, 2016. 107–117

  21. Guo H Y. Generating text with deep reinforcement learning. 2015. ArXiv:1510.09202

  22. Huang Q Y, Gan Z, Celikyilmaz A, et al. Hierarchically structured reinforcement learning for topically coherent visual story generation. In: Proceedings of Association for the Advance of Artificial Intelligence, Honolulu, 2019. 8465–8472

  23. Li J W, Monroe W, Ritter A, et al. Deep reinforcement learning for dialogue generation. In: Proceedings of Empirical Methods in Natural Language Processing, Austin, 2016. 1192–1202

  24. Takanobu R, Zhang T Y, Liu J X, et al. A hierarchical framework for relation extraction with reinforcement learning. In: Proceedings of Association for the Advance of Artificial Intelligence, Honolulu, 2019. 7072–7079

  25. Wang H, Li S Y, Pan R, et al. Incorporating graph attention mechanism into knowledge graph reasoning based on deep reinforcement learning. In: Proceedings of Empirical Methods in Natural Language Processing, Hong Kong, 2019. 2623–2631

  26. Zhang T Y, Huang M L, Zhao L. Learning structured representation for text classification via reinforcement learning. In: Proceedings of Association for the Advance of Artificial Intelligence, New Orleans, 2018. 6053–6060

  27. Feng J, Li H, Huang M L, et al. Learning to collaborate: multi-scenario ranking via multi-agent reinforcement learning. In: Proceedings of World Wide Web, Lyon, 2018. 1939–1948

  28. Gui T, Zhu L, Zhang Q, et al. Cooperative multimodal approach to depression detection in twitter. In: Proceedings of Association for the Advance of Artificial Intelligence, Honolulu, 2019. 110–117

  29. Littman M L. Markov games as a framework for multi-agent reinforcement learning. In: Proceedings of International Conference of Machine Learning, New Brunswick, 1994. 157–163

  30. Sutton R S, McAllester D A, Singh S P, et al. Policy gradient methods for reinforcement learning with function approximation. In: Proceedings of Neural Information Processing Systems, Denver, 1999. 1057–1063

  31. Wu Y H, Schuster M, Chen Z F, et al. Google’s neural machine translation system: bridging the gap between human and machine translation. 2016. ArXiv:1609.08144

  32. Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need. In: Proceedings of Neural Information Processing Systems, Long Beach, 2017. 6000–6010

  33. Shen T, Zhou T Y, Long G D, et al. Reinforced self-attention network: a hybrid of hard and soft attention for sequence modeling. In: Proceedings of International Joint Conference on Artificial Intelligence, Stockholm, 2018. 4345–4352

  34. Lu J S, Xiong C M, Parikh D, et al. Knowing when to look: adaptive attention via a visual sentinel for image captioning. In: Proceedings of Computer Vision and Pattern Recognition, Honolulu, 2017. 3242–3250

  35. Ren S Q, He K M, Girshick R B, et al. Faster R-CNN: towards real-time object detection with region proposal networks. In: Proceedings of Neural Information Processing Systems, Montreal, 2015. 91–99

  36. Sutton R S, Barto A G. Reinforcement learning: an introduction. IEEE Trans Neural Netw, 1998, 9: 1054–1054

    Article  Google Scholar 

  37. Yeung S, Ramanathan V, Russakovsky O, et al. Learning to learn from noisy web videos. In: Proceedings of Computer Vision and Pattern Recognition, Honolulu, 2017. 7455–7463

  38. Williams R J. Simple statistical gradient-following algorithms for connectionist reinforcement learning. Mach Learn, 1992, 8: 229–256

    Article  MATH  Google Scholar 

  39. Shuster K, Humeau S, Hu H X, et al. Engaging image captioning via personality. In: Proceedings of Computer Vision and Pattern Recognition, Long Beach, 2019. 12516–12526

  40. Glorot X, Bengio Y. Understanding the difficulty of training deep feedforward neural networks. In: Proceedings of Artificial Intelligence and Statistics, Chia Laguna Resort, 2010. 249–256

  41. Kingma D P, Ba J. ADAM: a method for stochastic optimization. In: Proceedings of International Conference on Learning Representations, San Diego, 2015

  42. Yang Y M, Liu X. A re-examination of text categorization methods. In: Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Berkeley, 1999. 42–49

  43. Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. In: Proceedings of International Conference on Learning Representations, San Diego, 2015

  44. Olgun M, Onarcan A O, Özkan K, et al. Wheat grain classification by using dense SIFT features with SVM classifier. Comput Electron Agr, 2016, 122: 185–190

    Article  Google Scholar 

  45. Nam H, Ha J W, Kim J. Dual attention networks for multimodal reasoning and matching. In: Proceedings of Computer Vision and Pattern Recognition, Honolulu, 2017. 2156–2164

  46. Zhang Q, Fu J L, Liu X Y, et al. Adaptive co-attention network for named entity recognition in tweets. In: Proceedings of Association for the Advance of Artificial Intelligence, New Orleans, 2018. 5674–5681

  47. Kim W, Son B, Kim I. ViLT: vision-and-Language transformer without convolution or region supervision. 2021. ArXiv:2102.03334

  48. Yu F, Tang J J, Yin W C, et al. ERNIE-ViL: knowledge enhanced vision-language representations through scene graph. 2020. ArXiv:2006.16934

  49. Qi D, Su L, Song J, et al. ImageBERT: cross-modal pre-training with large-scale weak-supervised image-text data. 2020. ArXiv:2001.07966

  50. Zheng Y T, Huang D, Liu S T, et al. Cross-domain object detection through coarse-to-fine feature adaptation. In: Proceedings of Computer Vision and Pattern Recognition, Seattle, 2020. 13763–13772

Download references

Acknowledgements

This work was supported by National Natural Science Foundation of China (Grant Nos. 62006166, 62076175, 62076176), China Postdoctoral Science Foundation (Grant No. 2019M661930), and Priority Academic Program Development of Jiangsu Higher Education Institutions (PAPD). We thank our anonymous reviewers for their helpful comments.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jingjing Wang.

Additional information

Supporting information

Appendixes A and B. The supporting information is available online at info.scichina.com and link.springer.com. The supporting materials are published as submitted, without typesetting or editing. The responsibility for scientific accuracy and content remains entirely with the authors.

Supplementary File

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Gao, X., Wang, J., Li, S. et al. Cognition-driven multimodal personality classification. Sci. China Inf. Sci. 65, 202104 (2022). https://doi.org/10.1007/s11432-020-3307-3

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11432-020-3307-3

Keywords