Multimodal Analysis and Prediction of Latent User Dimensions

Wendlandt, Laura; Mihalcea, Rada; Boyd, Ryan L.; Pennebaker, James W.

doi:10.1007/978-3-319-67217-5_20

Laura Wendlandt¹⁶,
Rada Mihalcea¹⁶,
Ryan L. Boyd¹⁷ &
…
James W. Pennebaker¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10539))

Included in the following conference series:

International Conference on Social Informatics

3380 Accesses
7 Citations

Abstract

Humans upload over 1.8 billion digital images to the internet each day, yet the relationship between the images that a person shares with others and his/her psychological characteristics remains poorly understood. In the current research, we analyze the relationship between images, captions, and the latent demographic/psychological dimensions of personality and gender. We consider a wide range of automatically extracted visual and textual features of images/captions that are shared by a large sample of individuals (\(N \approx 1,350\)). Using correlational methods, we identify several visual and textual properties that show strong relationships with individual differences between participants. Additionally, we explore the task of predicting user attributes using a multimodal approach that simultaneously leverages images and their captions. Results from these experiments suggest that images alone have significant predictive power and, additionally, multimodal methods outperform both visual features and textual features in isolation when attempting to predict individual differences.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
This data was collected under IRB approval at UT Austin.
2.
For prediction experiments, we use a slightly different version of dominance (\(Dominance = 0.76y + 0.32 s\)), as formulated in [24].
3.
We use the OpenCV probabilistic Hough transform function with an accumulator threshold of 50, a minimum line length of 50, and a maximum line gap of 10.
4.
We use the OpenCV Hough circles function, with a minimum distance of 8 and method-specific parameters set to 170 and 45.
5.
We use the Edge Boxes parameters \(\alpha =0.65\) and \(\beta =0.55\).
6.
Available at https://code.google.com/archive/p/word2vec/.

References

Bentivogli, L., Forner, P., Magnini, B., Pianta, E.: Revising the WordNet domains hierarchy: semantics, coverage and balancing. In: Proceedings of the Workshop on Multilingual Linguistic Resources, pp. 101–108. Association for Computational Linguistics (2004)
Google Scholar
Boyd, R.L.: Psychological text analysis in the digital humanities. In: Hai-Jew, S. (ed.) Data Analytics in the Digital Humanities. MMSA, pp. 161–189. Springer Science, New York City (2017). doi:10.1007/978-3-319-54499-1_7. In Press
Chapter Google Scholar
Bruni, E., Tran, N.K., Baroni, M.: Multimodal distributional semantics. J. Artif. Intell. Res. 49, 1–47 (2014)
MathSciNet MATH Google Scholar
Chris, D.P.: Another stemmer. In: ACM SIGIR Forum, vol. 24, pp. 56–61 (1990)
Google Scholar
Ciaramita, M., Johnson, M.: Supersense tagging of unknown nouns in WordNet. In: Proceedings of the 2003 Conference on Empirical Methods in Natural Language Processing, pp. 168–175. Association for Computational Linguistics (2003)
Google Scholar
Coltheart, M.: The MRC psycholinguistic database. Q. J. Exp. Psychol. 33(4), 497–505 (1981)
Article Google Scholar
Everingham, M., Van Gool, L., Williams, C.K., Winn, J., Zisserman, A.: The PASCAL visual object classes (VOC) challenge. Int. J. Comput. Vis. 88(2), 303–338 (2010)
Article Google Scholar
Fellbaum, C.: WordNet. Wiley Online Library, Hoboken (1998)
Google Scholar
Finkel, J.R., Grenager, T., Manning, C.: Incorporating non-local information into information extraction systems by Gibbs sampling. In: Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, pp. 363–370 (2005)
Google Scholar
Gosling, S.D., Craik, K.H., Martin, N.R., Pryor, M.R.: Material attributes of personal living spaces. Home Cultures 2(1), 51–87 (2005)
Article Google Scholar
Gosling, S.D., Ko, S.J., Mannarelli, T., Morris, M.E.: A room with a cue: personality judgments based on offices and bedrooms. J. Personal. Soc. Psychol. 82(3), 379 (2002)
Article Google Scholar
Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., Darrell, T.: Caffe: convolutional architecture for fast feature embedding. In: Proceedings of the 22nd ACM International Conference on Multimedia, pp. 675–678 (2014)
Google Scholar
John, O.P., Srivastava, S.: The big five trait taxonomy: history, measurement, and theoretical perspectives. Handb. Personal.: Theory Res. 2(1999), 102–138 (1999)
Google Scholar
Johnson, J., Karpathy, A., Fei-Fei, L.: DenseCap: Fully convolutional localization networks for dense captioning. arXiv preprint arXiv:1511.07571 (2015)
Karpathy, A., Fei-Fei, L.: Deep visual-semantic alignments for generating image descriptions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3128–3137 (2015)
Google Scholar
Kelly, E.L., Conley, J.J.: Personality and compatibility: a prospective analysis of marital stability and marital satisfaction. J. Personal. Soc. Psychol. 52(1), 27 (1987)
Article Google Scholar
Khouw, N.: The meaning of color for gender. In: Colors Matters-Research (2002)
Google Scholar
Koppel, M., Argamon, S., Shimoni, A.R.: Automatically categorizing written texts by author gender. Literary Linguist. Comput. 17(4), 401–412 (2002)
Article Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
Google Scholar
Li, J.J., Nenkova, A.: Fast and accurate prediction of sentence specificity. In: AAAI, pp. 2281–2287 (2015)
Google Scholar
Liu, H., Mihalcea, R.: Of men, women, and computers: data-driven gender modeling for improved user interfaces. In: International Conference on Weblogs and Social Media (2007)
Google Scholar
Liu, L., Preotiuc-Pietro, D., Samani, Z.R., Moghaddam, M.E., Ungar, L.: Analyzing personality through social media profile picture choice. In: Tenth International AAAI Conference on Web and Social Media (2016)
Google Scholar
Lovato, P., Bicego, M., Segalin, C., Perina, A., Sebe, N., Cristani, M.: Faved! biometrics: tell me which image you like and I’ll tell you who you are. IEEE Trans. Inf. Forensics Secur. 9(3), 364–374 (2014)
Article Google Scholar
Machajdik, J., Hanbury, A.: Affective image classification using features inspired by psychology and art theory. In: Proceedings of the 18th ACM International Conference on Multimedia, pp. 83–92. ACM (2010)
Google Scholar
Mairesse, F., Walker, M.A., Mehl, M.R., Moore, R.K.: Using linguistic cues for the automatic recognition of personality in conversation and text. J. Artif. Intell. Res. 30, 457–500 (2007)
MATH Google Scholar
Marcus, M.P., Marcinkiewicz, M.A., Santorini, B.: Building a large annotated corpus of English: the Penn treebank. Comput. Linguist. 19(2), 313–330 (1993)
Google Scholar
Mathias, M., Benenson, R., Pedersoli, M., van Gool, L.: Face detection without bells and whistles. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8692, pp. 720–735. Springer, Cham (2014). doi:10.1007/978-3-319-10593-2_47
Google Scholar
McCrae, R.R., John, O.P.: An introduction to the five-factor model and its applications. J. Personal. 60(2), 175–215 (1992)
Article Google Scholar
Meeker, M.: Internet trends 2014-Code conference (2014). Accessed 28 May 2014
Google Scholar
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)
Google Scholar
Newman, M.L., Groom, C.J., Handelman, L.D., Pennebaker, J.W.: Gender differences in language use: an analysis of 14,000 text samples. Discourse Process. 45(3), 211–236 (2008)
Article Google Scholar
Oberlander, J., Nowson, S.: Whose thumb is it anyway? Classifying author personality from weblog text. In: COLING/ACL, pp. 627–634 (2006)
Google Scholar
Park, G., Schwartz, H.A., Eichstaedt, J.C., Kern, M.L., Kosinski, M., Stillwell, D.J., Ungar, L.H., Seligman, M.E.P.: Automatic personality assessment through social media language. J. Personal. Soc. Psychol. 108(6), 934–952 (2014)
Article Google Scholar
Pennebaker, J.W., King, L.A.: Linguistic styles: language use as an individual difference. J. Personal. Soc. Psychol. 77(6), 1296 (1999)
Article Google Scholar
Redi, M., Quercia, D., Graham, L., Gosling, S.: Like partying? Your face says it all. Predicting the ambiance of places with profile pictures. In: Ninth International AAAI Conference on Web and Social Media (2015)
Google Scholar
Roberts, B., Kuncel, N., Shiner, R., Caspi, A., Goldberg, L.: The power of personality: the comparative validity of personality traits, socioeconomic status, and cognitive ability for predicting important life outcomes. Perspect. Psychol. Sci. 4(2), 313–345 (2007)
Google Scholar
Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., et al.: ImageNet large scale visual recognition challenge. Int. J. Comput. Vis. 115(3), 211–252 (2015)
Article MathSciNet Google Scholar
Segalin, C., Cheng, D.S., Cristani, M.: Social profiling through image understanding: personality inference using convolutional neural networks. Comput. Vis. Image Understanding 156, 34–50 (2016)
Article Google Scholar
Segalin, C., Perina, A., Cristani, M., Vinciarelli, A.: The pictures we like are our image: continuous mapping of favorite pictures into self-assessed and attributed personality traits. IEEE Trans. Affect. Comput. 8(2), 268–285 (2016)
Article Google Scholar
Valdez, P., Mehrabian, A.: Effects of color on emotions. J. Exp. Psychol.: Gen. 123(4), 394 (1994)
Article Google Scholar
Van De Weijer, J., Schmid, C., Verbeek, J., Larlus, D.: Learning color names for real-world applications. IEEE Trans. Image Process. 18(7), 1512–1523 (2009)
Article MathSciNet MATH Google Scholar
Yoder, P.J., Blackford, J.U., Waller, N.G., Kim, G.: Enhancing power while controlling family-wise error: an illustration of the issues using electrocortical studies. J. Clin. Exp. Neuropsychol. 26(3), 320–331 (2004)
Article Google Scholar
You, Q., Bhatia, S., Sun, T., Luo, J.: The eyes of the beholder: gender prediction using images posted in online social networks. In: 2014 IEEE International Conference on Data Mining Workshop, pp. 1026–1030. IEEE (2014)
Google Scholar
Zhang, D., Islam, M.M., Lu, G.: A review on automatic image annotation techniques. Pattern Recogn. 45(1), 346–362 (2012)
Article Google Scholar
Zhou, B., Lapedriza, A., Xiao, J., Torralba, A., Oliva, A.: Learning deep features for scene recognition using places database. In: Advances in Neural Information Processing Systems, pp. 487–495 (2014)
Google Scholar
Zitnick, C.L., Dollár, P.: Edge boxes: locating object proposals from edges. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 391–405. Springer, Cham (2014). doi:10.1007/978-3-319-10602-1_26
Google Scholar

Download references

Acknowledgments

This material is based in part upon work supported by the National Science Foundation (NSF #1344257), the John Templeton Foundation (#48503), and the Michigan Institute for Data Science (MIDAS). Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the NSF, the John Templeton Foundation, or MIDAS. We would like to thank Chris Pittman for his aid with the data collection, Shibamouli Lahiri for the readability code, and Steven R. Wilson for the implementation of Mairesse et al.

Author information

Authors and Affiliations

University of Michigan, Ann Arbor, MI, USA
Laura Wendlandt & Rada Mihalcea
University of Texas at Austin, Austin, TX, USA
Ryan L. Boyd & James W. Pennebaker

Authors

Laura Wendlandt
View author publications
You can also search for this author in PubMed Google Scholar
Rada Mihalcea
View author publications
You can also search for this author in PubMed Google Scholar
Ryan L. Boyd
View author publications
You can also search for this author in PubMed Google Scholar
James W. Pennebaker
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Laura Wendlandt .

Editor information

Editors and Affiliations

Indiana University, Bloomington, Indiana, USA
Giovanni Luca Ciampaglia
University of Washington, Seattle, Washington, USA
Afra Mashhadi
University of Oxford, Oxford, United Kingdom
Taha Yasseri

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wendlandt, L., Mihalcea, R., Boyd, R.L., Pennebaker, J.W. (2017). Multimodal Analysis and Prediction of Latent User Dimensions. In: Ciampaglia, G., Mashhadi, A., Yasseri, T. (eds) Social Informatics. SocInfo 2017. Lecture Notes in Computer Science(), vol 10539. Springer, Cham. https://doi.org/10.1007/978-3-319-67217-5_20

Download citation

DOI: https://doi.org/10.1007/978-3-319-67217-5_20
Published: 03 September 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-67216-8
Online ISBN: 978-3-319-67217-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics