Skip to main content

Multimodal Analysis and Prediction of Latent User Dimensions

  • Conference paper
  • First Online:
Social Informatics (SocInfo 2017)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10539))

Included in the following conference series:

Abstract

Humans upload over 1.8 billion digital images to the internet each day, yet the relationship between the images that a person shares with others and his/her psychological characteristics remains poorly understood. In the current research, we analyze the relationship between images, captions, and the latent demographic/psychological dimensions of personality and gender. We consider a wide range of automatically extracted visual and textual features of images/captions that are shared by a large sample of individuals (\(N \approx 1,350\)). Using correlational methods, we identify several visual and textual properties that show strong relationships with individual differences between participants. Additionally, we explore the task of predicting user attributes using a multimodal approach that simultaneously leverages images and their captions. Results from these experiments suggest that images alone have significant predictive power and, additionally, multimodal methods outperform both visual features and textual features in isolation when attempting to predict individual differences.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    This data was collected under IRB approval at UT Austin.

  2. 2.

    For prediction experiments, we use a slightly different version of dominance (\(Dominance = 0.76y + 0.32 s\)), as formulated in [24].

  3. 3.

    We use the OpenCV probabilistic Hough transform function with an accumulator threshold of 50, a minimum line length of 50, and a maximum line gap of 10.

  4. 4.

    We use the OpenCV Hough circles function, with a minimum distance of 8 and method-specific parameters set to 170 and 45.

  5. 5.

    We use the Edge Boxes parameters \(\alpha =0.65\) and \(\beta =0.55\).

  6. 6.

    Available at https://code.google.com/archive/p/word2vec/.

References

  1. Bentivogli, L., Forner, P., Magnini, B., Pianta, E.: Revising the WordNet domains hierarchy: semantics, coverage and balancing. In: Proceedings of the Workshop on Multilingual Linguistic Resources, pp. 101–108. Association for Computational Linguistics (2004)

    Google Scholar 

  2. Boyd, R.L.: Psychological text analysis in the digital humanities. In: Hai-Jew, S. (ed.) Data Analytics in the Digital Humanities. MMSA, pp. 161–189. Springer Science, New York City (2017). doi:10.1007/978-3-319-54499-1_7. In Press

    Chapter  Google Scholar 

  3. Bruni, E., Tran, N.K., Baroni, M.: Multimodal distributional semantics. J. Artif. Intell. Res. 49, 1–47 (2014)

    MathSciNet  MATH  Google Scholar 

  4. Chris, D.P.: Another stemmer. In: ACM SIGIR Forum, vol. 24, pp. 56–61 (1990)

    Google Scholar 

  5. Ciaramita, M., Johnson, M.: Supersense tagging of unknown nouns in WordNet. In: Proceedings of the 2003 Conference on Empirical Methods in Natural Language Processing, pp. 168–175. Association for Computational Linguistics (2003)

    Google Scholar 

  6. Coltheart, M.: The MRC psycholinguistic database. Q. J. Exp. Psychol. 33(4), 497–505 (1981)

    Article  Google Scholar 

  7. Everingham, M., Van Gool, L., Williams, C.K., Winn, J., Zisserman, A.: The PASCAL visual object classes (VOC) challenge. Int. J. Comput. Vis. 88(2), 303–338 (2010)

    Article  Google Scholar 

  8. Fellbaum, C.: WordNet. Wiley Online Library, Hoboken (1998)

    Google Scholar 

  9. Finkel, J.R., Grenager, T., Manning, C.: Incorporating non-local information into information extraction systems by Gibbs sampling. In: Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, pp. 363–370 (2005)

    Google Scholar 

  10. Gosling, S.D., Craik, K.H., Martin, N.R., Pryor, M.R.: Material attributes of personal living spaces. Home Cultures 2(1), 51–87 (2005)

    Article  Google Scholar 

  11. Gosling, S.D., Ko, S.J., Mannarelli, T., Morris, M.E.: A room with a cue: personality judgments based on offices and bedrooms. J. Personal. Soc. Psychol. 82(3), 379 (2002)

    Article  Google Scholar 

  12. Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., Darrell, T.: Caffe: convolutional architecture for fast feature embedding. In: Proceedings of the 22nd ACM International Conference on Multimedia, pp. 675–678 (2014)

    Google Scholar 

  13. John, O.P., Srivastava, S.: The big five trait taxonomy: history, measurement, and theoretical perspectives. Handb. Personal.: Theory Res. 2(1999), 102–138 (1999)

    Google Scholar 

  14. Johnson, J., Karpathy, A., Fei-Fei, L.: DenseCap: Fully convolutional localization networks for dense captioning. arXiv preprint arXiv:1511.07571 (2015)

  15. Karpathy, A., Fei-Fei, L.: Deep visual-semantic alignments for generating image descriptions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3128–3137 (2015)

    Google Scholar 

  16. Kelly, E.L., Conley, J.J.: Personality and compatibility: a prospective analysis of marital stability and marital satisfaction. J. Personal. Soc. Psychol. 52(1), 27 (1987)

    Article  Google Scholar 

  17. Khouw, N.: The meaning of color for gender. In: Colors Matters-Research (2002)

    Google Scholar 

  18. Koppel, M., Argamon, S., Shimoni, A.R.: Automatically categorizing written texts by author gender. Literary Linguist. Comput. 17(4), 401–412 (2002)

    Article  Google Scholar 

  19. Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)

    Google Scholar 

  20. Li, J.J., Nenkova, A.: Fast and accurate prediction of sentence specificity. In: AAAI, pp. 2281–2287 (2015)

    Google Scholar 

  21. Liu, H., Mihalcea, R.: Of men, women, and computers: data-driven gender modeling for improved user interfaces. In: International Conference on Weblogs and Social Media (2007)

    Google Scholar 

  22. Liu, L., Preotiuc-Pietro, D., Samani, Z.R., Moghaddam, M.E., Ungar, L.: Analyzing personality through social media profile picture choice. In: Tenth International AAAI Conference on Web and Social Media (2016)

    Google Scholar 

  23. Lovato, P., Bicego, M., Segalin, C., Perina, A., Sebe, N., Cristani, M.: Faved! biometrics: tell me which image you like and I’ll tell you who you are. IEEE Trans. Inf. Forensics Secur. 9(3), 364–374 (2014)

    Article  Google Scholar 

  24. Machajdik, J., Hanbury, A.: Affective image classification using features inspired by psychology and art theory. In: Proceedings of the 18th ACM International Conference on Multimedia, pp. 83–92. ACM (2010)

    Google Scholar 

  25. Mairesse, F., Walker, M.A., Mehl, M.R., Moore, R.K.: Using linguistic cues for the automatic recognition of personality in conversation and text. J. Artif. Intell. Res. 30, 457–500 (2007)

    MATH  Google Scholar 

  26. Marcus, M.P., Marcinkiewicz, M.A., Santorini, B.: Building a large annotated corpus of English: the Penn treebank. Comput. Linguist. 19(2), 313–330 (1993)

    Google Scholar 

  27. Mathias, M., Benenson, R., Pedersoli, M., van Gool, L.: Face detection without bells and whistles. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8692, pp. 720–735. Springer, Cham (2014). doi:10.1007/978-3-319-10593-2_47

    Google Scholar 

  28. McCrae, R.R., John, O.P.: An introduction to the five-factor model and its applications. J. Personal. 60(2), 175–215 (1992)

    Article  Google Scholar 

  29. Meeker, M.: Internet trends 2014-Code conference (2014). Accessed 28 May 2014

    Google Scholar 

  30. Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)

    Google Scholar 

  31. Newman, M.L., Groom, C.J., Handelman, L.D., Pennebaker, J.W.: Gender differences in language use: an analysis of 14,000 text samples. Discourse Process. 45(3), 211–236 (2008)

    Article  Google Scholar 

  32. Oberlander, J., Nowson, S.: Whose thumb is it anyway? Classifying author personality from weblog text. In: COLING/ACL, pp. 627–634 (2006)

    Google Scholar 

  33. Park, G., Schwartz, H.A., Eichstaedt, J.C., Kern, M.L., Kosinski, M., Stillwell, D.J., Ungar, L.H., Seligman, M.E.P.: Automatic personality assessment through social media language. J. Personal. Soc. Psychol. 108(6), 934–952 (2014)

    Article  Google Scholar 

  34. Pennebaker, J.W., King, L.A.: Linguistic styles: language use as an individual difference. J. Personal. Soc. Psychol. 77(6), 1296 (1999)

    Article  Google Scholar 

  35. Redi, M., Quercia, D., Graham, L., Gosling, S.: Like partying? Your face says it all. Predicting the ambiance of places with profile pictures. In: Ninth International AAAI Conference on Web and Social Media (2015)

    Google Scholar 

  36. Roberts, B., Kuncel, N., Shiner, R., Caspi, A., Goldberg, L.: The power of personality: the comparative validity of personality traits, socioeconomic status, and cognitive ability for predicting important life outcomes. Perspect. Psychol. Sci. 4(2), 313–345 (2007)

    Google Scholar 

  37. Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., et al.: ImageNet large scale visual recognition challenge. Int. J. Comput. Vis. 115(3), 211–252 (2015)

    Article  MathSciNet  Google Scholar 

  38. Segalin, C., Cheng, D.S., Cristani, M.: Social profiling through image understanding: personality inference using convolutional neural networks. Comput. Vis. Image Understanding 156, 34–50 (2016)

    Article  Google Scholar 

  39. Segalin, C., Perina, A., Cristani, M., Vinciarelli, A.: The pictures we like are our image: continuous mapping of favorite pictures into self-assessed and attributed personality traits. IEEE Trans. Affect. Comput. 8(2), 268–285 (2016)

    Article  Google Scholar 

  40. Valdez, P., Mehrabian, A.: Effects of color on emotions. J. Exp. Psychol.: Gen. 123(4), 394 (1994)

    Article  Google Scholar 

  41. Van De Weijer, J., Schmid, C., Verbeek, J., Larlus, D.: Learning color names for real-world applications. IEEE Trans. Image Process. 18(7), 1512–1523 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  42. Yoder, P.J., Blackford, J.U., Waller, N.G., Kim, G.: Enhancing power while controlling family-wise error: an illustration of the issues using electrocortical studies. J. Clin. Exp. Neuropsychol. 26(3), 320–331 (2004)

    Article  Google Scholar 

  43. You, Q., Bhatia, S., Sun, T., Luo, J.: The eyes of the beholder: gender prediction using images posted in online social networks. In: 2014 IEEE International Conference on Data Mining Workshop, pp. 1026–1030. IEEE (2014)

    Google Scholar 

  44. Zhang, D., Islam, M.M., Lu, G.: A review on automatic image annotation techniques. Pattern Recogn. 45(1), 346–362 (2012)

    Article  Google Scholar 

  45. Zhou, B., Lapedriza, A., Xiao, J., Torralba, A., Oliva, A.: Learning deep features for scene recognition using places database. In: Advances in Neural Information Processing Systems, pp. 487–495 (2014)

    Google Scholar 

  46. Zitnick, C.L., Dollár, P.: Edge boxes: locating object proposals from edges. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 391–405. Springer, Cham (2014). doi:10.1007/978-3-319-10602-1_26

    Google Scholar 

Download references

Acknowledgments

This material is based in part upon work supported by the National Science Foundation (NSF #1344257), the John Templeton Foundation (#48503), and the Michigan Institute for Data Science (MIDAS). Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the NSF, the John Templeton Foundation, or MIDAS. We would like to thank Chris Pittman for his aid with the data collection, Shibamouli Lahiri for the readability code, and Steven R. Wilson for the implementation of Mairesse et al.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Laura Wendlandt .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Wendlandt, L., Mihalcea, R., Boyd, R.L., Pennebaker, J.W. (2017). Multimodal Analysis and Prediction of Latent User Dimensions. In: Ciampaglia, G., Mashhadi, A., Yasseri, T. (eds) Social Informatics. SocInfo 2017. Lecture Notes in Computer Science(), vol 10539. Springer, Cham. https://doi.org/10.1007/978-3-319-67217-5_20

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-67217-5_20

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-67216-8

  • Online ISBN: 978-3-319-67217-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics