Big Data and Multimodal Communication: A Perspective View

Navarretta, Costanza; Oemig, Lucretia

doi:10.1007/978-3-030-15939-9_9

Costanza Navarretta⁶ &
Lucretia Oemig⁷

Part of the book series: Intelligent Systems Reference Library ((ISRL,volume 159))

573 Accesses

Abstract

Humans communicate face-to-face through at least two modalities, the auditive modality, speech, and the visual modality, gestures, which comprise e.g. gaze movements, facial expressions, head movements, and hand gestures. The relation between speech and gesture is complex and partly depends on factors such as the culture, the communicative situation, the interlocutors and their relation. Investigating these factors in real data is vital for studying multimodal communication and building models for implementing natural multimodal communicative interfaces able to interact naturally with individuals of different age, culture, and needs. In this paper, we discuss to what extent big data “in the wild”, which are growing explosively on the internet, are useful for this purpose also in light of legal aspects about the use of personal data, comprising multimodal data downloaded from social media.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 159.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Saving-Face: The Nonverbal Communicology of Basic Emotions

Multimodal Interaction, Interfaces, and Analytics

Multimodal Interaction: Taxonomy, Exchange Formats

Notes

1.
http://mocap.cs.cmu.edu/.
2.
In the GDPR article 4(7) the controller is defined as “the natural or legal person, public authority, agency or other body which, alone or jointly with others, determines the purposes and means of the processing of personal data; where the purposes and means of such processing are determined by Union or Member State law, the controller or the specific criteria for its nomination may be provided for by Union or Member State law”.
3.
This includes all information about (i) the identity and the contact details of the controller, (ii) the contact details of the data protection officer (where applicable), (iii) the purposes of the processing for which the personal data are intended and the legal basis for the processing, (iv) the recipients or categories of recipients of the personal data (if any), (v) the fact (when relevant) that the controller intends to transfer personal data to a third country or international organisation, (vi) the period for which the personal data will be stored, or if that is not possible, the criteria used to determine that period, (vii) the existence of the right to request from the controller access to and rectification or erasure of personal data or restriction of processing concerning the data subject or to object to processing as well as the right to data portability, (viii) in cases when the processing is based on consent, the existence of the right to withdraw consent at any time, without affecting the lawfulness of processing based on consent before its withdrawal, (ix) the right to lodge a complaint with a supervisory authority, and (x) the existence of automated decision-making, including profiling (if any).
4.
Cf. Judgement of 6 November 2003, Lindqvist (C 101/01, EU:C:2003:596, paragraph 47).

References

Allwood, J., Nivre, J., Ahls’en, E.: On the semantics and pragmatics of linguistic feedback. J. Semant. 9, 1–26 (1992)
Article Google Scholar
Allwood, J., Cerrato, L., Jokinen, K., Navarretta, C., Paggio, P.: The MUMIN coding scheme for the annotation of feedback, turn management and sequencing. Multimodal corpora for modelling human multimodal behaviour. Spec. Issue Int. J. Lang. Resour. Eval. 41(3–4), 273–287 (2007)
Google Scholar
Amos, B., Ludwiczuk, B., Satyanarayanan, M.: Openface: A general-purpose face recognition library with mobile applications. Technical report, CMU-CS-1 6-118, CMU School of Computer Science (2016)
Google Scholar
Batalha, N.M., Rowe, J.F., Bryson, S.T., Barclay, T., Burke, C.J. et al.: Planetary candidates observed by Kepler. III. Analysis of the first 16 months of data. Astrophys. J. Suppl. Ser. 204(2), 24 (2013)
Google Scholar
Bourbakis, N., Esposito, A., Kavraki, D.: Extracting and associating meta-features for understanding people’s emotional behaviour: face and speech. Cogn. Comput. 3, 436–448 (2011)
Article Google Scholar
Bunt, H., Alexandersson, J., Carletta, J., Choe, J.W., Fang, A.C., Hasida, K., Lee, K., Petukhova, V., Popescu-Belis, A., Romary, L., Soria, C., Traum, D.: Towards an ISO standard for dialogue act annotation. Proc. LREC 2010, 2548–2555 (2010)
Google Scholar
Cao, Z., Simon, T., Wei, S.E., Sheikh, Y.: Realtime multi-person 2D pose estimation using part affinity fields. In: CVPR (2017)
Google Scholar
Carletta, J., Ashby, S., Bourban, S., Flynn, M., Guillemot, M., Hain, T., Kadlec, J., Karaiskos, V., Kraaij, W., Kronenthal, M., Lathoud, G., Lincoln, M., Lisowska, A., McCowan, I., Post, W., Reidsma, D., Wellner, P.: The AMI meeting corpus: a pre-announcement. In: Renals, S., Bengio, S. (eds.) Machine Learning for Multimodal Interaction, Second International Workshop, vol. 10. Lecture Notes in Computer Science, pp. 28–39. Springer, Berlin (2006)
Chapter Google Scholar
Cassell, J., Pelachaud, C., Badler, N., Steedman, M., Achorn, B., Becket, T., Douville, B., Prevost, S., Stone, M.: Animated conversation: rule-based generation of facial expression, gesture & spoken intonation for multiple conversational agents. In: Proceedings of the 21st Annual Conference on Computer Graphics and Interactive Techniques, pp. 413–420. ACM (1994)
Google Scholar
Collobert, R., Weston, J.: A unified architecture for natural language processing: deep neural networks with multitask learning. In: Proceedings of the 25th International Conference on Machine Learning, ICML ’08, pp. 160–167. ACM, New York, NY, USA (2008)
Google Scholar
de Kok, I., Heylen, D.: The MultiLis corpus dealing with individual differences in nonverbal listening behavior. Toward Autonomous, Adaptive, and Context-Aware Multimodal Interfaces. Theoretical and Practical Issues- Third COST 2102 International Training School, Caserta, Italy, March 15–19, 2010, Revised Selected Papers, pp. 362–375. Springer, Berlin (2010)
Google Scholar
Deng, L., Li, J., Huang, J.T., Yao, K., Yu, D., Seide, F., Seltzer, M., Zweig, G., He, X., Williams, J., Gong, Y., Acero, A.: Recent advances in deep learning for speech research at Microsoft. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 8604–8608 (2013)
Google Scholar
Duncan, S.: Gesture, verb aspect, and the nature of iconic imagery in natural discourse. Gesture 2(2), 183–206 (2002)
Article MathSciNet Google Scholar
Esposito, A., Esposito, A.M.: On speech and gesture synchrony. In: Esposito, A., Vinciarelli, A., Vicsi, K., Pelachaud, C., Nijholt, A. (eds.) Communication and Enactment - The Processing Issues. LNCS, vol. 6800, pp. 252–272. Springer, Berlin (2011)
Google Scholar
Eyben, F., Weninger, F., Gross, F., Schuller, B.: Recent developments in opensmile, the munich open-source multimedia feature extractor. In: Proceedings of the 21st ACM International Conference on Multimedia, MM ’13, pp. 835–838. ACM, New York, NY, USA (2013)
Google Scholar
Feyereisen, P., Havard, I.: Mental imagery and production of hand gestures while speaking in younger and older adults. J. Nonverbal Behav. 23(2), 153–171 (1999)
Article Google Scholar
Giorgolo, G., Verstraten, F.A.: Perception of ‘speech-and-gesture’ integration. Proc. Int. Conf. Audit.-Vis. Speech Process. 2008, 31–36 (2008)
Google Scholar
Hadar, U., Steiner, T.J., Grant, E.C., Rose, F.C.: The relationship between head movements and speech dysfluencies. Lang. Speech 27(4), 333–342 (1984)
Article Google Scholar
Hadar, U., Steiner, T.J., Grant, E.C., Rose, F.C.: The timing of shifts of head postures during conservation. Hum. Mov. Sci. 3(3), 237–245 (1984)
Article Google Scholar
Hostetter, A.B., Potthoff, A.L.: Effects of personality and social situation on representational gesture production. Gesture 12(1), 62–83 (2012)
Article Google Scholar
Hunyadi, L., Bertok, K., Nemeth, T., Szekrenyes, I., Abuczki, A., Nagy, G., Nagy, N., Nemeti, P., Bodog, A.: The outlines of a theory and technology of human-computer interaction as represented in the model of the HuComTech project. In: 2011 2nd International Conference on Cognitive Infocommunications, CogInfoCom 2011 (2011)
Google Scholar
Ikizler-Cinbis, N., Sclaroff, S.: Object, scene and actions: combining multiple features for human action recognition. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) Computer Vision - ECCV 2010: 11th European Conference on Computer Vision, Heraklion, Crete, Greece, September 5–11, 2010, Proceedings, Part I, pp. 494–507. Springer, Berlin (2010)
Chapter Google Scholar
Kendon, A.: Some relationships between body motion and speech. In: Seigman, A., Pope, B. (eds.) Studies in Dyadic Communication, pp. 177–216. Pergamon Press, Elmsford, New York (1972)
Chapter Google Scholar
Kendon, A.: Gesture and speech: two aspects of the process of utterance. In: Key, M.R. (ed.) Nonverbal Communication and Language, pp. 207–227. Mouton (1980)
Google Scholar
Kendon, A.: Gesture - Visible Action as Utterance. Cambridge University Press, Cambridge (2004)
Google Scholar
Krämer, N., Kopp, S., Becker-Asano, C., Sommer, N.: Smile and the world will smile with you—the effects of a virtual agent’s smile on users’ evaluation and behavior. Int. J. Hum.-Comput. Stud. 71(3), 335–349 (2013)
Article Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Pereira, F., Burges, C.J.C., Bottou, L., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems, vol. 25, pp. 1097–1105. Curran Associates Inc. (2012)
Google Scholar
Leonard, T., Cummins, F.: The temporal relation between beat gestures and speech. Lang. Cogn. Process. 26(10), 1457–1471 (2010)
Article Google Scholar
Lis, M.: Multimodal representation of entities: a corpus-based investigation of co-speech hand gesture. Ph.D. thesis, University of Copenhagen (2014)
Google Scholar
Lis, M., Navarretta, C.: Classifying the form of iconic hand gestures from the linguistic categorization of co-occurring verbs. In: 1st European Symposium on Multimodal Communication (MMSym’13), pp. 41–50 (2013)
Google Scholar
Liu, M., Li, S., Shan, S., Chen, X.: AU-inspired deep networks for facial expression feature learning. Neurocomputing 159(Supplement C), 126–136 (2015)
Article Google Scholar
Loehr, D.P.: Gesture and intonation. Ph.D. thesis, Georgetown University (2004)
Google Scholar
Loehr, D.P.: Aspects of rhythm in gesture and speech. Gesture 7(2), (2007)
Article Google Scholar
McNeill, D.: Hand and Mind: What Gestures Reveal about Thought. University of Chicago Press, Chicago (1992)
Google Scholar
Mondada, L.: Emergent focused interactions in public places: a systematic analysis of the multimodal achievement of a common interactional space. J. Pragmat. 41, 1977–1997 (2009)
Article Google Scholar
Mou, D.: Automatic Face Recognition, pp. 91–106. Springer, Berlin (2010)
Chapter Google Scholar
Navarretta, C.: Individuality in communicative bodily behaviours. In: Esposito, A., Esposito, A.M., Vinciarelli, A., Hoffmann, R., Müller, V.C. (eds.) Behavioural Cognitive Systems. Lecture Notes in Computer Science, vol. 7403, pp. 417–423. Springer, Berlin (2012)
Chapter Google Scholar
Navarretta, C.: Transfer learning in multimodal corpora. In: IEEE (ed.) In Proceedings of the 4th IEEE International Conference on Cognitive Infocommunications (CogInfoCom2013), pp. 195–200. Budapest, Hungary (2013)
Google Scholar
Navarretta, C.: Fillers, filled pauses and gestures in Danish first encounters. In: Abstract proceedings of 3rd European Symposium on Multimodal Communication, pp. 1–3. Speech Communication Lab at Trinity College Dublin, Dublin (2015)
Google Scholar
Navarretta, C.: Mirroring facial expressions and emotions in dyadic conversations. In: Chair, N.C.C., Choukri, K., Declerck, T., Goggi, S., Grobelnik, M., Maegaard, B., Mariani, J., Mazo, H., Moreno, A., Odijk, J., Piperidis, S. (eds.) Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016), pp. 469–474. European Language Resources Association (ELRA), Paris, France (2016)
Google Scholar
Navarretta, C., Lis, M.: Multimodal feedback expressions in Danish and polish spontaneous conversations. In: NEALT Proceedings. Northern European Association for Language and Technology, Proceedings of the Fourth Nordic Symposium of Multimodal Communication, pp. 55–62. Linköping Electronic Conference Proceedings (2013)
Google Scholar
Navarretta, C., Lis, M.: Transfer learning of feedback head expressions in Danish and polish comparable multimodal corpora. In: Proceedings of 9th Language Resources and Evaluation Conference (LREC 2014), pp. 3597–3603. Reykjavik, Island (2014)
Google Scholar
Navarretta, C., Paggio, P.: Verbal and non-verbal feedback in different types of interactions. In: Proceedings of LREC 2012, pp. 2338–2342. Istanbul Turkey (2012)
Google Scholar
Navarretta, C., Ahlsn, E., Allwood, J., Jokinen, K., Paggio, P.: Feedback in Nordic first-encounters: a comparative study. In: Proceedings of LREC 2012, pp. 2494–2499. Istanbul Turkey (2012)
Google Scholar
Özyürek, A., Kita, S., Allen, S., Furman, R., Brown, A.: How does linguistic framing of events influence co-speech gestures? Insights from crosslinguistic variations and similarities. Gesture 5(1–2), 219–240 (2005)
Article Google Scholar
Paggio, P., Ahlsén, E., Allwood, J., Jokinen, K., Navarretta, C.: The NOMCO multimodal Nordic resource - goals and characteristics. In: Proceedings of LREC 2010, pp. 2968–2973. Malta (2010)
Google Scholar
Paggio, P., Navarretta, C.: The Danish NOMCO corpus: multimodal interaction in first acquaintance conversations. Lang. Resour. Eval. 51(2), 463–494 (2017). https://doi.org/10.1007/s10579-016-9371-6
Article Google Scholar
Parkhi, O.M., Vedaldi, A., Zisserman, A.: Deep face recognition. In: Proceedings of the British Machine Vision Conference (BMVC) (2015)
Google Scholar
Raina, R., Madhavan, A., Ng, A.Y.: Large-scale deep unsupervised learning using graphics processors. In: Proceedings 26th Annual International Conference on Machine Learning, pp. 873–888 (2009)
Google Scholar
Ringeval, F., Sonderegger, A., Sauer, J., Lalanne, D.: Introducing the RECOLA multimodal corpus of remote collaborative and affective interactions. In: Proceedings of the 2nd International Workshop on Emotion Representation, Analysis and Synthesis in Continuous Time and Space (EmoSPACE 2013), Held in Conjunction with the 10th International IEEE Conference on Automatic Face and Gesture Recognition (FG 2013). Shanghai, China (2013)
Google Scholar
Riviello, M.T., Esposito, A., Vicsi, K.: A cross-cultural study on the perception of emotions: how Hungarian subjects evaluate American and Italian emotional expressions. In: Esposito, A., Esposito, A.M., Vinciarelli, A., Hoffmann, R., Müller, V.C. (eds.) Cognitive Behavioural Systems: COST 2102 International Training School, Dresden, Germany, February 21–26, 2011, Revised Selected Papers, pp. 424–433. Springer, Berlin (2012)
Chapter Google Scholar
Rizzolatti, G.: The mirror neuron system and its function in humans. Anat. Embryol. 210, 419–421 (2005)
Article Google Scholar
Rizzolatti, G., Craighero, L.: The mirror-neuron system. Annu. Rev. Neurosci. 27, 169–192 (2004)
Article Google Scholar
Sacks, H., Schegloff, E., Jefferson, G.: A simplest systematics for the organization of turn-taking for conversation. Language 50(4), 696–735 (1974)
Article Google Scholar
Simon, T., Joo, H., Matthews, I., Sheikh, Y.: Hand keypoint detection in single images using multiview bootstrapping. In: CVPR (2017)
Google Scholar
Singh, S.P., Kumar, A., Darbari, H., Singh, L., Rastogi, A., Jain, S.: Machine translation using deep learning: an overview. In: 2017 International Conference on Computer, Communications and Electronics (Comptelix), pp. 162–167 (2017)
Google Scholar
Stieglitz, S., Dang-Xuan, L., Bruns, A., Neuberger, C.: Social media analytics. Bus. Inf. Syst. Eng. 6(2), 89–96 (2014)
Article Google Scholar
Streeck, J.: Gesturecraft - The Manufacture of Meaning. John Benjamins Publishing Company (2009)
Google Scholar
Streeck, J., Goodwin, C., LeBaron., C. (eds.): Embodied Interaction: Language and Body in the Material World. Cambridge University Press, Cambridge (2011)
Google Scholar
Traum, D.R.: A computational theory of grounding in natural language conversation. Ph.D. thesis, Computer Science Department, University of Rochester (1994)
Google Scholar
Wei, S.E., Ramakrishna, V., Kanade, T., Sheikh, Y.: Convolutional pose machines. In: CVPR (2016)
Google Scholar
Weninger, F., Erdogan, H., Watanabe, S., Vincent, E., Roux, J.L., Hershey, J.R., Schuller, B.: Speech enhancement with LSTM recurrent neural networks and its application to noise-robust ASR. In: Vincent, E., Yeredor, A., Koldovský, Z., Tichavský, P. (eds.) Latent Variable Analysis and Signal Separation: 12th International Conference, LVA/ICA 2015, Liberec, Czech Republic, August 25–28, 2015, Proceedings, pp. 91–99. Springer International Publishing, Cham (2015)
Chapter Google Scholar
You, Q., Luo, J., Jin, H., Yang, J.: Robust image sentiment analysis using progressively trained and domain transferred deep networks. In: Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, AAAI’15, pp. 381–388. AAAI Press (2015)
Google Scholar
Zadeh, A., Chen, M., Poria, S., Cambria, E., Morency, L.: Tensor fusion network for multimodal sentiment analysis. CoRR (2017). arXiv:abs/1707.07250
Zadeh, A., Zellers, R., Pincus, E., Morency, L.P.: MOSI: Multimodal corpus of sentiment intensity and subjectivity analysis in online opinion videos. IEEE Intell. Syst. 31(6), 81–88 (2016)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Nordic Studies and Linguistics, Centre for Language Technology, University of Copenhagen, Emil Holms Kanal 2, 2300, Copenhagen, Denmark
Costanza Navarretta
Agency for Digitisation, Ministry of Finance, Landgreven 4, 1017, Copenhagen, Denmark
Lucretia Oemig

Authors

Costanza Navarretta
View author publications
You can also search for this author in PubMed Google Scholar
Lucretia Oemig
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Costanza Navarretta .

Editor information

Editors and Affiliations

Dipartimento di Psicologia and International Institute for Advanced Scientific Studies (IIASS), Università degli Studi della Campania “Luigi Vanvitelli”, Caserta, Italy
Anna Esposito
Sezione di Napoli, Osservatorio Vesuviano, Istituto Nazionale di Geofisica e Vulcanologia, Napoli, Italy
Antonietta M. Esposito
University of Canberra, Canberra, ACT, Australia
Lakhmi C. Jain

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Navarretta, C., Oemig, L. (2019). Big Data and Multimodal Communication: A Perspective View. In: Esposito, A., Esposito, A., Jain, L. (eds) Innovations in Big Data Mining and Embedded Knowledge. Intelligent Systems Reference Library, vol 159. Springer, Cham. https://doi.org/10.1007/978-3-030-15939-9_9

Download citation

DOI: https://doi.org/10.1007/978-3-030-15939-9_9
Published: 03 July 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-15938-2
Online ISBN: 978-3-030-15939-9
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics

Big Data and Multimodal Communication: A Perspective View

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Saving-Face: The Nonverbal Communicology of Basic Emotions

Multimodal Interaction, Interfaces, and Analytics

Multimodal Interaction: Taxonomy, Exchange Formats

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Big Data and Multimodal Communication: A Perspective View

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Saving-Face: The Nonverbal Communicology of Basic Emotions

Multimodal Interaction, Interfaces, and Analytics

Multimodal Interaction: Taxonomy, Exchange Formats

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation