MCBF: Multimodal Corpora Building Framework

Caschera, Maria Chiara; D’Ulizia, Arianna; Ferri, Fernando; Grifoni, Patrizia

doi:10.1007/978-3-319-43808-5_14

Maria Chiara Caschera¹⁶,
Arianna D’Ulizia¹⁶,
Fernando Ferri¹⁶ &
…
Patrizia Grifoni¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9561))

Included in the following conference series:

Language and Technology Conference

798 Accesses

Abstract

Designing an effective human-machine multimodal interaction environment requires addressing crucial issues such as the correct interpretation of complex user’s input from different modal channels. In this context the use of corpora of multimodal sentences is very important because they allow integrating properties and linguistic knowledge which are not formalised in the grammar. This paper provides framework for dynamic multimodal corpora building that semi-automates the extraction of syntactic and semantic information from multimodal dialogues using both grammar inference and interpretation methodologies based on HMM. This method is based on a Multimodal Attribute Grammar and on an HMM-based approach to syntactically and semantically annotate new multimodal sentences. It allows for improving human-computer dialogue because the multimodal corpus evolves by adapting itself to the dynamic change of the human-computer interaction.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Multimodal Corpora

The Corpus of Interactional Data: A Large Multimodal Annotated Resource

Understanding conversational interaction in multiparty conversations: the EVA Corpus

Article Open access 10 December 2022

References

Allwood, J.: Multimodal corpora. In: Lüdeling, A., Kytö, M. (eds.) Corpus Linguistics. An International Handbook, pp. 207–225. Mouton de Gruyter, Berlin (2008)
Google Scholar
D’Ulizia, A., Ferri, F., Grifoni, P.: Generating multimodal grammars for multimodal dialogue processing. IEEE Trans. Syst. Man Cybern. Part A Syst. Hum. 40(6), 1130–1145 (2010)
Article Google Scholar
D’Ulizia, A., Ferri, F., Grifoni, P.: A Learning Algorithm for Multimodal Grammar Inference. IEEE Trans. Syst. Man Cybern. Part B Cybern. 41(6), 1495–1510 (2011)
Article Google Scholar
Caschera, M.C., Ferri, F., Grifoni, P.: InteSe: an integrated model for resolving ambiguities in multimodal sentences. IEEE Trans. Syst. Man Cybern. Syst. 43(4), 911–931 (2013)
Article Google Scholar
Caschera, M.C., Ferri, F., Grifoni, P.: Multimodal interaction systems: information and time features. Int. J. Web Grid Serv. (IJWGS) 3(1), 82–99 (2007)
Article Google Scholar
D’Ulizia, A.: Exploring multimodal input fusion strategies. In: The Handbook of Research on Multimodal Human Computer Interaction and Pervasive Services: Evolutionary Techniques for Improving Accessibility, pp. 34–57. IGI Publishing (2009)
Google Scholar
Manchón, P., Pérez, G., Amores, G.: Multimodal fusion: a new hybrid strategy for dialogue systems. Proceedings of Eighth International Conference on Multimodal Interfaces (ICMI 2006), Banff, Alberta, Canada, pp. 357–363. ACM, New York (2006)
Google Scholar
Shimazu, H., Takashima, Y.: Multimodal definite clause grammar. Syst. Comput. Japan 26(3), 93–102 (1995)
Article Google Scholar
Pereira, F., Warren, D.H.D.: Definite clause grammars for language analysis - a survey of the formalism and a comparison with augmented transition networks. Artif. Intell. 13(3), 231–278 (1980)
Article MathSciNet MATH Google Scholar
Johnston, M., Bangalore, S.: Finite-state multimodal integration and understanding. Nat. Lang. Eng. 11(2), 159–187 (2005)
Article Google Scholar
Reitter, D., Panttaja, E.M., Cummins, F.: UI on the fly: Generating a multimodal user interface. In: Proceedings of Human Language Technology conference - North American chapter of the Association for Computational Linguistics (HLT-NAACL-2004), Boston, Massachusetts, USA (2004)
Google Scholar
Baldridge, J., Kruijff, G.J.M.: Multimodal combinatory categorial grammar. In: Proceedings of the 10th Conference of the European Chapter of the Association for Computational Linguistics, Budapest, Hungary, pp. 211–218 (2003)
Google Scholar
Caschera, M.C., Ferri, F., Grifoni, P.: An approach for managing ambiguities in multimodal interaction. In: Meersman, R., Tari, Z. (eds.) OTM-WS 2007, Part I. LNCS, vol. 4805, pp. 387–397. Springer, Heidelberg (2007)
Chapter Google Scholar
Caschera M.C.: Interpretation methods and ambiguity management in multimodal systems. In: Grifoni, P. (ed.) Handbook of Research on Multimodal Human Computer Interaction and Pervasive Services: Evolutionary Techniques for Improving Accessibility, pp. 87–102. IGI Global, USA (2009)
Google Scholar
Chai, J., Hong, P., Zhou, M.X.: A probabilistic approach to reference resolution in multimodal user interface. In: Proceedings of the 9th International Conference on Intelligent User Interfaces, Madeira, Portugal, pp. 70–77 (2004)
Google Scholar
O’Hara, T., Wiebe, J., Bruce, R.F.: Selecting decomposable models for word-sense disambiguation: the Grling-Sdm system. Comput. Human. 34(1/2), 159–164 (2000)
Article Google Scholar
Johnston, M., Cohen, P.R., McGee, D., Oviatt, S.L., Pittman, J.A., Smith, I.: Unification-based multimodal integration. In: Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and the 8th Conference of the European Chapter of the Association for Computational Linguistics, pp. 281–288 (1997)
Google Scholar
Avola, D., Caschera, M.C., Ferri, F., Grifoni, P.: Classifying and resolving ambiguities in sketch-based interaction. Int. J. Virt. Technol. Multimedia 1(2), 104–139 (2010)
Article Google Scholar
Krogh, A., Mian, S.I., Haussler, D.: A hidden Markov model that finds genes, E.coli DNA. NAR. 22(22), 4768–4778 (1994)
Google Scholar
Rabiner, L.R.: A tutorial on hidden Markov models and selected applications in speech recognition. Proc. IEEE 77(2), 257–285 (1989)
Article Google Scholar
Makhoul, J., Starner, T., Schwartz, R., Chou, G.: On-line cursive handwriting recognition using hidden Markov models and statistical grammars. In: Proceedings of the Workshop Hum. Lang. Technol., Plainsboro, NJ, pp. 432–436 (1994)
Google Scholar
Jelinek, F.: Robust part-of-speech tagging using a hiddenMarkov model. Comput. Speech Lang. 6(3), 225–242 (1992)
Article Google Scholar
Allwood, J.: Multimodal corpora. In: Lüdeling, A., Kytö, M. (eds.) Corpus Linguistics. An International Handbook, pp. 207–225. Mouton de Gruyter. Berlin (2008)
Google Scholar
Gu, Y.: Multimodal text analysis: A corpus linguistic approach to situated discourse. Text Talk 26(2), 127–167 (2006)
Article Google Scholar
Knight, D., Carter, R., Adolphs, S., Pridmore, T., Mills, S., Crabtree, A., Bayoumi, S.: Beyond the text: construction and analysis of multi-modal linguistic corpora. In: The 2nd International Conference on e-Social Science, June 28–30, University of Manchester, NCeSS (2006)
Google Scholar
Knight, D.: The future of multimodal corpora. Braz. J. Appl. Linguist. 11(2), 391–416 (2011)
Google Scholar
Karypidis, A., Lalis, S.: Automated context aggregation and file annotation for PAN-based computing. Pers. Ubiquit. Comput. 11(1), 33–44 (2007)
Article Google Scholar
Caschera, M.C., Ferri, F., Grifoni, P.: From modal to multimodal ambiguities: a classification approach. JNIT 4(5), 87–109 (2013)
Article Google Scholar
Byron, D.K., Fosler-Lussier, E.: The OSU Quake 2004 corpus of two-party situated problem- solving dialogs. In: Proceedings of the 15th Language Resources and Evaluation Conference (LREC 2006) (2006)
Google Scholar
Stoia, L., Shockley, D.M., Byron, D.K., Fosler-Lussier, E.: SCARE: A situated cor- pus with annotated referring expressions. In: Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC 2008), pp. 28–30 (2008)
Google Scholar
Tokunaga, T., Iida, R., Terai, A., Kuriyama, N.: The REX corpora: a collection of multimodal corpora of referring expressions in collaborative problem solving dialogues, In: Proceedings of the International Conference on Language Re- sources and Evaluation (LREC 2012), pp. 422–429 (2012)
Google Scholar
Rehm, M., Gruneberg, F., Nakano, Y., Lipi, A.A., Yamaoka, Y., Huang, H.: Creating a standardized corpus of multimodal interactions for enculturating conversational interfaces. In: Workshop on Enculturating Conversational Interfaces by Socio-cultural Aspects of Communication, 2008 International Conference on Intelligent User Interfaces (IUI2008), Canary Islands, Spain, January 2008
Google Scholar
Blache, P., Bertrand, R., Ferré, G.: Creating and Exploiting Multimodal Annotated Corpora: The ToMA Project. Multimodal Corpora, pp. 38–53 (2009)
Google Scholar
Schiel, F., Steininger, S., Türk, U.: The SmartKom multimodal corpus at BAS. In: Proceedings of the International Language Resources and Evaluation Conference (LREC) (2002)
Google Scholar
TALK (2007) project website: http://www.talk-project.org. Accessed 2 May 2011
Caschera, M.C., D’Ulizia, A., Ferri, F., Grifoni, P.: Multiculturality and multimodal languages. Multiple sensorial media advances and applications: new developments in MulSeMedia. A book edited by Dr. G. Ghinea (Brunel University), Dr. F. Andres (CVCE/NII), and Dr. S. Gulliver (University of Reading), pp. 99–114. IGI Global Publishing (2012)
Google Scholar
Oliver, N., Garg, A., Horvitz, E.: Layered representations for learning and inferring office activity from multiple sensory channels. Comput. Vis. Image Underst. 96(2), 163–180 (2004). [Ch6]
Article Google Scholar
Fine, S., Singer, Y., Tishby, N.: The hierarchical hidden Markov model: analysis and applications. Mach. Learn. 32(1), 41–62 (1998). [Ch7]
Article MATH Google Scholar
Rabiner, L.R.: A tutorial on hidden Markov models and selected applications in speech recognition. Proc. IEEE 77(2), 257–285 (1989). [Ch8]
Article Google Scholar
Monachini M.: ELM-IT: EAGLES Specification for Italian morphosintax Lexicon Specification and Classification Guidelines. EAGLES Document EAG CLWG ELM IT/F (1996)
Google Scholar
Roventini, A., Alonge, A., Calzolari, N., Magnini, B., Bertagna, F.: ItalWordNet: a large semantic database for Italian. Proceedings of the 2nd International Conference on Language Resources and Evaluation (LREC 2000), Athens, Greece, 31 May – 2 June 2000, vol. II, pp. 783–790. The European Language Resources Association (ELRA), Paris (2000)
Google Scholar
Ringeval, F., Sonderegger, A., Sauer, J., Lalanne, D.: Introducing the RECOLA multimodal corpus of remote collaborative and affective interactions. In: 2nd International Workshop EmoSPACE (2013)
Google Scholar
Inoue, M., Hanada, R., Furuyama, N., Irino, T., Ichinomiya, T., Massaki, H.: Multimodal corpus for psychotherapeutic situations. In: International Workshop Series on Multimodal Corpora, Tools and Resources, pp. 18–21 (2012)
Google Scholar
Fleury, A., Vacher, M., Portet, F., Chahuara, P., Noury, N.: A multimodal corpus recorded in a health smart home. In: Proceedings of the LREC 2010, pp. 99–105 (2010)
Google Scholar
Vacher, M., Lecouteux, B., Chahuara, P., Portet, F., Meillon, B., Bonnefond, N.: The Sweet-Home speech and multimodal corpus for home automation interaction. In: LREC 2014, pp. 1–8 (2014)
Google Scholar
Costantini, E., Burger, S., Pianesi, F.: NESPOLE!’s multilingual and multimodal corpus. In: LREC (2002)
Google Scholar
http://badip.uni-graz.at/it/lista-di-corpora
Caschera, M.C., D’Ulizia, A., Ferri, F., Grifoni, P.: An Italian Multimodal Corpus: the Building Process. In: Meersman, R., et al. (eds.) OTM 2014 Workshops. LNCS, vol. 8842, pp. 557–566. Springer, Heidelberg (2013)
Google Scholar
Caschera, M.C., D’Ulizia, A., Ferri, F., Grifoni, P.: Methods for dynamic building of multimodal corpora. In: LTC 2013, pp 499–503 (2013)
Google Scholar
Caschera, M.C., D’Ulizia, A., Ferri, F., Grifoni, P.: Multimodal interaction in gaming. In: Demey, Y.T., Panetto, H. (eds.) OTM 2013 Workshops 2013. LNCS, vol. 8186, pp. 694–703. Springer, Heidelberg (2013)
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

National Research Council – IRPPS, Via Palestro, 32, 00185, Rome, Italy
Maria Chiara Caschera, Arianna D’Ulizia, Fernando Ferri & Patrizia Grifoni

Authors

Maria Chiara Caschera
View author publications
You can also search for this author in PubMed Google Scholar
Arianna D’Ulizia
View author publications
You can also search for this author in PubMed Google Scholar
Fernando Ferri
View author publications
You can also search for this author in PubMed Google Scholar
Patrizia Grifoni
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Fernando Ferri .

Editor information

Editors and Affiliations

Adam Mickiewicz University , Poznań, Poland
Zygmunt Vetulani
Deutsches Forschungszentrum f. Künstl.Intelligenz (DFKI GmbH), Saarbrücken, Saarland, Germany
Hans Uszkoreit
Adam Mickiewicz University , Poznań, Poland
Marek Kubis

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Caschera, M.C., D’Ulizia, A., Ferri, F., Grifoni, P. (2016). MCBF: Multimodal Corpora Building Framework. In: Vetulani, Z., Uszkoreit, H., Kubis, M. (eds) Human Language Technology. Challenges for Computer Science and Linguistics. LTC 2013. Lecture Notes in Computer Science(), vol 9561. Springer, Cham. https://doi.org/10.1007/978-3-319-43808-5_14

Download citation

DOI: https://doi.org/10.1007/978-3-319-43808-5_14
Published: 30 July 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-43807-8
Online ISBN: 978-3-319-43808-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics