Abstract
Designing an effective human-machine multimodal interaction environment requires addressing crucial issues such as the correct interpretation of complex user’s input from different modal channels. In this context the use of corpora of multimodal sentences is very important because they allow integrating properties and linguistic knowledge which are not formalised in the grammar. This paper provides framework for dynamic multimodal corpora building that semi-automates the extraction of syntactic and semantic information from multimodal dialogues using both grammar inference and interpretation methodologies based on HMM. This method is based on a Multimodal Attribute Grammar and on an HMM-based approach to syntactically and semantically annotate new multimodal sentences. It allows for improving human-computer dialogue because the multimodal corpus evolves by adapting itself to the dynamic change of the human-computer interaction.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Allwood, J.: Multimodal corpora. In: Lüdeling, A., Kytö, M. (eds.) Corpus Linguistics. An International Handbook, pp. 207–225. Mouton de Gruyter, Berlin (2008)
D’Ulizia, A., Ferri, F., Grifoni, P.: Generating multimodal grammars for multimodal dialogue processing. IEEE Trans. Syst. Man Cybern. Part A Syst. Hum. 40(6), 1130–1145 (2010)
D’Ulizia, A., Ferri, F., Grifoni, P.: A Learning Algorithm for Multimodal Grammar Inference. IEEE Trans. Syst. Man Cybern. Part B Cybern. 41(6), 1495–1510 (2011)
Caschera, M.C., Ferri, F., Grifoni, P.: InteSe: an integrated model for resolving ambiguities in multimodal sentences. IEEE Trans. Syst. Man Cybern. Syst. 43(4), 911–931 (2013)
Caschera, M.C., Ferri, F., Grifoni, P.: Multimodal interaction systems: information and time features. Int. J. Web Grid Serv. (IJWGS) 3(1), 82–99 (2007)
D’Ulizia, A.: Exploring multimodal input fusion strategies. In: The Handbook of Research on Multimodal Human Computer Interaction and Pervasive Services: Evolutionary Techniques for Improving Accessibility, pp. 34–57. IGI Publishing (2009)
Manchón, P., Pérez, G., Amores, G.: Multimodal fusion: a new hybrid strategy for dialogue systems. Proceedings of Eighth International Conference on Multimodal Interfaces (ICMI 2006), Banff, Alberta, Canada, pp. 357–363. ACM, New York (2006)
Shimazu, H., Takashima, Y.: Multimodal definite clause grammar. Syst. Comput. Japan 26(3), 93–102 (1995)
Pereira, F., Warren, D.H.D.: Definite clause grammars for language analysis - a survey of the formalism and a comparison with augmented transition networks. Artif. Intell. 13(3), 231–278 (1980)
Johnston, M., Bangalore, S.: Finite-state multimodal integration and understanding. Nat. Lang. Eng. 11(2), 159–187 (2005)
Reitter, D., Panttaja, E.M., Cummins, F.: UI on the fly: Generating a multimodal user interface. In: Proceedings of Human Language Technology conference - North American chapter of the Association for Computational Linguistics (HLT-NAACL-2004), Boston, Massachusetts, USA (2004)
Baldridge, J., Kruijff, G.J.M.: Multimodal combinatory categorial grammar. In: Proceedings of the 10th Conference of the European Chapter of the Association for Computational Linguistics, Budapest, Hungary, pp. 211–218 (2003)
Caschera, M.C., Ferri, F., Grifoni, P.: An approach for managing ambiguities in multimodal interaction. In: Meersman, R., Tari, Z. (eds.) OTM-WS 2007, Part I. LNCS, vol. 4805, pp. 387–397. Springer, Heidelberg (2007)
Caschera M.C.: Interpretation methods and ambiguity management in multimodal systems. In: Grifoni, P. (ed.) Handbook of Research on Multimodal Human Computer Interaction and Pervasive Services: Evolutionary Techniques for Improving Accessibility, pp. 87–102. IGI Global, USA (2009)
Chai, J., Hong, P., Zhou, M.X.: A probabilistic approach to reference resolution in multimodal user interface. In: Proceedings of the 9th International Conference on Intelligent User Interfaces, Madeira, Portugal, pp. 70–77 (2004)
O’Hara, T., Wiebe, J., Bruce, R.F.: Selecting decomposable models for word-sense disambiguation: the Grling-Sdm system. Comput. Human. 34(1/2), 159–164 (2000)
Johnston, M., Cohen, P.R., McGee, D., Oviatt, S.L., Pittman, J.A., Smith, I.: Unification-based multimodal integration. In: Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and the 8th Conference of the European Chapter of the Association for Computational Linguistics, pp. 281–288 (1997)
Avola, D., Caschera, M.C., Ferri, F., Grifoni, P.: Classifying and resolving ambiguities in sketch-based interaction. Int. J. Virt. Technol. Multimedia 1(2), 104–139 (2010)
Krogh, A., Mian, S.I., Haussler, D.: A hidden Markov model that finds genes, E.coli DNA. NAR. 22(22), 4768–4778 (1994)
Rabiner, L.R.: A tutorial on hidden Markov models and selected applications in speech recognition. Proc. IEEE 77(2), 257–285 (1989)
Makhoul, J., Starner, T., Schwartz, R., Chou, G.: On-line cursive handwriting recognition using hidden Markov models and statistical grammars. In: Proceedings of the Workshop Hum. Lang. Technol., Plainsboro, NJ, pp. 432–436 (1994)
Jelinek, F.: Robust part-of-speech tagging using a hiddenMarkov model. Comput. Speech Lang. 6(3), 225–242 (1992)
Allwood, J.: Multimodal corpora. In: Lüdeling, A., Kytö, M. (eds.) Corpus Linguistics. An International Handbook, pp. 207–225. Mouton de Gruyter. Berlin (2008)
Gu, Y.: Multimodal text analysis: A corpus linguistic approach to situated discourse. Text Talk 26(2), 127–167 (2006)
Knight, D., Carter, R., Adolphs, S., Pridmore, T., Mills, S., Crabtree, A., Bayoumi, S.: Beyond the text: construction and analysis of multi-modal linguistic corpora. In: The 2nd International Conference on e-Social Science, June 28–30, University of Manchester, NCeSS (2006)
Knight, D.: The future of multimodal corpora. Braz. J. Appl. Linguist. 11(2), 391–416 (2011)
Karypidis, A., Lalis, S.: Automated context aggregation and file annotation for PAN-based computing. Pers. Ubiquit. Comput. 11(1), 33–44 (2007)
Caschera, M.C., Ferri, F., Grifoni, P.: From modal to multimodal ambiguities: a classification approach. JNIT 4(5), 87–109 (2013)
Byron, D.K., Fosler-Lussier, E.: The OSU Quake 2004 corpus of two-party situated problem- solving dialogs. In: Proceedings of the 15th Language Resources and Evaluation Conference (LREC 2006) (2006)
Stoia, L., Shockley, D.M., Byron, D.K., Fosler-Lussier, E.: SCARE: A situated cor- pus with annotated referring expressions. In: Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC 2008), pp. 28–30 (2008)
Tokunaga, T., Iida, R., Terai, A., Kuriyama, N.: The REX corpora: a collection of multimodal corpora of referring expressions in collaborative problem solving dialogues, In: Proceedings of the International Conference on Language Re- sources and Evaluation (LREC 2012), pp. 422–429 (2012)
Rehm, M., Gruneberg, F., Nakano, Y., Lipi, A.A., Yamaoka, Y., Huang, H.: Creating a standardized corpus of multimodal interactions for enculturating conversational interfaces. In: Workshop on Enculturating Conversational Interfaces by Socio-cultural Aspects of Communication, 2008 International Conference on Intelligent User Interfaces (IUI2008), Canary Islands, Spain, January 2008
Blache, P., Bertrand, R., Ferré, G.: Creating and Exploiting Multimodal Annotated Corpora: The ToMA Project. Multimodal Corpora, pp. 38–53 (2009)
Schiel, F., Steininger, S., Türk, U.: The SmartKom multimodal corpus at BAS. In: Proceedings of the International Language Resources and Evaluation Conference (LREC) (2002)
TALK (2007) project website: http://www.talk-project.org. Accessed 2 May 2011
Caschera, M.C., D’Ulizia, A., Ferri, F., Grifoni, P.: Multiculturality and multimodal languages. Multiple sensorial media advances and applications: new developments in MulSeMedia. A book edited by Dr. G. Ghinea (Brunel University), Dr. F. Andres (CVCE/NII), and Dr. S. Gulliver (University of Reading), pp. 99–114. IGI Global Publishing (2012)
Oliver, N., Garg, A., Horvitz, E.: Layered representations for learning and inferring office activity from multiple sensory channels. Comput. Vis. Image Underst. 96(2), 163–180 (2004). [Ch6]
Fine, S., Singer, Y., Tishby, N.: The hierarchical hidden Markov model: analysis and applications. Mach. Learn. 32(1), 41–62 (1998). [Ch7]
Rabiner, L.R.: A tutorial on hidden Markov models and selected applications in speech recognition. Proc. IEEE 77(2), 257–285 (1989). [Ch8]
Monachini M.: ELM-IT: EAGLES Specification for Italian morphosintax Lexicon Specification and Classification Guidelines. EAGLES Document EAG CLWG ELM IT/F (1996)
Roventini, A., Alonge, A., Calzolari, N., Magnini, B., Bertagna, F.: ItalWordNet: a large semantic database for Italian. Proceedings of the 2nd International Conference on Language Resources and Evaluation (LREC 2000), Athens, Greece, 31 May – 2 June 2000, vol. II, pp. 783–790. The European Language Resources Association (ELRA), Paris (2000)
Ringeval, F., Sonderegger, A., Sauer, J., Lalanne, D.: Introducing the RECOLA multimodal corpus of remote collaborative and affective interactions. In: 2nd International Workshop EmoSPACE (2013)
Inoue, M., Hanada, R., Furuyama, N., Irino, T., Ichinomiya, T., Massaki, H.: Multimodal corpus for psychotherapeutic situations. In: International Workshop Series on Multimodal Corpora, Tools and Resources, pp. 18–21 (2012)
Fleury, A., Vacher, M., Portet, F., Chahuara, P., Noury, N.: A multimodal corpus recorded in a health smart home. In: Proceedings of the LREC 2010, pp. 99–105 (2010)
Vacher, M., Lecouteux, B., Chahuara, P., Portet, F., Meillon, B., Bonnefond, N.: The Sweet-Home speech and multimodal corpus for home automation interaction. In: LREC 2014, pp. 1–8 (2014)
Costantini, E., Burger, S., Pianesi, F.: NESPOLE!’s multilingual and multimodal corpus. In: LREC (2002)
Caschera, M.C., D’Ulizia, A., Ferri, F., Grifoni, P.: An Italian Multimodal Corpus: the Building Process. In: Meersman, R., et al. (eds.) OTM 2014 Workshops. LNCS, vol. 8842, pp. 557–566. Springer, Heidelberg (2013)
Caschera, M.C., D’Ulizia, A., Ferri, F., Grifoni, P.: Methods for dynamic building of multimodal corpora. In: LTC 2013, pp 499–503 (2013)
Caschera, M.C., D’Ulizia, A., Ferri, F., Grifoni, P.: Multimodal interaction in gaming. In: Demey, Y.T., Panetto, H. (eds.) OTM 2013 Workshops 2013. LNCS, vol. 8186, pp. 694–703. Springer, Heidelberg (2013)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Caschera, M.C., D’Ulizia, A., Ferri, F., Grifoni, P. (2016). MCBF: Multimodal Corpora Building Framework. In: Vetulani, Z., Uszkoreit, H., Kubis, M. (eds) Human Language Technology. Challenges for Computer Science and Linguistics. LTC 2013. Lecture Notes in Computer Science(), vol 9561. Springer, Cham. https://doi.org/10.1007/978-3-319-43808-5_14
Download citation
DOI: https://doi.org/10.1007/978-3-319-43808-5_14
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-43807-8
Online ISBN: 978-3-319-43808-5
eBook Packages: Computer ScienceComputer Science (R0)