Skip to main content

MCBF: Multimodal Corpora Building Framework

  • Conference paper
  • First Online:
Book cover Human Language Technology. Challenges for Computer Science and Linguistics (LTC 2013)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9561))

Included in the following conference series:

  • 754 Accesses

Abstract

Designing an effective human-machine multimodal interaction environment requires addressing crucial issues such as the correct interpretation of complex user’s input from different modal channels. In this context the use of corpora of multimodal sentences is very important because they allow integrating properties and linguistic knowledge which are not formalised in the grammar. This paper provides framework for dynamic multimodal corpora building that semi-automates the extraction of syntactic and semantic information from multimodal dialogues using both grammar inference and interpretation methodologies based on HMM. This method is based on a Multimodal Attribute Grammar and on an HMM-based approach to syntactically and semantically annotate new multimodal sentences. It allows for improving human-computer dialogue because the multimodal corpus evolves by adapting itself to the dynamic change of the human-computer interaction.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Allwood, J.: Multimodal corpora. In: Lüdeling, A., Kytö, M. (eds.) Corpus Linguistics. An International Handbook, pp. 207–225. Mouton de Gruyter, Berlin (2008)

    Google Scholar 

  2. D’Ulizia, A., Ferri, F., Grifoni, P.: Generating multimodal grammars for multimodal dialogue processing. IEEE Trans. Syst. Man Cybern. Part A Syst. Hum. 40(6), 1130–1145 (2010)

    Article  Google Scholar 

  3. D’Ulizia, A., Ferri, F., Grifoni, P.: A Learning Algorithm for Multimodal Grammar Inference. IEEE Trans. Syst. Man Cybern. Part B Cybern. 41(6), 1495–1510 (2011)

    Article  Google Scholar 

  4. Caschera, M.C., Ferri, F., Grifoni, P.: InteSe: an integrated model for resolving ambiguities in multimodal sentences. IEEE Trans. Syst. Man Cybern. Syst. 43(4), 911–931 (2013)

    Article  Google Scholar 

  5. Caschera, M.C., Ferri, F., Grifoni, P.: Multimodal interaction systems: information and time features. Int. J. Web Grid Serv. (IJWGS) 3(1), 82–99 (2007)

    Article  Google Scholar 

  6. D’Ulizia, A.: Exploring multimodal input fusion strategies. In: The Handbook of Research on Multimodal Human Computer Interaction and Pervasive Services: Evolutionary Techniques for Improving Accessibility, pp. 34–57. IGI Publishing (2009)

    Google Scholar 

  7. Manchón, P., Pérez, G., Amores, G.: Multimodal fusion: a new hybrid strategy for dialogue systems. Proceedings of Eighth International Conference on Multimodal Interfaces (ICMI 2006), Banff, Alberta, Canada, pp. 357–363. ACM, New York (2006)

    Google Scholar 

  8. Shimazu, H., Takashima, Y.: Multimodal definite clause grammar. Syst. Comput. Japan 26(3), 93–102 (1995)

    Article  Google Scholar 

  9. Pereira, F., Warren, D.H.D.: Definite clause grammars for language analysis - a survey of the formalism and a comparison with augmented transition networks. Artif. Intell. 13(3), 231–278 (1980)

    Article  MathSciNet  MATH  Google Scholar 

  10. Johnston, M., Bangalore, S.: Finite-state multimodal integration and understanding. Nat. Lang. Eng. 11(2), 159–187 (2005)

    Article  Google Scholar 

  11. Reitter, D., Panttaja, E.M., Cummins, F.: UI on the fly: Generating a multimodal user interface. In: Proceedings of Human Language Technology conference - North American chapter of the Association for Computational Linguistics (HLT-NAACL-2004), Boston, Massachusetts, USA (2004)

    Google Scholar 

  12. Baldridge, J., Kruijff, G.J.M.: Multimodal combinatory categorial grammar. In: Proceedings of the 10th Conference of the European Chapter of the Association for Computational Linguistics, Budapest, Hungary, pp. 211–218 (2003)

    Google Scholar 

  13. Caschera, M.C., Ferri, F., Grifoni, P.: An approach for managing ambiguities in multimodal interaction. In: Meersman, R., Tari, Z. (eds.) OTM-WS 2007, Part I. LNCS, vol. 4805, pp. 387–397. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  14. Caschera M.C.: Interpretation methods and ambiguity management in multimodal systems. In: Grifoni, P. (ed.) Handbook of Research on Multimodal Human Computer Interaction and Pervasive Services: Evolutionary Techniques for Improving Accessibility, pp. 87–102. IGI Global, USA (2009)

    Google Scholar 

  15. Chai, J., Hong, P., Zhou, M.X.: A probabilistic approach to reference resolution in multimodal user interface. In: Proceedings of the 9th International Conference on Intelligent User Interfaces, Madeira, Portugal, pp. 70–77 (2004)

    Google Scholar 

  16. O’Hara, T., Wiebe, J., Bruce, R.F.: Selecting decomposable models for word-sense disambiguation: the Grling-Sdm system. Comput. Human. 34(1/2), 159–164 (2000)

    Article  Google Scholar 

  17. Johnston, M., Cohen, P.R., McGee, D., Oviatt, S.L., Pittman, J.A., Smith, I.: Unification-based multimodal integration. In: Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and the 8th Conference of the European Chapter of the Association for Computational Linguistics, pp. 281–288 (1997)

    Google Scholar 

  18. Avola, D., Caschera, M.C., Ferri, F., Grifoni, P.: Classifying and resolving ambiguities in sketch-based interaction. Int. J. Virt. Technol. Multimedia 1(2), 104–139 (2010)

    Article  Google Scholar 

  19. Krogh, A., Mian, S.I., Haussler, D.: A hidden Markov model that finds genes, E.coli DNA. NAR. 22(22), 4768–4778 (1994)

    Google Scholar 

  20. Rabiner, L.R.: A tutorial on hidden Markov models and selected applications in speech recognition. Proc. IEEE 77(2), 257–285 (1989)

    Article  Google Scholar 

  21. Makhoul, J., Starner, T., Schwartz, R., Chou, G.: On-line cursive handwriting recognition using hidden Markov models and statistical grammars. In: Proceedings of the Workshop Hum. Lang. Technol., Plainsboro, NJ, pp. 432–436 (1994)

    Google Scholar 

  22. Jelinek, F.: Robust part-of-speech tagging using a hiddenMarkov model. Comput. Speech Lang. 6(3), 225–242 (1992)

    Article  Google Scholar 

  23. Allwood, J.: Multimodal corpora. In: Lüdeling, A., Kytö, M. (eds.) Corpus Linguistics. An International Handbook, pp. 207–225. Mouton de Gruyter. Berlin (2008)

    Google Scholar 

  24. Gu, Y.: Multimodal text analysis: A corpus linguistic approach to situated discourse. Text Talk 26(2), 127–167 (2006)

    Article  Google Scholar 

  25. Knight, D., Carter, R., Adolphs, S., Pridmore, T., Mills, S., Crabtree, A., Bayoumi, S.: Beyond the text: construction and analysis of multi-modal linguistic corpora. In: The 2nd International Conference on e-Social Science, June 28–30, University of Manchester, NCeSS (2006)

    Google Scholar 

  26. Knight, D.: The future of multimodal corpora. Braz. J. Appl. Linguist. 11(2), 391–416 (2011)

    Google Scholar 

  27. Karypidis, A., Lalis, S.: Automated context aggregation and file annotation for PAN-based computing. Pers. Ubiquit. Comput. 11(1), 33–44 (2007)

    Article  Google Scholar 

  28. Caschera, M.C., Ferri, F., Grifoni, P.: From modal to multimodal ambiguities: a classification approach. JNIT 4(5), 87–109 (2013)

    Article  Google Scholar 

  29. Byron, D.K., Fosler-Lussier, E.: The OSU Quake 2004 corpus of two-party situated problem- solving dialogs. In: Proceedings of the 15th Language Resources and Evaluation Conference (LREC 2006) (2006)

    Google Scholar 

  30. Stoia, L., Shockley, D.M., Byron, D.K., Fosler-Lussier, E.: SCARE: A situated cor- pus with annotated referring expressions. In: Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC 2008), pp. 28–30 (2008)

    Google Scholar 

  31. Tokunaga, T., Iida, R., Terai, A., Kuriyama, N.: The REX corpora: a collection of multimodal corpora of referring expressions in collaborative problem solving dialogues, In: Proceedings of the International Conference on Language Re- sources and Evaluation (LREC 2012), pp. 422–429 (2012)

    Google Scholar 

  32. Rehm, M., Gruneberg, F., Nakano, Y., Lipi, A.A., Yamaoka, Y., Huang, H.: Creating a standardized corpus of multimodal interactions for enculturating conversational interfaces. In: Workshop on Enculturating Conversational Interfaces by Socio-cultural Aspects of Communication, 2008 International Conference on Intelligent User Interfaces (IUI2008), Canary Islands, Spain, January 2008

    Google Scholar 

  33. Blache, P., Bertrand, R., Ferré, G.: Creating and Exploiting Multimodal Annotated Corpora: The ToMA Project. Multimodal Corpora, pp. 38–53 (2009)

    Google Scholar 

  34. Schiel, F., Steininger, S., Türk, U.: The SmartKom multimodal corpus at BAS. In: Proceedings of the International Language Resources and Evaluation Conference (LREC) (2002)

    Google Scholar 

  35. TALK (2007) project website: http://www.talk-project.org. Accessed 2 May 2011

  36. Caschera, M.C., D’Ulizia, A., Ferri, F., Grifoni, P.: Multiculturality and multimodal languages. Multiple sensorial media advances and applications: new developments in MulSeMedia. A book edited by Dr. G. Ghinea (Brunel University), Dr. F. Andres (CVCE/NII), and Dr. S. Gulliver (University of Reading), pp. 99–114. IGI Global Publishing (2012)

    Google Scholar 

  37. Oliver, N., Garg, A., Horvitz, E.: Layered representations for learning and inferring office activity from multiple sensory channels. Comput. Vis. Image Underst. 96(2), 163–180 (2004). [Ch6]

    Article  Google Scholar 

  38. Fine, S., Singer, Y., Tishby, N.: The hierarchical hidden Markov model: analysis and applications. Mach. Learn. 32(1), 41–62 (1998). [Ch7]

    Article  MATH  Google Scholar 

  39. Rabiner, L.R.: A tutorial on hidden Markov models and selected applications in speech recognition. Proc. IEEE 77(2), 257–285 (1989). [Ch8]

    Article  Google Scholar 

  40. Monachini M.: ELM-IT: EAGLES Specification for Italian morphosintax Lexicon Specification and Classification Guidelines. EAGLES Document EAG CLWG ELM IT/F (1996)

    Google Scholar 

  41. Roventini, A., Alonge, A., Calzolari, N., Magnini, B., Bertagna, F.: ItalWordNet: a large semantic database for Italian. Proceedings of the 2nd International Conference on Language Resources and Evaluation (LREC 2000), Athens, Greece, 31 May – 2 June 2000, vol. II, pp. 783–790. The European Language Resources Association (ELRA), Paris (2000)

    Google Scholar 

  42. Ringeval, F., Sonderegger, A., Sauer, J., Lalanne, D.: Introducing the RECOLA multimodal corpus of remote collaborative and affective interactions. In: 2nd International Workshop EmoSPACE (2013)

    Google Scholar 

  43. Inoue, M., Hanada, R., Furuyama, N., Irino, T., Ichinomiya, T., Massaki, H.: Multimodal corpus for psychotherapeutic situations. In: International Workshop Series on Multimodal Corpora, Tools and Resources, pp. 18–21 (2012)

    Google Scholar 

  44. Fleury, A., Vacher, M., Portet, F., Chahuara, P., Noury, N.: A multimodal corpus recorded in a health smart home. In: Proceedings of the LREC 2010, pp. 99–105 (2010)

    Google Scholar 

  45. Vacher, M., Lecouteux, B., Chahuara, P., Portet, F., Meillon, B., Bonnefond, N.: The Sweet-Home speech and multimodal corpus for home automation interaction. In: LREC 2014, pp. 1–8 (2014)

    Google Scholar 

  46. Costantini, E., Burger, S., Pianesi, F.: NESPOLE!’s multilingual and multimodal corpus. In: LREC (2002)

    Google Scholar 

  47. http://badip.uni-graz.at/it/lista-di-corpora

  48. Caschera, M.C., D’Ulizia, A., Ferri, F., Grifoni, P.: An Italian Multimodal Corpus: the Building Process. In: Meersman, R., et al. (eds.) OTM 2014 Workshops. LNCS, vol. 8842, pp. 557–566. Springer, Heidelberg (2013)

    Google Scholar 

  49. Caschera, M.C., D’Ulizia, A., Ferri, F., Grifoni, P.: Methods for dynamic building of multimodal corpora. In: LTC 2013, pp 499–503 (2013)

    Google Scholar 

  50. Caschera, M.C., D’Ulizia, A., Ferri, F., Grifoni, P.: Multimodal interaction in gaming. In: Demey, Y.T., Panetto, H. (eds.) OTM 2013 Workshops 2013. LNCS, vol. 8186, pp. 694–703. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Fernando Ferri .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Caschera, M.C., D’Ulizia, A., Ferri, F., Grifoni, P. (2016). MCBF: Multimodal Corpora Building Framework. In: Vetulani, Z., Uszkoreit, H., Kubis, M. (eds) Human Language Technology. Challenges for Computer Science and Linguistics. LTC 2013. Lecture Notes in Computer Science(), vol 9561. Springer, Cham. https://doi.org/10.1007/978-3-319-43808-5_14

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-43808-5_14

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-43807-8

  • Online ISBN: 978-3-319-43808-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics