Skip to main content

Multimodal Systems: An Excursus of the Main Research Questions

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9416))

Abstract

Multimodal systems use integrated multiple interaction modalities (e.g. speech, sketch, handwriting, etc.) enabling users to benefit of a communication more similar to the human-human communication. To develop multimodal systems, several research questions have been addressed in the literature from the early 80s till the present day, such as multimodal fusion, recognition, dialogue interpretation and disambiguation, fission, context adaptation, etc. This paper investigates studies developed in the last decade, by analyzing the evolution of the approaches applied to face the main research questions related to multimodal fusion, interpretation, and context adaptation. As result, the paper provides a discussion on the reasons that led to shift attention from one methodology to another.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Caschera, M.C., Ferri, F., Grifoni, P.: Multimodal interaction systems: information and time features. International Journal of Web and Grid Services 3(1), 82–99 (2007)

    Article  Google Scholar 

  2. Caschera, M.C., Ferri, F., Grifoni, P.: Multimodality in mobile applications and services, encyclopedia of mobile computing and commerce. In: Taniar, D. (ed.) Monash University, Australia, pp. 675–681 (2007)

    Google Scholar 

  3. Nesselrath, R., Feld, M.: SiAM-dp: a platform for the model-based development of context-aware multimodal dialogue applications. In: Intelligent Environments 2014, pp. 162–169 (2014)

    Google Scholar 

  4. Caschera, M.C., D’Ulizia, A., Ferri, F., Grifoni, P.: An advanced multimodal platform for educational social networks. In: Meersman, R., Dillon, T., Herrero, P. (eds.) OTM 2010. LNCS, vol. 6428, pp. 339–348. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  5. D’Andrea, A., D’Ulizia, A., Ferri, F., Grifoni, P.: A multimodal pervasive framework for ambient assisted living. In: Proceedings of the 2nd International Conference on PErvasive Technologies Related to Assistive Environments (PETRA 2009), June 9–13, Corfù, Greece. ACM, New York, pp. 1–8 (2009)

    Google Scholar 

  6. Chai, J.Y., Pan, S., Zhou, M.X.: Mind: a context-based multimodal interpretation framework in conversational systems. In: van Kuppevelt, J.C.J., et al. (eds.), Advances in Natural Multimodal Dialogue Systems, pp. 265–285 (2005)

    Google Scholar 

  7. Duarte, C., Carriço, L.: A conceptual framework for developing adaptive multimodal applications. In: Proceedings of the 11th International Conference on Intelligent User Interfaces, Sydney, Australia, January 29–February 01, 2006. ACM, New York, pp. 132–139 (2006)

    Google Scholar 

  8. Kong, J., Zhang, W.Y., Yu, N., Xia, X.J.: Design of Human-Centric Adaptive Multimodal Interfaces. International Journal of Human-Computer Studies 69(12), 854–869 (2011)

    Article  Google Scholar 

  9. Hina, M.D., Ramdane-Cherif, A., Tadj, C., Levy, N.: A Multi-Agent Based Multimodal System Adaptive to the User’s Interaction Context. INTECH Open Access Publisher (2011)

    Google Scholar 

  10. Dumas, B., Signer, B., Lalanne, D.: Fusion in multimodal interactive systems: an HMM-based algorithm for user-induced adaptation. In: Proceedings of the 4th ACM SIGCHI Symposium on Engineering Interactive Computing Systems, pp. 15–24. ACM (2012)

    Google Scholar 

  11. Grifoni, P.: Multimodal fission. Multimodal human computer interaction and pervasive services, pp. 103–120 (2009)

    Google Scholar 

  12. Caschera, M.C., D’Ulizia, A., Ferri, F., Grifoni, P.: Towards evolutionary multimodal interaction. In: Herrero, P., Panetto, H., Meersman, R., Dillon, T. (eds.) OTM-WS 2012. LNCS, vol. 7567, pp. 608–616. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  13. Larson, J.A., Raman, T.V., Raggett, D., Bodell, M., Johnston, M., Kumar, S., Potter, S., Waters, K.: W3C multimodal interaction framework. W3C NOTE 6 (2003)

    Google Scholar 

  14. D’Ulizia, A.: Exploring Multimodal Input Fusion Strategies. Handbook of Research on Multimodal Human Computer Interaction and Pervasive Services: Evolutionary Techniques for Improving Accessibility, 34–57 (2009). IGI Publishing

    Google Scholar 

  15. Lalanne, D., Nigay, L., Robinson, P., Vanderdonckt, J., Ladry, J.F.: Fusion engines for multimodal input: a survey. In: Proceedings of the 2009 International Conference on Multimodal Interfaces, pp. 153–160. ACM (2009)

    Google Scholar 

  16. Dumas, B., Lalanne, D., Oviatt, S.: Multimodal interfaces: a survey of principles, models and frameworks. In: Lalanne, D., Kohlas, J. (eds.) Human Machine Interaction. LNCS, vol. 5440, pp. 3–26. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  17. Johnston, M., Bangalore, S.: Finite-state multimodal integration and understanding. Nat. Lang. Eng. 11(2), 159–188 (2005)

    Article  Google Scholar 

  18. Russ, G., Sallans, B., Hareter, H.: Semantic based information fusion in a multimodal interface. In: International Conference on Human-Computer Interaction (HCI2005), Las Vegas, June 20–23, pp 94–100 (2005)

    Google Scholar 

  19. Pérez, G., Amores, G., Manchón, P.: Two strategies for multimodal fusion. Proceedings of Multimodal Interaction for the Visualization and Exploration of Scientific Data, Trento, Italy, 26–32 (2005)

    Google Scholar 

  20. Portillo, P.M., García, G.P., Carredano, G.A.: Multimodal fusion: a new hybrid strategy for dialogue systems. In: ACM International Conference on Multimodal Interfaces, Banff, Canada, pp. 357–363 (2006)

    Google Scholar 

  21. Melichar, M., Cenek, P.: From vocal to multimodal dialogue management. In: Proceedings of the 8th International Conference on Multimodal interfaces, pp. 59–67. ACM (2006)

    Google Scholar 

  22. Dumas, B., Lalanne, D., Guinard, D., Koenig, R., Ingold, R.: Strengths and weaknesses of software architectures for the rapid creation of tangible and multimodal interfaces. In: Proc. of the 2nd Int. Conf. on Tangible and Embedded interaction (Bonn, Germany, 2008), pp. 47–54. ACM (2008)

    Google Scholar 

  23. Wasinger, R.: Multimodal Interaction with Mobile Devices: Fusing a Broad Spectrum of Modality Combinations. IOS Press (2006)

    Google Scholar 

  24. Mendonça, H., Lawson, J.Y.L., Vybornova, O., Macq, B., Vanderdonckt, J.: A fusion framework for multimodal interactive applications. In: ACM International Conference on Multimodal Interfaces (ICMI-MLMI), Cambridge, MA, pp. 161–168 (2009)

    Google Scholar 

  25. Griol, D., Garcia-Herrero, J., Molina, J.M.: A novel approach for data fusion and dialog management in user-adapted multimodal dialog systems. In: 17th International Conference on Information Fusion, pp. 1–7. IEEE (2014)

    Google Scholar 

  26. Sun, Y., Chen, F., Shi, Y.D., Chung, V.: A novel method for multi-sensory data fusion in multimodal human computer interaction. In: Proceedings of the 20th Conference of the Computer-Human Interaction Special Interest Group, Sydney, Australia, pp. 401–404 (2006)

    Google Scholar 

  27. D’Ulizia, A., Ferri, F., Grifoni, P.: A hybrid grammar-based approach to multimodal languages specification. In: Meersman, R., Tari, Z., Herrero, P. (eds.) OTM-WS 2007, Part I. LNCS, vol. 4805, pp. 367–376. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  28. D’Ulizia, A., Ferri, F., Grifoni, P.: Toward the development of an integrative framework for multimodal dialogue processing. In: Meersman, R., Tari, Z., Herrero, P. (eds.) OTM-WS 2008. LNCS, vol. 5333, pp. 509–518. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  29. D’Ulizia, A., Ferri, F.: Formalization of multimodal languages in pervasive computing paradigm. In: Damiani, E., Yetongnon, K., Chbeir, R., Dipanda, A. (eds.) SITIS 2006. LNCS, vol. 4879, pp. 126–136. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  30. Ferri, F., D’Ulizia, A., Grifoni, P.: Multimodal Language Specification for Human Adaptive Mechatronics. Journal of Next Generation Information Technology 3(1), 47–57 (2012)

    Article  Google Scholar 

  31. Wahlster, W.: Dialogue systems go multimodal: the SmartKom experience. In: SmartKom: foundations of multimodal dialogue systems, pp. 3–27. Springer, Berlin Heidelberg (2006)

    Google Scholar 

  32. Schüssel, F., Honold, F., Weber, M.: Using the transferable belief model for multimodal input fusion in companion systems. In: Schwenker, F., Scherer, S., Morency, L.-P. (eds.) MPRSS 2012. LNCS, vol. 7742, pp. 100–115. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  33. Caschera, M.C.: Interpretation methods and ambiguity management in multimodal systems. In: Grifoni, P. (ed.) Handbook of Research on Multimodal Human Computer Interaction and Pervasive Services: Evolutionary Techniques for Improving Accessibility, pp. 87–102. IGI Global, USA (2009)

    Chapter  Google Scholar 

  34. Mankoff, J., Hudson, S.E., Abowd, G.D.: Providing integrated toolkit-level support for ambiguity in recognition-based interfaces. In: Proceedings of ACM CHI 2000 Conference on Human Factors in Computing Systems, pp. 368–375 (2000)

    Google Scholar 

  35. Caschera, M.C., Ferri, F., Grifoni, P.: InteSe: An Integrated Model for Resolving Ambiguities in Multimodal Sentences. IEEE Transactions on Systems, Man, and Cybernetics: Systems 43(4), 911–931 (2013)

    Article  Google Scholar 

  36. Bui, T.H.: Multimodal Dialogue Management - State of the Art. CTIT Technical Report series No. 06-01, University of Twente (UT), Enschede, The Netherlands (2006)

    Google Scholar 

  37. Bui, T.H., Zwiers, J., Nijholt, A., Poel, M.: Generic dialogue modeling for multi-application dialogue systems. In: Proceedings of the 2nd Joint Workshop on Multimodal Interaction and Related Machine Learning Algorithms, Edinburgh, UK (2005)

    Google Scholar 

  38. Williams, J.D., Poupart, P., Young, S.: Factored partially observable markov decision processes for dialogue management. In: Proceedings of 4th Workshop on Knowledge and Reasoning in Practical Dialog Systems, International Joint Conference on Artificial Intelligence (IJCAI), pp. 76–82, Edinburgh (2005)

    Google Scholar 

  39. Williams, J.D., Poupart, P., Young, S.: Partially observable markov decision processes with continuous observations for dialogue management. In: Dybkjær, L., Minker, W. (eds.) Recent Trends in Discourse and Dialogue, Springer Science + Business Media B.V., pp. 191–217 (2008)

    Google Scholar 

  40. Johnston, M., Bangalore, S.: Combining stochastic and grammar-based language processing with finite-state edit machines. In: Proceedings of IEEE Automatic Speech Recognition and Understanding Workshop (2005)

    Google Scholar 

  41. Caschera, M.C., D’Ulizia, A., Ferri, F., Grifoni, P.: Multimodal interaction in gaming. In: Demey, Y.T., Panetto, H. (eds.) OTM 2013 Workshops 2013. LNCS, vol. 8186, pp. 694–703. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  42. Honold, F., Schüssel, F., Weber, M.: The Automated interplay of multimodal fission and fusion in adaptive HCI. In: Proceedings of the 2014 International Conference on Intelligent Environments, pp. 170–177. IEEE Computer Society, Washington (2014)

    Google Scholar 

  43. Chai, J.Y., Prasov Z., Qu, S.: Cognitive Principles in Robust Multimodal Interpretation 27, 55–83 (2006)

    Google Scholar 

  44. Muller, S., Schroter, C., Gross, H.M.: Adaptative input interpretation for dialogue management of an autonomous robot. In: 5th CompanionAble Workshop (2011)

    Google Scholar 

  45. Cutugno, F., Leano, V.A., Rinaldi, R., Mignini, G.: Multimodal framework for mobile interaction. In: Proceedings of the International Working Conference on Advanced Visual Interfaces, pp. 197–203. ACM (2012)

    Google Scholar 

  46. Nguyen, A., Wobcke, W.: An agent-based approach to dialogue management in personal assistants. In: Proceedings of the 10th International Conference on Intelligent User Interfaces, pp. 137–144. ACM Press, New York (2005)

    Google Scholar 

  47. Blaylock, N.: A collaborative problem-solving model of dialogue. In: SIGDIAL (2005)

    Google Scholar 

  48. Lieberman, H., Chu, A.: An interface for mutual disambiguation of recognition errors in a multimodal navigational assistant. Multimedia Syst. 12(4/5), 393–402 (2007)

    Article  Google Scholar 

  49. Huang, H.-H., Cerekovic, A., Tarasenko, K., Levacic, V., Zoric, G., Pandzic, I.S., Nakano, Y., Nishida, T.: Integrating embodied conversational agent components with a generic framework. Multiagent and Grid Systems - Innovations in Intelligent Agent Technology 4(4), 371–386 (2008). IOS Press, Amsterdam

    MATH  Google Scholar 

  50. Niewiadomski, R., Bevacqua, E., Mancini, M., Pelachaud, C.: Greta: an interactive expressive ECA system. In: Proceedings of the 8th International Conference on Autonomous Agents and Multiagent Systems, Budapest, Hungary, vol. 2, pp. 1399–1400 (2009)

    Google Scholar 

  51. Bohus, D., Horvitz, E.: Facilitating multiparty dialog with gaze, gesture, and speech. In: ACM International Conference on Multimodal Interfaces, Beijing, China (2010)

    Google Scholar 

  52. Ondas, S., Juhar, J.: Design and development of the Slovak multimodal dialogue system with the BML realizer elckerlyc. In: Cognitive Infocommunications, pp. 427–432 (2012)

    Google Scholar 

  53. Ondáš, S., Juhár, J.: Event-Based Dialogue Manager for Multimodal Systems. Emergent Trends in Robotics and Intelligent Systems 316, 227–235 (2015)

    Google Scholar 

  54. D’Ulizia, A., Ferri, F., Grifoni, P.: Generating Multimodal Grammars for Multimodal Dialogue Processing. IEEE Transactions on Systems, Man and Cybernetics, Part A: Systems and Humans 40(6), 1130–1145 (2010)

    Article  Google Scholar 

  55. D’Ulizia, A., Ferri, F., Grifoni, P.: A Learning Algorithm for Multimodal Grammar Inference. IEEE Transactions on Systems, Man, and Cybernetics - Part B: Cybernetics 41(6), 1495–1510 (2011)

    Article  Google Scholar 

  56. Caschera, M.C., D’Ulizia, A.: Information extraction based on personalization and contextualization models for multimodal data. In: DEXA Workshops 2007, September 3–7, 2007, Regensburg, Germany, pp. 114–118. IEEE Computer Society (2007)

    Google Scholar 

  57. Motti, V.G., Vanderdonckt, J.: A computational framework for context-aware adaptation of user interfaces. In: IEEE Seventh International Conference on Research Challenges in Information Science (RCIS), pp. 1–12. IEEE (2013)

    Google Scholar 

  58. Honold, F., Schussel, F., Weber, M., Nothdurft, F., Bertrand, G., Minker, W.: Context models for adaptive dialogs and multimodal interaction. In: 9th International Conference on Intelligent Environments, pp. 57–64. IEEE (2013)

    Google Scholar 

  59. Foster, M.E., White, M.: Assessing the impact of adaptive generation in the COMIC multimodal dialogue system. In: Proceedings of the IJCAI 2005 Workshop on Knowledge and Reasoning in Practical Dialogue Systems, pp. 24–31 (2005)

    Google Scholar 

  60. Demberg, V., Moore, J.D.: Information presentation in spoken dialogue systems. In: Proceedings of EACL (2006)

    Google Scholar 

  61. David, L., Endler, M., Barbosa, S.D.J., Filho, J.V.: Middleware support for context-aware mobile applications with adaptive multimodal user interfaces. In: Proc. of U-Media 2011, Sao Paulo, Brazil, pp.106–111 (2011)

    Google Scholar 

  62. Dargie, W., Strunk, A., Winkler, M., Mrohs, B., Thakar, S., Enkelmann, W.: A model based approach for developing adaptive multimodal interactive systems. In: ICSOFT (PL/DPS/KE/MUSE), pp. 73–79 (2007)

    Google Scholar 

  63. Rieser V., Lemon, O.: Learning effective multimodal dialogue strategies from wizard-of-oz data: bootstrapping and evaluation. In: Proceedings of ACL, pp. 638–646 (2008)

    Google Scholar 

  64. Ertl, D.: Semi-automatic multimodal user interface generation: In Proceedings EICS 2009, pp. 321–324. ACM Press (2009)

    Google Scholar 

  65. Porta, D., Deru, M., Bergweiler, S., Herzog, G., Poller, P.: Building multimodal dialog user interfaces in the context of the internet of services. In: Wahlster, W., Grallert, H.J., Wess, S., Friedrich, H., Widenka, T. (eds.): Towards the Internet of Services: The THESEUS Research Program, Cognitive Technologies, pp 145–162. Springer (2014)

    Google Scholar 

  66. Avola, D., Caschera, M.C., Ferri, F., Grifoni, P.: Classifying and Resolving Ambiguities in Sketch-Based Interaction. International Journal of Virtual Technology and Multimedia 1(2), 104–139 (2010). Inderscience Publishers

    Article  Google Scholar 

  67. Avola, D., Caschera, M.C., Grifoni, P.: Solving ambiguities for sketch-based interaction in mobile environments. In: Meersman, R., Tari, Z., Herrero, P. (eds.) OTM 2006 Workshops. LNCS, vol. 4277, pp. 904–915. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  68. Avola, D., Caschera, M.C., Ferri, F., Grifoni, P.: Ambiguities in sketch-based interfaces. In: 40th Annual Hawaii International Conference on System Sciences (HICSS2007), p. 290. IEEE Computer Society (2007)

    Google Scholar 

  69. Caschera, M.C., Ferri, F., Grifoni, P.: The Management of ambiguities. Visual Languages for Interactive Computing: Definitions and Formalizations, 129–140 (2007). IGI Publishing

    Google Scholar 

  70. Caschera, M.C., Ferri, F., Grifoni, P.: From Modal to Multimodal Ambiguities: a Classification Approach. JNIT 4(5), 87–109 (2013)

    Article  Google Scholar 

  71. Caschera, M.C., Ferri, F., Grifoni, P.: An Approach for Managing Ambiguities in Multimodal Interaction. In: Meersman, R., Tari, Z., Herrero, P. (eds.) OTM-WS 2007, Part I. LNCS, vol. 4805, pp. 387–397. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  72. Caschera M.C., Ferri, F., Grifoni P.: Ambiguity detection in multimodal systems. In: Advanced Visual Interfaces, AVI 2008, pp. 331–334. ACM Press (2008)

    Google Scholar 

  73. Manca, M., Paternò, F., Santoro, C., Spano, L.D.: Generation of multi-device adaptive multimodal web applications. In: Daniel, F., Papadopoulos, G.A., Thiran, P. (eds.) MobiWIS 2013. LNCS, vol. 8093, pp. 218–232. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  74. Caschera, M.C., D’Ulizia, A., Ferri, F., Grifoni, P.: Multiculturality and multimodal languages. In: Ghinea, G., Andres, F.,Gulliver, S. (eds.) Multiple Sensorial Media Advances and Applications: New Developments in MulSeMedia., pp. 99–114. IGI Global Publishing (2012)

    Google Scholar 

  75. Grifoni, P., Ferri, F., Caschera, M.C., D’Ulizia, A., Mazzei, M.: MIS: Multimodal Interaction Services in a cloud perspective. JNIT: Journal of Next Generation Information Technology 5(4), 1–10 (2014)

    Google Scholar 

  76. Jeong, H., Kim, M., Choi, E.: Build a Multi-modal Interaction in Cloud Computing, ASTL Volume 3, Information Science and Technology (Part 2), pp.36–38 (2012)

    Google Scholar 

  77. Caschera, M.C., D’Andrea, A., D’Ulizia, A., Ferri, F., Grifoni, P., Guzzo, T.: ME: multimodal environment based on web services architecture. In: Meersman, R., Herrero, P., Dillon, T. (eds.) OTM 2009 Workshops. LNCS, vol. 5872, pp. 504–512. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  78. Caschera, M.C., D’ulizia, A., Ferri, F., Grifoni, P.: An italian multimodal corpus: the building process. In: Meersman, R., et al. (eds.) OTM 2014 Workshops. LNCS, vol. 8842, pp. 557–566. Springer, Heidelberg (2009)

    Google Scholar 

  79. Caschera, M.C., D’Ulizia, A., Ferri, F., Grifoni, P.: Methods for dynamic building of multimodal corpora. In: the Proceedings of the 6th Language & Technology Conference (LTC2013), December 7–9, 2013, Poznan, Poland, pp. 499–503 (2013)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Fernando Ferri .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Caschera, M.C., D’Ulizia, A., Ferri, F., Grifoni, P. (2015). Multimodal Systems: An Excursus of the Main Research Questions. In: Ciuciu, I., et al. On the Move to Meaningful Internet Systems: OTM 2015 Workshops. OTM 2015. Lecture Notes in Computer Science(), vol 9416. Springer, Cham. https://doi.org/10.1007/978-3-319-26138-6_59

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-26138-6_59

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-26137-9

  • Online ISBN: 978-3-319-26138-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics