ABSTRACT
Several models and design spaces have been defined and are regularly used to describe how modalities can be fused together in an interactive multimodal system. However, models such as CASE, the CARE properties or TYCOON have all been defined more than two decades ago. In this paper, we start with a critical review of these models, which notably highlighted a confusion between how the user and the system side of a multimodal system were described. Based on this critical review, we define MMMM v1, an improved model for the description of multimodal fusion in interactive systems targeting completeness. A first user evaluation comparing the models revealed that MMMM v1 was indeed complete, but at the cost of user friendliness. Based on the results of this first evaluation, an improved version of MMMM, called MMMM v2 was defined. A second user evaluation highlighted that this model achieved a good balance between complexity, consistency and completeness compared to the state of the art.
- Richard A. Bolt. 1980. “Put-that-there”: Voice and Gesture at the Graphics Interface. In Proceedings of the 7th Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH 1980). Seattle, USA, 262–270. Google ScholarDigital Library
- Andrea Cherubini, Robin Passama, Philippe Fraisse, and André Crosnier. 2015. A Unified Multimodal Control Framework for Human-Robot Interaction. Robotics and Autonomous Systems 70 (2015), 106–115. Modelling Fusion of Modalities in Multimodal... ICMI’17, November 13–17, 2017, Glasgow, UK Google ScholarDigital Library
- Joëlle Coutaz, Laurence Nigay, Daniel Salber, Ann Blandford, Jon May, and Richard M. Young. 1995. Four Easy Pieces for Assessing the Usability of Multimodal Interaction: The CARE Properties. In Proceedings of the 5th International Conference on Human-Computer Interaction (Interact 1995). Lillehammer, Norway, 115–120.Google Scholar
- Fredy Cuenca, Jan Van den Bergh, Kris Luyten, and Karin Coninx. 2015. Hasselt UIMS: A Tool for Describing Multimodal Interactions with Composite Events. In Proceedings of the 7th ACM SIGCHI Symposium on Engineering Interactive Computing Systems. Duisburg, Germany, 226–229. Google ScholarDigital Library
- Bruno Dumas, Rolf Ingold, and Denis Lalanne. 2009. Benchmarking Fusion Engines of Multimodal Interactive Systems. In Proceedings of the 2009 International Conference on Multimodal Interfaces. Cambridge, Massachusetts, USA, 169–176. Google ScholarDigital Library
- Bruno Dumas, Denis Lalanne, and Rolf Ingold. 2010. Description Languages for Multimodal Interaction: A Set of Guidelines and its Illustration with SMUIML. Journal on Multimodal User Interfaces: “Special Issue on The Challenges of Engineering Multimodal Interaction” 3, 3 (February 2010), 237–247.Google ScholarCross Ref
- Bruno Dumas, Denis Lalanne, and Sharon Oviatt. 2009. Multimodal Interfaces: A Survey of Principles, Models and Frameworks. In Human Machine Interaction: Research Results of the MMI Program. Springer-Verlag, Berlin, Heidelberg, 3–26. Google ScholarDigital Library
- Lode Hoste, Bruno Dumas, and Beat Signer. 2011. Mudra: A Unified Multimodal Interaction Framework. In Proceedings of the 13th International Conference on Multimodal Interfaces (ICMI 2011). Alicante, Spain, 97–104. Google ScholarDigital Library
- Marc Erich Latoschik. 2005. A User Interface Framework for Multimodal VR Interactions. In Proceedings of the 7th International Conference on Multimodal Interfaces. ACM, Torento, Italy, 76–83. Google ScholarDigital Library
- Jean-Claude Martin. 1998. TYCOON: Theoretical Framework and Software Tools for Multimodal Interfaces. Intelligence and Multimodality in Multimedia interfaces (1998), 1–25.Google Scholar
- Jean-Claude Martin and Dominique Béroule. 1999. TYCOON: Six Primitive Types of Cooperation for Observing, Evaluating and Specifying Cooperations. In Proceedings of the AAAI Fall 1999 Symposium on Psychological Models of Communication in Collaborative Systems, Vol. 16.Google Scholar
- David R. McGee, Philip R. Cohen, and Lizhong Wu. 2000. Something from Nothing: Augmenting a Paper-based Work Practice via Multimodal Interaction. In Proceedings of DARE 2000 on Designing Augmented Reality Environments. Elsinore, Denmark, 71–80. Google ScholarDigital Library
- Laurence Nigay. 1994. Conception et modélisation logicielles des systèmes interactifs: application aux interfaces multimodales. Ph.D. Dissertation. Université Joseph-Fourier-Grenoble I.Google Scholar
- Laurence Nigay. 2004. Design Space for Multimodal Interaction. In Building the Information Society. Springer, 403–408.Google Scholar
- Laurence Nigay and Joëlle Coutaz. 1993. A Design Space for Multimodal Systems: Concurrent Processing and Data Fusion. In Proceedings of the SIGCHI conference on Human factors in computing systems (CHI 1993). 172–178. Google ScholarDigital Library
- Donald A Norman. 2013. The design of everyday things: Revised and expanded edition. Basic books.Google Scholar
- Sharon Oviatt. 1999. Ten myths of multimodal interaction. Commun. ACM 42, 11 (1999), 74–81. Google ScholarDigital Library
- M. Serrano, L. Nigay, J.Y. Lawson, A. Ramsay, R. Murray-Smith, and S. Denef. 2008. The OpenInterface Framework: A Tool for Multimodal Interaction. In Proceedings of the 26th International Conference on Human Factors in Computing Systems (CHI 2008). Florence, Italy, 3501–3506. Google ScholarDigital Library
Index Terms
- Modelling fusion of modalities in multimodal interactive systems with MMMM
Recommendations
Fusion engines for multimodal input: a survey
ICMI-MLMI '09: Proceedings of the 2009 international conference on Multimodal interfacesFusion engines are fundamental components of multimodal inter-active systems, to interpret input streams whose meaning can vary according to the context, task, user and time. Other surveys have considered multimodal interactive systems; we focus more ...
Benchmarking fusion engines of multimodal interactive systems
ICMI-MLMI '09: Proceedings of the 2009 international conference on Multimodal interfacesThis article proposes an evaluation framework to benchmark the performance of multimodal fusion engines. The paper first introduces different concepts and techniques associated with multimodal fusion engines and further surveys recent implementations. ...
Salience modeling based on non-verbal modalities for spoken language understanding
ICMI '06: Proceedings of the 8th international conference on Multimodal interfacesPrevious studies have shown that, in multimodal conversational systems, fusing information from multiple modalities together can improve the overall input interpretation through mutual disambiguation. Inspired by these findings, this paper investigates ...
Comments