Article

A user interface framework for multimodal VR interactions

Author:

Marc Erich LatoschikAuthors Info & Claims

ICMI '05: Proceedings of the 7th international conference on Multimodal interfaces

Pages 76 - 83

https://doi.org/10.1145/1088463.1088479

Published: 04 October 2005 Publication History

Abstract

This article presents a User Interface (UI) framework for multimodal interactions targeted at immersive virtual environments. Its configurable input and gesture processing components provide an advanced behavior graph capable of routing continuous data streams asynchronously. The framework introduces a Knowledge Representation Layer which augments objects of the simulated environment with Semantic Entities as a central object model that bridges and interfaces Virtual Reality (VR) and Artificial Intelligence (AI) representations. Specialized node types use these facilities to implement required processing tasks like gesture detection, preprocessing of the visual scene for multimodal integration, or translation of movements into multimodally initialized gestural interactions. A modified Augmented Transition Nettwork (ATN) approach accesses the knowledge layer as well as the preprocessing components to integrate linguistic, gestural, and context information in parallel. The overall framework emphasizes extensibility, adaptivity and reusability, e.g., by utilizing persistent and interchangeable XML-based formats to describe its processing stages.

References

[1]

F. Althoff, G. McGlaun, B. Schuller, P. Morguet, and M. Lang. Using multimodal interaction to navigate in arbitrary virtual vrml worlds. In Proceedings of PUI 2001, 2001.

Digital Library

[2]

R. Arangarasan and G. N. J. Phillips. Modular Approach of Multimodal Integration in a Virtual Environment. In Proceedings of the Fourth IEEE International Conference on Multimodal Interfaces ICMI'02, Pittsburgh, Pennsylvania, pages 331--336. IEEE, 2002.

Digital Library

[3]

K. Böhm, W. Hübner, and K. Väänänen. Given: Gesture driven interactions in virtual environments; a toolkit approach to 3D interactions. In Interfaces to Real and Virtual Worlds, 1992.

[4]

R. A. Bolt. Put-That-There: Voice and gesture at the graphics interface. In ACM SIG-GRAPH Computer Graphics, New York, 1980. ACM Press.

Digital Library

[5]

R. Carey, G. Bell, and C. Marrin. ISO/IEC 14772-1:1997 virtual reality modeling language (VRML). Technical report, The VRML Consortium Incorporated, 1997.

[6]

M. Cavazza, X. Pouteau, and D. Pernel. Multimodal communication in virtual environments. In Symbiosis of Human and Artifact, pages 597--604. Elsevier Science B. V., 1995.

[7]

P. Cohen, D. McGee, S. Oviatt, L. Wu, J. Clow, R. King, S. Julier, and L. Rosenblum. Multimodal interactions for 2d and 3d environments. IEEE Computer Graphics and Applications, pages 10--13, 1999.

Digital Library

[8]

A. Hauptmann and P. McAvinney. Gestures with speech for graphic manipulation. International Journal of Man-Machine Studies, 38:231--249, 1993.

Digital Library

[9]

G. Heumer, M. Schilling, and M. E. Latoschik. Automatic data exchange and synchronization for knowledge-based intelligent virtual environments. In Proceedings of the IEEE VR2005, pages 43--50, Bonn, Germany, 2005.

Digital Library

[10]

M. Johnston. Unification-based multimodal parsing. In Proceedings of the 17th International Conference on Computational Linguistics and the 36th Annual Meeting of the Association for Computational Linguistics COLING-ACL, pages 624 -- 630, 1998.

Digital Library

[11]

M. Johnston and S. Bangalore. Finite-state methods for multimodal parsing and integration. In Finite-state Methods Workshop, ESSLLI Summer School on Logic Language and Information,Helsinki, Finland, august 2001.

[12]

M. Johnston, P. R. Cohen, D. McGee, S. L. Oviatt, J. A. Pittman, and I. Smith. Unification-based multimodal integration. In 35th Annual Meeting of the Association for Computational Linguistics, Madrid, pages 281--288, 1997.

Digital Library

[13]

E. Kaiser, A. Olwal, D. McGee, H. Benko, A. Corradini, X. Li, P. Cohen, and S. Feiner. Mutual disambiguation of 3d multimodal interaction in augmented and virtual reality. In Proceedings of the 5th international conference on Multimodal interfaces, pages 12--19. ACM Press, 2003.

Digital Library

[14]

D. Koons, C. Sparrel, and K. Thorisson. Intergrating simultaneous input from speech, gaze and hand gestures. In Intelligent Multimedia Interfaces. AAAI Press, 1993.

Digital Library

[15]

F. Landragin, N. Bellalem, and L. Romary. Referring to Objects with Spoken and Haptic Modalities. In Proceedings of the Fourth IEEE International Conference on Multimodal Interfaces ICMI'02, Pittsburgh, Pennsylvania, pages 99--104. IEEE, 2002.

Digital Library

[16]

M. E. Latoschik. A gesture processing framework for multimodal interaction in virtual reality. In A. Chalmers and V. Lalioti, editors, AFRIGRAPH 2001, 1st International Conference on Computer Graphics, Virtual Reality and Visualisation in Africa, conference proceedings, pages 95--100. ACM SIG-GRAPH, 2001.

Digital Library

[17]

M. E. Latoschik. Designing Transition Networks for Multimodal VR-Interactions Using a Markup Language. In Proceedings of the Fourth IEEE International Conference on Multimodal Interfaces ICMI'02, Pittsburgh, Pennsylvania, pages 411--416. IEEE, 2002.

Digital Library

[18]

M. E. Latoschik and M. Schilling. Incorporating VR Databases into AI Knowledge Representations: A Framework for Intelligent Graphics Applications. In Proceedings of the Sixth International Conference on Computer Graphics and Imaging. IASTED, ACTA Press, 2003.

[19]

B. Lenzmann. Benutzeradaptive und multimodale Interface-Agenten. PhD thesis, Technische Fakultät, Universität Bielefeld, 1998.

[20]

M. Lucente, G.-J. Zwart, and A. D. George. Visualization space: A testbed for deviceless multimodal user interface. In Intelligent Environments Symposium, American Assoc. for Artificial Intelligence Spring Symposium Series, Mar. 1998.

[21]

M. T. Maybury. Research in multimedia an multimodal parsing and generation. In P. McKevitt, editor, Journal of Artificial Intelligence Review: Special Issue on the Integration of Natural Language and Vision Processing, volume 9, pages 2--27. 1993.

Digital Library

[22]

J. G. Neal and S. C. Shapiro. Intelligent User Interfaces, chapter Intelligent Multi-Media Interface Technology, pages 11--45. Addison-Wesley Publishing Company, 1991.

Digital Library

[23]

S. Oviatt. The Human-Computer Interaction Handbook: Fundamentals, Evolving Technologies and Emerging Applications, chapter Multimodal Interfaces. Lawrence Erlbaum Assoc., 2003.

Digital Library

[24]

T. Pfeiffer and M. E. Latoschik. Resolving Object References in multimodal Dialogues for Immersive Virtual Environments. In Proceedings of the IEEE Virtual Reality conference 2004, pages 35--42, 2004.

Digital Library

[25]

C. J. Sparrell and D. B. Koons. Interpretation of coverbal depictive gestures. In AAAI Spring Symposium Series, pages 8--12. Stanford University, March 1994.

[26]

P. S. Strauss and R. Carey. An object-oriented 3D graphics toolkit. In Computer Graphics, volume 26 of SIGGRAPH Proceedings, pages 341--349, 1992.

Digital Library

[27]

D. Thalmann. The virtual human as a multimodal interface. In Proceedings of the Working Conference on Advanced Visual Interfaces, pages 14--20. ACM Press, 2000.

Digital Library

[28]

D. Touraine, P. Bourdot, Y. Bellik, and L. Bolot. A framework to manage multimodal fusion of events for advanced interactions within virtual environments. In Proceedings of the workshop on Virtual environments 2002, pages 159--168. Eurographics Association, 2002.

Digital Library

[29]

H. Tramberend. A distributed virtual reality framework. In IEEE Virtual Reality Conference, pages 14--21, 1999.

Digital Library

[30]

M. Vo and C. Wood. Building an application framework for speech and pen input integration in multimodal learning interfaces. In Proceedings of International Conference on Acoustics, Speech, and Signal Processing, 1996.

Digital Library

[31]

E. Zudilova, P. Sloot, and R. Belleman. A Multi-modal Interface for an Interactive Simulated Vascular Reconstruction System. In Proceedings of the Fourth IEEE International Conference on Multimodal Interfaces ICMI'02, Pittsburgh, Pennsylvania, pages 313--318. IEEE, 2002.

Digital Library

Cited By

Zimmerer CFischbach MLatoschik M(2022)A Case Study on the Rapid Development of Natural and Synergistic Multimodal Interfaces for XR Use-CasesExtended Abstracts of the 2022 CHI Conference on Human Factors in Computing Systems10.1145/3491101.3503552(1-8)Online publication date: 27-Apr-2022
https://dl.acm.org/doi/10.1145/3491101.3503552
Wienrich CLatoschik M(2021)eXtended Artificial Intelligence: New Prospects of Human-AI Interaction ResearchFrontiers in Virtual Reality10.3389/frvir.2021.6867832Online publication date: 6-Sep-2021
https://doi.org/10.3389/frvir.2021.686783
Zimmerer CWolf EWolf SFischbach MLugrin JLatoschik MTruong KHeylen DCzerwinski MBerthouze NChetouani MNakano M(2020)Finally on Par?! Multimodal and Unimodal Interaction for Open Creative Design Tasks in Virtual RealityProceedings of the 2020 International Conference on Multimodal Interaction10.1145/3382507.3418850(222-231)Online publication date: 21-Oct-2020
https://dl.acm.org/doi/10.1145/3382507.3418850
Show More Cited By

Index Terms

A user interface framework for multimodal VR interactions
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
  2. Computer graphics
    1. Graphics systems and interfaces
      1. Virtual reality
2. Human-centered computing
  1. Human computer interaction (HCI)
    1. Interaction devices
      1. Graphics input devices
    2. Interaction paradigms
      1. Natural language interfaces

Recommendations

Multimodal augmented reality: the norm rather than the exception
MVAR '16: Proceedings of the 2016 workshop on Multimodal Virtual and Augmented Reality

Augmented reality (AR) is commonly seen as a technology that overlays virtual imagery onto a participant's view of the world. In line with this, most AR research is focused on what we see. In this paper, we challenge this focus on vision and make a case ...
A Wizard of Oz study for an AR multimodal interface
ICMI '08: Proceedings of the 10th international conference on Multimodal interfaces

In this paper we describe a Wizard of Oz (WOz) user study of an Augmented Reality (AR) interface that uses multimodal input (MMI) with natural hand interaction and speech commands. Our goal is to use a WOz study to help guide the creation of a ...
Extending chatterbot system into multimodal interaction framework with embodied contextual understanding
HRI '12: Proceedings of the seventh annual ACM/IEEE international conference on Human-Robot Interaction

This work aims to realize multimodal interaction with embodied contextual understanding based on the simple chatterbot system. A system framework is proposed to integrate the dialogue system into a 3D simulation platform, SIGVerse to attain multimodal ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

ICMI '05: Proceedings of the 7th international conference on Multimodal interfaces

October 2005

344 pages

ISBN:1595930280

DOI:10.1145/1088463

General Chairs:
Gianni Lazzari
ITC-irst, Trento (Italy)
,
Fabio Pianesi
ITC-irst, Trento (Italy)
,
Program Chairs:
James Crowley
I.N.P. Grenoble (France)
,
Kenji Mase
Nagoya University (Japan)
,
Sharon Oviatt
Oregon Health & Sciences University

Copyright © 2005 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 04 October 2005

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Article

Conference

ICMI05

Sponsor:

ICMI05: Seventh International Conference on Multimodal Interfaces 2005

October 4 - 6, 2005

Torento, Italy

Acceptance Rates

Overall Acceptance Rate 453 of 1,080 submissions, 42%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

34
Total Citations
View Citations
1,395
Total Downloads

Downloads (Last 12 months)46
Downloads (Last 6 weeks)6

Reflects downloads up to 01 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Zimmerer CFischbach MLatoschik M(2022)A Case Study on the Rapid Development of Natural and Synergistic Multimodal Interfaces for XR Use-CasesExtended Abstracts of the 2022 CHI Conference on Human Factors in Computing Systems10.1145/3491101.3503552(1-8)Online publication date: 27-Apr-2022
https://dl.acm.org/doi/10.1145/3491101.3503552
Wienrich CLatoschik M(2021)eXtended Artificial Intelligence: New Prospects of Human-AI Interaction ResearchFrontiers in Virtual Reality10.3389/frvir.2021.6867832Online publication date: 6-Sep-2021
https://doi.org/10.3389/frvir.2021.686783
Zimmerer CWolf EWolf SFischbach MLugrin JLatoschik MTruong KHeylen DCzerwinski MBerthouze NChetouani MNakano M(2020)Finally on Par?! Multimodal and Unimodal Interaction for Open Creative Design Tasks in Virtual RealityProceedings of the 2020 International Conference on Multimodal Interaction10.1145/3382507.3418850(222-231)Online publication date: 21-Oct-2020
https://dl.acm.org/doi/10.1145/3382507.3418850
Wolf EKlüber SZimmerer CLugrin JLatoschik M(2019)”Paint that object yellow”: Multimodal Interaction to Enhance Creativity During Design Tasks in VR2019 International Conference on Multimodal Interaction10.1145/3340555.3353724(195-204)Online publication date: 14-Oct-2019
https://dl.acm.org/doi/10.1145/3340555.3353724
Zimmerer CFischbach MLatoschik M(2018)Semantic Fusion for Natural Multimodal Interfaces using Concurrent Augmented Transition NetworksMultimodal Technologies and Interaction10.3390/mti20400812:4(81)Online publication date: 6-Dec-2018
https://doi.org/10.3390/mti2040081
Zimmerer CFischbach MLatoschik M(2018)Space Tentacles - Integrating Multimodal Input into a VR Adventure Game2018 IEEE Conference on Virtual Reality and 3D User Interfaces (VR)10.1109/VR.2018.8446151(745-746)Online publication date: Mar-2018
https://doi.org/10.1109/VR.2018.8446151
Tscharn RLatoschik MLöffler DHurtienne JLank EVinciarelli AHoggan ESubramanian SBrewster S(2017)“Stop over there”: natural gesture and speech interaction for non-critical spontaneous intervention in autonomous drivingProceedings of the 19th ACM International Conference on Multimodal Interaction10.1145/3136755.3136787(91-100)Online publication date: 3-Nov-2017
https://dl.acm.org/doi/10.1145/3136755.3136787
Dumas BPirau JLalanne DLank EVinciarelli AHoggan ESubramanian SBrewster S(2017)Modelling fusion of modalities in multimodal interactive systems with MMMMProceedings of the 19th ACM International Conference on Multimodal Interaction10.1145/3136755.3136768(288-296)Online publication date: 3-Nov-2017
https://dl.acm.org/doi/10.1145/3136755.3136768
Fischbach MWiebusch DLatoschik M(2017)Semantic Entity-Component State Management Techniques to Enhance Software Quality for Multimodal VR-SystemsIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2017.265709823:4(1342-1351)Online publication date: 1-Apr-2017
https://dl.acm.org/doi/10.1109/TVCG.2017.2657098
Argelaguet FDucoffe MLecuyer AGribonval R(2017)Spatial and rotation invariant 3D gesture recognition based on sparse representation2017 IEEE Symposium on 3D User Interfaces (3DUI)10.1109/3DUI.2017.7893333(158-167)Online publication date: 2017
https://doi.org/10.1109/3DUI.2017.7893333
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten