skip to main content
10.1145/1088463.1088479acmconferencesArticle/Chapter ViewAbstractPublication Pagesicmi-mlmiConference Proceedingsconference-collections
Article

A user interface framework for multimodal VR interactions

Published: 04 October 2005 Publication History

Abstract

This article presents a User Interface (UI) framework for multimodal interactions targeted at immersive virtual environments. Its configurable input and gesture processing components provide an advanced behavior graph capable of routing continuous data streams asynchronously. The framework introduces a Knowledge Representation Layer which augments objects of the simulated environment with Semantic Entities as a central object model that bridges and interfaces Virtual Reality (VR) and Artificial Intelligence (AI) representations. Specialized node types use these facilities to implement required processing tasks like gesture detection, preprocessing of the visual scene for multimodal integration, or translation of movements into multimodally initialized gestural interactions. A modified Augmented Transition Nettwork (ATN) approach accesses the knowledge layer as well as the preprocessing components to integrate linguistic, gestural, and context information in parallel. The overall framework emphasizes extensibility, adaptivity and reusability, e.g., by utilizing persistent and interchangeable XML-based formats to describe its processing stages.

References

[1]
F. Althoff, G. McGlaun, B. Schuller, P. Morguet, and M. Lang. Using multimodal interaction to navigate in arbitrary virtual vrml worlds. In Proceedings of PUI 2001, 2001.
[2]
R. Arangarasan and G. N. J. Phillips. Modular Approach of Multimodal Integration in a Virtual Environment. In Proceedings of the Fourth IEEE International Conference on Multimodal Interfaces ICMI'02, Pittsburgh, Pennsylvania, pages 331--336. IEEE, 2002.
[3]
K. Böhm, W. Hübner, and K. Väänänen. Given: Gesture driven interactions in virtual environments; a toolkit approach to 3D interactions. In Interfaces to Real and Virtual Worlds, 1992.
[4]
R. A. Bolt. Put-That-There: Voice and gesture at the graphics interface. In ACM SIG-GRAPH Computer Graphics, New York, 1980. ACM Press.
[5]
R. Carey, G. Bell, and C. Marrin. ISO/IEC 14772-1:1997 virtual reality modeling language (VRML). Technical report, The VRML Consortium Incorporated, 1997.
[6]
M. Cavazza, X. Pouteau, and D. Pernel. Multimodal communication in virtual environments. In Symbiosis of Human and Artifact, pages 597--604. Elsevier Science B. V., 1995.
[7]
P. Cohen, D. McGee, S. Oviatt, L. Wu, J. Clow, R. King, S. Julier, and L. Rosenblum. Multimodal interactions for 2d and 3d environments. IEEE Computer Graphics and Applications, pages 10--13, 1999.
[8]
A. Hauptmann and P. McAvinney. Gestures with speech for graphic manipulation. International Journal of Man-Machine Studies, 38:231--249, 1993.
[9]
G. Heumer, M. Schilling, and M. E. Latoschik. Automatic data exchange and synchronization for knowledge-based intelligent virtual environments. In Proceedings of the IEEE VR2005, pages 43--50, Bonn, Germany, 2005.
[10]
M. Johnston. Unification-based multimodal parsing. In Proceedings of the 17th International Conference on Computational Linguistics and the 36th Annual Meeting of the Association for Computational Linguistics COLING-ACL, pages 624 -- 630, 1998.
[11]
M. Johnston and S. Bangalore. Finite-state methods for multimodal parsing and integration. In Finite-state Methods Workshop, ESSLLI Summer School on Logic Language and Information,Helsinki, Finland, august 2001.
[12]
M. Johnston, P. R. Cohen, D. McGee, S. L. Oviatt, J. A. Pittman, and I. Smith. Unification-based multimodal integration. In 35th Annual Meeting of the Association for Computational Linguistics, Madrid, pages 281--288, 1997.
[13]
E. Kaiser, A. Olwal, D. McGee, H. Benko, A. Corradini, X. Li, P. Cohen, and S. Feiner. Mutual disambiguation of 3d multimodal interaction in augmented and virtual reality. In Proceedings of the 5th international conference on Multimodal interfaces, pages 12--19. ACM Press, 2003.
[14]
D. Koons, C. Sparrel, and K. Thorisson. Intergrating simultaneous input from speech, gaze and hand gestures. In Intelligent Multimedia Interfaces. AAAI Press, 1993.
[15]
F. Landragin, N. Bellalem, and L. Romary. Referring to Objects with Spoken and Haptic Modalities. In Proceedings of the Fourth IEEE International Conference on Multimodal Interfaces ICMI'02, Pittsburgh, Pennsylvania, pages 99--104. IEEE, 2002.
[16]
M. E. Latoschik. A gesture processing framework for multimodal interaction in virtual reality. In A. Chalmers and V. Lalioti, editors, AFRIGRAPH 2001, 1st International Conference on Computer Graphics, Virtual Reality and Visualisation in Africa, conference proceedings, pages 95--100. ACM SIG-GRAPH, 2001.
[17]
M. E. Latoschik. Designing Transition Networks for Multimodal VR-Interactions Using a Markup Language. In Proceedings of the Fourth IEEE International Conference on Multimodal Interfaces ICMI'02, Pittsburgh, Pennsylvania, pages 411--416. IEEE, 2002.
[18]
M. E. Latoschik and M. Schilling. Incorporating VR Databases into AI Knowledge Representations: A Framework for Intelligent Graphics Applications. In Proceedings of the Sixth International Conference on Computer Graphics and Imaging. IASTED, ACTA Press, 2003.
[19]
B. Lenzmann. Benutzeradaptive und multimodale Interface-Agenten. PhD thesis, Technische Fakultät, Universität Bielefeld, 1998.
[20]
M. Lucente, G.-J. Zwart, and A. D. George. Visualization space: A testbed for deviceless multimodal user interface. In Intelligent Environments Symposium, American Assoc. for Artificial Intelligence Spring Symposium Series, Mar. 1998.
[21]
M. T. Maybury. Research in multimedia an multimodal parsing and generation. In P. McKevitt, editor, Journal of Artificial Intelligence Review: Special Issue on the Integration of Natural Language and Vision Processing, volume 9, pages 2--27. 1993.
[22]
J. G. Neal and S. C. Shapiro. Intelligent User Interfaces, chapter Intelligent Multi-Media Interface Technology, pages 11--45. Addison-Wesley Publishing Company, 1991.
[23]
S. Oviatt. The Human-Computer Interaction Handbook: Fundamentals, Evolving Technologies and Emerging Applications, chapter Multimodal Interfaces. Lawrence Erlbaum Assoc., 2003.
[24]
T. Pfeiffer and M. E. Latoschik. Resolving Object References in multimodal Dialogues for Immersive Virtual Environments. In Proceedings of the IEEE Virtual Reality conference 2004, pages 35--42, 2004.
[25]
C. J. Sparrell and D. B. Koons. Interpretation of coverbal depictive gestures. In AAAI Spring Symposium Series, pages 8--12. Stanford University, March 1994.
[26]
P. S. Strauss and R. Carey. An object-oriented 3D graphics toolkit. In Computer Graphics, volume 26 of SIGGRAPH Proceedings, pages 341--349, 1992.
[27]
D. Thalmann. The virtual human as a multimodal interface. In Proceedings of the Working Conference on Advanced Visual Interfaces, pages 14--20. ACM Press, 2000.
[28]
D. Touraine, P. Bourdot, Y. Bellik, and L. Bolot. A framework to manage multimodal fusion of events for advanced interactions within virtual environments. In Proceedings of the workshop on Virtual environments 2002, pages 159--168. Eurographics Association, 2002.
[29]
H. Tramberend. A distributed virtual reality framework. In IEEE Virtual Reality Conference, pages 14--21, 1999.
[30]
M. Vo and C. Wood. Building an application framework for speech and pen input integration in multimodal learning interfaces. In Proceedings of International Conference on Acoustics, Speech, and Signal Processing, 1996.
[31]
E. Zudilova, P. Sloot, and R. Belleman. A Multi-modal Interface for an Interactive Simulated Vascular Reconstruction System. In Proceedings of the Fourth IEEE International Conference on Multimodal Interfaces ICMI'02, Pittsburgh, Pennsylvania, pages 313--318. IEEE, 2002.

Cited By

View all
  • (2022)A Case Study on the Rapid Development of Natural and Synergistic Multimodal Interfaces for XR Use-CasesExtended Abstracts of the 2022 CHI Conference on Human Factors in Computing Systems10.1145/3491101.3503552(1-8)Online publication date: 27-Apr-2022
  • (2021)eXtended Artificial Intelligence: New Prospects of Human-AI Interaction ResearchFrontiers in Virtual Reality10.3389/frvir.2021.6867832Online publication date: 6-Sep-2021
  • (2020)Finally on Par?! Multimodal and Unimodal Interaction for Open Creative Design Tasks in Virtual RealityProceedings of the 2020 International Conference on Multimodal Interaction10.1145/3382507.3418850(222-231)Online publication date: 21-Oct-2020
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ICMI '05: Proceedings of the 7th international conference on Multimodal interfaces
October 2005
344 pages
ISBN:1595930280
DOI:10.1145/1088463
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 04 October 2005

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. gesture and speech processing
  2. multimodal interaction
  3. semantic scene description
  4. user interface framework
  5. virtual reality

Qualifiers

  • Article

Conference

ICMI05
Sponsor:

Acceptance Rates

Overall Acceptance Rate 453 of 1,080 submissions, 42%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)46
  • Downloads (Last 6 weeks)6
Reflects downloads up to 01 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2022)A Case Study on the Rapid Development of Natural and Synergistic Multimodal Interfaces for XR Use-CasesExtended Abstracts of the 2022 CHI Conference on Human Factors in Computing Systems10.1145/3491101.3503552(1-8)Online publication date: 27-Apr-2022
  • (2021)eXtended Artificial Intelligence: New Prospects of Human-AI Interaction ResearchFrontiers in Virtual Reality10.3389/frvir.2021.6867832Online publication date: 6-Sep-2021
  • (2020)Finally on Par?! Multimodal and Unimodal Interaction for Open Creative Design Tasks in Virtual RealityProceedings of the 2020 International Conference on Multimodal Interaction10.1145/3382507.3418850(222-231)Online publication date: 21-Oct-2020
  • (2019)”Paint that object yellow”: Multimodal Interaction to Enhance Creativity During Design Tasks in VR2019 International Conference on Multimodal Interaction10.1145/3340555.3353724(195-204)Online publication date: 14-Oct-2019
  • (2018)Semantic Fusion for Natural Multimodal Interfaces using Concurrent Augmented Transition NetworksMultimodal Technologies and Interaction10.3390/mti20400812:4(81)Online publication date: 6-Dec-2018
  • (2018)Space Tentacles - Integrating Multimodal Input into a VR Adventure Game2018 IEEE Conference on Virtual Reality and 3D User Interfaces (VR)10.1109/VR.2018.8446151(745-746)Online publication date: Mar-2018
  • (2017)“Stop over there”: natural gesture and speech interaction for non-critical spontaneous intervention in autonomous drivingProceedings of the 19th ACM International Conference on Multimodal Interaction10.1145/3136755.3136787(91-100)Online publication date: 3-Nov-2017
  • (2017)Modelling fusion of modalities in multimodal interactive systems with MMMMProceedings of the 19th ACM International Conference on Multimodal Interaction10.1145/3136755.3136768(288-296)Online publication date: 3-Nov-2017
  • (2017)Semantic Entity-Component State Management Techniques to Enhance Software Quality for Multimodal VR-SystemsIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2017.265709823:4(1342-1351)Online publication date: 1-Apr-2017
  • (2017)Spatial and rotation invariant 3D gesture recognition based on sparse representation2017 IEEE Symposium on 3D User Interfaces (3DUI)10.1109/3DUI.2017.7893333(158-167)Online publication date: 2017
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media