Consistent categorization of multimodal integration patterns during human–computer interaction

Hak, Roman; Zeman, Tomas

doi:10.1007/s12193-017-0243-1

Consistent categorization of multimodal integration patterns during human–computer interaction

Published: 22 March 2017

Volume 11, pages 251–265, (2017)
Cite this article

Journal on Multimodal User Interfaces Aims and scope Submit manuscript

434 Accesses
5 Citations
Explore all metrics

Abstract

Multimodal interaction represents a more natural style of human-computer interaction permitting our developed communicative skills to interact with computer systems. It remains a challenging task to design reliable multimodal systems. Employing advanced methods providing optimal performance depends on precise modeling of integration patterns that allows adapting to preferences and differences of individual users. While basic foundation and empirical evidence around these differences has already been described and confirmed in previous research works, introduced measures and classifications seem oversimplified and insufficiently precise to design reliable and robust interaction models. In this paper, results of our study of multimodal integration patterns in systems combining speech and gesture input are presented. Important interaction differences of subjects and their specific multimodal integration patterns were confirmed and completed with our own findings. Based on the obtained results, a new integration pattern categorization is defined and analyzed. The introduced categorization provides more reliable and consistent results in comparison with classifications presented in related literature. Moreover, its generality means it is applicable on other input modality combinations.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Empirical Evaluation of Multimodal Input Interactions

User Interface Patterns for Multimodal Interaction

Multimodal Systems: An Excursus of the Main Research Questions

Notes

The original algorithm is rotation invariant.
http://www.anvil-software.org
In order to distinguish between the two definitions, the one from Oviatt et al. will be denoted as SEQ$_O$/SIM$_O$ and our redefined as SEQ$_R$/SIM$_R$ in the rest of the work.

References

Bangalore S, Johnston M (2009) Robust understanding in multimodal interfaces. Comput Linguist 35(3):345–397. doi:10.1162/coli.08-022-R2-06-26
Article Google Scholar
Billinghurst M, Lee M (2012) Multimodal interfaces for augmented reality, expanding the frontiers of visual analytics and visualization. pp 449–465 doi:10.1007/978-1-4471-2804-5
Bolt RA (1980) Put-that-there: voice and gesture at the graphics interface. In: Proceedings of the 7th annual conference on computer graphics and interactive techniques - SIGGRAPH ’80, vol 32. ACM Press, pp 262–270. doi:10.1145/800250.807503
Cohen PR, Johnston M, McGee D, Oviatt S, Pittman J, Smith I, Chen L, Clow J (1997) QuickSet: multimodal interaction for distributed applications. In: Proceedings of the fifth ACM international conference on multimedia-MULTIMEDIA ’97, ACM Press, pp 31–40. doi:10.1145/266180.266328
Cohen PR, Kaiser EC, Buchanan MC, Lind S, Corrigan MJ, Wesson RM (2015) Sketch-Thru-Plan: a multimodal interface for command and control. Commun of the ACM 58(4):56–65. doi:10.1145/2735589
Article Google Scholar
Dumas B, Lalanne D, Oviatt S (2009) Multimodal interfaces: a durvey of principles, models and frameworks. In: Lalanne D, Kohlas J (eds) Human machine interaction, Lecture notes in computer science, vol 5440. Springer, Berlin, pp 3–26. doi:10.1007/978-3-642-00437-7_1
Chapter Google Scholar
Ehlen P, Johnston M (2012) Multimodal interaction patterns in mobile local search. In: Proceedings of the 2012 ACM international conference on intelligent user interfaces - IUI ’12, pp 21–24 . doi:10.1145/2166966.2166970
Haas EC, Pillalamarri KS, Stachowiak CC, McCullough G (2011) Temporal binding of multimodal controls for dynamic map displays. In: Proceedings of the 13th international conference on multimodal interfaces - ICMI ’11, ACM Press. p 409. doi:10.1145/2070481.2070558
Huang X, Oviatt S (2006) Toward adaptive information fusion in multimodal systems. In: Renals S, Bengio S (eds) Machine learning for multimodal interaction, Lecture notes in computer science, vol 3869. Springer, Berlin, pp 15–27. doi:10.1007/11677482_2
Chapter Google Scholar
Huang X, Oviatt S, Lunsford R (2006) Combining user modeling and machine learning to predict users multimodal integration patterns. In: Renals S, Bengio S, Fiscus JG (eds) Machine Learning for multimodal interaction, Lecture notes in computer science, vol 4299. Springer, Berlin, pp 50–62. doi:10.1007/11965152_5
Chapter Google Scholar
Huggins-Daines D, Kumar M, Chan A, Black A, Ravishankar M, Rudnicky A (2006) Pocketsphinx: a free, real-time continuous speech recognition system for hand-held devices. In: Proceedings of IEEE international conference on acoustics speech and signal processing. pp 185–188. doi:10.1109/ICASSP.2006.1659988
Johnston M, Bangalore S (2005) Finite-state multimodal integration and understanding. Nat Lang Eng 11(2):159–187. doi:10.1017/S1351324904003572
Article Google Scholar
Johnston M, Bangalore S, Vasireddy G, Stent A, Ehlen P, Walker M, Whittaker S, Maloor P (2002) MATCH: an architecture for multimodal dialogue systems. In: Proceedings of the 40th annual meeting on association for computational linguistics - ACL ’02, July, pp 376–383. doi:10.3115/1073083.1073146
Lee M, Billinghurst M, Baek W, Green R, Woo W (2013) A usability study of multimodal input in an augmented reality environment. Virtual Real 17(4):293–305. doi:10.1007/s10055-013-0230-0
Article Google Scholar
Lewis JR (2012) Usability testing. In: Handbook of human factors and ergonomics. Wiley, pp 1267–1312. doi:10.1002/9781118131350.ch46
Oviatt S (1999) Ten myths of multimodal interaction. Commun of the ACM 42(11):74–81. doi:10.1145/319382.319398
Article Google Scholar
Oviatt S (2003) User-centered modeling and evaluation of multimodal interfaces. Proc of the IEEE 91(9):1457–1468. doi:10.1109/JPROC.2003.817127
Article Google Scholar
Oviatt S, Coulston R, Lunsford R (2004) When do we interact multimodally?. In: Proceedings of the 6th international conference on multimodal interfaces - ICMI ’04, ACM Press, pp 129–136. doi:10.1145/1027933.1027957
Oviatt S, Coulston R, Tomko S, Xiao B, Lunsford R, Wesson M, Carmichael L (2003) Toward a theory of organized multimodal integration patterns during human-computer interaction. In: Proceedings of the 5th international conference on multimodal interfaces - ICMI ’03, ACM Press, pp 44–51. doi:10.1145/958432.958443
Oviatt S, DeAngeli A, Kuhn K (1997) Integration and synchronization of input modes during multimodal human-computer interaction. In: Proceedings of the SIGCHI conference on human factors in computing systems - CHI ’97, ACM Press, pp 415–422. doi:10.1145/258549.258821
Oviatt S, Lunsford R, Coulston R (2005) Individual differences in multimodal integration patterns: what are they and why do they exist?. In: Proceedings of the SIGCHI conference on human factors in computing systems - CHI ’05, ACM Press, pp 241–249. doi:10.1145/1054972.1055006
Schüssel F, Honold F, Schmidt M, Bubalo N, Huckauf A, Weber M (2014) Multimodal interaction history and its use in error detection and recovery. In: Proceedings of the 16th international conference on multimodal interaction - ICMI ’14, ACM Press, pp 164–171. doi:10.1145/2663204.2663255
Serrano M, Nigay L (2010) A wizard of oz component-based approach for rapidly prototyping and testing input multimodal interfaces. J Multimodal User Interfaces 3(3):215–225. doi:10.1007/s12193-010-0042-4
Article Google Scholar
Wobbrock JO, Wilson AD, Li Y (2007) Gestures without libraries, toolkits or training: a $1 recognizer for user interface prototypes. In: Proceedings of the 20th annual ACM symposium on user interface software and technology - UIST ’07, ACM Press, pp 159–169. doi:10.1145/1294211.1294238
Xiao B, Girand C, Oviatt S (2002) Multimodal integration patterns in children. In: Proceedings of international conference on spoken language processing, pp 629–632
Xiao B, Oviatt S (2003) Modeling multimodal integration patterns and performance in seniors : toward adaptive processing of individual differences. In: Proceedings of the 5th international conference on multimodal interfaces - ICMI ’03, pp 256–272. doi:10.1145/958432.958480

Download references

Acknowledgements

We would like to thank Michal Vondra for providing an initial feedback during a pilot test, and all volunteers for participating in the study. Thanks also to the anonymous reviewers for their helpful comments and suggestions. This work has been supported by the Grant Agency of the Czech Technical University in Prague, Grant No. SGS16/156/OHK3/2T/13.

Author information

Authors and Affiliations

Czech Technical University in Prague, Prague, Czech Republic
Roman Hak & Tomas Zeman

Authors

Roman Hak
View author publications
You can also search for this author in PubMed Google Scholar
Tomas Zeman
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Roman Hak.

Appendices

Appendix

Testing scenarios

The following listing contains a complete set of objectives as introduced to tested subjects:

1.
Zoom in and out the map view.
2.
Get your current location and find the nearest petrol station.
3.
Get detail information about two gas stations.
4.
Get directions between Oloumouc and Liberec.
5.
Get information about cinemas in your location.
6.
Find estimated travel time between an airport near Prague and a theatre in the downtown of Prague.
7.
Get coordinates of at least 3 hospitals in Pilsen.
8.
Find the nearest police and emergency.
9.
Find a travel distance between a railway station in Brno and the closest airport.
10.
Find a name of the nearest bus and subway station.
11.
Find names of some pubs and restaurants in the downtown of Ceske Budejovice.
12.
Find phone numbers of libraries in the surrounding area.
13.
Get a postal address of a coffeehouse around a museum in Cesky Krumlov.
14.
Get phone numbers and postal addresses of churches in the surrounding area of Brno.
15.
Get details of the two nearest restaurants in your current location.
16.
Find a travel distance from the westernmost to the easternmost point and then from the northernmost to the southernmost point of Czech Republic.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hak, R., Zeman, T. Consistent categorization of multimodal integration patterns during human–computer interaction. J Multimodal User Interfaces 11, 251–265 (2017). https://doi.org/10.1007/s12193-017-0243-1

Download citation

Received: 01 April 2016
Accepted: 14 March 2017
Published: 22 March 2017
Issue Date: September 2017
DOI: https://doi.org/10.1007/s12193-017-0243-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Consistent categorization of multimodal integration patterns during human–computer interaction

Abstract

Access this article

Similar content being viewed by others

Empirical Evaluation of Multimodal Input Interactions

User Interface Patterns for Multimodal Interaction

Multimodal Systems: An Excursus of the Main Research Questions

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Appendices

Appendix

Testing scenarios

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Consistent categorization of multimodal integration patterns during human–computer interaction

Abstract

Access this article

Similar content being viewed by others

Empirical Evaluation of Multimodal Input Interactions

User Interface Patterns for Multimodal Interaction

Multimodal Systems: An Excursus of the Main Research Questions

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Appendices

Appendix

Testing scenarios

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation