Run-time model based framework for automatic evaluation of multimodal interfaces

Mateo Navarro, Pedro Luis; Hillmann, Stefan; Möller, Sebastian; Sevilla Ruiz, Diego; Martínez Pérez, Gregorio

doi:10.1007/s12193-014-0170-3

Run-time model based framework for automatic evaluation of multimodal interfaces

Original Paper
Published: 19 August 2014

Volume 8, pages 399–427, (2014)
Cite this article

Journal on Multimodal User Interfaces Aims and scope Submit manuscript

Pedro Luis Mateo Navarro¹,
Stefan Hillmann²,
Sebastian Möller²,
Diego Sevilla Ruiz³ &
…
Gregorio Martínez Pérez⁴

376 Accesses
4 Citations
Explore all metrics

Abstract

Multimodal interfaces are expected to improve input and output capabilities of increasingly sophisticated applications. Several approaches are aimed at formally describing multimodal interaction. However, they rarely treat it as a continuous flow of actions, preserving its dynamic nature and considering modalities at the same level. This work proposes a model-based approach called Practice-oriented Analysis and Description of Multimodal Interaction (PALADIN) aimed at describing sequential multimodal interaction beyond such problems. It arranges a set of parameters to quantify multimodal interaction as a whole, in order to minimise the existing differences between modalities. Furthermore, interaction is described stepwise to preserve the dynamic nature of the dialogue process. PALADIN defines a common notation to describe interaction in different multimodal contexts, providing a framework to assess and compare the usability of systems. Our approach was integrated into four real applications to conduct two experiments with users. The experiments show the validity and prove the effectiveness of the proposed model for analysing and evaluating multimodal interaction.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

In-Depth Analysis of Multimodal Interaction: An Explorative Paradigm

Design and Development of Multimodal Applications: A Vision on Key Issues and Methods

Implementation Goals for Multimodal Interfaces in Human-Computer Interaction

Notes

Table 10 in Appendix B gives an overview of the 9 guidelines.
Tables 13, 14, 15 and 16 describe different interaction parameters including the modalities for which they can be applied (Mod.), the interaction level at which they are collected (Int. Lev.) and the measurement method (Meas. meth.) Table 12 further describes these abbreviations and their values.
A barge-in attempt occurs when the user intentionally addresses the system while the system is still speaking, displaying the information of a GUI, performing a gesture or sending information using another modality.
A facade is an object that provides a simplified interface to a larger body of code, such as a class library or a software framework.
An open-source implementation of the Android HCI Extractor [36] can be downloaded from http://code.google.com/p/android-hci-extractor. More information related to this tool and its integration with the model and the framework described above can be found in [35].

References

Araki M, Kouzawa A, Tachibana K (2005) Proposal of a multimodal interaction description language for various interactive agents. Trans Inf Syst E88-D(11):2469–2476
Balbo S, Coutaz J, Salber D (1993) Towards automatic evaluation of multimodal user interfaces. In: Proceedings of the 1st international conference on intelligent user interfaces, IUI ’93ACM, New York, NY, USA, pp 201–208
Balme L, Demeure A, Barralon N, Coutaz J, Calvary G (2004) Cameleon-rt: a software architecture reference model for distributed, migratable, and plastic user interfaces. In: Markopoulos P, Eggen B, Aarts EHL, Crowley JL (eds) EUSAI. Lecture Notes in Computer Science, vol 3295. Springer, Berlin, pp 291–302. http://dblp.uni-trier.de/db/conf/eusai/eusai2004.html#BalmeDBCC04
Bayer S, Damianos LE, Kozierok R, Mokwa J (1999) The MITRE multi-modal logger: its use in evaluation of collaborative systems. ACM Comput Surv 31(2es):17
Beringer N, Kartal U, Louka K, Schiel F, Türk U (2002) PROMISE—a procedure for multimodal interactive system evaluation. In: Proceedings of multimodal resources and multimodal systems evaluation workshop (LREC 2002), pp 77–80
Bernsen NO, Dybkjær L (2009) Multimodal usability. Springer, Berlin
Bourguet ML (2003) Designing and prototyping multimodal commands. In: Rauterberg M, Menozzi M, Wesson J (eds) INTERACT. IOS Press, Amsterdam. http://dblp.uni-trier.de/db/conf/interact/interact2003.html#Bourguet03
Carey R, Bell G (1997) The annotated VRML 2.0 reference manual. Addison-Wesley, Boston
Cohen PR, McGee DR (2004) Tangible multimodal interfaces for safety–critical applications. Commun ACM 47(1):41–46
Article Google Scholar
Coutaz J, Nigay L, Salber D, Blandford A, May J, Young RM (1995) Four easy pieces for assessing the usability of multimodal interaction: the CARE properties. In: Arnesen SA, Gilmore D (eds) Proceedings of INTERACT’95 conference. Chapman & Hall, London, pp 115–120
Damianos LE, Drury J, Fanderclai T, Hirschman L, Kurtz J, Oshika B (2000) Evaluating multi-party multimodal systems. In: Proceedings of the second international conference on language resources and evaluation, vol 3. MIT Media Laboratory, pp 1361–1368
Diefenbach S, Hassenzahl M (2011) Handbuch zur Fun-ni Toolbox. manual, Folkwang Universität der Künste (2011). Retrieved at 16.10.2013. http://fun-ni.org/wp-content/uploads/Diefenbach+Hassenzahl_2010_HandbuchFun-niToolbox.pdf
Ergonomics of human-system interaction (2006) Part 110: Dialogue principles (ISO 9241-110:2006)
Dumas B, Lalanne D, Ingold R (2010) Description languages for multimodal interaction: a set of guidelines and its illustration with SMUIML. J Multimodal User Interfaces 3:237–247
Article Google Scholar
Dybkjær L, Bernsen NO, Minker W (2004) Evaluation and usability of multimodal spoken language dialogue systems. Speech Commun 43:33–54
Article Google Scholar
Engelbrecht KP, Kruppa M, Möller S, Quade M (2008) MeMo Workbench for semi-automated usability testing. In: Proceedings of Interspeech 2008 incorporation SST 2008. International Symposium on Computer Architecture, Brisbane, Australia, pp 1662–1665
Fraser NM (1997) Spoken dialogue system evaluation: a first framework for reporting results. In: EUROSPEECH-1997, pp 1907–1910
Fraser NM, Gilbert G (1991) Simulating speech systems. Comput Speech Lang 5(1):81–99
Article Google Scholar
Göbel S, Hartmann F, Kadner K, Pohl C (2006) A device-independent multimodal mark-up language. In: Hochberger C, Liskowsky R (eds) INFORMATIK 2006. Informatik für Menschen, LNI, vol 94. Gesellschaft für Informatik, pp 170–177
Gong XG, Engelbrecht KP (2013) The influence of user characteristics on the quality of judgment prediction models for tablet applications. In: 10. Berliner Werkstatt, pp 198–204
GNU general public license. http://www.gnu.org/licenses/gpl.html
Grice HP (1975) Logic and conversation. Syntax Semant 3:41–58
Google Scholar
Johnston M (2009) EMMA: extensible multimodal annotation markup language. W3C recommendation, W3C (2009) http://www.w3.org/TR/2009/REC-emma-20090210/
Jöst M, Häußler J, Merdes M, Malaka R (2005) Multimodal interaction for pedestrians: an evaluation study. In: Amant RS, Riedl J, Jameson A (eds) Proceedings of the 10th international conference on intelligent user interfaces. ACM, New York, pp 59–66
Jouault F, Allilaire F, Bézivin J, Kurtev I (2008) ATL: a model transformation tool. Sci Comput Program 72(1–2):31–39
Article MATH Google Scholar
Kranstedt A, Kopp S, Wachsmuth I (2002) MURML: a multimodal utterance representation markup language for conversational agents. In: Proceedings of AAMAS02 workshop on embodied conversational agents—let’s specify and evaluate them
Kühnel C, Weiss B, Möller S (2010) Parameters describing multimodal interaction—definitions and three usage scenarios. In: Kobayashi T, Hirose K, Nakamura S (eds) Proceedings of 11th annual conference on ISCA (Interspeech 2010). ISCA, Makuhari, pp 2014–2017
Larson JA, Raggett D, Raman TV (2003) W3C multimodal interaction framework. W3C note, W3C (2003). http://www.w3.org/TR/2003/NOTE-mmi-framework-20030506/
Leech G, Wilson A (1996) EAGLES. recommendations for the morphosyntactic annotation of corpora. http://www.ilc.cnr.it/EAGLES96/annotate/annotate.html
Lemmelä S, Vetek A, Mäkelä K, Trendafilov D (2008) Designing and evaluating multimodal interaction for mobile contexts. In: Digalakis V, Potamianos A, Turk M, Pieraccini R, Ivanov Y (eds) Proceedings of the 10th international conference on multimodal interfaces. ACM, New York, pp 265–272
Limbourg Q, Vanderdonckt J, Michotte B, Bouillon L, López-Jaquero V (2005) USIXML: a language supporting multi-path development of user interfaces. In: Bastide R, Palanque P, Roth J (eds) Engineering human computer interaction and interactive systems. Lecture Notes in Computer Science, vol 3425, chap. 12. Springer, Berlin, pp 134–135. doi:10.1007/11431879_12
Malhotra A, Biron PV (2004) XML schema part 2: datatypes second edition. W3C recommendation, W3C. http://www.w3.org/TR/2004/REC-xmlschema-2-20041028/
Manca M, Paternó F (2010) Supporting multimodality in service-oriented model-based development environments. In: Bernhaupt R, Forbrig P, Gulliksen J, Lárusdóttir M (eds) HCSE. Lecture Notes in Computer Science, vol 6409. Springer, Berlin, pp 135–148. http://dblp.uni-trier.de/db/conf/hcse/hcse2010.html#MancaP10
Martin JC, Kipp M (2002) Annotating and measuring multimodal behaviour—tycoon metrics in the anvil tool. In: LREC. European Language Resources Association. http://dblp.uni-trier.de/db/conf/lrec/lrec2002.html#MartinK02
Mateo P (2012) Android HCI extractor and the MIM project: integration and usage tutorial. http://www.catedrasaes.org/wiki/MIM. Accessed 04 Nov 2013
Mateo P, Hillmann S (2012) Android HCI Extractor. http://code.google.com/p/android-hci-extractor. Accessed 04 Nov 2013
Mateo P, Hillmann S (2013) Instantiation framework for the PALADIN interaction model. https://github.com/pedromateo/paladin_instantiation. Accessed 04 Nov 2013
Mateo P, Hillmann S (2013) PALADIN: a run-time model for automatic evaluation of multimodal interfaces. https://github.com/pedromateo/paladin. Accessed 04 Nov 2013
Mateo Navarro PL, Martínez Pérez G, Sevilla Ruiz D (2014) A context-aware interaction model for the analysis of users QoE in mobile environments. Int J Hum Comput Interact, Taylor & Francis (in press)
Möller S (2005) Parameters describing the interaction with spoken dialogue systems. ITU-T Recommendation Supplement 24 to P-Series, International Telecommunication Union, Geneva, Switzerland. Based on ITU-T Contr. COM 12–17 (2009)
Möller S (2005) Quality of telephone-based spoken dialogue systems. Springer, New York
Google Scholar
Möller S (2011) Parameters describing the interaction with multimodal dialogue systems. ITU-T Recommendation Supplement 25 to P-Series Rec., International Telecommunication Union, Geneva, Switzerland
Nigay L, Coutaz J (1993) A design space for multimodal systems: concurrent processing and data fusion. In: Ashlund S, Mullet K, Henderson A, Hollnagel E, White TN (eds) Proceedings of INTERACT ’93 and CHI ’93 conference on human factors in computing system. ACM, New York, pp 172–178
Olmedo-Rodríguez H, Escudero-Mancebo D, Cardeñoso Payo V (2009) Evaluation proposal of a framework for the integration of multimodal interaction in 3D worlds. In: Proceedings of the 13th international conference on human–computer interactions. Part II: Novel Interaction Methods and Techniques. Springer, Berlin, pp 84–92
Oshry M, Baggia P, Rehor K, Young M, Akolkar R, Yang X, Barnett J, Hosn R, Auburn R, Carter J, McGlashan S, Bodell M, Burnett DC (2009) Voice extensible markup language (VoiceXML) 3.0. W3C working draft, W3C. http://www.w3.org/TR/2009/WD-voicexml30-20091203/
Oviatt S (1999) Ten myths of multimodal interaction. Commun ACM 42:74–81
Article Google Scholar
Oviatt S (2003) Advances in robust multimodal interface design. IEEE Comput Graph Appl 23:62–68
Article Google Scholar
Palanque PA, Schyn A (2003) A model-based approach for engineering multimodal interactive systems. In: Rauterberg M, Menozzi M, Wesson J (eds) INTERACT’03. IOS Press, Amsterdam, pp 543–550
Paternò F, Santoro C, Spano LD (2009) MARIA: a universal, declarative, multiple abstraction-level language for service-oriented applications in ubiquitous environments. ACM Trans Comput Hum Interact 16(4):1–30. doi:10.1145/1614390.1614394
Pelachaud C (2005) Multimodal expressive embodied conversational agents. In: Zhang H, Chua TS, Steinmetz R, Kankanhalli MS, Wilcox L (eds) ACM multimedia. ACM, New York, pp 683–689
Perakakis M, Potamianos A (2007) The effect of input mode on inactivity and interaction times of multimodal systems. In: Massaro DW, Takeda K, Roy D, Potamianos A (eds) Proceedings of the 9th international conference on multimodal interfaces (ICMI 2007). ACM, New York, pp 102–109
Perakakis M, Potamianos A (2008) Multimodal system evaluation using modality eficiency and synergy metrics. In: Proceedings of the 10th international conference on multimodal interfaces (ICMI’08). ACM, New York, pp 9–16
Schatzmann J, Georgila K, Young S (2005) Quantitative evaluation of user simulation techniques for spoken dialogue systems. In: Dybkjær L, Minker W (eds) Proceedings of the 6th SIGdial workshop discourse dialogue. Special Interest Group on Discourse and Dialogue (SIGdial), Associtation for Computational Linguistics (ACL), pp 45–54
Schatzmann J, Young S (2009) The hidden agenda user simulation model. IEEE Trans Audio Speech Lang Process 17(4):733–747
Article Google Scholar
Schmidt S, Engelbrecht KP, Schulz M, Meister M, Stubbe J, Töppel M, Möller S (2010) Identification of interactivity sequences in interactions with spoken dialog systems. In: Proceedings of the 3rd international workshop on perception quality system. Chair of Communication Acoustics TU Dresden, pp 109–114
Serrano M, Nigay L (2010) A wizard of Oz component-based approach for rapidly prototyping and testing input multimodal interfaces. J Multimodal User Interfaces 3(3):215–225. doi:10.1007/s12193-010-0042-4
Article Google Scholar
Serrano M, Nigay L, Demumieux R, Descos J, Losquin P (2006) Multimodal interaction on mobile phones: development and evaluation using ACICARE. In: Nieminen M, Röykkee M (eds) MobileHCI ’06: Proceedings of the 8th conference on human–computer interactions mob. devices serv.,. ACM, New York, pp 129–136
Sonntag D (2012) Collaborative multimodality. KI 26(2):161–168 http://dblp.uni-trier.de/db/journals/ki/ki26.html#Sonntag12
Steinberg D, Budinsky F, Paternostro M, Ed M (2009) EMF. Eclipse Modeling Framwork, 2 edn. Addison-Wesley, Upper Saddle River
Sturm J, Bakx I, Cranen B, Terken J, Wang F (2002) Usability evaluation of a dutch multimodal system for train timetable information. In: Rodriguez MG, Araujo CS (eds) Proceedings of LREC 2002. 3rd Intenational conference on language resources and evaluation, pp 255–261
Sutcliffe A (2008) Multimedia user interface design, chap. 20. Lawrence Erlbaum Associates, New Jersey, pp 393–410
Thompson HS, Maloney M, Beech D, Mendelsohn N (2004) XML schema part 1: Structures second edition. W3C recommendation, W3C. http://www.w3.org/TR/2004/REC-xmlschema-1-20041028/
Vanacken D, Boeck JD, Raymaekers C, Coninx K (2006) NIMMIT: a notation for modeling multimodal interaction techniques. In: Braz J, Jorge JA, Dias M, Marcos A (eds) GRAPP. INSTICC—Institute for Systems and Technologies of Information, Control and Communication, pp 224–231. http://dblp.uni-trier.de/db/conf/grapp/grapp2006.html#VanackenBRC06
Walker M, Litman D, Kamm C, Abella A (1997) PARADISE: a framework for evaluating spoken dialogue agents. In: Proceedings of the 35th annual meeting of the association for computational linguistics, ACL 97, pp 262–270
Wechsung I, Engelbrecht KP, Kühnel C, Möller S, Weiss B (2012) Measuring the quality of service and quality of experience of multimodal humanmachine interaction. J Multimodal User Interfaces 73(6):73–85
Article Google Scholar

Download references

Acknowledgments

This work has been supported by the Cátedra SAES (http://www.catedrasaes.org), a private initiative of the University of Murcia (http://www.um.es) and SAES (Sociedad Anónima de Electrónica Submarina) (http://www.electronica-submarina.com), as well as by the Telekom Innovation Laboratories (http://www.laboratories.telekom.com) within Technische Universität Berlin (http://www.tu-berlin.de).

Author information

Authors and Affiliations

Cátedra SAES Laboratories, University of Murcia, Murcia, Spain
Pedro Luis Mateo Navarro
Telekom Innovation Laboratories, Technische Universität, Berlin, Germany
Stefan Hillmann & Sebastian Möller
Department of Computer Engineering, University of Murcia, Murcia, Spain
Diego Sevilla Ruiz
Department of Information and Communications Engineering, University of Murcia, Murcia, Spain
Gregorio Martínez Pérez

Authors

Pedro Luis Mateo Navarro
View author publications
You can also search for this author inPubMed Google Scholar
Stefan Hillmann
View author publications
You can also search for this author inPubMed Google Scholar
Sebastian Möller
View author publications
You can also search for this author inPubMed Google Scholar
Diego Sevilla Ruiz
View author publications
You can also search for this author inPubMed Google Scholar
Gregorio Martínez Pérez
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Pedro Luis Mateo Navarro.

Appendices

Appendix A: Translations

See Tables 6, 7 , 8 and 9.

Table 6 Translations and meanings of German phrases in the Figs. 4a and 6a, b

Full size table

Table 7 Translations and meanings of German phrases in Fig. 7a

Full size table

Table 8 Translations and meanings of German phrases in Fig. 7b

Full size table

Table 9 Translations and meanings of German phrases in Fig. 7c

Full size table

Appendix B: Guidelines on features of multimodal description languages

See Table 10.

Table 10 Guidelines on potential features of multimodal interaction description languages as described in [14]

Full size table

Appendix C: Parameters used in the model

The tables in this section give an overview about all parameters which are modified or newly introduce in PALADIN compared to ITU-T Suppl. 25 to P-Series Rec. [42]. Table 12 explains the abbreviations which are used in the subsequent tables. Furthermore, Table 11 provides an index containing each parameters (by its abbreviation) and the table or reference describing it.

See Tables 11, 12, 13, 14, 15 and 16.

Table 11 Index of parameters and the tables containing those

Full size table

Table 12 Glossary of abbreviations used in Tables 13, 14, 15 and 16

Full size table

Table 13 Dialogue and communication-related interaction parameters

Full size table

Table 14 Modality-related interaction parameters

Full size table

Table 15 Meta-communication-related interaction parameters

Full size table

Table 16 Keyboard- and mouse-input-related interaction parameters

Full size table

Rights and permissions

Reprints and permissions

About this article

Cite this article

Mateo Navarro, P.L., Hillmann, S., Möller, S. et al. Run-time model based framework for automatic evaluation of multimodal interfaces. J Multimodal User Interfaces 8, 399–427 (2014). https://doi.org/10.1007/s12193-014-0170-3

Download citation

Received: 05 November 2013
Accepted: 25 July 2014
Published: 19 August 2014
Issue Date: October 2014
DOI: https://doi.org/10.1007/s12193-014-0170-3

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Run-time model based framework for automatic evaluation of multimodal interfaces

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

In-Depth Analysis of Multimodal Interaction: An Explorative Paradigm

Design and Development of Multimodal Applications: A Vision on Key Issues and Methods

Implementation Goals for Multimodal Interfaces in Human-Computer Interaction

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendices

Appendix A: Translations

Appendix B: Guidelines on features of multimodal description languages

Appendix C: Parameters used in the model

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now