Abstract
This paper proposes a multimodal interaction API (MMI-API) and a library for the development of web-based multimodal applications. The API and library enable us to embed synchronized multiple inputs/outputs into an application, as well as to specify concrete speech inputs/outputs and actions of dialogue agents. Because the API and the library are provided for JavaScript, which is a commonly used web-development language, they can be executed on general web browsers without having to install special add-ons. The users can therefore experience multimodal interaction simply by accessing a web site from their web browsers. In addition to presenting an outline of the API and the library, we offer a practical example of the use of the multimodal interaction system, as applied to an English pronunciation training application for Japanese students.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
XHTML+Voice, http://www.w3.org/TR/xhtml+voice/
Katsurada, K., Nakamura, Y., Yamada, H., Nitta, T.: XISL: A Language for Describing Multimodal Interaction Scenarios. In: Proc. of ICMI 2003, pp. 281–284 (2003)
Wang, K.: SALT: A spoken language interface for web-based multimodal dialog systems. In: Proc. of InterSpeec 2002, pp. 2241–2244 (2002)
Tsutsui, T., Saeyor, S., Ishizuka, M.: MPML: A Multimodal Presentation Markup Language with Character Agent Control Functions. In: Proc. WebNet 2000 World Conf. on the WWW and Internet (2000)
Hayashi, Ueda, Kurihara: TVML (TV program Making Language) - Automatic TV Program Generation from Text-based Script. In: ACM Multimedia 1997 State of the Art Demos (1997)
Nishimura, Y., Minotsu, S., Dohi, H., Ishizuka, M., Nakano, M., Funakoshi, K., Takeuchi, J., Hasegawa, Y., Tsujino, H.: A markup language for describing interactive humanoid robot presentations. In: Proc. of IUI 2007, pp. 333–336 (2007)
Kawahara, T., Kobayashi, T., Takeda, K., Minematsu, N., Itou, K., Yamamoto, M., Yamada, A., Utsuro, T., Shikano, K.: Sharable software repository for Japanese large vocabulary continuous speech recognition. In: Proc. ICSLP 1998, pp. 3257–3260 (1998)
Aques Talk, http://www.a-quest.com/aquestalk/
Mori, T., Iribe, Y., Katsurada, K., Nitta, T.: Real-time Visualization of English Pronunciation on an IPA Vowel-Chart Based on Articulatory Feature Extraction. IPSJ SIG Technical Report 89-15 (2011) (in Japanese)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Katsurada, K., Kikuchi, T., Iribe, Y., Nitta, T. (2012). Proposal of MMI-API and Library for JavaScript. In: Watanabe, T., Watada, J., Takahashi, N., Howlett, R., Jain, L. (eds) Intelligent Interactive Multimedia: Systems and Services. Smart Innovation, Systems and Technologies, vol 14. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-29934-6_49
Download citation
DOI: https://doi.org/10.1007/978-3-642-29934-6_49
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-29933-9
Online ISBN: 978-3-642-29934-6
eBook Packages: EngineeringEngineering (R0)