Skip to main content
Log in

MIMOSE: multimodal interaction for music orchestration sheet editors

An integrable multimodal music editor interaction system

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

The increasing number and accuracy of sensors devoted to human- computer input are supporting the emergence of novel multimodal interaction paradigms. These, in turn, can unlock additional strategies to design innovative user-friendly systems. The underlying approaches to user-computer interaction leverage natural channels of communication (e.g. gestures and voice), therefore oftentimes are less cumbersome than traditional interface modalities. This paper proposes a wrapper-based strategy to easily map keyboard shortcuts onto multimodal actions. The presented case study is a music editor software. These applications are often overwhelming for novice users, therefore discouraging their interaction. MIMOSE - Multimodal Interaction for Music Orchestration Sheet Editors addresses these limitations. Instead of relying on buttons and mixture pads for the composition of a music opera, it provides a gesture- and voice-based multimodal wrapper for music editor applications. The user assumes the role of an orchestra conductor. Hence, the wrapper translates user gestures and music jargon keywords into mouse clicks or keyboard pressings, by substituting keyboard shortcuts with multimodal actions. This provides a user ecologically tuned and immersive environment of interaction. It is worth noticing that the wrapped application is not necessarily an open source one. In fact, events already captured by such application are just sent over different channels than keyboard and mouse and are triggered by multimodal actions instead of key pressing. After presenting the features of the wrapper, we describe its application to an open source software tool for music editing and present twofold evaluation results. We separately evaluated the performances of each interaction modality in terms of accuracy and F1 score. Furthermore, we asked real users to evaluate the usability of the application when extended by the wrapper. The user evaluation relies on ad-hoc tailored QUIS and SUXES questionnaires in order to assess the user-friendliness of the resulting application. The results are encouraging from both technical quality and usability points of view. The wrapper at the core of MIMOSE can be adapted to other kinds of applications, with a minimal coding effort.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15

Similar content being viewed by others

Notes

  1. https://www.verdict.co.uk/average-number-connected-devices-time-high/

  2. http://wii.com//

  3. https://developer.microsoft.com/en-us/windows/kinect

  4. http://www.iseenotes.com/

  5. https://itunes.apple.com/us/app/music-tutor-sight-reading/id514363426?mt=8

  6. https://cmusphinx.github.io/

  7. https://developer.leapmotion.com/

  8. https://www.leapmotion.com/

  9. https://cmusphinx.github.io/

  10. http://www.speech.cs.cmu.edu/tools/lmtool-new.html

  11. https://mario.nintendo.com

  12. https://musescore.org/

  13. https://musescore.org/en/handbook/keyboard-shortcuts

  14. https://www.reginfo.gov/public/do/DownloadDocument?objectID=59651101

  15. http://pages.mtu.edu/~suits/notefreqs.html

  16. https://ehomerecordingstudio.com/best-vocal-mics/

References

  1. Awada IA, Mocanu I, Florea AM, Cramariuc B (2017) Multimodal interface for elderly people. In: Control Systems and Computer Science (CSCS), 2017 21st International Conference on. IEEE, p 536–541

  2. Barfield W (2015) Fundamentals of wearable computers and augmented reality. CRC Press, Boca Raton

    Book  Google Scholar 

  3. Caschera MC, D’Ulizia A, Ferri F, Grifoni P (2013) Multimodal interaction in gaming. In: Demey YT, Panetto H (eds) On the move to meaningful internet systems: OTM 2013 workshops. Springer, Berlin Heidelberg, pp 694–703

    Chapter  Google Scholar 

  4. Chin JP, Diehl VA, Norman KL (1988) Development of an instrument measuring user satisfaction of the human-computer interface. In: Proceedings of the SIGCHI conference on human factors in computing systems. ACM, New York, pp 213–218

    Google Scholar 

  5. Edwards ADN (2002) Multimodal interaction and people with disabilities. In: Multimodality in language and speech systems. Springer, Berlin, pp 73–92

    Chapter  Google Scholar 

  6. Forsberg A, Dieterich M, Zeleznik R (1998) The music notepad. In: Proceedings of the 11th annual ACM symposium on user interface software and technology. ACM, New York, pp 203–210

    Google Scholar 

  7. Gruenstein A, Hsu B-JP, Glass J, Seneff S, Hetherington L, Cyphers S, Badr I, Wang C, Liu S (2008) A multimodal home entertainment interface via a mobile device. In: Proceedings of the ACL-08: HLT Workshop on Mobile Language Processing, p 1–9

  8. Laver KE, George S, Thomas S, Deutsch JE, Crotty M (2015) Virtual reality for stroke rehabilitation. Cochrane Database Syst Rev (2)

  9. Li L, Yu F, Shi D, Shi J, Tian Z, Yang J, Wang X, Jiang Q (2017) Application of virtual reality technology in clinical medicine. Am J Transl Res 9(9):3867

    Google Scholar 

  10. Lin F, Ye L, Duffy VG, Chuan-Jun S (2002) Developing virtual environments for industrial training. Inf Sci 140(1–2):153–170

    Article  Google Scholar 

  11. Ohta Y, Tamura H (2014) Mixed reality: merging real and virtual worlds. Springer Publishing Company, Incorporated

  12. Piekarski W, Thomas B (2002) Arquake: the outdoor augmented reality gaming system. Commun ACM 45(1):36–38

    Article  Google Scholar 

  13. Rubine D (1991) The automatic recognition of gestures. PhD thesis, Citeseer

  14. Rubine D (1991) Specifying gestures by example. SIGGRAPH Comput Graph 25(4):329–337

    Article  Google Scholar 

  15. Sharma S, Kallioniemi P, Heimonen T, Hakulinen J, Turunen M, Keskinen T (2018) Overcoming socio-technical challenges for cross-cultural collaborative applications. In: Proceedings of the 17th ACM conference on interaction design and children. ACM, New York, pp 325–336

    Chapter  Google Scholar 

  16. Stone R (2001) Virtual reality for interactive training: an industrial practitioner’s viewpoint. Int J Hum Comput Stud 55(4):699–711

    Article  Google Scholar 

  17. Tse E, Greenberg S, Shen C, Forlines C (2007) Multimodal multiplayer tabletop gaming. Comput Entertain 5(2):12

    Article  Google Scholar 

  18. Turunen M, Hakulinen J, Melto A, Heimonen T, Laivo T, Hella J (2009) Suxes-user experience evaluation method for spoken and multimodal interaction. In: Tenth Annual Conference of the International Speech Communication Association

  19. Wigdor D, Wixon D (2011) Brave NUI world: designing natural user interfaces for touch and gesture. Elsevier, Amsterdam

    Google Scholar 

  20. Zyda M (2005) From visual simulation to virtual reality to games. Computer 38(9):25–32

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Maria De Marsico.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Coletta, A., De Marsico, M., Panizzi, E. et al. MIMOSE: multimodal interaction for music orchestration sheet editors. Multimed Tools Appl 78, 33041–33068 (2019). https://doi.org/10.1007/s11042-019-07838-0

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-019-07838-0

Keywords

Navigation