skip to main content
10.1145/2556288.2557000acmconferencesArticle/Chapter ViewAbstractPublication PageschiConference Proceedingsconference-collections
research-article
Open Access

The boomRoom: mid-air direct interaction with virtual sound sources

Authors Info & Claims
Published:26 April 2014Publication History

ABSTRACT

In this paper we present a system that allows to "touch", grab and manipulate sounds in mid-air. Further, arbitrary objects can seem to emit sound. We use spatial sound reproduction for sound rendering and computer vision for tracking. Using our approach, sounds can be heard from anywhere in the room and always appear to originate from the same (possibly moving) position, regardless of the listener's position. We demonstrate that direct "touch" interaction with sound is an interesting alternative to indirect interaction mediated through controllers or visual interfaces. We show that sound localization is surprisingly accurate (11.5 cm), even in the presence of distractors. We propose to leverage the ventriloquist effect to further increase localization accuracy. Finally, we demonstrate how affordances of real objects can create synergies of auditory and visual feedback. As an application of the system, we built a spatial music mixing room.

References

  1. Berkhout, A. A holographic approach to acoustic control. Journal of the Audio Engineering Society 36, 12 (1988), 977--995.Google ScholarGoogle Scholar
  2. Blauert, J. Spatial Hearing: The Psychophysics of Human Sound Localization, revised ed. MIT Press, 1996.Google ScholarGoogle Scholar
  3. Brewster, S., Lumsden, J., Bell, M., Hall, M., and Tasker, S. Multimodal 'eyes-free' interaction techniques for wearable devices. In SIGCHI Conference on Human Factors in Computing Systems (2003). Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Brungart, D. S., and Rabinowitz, W. M. Auditory localization of nearby sources. Head-related transfer functions. Journal of the Acoustical Socienty of America 106, 3 (1999), 1465--1479.Google ScholarGoogle Scholar
  5. Daniel, J. Spatial sound encoding including near field effect: Introducing distance coding filters and a viable, new Ambisonic format. In 23rd International Conference of the Audio Engineering Society (2003).Google ScholarGoogle Scholar
  6. de Vries, D. Wave Field Synthesis. AES Monograph. Audio Engineering Society, 2009.Google ScholarGoogle Scholar
  7. Fohl, W., and Nogalski, M. A gesture control interface for a Wave Field Synthesis system. In International Conference on New Interfaces for Musical Expression (2013).Google ScholarGoogle Scholar
  8. Geier, M., and Spors, S. Spatial audio reproduction with the SoundScape Renderer. In 27th Tonmeistertagung - VDT International Convention (2012).Google ScholarGoogle Scholar
  9. Hutchins, E. L., Hollan, J. D., and Norman, D. A. Direct manipulation interfaces. Human-Computer Interaction 1, 4 (1985), 311--338. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Ishii, H., Mazalek, A., and Lee, J. Bottles as a minimal interface to access digital information. In SIGCHI Conference on Human Factors in Computing Systems (2001). Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Ishii, H., and Ullmer, B. Tangible bits: towards seamless interfaces between people, bits and atoms. In SIGCHI Conference on Human Factors in Computing Systems (1997). Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Jackson, C. V. Visual factors in auditory localization. Quarterly Journal of Experimental Psychology 5, 2 (1953), 52--65.Google ScholarGoogle ScholarCross RefCross Ref
  13. Leslie, G., Zamborlin, B., Jodlowski, P., and Schnell, N. Grainstick: A collaborative, interactive sound installation. In International Computer Music Conference (2010).Google ScholarGoogle Scholar
  14. Melchior, F., Laubach, T., and de Vries, D. Authoring and user interaction for the production of Wave Field Synthesis content in an augmented reality system. In Fourth IEEE and ACM International Symposium on Mixed and Augmented Reality (2005). Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Melchior, F., Sladeczek, C., de Vries, D., and Fröhlich, B. User-dependent optimization of Wave Field Synthesis reproduction for directive sound fields. In 124th Convention of the Audio Engineering Society (2008).Google ScholarGoogle Scholar
  16. Mynatt, E. D., Back, M., Want, R., Baer, M., and Ellis, J. B. Designing Audio Aura. In SIGCHI Conference on Human Factors in Computing Systems (1998). Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Pick, H. L., Warren, D. H., and Hay, J. C. Sensory conflict in judgments of spatial direction. Perception & Psychophysics 6, 4 (1969), 203--205.Google ScholarGoogle ScholarCross RefCross Ref
  18. Shneiderman, B. The future of interactive systems and the emergence of direct manipulation. Behaviour & Information Technology 1, 3 (1982), 237--256.Google ScholarGoogle ScholarCross RefCross Ref
  19. Spors, S. Extension of an analytic secondary source selection criterion for Wave Field Synthesis. In 123rd Convention of the Audio Engineering Society (2007).Google ScholarGoogle Scholar
  20. Spors, S., Wierstorf, H., Raake, A., Melchior, F., Frank, M., and Zotter, F. Spatial sound with loudspeakers and its perception: A review of the current state. IEEE Proceedings 101, 9 (2013), 1920--1938.Google ScholarGoogle ScholarCross RefCross Ref
  21. Springer, J. P., Sladeczek, C., Scheffer, M., Hochstrate, J., Melchior, F., and Fröhlich, B. Combining Wave Field Synthesis and multi-viewer stereo displays. In IEEE Virtual Reality Conference (2006). Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Völk, F., Mühlbauer, U., and Fastl, H. Minimum audible distance (MAD) by the example of Wave Field Synthesis. In German Annual Conference on Acoustics (DAGA) (2012).Google ScholarGoogle Scholar
  23. Wierstorf, H., Raake, A., Geier, M., and Spors, S. Perception of focused sources in Wave Field Synthesis. Journal of the Audio Engineering Society 61, 1/2 (2013), 5--16.Google ScholarGoogle Scholar
  24. Yost, W. A., Dye, R. H., and Sheft, S. A simulated cocktail party with up to three sound sources. Perception & Psychophysics 58 (1996), 1026--1036.Google ScholarGoogle ScholarCross RefCross Ref
  25. Zotter, F., and Spors, S. Is sound field control determined at all frequencies? How is it related to numerical acoustics? In 52nd Conference of the Audio Engineering Society (2013).Google ScholarGoogle Scholar

Index Terms

  1. The boomRoom: mid-air direct interaction with virtual sound sources

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      CHI '14: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
      April 2014
      4206 pages
      ISBN:9781450324731
      DOI:10.1145/2556288

      Copyright © 2014 Owner/Author

      Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 26 April 2014

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      CHI '14 Paper Acceptance Rate465of2,043submissions,23%Overall Acceptance Rate6,199of26,314submissions,24%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader