skip to main content
10.1145/3411109.3411113acmotherconferencesArticle/Chapter ViewAbstractPublication PagesamConference Proceedingsconference-collections
research-article

Voice-based interface for accessible soundscape composition: composing soundscapes by vocally querying online sounds repositories

Published: 16 September 2020 Publication History

Abstract

This paper presents an Internet of Audio Things ecosystem devised to support soundscape composition via vocal interactions. The ecosystem involves a commercial voice-based interface and the cloud-based repository of audio content Freesound.org. The user-system interactions are exclusively based on vocal input/outputs, and differ from the conventional methods for retrieval and sound editing which involve a browser and programs running on a desktop PC. The developed ecosystem targets sound designers interested in soundscape composition and in particular the visually-impaired ones, with the aim of making the soundscape composition practice more accessible. We report the results of a user study conducted with twelve participants. Overall, results show that the interface was found usable and was deemed easy to use and to learn. Participants reported to have enjoyed using the system and generally felt that it was effective in supporting their creativity during the process of composing a soundscape.

References

[1]
V. Akkermans, F. Font, J. Funollet, B. de Jong, G. Roma, S. Togias, and X. Serra. 2011. Freesound 2: An improved platform for sharing audio clips. In Proceedings of the International Society for Music Information Retrieval Conference.
[2]
S. Bazen, L. Bouvard, and J.B. Zimmermann. 2015. Musicians and the Creative Commons: A survey of artists on Jamendo. Information Economics and Policy 32 (2015), 65--76.
[3]
J.P. Bigham, T. Lau, and J. Nichols. 2009. Trailblazer: enabling blind users to blaze trails through the web. In Proceedings of the 14th international conference on Intelligent user interfaces. 177--186.
[4]
V. Braun and V. Clarke. 2006. Using thematic analysis in psychology. Qualitative Research in Psychology 3, 2 (2006), 77--101.
[5]
R.N. Brewer, L. Findlater, J. Kaye, W. Lasecki, C. Munteanu, and A. Weber. 2018. Accessible Voice Interfaces. In Companion of the 2018 ACM Conference on Computer Supported Cooperative Work and Social Computing. 441--446.
[6]
J. Brooke. 1996. SUS-A quick and dirty usability scale. Usability evaluation in industry 189, 194 (1996), 4--7.
[7]
O. Chapman. 2009. The Icebreaker: Soundscape works as everyday sound art. Organised Sound 14, 1 (2009), 83--88.
[8]
E. Cherry and C. Latulipe. 2014. Quantifying the creativity support of digital tools through the creativity support index. ACM Transactions on Computer-Human Interaction 21, 4 (2014), 21.
[9]
K. Christian, B Kules, B. Shneiderman, and A. Youssef. 2000. A comparison of voice controlled and mouse controlled web browsing. In Proceedings of the fourth international ACM conference on Assistive technologies. 72--79.
[10]
J. L. Drever. 2002. Soundscape composition: the convergence of ethnography and acousmatic music. Organised Sound 7, 1 (2002), 21--27.
[11]
J. d'Escriván. 2009. Sound art (?) on/in film. Organised Sound 14, 1 (2009), 65--73.
[12]
G. Eckel. 2001. Immersive audio-augmented environments: the LISTEN project. Proceedings Fifth International Conference on Information Visualisation 128 (2001), 571--573.
[13]
X. Favory, E. Fonseca, F. Font, and X. Serra. 2018. Facilitating the manual annotation of sounds when using large taxonomies. In Proceedings of the 23rd Conference of Open Innovations Association FRUCT. IEEE, 60--64.
[14]
E. Fonseca, J. Pons Puig, X. Favory, F. Font Corbera, D. Bogdanov, A. Ferraro, S. Oramas, A. Porter, and X. Serra. 2017. Freesound datasets: a platform for the creation of open audio datasets. In Proceedings of the International Society for Music Information Retrieval Conference. International Society for Music Information Retrieval, 486--493.
[15]
F. Font, T. Brookes, G. Fazekas, M. Guerber, A. La Burthe, D. Plans, M.D. Plumbley, M. Shaashua, W. Wang, and X. Serra. 2016. Audio Commons: bringing Creative Commons audio content to the creative industries. In Audio Engineering Society Conference: 61st International Conference: Audio for Games. Audio Engineering Society.
[16]
F. Font, G. Roma, and X. Serra. 2013. Freesound technical demo. In Proceedings of the ACM international conference on Multimedia. ACM, 411--412.
[17]
J. Freeman, C. Disalvo, M. Nitsche, and S. Garrett. 2011. Soundscape composition and field recording as a platform for collaborative creativity. Organised Sound 16, 3 (2011), 272--281.
[18]
C. Granata, M. Chetouani, A. Tapus, P. Bidaud, and V. Dupourqué. 2010. Voice and graphical-based interfaces for interaction with a robot dedicated to elderly and people with cognitive disorders. In 19th International Symposium in Robot and Human Interactive Communication. IEEE, 785--790.
[19]
S. Harada, J.A. Landay, J. Malkin, X. Li, and J.A. Bilmes. 2006. The vocal joystick: evaluation of voice-based cursor control techniques. In Proceedings of the 8th international ACM SIGACCESS conference on Computers and accessibility. 197--204.
[20]
S. Harada, J.O. Wobbrock, and J.A. Landay. 2011. Voice games: investigation into the use of non-speech voice input for making computer games more accessible. In IFIP Conference on Human-Computer Interaction. Springer, 11--29.
[21]
C.C. Huang, Y.J. Lin, X. Zeng, M. Newman, and S. O'Modhrain. 2015. Olegoru: A Soundscape Composition Tool to Enhance Imaginative Storytelling with Tangible Objects. In Proceedings of the Ninth International Conference on Tangible, Embedded, and Embodied Interaction. 709--714.
[22]
T. Igarashi and J.F. Hughes. 2001. Voice as sound: using non-verbal voice input for interactive control. In Proceedings of the 14th annual ACM symposium on User interface software and technology. 155--156.
[23]
S. Kopp, M. Brandt, H. Buschmeier, K. Cyra, F. Freigang, N. Krämer, F. Kummert, C. Opfermann, K. Pitsch, L. Schillingmann, et al. 2018. Conversational assistants for elderly users-the importance of socially cooperative dialogue. In Proceedings of the AAMAS Workshop on Intelligent Conversation Agents in Home and Geriatric Care Applications co-located with the Federated AI Meeting, Vol. 2338.
[24]
M. Koutsomichalis. 2013. On soundscapes, phonography and environmental sound art. Journal of sonic studies 4, 1 (2013).
[25]
M. Leonard and R. Strachan. 2014. More Than Background: Ambience and Sound-Design in Contemporary Art Documentary Film. In Music and Sound in Documentary Film. Routledge, 180--193.
[26]
A. Pradhan, K Mehta, and L. Findlater. 2018. "Accessibility Came by Accident" Use of Voice-Controlled Intelligent Personal Assistants by People with Disabilities. In Proceedings of the CHI Conference on Human Factors in Computing Systems. 1--13.
[27]
R. M. Schafer. 1977. The Tuning of the World. Random House Inc.
[28]
M. Schirosa, J. Janer, S. Kersten, and G. Roma. 2010. A system for soundscape generation, composition and streaming. In XVII CIM-Colloquium of Musical Informatics.
[29]
S. Skach, A. Xambó, L. Turchet, A. Stolfi, R. Stewart, and M. Barthet. 2018. Embodied Interactions with E-Textiles and the Internet of Sounds for Performing Arts. In Proceedings of the International Conference on Tangible, Embedded, and Embodied Interaction. ACM, 80--87.
[30]
A. Stolfi, M. Ceriani, L. Turchet, and M. Barthet. 2018. Playsound.space: Inclusive Free Music Improvisations Using Audio Commons. In Proceedings of the Conference on New Interfaces for Musical Expression. 228--233.
[31]
M. Thorogood and P. Pasquier. 2013. Computationally Created Soundscapes with Audio Metaphor. In International Conference on Computational Creativity. 1--7.
[32]
B. Truax. 1992. Electroacoustic music and the soundscape: the inner and outer world. Companion to contemporary musical thought 1 (1992), 374--398.
[33]
B. Truax. 1996. Soundscape, Acoustic Communication and Environmental Sound Composition. Contemporary Music Review 15, 1--2 (1996), 49--65.
[34]
B. Truax. 2008. Soundscape composition as global music: electroacoustic music as soundscape. Organised Sound 13, 2 (2008), 103--109.
[35]
L. Turchet. 2019. Smart Musical Instruments: vision, design principles, and future directions. IEEE Access 7 (2019), 8944--8963.
[36]
L. Turchet and M. Barthet. 2018. Jamming with a smart mandolin and Freesound-based accompaniment. In IEEE Conference of Open Innovations Association (FRUCT). IEEE, 375--381.
[37]
L. Turchet, G. Fazekas, M. Lagrange, H. Shokri Ghadikolaei, and C. Fischione. 2020 (In press). The Internet of Audio Things: state-of-the-art, vision, and challenges. IEEE Internet of Things Journal (2020 (In press)).
[38]
L. Turchet, C. Fischione, G. Essl, D. Keller, and M. Barthet. 2018. Internet of Musical Things: Vision and Challenges. IEEE Access 6 (2018), 61994--62017.
[39]
L. Turchet, J. Pauwels, C. Fischione, and G. Fazekas. 2020. Cloud-Smart Musical Instrument Interactions: Querying a Large Music Collection with a Smart Guitar. ACM Transactions on the Internet of Things 1, 3 (2020), 1--29.
[40]
L. Turchet and S. Serafin. 2013. Investigating the amplitude of interactive footstep sounds and soundscape reproduction. Applied Acoustics 74, 4 (2013), 566--574.
[41]
P. Turner, I. McGregor, S. Turner, and F. Carroll. 2003. Evaluating soundscapes as a means of creating a sense of place. In Proceedings of International Conference on Auditory Display. 148--151.
[42]
A. Valle, P. Armao, M. Casu, and M. Koutsomichalis. 2014. SoDA: A Sound Design Accelerator for the automatic generation of soundscapes from an ontologically annotated sound library. In International Computer Music Conference.
[43]
A. Valle, V. Lombardo, and M. Schirosa. 2009. Simulating the soundscape through an analysis/resynthesis methodology. In Auditory Display. Springer, 330--357.
[44]
C. Verron, M. Aramaki, R. Kronland-Martinet, and G. Pallone. 2009. A 3-D immersive synthesizer for environmental sounds. IEEE Transactions on Audio, Speech, and Language Processing 18, 6 (2009), 1550--1561.
[45]
I. Wechsung and A. B. Naumann. 2008. Evaluation methods for multimodal systems: A comparison of standardized usability questionnaires. In International Tutorial and Research Workshop on Perception and Interactive Technologies for Speech-Based Systems. Springer, 276--284.
[46]
H. Westerkamp. 2002. Linking soundscape composition and acoustic ecology. Organised Sound 7, 1 (2002), 51--56.
[47]
K. Wrightson. 2000. An introduction to acoustic ecology. Soundscape: The journal of acoustic ecology 1, 1 (2000), 10--13.
[48]
A. Xambó, G. Roma, A. Lerch, M. Barthet, and G. Fazekas. 2018. Live Repurposing of Sounds: MIR Explorations with Personal and Crowdsourced Databases. In Proceedings of the International Conference on New Interfaces for Musical Expression.

Cited By

View all
  • (2023)The Internet of Sounds: Convergent Trends, Insights, and Future DirectionsIEEE Internet of Things Journal10.1109/JIOT.2023.325360210:13(11264-11292)Online publication date: 1-Jul-2023
  • (2023)“Give me happy pop songs in C major and with a fast tempo”: A vocal assistant for content-based queries to online music repositoriesInternational Journal of Human-Computer Studies10.1016/j.ijhcs.2023.103007173(103007)Online publication date: May-2023
  • (2023)LyricJam Sonic: A Generative System for Real-Time Composition and Musical ImprovisationArtificial Intelligence in Music, Sound, Art and Design10.1007/978-3-031-29956-8_19(292-307)Online publication date: 1-Apr-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
AM '20: Proceedings of the 15th International Audio Mostly Conference
September 2020
281 pages
ISBN:9781450375634
DOI:10.1145/3411109
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 16 September 2020

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. conversational AI
  2. freesound
  3. internet of audio things
  4. online sound repository
  5. voice assistant

Qualifiers

  • Research-article

Conference

AM'20
AM'20: Audio Mostly 2020
September 15 - 17, 2020
Graz, Austria

Acceptance Rates

AM '20 Paper Acceptance Rate 29 of 47 submissions, 62%;
Overall Acceptance Rate 177 of 275 submissions, 64%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)8
  • Downloads (Last 6 weeks)0
Reflects downloads up to 13 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2023)The Internet of Sounds: Convergent Trends, Insights, and Future DirectionsIEEE Internet of Things Journal10.1109/JIOT.2023.325360210:13(11264-11292)Online publication date: 1-Jul-2023
  • (2023)“Give me happy pop songs in C major and with a fast tempo”: A vocal assistant for content-based queries to online music repositoriesInternational Journal of Human-Computer Studies10.1016/j.ijhcs.2023.103007173(103007)Online publication date: May-2023
  • (2023)LyricJam Sonic: A Generative System for Real-Time Composition and Musical ImprovisationArtificial Intelligence in Music, Sound, Art and Design10.1007/978-3-031-29956-8_19(292-307)Online publication date: 1-Apr-2023
  • (2022)FreesoundVR: soundscape composition in virtual reality using online sound repositoriesVirtual Reality10.1007/s10055-022-00705-827:2(903-915)Online publication date: 30-Sep-2022

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media