poster

Audio cloud: creation and rendering

Authors:
Jitendra Ajmera

IBM Research, New Delhi, Delhi, India

IBM Research, New Delhi, Delhi, India
View Profile

,
Om D Deshmukh

IBM Research, New Delhi, Delhi, India

IBM Research, New Delhi, Delhi, India
View Profile

,
Anupam Jain

IBM Research Lab, New Delhi, Delhi, India

IBM Research Lab, New Delhi, Delhi, India
View Profile

,
Amit Anil Nanavati

IBM Research, New Delhi, Delhi, India

IBM Research, New Delhi, Delhi, India
View Profile

,
Nitendra Rajput

IBM Research, New Delhi, Delhi, India

IBM Research, New Delhi, Delhi, India
View Profile

,
Saurabh Srivastava

IBM Research, New Delhi, Delhi, India

IBM Research, New Delhi, Delhi, India
View Profile

IUI '12: Proceedings of the 2012 ACM international conference on Intelligent User InterfacesFebruary 2012Pages 277–280https://doi.org/10.1145/2166966.2167017

Published:14 February 2012Publication History

IUI '12: Proceedings of the 2012 ACM international conference on Intelligent User Interfaces

Pages 277–280

ABSTRACT

Word clouds are extensively used to present a summary of the prominent words in a document on the World Wide Web. Such clouds give the user an idea about the content of the document. In this paper we present a mechanism to create and render an audio cloud for audio content. Such audio clouds are expected to provide a similar summary of the audio documents. They have wide applicability in various domains, especially for low-literate users who currently do not use the Internet but interact with audio-based systems.

Detecting words from an audio content is challenging, especially if the audio is in languages for which a speech recognition system does not exist. We present a language-independent mechanism to detect frequently occurring words within an audio document. We then present four ways to render these words that form an audio cloud. The four prototypes for rendering the audio cloud are based on varying the amplitude, the voice quality, echo and the repetition of audio words. An evaluation study conducted across 32 users suggests that literate and low-literate users easily understand the concept of audio cloud.

References

Allauzen, C., Riley, M., Schalkwyk, J., Skut, W., and Mohri, M. Openfst: A general and efficient weighted finite-state transducer library. CIAA (2007), 11--23. Google ScholarDigital Library
Furnas, G. W., Fake, C., von Ahn, L., Schachter, J., Golder, S., Fox, K., Davis, M., Marlow, C., and Naaman, M. Why do tagging systems work? In CHI'06 extended abstracts, CHI EA '06 (2006), 36--39. Google ScholarDigital Library
Internet Usage World Wide by Country. http://www.infoplease.com/ipa/a0933606.html, Last accessed on October 10, 2011.Google Scholar
Legg, L., and Gilbert, P. A pilot study of gender of voice and gender of voice hearer in psychotic voice hearers. Psychology and Psychotherapy: Theory, Research and Practice (2006), 517--527.Google Scholar
Liddy, E. Advances in automatic text summarization. Inf. Retr. 4 (April 2001), 82--83. Google ScholarDigital Library
Marzano, R. J. A theory-based meta-analysis of research on instruction. Mid-continent Aurora, Colorado: Regional Educational Laboratory. (2000).Google Scholar
Miller, G. A. The magical number seven, plus or minus two: Some limits on our capacity for processing information. The Psychological Review (1956), 81--97.Google Scholar
Parada, C., Sethy, A., and Ramabhadran, B. Query-by-example spoken term detection for oov terms. Proc. of Automatic Speech Recognition and Understanding (2009).Google ScholarCross Ref
Tusing, K., and Dillard, J. The sounds of dominance. Human Communication Research 26, 1 (2000), 148--171.Google ScholarCross Ref
UNESCO Institute for Statistics. Global education digest 2010: Comparing education statistics across the world, 2010.Google Scholar
Vigas, A. B., Wattenberg, M., and Feinberg, J. Participatory visualization with wordle. IEEE Transactions on Visualization and Computer Graphics 15 (2009). Google ScholarDigital Library

Index Terms

Audio cloud: creation and rendering
1. Human-centered computing
  1. Human computer interaction (HCI)

Recommendations

From raw audio to a seamless mix: creating an automated DJ system for Drum and Bass

We present the open-source implementation of the first fully automatic and comprehensive DJ system, able to generate seamless music mixes using songs from a given library much like a human DJ does.
The proposed system is built on top of several enhanced ...
Read More
Digital Audio Workstation: Audio restoration, Comparison of multitrack recording software, Console automation, Ableton Live, ACID Pro, Adobe Audition, ... GarageBand, Logic Pro, Orion Platinum
Read More
Marble track audio manipulator (MTAM): a tangible user interface for audio composition
TEI '08: Proceedings of the 2nd international conference on Tangible and embedded interaction

We created a tangible user interface that allows children to create musical compositions through constructive play. Our Marble Track Audio Manipulator (MTAM) is an augmented marble tower construction kit where marbles represent sound clips and tracks ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
IUI '12: Proceedings of the 2012 ACM international conference on Intelligent User Interfaces
February 2012
436 pages
ISBN:9781450310482
DOI:10.1145/2166966
General Chairs:
Carlos Duarte
University of Lisbon, Portugal
,
Luís Carriço
Universiy of Lisbon, Portugal
,
Program Chairs:
Joaquim Jorge
INESC-ID, Portugal
,
Sharon Oviatt
Incaa Designs, USA
,
Daniel Gonçalves
INESC-ID, Portugal
Copyright © 2012 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 14 February 2012
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
audio cloud
language independent
low-literate
Qualifiers
- poster
Conference

Acceptance Rates
Overall Acceptance Rate746of2,811submissions,27%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 1
  Total Citations
  View Citations
- 235
  Total Downloads
- Downloads (Last 12 months)1
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Audio cloud: creation and rendering

IUI '12: Proceedings of the 2012 ACM international conference on Intelligent User Interfaces

ABSTRACT

References

Cited By

Index Terms

Recommendations

From raw audio to a seamless mix: creating an automated DJ system for Drum and Bass

Digital Audio Workstation: Audio restoration, Comparison of multitrack recording software, Console automation, Ableton Live, ACID Pro, Adobe Audition, ... GarageBand, Logic Pro, Orion Platinum

Marble track audio manipulator (MTAM): a tangible user interface for audio composition