research-article

AdaptiveSound: An Interactive Feedback-Loop System to Improve Sound Recognition for Deaf and Hard of Hearing Users

Authors:
Hang Do

Computer Science and Engineering, University of Washington, United States

Computer Science and Engineering, University of Washington, United States

0009-0006-2055-3405
View Profile

,
Quan Dang

Computer Science and Engineering, University of Washington, United States

Computer Science and Engineering, University of Washington, United States

0009-0002-0691-2919
View Profile

,
Jeremy Zhengqi Huang

University of Michigan, United States

University of Michigan, United States

0000-0003-2177-9909
View Profile

,
Dhruv Jain

Computer Science and Engineering, University of Michigan, United States

Computer Science and Engineering, University of Michigan, United States

0000-0001-6176-968X
View Profile

ASSETS '23: Proceedings of the 25th International ACM SIGACCESS Conference on Computers and AccessibilityOctober 2023Article No.: 18Pages 1–12https://doi.org/10.1145/3597638.3608390

Published:22 October 2023Publication History

ASSETS '23: Proceedings of the 25th International ACM SIGACCESS Conference on Computers and Accessibility

Pages 1–12

ABSTRACT

Sound recognition tools have wide-ranging impacts for deaf and hard of hearing (DHH) people from being informed of safety-critical information (e.g., fire alarms, sirens) to more mundane but still useful information (e.g., door knock, microwave beeps). However, prior sound recognition systems use models that are pre-trained on generic sound datasets and do not adapt well to diverse variations of real-world sounds. We introduce AdaptiveSound, a real-time system for portable devices (e.g., smartphones) that allows DHH users to provide corrective feedback to the sound recognition model to adapt the model to diverse acoustic environments. AdaptiveSound is informed by prior surveys of sound recognition systems, where DHH users strongly desired the ability to provide feedback to a pre-trained sound recognition model to fine-tune it to their environments. Through quantitative experiments and field evaluations with 12 DHH users, we show that AdaptiveSound can achieve a significantly higher accuracy (+14.6%) than prior state-of-the art systems in diverse real-world locations (e.g., homes, parks, streets, and malls) with little end-user effort (about 10 minutes of feedback).

References

Adavanne, S., Politis, A. and Virtanen, T. 2019. TAU Moving Sound Events 2019 - Ambisonic, Anechoic, Synthetic IR and Moving Source Dataset [Data set]. Zenodo.Google Scholar
AudioSet Label Accuracy: https://research.google.com/audioset/dataset/index.html. Accessed: 2021-04-06.Google Scholar
BBC Sound Effects: http://bbcsfx.acropolis.org.uk/. Accessed: 2019-09-18.Google Scholar
Bragg, D., Huynh, N. and Ladner, R.E. 2016. A Personalizable Mobile Sound Detector App Design for Deaf and Hard-of-Hearing Users. Proceedings of the 18th International ACM SIGACCESS Conference on Computers and Accessibility (New York, New York, USA, 2016), 3–13.Google ScholarDigital Library
Cavender, A. and Ladner, R.E. 2008. Hearing impairments. Web accessibility. Springer. 25–35.Google Scholar
Cavender, A. and Ladner, R.E. 2008. Hearing impairments. Web accessibility. Springer. 25–35.Google Scholar
Chrysos, G.G., Kossaifi, J. and Zafeiriou, S. 2018. Robust conditional generative adversarial networks. arXiv preprint arXiv:1805.08657. (2018).Google Scholar
[8] Findlater, L., Chinh, B., Jain, D., Froehlich, J., Kushalnagar, R. and Lin, A.C. 2019. Deaf and Hard-of-hearing Individuals’ Preferences for Wearable and Mobile Sound Awareness Technologies. SIGCHI Conference on Human Factors in Computing Systems (CHI). (2019), 1–13.Google Scholar
Finn, C., Abbeel, P. and Levine, S. 2017. Model-agnostic meta-learning for fast adaptation of deep networks. International Conference on Machine Learning (2017), 1126–1135.Google Scholar
Fonseca, E., Pons Puig, J., Favory, X., Font Corbera, F., Bogdanov, D., Ferraro, A., Oramas, S., Porter, A. and Serra, X. 2017. Freesound datasets: a platform for the creation of open audio datasets. Hu X, Cunningham SJ, Turnbull D, Duan Z, editors. Proceedings of the 18th ISMIR Conference; 2017 oct 23-27; Suzhou, China.[Canada]: International Society for Music Information Retrieval; 2017. p. 486-93. (2017).Google Scholar
Gemmeke, J.F., Ellis, D.P.W., Freedman, D., Jansen, A., Lawrence, W., Moore, R.C., Plakal, M. and Ritter, M. 2017. Audio set: An ontology and human-labeled dataset for audio events. 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (2017), 776–780.Google ScholarDigital Library
Gepperth, A. and Hammer, B. 2016. Incremental learning algorithms and applications. European symposium on artificial neural networks (ESANN) (2016).Google Scholar
Goodman, S.M., Liu, P., Jain, D., McDonnell, E.J., Froehlich, J.E. and Findlater, L. 2021. Toward User-Driven Sound Recognizer Personalization with People Who Are d/Deaf or Hard of Hearing. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies. 5, 2 (2021), 1–23.Google ScholarDigital Library
Goodman, S.M., Liu, P., Jain, D., McDonnell, E.J., Froehlich, J.E. and Findlater, L. 2021. Toward User-Driven Sound Recognizer Personalization with People Who Are d/Deaf or Hard of Hearing. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies. 5, 2 (2021), 1–23.Google ScholarDigital Library
Gosain, A. and Sardana, S. 2017. Handling class imbalance problem using oversampling techniques: A review. 2017 international conference on advances in computing, communications and informatics (ICACCI) (2017), 79–85.Google Scholar
Hands-on with iOS 14’s Sound Recognition feature that listens for doorbells, smoke alarms, more: https://9to5mac.com/2020/10/28/how-to-use-iphone-sound-recognition-ios-14/. Accessed: 2021-03-08.Google Scholar
Jain, D., Lin, A.C., Amalachandran, M., Zeng, A., Guttman, R., Findlater, L. and Froehlich, J. 2019. Exploring Sound Awareness in the Home for People who are Deaf or Hard of Hearing. Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems (2019), 94:1-94:13.Google ScholarDigital Library
Jain, D., Mack, K., Amrous, A., Wright, M., Goodman, S., Findlater, L. and Froehlich, J.E. 2020. HomeSound: An Iterative Field Deployment of an In-Home Sound Awareness System for Deaf or Hard of Hearing Users. Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems (New York, NY, USA, 2020), 1–12.Google ScholarDigital Library
Jain, D., Ngo, H., Patel, P., Goodman, S., Findlater, L. and Froehlich, J. 2020. SoundWatch: Exploring Smartwatch-based Deep Learning Approaches to Support Sound Awareness for Deaf and Hard of Hearing Users. ACM SIGACCESS conference on Computers and accessibility (2020), 1–13.Google ScholarDigital Library
Jain, D., Nguyen, K., Goodman, S., Grossman-Kahn, R., Ngo, H., Kusupati, A., Du, R., Olwal, A., Findlater, L. and Froehlich, J. 2021. ProtoSound: A Personalized, Scalable Sound Recognition System for d/Deaf and Hard of Hearing Users. SIGCHI Conference on Human Factors in Computing Systems (CHI) (2021), 1–16.Google Scholar
Karbasi, M., Ahadi, S.M. and Bahmanian, M. 2011. Environmental sound classification using spectral dynamic features. 2011 8th International Conference on Information, Communications & Signal Processing (2011), 1–5.Google ScholarCross Ref
Kay, M., Kola, T., Hullman, J.R. and Munson, S.A. 2016. When (ish) is my bus? user-centered visualizations of uncertainty in everyday, mobile predictive systems. Proceedings of the 2016 chi conference on human factors in computing systems (2016), 5092–5103.Google ScholarDigital Library
Kingma, D.P. and Ba, J. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980. (2014).Google Scholar
Krippendorff, K. 2018. Content analysis: An introduction to its methodology. Sage publications.Google Scholar
Ladd, P. and Lane, H. 2013. Deaf ethnicity, deafhood, and their relationship. Sign Language Studies. 13, 4 (2013), 565–579.Google ScholarCross Ref
Ladd, P. and Lane, H. 2013. Deaf ethnicity, deafhood, and their relationship. Sign Language Studies. 13, 4 (2013), 565–579.Google ScholarCross Ref
Laput, G., Ahuja, K., Goel, M. and Harrison, C. 2018. Ubicoustics: Plug-and-play acoustic activity recognition. The 31st Annual ACM Symposium on User Interface Software and Technology (2018), 213–224.Google Scholar
Live Transcribe & Sound Notifications – Apps on Google Play: https://play.google.com/store/apps/details?id=com.google.audio.hearing.visualization.accessibility.scribe. Accessed: 2021-04-04.Google Scholar
Matthews, T., Fong, J., Ho-Ching, F.W.-L. and Mankoff, J. 2006. Evaluating non-speech sound visualizations for the deaf. Behaviour & Information Technology. 25, 4 (Jul. 2006), 333–351.Google ScholarCross Ref
Mesaros, A., Heittola, T. and Virtanen, T. 2016. TUT Sound events 2016.Google Scholar
Mirza, M. and Osindero, S. 2014. Conditional generative adversarial nets. arXiv preprint arXiv:1411.1784. (2014).Google Scholar
Moore, M.S. 1992. For Hearing people only: Answers to some of the most commonly asked questions about the Deaf community, its culture, and the" Deaf Reality". Deaf Life Press.Google Scholar
Narwane, S. V and Sawarkar, S.D. 2019. Machine learning and class imbalance: A literature survey. Ind. Eng. J. 12, (2019).Google Scholar
Network Sound Effects Library: https://www.sound-ideas.com/Product/199/Network-Sound-Effects-Library. Accessed: 2019-09-15.Google Scholar
Piczak, K.J. 2015. ESC: Dataset for environmental sound classification. Proceedings of the 23rd ACM international conference on Multimedia (2015), 1015–1018.Google ScholarDigital Library
Ramos, G., Meek, C., Simard, P., Suh, J. and Ghorashi, S. 2020. Interactive machine teaching: a human-centered approach to building machine-learned models. Human–Computer Interaction. 35, 5–6 (2020), 413–451.Google Scholar
Ribicic, H., Waser, J., Gurbat, R., Sadransky, B. and Gröller, M.E. 2012. Sketching uncertainty into simulations. IEEE Transactions on Visualization and Computer Graphics. 18, 12 (2012), 2255–2264.Google ScholarDigital Library
Salamon, J., Jacoby, C. and Bello, J.P. 2014. A Dataset and Taxonomy for Urban Sound Research. 22nd {ACM} International Conference on Multimedia (ACM-MM’14) (Orlando, FL, USA, 2014), 1041–1044.Google Scholar
Sandler, M., Howard, A., Zhu, M., Zhmoginov, A. and Chen, L.-C. 2018. Mobilenetv2: Inverted residuals and linear bottlenecks. Proceedings of the IEEE conference on computer vision and pattern recognition (2018), 4510–4520.Google Scholar
Sicong, L., Zimu, Z., Junzhao, D., Longfei, S., Han, J. and Wang, X. 2017. UbiEar: Bringing Location-independent Sound Awareness to the Hard-of-hearing People with Smartphones. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies. 1, 2 (2017), 17.Google ScholarDigital Library
Strand, O.M. and Egeberg, A. 2004. Cepstral mean and variance normalization in the model domain. COST278 and ISCA Tutorial and Research Workshop (ITRW) on Robustness Issues in Conversational Interaction (2004).Google Scholar
Torrey, L. and Shavlik, J. 2010. Transfer learning. Handbook of research on machine learning applications and trends: algorithms, methods, and techniques. IGI global. 242–264.Google Scholar
UPC-TALP dataset: http://www.talp.upc.edu/content/upc-talp-database-isolated-meeting-room-acoustic-events. Accessed: 2019-09-18.Google Scholar
Wang, Y., Yao, Q., Kwok, J.T. and Ni, L.M. 2020. Generalizing from a few examples: A survey on few-shot learning. ACM Computing Surveys (CSUR). 53, 3 (2020), 1–34.Google ScholarDigital Library
Wang, Z., Chen, C. and Dong, D. 2021. Lifelong incremental reinforcement learning with online Bayesian inference. IEEE Transactions on Neural Networks and Learning Systems. 33, 8 (2021), 4003–4016.Google ScholarCross Ref
Wardekker, J.A., van der Sluijs, J.P., Janssen, P.H.M., Kloprogge, P. and Petersen, A.C. 2008. Uncertainty communication in environmental assessments: views from the Dutch science-policy interface. Environmental science & policy. 11, 7 (2008), 627–641.Google Scholar
Wu, J., Harrison, C., Bigham, J.P. and Laput, G. 2020. Automated Class Discovery and One-Shot Interactions for Acoustic Activity Recognition. Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems (2020), 1–14.Google ScholarDigital Library
Wu, X., Huang, W., Wu, X., Wu, S. and Huang, J. 2022. Classification of thermal image of clinical burn based on incremental reinforcement learning. Neural Computing and Applications. (2022), 1–14.Google Scholar
Zajadacz, A. 2015. Evolution of models of disability as a basis for further policy changes in accessible tourism. Journal of Tourism Futures. 1, 3 (2015), 189–202.Google ScholarCross Ref

Index Terms

AdaptiveSound: An Interactive Feedback-Loop System to Improve Sound Recognition for Deaf and Hard of Hearing Users
1. Human-centered computing
  1. Human computer interaction (HCI)
2. Social and professional topics
  1. Professional topics
    1. Computing profession
      1. Assistive technologies
  2. User characteristics
    1. People with disabilities

Index terms have been assigned to the content through auto-classification.

Recommendations

SoundWatch: Exploring Smartwatch-based Deep Learning Approaches to Support Sound Awareness for Deaf and Hard of Hearing Users
ASSETS '20: Proceedings of the 22nd International ACM SIGACCESS Conference on Computers and Accessibility

Smartwatches have the potential to provide glanceable, always-available sound feedback to people who are deaf or hard of hearing. In this paper, we present a performance evaluation of four low-resource deep learning sound classification models: MobileNet,...
Read More
ProtoSound: A Personalized and Scalable Sound Recognition System for Deaf and Hard-of-Hearing Users
CHI '22: Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems

Recent advances have enabled automatic sound recognition systems for deaf and hard of hearing (DHH) users on mobile devices. However, these tools use pre-trained, generic sound recognition models, which do not meet the diverse needs of DHH users. We ...
Read More
How people who are deaf, Deaf, and hard of hearing use technology in creative sound activities
ASSETS '22: Proceedings of the 24th International ACM SIGACCESS Conference on Computers and Accessibility

Creative sound activities, such as music playing and audio engineering, are said to have been democratized with the development of technology. Yet, the use of technology in creative sound activities by people who are deaf, Deaf, and hard of hearing (DHH)...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

ASSETS '23: Proceedings of the 25th International ACM SIGACCESS Conference on Computers and Accessibility
October 2023
1163 pages
ISBN:9798400702204
DOI:10.1145/3597638

Copyright © 2023 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 22 October 2023
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Accessibility
Deaf
Human-AI
applied machine learning
audio event detection
deaf
hard of hearing
human-in-the-loop
incremental learning
reinforcement learning
sound awareness
Qualifiers
- research-article
- Research
- Refereed limited
Conference

Acceptance Rates
ASSETS '23 Paper Acceptance Rate55of182submissions,30%Overall Acceptance Rate436of1,556submissions,28%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 1
  Total Citations
  View Citations
- 182
  Total Downloads
- Downloads (Last 12 months)182
- Downloads (Last 6 weeks)20
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

AdaptiveSound: An Interactive Feedback-Loop System to Improve Sound Recognition for Deaf and Hard of Hearing Users

ASSETS '23: Proceedings of the 25th International ACM SIGACCESS Conference on Computers and Accessibility

ABSTRACT

References

Cited By

Index Terms

Recommendations

SoundWatch: Exploring Smartwatch-based Deep Learning Approaches to Support Sound Awareness for Deaf and Hard of Hearing Users

ProtoSound: A Personalized and Scalable Sound Recognition System for Deaf and Hard-of-Hearing Users

How people who are deaf, Deaf, and hard of hearing use technology in creative sound activities

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

HTML Format

Caption

AdaptiveSound: An Interactive Feedback-Loop System to Improve Sound Recognition for Deaf and Hard of Hearing Users

ASSETS '23: Proceedings of the 25th International ACM SIGACCESS Conference on Computers and Accessibility

ABSTRACT

References

Cited By

Index Terms

Recommendations

SoundWatch: Exploring Smartwatch-based Deep Learning Approaches to Support Sound Awareness for Deaf and Hard of Hearing Users

ProtoSound: A Personalized and Scalable Sound Recognition System for Deaf and Hard-of-Hearing Users

How people who are deaf, Deaf, and hard of hearing use technology in creative sound activities

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

HTML Format

Share this Publication link

Share on Social Media