skip to main content
10.1145/3597638.3608390acmconferencesArticle/Chapter ViewAbstractPublication PagesassetsConference Proceedingsconference-collections
research-article

AdaptiveSound: An Interactive Feedback-Loop System to Improve Sound Recognition for Deaf and Hard of Hearing Users

Published:22 October 2023Publication History

ABSTRACT

Sound recognition tools have wide-ranging impacts for deaf and hard of hearing (DHH) people from being informed of safety-critical information (e.g., fire alarms, sirens) to more mundane but still useful information (e.g., door knock, microwave beeps). However, prior sound recognition systems use models that are pre-trained on generic sound datasets and do not adapt well to diverse variations of real-world sounds. We introduce AdaptiveSound, a real-time system for portable devices (e.g., smartphones) that allows DHH users to provide corrective feedback to the sound recognition model to adapt the model to diverse acoustic environments. AdaptiveSound is informed by prior surveys of sound recognition systems, where DHH users strongly desired the ability to provide feedback to a pre-trained sound recognition model to fine-tune it to their environments. Through quantitative experiments and field evaluations with 12 DHH users, we show that AdaptiveSound can achieve a significantly higher accuracy (+14.6%) than prior state-of-the art systems in diverse real-world locations (e.g., homes, parks, streets, and malls) with little end-user effort (about 10 minutes of feedback).

References

  1. Adavanne, S., Politis, A. and Virtanen, T. 2019. TAU Moving Sound Events 2019 - Ambisonic, Anechoic, Synthetic IR and Moving Source Dataset [Data set]. Zenodo.Google ScholarGoogle Scholar
  2. AudioSet Label Accuracy: https://research.google.com/audioset/dataset/index.html. Accessed: 2021-04-06.Google ScholarGoogle Scholar
  3. BBC Sound Effects: http://bbcsfx.acropolis.org.uk/. Accessed: 2019-09-18.Google ScholarGoogle Scholar
  4. Bragg, D., Huynh, N. and Ladner, R.E. 2016. A Personalizable Mobile Sound Detector App Design for Deaf and Hard-of-Hearing Users. Proceedings of the 18th International ACM SIGACCESS Conference on Computers and Accessibility (New York, New York, USA, 2016), 3–13.Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Cavender, A. and Ladner, R.E. 2008. Hearing impairments. Web accessibility. Springer. 25–35.Google ScholarGoogle Scholar
  6. Cavender, A. and Ladner, R.E. 2008. Hearing impairments. Web accessibility. Springer. 25–35.Google ScholarGoogle Scholar
  7. Chrysos, G.G., Kossaifi, J. and Zafeiriou, S. 2018. Robust conditional generative adversarial networks. arXiv preprint arXiv:1805.08657. (2018).Google ScholarGoogle Scholar
  8. [8] Findlater, L., Chinh, B., Jain, D., Froehlich, J., Kushalnagar, R. and Lin, A.C. 2019. Deaf and Hard-of-hearing Individuals’ Preferences for Wearable and Mobile Sound Awareness Technologies. SIGCHI Conference on Human Factors in Computing Systems (CHI). (2019), 1–13.Google ScholarGoogle Scholar
  9. Finn, C., Abbeel, P. and Levine, S. 2017. Model-agnostic meta-learning for fast adaptation of deep networks. International Conference on Machine Learning (2017), 1126–1135.Google ScholarGoogle Scholar
  10. Fonseca, E., Pons Puig, J., Favory, X., Font Corbera, F., Bogdanov, D., Ferraro, A., Oramas, S., Porter, A. and Serra, X. 2017. Freesound datasets: a platform for the creation of open audio datasets. Hu X, Cunningham SJ, Turnbull D, Duan Z, editors. Proceedings of the 18th ISMIR Conference; 2017 oct 23-27; Suzhou, China.[Canada]: International Society for Music Information Retrieval; 2017. p. 486-93. (2017).Google ScholarGoogle Scholar
  11. Gemmeke, J.F., Ellis, D.P.W., Freedman, D., Jansen, A., Lawrence, W., Moore, R.C., Plakal, M. and Ritter, M. 2017. Audio set: An ontology and human-labeled dataset for audio events. 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (2017), 776–780.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Gepperth, A. and Hammer, B. 2016. Incremental learning algorithms and applications. European symposium on artificial neural networks (ESANN) (2016).Google ScholarGoogle Scholar
  13. Goodman, S.M., Liu, P., Jain, D., McDonnell, E.J., Froehlich, J.E. and Findlater, L. 2021. Toward User-Driven Sound Recognizer Personalization with People Who Are d/Deaf or Hard of Hearing. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies. 5, 2 (2021), 1–23.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Goodman, S.M., Liu, P., Jain, D., McDonnell, E.J., Froehlich, J.E. and Findlater, L. 2021. Toward User-Driven Sound Recognizer Personalization with People Who Are d/Deaf or Hard of Hearing. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies. 5, 2 (2021), 1–23.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Gosain, A. and Sardana, S. 2017. Handling class imbalance problem using oversampling techniques: A review. 2017 international conference on advances in computing, communications and informatics (ICACCI) (2017), 79–85.Google ScholarGoogle Scholar
  16. Hands-on with iOS 14’s Sound Recognition feature that listens for doorbells, smoke alarms, more: https://9to5mac.com/2020/10/28/how-to-use-iphone-sound-recognition-ios-14/. Accessed: 2021-03-08.Google ScholarGoogle Scholar
  17. Jain, D., Lin, A.C., Amalachandran, M., Zeng, A., Guttman, R., Findlater, L. and Froehlich, J. 2019. Exploring Sound Awareness in the Home for People who are Deaf or Hard of Hearing. Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems (2019), 94:1-94:13.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Jain, D., Mack, K., Amrous, A., Wright, M., Goodman, S., Findlater, L. and Froehlich, J.E. 2020. HomeSound: An Iterative Field Deployment of an In-Home Sound Awareness System for Deaf or Hard of Hearing Users. Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems (New York, NY, USA, 2020), 1–12.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Jain, D., Ngo, H., Patel, P., Goodman, S., Findlater, L. and Froehlich, J. 2020. SoundWatch: Exploring Smartwatch-based Deep Learning Approaches to Support Sound Awareness for Deaf and Hard of Hearing Users. ACM SIGACCESS conference on Computers and accessibility (2020), 1–13.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Jain, D., Nguyen, K., Goodman, S., Grossman-Kahn, R., Ngo, H., Kusupati, A., Du, R., Olwal, A., Findlater, L. and Froehlich, J. 2021. ProtoSound: A Personalized, Scalable Sound Recognition System for d/Deaf and Hard of Hearing Users. SIGCHI Conference on Human Factors in Computing Systems (CHI) (2021), 1–16.Google ScholarGoogle Scholar
  21. Karbasi, M., Ahadi, S.M. and Bahmanian, M. 2011. Environmental sound classification using spectral dynamic features. 2011 8th International Conference on Information, Communications & Signal Processing (2011), 1–5.Google ScholarGoogle ScholarCross RefCross Ref
  22. Kay, M., Kola, T., Hullman, J.R. and Munson, S.A. 2016. When (ish) is my bus? user-centered visualizations of uncertainty in everyday, mobile predictive systems. Proceedings of the 2016 chi conference on human factors in computing systems (2016), 5092–5103.Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Kingma, D.P. and Ba, J. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980. (2014).Google ScholarGoogle Scholar
  24. Krippendorff, K. 2018. Content analysis: An introduction to its methodology. Sage publications.Google ScholarGoogle Scholar
  25. Ladd, P. and Lane, H. 2013. Deaf ethnicity, deafhood, and their relationship. Sign Language Studies. 13, 4 (2013), 565–579.Google ScholarGoogle ScholarCross RefCross Ref
  26. Ladd, P. and Lane, H. 2013. Deaf ethnicity, deafhood, and their relationship. Sign Language Studies. 13, 4 (2013), 565–579.Google ScholarGoogle ScholarCross RefCross Ref
  27. Laput, G., Ahuja, K., Goel, M. and Harrison, C. 2018. Ubicoustics: Plug-and-play acoustic activity recognition. The 31st Annual ACM Symposium on User Interface Software and Technology (2018), 213–224.Google ScholarGoogle Scholar
  28. Live Transcribe & Sound Notifications – Apps on Google Play: https://play.google.com/store/apps/details?id=com.google.audio.hearing.visualization.accessibility.scribe. Accessed: 2021-04-04.Google ScholarGoogle Scholar
  29. Matthews, T., Fong, J., Ho-Ching, F.W.-L. and Mankoff, J. 2006. Evaluating non-speech sound visualizations for the deaf. Behaviour & Information Technology. 25, 4 (Jul. 2006), 333–351.Google ScholarGoogle ScholarCross RefCross Ref
  30. Mesaros, A., Heittola, T. and Virtanen, T. 2016. TUT Sound events 2016.Google ScholarGoogle Scholar
  31. Mirza, M. and Osindero, S. 2014. Conditional generative adversarial nets. arXiv preprint arXiv:1411.1784. (2014).Google ScholarGoogle Scholar
  32. Moore, M.S. 1992. For Hearing people only: Answers to some of the most commonly asked questions about the Deaf community, its culture, and the" Deaf Reality". Deaf Life Press.Google ScholarGoogle Scholar
  33. Narwane, S. V and Sawarkar, S.D. 2019. Machine learning and class imbalance: A literature survey. Ind. Eng. J. 12, (2019).Google ScholarGoogle Scholar
  34. Network Sound Effects Library: https://www.sound-ideas.com/Product/199/Network-Sound-Effects-Library. Accessed: 2019-09-15.Google ScholarGoogle Scholar
  35. Piczak, K.J. 2015. ESC: Dataset for environmental sound classification. Proceedings of the 23rd ACM international conference on Multimedia (2015), 1015–1018.Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Ramos, G., Meek, C., Simard, P., Suh, J. and Ghorashi, S. 2020. Interactive machine teaching: a human-centered approach to building machine-learned models. Human–Computer Interaction. 35, 5–6 (2020), 413–451.Google ScholarGoogle Scholar
  37. Ribicic, H., Waser, J., Gurbat, R., Sadransky, B. and Gröller, M.E. 2012. Sketching uncertainty into simulations. IEEE Transactions on Visualization and Computer Graphics. 18, 12 (2012), 2255–2264.Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Salamon, J., Jacoby, C. and Bello, J.P. 2014. A Dataset and Taxonomy for Urban Sound Research. 22nd {ACM} International Conference on Multimedia (ACM-MM’14) (Orlando, FL, USA, 2014), 1041–1044.Google ScholarGoogle Scholar
  39. Sandler, M., Howard, A., Zhu, M., Zhmoginov, A. and Chen, L.-C. 2018. Mobilenetv2: Inverted residuals and linear bottlenecks. Proceedings of the IEEE conference on computer vision and pattern recognition (2018), 4510–4520.Google ScholarGoogle Scholar
  40. Sicong, L., Zimu, Z., Junzhao, D., Longfei, S., Han, J. and Wang, X. 2017. UbiEar: Bringing Location-independent Sound Awareness to the Hard-of-hearing People with Smartphones. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies. 1, 2 (2017), 17.Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Strand, O.M. and Egeberg, A. 2004. Cepstral mean and variance normalization in the model domain. COST278 and ISCA Tutorial and Research Workshop (ITRW) on Robustness Issues in Conversational Interaction (2004).Google ScholarGoogle Scholar
  42. Torrey, L. and Shavlik, J. 2010. Transfer learning. Handbook of research on machine learning applications and trends: algorithms, methods, and techniques. IGI global. 242–264.Google ScholarGoogle Scholar
  43. UPC-TALP dataset: http://www.talp.upc.edu/content/upc-talp-database-isolated-meeting-room-acoustic-events. Accessed: 2019-09-18.Google ScholarGoogle Scholar
  44. Wang, Y., Yao, Q., Kwok, J.T. and Ni, L.M. 2020. Generalizing from a few examples: A survey on few-shot learning. ACM Computing Surveys (CSUR). 53, 3 (2020), 1–34.Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Wang, Z., Chen, C. and Dong, D. 2021. Lifelong incremental reinforcement learning with online Bayesian inference. IEEE Transactions on Neural Networks and Learning Systems. 33, 8 (2021), 4003–4016.Google ScholarGoogle ScholarCross RefCross Ref
  46. Wardekker, J.A., van der Sluijs, J.P., Janssen, P.H.M., Kloprogge, P. and Petersen, A.C. 2008. Uncertainty communication in environmental assessments: views from the Dutch science-policy interface. Environmental science & policy. 11, 7 (2008), 627–641.Google ScholarGoogle Scholar
  47. Wu, J., Harrison, C., Bigham, J.P. and Laput, G. 2020. Automated Class Discovery and One-Shot Interactions for Acoustic Activity Recognition. Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems (2020), 1–14.Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Wu, X., Huang, W., Wu, X., Wu, S. and Huang, J. 2022. Classification of thermal image of clinical burn based on incremental reinforcement learning. Neural Computing and Applications. (2022), 1–14.Google ScholarGoogle Scholar
  49. Zajadacz, A. 2015. Evolution of models of disability as a basis for further policy changes in accessible tourism. Journal of Tourism Futures. 1, 3 (2015), 189–202.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. AdaptiveSound: An Interactive Feedback-Loop System to Improve Sound Recognition for Deaf and Hard of Hearing Users
        Index terms have been assigned to the content through auto-classification.

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          ASSETS '23: Proceedings of the 25th International ACM SIGACCESS Conference on Computers and Accessibility
          October 2023
          1163 pages
          ISBN:9798400702204
          DOI:10.1145/3597638

          Copyright © 2023 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 22 October 2023

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article
          • Research
          • Refereed limited

          Acceptance Rates

          ASSETS '23 Paper Acceptance Rate55of182submissions,30%Overall Acceptance Rate436of1,556submissions,28%

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        HTML Format

        View this article in HTML Format .

        View HTML Format