research-article

A Human-AI Collaborative Approach for Designing Sound Awareness Systems

Authors:

Jeremy Zhengqi Huang,

Reyna Wood,

Hriday Chhabria,

Dhruv JainAuthors Info & Claims

CHI '24: Proceedings of the 2024 CHI Conference on Human Factors in Computing Systems

Article No.: 884, Pages 1 - 11

https://doi.org/10.1145/3613904.3642062

Published: 11 May 2024 Publication History

Get Access

Abstract

Current sound recognition systems for deaf and hard of hearing (DHH) people identify sound sources or discrete events. However, these systems do not distinguish similar sounding events (e.g., a patient monitor beep vs. a microwave beep). In this paper, we introduce HACS, a novel futuristic approach to designing human-AI sound awareness systems. HACS assigns AI models to identify sounds based on their characteristics (e.g., a beep) and prompts DHH users to use this information and their contextual knowledge (e.g., “I am in a kitchen”) to recognize sound events (e.g., a microwave). As a first step for implementing HACS, we articulated a sound taxonomy that classifies sounds based on sound characteristics using insights from a multi-phased research process with people of mixed hearing abilities. We then performed a qualitative (with 9 DHH people) and a quantitative (with a sound recognition model) evaluation. Findings demonstrate the initial promise of HACS for designing accurate and reliable human-AI systems.

Supplemental Material

MP4 File - Video Presentation

Video Presentation

Transcript for: Video Presentation

ZIP File - Study Materials

A compressed folder containing: - Formative-Codebook.pdf: The final codebook for the formative study - Interview Protocol - Formative.pdf: The interview protocol we used for the formative study - PE1-Codebook.pdf: The final codebook for the Preliminary Evaluation 1 - Interview Protocol - Formative.pdf: The interview protocol we used for the Preliminary Evaluation 1

Download
322.32 KB

References

[1]

Oliver Alonzo, Hijung Valentina Shin, and Dingzeyu Li. 2022. Beyond Subtitles: Captioning and Visualizing Non-speech Sounds to Improve Accessibility of User-Generated Videos. In Proceedings of the 24th International ACM SIGACCESS Conference on Computers and Accessibility (ASSETS ’22), October 22, 2022, New York, NY, USA. Association for Computing Machinery, New York, NY, USA, 1–12. . https://doi.org/10.1145/3517428.3544808

Abstract

Supplemental Material

References

Index Terms

Recommendations

Immersive auditory display system 'sound cask': three-dimensional sound field reproduction system based on the boundary surface control principle

Spectral cues in human sound localization

Intelligent sound field tuning system for home theater systems

Comments

Information

Published In

Sponsors

Publisher

Publication History

Permissions

Check for updates

Badges

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Login options

Full Access

View options

PDF

eReader

Full Text

HTML Format

Share

Share this Publication link

Share on social media

Affiliations