short-paper

EchoNose: Sensing Mouth, Breathing and Tongue Gestures inside Oral Cavity using a Non-contact Nose Interface

Authors:
Rujia Sun

Cornell University, United States

Cornell University, United States

0009-0009-0435-140X
View Profile

,
Xiaohe Zhou

Cornell University, United States

Cornell University, United States

0000-0003-1116-8557
View Profile

,
Benjamin Steeper

Cornell University, United States

Cornell University, United States

0000-0002-5555-3441
View Profile

,
Ruidong Zhang

Cornell University, United States

Cornell University, United States

0000-0001-8329-0522
View Profile

,
Sicheng Yin

University of Edinburgh, United Kingdom

University of Edinburgh, United Kingdom

0000-0002-0165-9750
View Profile

,
Ke Li

Cornell University, United States

Cornell University, United States

0000-0002-4208-7904
View Profile

,
Shengzhang Wu

Cornell University, United States

Cornell University, United States

0009-0008-3077-8231
View Profile

,
Sam Tilsen

Cornell University, United States

Cornell University, United States

0000-0003-0865-5119
View Profile

,
Francois Guimbretiere

Cornell University, United States

Cornell University, United States

0000-0002-5510-6799
View Profile

,
Cheng Zhang

Cornell University, United States

Cornell University, United States

0000-0002-5079-5927
View Profile

ISWC '23: Proceedings of the 2023 ACM International Symposium on Wearable ComputersOctober 2023Pages 22–26https://doi.org/10.1145/3594738.3611358

Published:08 October 2023Publication History

ISWC '23: Proceedings of the 2023 ACM International Symposium on Wearable Computers

Pages 22–26

ABSTRACT

Sensing movements and gestures inside the oral cavity has been a long-standing challenge for the wearable research community. This paper introduces EchoNose, a novel nose interface that explores a unique sensing approach to recognize gestures related to mouth, breathing, and tongue by analyzing the acoustic signal reflections inside the nasal and oral cavities. The interface incorporates a speaker and a microphone placed at the nostrils, emitting inaudible acoustic signals and capturing the corresponding reflections. These received signals were processed using a customized data processing and machine learning pipeline, enabling the distinction of 16 gestures involving speech, tongue, and breathing. A user study with 10 participants demonstrates that EchoNose achieves an average accuracy of 93.7% in recognizing these 16 gestures. Based on these promising results, we discuss the potential opportunities and challenges associated with applying this innovative nose interface in various future applications.

References

Abdelkareem Bedri, Himanshu Sahni, Pavleen Thukral, Thad Starner, David Byrd, Peter Presti, Gabriel Reyes, Maysam Ghovanloo, and Zehua Guo. 2015. Toward Silent-Speech Control of Consumer Wearables. Computer 48, 10 (2015), 54–62. https://doi.org/10.1109/MC.2015.310Google ScholarDigital Library
Ho-Seung Cha, Won-Du Chang, and Chang-Hwan Im. 2022. Deep-learning-based real-time silent speech recognition using facial electromyogram recorded around eyes for hands-free interfacing in a virtual reality environment. Virtual Reality 26, 3 (2022), 1047–1057. https://doi.org/10.1007/s10055-021-00616-0Google ScholarDigital Library
Yeqing Chen, Yulong Bian, Wei Gai, and Chenglei Yang. 2020. Adaptive Blowing Interaction Method Based on a Siamese Network. IEEE Access 8 (2020), 115486–115500. https://doi.org/10.1109/ACCESS.2020.3004349Google ScholarCross Ref
Dyson. 2023. Zone™ air-purifying headphones | Dyson US. Retrieved May 26, 2023 from https://www.dyson.com/headphones/zoneGoogle Scholar
Mayank Goel, Chen Zhao, Ruth Vinisha, and Shwetak N. Patel. 2015. Tongue-in-Cheek: Using Wireless Signals to Enable Non-Intrusive and Flexible Facial Gestures Detection. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems (Seoul, Republic of Korea) (CHI ’15). Association for Computing Machinery, New York, NY, USA, 255–258. https://doi.org/10.1145/2702123.2702591Google ScholarDigital Library
Takuma Hashimoto, Suzanne Low, Koji Fujita, Risa Usumi, Hiroshi Yanagihara, Chihiro Takahashi, Maki Sugimoto, and Yuta Sugiura. 2018. TongueInput: Input Method by Tongue Gestures Using Optical Sensors Embedded in Mouthpiece. In 2018 57th Annual Conference of the Society of Instrument and Control Engineers of Japan (SICE). 1219–1224. https://doi.org/10.23919/SICE.2018.8492690Google ScholarCross Ref
Yincheng Jin, Yang Gao, Xuhai Xu, Seokmin Choi, Jiyang Li, Feng Liu, Zhengxiong Li, and Zhanpeng Jin. 2022. EarCommand: "Hearing" Your Silent Speech Commands In Ear. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 6, 2, Article 57 (jul 2022), 28 pages. https://doi.org/10.1145/3534613Google ScholarDigital Library
Naoki Kimura, Tan Gemicioglu, Jonathan Womack, Richard Li, Yuhui Zhao, Abdelkareem Bedri, Alex Olwal, Jun Rekimoto, and Thad Starner. 2021. Mobile, Hands-Free, Silent Speech Texting Using SilentSpeller. In Extended Abstracts of the 2021 CHI Conference on Human Factors in Computing Systems (Yokohama, Japan) (CHI EA ’21). Association for Computing Machinery, New York, NY, USA, Article 178, 5 pages. https://doi.org/10.1145/3411763.3451552Google ScholarDigital Library
Naoki Kimura, Michinari Kono, and Jun Rekimoto. 2019. SottoVoce: An Ultrasound Imaging-Based Silent Speech Interaction Using Deep Neural Networks. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems (Glasgow, Scotland Uk) (CHI ’19). Association for Computing Machinery, New York, NY, USA, 1–11. https://doi.org/10.1145/3290605.3300376Google ScholarDigital Library
Ke Li, Ruidong Zhang, Bo Liang, François Guimbretière, and Cheng Zhang. 2022. EarIO: A Low-Power Acoustic Sensing Earable for Continuously Tracking Detailed Facial Movements. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 6, 2, Article 62 (jul 2022), 24 pages. https://doi.org/10.1145/3534621Google ScholarDigital Library
Richard Li, Jason Wu, and Thad Starner. 2019. TongueBoard: An Oral Interface for Subtle Input. In Proceedings of the 10th Augmented Human International Conference 2019 (Reims, France) (AH2019). Association for Computing Machinery, New York, NY, USA, Article 1, 9 pages. https://doi.org/10.1145/3311823.3311831Google ScholarDigital Library
Hangue Park, Mehdi Kiani, Hyung-Min Lee, Jeonghee Kim, Jacob Block, Benoit Gosselin, and Maysam Ghovanloo. 2012. A Wireless Magnetoresistive Sensing System for an Intraoral Tongue-Computer Interface. IEEE Transactions on Biomedical Circuits and Systems 6, 6 (2012), 571–585. https://doi.org/10.1109/TBCAS.2012.2227962Google ScholarCross Ref
Makoto Sasaki, Kohei Onishi, Dimitar Stefanov, Katsuhiro Kamata, Atsushi Nakayama, Masahiro Yoshikawa, and Goro Obinata. 2016. Tongue interface based on surface EMG signals of suprahyoid muscles. ROBOMECH Journal 3, 1 (April 2016). https://doi.org/10.1186/s40648-016-0048-0Google ScholarCross Ref
Misha Sra, Xuhai Xu, and Pattie Maes. 2018. BreathVR: Leveraging Breathing as a Directly Controlled Interface for Virtual Reality Games. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems (Montreal QC, Canada) (CHI ’18). Association for Computing Machinery, New York, NY, USA, 1–12. https://doi.org/10.1145/3173574.3173914Google ScholarDigital Library
Ruidong Zhang, Mingyang Chen, Benjamin Steeper, Yaxuan Li, Zihan Yan, Yizhuo Chen, Songyun Tao, Tuochao Chen, Hyunchul Lim, and Cheng Zhang. 2022. SpeeChin: A Smart Necklace for Silent Speech Recognition. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 5, 4, Article 192 (dec 2022), 23 pages. https://doi.org/10.1145/3494987Google ScholarDigital Library
Ruidong Zhang, Ke Li, Yihong Hao, Yufan Wang, Zhengnan Lai, François Guimbretière, and Cheng Zhang. 2023. EchoSpeech: Continuous Silent Speech Recognition on Minimally-Obtrusive Eyewear Powered by Acoustic Sensing. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems (Hamburg, Germany) (CHI ’23). Association for Computing Machinery, New York, NY, USA, Article 852, 18 pages. https://doi.org/10.1145/3544548.3580801Google ScholarDigital Library
Yuzhou Zhuang, Yuntao Wang, Yukang Yan, Xuhai Xu, and Yuanchun Shi. 2021. ReflecTrack: Enabling 3D Acoustic Position Tracking Using Commodity Dual-Microphone Smartphones. In The 34th Annual ACM Symposium on User Interface Software and Technology (Virtual Event, USA) (UIST ’21). Association for Computing Machinery, New York, NY, USA, 1050–1062. https://doi.org/10.1145/3472749.3474805Google ScholarDigital Library
Daniel Zielasko, Neha Neha, Benjamin Weyers, and Torsten W. Kuhlen. 2017. A reliable non-verbal vocal input metaphor for clicking. In 2017 IEEE Symposium on 3D User Interfaces (3DUI). 40–49. https://doi.org/10.1109/3DUI.2017.7893316Google ScholarCross Ref

Index Terms

EchoNose: Sensing Mouth, Breathing and Tongue Gestures inside Oral Cavity using a Non-contact Nose Interface
1. Human-centered computing
  1. Human computer interaction (HCI)
    1. Interaction techniques
      1. Gestural input
  2. Ubiquitous and mobile computing
    1. Ubiquitous and mobile computing systems and tools

Recommendations

HPSpeech: Silent Speech Interface for Commodity Headphones
ISWC '23: Proceedings of the 2023 ACM International Symposium on Wearable Computers

We present HPSpeech, a silent speech interface for commodity headphones. HPSpeech utilizes the existing speakers of the headphones to emit inaudible acoustic signals. The movements of the temporomandibular joint (TMJ) during speech modify the ...
Read More
EarCommand: "Hearing" Your Silent Speech Commands In Ear

Intelligent speech interfaces have been developing vastly to support the growing demands for convenient control and interaction with wearable/earable and portable devices. To avoid privacy leakage during speech interactions and strengthen the resistance ...
Read More
Development of a silent speech interface driven by ultrasound and optical images of the tongue and lips

This article presents a segmental vocoder driven by ultrasound and optical images (standard CCD camera) of the tongue and lips for a ''silent speech interface'' application, usable either by a laryngectomized patient or for silent communication. The ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
ISWC '23: Proceedings of the 2023 ACM International Symposium on Wearable Computers
October 2023
145 pages
ISBN:9798400701993
DOI:10.1145/3594738
Editors:
Monica Tentori
(CICESE, Mexico)
,
Nadir Weibel
(UC San Diego, USA)
,
Kristof Van Laerhoven
(University of Siegen, Germany)
,
Zhongyi Zhou
(University of Tokyo, Japan)
Copyright © 2023 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 8 October 2023
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Acoustic Sensing
Breathing Patterns
Nose Interface
Silent Speech
Tongue Gestures
Qualifiers
- short-paper
- Research
- Refereed limited
Conference

Acceptance Rates
Overall Acceptance Rate38of196submissions,19%
Upcoming Conference
UBICOMP '24

Sponsor:

sigchi

sigchi

UBICOMP '24: The 2022 ACM International Joint Conference on Pervasive and Ubiquitous Computing

October 5 - 9, 2024

Melbourne , VIC , Australia
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 228
  Total Downloads
- Downloads (Last 12 months)228
- Downloads (Last 6 weeks)45
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

EchoNose: Sensing Mouth, Breathing and Tongue Gestures inside Oral Cavity using a Non-contact Nose Interface

ISWC '23: Proceedings of the 2023 ACM International Symposium on Wearable Computers

ABSTRACT

References

Cited By

Index Terms

Recommendations

HPSpeech: Silent Speech Interface for Commodity Headphones

EarCommand: "Hearing" Your Silent Speech Commands In Ear

Development of a silent speech interface driven by ultrasound and optical images of the tongue and lips