skip to main content
10.1145/3458709.3458985acmotherconferencesArticle/Chapter ViewAbstractPublication PagesahsConference Proceedingsconference-collections
research-article

SilentMask: Mask-type Silent Speech Interface with Measurement of Mouth Movement

Published: 11 July 2021 Publication History

Abstract

Silent Speech Interaction (SSI) is a non-speech interaction used as an input method for speech recognition devices such as smartphones and as a support tool for people with speech difficulties. Conventional SSI methods using lip reading, electromyography(EMG), ultrasonic echo, and electrostatic positioning in the palate have been proposed, but there have been issues such as not being able to use one hand and being easily noticeable.
In this study, we propose a mask-based SSI that recognizes silent speech by measuring the motion around the mouth using acceleration and angular velocity sensors attached to mask.
Using two acceleration and angular velocity sensors to acquire 12-dimensional motion information around the mouth and analyzing it using deep learning, we were able to identify a total of 22 states (21 types of voice commands and no speech) with 79.9% accuracy.
The results also showed that the device can be worn for a longer period of time compared to the method of applying the sensors directly to the skin. This research presents new possibilities for masks, as they are a non-contact, unobtrusive interface that does not use camera images and is therefore independent of lighting conditions.

Supplementary Material

3458709.3458985 (3458709.3458985.mp4)
Supplementary video

References

[1]
Tamás Csapó, Tamás Grósz, Gábor Gosztolya, László Tóth, and Alexandra Markó. 2017. DNN-Based Ultrasound-to-Speech Conversion for a Silent Speech Interface. In interspeech 2017. 3672–3676. https://doi.org/10.21437/Interspeech.2017-939
[2]
B. Denby, T. Schultz, K. Honda, T. Hueber, J.M. Gilbert, and J.S. Brumberg. 2010. Silent speech interfaces. Speech Communication 52, 4 (2010), 270 – 287. https://doi.org/10.1016/j.specom.2009.08.002 Silent Speech Interfaces.
[3]
Masaaki Fukumoto. 2018. SilentVoice: Unnoticeable Voice Input by Ingressive Speech. In Proceedings of the 31st Annual ACM Symposium on User Interface Software and Technology (Berlin, Germany) (UIST ’18). Association for Computing Machinery, New York, NY, USA, 237–246. https://doi.org/10.1145/3242587.3242603
[4]
Charles Jorgensen and Sorin Dusan. 2010. Speech interfaces based upon surface electromyography. Speech Communication 52 (04 2010), 354–366. https://doi.org/10.1016/j.specom.2009.11.003
[5]
Arnav Kapur, Shreyas Kapur, and Pattie Maes. 2018. AlterEgo: A Personalized Wearable Silent Speech Interface. In 23rd International Conference on Intelligent User Interfaces (Tokyo, Japan) (IUI ’18). Association for Computing Machinery, New York, NY, USA, 43–53. https://doi.org/10.1145/3172944.3172977
[6]
Naoki Kimura, Michinari Kono, and Jun Rekimoto. 2019. SottoVoce: An Ultrasound Imaging-Based Silent Speech Interaction Using Deep Neural Networks. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems (Glasgow, Scotland Uk) (CHI ’19). Association for Computing Machinery, New York, NY, USA, 1–11. https://doi.org/10.1145/3290605.3300376
[7]
Richard Li, Jason Wu, and Thad Starner. 2019. TongueBoard: An Oral Interface for Subtle Input. In Proceedings of the 10th Augmented Human International Conference 2019 (Reims, France) (AH2019). Association for Computing Machinery, New York, NY, USA, Article 1, 9 pages. https://doi.org/10.1145/3311823.3311831
[8]
Geoffrey S. Meltzner, James T. Heaton, Yunbin Deng, Gianluca De Luca, Serge H. Roy, and Joshua C. Kline. 2018. Development of sEMG sensors and algorithms for silent speech recognition., In Journal of Neural Engineering 2018. Journal of neural engineering 15, 4, 1741–2552.
[9]
Jun Rekimoto and Yu Nishimura. 2020. Derma: Silent Speech Interaction by Skin Movement Measurement Silent Speech Interaction. In IPSJ Interaction 2020. Transactions of Information Processing Society of Japan, 11–20.
[10]
Alexander I. Rudnicky. 1989. The Design of Voice-Driven Interfaces. In Proceedings of the Workshop on Speech and Natural Language (Philadelphia, Pennsylvania) (HLT ’89). Association for Computational Linguistics, USA, 120–124. https://doi.org/10.3115/100964.100972
[11]
Ke Sun, Chun Yu, Weinan Shi, Lan Liu, and Yuanchun Shi. 2018. Lip-Interact: Improving Mobile Device Interaction with Silent Speech Commands. In Proceedings of the 31st Annual ACM Symposium on User Interface Software and Technology(Berlin, Germany) (UIST ’18). Association for Computing Machinery, New York, NY, USA, 581–593. https://doi.org/10.1145/3242587.3242599
[12]
M. Wand, J. Koutník, and J. Schmidhuber. 2016. Lipreading with long short-term memory. In 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 6115–6119.
[13]
Michael Wand and Tanja Schultz. 2011. Session-independent EMG-based Speech Recognition., In Proceedings of Biosignals 2011. BIOSIGNALS 2011 - Proceedings of the International Conference on Bio-Inspired Systems and Signal Processing, 295–300.

Cited By

View all
  • (2024)Whispering Wearables: Multimodal Approach to Silent Speech Recognition with Head-Worn DevicesProceedings of the 26th International Conference on Multimodal Interaction10.1145/3678957.3685720(214-223)Online publication date: 4-Nov-2024
  • (2024)Piezoelectric Sensing of Mask Surface Waves for Noise-Suppressive Speech InputAdjunct Proceedings of the 37th Annual ACM Symposium on User Interface Software and Technology10.1145/3672539.3686331(1-3)Online publication date: 13-Oct-2024
  • (2024)Unvoiced: Designing an LLM-assisted Unvoiced User Interface using EarablesProceedings of the 22nd ACM Conference on Embedded Networked Sensor Systems10.1145/3666025.3699374(784-798)Online publication date: 4-Nov-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
AHs '21: Proceedings of the Augmented Humans International Conference 2021
February 2021
321 pages
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 July 2021

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Mask
  2. Silent Speech Interface
  3. Wearble Device

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

AHs '21
AHs '21: Augmented Humans International Conference 2021
February 22 - 24, 2021
Rovaniemi, Finland

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)65
  • Downloads (Last 6 weeks)4
Reflects downloads up to 25 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Whispering Wearables: Multimodal Approach to Silent Speech Recognition with Head-Worn DevicesProceedings of the 26th International Conference on Multimodal Interaction10.1145/3678957.3685720(214-223)Online publication date: 4-Nov-2024
  • (2024)Piezoelectric Sensing of Mask Surface Waves for Noise-Suppressive Speech InputAdjunct Proceedings of the 37th Annual ACM Symposium on User Interface Software and Technology10.1145/3672539.3686331(1-3)Online publication date: 13-Oct-2024
  • (2024)Unvoiced: Designing an LLM-assisted Unvoiced User Interface using EarablesProceedings of the 22nd ACM Conference on Embedded Networked Sensor Systems10.1145/3666025.3699374(784-798)Online publication date: 4-Nov-2024
  • (2024)WhisperMask: a noise suppressive mask-type microphone for whisper speechProceedings of the Augmented Humans International Conference 202410.1145/3652920.3652925(1-14)Online publication date: 4-Apr-2024
  • (2023)Masktrap: Designing and Identifying Gestures to Transform Mask Strap into an Input InterfaceProceedings of the 28th International Conference on Intelligent User Interfaces10.1145/3581641.3584062(762-775)Online publication date: 27-Mar-2023
  • (2023)EchoSpeech: Continuous Silent Speech Recognition on Minimally-obtrusive Eyewear Powered by Acoustic SensingProceedings of the 2023 CHI Conference on Human Factors in Computing Systems10.1145/3544548.3580801(1-18)Online publication date: 19-Apr-2023
  • (2023)Pulse Wave Generation Method for PPG by Using DisplayIEEE Access10.1109/ACCESS.2023.326086211(31199-31211)Online publication date: 2023
  • (2023)Unvoiced Vowel Recognition Using Active Bio-Acoustic Sensing for Silent Speech InteractionArtificial Intelligence in HCI10.1007/978-3-031-35891-3_10(150-161)Online publication date: 23-Jul-2023
  • (2022)Review of the Speech-aid Device発声支援デバイスの開発と今後の展望Koutou (THE LARYNX JAPAN)10.5426/larynx.34.5834:2(58-64)Online publication date: 1-Dec-2022
  • (2022)An Analysis on Multimodal Framework for Silent Speech RecognitionPrinciples and Applications of Socio-Cognitive and Affective Computing10.4018/978-1-6684-3843-5.ch010(159-176)Online publication date: 30-Sep-2022
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media