Abstract
Sign language builds up an important bridge between the d/Deaf and hard-of-hearing (DHH) and hearing people. Regrettably, most hearing people face challenges in comprehending sign language, necessitating sign language translation. However, state-of-the-art wearable-based techniques mainly concentrate on recognizing manual markers (e.g., hand gestures), while frequently overlooking non-manual markers, such as negative head shaking, question markers, and mouthing. This oversight results in the loss of substantial grammatical and semantic information in sign language. To address this limitation, we introduce SmartASL, a novel proof-of-concept system that can 1) recognize both manual and non-manual markers simultaneously using a combination of earbuds and a wrist-worn IMU, and 2) translate the recognized American Sign Language (ASL) glosses into spoken language. Our experiments demonstrate the SmartASL system's significant potential to accurately recognize the manual and non-manual markers in ASL, effectively bridging the communication gaps between ASL signers and hearing people using commercially available devices.
- Accessibility.com, LLC. 2022. Is American Sign Language (ASL) a language? https://www.accessibility.com/blog/is-american-sign-language-asl-a-language/.Google Scholar
- Ashwin Ahuja, Andrea Ferlini, and Cecilia Mascolo. 2021. PilotEar: Enabling In-ear Inertial Navigation. In Adjunct Proceedings of the 2021 ACM International Joint Conference on Pervasive and Ubiquitous Computing and Proceedings of the 2021 ACM International Symposium on Wearable Computers. 139--145.Google Scholar
- Takashi Amesaka, Hiroki Watanabe, and Masanori Sugimoto. 2019. Facial expression recognition using ear canal transfer function. In Proceedings of the 23rd International Symposium on Wearable Computers. 1--9.Google ScholarDigital Library
- F Berzin and CRH Fortinguerra. 1993. EMG study of the anterior, superior and posterior auricular muscles in man. Annals of Anatomy-Anatomischer Anzeiger 175, 2 (1993), 195--197.Google ScholarCross Ref
- Hongliang Bi and Jiajia Liu. 2022. CSEar: Meta-learning for Head Gesture Recognition Using Earphones in Internet of Healthcare Things. IEEE Internet of Things Journal (2022).Google Scholar
- Eric Branda and Tobias Wurzbacher. 2021. Motion Sensors in Automatic Steering of Hearing Aids. In Seminars in Hearing, Vol. 42. Thieme Medical Publishers, Inc., 237--247.Google Scholar
- Nam Bui, Nhat Pham, Jessica Jacqueline Barnitz, Zhanan Zou, Phuc Nguyen, Hoang Truong, Taeho Kim, Nicholas Farrow, Anh Nguyen, Jianliang Xiao, et al. 2019. ebp: A wearable system for frequent and comfortable blood pressure monitoring from user's ear. In The 25th Annual International Conference on Mobile Computing and Networking. 1--17.Google ScholarDigital Library
- Kayla-Jade Butkow, Ting Dang, Andrea Ferlini, Dong Ma, and Cecilia Mascolo. 2021. Motion-resilient Heart Rate Monitoring with In-ear Microphones. arXiv preprint arXiv:2108.09393 (2021).Google Scholar
- George Caridakis, Stylianos Asteriadis, and Kostas Karpouzis. 2014. Non-manual cues in automatic sign language recognition. Personal and ubiquitous computing 18, 1 (2014), 37--46.Google Scholar
- Seokmin Choi, Yang Gao, Yincheng Jin, Se jun Kim, Jiyang Li, Wenyao Xu, and Zhanpeng Jin. 2022. PPGface: Like What You Are Watching? Earphones Can" Feel" Your Facial Expressions. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 6, 2 (2022), 1--32.Google ScholarDigital Library
- Deaf Community. 2021. Deaf Culture. https://www.startasl.com/what-does-d-d-and-d-deaf-mean-in-the-deaf-community/. [Updated May 13, 2021].Google Scholar
- ASLLRP Continuous Signing Corpora. 2022. American Sign Language Linguistic Research Project. https://dai.cs.rutgers.edu/dai/s/dai. [Online].Google Scholar
- Biyi Fang, Jillian Co, and Mi Zhang. 2017. DeepASL: Enabling ubiquitous and non-intrusive word and sentence-level sign language translation. In Proceedings of the 15th ACM Conference on Embedded Network Sensor Systems. 1--13.Google ScholarDigital Library
- Andrea Ferlini, Dong Ma, Robert Harle, and Cecilia Mascolo. 2021. EarGate: gait-based user identification with in-ear microphones. In Proceedings of the 27th Annual International Conference on Mobile Computing and Networking. 337--349.Google ScholarDigital Library
- E Friesen and Paul Ekman. 1978. Facial action coding system: a technique for the measurement of facial movement. Palo Alto 3, 2 (1978), 5.Google Scholar
- Yang Gao, Yincheng Jin, Seokmin Choi, Jiyang Li, Junjie Pan, Lin Shu, Chi Zhou, and Zhanpeng Jin. 2021. SonicFace: Tracking Facial Expressions Using a Commodity Microphone Array. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 5, 4 (2021), 1--33.Google ScholarDigital Library
- Yang Gao, Wei Wang, Vir V. Phoha, Wei Sun, and Zhanpeng Jin. 2019. EarEcho: Using Ear Canal Echo for Wearable Authentication. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 3, 3, Article 81 (Sept. 2019), 24 pages.Google ScholarDigital Library
- Google. 2022. AR Glass. https://nerdist.com/article/google-ar-glasses-live-translation-real-time-transcription/.Google Scholar
- Audien Hearing. 2023. Atom Pro. https://audienhearing.com/products/audien-atom-pro-pair?variant=39511193255999.Google Scholar
- Carl-Herman Hjortsjö. 1969. Man's face and mimic language. Studentlitteratur.Google Scholar
- Jiahui Hou, Xiang-Yang Li, Peide Zhu, Zefan Wang, Yu Wang, Jianwei Qian, and Panlong Yang. 2019. SignSpeaker: A real-time, high-precision smartwatch-based sign language translator. In The 25th Annual International Conference on Mobile Computing and Networking (MobiCom'19). Article 24, 15 pages.Google ScholarDigital Library
- Jie Huang, Wengang Zhou, Qilin Zhang, Houqiang Li, and Weiping Li. 2018. Video-based sign language recognition without temporal segmentation. In Thirty-Second AAAI Conference on Artificial Intelligence.Google ScholarCross Ref
- Yincheng Jin, Yang Gao, Xiaotao Guo, Jun Wen, Zhengxiong Li, and Zhanpeng Jin. 2022. EarHealth: an earphone-based acoustic otoscope for detection of multiple ear diseases in daily life. In Proceedings of the 20th Annual International Conference on Mobile Systems, Applications and Services. 397--408.Google ScholarDigital Library
- Yincheng Jin, Yang Gao, Xuhai Xu, Seokmin Choi, Jiyang Li, Feng Liu, Zhengxiong Li, and Zhanpeng Jin. 2022. EarCommand: "Hearing" Your Silent Speech Commands In Ear. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 6, 2 (2022), 1--28.Google ScholarDigital Library
- Yincheng Jin, Yang Gao, Yanjun Zhu, Wei Wang, Jiyang Li, Seokmin Choi, Zhangyu Li, Jagmohan Chauhan, Anind K Dey, and Zhanpeng Jin. 2021. SonicASL: An acoustic-based sign language gesture recognizer using earphones. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 5, 2 (2021), 1--30.Google ScholarDigital Library
- Sara Askari Khomami and Sina Shamekhi. 2021. Persian sign language recognition using IMU and surface EMG sensors. Measurement 168 (2021), 108471.Google ScholarCross Ref
- Suyoun Kim, Takaaki Hori, and Shinji Watanabe. 2017. Joint CTC-attention based end-to-end speech recognition using multi-task learning. In 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 4835--4839.Google ScholarDigital Library
- Nicolas Le Goff, Jesper Jensen, Michael Syskind Pedersen, and Susanna Løve Callaway. 2016. An introduction to OpenSound Navigator™. Oticon A/S (2016).Google Scholar
- Steven F LeBoeuf, Michael E Aumer, William E Kraus, Johanna L Johnson, and Brian Duscha. 2014. Earbud-based sensor for the assessment of energy expenditure, heart rate, and VO2max. Medicine and Science in Sports and Exercise 46, 5 (2014), 1046.Google ScholarCross Ref
- Yilin Liu, Fengyang Jiang, and Mahanth Gowda. 2020. Finger Gesture Tracking for Interactive Applications: A Pilot Study with Sign Languages. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 4, 3 (2020), 1--21.Google ScholarDigital Library
- Yilin Liu, Shijia Zhang, and Mahanth Gowda. 2021. When Video Meets Inertial Sensors: Zero-Shot Domain Adaptation for Finger Motion Analytics with Inertial Sensors. In Proceedings of the International Conference on Internet-of-Things Design and Implementation (Charlottesvle, VA, USA) (IoTDI '21). ACM, New York, NY, USA, 182--194.Google ScholarDigital Library
- Hamzah Luqman and El-Sayed M El-Alfy. 2021. Towards hybrid multimodal manual and non-manual Arabic sign language recognition: MArSL database and pilot study. Electronics 10, 14 (2021), 1739.Google ScholarCross Ref
- Yongsen Ma, Gang Zhou, Shuangquan Wang, Hongyang Zhao, and Woosub Jung. 2018. SignFi: Sign language recognition using WiFi. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 2, 1, Article 23 (2018), 21 pages.Google ScholarDigital Library
- Stephen McCullough, Karen Emmorey, and Martin Sereno. 2005. Neural organization for recognition of grammatical and emotional facial expressions in deaf ASL signers and hearing nonsigners. Cognitive Brain Research 22, 2 (2005), 193--203.Google ScholarCross Ref
- Meta. 2016. Binaural Audio for Narrative AR. https://www.oculus.com/story-studio/blog/binaural-audio-for-narrative-vr/.Google Scholar
- Nicholas Michael, Peng Yang, Qingshan Liu, Dimitris N Metaxas, Carol Neidle, and CBIM Center. 2011. A Framework for the Recognition of Nonmanual Markers in Segmented Sequences of American Sign Language.. In BMVC. 1--12.Google Scholar
- NIH. 2008. Hearing Loss and Hearing Aid Use. https://www.mdcd.nih.gov/news/multimedia/hearing-loss-and-hearing-aid-use-text-version. [Updated July 17, 2017].Google Scholar
- Achraf Othman and Mohamed Jemni. 2012. English-ASL gloss parallel corpus 2012: ASLG-PC12. In 5th Workshop on the Representation and Processing of Sign Languages: Interactions between Corpus and Lexicon LREC.Google Scholar
- Sinno Jialin Pan and Qiang Yang. 2009. A survey on transfer learning. IEEE Transactions on Knowledge and Data Engineering 22, 10 (2009), 1345--1359.Google ScholarDigital Library
- Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, Peter J Liu, et al. 2020. Exploring the limits of transfer learning with a unified text-to-text transformer. J. Mach. Learn. Res. 21, 140 (2020), 1--67.Google Scholar
- Grand Review Research. 2023. Grand Review Research. https://www.grandviewresearch.com/industry-analysis/earphone-and-headphone-market. [Online].Google Scholar
- Tobias Röddiger, Christopher Clarke, Paula Breitling, Tim Schneegans, Haibin Zhao, Hans Gellersen, and Michael Beigl. 2022. Sensing with Earables: A Systematic Literature Review and Taxonomy of Phenomena. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 6, 3 (2022), 1--57.Google ScholarDigital Library
- Arman Sabyrov, Medet Mukushev, and Vadim Kimmelman. 2019. Towards Real-time Sign Language Interpreting Robot: Evaluation of Non-manual Components on Recognition Accuracy.. In CVPR Workshops.Google Scholar
- Panneer Selvam Santhalingam, Al Amin Hosain, Ding Zhang, Parth Pathak, Huzefa Rangwala, and Raja Kushalnagar. 2020. mmASL: Environment-Independent ASL Gesture Recognition Using 60 GHz Millimeter-wave Signals. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 4, 1, Article 26 (2020), 30 pages.Google ScholarDigital Library
- Torgyn Shaikhina and Natalia A. Khovanova. 2017. Handling limited datasets with neural networks in medical applications: A small-data approach. Artificial Intelligence in Medicine 75 (2017), 51--63.Google ScholarDigital Library
- Jiacheng Shang and Jie Wu. 2017. A robust sign language recognition system with multiple Wi-Fi devices. In Proceedings of the Workshop on Mobility in the Evolving Internet Architecture. 19--24.Google ScholarDigital Library
- Xingzhe Song, Kai Huang, and Wei Gao. 2022. FaceListener: Recognizing Human Facial Expressions via Acoustic Sensing on Commodity Headphones. In 2022 21st ACM/IEEE International Conference on Information Processing in Sensor Networks (IPSN). IEEE, 145--157.Google ScholarCross Ref
- StartASL. 2020. ASL Dictionary -- Learn Essential Vocabulary. https://www.handspeak.com/word/. [Updated April 28, 2020].Google Scholar
- Karush Suri and Rinki Gupta. 2019. Continuous sign language recognition from wearable IMUs using deep capsule networks and game theory. Computers & Electrical Engineering 78 (2019), 493--503.Google ScholarDigital Library
- Noeru Suzuki, Yuki Watanabe, and Atsushi Nakazawa. 2020. Gan-based style transformation to improve gesture-recognition accuracy. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies (IMWUT) 4, 4 (2020), 1--20.Google ScholarDigital Library
- Andrius Vabalas, Emma Gowen, Ellen Poliakoff, and Alexander J. Casson. 2019. Machine learning algorithm validation with a limited sample size. PLoS ONE 14, 11 (2019), 1--20.Google ScholarCross Ref
- Dhruv Verma, Sejal Bhalla, Dhruv Sahnan, Jainendra Shukla, and Aman Parnami. 2021. ExpressEar: Sensing Fine-Grained Facial Expressions with Earables. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 5, 3 (2021), 1--28.Google ScholarDigital Library
- Zi Wang, Sheng Tan, Linghan Zhang, Yili Ren, Zhi Wang, and Jie Yang. 2021. EarDynamic: An Ear Canal Deformation Based Continuous User Authentication Using In-Ear Wearables. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 5, 1 (2021), 1--27.Google ScholarDigital Library
- Katharine L Watson. 2010. WH-questions in American Sign Language: Contributions of non-manual marking to structure and meaning. Purdue University.Google Scholar
- Traci Patricia Weast. 2008. Questions in American Sign Language: A quantitative analysis of raised and lowered eyebrows. The University of Texas at Arlington.Google Scholar
- WHO. 2022. Deafness and hearing loss. https://www.who.int/news-room/fact-sheets/detail/deafness-and-hearing-loss. [Online].Google Scholar
- Jian Wu, Lu Sun, and Roozbeh Jafari. 2016. A Wearable System for Recognizing American Sign Language in Real-Time Using IMU and Surface EMG Sensors. IEEE Journal of Biomedical and Health Informatics 20, 5 (2016), 1281--1290.Google ScholarCross Ref
- Kayo Yin. 2020. Sign language translation with transformers. arXiv preprint arXiv:2004.00588 2 (2020).Google Scholar
- Zahoor Zafrulla, Helene Brashear, Thad Starner, Harley Hamilton, and Peter Presti. 2011. American sign language recognition with the kinect. In Proceedings of the 13th International Conference on Multimodal Interfaces. 279--286.Google ScholarDigital Library
- Qian Zhang, JiaZhen Jing, Dong Wang, and Run Zhao. 2022. WearSign: Pushing the Limit of Sign Language Translation Using Inertial and EMG Wearables. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 6, 1 (2022), 1--27.Google ScholarDigital Library
- Qian Zhang, Dong Wang, Run Zhao, and Yinggang Yu. 2019. MyoSign: enabling end-to-end sign language recognition with wearables. In Proceedings of the 24th International Conference on Intelligent User Interfaces. 650--660.Google ScholarDigital Library
- Zhihao Zhou, Kyle Chen, Xiaoshi Li, Songlin Zhang, Yufen Wu, Yihao Zhou, Keyu Meng, Chenchen Sun, Qiang He, Wenjing Fan, Endong Fan, Zhiwei Lin, Xulong Tan, Weili Deng, Jin Yang, and Jun Chen. 2020. Sign-to-speech translation using machine-learning-assisted stretchable sensor arrays. Nature Electronics 3 (2020), 571--578.Google ScholarCross Ref
Index Terms
- SmartASL: "Point-of-Care" Comprehensive ASL Interpreter Using Wearables
Recommendations
TransASL: A Smart Glass based Comprehensive ASL Recognizer in Daily Life
IUI '23: Proceedings of the 28th International Conference on Intelligent User InterfacesSign language is a primary language used by deaf and hard-of-hearing (DHH) communities. However, existing sign language translation solutions primarily focus on recognizing manual markers. The non-manual markers, such as negative head shaking, question ...
Facial action unit detection methodology with application in Brazilian sign language recognition
AbstractSign Language is the linguistic system adopted by the Deaf to communicate. The lack of fully-fledged Automatic Sign Language (ASLR) technologies contributes to the numerous difficulties that deaf individuals face in the absence of an interpreter, ...
Recognition of Affective and Grammatical Facial Expressions: A Study for Brazilian Sign Language
Computer Vision – ECCV 2020 WorkshopsAbstractIndividuals with hearing impairment typically face difficulties in communicating with hearing individuals and during the acquisition of reading and writing skills. Widely adopted by the deaf, Sign Language (SL) has a grammatical structure where ...
Comments