skip to main content
research-article

EarIO: A Low-power Acoustic Sensing Earable for Continuously Tracking Detailed Facial Movements

Published: 07 July 2022 Publication History

Abstract

This paper presents EarIO, an AI-powered acoustic sensing technology that allows an earable (e.g., earphone) to continuously track facial expressions using two pairs of microphone and speaker (one on each side), which are widely available in commodity earphones. It emits acoustic signals from a speaker on an earable towards the face. Depending on facial expressions, the muscles, tissues, and skin around the ear would deform differently, resulting in unique echo profiles in the reflected signals captured by an on-device microphone. These received acoustic signals are processed and learned by a customized deep learning pipeline to continuously infer the full facial expressions represented by 52 parameters captured using a TruthDepth camera. Compared to similar technologies, it has significantly lower power consumption, as it can sample at 86 Hz with a power signature of 154 mW. A user study with 16 participants under three different scenarios, showed that EarIO can reliably estimate the detailed facial movements when the participants were sitting, walking or after remounting the device. Based on the encouraging results, we further discuss the potential opportunities and challenges on applying EarIO on future ear-mounted wearables.

Supplemental Material

ZIP File - li
Supplemental movie, appendix, image and software files for, EarIO: A Low-power Acoustic Sensing Earable for Continuously Tracking Detailed Facial Movements

References

[1]
Toshiyuki Ando, Yuki Kubo, Buntarou Shizuki, and Shin Takahashi. 2017. Canalsense: Face-related movement recognition system based on sensing air pressure in ear canals. In Proceedings of the Annual ACM Symposium on User Interface Software and Technology (UIST). 679--689.
[2]
Md Tanvir Islam Aumi, Sidhant Gupta, Mayank Goel, Eric Larson, and Shwetak Patel. 2013. DopLink: using the doppler effect for multi-device interaction. In Proceedings of the ACM International Joint Conference on Pervasive and Ubiquitous Computing. 583--586.
[3]
Jaekwang Cha, Jinhyuk Kim, and Shiho Kim. 2016. An IR-based facial expression tracking sensor for head-mounted displays. In IEEE SENSORS. IEEE, 1--3.
[4]
Tuochao Chen, Yaxuan Li, Songyun Tao, Hyunchul Lim, Mose Sakashita, Ruidong Zhang, François Guimbretière, and Cheng Zhang. 2021. NeckFace: Continuously Tracking Full Facial Expressions on Neck-mounted Wearables. In Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies (IMWUT), Vol. 5. 1--31.
[5]
Tuochao Chen, Benjamin Steeper, Kinan Alsheikh, Songyun Tao, François Guimbretière, and Cheng Zhang. 2020. C-Face: Continuously Reconstructing Facial Expressions by Deep Learning Contours of the Face with Ear-mounted Miniature Cameras. In Proceedings of the Annual ACM Symposium on User Interface Software and Technology (UIST). 112--125.
[6]
Roddy Cowie, Ellen Douglas-Cowie, Nicolas Tsapatsoulis, George Votsis, Stefanos Kollias, Winfried Fellenz, and John G Taylor. 2001. Emotion recognition in human-computer interaction. IEEE Signal Processing Magazine 18, 1 (2001), 32--80.
[7]
Lloyd E Emokpae, Stephen DiBenedetto, Brad Potteiger, and Mohamed Younis. 2014. UREAL: Underwater reflection-enabled acoustic-based localization. IEEE Sensors Journal 14, 11 (2014), 3915--3925.
[8]
Anna Gruebler and Kenji Suzuki. 2010. Measurement of distal EMG signals using a wearable device for reading facial expressions. In Annual International Conference of the IEEE Engineering in Medicine and Biology. IEEE, 4594--4597.
[9]
Shan He, Shangfei Wang, Wuwei Lan, Huan Fu, and Qiang Ji. 2013. Facial expression recognition using deep Boltzmann machine from thermal infrared images. In Humaine Association Conference on Affective Computing and Intelligent Interaction. IEEE, 239--244.
[10]
Steven Hickson, Nick Dufour, Avneesh Sud, Vivek Kwatra, and Irfan Essa. 2019. Eyemotion: Classifying facial expressions in VR using eye-tracking cameras. In IEEE Winter Conference on Applications of Computer Vision (WACV). IEEE, 1626--1635.
[11]
Pei-Lun Hsieh, Chongyang Ma, Jihun Yu, and Hao Li. 2015. Unconstrained realtime facial performance capture. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 1675--1683.
[12]
Earnest Paul Ijjina and C Krishna Mohan. 2014. Facial expression recognition using kinect depth sensor and convolutional neural networks. In International Conference on Machine Learning and Applications. IEEE, 392--396.
[13]
Yasha Iravantchi, Yang Zhang, Evi Bernitsas, Mayank Goel, and Chris Harrison. 2019. Interferi: Gesture Sensing Using On-Body Acoustic Interferometry. In Proceedings of the CHI Conference on Human Factors in Computing Systems. 1--13.
[14]
Samira Ebrahimi Kahou, Christopher Pal, Xavier Bouthillier, Pierre Froumenty, Çaglar Gülçehre, Roland Memisevic, Pascal Vincent, Aaron Courville, Yoshua Bengio, Raul Chandias Ferrari, et al. 2013. Combining modality specific deep neural networks for emotion recognition in video. In Proceedings of the ACM on International Conference on Multimodal Interaction. 543--550.
[15]
Davis E King. 2009. Dlib-ml: A machine learning toolkit. The Journal of Machine Learning Research 10 (2009), 1755--1758.
[16]
Ying-Hsiu Lai and Shang-Hong Lai. 2018. Emotion-preserving representation learning via generative adversarial network for multi-view facial expression recognition. In IEEE International Conference on Automatic Face & Gesture Recognition (FG). IEEE, 263--270.
[17]
Hao Li, Laura Trutoiu, Kyle Olszewski, Lingyu Wei, Tristan Trutna, Pei-Lun Hsieh, Aaron Nicholls, and Chongyang Ma. 2015. Facial performance sensing head-mounted display. ACM Transactions on Graphics (ToG) 34, 4 (2015), 1--9.
[18]
Jie Lian, Jiadong Lou, Li Chen, and Xu Yuan. 2021. EchoSpot: Spotting Your Locations via Acoustic Sensing. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 5, 3 (2021), 1--21.
[19]
Mengyi Liu, Shiguang Shan, Ruiping Wang, and Xilin Chen. 2014. Learning expressionlets on spatio-temporal manifold for dynamic facial expression recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 1749--1756.
[20]
Ping Liu, Shizhong Han, Zibo Meng, and Yan Tong. 2014. Facial expression recognition via a boosted deep belief network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 1805--1812.
[21]
Li Lu, Jiadi Yu, Yingying Chen, Hongbo Liu, Yanmin Zhu, Linghe Kong, and Minglu Li. 2019. Lip reading-based user authentication through acoustic sensing on smartphones. IEEE/ACM Transactions on Networking (TON) 27, 1 (2019), 447--460.
[22]
Rajalakshmi Nandakumar, Shyamnath Gollakota, and Nathaniel Watson. 2015. Contactless sleep apnea detection on smartphones. In Proceedings of the 13th annual international conference on mobile systems, applications, and services. 45--57.
[23]
Rajalakshmi Nandakumar, Vikram Iyer, Desney Tan, and Shyamnath Gollakota. 2016. Fingerio: Using active sonar for fine-grained finger tracking. In Proceedings of the CHI Conference on Human Factors in Computing Systems. 1515--1525.
[24]
U.S. Department of Health and Human Services. 1998. Criteria for a recommended standard: occupational noise exposure. DHHS (NIOSH) Publication No. 98--126 (1998). https://www.cdc.gov/niosh/docs/98-126/
[25]
U.S. Environment Protection Agency Office of Noise Abatement and Control. 1974. Information on levels of environmental noise requisite to protect public health and welfare with adequate margin of safety. EPA/ONAC 550/9-74-004 (1974). http://nepis.epa.gov/Exe/ZyPDF.cgi/2000L3LN.PDF?Dockey=2000L3LN.PDF
[26]
Ville Rantanen, Pekka-Henrik Niemenlehto, Jarmo Verho, and Jukka Lekkala. 2010. Capacitive facial movement detection for human-computer interaction to click by frowning and lifting eyebrows. Medical & biological engineering & computing 48, 1 (2010), 39--47.
[27]
Marc'Aurelio Ranzato, Joshua Susskind, Volodymyr Mnih, and Geoffrey Hinton. 2011. On deep generative models with applications to recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2857--2864.
[28]
Salah Rifai, Yoshua Bengio, Aaron Courville, Pascal Vincent, and Mehdi Mirza. 2012. Disentangling factors of variation for facial expression recognition. In European Conference on Computer Vision. Springer, 808--822.
[29]
James A Russell. 1994. Is there universal recognition of emotion from facial expression? A review of the cross-cultural studies. Psychological bulletin 115, 1 (1994), 102.
[30]
Nicu Sebe, Michael S Lew, Yafei Sun, Ira Cohen, Theo Gevers, and Thomas S Huang. 2007. Authentic facial expression analysis. Image and Vision Computing 25, 12 (2007), 1856--1863.
[31]
Ke Sun, Ting Zhao, Wei Wang, and Lei Xie. 2018. Vskin: Sensing touch gestures on surfaces of mobile devices using acoustic signals. In Proceedings of the Annual International Conference on Mobile Computing and Networking (MobiCom). 591--605.
[32]
Justus Thies, Michael Zollhöfer, Matthias Nießner, Levi Valgaerts, Marc Stamminger, and Christian Theobalt. 2015. Real-time expression transfer for facial reenactment. ACM Trans. Graph. 34, 6 (2015), 183--1.
[33]
Dhruv Verma, Sejal Bhalla, Dhruv Sahnan, Jainendra Shukla, and Aman Parnami. 2021. ExpressEar: Sensing Fine-Grained Facial Expressions with Earables. In Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies (IMWUT), Vol. 5. 1--28.
[34]
Tianben Wang, Daqing Zhang, Yuanqing Zheng, Tao Gu, Xingshe Zhou, and Bernadette Dorizzi. 2018. C-FMCW based contactless respiration detection using acoustic signal. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 1, 4 (2018), 1--20.
[35]
Wei Wang, Alex X Liu, and Ke Sun. 2016. Device-free gesture tracking using acoustic signals. In Proceedings of the Annual International Conference on Mobile Computing and Networking (MobiCom). 82--94.
[36]
Chenglei Wu, Derek Bradley, Markus Gross, and Thabo Beeler. 2016. An anatomically-constrained local deformation model for monocular face capture. ACM transactions on graphics (TOG) 35, 4 (2016), 1--12.
[37]
Wayne Wu, Chen Qian, Shuo Yang, Quan Wang, Yici Cai, and Qiang Zhou. 2018. Look at boundary: A boundary-aware face alignment algorithm. In Proceedings of the IEEE conference on computer vision and pattern recognition. 2129--2138.
[38]
Yi Wu, Vimal Kakaraparthi, Zhuohang Li, Tien Pham, Jian Liu, and Phuc Nguyen. 2021. BioFace-3D: Continuous 3d Facial Reconstruction through Lightweight Single-Ear Biosensors. In Proceedings of the Annual International Conference on Mobile Computing and Networking (MobiCom). 350--363.
[39]
Wentao Xie, Qian Zhang, and Jin Zhang. 2021. Acoustic-Based Upper Facial Action Recognition for Smart Eyewear. In Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies (IMWUT), Vol. 5. 1--28.
[40]
Xuhai Xu, Haitian Shi, Xin Yi, Wenjia Liu, Yukang Yan, Yuanchun Shi, Alex Mariakakis, Jennifer Mankoff, and Anind K Dey. 2020. EarBuddy: Enabling On-Face Interaction via Wireless Earbuds. In Proceedings of the CHI Conference on Human Factors in Computing Systems. 1--14.
[41]
Sangki Yun, Yi-Chao Chen, Huihuang Zheng, Lili Qiu, and Wenguang Mao. 2017. Strata: Fine-grained acoustic-based device-free tracking. In Proceedings of the Annual International Conference on Mobile Systems, Applications, and Services. 15--28.
[42]
Cheng Zhang, Qiuyue Xue, Anandghan Waghmare, Ruichen Meng, Sumeet Jain, Yizeng Han, Xinyu Li, Kenneth Cunefare, Thomas Ploetz, Thad Starner, et al. 2018. FingerPing: Recognizing fine-grained hand poses using active acoustic on-body sensing. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. 1--10.
[43]
Ruidong Zhang, Mingyang Chen, Benjamin Steeper, Yaxuan Li, Zihan Yan, Yizhuo Chen, Songyun Tao, Tuochao Chen, Hyunchul Lim, and Cheng Zhang. 2021. SpeeChin: A Smart Necklace for Silent Speech Recognition. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies (IMWUT) 5, 4 (2021), 1--23.
[44]
Yongzhao Zhang, Wei-Hsiang Huang, Chih-Yun Yang, Wen-Ping Wang, Yi-Chao Chen, Chuang-Wen You, Da-Yuan Huang, Guangtao Xue, and Jiadi Yu. 2020. Endophasia: Utilizing Acoustic-Based Imaging for Issuing Contact-Free Silent Speech Commands. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies (IMWUT) 4, 1 (2020), 1--26.
[45]
Yunting Zhang, Jiliang Wang, Weiyi Wang, Zhao Wang, and Yunhao Liu. 2018. Vernier: Accurate and fast acoustic motion tracking using mobile devices. In IEEE International Conference on Computer Communications (INFOCOM). IEEE, 1709--1717.

Cited By

View all
  • (2024)PADrone: Pre-flight Abnormalities Detection on Drone via Deep RF SensingACM Transactions on Internet of Things10.1145/3706121Online publication date: 29-Nov-2024
  • (2024)Active Acoustic Sensing Based Authentication System Using a Door HandleProceedings of the International Conference on Mobile and Ubiquitous Multimedia10.1145/3701571.3701587(324-330)Online publication date: 2-Dec-2024
  • (2024)HCR-Auth: Reliable Bone Conduction Earphone Authentication with Head Contact ResponseProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36997808:4(1-27)Online publication date: 21-Nov-2024
  • Show More Cited By

Index Terms

  1. EarIO: A Low-power Acoustic Sensing Earable for Continuously Tracking Detailed Facial Movements

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies
      Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies  Volume 6, Issue 2
      June 2022
      1551 pages
      EISSN:2474-9567
      DOI:10.1145/3547347
      Issue’s Table of Contents
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 07 July 2022
      Published in IMWUT Volume 6, Issue 2

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. Acoustic sensing
      2. Deep learning
      3. Facial expression reconstruction
      4. Low-power

      Qualifiers

      • Research-article
      • Research
      • Refereed

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)333
      • Downloads (Last 6 weeks)32
      Reflects downloads up to 16 Feb 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)PADrone: Pre-flight Abnormalities Detection on Drone via Deep RF SensingACM Transactions on Internet of Things10.1145/3706121Online publication date: 29-Nov-2024
      • (2024)Active Acoustic Sensing Based Authentication System Using a Door HandleProceedings of the International Conference on Mobile and Ubiquitous Multimedia10.1145/3701571.3701587(324-330)Online publication date: 2-Dec-2024
      • (2024)HCR-Auth: Reliable Bone Conduction Earphone Authentication with Head Contact ResponseProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36997808:4(1-27)Online publication date: 21-Nov-2024
      • (2024)MmECare: Enabling Fine-grained Vital Sign Monitoring for Emergency Care with Handheld MmWave RadarsProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36997668:4(1-24)Online publication date: 21-Nov-2024
      • (2024)ActSonic: Recognizing Everyday Activities from Inaudible Acoustic Wave Around the BodyProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36997528:4(1-32)Online publication date: 21-Nov-2024
      • (2024)Ring-a-Pose: A Ring for Continuous Hand Pose TrackingProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36997418:4(1-30)Online publication date: 21-Nov-2024
      • (2024)SonicID: User Identification on Smart Glasses with Acoustic SensingProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36997348:4(1-27)Online publication date: 21-Nov-2024
      • (2024)DEWS: A Distributed Measurement Scheme for Efficient Wireless SensingProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36997288:4(1-34)Online publication date: 21-Nov-2024
      • (2024)GrainSense: A Wireless Grain Moisture Sensing System Based on Wi-Fi SignalsProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36785898:3(1-25)Online publication date: 9-Sep-2024
      • (2024)RF-GymCare: Introducing Respiratory Prior for RF Sensing in Gym EnvironmentsProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36785688:3(1-28)Online publication date: 9-Sep-2024
      • Show More Cited By

      View Options

      Login options

      Full Access

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media