skip to main content
10.1145/3560905.3568528acmconferencesArticle/Chapter ViewAbstractPublication PagessensysConference Proceedingsconference-collections
research-article

Room-Scale Hand Gesture Recognition Using Smart Speakers

Published:24 January 2023Publication History

ABSTRACT

Acoustic signal has been recently adopted for contact-free hand gesture recognition due to its fine-grained sensing granularity and wide availability of microphone and speaker in consumer-grade electronic devices such as smartphones. However, a very limited sensing range constrains acoustic sensing to application scenarios where users interact with devices in close proximity. In this paper, we improve the range of acoustic sensing and demonstrate the feasibility of enabling room-scale hand gesture recognition using commodity smart speakers. We develop a series of novel signal processing techniques and implement our system on two commodity smart speaker prototypes with different numbers of microphones. Extensive evaluations are performed in three different environments with 1440 gestures collected from 16 participants. Experiment results show that our system can significantly increase the sensing range from 1 m to 4--5 m. In the challenging scenario where the user is 4 m away from the smart speaker and there is strong interference, the achieved gesture recognition accuracy is still higher than 90%.

References

  1. Fadel Adib, Zachary Kabelac, and Dina Katabi. 2015. Multi-person localization via {RF} body reflections. In 12th {USENIX} Symposium on Networked Systems Design and Implementation ({NSDI} 15). 279--292.Google ScholarGoogle Scholar
  2. Fadel Adib and Dina Katabi. 2013. See through walls with WiFi!. In Proceedings of the ACM SIGCOMM 2013 conference on SIGCOMM. 75--86.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Amazon. 2021. Amazon Echo Dot. https://www.amazon.com/Echo-Dot/dp/B07FZ8S74R?th=1Google ScholarGoogle Scholar
  4. Tawfiq Ammari, Jofish Kaye, Janice Y Tsai, and Frank Bentley. 2019. Music, Search, and IoT: How People (Really) Use Voice Assistants. ACM Trans. Comput. Hum. Interact. 26, 3 (2019), 17--1.Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Apple. 2021. Apple HomePod. https://www.apple.com/shop/buy-homepod/homepodGoogle ScholarGoogle Scholar
  6. Derya Birant and Alp Kut. 2007. ST-DBSCAN: An algorithm for clustering spatial-temporal data. Data & knowledge engineering 60, 1 (2007), 208--221.Google ScholarGoogle Scholar
  7. Bo Chen, Qian Zhang, Run Zhao, Dong Li, and Dong Wang. 2018. SGRS: A sequential gesture recognition system using COTS RFID. In 2018 IEEE Wireless Communications and Networking Conference (WCNC). IEEE, 1--6.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Zhizhang Chen, Gopal Gokeda, and Yiqiang Yu. 2010. Introduction to Direction-of-arrival Estimation. Artech House.Google ScholarGoogle Scholar
  9. Cao Dian, Dong Wang, Qian Zhang, Run Zhao, and Yinggang Yu. 2020. Towards Domain-independent Complex and Fine-grained Gesture Recognition with RFID. Proceedings of the ACM on Human-Computer Interaction 4, ISS (2020), 1--22.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Zhihui Gao, Ang Li, Dong Li, Jialin Liu, Jie Xiong, Yu Wang, Bing Li, and Yiran Chen. 2022. MOM: Microphone based 3D Orientation Measurement. In 2022 21st ACM/IEEE International Conference on Information Processing in Sensor Networks (IPSN). IEEE, 132--144.Google ScholarGoogle ScholarCross RefCross Ref
  11. Google. 2022. Google Home Mini. https://store.google.com/us/product/google_home_mini_first_gen?hl=en-USGoogle ScholarGoogle Scholar
  12. Sidhant Gupta, Daniel Morris, Shwetak Patel, and Desney Tan. 2012. Soundwave: using the doppler effect to sense gestures. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, 1911--1914.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Qualisys Inc. 2020. Qualisys motion capture systems. https://www.qualisys.com/hardware/miqus/Google ScholarGoogle Scholar
  14. Yincheng Jin, Yang Gao, Yanjun Zhu, Wei Wang, Jiyang Li, Seokmin Choi, Zhangyu Li, Jagmohan Chauhan, Anind K Dey, and Zhanpeng Jin. 2021. Soni-cASL: An Acoustic-based Sign Language Gesture Recognizer Using Earphones. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 5, 2 (2021), 1--30.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Chenning Li, Manni Liu, and Zhichao Cao. 2020. WiHF: Enable User Identified Gesture Recognition with WiFi. In IEEE INFOCOM 2020-IEEE Conference on Computer Communications. IEEE, 586--595.Google ScholarGoogle Scholar
  16. Dong Li, Shirui Cao, Sunghoon Ivan Lee, and Jie Xiong. 2022. Experience: practical problems for acoustic sensing. In Proceedings of the 28th Annual International Conference on Mobile Computing And Networking. 381--390.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Dong Li, Feng Ding, Qian Zhang, Run Zhao, Jinshi Zhang, and Dong Wang. 2017. TagController: A universal wireless and battery-free remote controller using passive RFID tags. In Proceedings of the 14th EAI International Conference on Mobile and Ubiquitous Systems: Computing, Networking and Services. 166--175.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Dong Li, Jialin Liu, Sunghoon Ivan Lee, and Jie Xiong. 2020. FM-Track: Pushing the Limits of Contactless Multi-target Tracking using Acoustic Signals. In Proceedings of the 18th ACM Conference on Embedded Networked Sensor Systems. 1--14.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Dong Li, Jialin Liu, Sunghoon Ivan Lee, and Jie Xiong. 2022. LASense: Pushing the Limits of Fine-grained Activity Sensing Using Acoustic Signals. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 6, 1 (2022), 1--27.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Tianxing Li, Chuankai An, Zhao Tian, Andrew T Campbell, and Xia Zhou. 2015. Human sensing using visible light communication. In Proceedings of the 21st Annual International Conference on Mobile Computing and Networking. 331--344.Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Tianxing Li, Qiang Liu, and Xia Zhou. 2016. Practical human sensing in the light. In Proceedings of the 14th Annual International Conference on Mobile Systems, Applications, and Services. 71--84.Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Yichen Li, Tianxing Li, Ruchir A Patel, Xing-Dong Yang, and Xia Zhou. 2018. Self-powered gesture recognition with ambient light. In Proceedings of the 31st Annual ACM Symposium on User Interface Software and Technology. 595--608.Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Jie Lian, Jiadong Lou, Li Chen, and Xu Yuan. 2021. EchoSpot: Spotting Your Locations via Acoustic Sensing. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 5, 3 (2021), 1--21.Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. BDNC (HOLDING) LIMITED. 2019. STAPEZ brand low distortion speaker. http://www.newbdnc.com/wp-content/uploads/datasheets/BFC-4448-24-4-006.pdfGoogle ScholarGoogle Scholar
  25. Kang Ling, Haipeng Dai, Yuntang Liu, and Alex X Liu. 2018. Ultragesture: Finegrained gesture sensing and recognition. In 2018 15th Annual IEEE International Conference on Sensing, Communication, and Networking (SECON). IEEE, 1--9.Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Jialin Liu, Dong Li, Lei Wang, and Jie Xiong. 2021. BlinkListener: " Listen" to Your Eye Blink Using Your Smartphone. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 5, 2 (2021), 1--27.Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Jialin Liu, Dong Li, Lei Wang, Fusang Zhang, and Jie Xiong. 2022. Enabling Contact-free Acoustic Sensing under Device Motion. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 6, 3 (2022), 1--27.Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Wenguang Mao, Mei Wang, Wei Sun, Lili Qiu, Swadhin Pradhan, and Yi-Chao Chen. 2019. RNN-Based Room Scale Hand Motion Tracking. In The 25th Annual International Conference on Mobile Computing and Networking. ACM, 38.Google ScholarGoogle Scholar
  29. Sky McKinley and Megan Levine. 1998. Cubic spline interpolation. College of the Redwoods 45, 1 (1998), 1049--1060.Google ScholarGoogle Scholar
  30. MiniDSP. 2020. UMA-8-SP User mannual. https://www.minidsp.com/images/documents/UMA-8-SP%20User%20manual.pdfGoogle ScholarGoogle Scholar
  31. Rajalakshmi Nandakumar, Vikram Iyer, Desney Tan, and Shyamnath Gollakota. 2016. Fingerio: Using active sonar for fine-grained finger tracking. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems. ACM, 1515--1525.Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Rajalakshmi Nandakumar, Alex Takakuwa, Tadayoshi Kohno, and Shyamnath Gollakota. 2017. Covertband: Activity information leakage using music. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 1, 3 (2017), 1--24.Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Qifan Pu, Sidhant Gupta, Shyamnath Gollakota, and Shwetak Patel. 2013. Whole-home gesture recognition using wireless signals. In Proceedings of the 19th annual international conference on Mobile computing & networking. 27--38.Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Fitzpatrick D. Purves D, Augustine GJ. 2021. Neuroscience. 2nd edition. The Audible Spectrum. https://www.ncbi.nlm.nih.gov/books/NBK10924/Google ScholarGoogle Scholar
  35. Kun Qian, Chenshu Wu, Fu Xiao, Yue Zheng, Yi Zhang, Zheng Yang, and Yunhao Liu. 2018. Acousticcardiogram: Monitoring heartbeats using acoustic signals on smart devices. In IEEE INFOCOM 2018-IEEE Conference on Computer Communications. IEEE, 1574--1582.Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Research and Markets. 2022. Global Gesture Recognition and Touchless Sensing Market with COVID-19 Impact Analysis by Technology (Touch-based, Touchless), Type, Product (Touchless Biometric Equipment, Touchless Sanitary Equipment), Industry and Geography - Forecast to 2026. https://www.researchandmarkets.com/reports/5011620/global-gesture-recognition-and-touchless-sensingGoogle ScholarGoogle Scholar
  37. Wenjie Ruan, Quan Z Sheng, Lei Yang, Tao Gu, Peipei Xu, and Longfei Shangguan. 2016. AudioGest: enabling fine-grained hand gesture detection by decoding echo signal. In Proceedings of the 2016 ACM International Joint Conference on Pervasive and Ubiquitous Computing. ACM, 474--485.Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Seeed. 2021. ReSpeaker 2-Mic Array. https://wiki.seeedstudio.com/ReSpeaker_2_Mics_Pi_HAT/Google ScholarGoogle Scholar
  39. Seeed. 2021. ReSpeaker 6-Mic Circular Array. https://wiki.seeedstudio.com/ReSpeaker_6-Mic_Circular_Array_kit_for_Raspberry_Pi/Google ScholarGoogle Scholar
  40. Sonos. 2021. Sonos One. https://www.sonos.com/en-us/shop/one.htmlGoogle ScholarGoogle Scholar
  41. Ke Sun, Chen Chen, and Xinyu Zhang. 2020. " Alexa, stop spying on me!" speech privacy protection against voice assistants. In Proceedings of the 18th conference on embedded networked sensor systems. 298--311.Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Ke Sun, Wei Wang, Alex X Liu, and Haipeng Dai. 2018. Depth aware finger tapping on virtual displays. In Proceedings of the 16th Annual International Conference on Mobile Systems, Applications, and Services. ACM, 283--295.Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Zijian Tang, Gerrit Blacquiere, and Geert Leus. 2011. Aliasing-free wideband beamforming using sparse signal representation. IEEE Transactions on Signal Processing 59, 7 (2011), 3464--3469.Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Raghav H Venkatnarayan, Griffin Page, and Muhammad Shahzad. 2018. Multiuser gesture recognition using WiFi. In Proceedings of the 16th Annual International Conference on Mobile Systems, Applications, and Services. 401--413.Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Raghav H Venkatnarayan and Muhammad Shahzad. 2018. Gesture recognition using ambient light. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 2, 1 (2018), 1--28.Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. VLIKE. 2021. VLIKE LCD Digital Sound Level Meter. https://www.amazon.com/VLIKE-Digital-Measurement-Measuring-Function/dp/B01N2RLJ32Google ScholarGoogle Scholar
  47. Voicebot.ai. 2020. Smart speaker survey. https://research.voicebot.ai/report-list/smart-speaker-consumer-adoption-report-2020/Google ScholarGoogle Scholar
  48. Tianben Wang, Daqing Zhang, Yuanqing Zheng, Tao Gu, Xingshe Zhou, and Bernadette Dorizzi. 2018. C-FMCW based contactless respiration detection using acoustic signal. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 1, 4 (2018), 1--20.Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Wei Wang, Alex X Liu, and Ke Sun. 2016. Device-free gesture tracking using acoustic signals. In Proceedings of the 22nd Annual International Conference on Mobile Computing and Networking. ACM, 82--94.Google ScholarGoogle Scholar
  50. Yanwen Wang, Jiaxing Shen, and Yuanqing Zheng. 2020. Push the Limit of Acoustic Gesture Recognition. In IEEE INFOCOM 2020-IEEE Conference on Computer Communications. IEEE, 566--575.Google ScholarGoogle Scholar
  51. Zhengjie Wang, Yushan Hou, Kangkang Jiang, Wenwen Dou, Chengming Zhang, Zehua Huang, and Yinjing Guo. 2019. Hand gesture recognition based on active ultrasonic sensing of smartphone: a survey. IEEE Access 7 (2019), 111897--111922.Google ScholarGoogle ScholarCross RefCross Ref
  52. Binbin Xie and Jie Xiong. 2020. Combating interference for long range LoRa sensing. In Proceedings of the 18th Conference on Embedded Networked Sensor Systems. 69--81.Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. Sangki Yun, Yi-Chao Chen, Huihuang Zheng, Lili Qiu, and Wenguang Mao. 2017. Strata: Fine-grained acoustic-based device-free tracking. In Proceedings of the 15th Annual International Conference on Mobile Systems, Applications, and Services. ACM, 15--28.Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. Fusang Zhang, Zhi Wang, Beihong Jin, Jie Xiong, and Daqing Zhang. 2020. Your Smart Speaker Can" Hear" Your Heartbeat! Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 4, 4 (2020), 1--24.Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. Maotian Zhang, Qian Dai, Panlong Yang, Jie Xiong, Chang Tian, and Chaocan Xiang. 2018. idial: Enabling a virtual dial plate on the hand back for around-device interaction. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 2, 1 (2018), 55.Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. Ningzhi Zhu, Huangxun Chen, and Zhice Yang. 2021. Fine-grained Multi-user Device-Free Gesture Tracking on Today's Smart Speakers. In 2021 IEEE 18th International Conference on Mobile Ad Hoc and Smart Systems (MASS). IEEE, 99--107.Google ScholarGoogle ScholarCross RefCross Ref
  57. Yongpan Zou, Jiang Xiao, Jinsong Han, Kaishun Wu, Yun Li, and Lionel M Ni. 2016. Grfid: A device-free rfid-based gesture recognition system. IEEE Transactions on Mobile Computing 16, 2 (2016), 381--393.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Room-Scale Hand Gesture Recognition Using Smart Speakers

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      SenSys '22: Proceedings of the 20th ACM Conference on Embedded Networked Sensor Systems
      November 2022
      1280 pages
      ISBN:9781450398862
      DOI:10.1145/3560905

      Copyright © 2022 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 24 January 2023

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      SenSys '22 Paper Acceptance Rate52of187submissions,28%Overall Acceptance Rate174of867submissions,20%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader