skip to main content

UniFi: A Unified Framework for Generalizable Gesture Recognition with Wi-Fi Signals Using Consistency-guided Multi-View Networks

Published: 12 January 2024 Publication History


In recent years, considerable endeavors have been devoted to exploring Wi-Fi-based sensing technologies by modeling the intricate mapping between received signals and corresponding human activities. However, the inherent complexity of Wi-Fi signals poses significant challenges for practical applications due to their pronounced susceptibility to deployment environments. To address this challenge, we delve into the distinctive characteristics of Wi-Fi signals and distill three pivotal factors that can be leveraged to enhance generalization capabilities of deep learning-based Wi-Fi sensing models: 1) effectively capture valuable input to mitigate the adverse impact of noisy measurements; 2) adaptively fuse complementary information from multiple Wi-Fi devices to boost the distinguishability of signal patterns associated with different activities; 3) extract generalizable features that can overcome the inconsistent representations of activities under different environmental conditions (e.g., locations, orientations). Leveraging these insights, we design a novel and unified sensing framework based on Wi-Fi signals, dubbed UniFi, and use gesture recognition as an application to demonstrate its effectiveness. UniFi achieves robust and generalizable gesture recognition in real-world scenarios by extracting discriminative and consistent features unrelated to environmental factors from pre-denoised signals collected by multiple transceivers. To achieve this, we first introduce an effective signal preprocessing approach that captures the applicable input data from noisy received signals for the deep learning model. Second, we propose a multi-view deep network based on spatio-temporal cross-view attention that integrates multi-carrier and multi-device signals to extract distinguishable information. Finally, we present the mutual information maximization as a regularizer to learn environment-invariant representations via contrastive loss without requiring access to any signals from unseen environments for practical adaptation. Extensive experiments on the Widar 3.0 dataset demonstrate that our proposed framework significantly outperforms state-of-the-art approaches in different settings (99% and 90%-98% accuracy for in-domain and cross-domain recognition without additional data collection and model training).


Heba Abdelnasser, Moustafa Youssef, and Khaled A Harras. 2015. Wigest: A ubiquitous wifi-based gesture recognition system. In 2015 IEEE conference on computer communications (INFOCOM). IEEE, 1472--1480.
Mohamed Ishmael Belghazi, Aristide Baratin, Sai Rajeswar, Sherjil Ozair, Yoshua Bengio, Aaron Courville, and R Devon Hjelm. 2018. Mine: mutual information neural estimation. arXiv preprint arXiv:1801.04062 (2018).
Ting Chen, Simon Kornblith, Mohammad Norouzi, and Geoffrey Hinton. 2020. A simple framework for contrastive learning of visual representations. In International conference on machine learning. PMLR, 1597--1607.
Zekai Chen, Xiao Zhang, and Xiuzhen Cheng. 2022. ASM2TV: An Adaptive Semi-supervised Multi-Task Multi-View Learning Framework for Human Activity Recognition. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 36. 6342--6349.
Corinna Cortes, Mehryar Mohri, and Afshin Rostamizadeh. 2009. Learning non-linear combinations of kernels. Advances in neural information processing systems 22 (2009).
Shuya Ding, Zhe Chen, Tianyue Zheng, and Jun Luo. 2020. RF-net: A unified meta-learning framework for RF-enabled one-shot human activity recognition. In Proceedings of the 18th Conference on Embedded Networked Sensor Systems. 517--530.
Chao Feng, Nan Wang, Yicheng Jiang, Xia Zheng, Kang Li, Zheng Wang, and Xiaojiang Chen. 2022. Wi-Learner: Towards One-shot Learning for Cross-Domain Wi-Fi based Gesture Recognition. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 6, 3 (2022), 1--27.
Colin Fyfe. 2001. ICA using kernel canonical correlation analysis. In International Workshop on Independent Component Analysis and Blind Signal Separation, 2001.
Yaroslav Ganin and Victor Lempitsky. 2015. Unsupervised domain adaptation by backpropagation. In International conference on machine learning. PMLR, 1180--1189.
Ruiyang Gao, Mi Zhang, Jie Zhang, Yang Li, Enze Yi, Dan Wu, Leye Wang, and Daqing Zhang. 2021. Towards Position-Independent Sensing for Gesture Recognition with Wi-Fi. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 5, 2 (2021), 1--28.
Jiuxiang Gu, Zhenhua Wang, Jason Kuen, Lianyang Ma, Amir Shahroudy, Bing Shuai, Ting Liu, Xingxing Wang, Gang Wang, Jianfei Cai, et al. 2018. Recent advances in convolutional neural networks. Pattern recognition 77 (2018), 354--377.
Yu Gu, Huan Yan, Xiang Zhang, Yantong Wang, Jinyang Huang, Yusheng Ji, and Fuji Ren. 2023. Attention-Based Gesture Recognition Using Commodity WiFi Devices. IEEE Sensors Journal 23, 9 (2023), 9685--9696.
Yu Gu, Xiang Zhang, Zhi Liu, and Fuji Ren. 2020. Wife: Wifi and vision based intelligent facial-gesture emotion recognition. arXiv preprint arXiv:2004.09889 (2020).
Yu Gu, Xiang Zhang, Yantong Wang, Meng Wang, Huan Yan, Yusheng Ji, Zhi Liu, Jianhua Li, and Mianxiong Dong. 2022. WiGRUNT: WiFi-enabled gesture recognition using dual-attention network. IEEE Transactions on Human-Machine Systems 52, 4 (2022), 736--746.
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770--778.
Wenjun Jiang, Chenglin Miao, Fenglong Ma, Shuochao Yao, Yaqing Wang, Ye Yuan, Hongfei Xue, Chen Song, Xin Ma, Dimitrios Koutsonikolas, et al. 2018. Towards environment independent device free human activity recognition. In Proceedings of the 24th annual international conference on mobile computing and networking. 289--304.
Abhishek Kumar and Hal Daumé. 2011. A co-training approach for multi-view spectral clustering. In Proceedings of the 28th international conference on machine learning (ICML-11). 393--400.
Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. 2015. Deep learning. nature 521, 7553 (2015), 436--444.
Bing Li, Wei Cui, Wei Wang, Le Zhang, Zhenghua Chen, and Min Wu. 2021. Two-stream convolution augmented transformer for human activity recognition. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 286--293.
Chenning Li, Manni Liu, and Zhichao Cao. 2020. WiHF: Gesture and user recognition with WiFi. IEEE Transactions on Mobile Computing 21, 2 (2020), 757--768.
Xiang Li, Daqing Zhang, Qin Lv, Jie Xiong, Shengjie Li, Yue Zhang, and Hong Mei. 2017. IndoTrack: Device-free indoor human tracking with commodity Wi-Fi. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 1, 3 (2017), 1--22.
Yongsen Ma, Gang Zhou, Shuangquan Wang, Hongyang Zhao, and Woosub Jung. 2018. Signfi: Sign language recognition using wifi. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 2, 1 (2018), 1--21.
Aaron van den Oord, Yazhe Li, and Oriol Vinyals. 2018. Representation learning with contrastive predictive coding. arXiv preprint arXiv:1807.03748 (2018).
Alexander Robey, George J Pappas, and Hamed Hassani. 2021. Model-based domain generalization. Advances in Neural Information Processing Systems 34 (2021), 20210--20229.
Ramprasaath R Selvaraju, Michael Cogswell, Abhishek Das, Ramakrishna Vedantam, Devi Parikh, and Dhruv Batra. 2017. Grad-cam: Visual explanations from deep networks via gradient-based localization. In Proceedings of the IEEE international conference on computer vision. 618--626.
Li Sun, Souvik Sen, Dimitrios Koutsonikolas, and Kyu-Han Kim. 2015. Widraw: Enabling hands-free drawing in the air on commodity wifi devices. In Proceedings of the 21st Annual International Conference on Mobile Computing and Networking. 77--89.
Shiliang Sun. 2013. A survey of multi-view machine learning. Neural computing and applications 23 (2013), 2031--2038.
Sheng Tan and Jie Yang. 2016. WiFinger: Leveraging commodity WiFi for fine-grained finger gesture recognition. In Proceedings of the 17th ACM international symposium on mobile ad hoc networking and computing. 201--210.
Laurens Van Der Maaten. 2014. Accelerating t-SNE using tree-based algorithms. The journal of machine learning research 15, 1 (2014), 3221--3245.
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. Advances in neural information processing systems 30 (2017).
Aditya Virmani and Muhammad Shahzad. 2017. Position and orientation agnostic gesture recognition using wifi. In Proceedings of the 15th Annual International Conference on Mobile Systems, Applications, and Services. 252--264.
Dazhuo Wang, Jianfei Yang, Wei Cui, Lihua Xie, and Sumei Sun. 2021. Multimodal CSI-based human activity recognition using GANs. IEEE Internet of Things Journal 8, 24 (2021), 17345--17355.
Dazhuo Wang, Jianfei Yang, Wei Cui, Lihua Xie, and Sumei Sun. 2022. AirFi: Empowering WiFi-based Passive Human Gesture Recognition to Unseen Environment via Domain Generalization. IEEE Transactions on Mobile Computing (2022).
Hao Wang, Daqing Zhang, Junyi Ma, Yasha Wang, Yuxiang Wang, Dan Wu, Tao Gu, and Bing Xie. 2016. Human Respiration Detection with Commodity Wifi Devices: Do User Location and Body Orientation Matter? (UbiComp '16). Association for Computing Machinery, New York, NY, USA, 25--36.
Hao Wang, Daqing Zhang, Yasha Wang, Junyi Ma, Yuxiang Wang, and Shengjie Li. 2016. RT-Fall: A real-time and contactless fall detection system with commodity WiFi devices. IEEE Transactions on Mobile Computing 16, 2 (2016), 511--526.
Lichen Wang, Zhengming Ding, Zhiqiang Tao, Yunyu Liu, and Yun Fu. 2019. Generative multi-view human action recognition. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 6212--6221.
Wei Wang, Alex X Liu, Muhammad Shahzad, Kang Ling, and Sanglu Lu. 2015. Understanding and modeling of wifi signal based human activity recognition. In Proceedings of the 21st annual international conference on mobile computing and networking. 65--76.
Dan Wu, Ruiyang Gao, Youwei Zeng, Jinyi Liu, Leye Wang, Tao Gu, and Daqing Zhang. 2020. FingerDraw: Sub-wavelength level finger motion tracking with WiFi signals. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 4, 1 (2020), 1--27.
Dan Wu, Youwei Zeng, Ruiyang Gao, Shengjie Li, Yang Li, Rahul C Shah, Hong Lu, and Daqing Zhang. 2021. WiTraj: Robust Indoor Motion Tracking with WiFi Signals. IEEE Transactions on Mobile Computing (2021), 1--1.
Dan Wu, Youwei Zeng, Ruiyang Gao, Shengjie Li, Yang Li, Rahul C Shah, Hong Lu, and Daqing Zhang. 2021. WiTraj: robust indoor motion tracking with WiFi signals. IEEE Transactions on Mobile Computing (2021).
Dan Wu, Daqing Zhang, Chenren Xu, Yasha Wang, and Hao Wang. 2016. WiDir: Walking Direction Estimation Using Wireless Signals. In Proceedings of the 2016 ACM International Joint Conference on Pervasive and Ubiquitous Computing (Heidelberg, Germany) (UbiComp '16). Association for Computing Machinery, New York, NY, USA, 351--362.
Rui Xiao, Jianwei Liu, Jinsong Han, and Kui Ren. 2021. Onefi: One-shot recognition for unseen gesture via cots wifi. In Proceedings of the 19th ACM Conference on Embedded Networked Sensor Systems. 206--219.
Chang Xu, Dacheng Tao, and Chao Xu. 2013. A survey on multi-view learning. arXiv preprint arXiv:1304.5634 (2013).
Hongfei Xue, Wenjun Jiang, Chenglin Miao, Fenglong Ma, Shiyang Wang, Ye Yuan, Shuochao Yao, Aidong Zhang, and Lu Su. 2020. DeepMV: Multi-view deep learning for device-free human activity recognition. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 4, 1 (2020), 1--26.
Xufeng Yao, Yang Bai, Xinyun Zhang, Yuechen Zhang, Qi Sun, Ran Chen, Ruiyu Li, and Bei Yu. 2022. PCL: Proxy-based Contrastive Learning for Domain Generalization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 7097--7107.
Nan Yu, Wei Wang, Alex X Liu, and Lingtao Kong. 2018. QGesture: Quantifying gesture distance and direction with WiFi signals. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 2, 1 (2018), 1--23.
Youwei Zeng, Dan Wu, Jie Xiong, Enze Yi, Ruiyang Gao, and Daqing Zhang. 2019. FarSense: Pushing the range limit of WiFi-based respiration sensing with CSI ratio of two antennas. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 3, 3 (2019), 1--26.
Youwei Zeng, Dan Wu, Jie Xiong, Enze Yi, Ruiyang Gao, and Daqing Zhang. 2019. FarSense: Pushing the Range Limit of WiFi-Based Respiration Sensing with CSI Ratio of Two Antennas. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 3, 3, Article 121 (Sept. 2019), 26 pages.
Daqing Zhang, Hao Wang, and Dan Wu. 2017. Toward centimeter-scale human activity sensing with Wi-Fi signals. Computer 50, 1 (2017), 48--57.
Jin Zhang, Zhuangzhuang Chen, Chengwen Luo, Bo Wei, Salil S Kanhere, and Jianqiang Li. 2022. MetaGanFi: Cross-Domain Unseen Individual Identification Using WiFi Signals. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 6, 3 (2022), 1--21.
Jie Zhang, Yang Li, Haoyi Xiong, Dejing Dou, Chunyan Miao, and Daqing Zhang. 2022. HandGest: Hierarchical Sensing for Robust-in-the-Air Handwriting Recognition With Commodity WiFi Devices. IEEE Internet of Things Journal 9, 19 (2022), 19529--19544.
Xie Zhang, Chengpei Tang, Kang Yin, and Qingqian Ni. 2021. Wifi-based cross-domain gesture recognition via modified prototypical networks. IEEE Internet of Things Journal 9, 11 (2021), 8584--8596.
Yi Zhang, Yue Zheng, Kun Qian, Guidong Zhang, Yunhao Liu, Chenshu Wu, and Zheng Yang. 2021. Widar3. 0: Zero-effort cross-domain gesture recognition with Wi-Fi. IEEE Transactions on Pattern Analysis and Machine Intelligence 44, 11 (2021), 8671--8688.
Y. Zheng, Y. Zhang, K. Qian, G. Zhang, Y. Liu, C. Wu, and Z. Yang. 2019. Zero-Effort Cross-Domain Gesture Recognition with Wi-Fi. In Proceedings of the 17th Annual International Conference on Mobile Systems, Applications, and Services (Seoul, Republic of Korea) (MobiSys '19). ACM, New York, NY, USA, 313--325.
Kaiyang Zhou, Ziwei Liu, Yu Qiao, Tao Xiang, and Chen Change Loy. 2021. Domain generalization in vision: A survey. arXiv preprint arXiv:2103.02503 (2021).
Han Zou, Jianfei Yang, Yuxun Zhou, Lihua Xie, and Costas J Spanos. 2018. Robust WiFi-enabled device-free gesture recognition via unsupervised adversarial domain adaptation. In 2018 27th International Conference on Computer Communication and Networks (ICCCN). IEEE, 1--8.

Cited By

View all
  • (2024)Commodity Wi-Fi-Based Wireless Sensing Advancements over the Past Five YearsSensors10.3390/s2422719524:22(7195)Online publication date: 10-Nov-2024
  • (2024)Time-Series Anomaly Detection: Overview and New TrendsProceedings of the VLDB Endowment10.14778/3685800.368584217:12(4229-4232)Online publication date: 8-Nov-2024
  • (2024)MmECare: Enabling Fine-grained Vital Sign Monitoring for Emergency Care with Handheld MmWave RadarsProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36997668:4(1-24)Online publication date: 21-Nov-2024
  • Show More Cited By

Index Terms

  1. UniFi: A Unified Framework for Generalizable Gesture Recognition with Wi-Fi Signals Using Consistency-guided Multi-View Networks



    Information & Contributors


    Published In

    cover image Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies
    Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies  Volume 7, Issue 4
    December 2023
    1613 pages
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].


    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 12 January 2024
    Published in IMWUT Volume 7, Issue 4


    Request permissions for this article.

    Check for updates

    Author Tags

    1. Channel State Information (CSI)
    2. Deep learning
    3. Gesture Recognition
    4. Wireless Sensing


    • Research-article
    • Research
    • Refereed

    Funding Sources


    Other Metrics

    Bibliometrics & Citations


    Article Metrics

    • Downloads (Last 12 months)646
    • Downloads (Last 6 weeks)38
    Reflects downloads up to 20 Jan 2025

    Other Metrics


    Cited By

    View all
    • (2024)Commodity Wi-Fi-Based Wireless Sensing Advancements over the Past Five YearsSensors10.3390/s2422719524:22(7195)Online publication date: 10-Nov-2024
    • (2024)Time-Series Anomaly Detection: Overview and New TrendsProceedings of the VLDB Endowment10.14778/3685800.368584217:12(4229-4232)Online publication date: 8-Nov-2024
    • (2024)MmECare: Enabling Fine-grained Vital Sign Monitoring for Emergency Care with Handheld MmWave RadarsProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36997668:4(1-24)Online publication date: 21-Nov-2024
    • (2024)ChatCam: Embracing LLMs for Contextual Chatting-to-Camera with Interest-Oriented Video SummarizationProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36997318:4(1-34)Online publication date: 21-Nov-2024
    • (2024)DEWS: A Distributed Measurement Scheme for Efficient Wireless SensingProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36997288:4(1-34)Online publication date: 21-Nov-2024
    • (2024)ChatIoT: Zero-code Generation of Trigger-action Based IoT ProgramsProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36785858:3(1-29)Online publication date: 9-Sep-2024
    • (2024)Toolkit Design for Building Camera Sensor-Driven DIY Smart HomesCompanion of the 2024 on ACM International Joint Conference on Pervasive and Ubiquitous Computing10.1145/3675094.3678363(256-261)Online publication date: 5-Oct-2024
    • (2024)G-VOILA: Gaze-Facilitated Information Querying in Daily ScenariosProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36596238:2(1-33)Online publication date: 15-May-2024
    • (2024)NetDiffusion: Network Data Augmentation Through Protocol-Constrained Traffic GenerationProceedings of the ACM on Measurement and Analysis of Computing Systems10.1145/36390378:1(1-32)Online publication date: 21-Feb-2024
    • (2024)Fine-grained Textile Moisture Sensing with Commodity UWBProceedings of the 30th Annual International Conference on Mobile Computing and Networking10.1145/3636534.3690679(1074-1088)Online publication date: 4-Dec-2024
    • Show More Cited By

    View Options

    Login options

    Full Access

    View options


    View or Download as a PDF file.



    View online with eReader.








    Share this Publication link

    Share on social media