Authors:
Mingxi Cheng
1
;
Fatima Daha
1
;
Amit Srivastava
2
and
Ji Li
1
Affiliations:
1
Microsoft, Mountain View, CA, U.S.A.
;
2
ServiceNow, Santa Clara, CA, U.S.A.
Keyword(s):
Gesture Recognition, GAN, 3D-CNN, Deep Learning.
Abstract:
With the SARS-CoV-2 pandemic outbreak, video conferencing tools experience huge spikes in usage. Gesture recognition can automatically translate non-verbal gestures into emoji reactions in these tools, making it easier for participants to express themselves. Nonetheless, certain rare gestures may trigger false alarms, and acquiring data for these negative classes in a timely manner is challenging. In this work, we develop a low-cost fast-to-market generation-based approach to effectively reduce the false alarm rate for any identified negative gesture. The proposed pipeline is comprised of data augmentation via generative adversarial networks, automatic gesture alignment, and model retraining with synthetic data. We evaluated our approach on a 3D-CNN based real-time gesture recognition system at a large software company. Experimental results demonstrate that the proposed approach can effectively reduce false alarm rate while maintaining similar accuracy on positive gestures.