Skip to main content
Log in

Hand joints-based gesture recognition for noisy dataset using nested interval unscented Kalman filter with LSTM network

  • Original Article
  • Published:
The Visual Computer Aims and scope Submit manuscript

Abstract

Hand joints-based gesture recognition using a neural network provides excellent performance in hand gesture recognition. However, during the collection of sequential skeletal datasets, joints identification through hand pose estimations usually includes noise and even errors, which often diminish the accuracy of gesture recognition. To promote the availability of hand gesture recognition for such noisy datasets, this paper presents a nested interval unscented Kalman filter (UKF) with long short-term memory (NIUKF-LSTM) network to improve the accuracy of hand gesture recognition from noisy datasets. This nested interval method with the UKF changes the distribution of the sigma points based on two sampling intervals. By considering the information of previous frames, the nested interval method helps the NIUKF-LSTM network revise the noise in the sequential hand skeletal data and improve the recognition accuracy. The experimental results showing the removal of noisy skeletal data from the dynamic hand gesture dataset demonstrate the effectiveness of our NIUKF-LSTM network, which achieves better performance than do other state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  1. Ohn-Bar, E., Trivedi, M.M.: Hand gesture recognition in real time for automotive interfaces: a multimodal vision-based approach and evaluations. IEEE Trans. Intell. Transp. Syst. 15, 2368–2377 (2014)

    Article  Google Scholar 

  2. Marin, G., Dominio, F., Zanuttigh, P.: Hand gesture recognition with leap motion and kinect devices. In: IEEE International Conference on Image Processing, pp. 1565–1569 (2015)

  3. Molchanov, P., Gupta, S., Kim, K., Pulli, K.: Multi-sensor system for driver’s hand-gesture recognition. In: International Conference and Workshops on Automatic Face and Gesture Recognition IEEE, vol. 1, pp. 1–8 (2015)

  4. Karpathy, A., et al.: Large-scale video classification with convolutional neural networks. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014)

  5. Molchanov, P., Gupta, S., Kim, K., Kautz, J.B.T.-C.V., Hand, P.R.W.: Gesture recognition with 3D convolutional neural networks. Comput. Vis. Pattern Recognit. Workshops 18, 1–7 (2015)

    Google Scholar 

  6. Neverova, N., Wolf, C., Taylor, G.W., Nebout, F.: Multi-scale deep learning for gesture detection and localization. In: Workshop at the European Conference on Computer Vision, vol. 8925, pp. 474–490 (2014)

  7. Molchanov, P., et al.: Online detection and classification of dynamic hand gestures with recurrent 3D convolutional neural networks. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 4207–4215 (2016)

  8. Dardas, N.H., Georganas, N.D.: Real-time hand gesture detection and recognition using bag-of-features and support vector machine techniques. IEEE Trans. Instrum. Meas. 60, 3592–3607 (2011)

    Article  Google Scholar 

  9. Ge, L., Liang, H., Yuan, J., Thalmann, D.: Robust 3D hand pose estimation in single depth images: from single-view CNN to multi-view CNNs. In: Computer Vision and Pattern Recognition IEEE, pp. 3593–3601 (2016)

  10. Xu, C., Govindarajan, L.N., Zhang, Y., Cheng, L.: Lie-X: depth image based articulated object pose estimation, tracking, and action recognition on Lie groups. Int. J. Comput. Vis. 123, 454–478 (2016)

    Article  MathSciNet  Google Scholar 

  11. Wan, C., Probst, T., Van Gool, L., Yao, A.: Crossing nets: combining GANs and VAEs with a shared latent space for hand pose estimation. In: IEEE Computer Vision and Pattern Recognition, pp. 1196–1205 (2017)

  12. Ye, Q., Yuan, S., Kim, T.: Spatial attention deep net with partial PSO for hierarchical hybrid hand pose estimation. In: European Conference on Computer Vision, pp. 346–361. Springer, Berlin. arXiv:1604.03334 (2016)

  13. Wetzler, A., Slossberg, R., Kimmel, R.: Rule Of Thumb: Deep derotation for improved fingertip detection. arXiv preprint arXiv:1507.05726 (2015)

  14. Sinha, A., Choi, C., Ramani, K.: Deephand: robust hand pose estimation by completing a matrix imputed with deep features. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 4150–4158 (2016)

  15. Ge, L., Liang, H., Yuan, J., Thalmann, D.: 3D convolutional neural networks for efficient and robust hand pose estimation from single depth images. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 5679–5688 (2017)

  16. Zhu, W., et al.: Co-occurrence feature learning for skeleton based action recognition using regularized deep LSTM networks. In: Thirtieth AAAI Conference on Artificial Intelligence, pp. 3697–3703. AAAI Press (2016)

  17. Zhang, S., Liu, X., Xiao, J.: On geometric features for skeleton-based action recognition using multilayer LSTM networks. In: IEEE Applications of Computer Vision (2017)

  18. Du, Y., Wang, W., Wang, L.: Hierarchical recurrent neural network for skeleton based action recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1110–1118 (2015)

  19. Mahasseni, B., Todorovic, S.: Regularizing long short term memory with 3D human-skeleton sequences for action recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 3054–3062 (2016)

  20. Li, C., Wang, P., Wang, S., Hou, Y., Li, W.: Skeleton-based action recognition using LSTM and CNN. In: IEEE International Conference on Multimedia and Expo Workshops IEEE Computer Society, pp. 585–590 (2017)

  21. Li, Y., et al.: Online Human Action Detection Using Joint Classification-Regression Recurrent Neural Networks, pp. 203–220 (2016)

  22. Liu, J., Shahroudy, A., Xu, D., Kot, A.C., Wang, G.: Skeleton-based action recognition using spatio-temporal LSTM network with trust gates. In: IEEE Transactions on Pattern Analysis & Machine Intelligence, p. 99, 1-1 (2017)

  23. Lefebvre, G., Berlemont, S., Mamalet, F., Garcia, C.: BLSTM-RNN based 3D gesture classification. In: International Conference on Artificial Neural Networks, pp. 381–388. Springer, Berlin (2013)

  24. Lin, J., Wu, Y., Huang: Modeling the constraints of human hand motion. In: The Workshop on Human Motion IEEE Computer Society, pp. 121–126 (2000)

  25. Chen, F.C., Appendino, S., Battezzato, A.: Constraint study for a hand exoskeleton: human hand kinematics and dynamics. J. Robot. 2013, 3 (2013)

    Google Scholar 

  26. Yang, H., Zhang, J.: Hand pose regression via a classification-guided approach. In: Asian Conference on Computer Vision, pp. 452–466. Springer, Cham (2016)

  27. Khamis, S., et al.: Learning an efficient model of hand shape variation from depth images. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2540–2548 (2015)

  28. Tan, D.J., et al.: Fits like a glove: rapid and reliable hand shape personalization. In: IEEE Computer Vision and Pattern Recognition, pp. 5610–5619 (2016)

  29. Taylor, J., et al.: Efficient and precise interactive hand tracking through joint, continuous optimization of pose and correspondences. ACM Trans. Graph. 35(4), 143 (2016)

    Article  Google Scholar 

  30. Li, R., Liu, Z., Zhang, Y., Li, Y., Fu, Z.: Noise-level estimation based detection of motion-compensated frame interpolation in video sequences. Multimed. Tools Appl. 77, 663–688 (2016)

    Article  Google Scholar 

  31. Young, N., Evans.: Spatio-temporal attribute morphology filters for noise reduction in image sequences. In: Proceedings of the IEEE International Conference on Image Processing, 2003 (ICIP 2003), vol. 1, p. I-333-6 (2003)

  32. Betancourt, A., Morerio, P., Marcenaro, L., Rauterberg, G.M., Regazzoni, C.S.: Filtering SVM frame-by-frame binary classification in a detection framework. In: IEEE International Conference on Image Processing. IEEE (2015)

  33. Yahya, A.A., Tan, J., Li, L.: Video noise reduction method using adaptive spatial-temporal filtering. Discret. Dyn. Nat. Soc. 2015, 1–10 (2015)

    Article  MathSciNet  Google Scholar 

  34. So, S., George, A.E.W., Ghosh, R., Paliwal, K.K.: Kalman filter with sensitivity tuning for improved noise reduction in speech. Circuits Syst. Signal Process. 36(4), 1–17 (2016)

    MATH  Google Scholar 

  35. Liang, H., Yuan, J., Thalmann, D.: Parsing the hand in depth images. IEEE Trans. Multimed. 16(5), 1241–1253 (2014)

    Article  Google Scholar 

  36. Ren, Z., Yuan, J., Meng, J., Zhang, Z.: Robust part-based hand gesture recognition using kinect sensor. IEEE Trans. Multimed. 15(5), 1110–1120 (2013)

    Article  Google Scholar 

  37. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9, 1735 (1997)

    Article  Google Scholar 

  38. Graves, A.: Supervised Sequence Labelling with Recurrent Neural Networks. Springer, Berlin (2012)

    Book  MATH  Google Scholar 

  39. Graves, A., Mohamed, A.R.: Hinton speech and signal processing. In: Speech Recognition with Deep Recurrent Neural Networks, pp. 6645–6649 (2013)

  40. Kalman, R.E.: A new approach to linear filtering and prediction problems. J. Basic Eng. Trans. 82, 35–45 (1960)

    Article  Google Scholar 

  41. Wan, E.A., van der Merwe, R.: The unscented Kalman filter for nonlinear estimation. In: Proceedings of the IEEE Symposium on Adaptive Systems for Signal Processing Communications and Control (AS-SPCC) (2000)

  42. Uhlmann, J.K., Julier, S.J.: A New Extension of the Kalman Filter to Nonlinear Systems, vol. 3068, pp. 182–193 (1997)

  43. Kandepu, R., Foss, B., Imsland, L.: Applying the unscented Kalman filter for nonlinear state estimation. J. Process Control 18(7), 753–768 (2008)

    Article  Google Scholar 

  44. Turner, R., Rasmussen, C.E.: Model based learning of sigma points in unscented Kalman filtering. Neurocomputing 80, 47–53 (2012)

    Article  Google Scholar 

  45. De Smedt, Q., Wannous, H., Vandeborre, J.-P., Guerry, J., Le Saux, B., Filliat, D.: 3D Hand Gesture Recognition Using a Depth and Skeletal Dataset (2017)

  46. De Smedt, Q., Wannous, H., Vandeborre, J.P.: Skeleton-based dynamic hand gesture recognition. In: IEEE Computer Vision and Pattern Recognition Workshops, pp. 1–9 (2016)

  47. Ohn-Bar, E., Trivedi, M.M.: Joint angles similarities and HOG2 for action recognition. In: Computer Vision and Pattern Recognition Workshops IEEE, vol. 13, pp. 465–470 (2013)

  48. Oreifej, O., Liu, Z.: Hon4d: histogram of oriented 4D normals for activity recognition from depth sequences. In: IEEE Computer Vision and Pattern Recognition, vol. 9, pp. 716–723 (2013)

  49. Devanne, M., et al.: 3-D human action recognition by shape analysis of motion trajectories on Riemannian manifold. IEEE Trans. Cybern. 45, 1340–1352 (2015)

    Article  Google Scholar 

Download references

Acknowledgements

The authors acknowledge the Fundamental Research Funds for the Central Universities (No. 201762005), Qingdao major projects of independent innovation (No. 16-7-1-1-zdzx-xx), the National Key Scientific Instrument and Equipment Development Projects of National Natural Science Foundation of China (No. 41527901) and the Qingdao Source Innovation Program (No. 17-1-1-6-jch).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Anni Wang.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ma, C., Wang, A., Chen, G. et al. Hand joints-based gesture recognition for noisy dataset using nested interval unscented Kalman filter with LSTM network. Vis Comput 34, 1053–1063 (2018). https://doi.org/10.1007/s00371-018-1556-0

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00371-018-1556-0

Keywords

Navigation