Skip to main content
Log in

2D Hand Detection Using Multi-Feature Skin Model Supervised Cascaded CNN

  • Published:
Journal of Signal Processing Systems Aims and scope Submit manuscript

A Correction to this article was published on 19 November 2019

This article has been updated

Abstract

Hand gesture recognition is one of the most popular Human Computer Interface. The first step in most vision-based gesture recognition system is the hand detection and segmentation. Since hands are involved in a variety of daily tasks, the detection work suffers from both extreme illumination changes and the intrinsic variability of hand appearance. To overcome these problems, we propose a new method for 2D hand detection which can be considered as a combination of Multi-Feature based hand proposal generation and cascaded convolutional neural network (CCNN) classification. Considered various luminance, we choose color, Gabor, HOG and SIFT feature to discriminate skin region and generate hand proposal. Also, we propose a cascaded CNN that keeps the deep context information to detect hand among the proposals. The proposed Multi-Feature Supervised Cascaded CNN (MFS-CCNN) method is tested on a combination of several datasets including Oxford Hands Dataset, VIVA hand detection and Egohands Dataset as positive sample and ImageNet 2012, FDDB dataset as negative sample. The proposed method achieves competitive results.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7

Similar content being viewed by others

Change history

  • 19 November 2019

    The Publisher regrets an error on the printed front cover of the October 2019 issue. The issue numbers were incorrectly listed as Volume 91, Nos. 10-12, October 2019. The correct number should be: "Volume 91, No. 10, October 2019"

References

  1. Stergiopoulou, E., Sgouropoulos, K., Nikolaou, N., Papamarkos, N., & Mitianoudis, N. (2014). Real time hand detection in a complex background. Engineering Applications of Artificial Intelligence, 35(2), 54–70.

    Article  Google Scholar 

  2. Ebert, A., Gershon, N. D., & van der Veer, G. C. (2012). Human-computer interaction: Introduction and overview. Künstliche Intelligenz, 26(2), 121–126.

    Article  Google Scholar 

  3. Zariffa, J., & Popovic, M. R. (2013). Hand contour detection in wearable camera video using an adaptive histogram region of interest. J NeuroEng Rehab, 10,1(2013-12-19), 10(1), 114–114.

  4. Rogez, G., Supancic, J. S., & Ramanan, D. (2015). Understanding everyday hands in action from RGB-D images. IEEE International Conference on Computer Vision, 22, 3889–3897 IEEE Computer Society.

    Google Scholar 

  5. Mittal, A., Zisserman, A., & Torr, P. (2011). Hand detection using multiple proposals. British Machine Vision Conference, 40, 75.1–75.11.

    Google Scholar 

  6. Li, C., & Kitani, K. M. (2013). Pixel-level hand detection in ego-centric videos. Computer Vision and Pattern Recognition, 9, 3570–3577 IEEE.

    Google Scholar 

  7. Fathi, A., & Rehg, J. M. (2011). Learning to recognize objects in egocentric activities. IEEE Conference on Computer Vision and Pattern Recognition 42, pp.3281-3288). IEEE Computer Society.

  8. Serra, G., Camurri, M., Baraldi, L., Benedetti, M., & Cucchiara, R. (2013). Hand segmentation for gesture recognition in EGO-vision. ACM International Workshop on Interactive Multimedia on Mobile & Portable Devices, 24, 31–36 ACM.

    Google Scholar 

  9. Dalal, N., & Triggs, & Bill. (2005). Histograms of oriented gradients for human detection. Computer Vision and Pattern Recognition, 2005. CVPR 2005. IEEE Computer Society Conference on, 1, 886–893 IEEE.

    Google Scholar 

  10. Lowe, D. G. (2004). Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60(2), 91–110.

    Article  Google Scholar 

  11. Girshick, R., Donahue, J., Darrell, T., & Malik, J. (2016). Region-based convolutional networks for accurate object detection and segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 38(1), 142–158.

    Article  Google Scholar 

  12. Girshick, R. (2015). Fast R-CNN. IEEE International Conference on Computer Vision (pp.1440-1448). IEEE.

  13. Ren, S., He, K., Girshick, R., & Sun, J. (2015). Faster R-CNN: Towards real-time object detection with region proposal networks. International Conference on Neural Information Processing Systems, 39, 91–99 MIT press.

    Google Scholar 

  14. Dai, J., Li, Y., He, K., & Sun, J. (2016). R-fcn: Object detection via region-based fully convolutional networks.

  15. Erol, A., Bebis, G., Nicolescu, M., Boyle, R. D., & Twombly, X. (2007). Vision based hand pose estimation: A review. Computer Vision & Image Understanding, 108(1), 52–73.

    Article  Google Scholar 

  16. Wachs, J. P., Kölsch, M., Stern, H., & Edan, Y. (2011). Vision-based hand-gesture applications. Communications of the ACM, 54(2), 60–71.

    Article  Google Scholar 

  17. The Vision for Intelligent Vehicles and Applications (VIVA) Challenge, Laboratory for Intelligent and Safe Automobiles, UCSD. http://cvrr.ucsd.edu/vivachallenge/.

  18. Bambach, S., Lee, S., Crandall, D. J., & Yu, C. (2016). Lending a hand: Detecting hands and recognizing activities in complex egocentric interactions. IEEE International Conference on Computer Vision (pp.1949-1957). IEEE.

  19. Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., Berg, A. C., & Fei-Fei, L. (2015). Imagenet large scale visual recognition challenge. International Journal of Computer Vision, 115(3), 211–252.

    Article  MathSciNet  Google Scholar 

  20. Jain, V., & Learned-Miller, E. (2010). FDDB: A benchmark for face detection in unconstrained settings. UMass Amherst Technical Report.

  21. Betancourt, A. (2014). A sequential classifier for hand detection in the framework of egocentric vision. Computer Vision and Pattern Recognition Workshops (pp.600-605). IEEE.

  22. Wang, Q., & Zhang, G. (2017). Ore image edge detection using hog-index dictionary learning approach. Journal of Engineering, 1(1).

  23. Yin, H., & Gai, K. (2015). An empirical study on preprocessing high-dimensional class-imbalanced data for classification. IEEE, International Conference on High PERFORMANCE Computing and Communications (pp.1314-1319). IEEE Computer Society.

  24. Le, T. H. N., Zhu, C., Zheng, Y., Luu, K., & Savvides, M. (2017). Robust hand detection in vehicles. International Conference on Pattern Recognition (pp.573-578). IEEE.

  25. Kong, T., Yao, A., Chen, Y., & Sun, F. (2016). HyperNet: Towards accurate region proposal generation and joint object detection. Computer Vision and Pattern Recognition (pp.845-853). IEEE.

  26. Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., & Fu, C. Y., et al. (2016). SSD: Single shot MultiBox detector. European Conference on Computer Vision (pp.21-37). Springer international publishing.

  27. Kakumanu, P., Makrogiannis, S., & Bourbakis, N. (2007). A survey of skin-color modeling and detection methods. Pattern Recognition, 40(3), 1106–1122.

    Article  Google Scholar 

  28. Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. Computer Science.

  29. Iandola, F. N., Han, S., Moskewicz, M. W., Ashraf, K., Dally, W. J., & Keutzer, K. (2016). Squeezenet: Alexnet-level accuracy with 50x fewer parameters and <0.5mb model size.

  30. Howard, A. G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., & Weyand, T., et al. (2017). Mobilenets: Efficient convolutional neural networks for mobile vision applications.

  31. Gai, K., Qiu, M., & Sun, X. (2017). A survey on fintech. Journal of Network & Computer Applications.

  32. Yin, H., Gai, K., & Wang, Z. (2016). A classification algorithm based on ensemble feature selections for imbalanced-class dataset. IEEE, International Conference on Big Data Security on Cloud (pp.245-249). IEEE.

Download references

Acknowledgements

This work is supported by Foundation of China Institute of water resources and hydropower research (GE0145B112017).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Qiangyu Wang.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, Q., Zhang, G. & Yu, S. 2D Hand Detection Using Multi-Feature Skin Model Supervised Cascaded CNN. J Sign Process Syst 91, 1105–1113 (2019). https://doi.org/10.1007/s11265-018-1406-3

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11265-018-1406-3

Keywords

Navigation