2D Hand Detection Using Multi-Feature Skin Model Supervised Cascaded CNN

Wang, Qiangyu; Zhang, Guoying; Yu, Shu

doi:10.1007/s11265-018-1406-3

2D Hand Detection Using Multi-Feature Skin Model Supervised Cascaded CNN

Published: 13 September 2018

Volume 91, pages 1105–1113, (2019)
Cite this article

Journal of Signal Processing Systems Aims and scope Submit manuscript

439 Accesses
4 Citations
Explore all metrics

A Correction to this article was published on 19 November 2019

This article has been updated

Abstract

Hand gesture recognition is one of the most popular Human Computer Interface. The first step in most vision-based gesture recognition system is the hand detection and segmentation. Since hands are involved in a variety of daily tasks, the detection work suffers from both extreme illumination changes and the intrinsic variability of hand appearance. To overcome these problems, we propose a new method for 2D hand detection which can be considered as a combination of Multi-Feature based hand proposal generation and cascaded convolutional neural network (CCNN) classification. Considered various luminance, we choose color, Gabor, HOG and SIFT feature to discriminate skin region and generate hand proposal. Also, we propose a cascaded CNN that keeps the deep context information to detect hand among the proposals. The proposed Multi-Feature Supervised Cascaded CNN (MFS-CCNN) method is tested on a combination of several datasets including Oxford Hands Dataset, VIVA hand detection and Egohands Dataset as positive sample and ImageNet 2012, FDDB dataset as negative sample. The proposed method achieves competitive results.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Figure 2

Hand Skin Classification from Other Skin Objects Using Multi-direction 3D Color-Texture Feature and Cascaded Neural Network Classifier

An Efficient Human Computer Interaction through Hand Gesture Using Deep Convolutional Neural Network

Article 23 June 2020

Detecting Hands in Egocentric Videos: Towards Action Recognition

Change history

19 November 2019
The Publisher regrets an error on the printed front cover of the October 2019 issue. The issue numbers were incorrectly listed as Volume 91, Nos. 10-12, October 2019. The correct number should be: "Volume 91, No. 10, October 2019"

References

Stergiopoulou, E., Sgouropoulos, K., Nikolaou, N., Papamarkos, N., & Mitianoudis, N. (2014). Real time hand detection in a complex background. Engineering Applications of Artificial Intelligence, 35(2), 54–70.
Article Google Scholar
Ebert, A., Gershon, N. D., & van der Veer, G. C. (2012). Human-computer interaction: Introduction and overview. Künstliche Intelligenz, 26(2), 121–126.
Article Google Scholar
Zariffa, J., & Popovic, M. R. (2013). Hand contour detection in wearable camera video using an adaptive histogram region of interest. J NeuroEng Rehab, 10,1(2013-12-19), 10(1), 114–114.
Rogez, G., Supancic, J. S., & Ramanan, D. (2015). Understanding everyday hands in action from RGB-D images. IEEE International Conference on Computer Vision, 22, 3889–3897 IEEE Computer Society.
Google Scholar
Mittal, A., Zisserman, A., & Torr, P. (2011). Hand detection using multiple proposals. British Machine Vision Conference, 40, 75.1–75.11.
Google Scholar
Li, C., & Kitani, K. M. (2013). Pixel-level hand detection in ego-centric videos. Computer Vision and Pattern Recognition, 9, 3570–3577 IEEE.
Google Scholar
Fathi, A., & Rehg, J. M. (2011). Learning to recognize objects in egocentric activities. IEEE Conference on Computer Vision and Pattern Recognition 42, pp.3281-3288). IEEE Computer Society.
Serra, G., Camurri, M., Baraldi, L., Benedetti, M., & Cucchiara, R. (2013). Hand segmentation for gesture recognition in EGO-vision. ACM International Workshop on Interactive Multimedia on Mobile & Portable Devices, 24, 31–36 ACM.
Google Scholar
Dalal, N., & Triggs, & Bill. (2005). Histograms of oriented gradients for human detection. Computer Vision and Pattern Recognition, 2005. CVPR 2005. IEEE Computer Society Conference on, 1, 886–893 IEEE.
Google Scholar
Lowe, D. G. (2004). Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60(2), 91–110.
Article Google Scholar
Girshick, R., Donahue, J., Darrell, T., & Malik, J. (2016). Region-based convolutional networks for accurate object detection and segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 38(1), 142–158.
Article Google Scholar
Girshick, R. (2015). Fast R-CNN. IEEE International Conference on Computer Vision (pp.1440-1448). IEEE.
Ren, S., He, K., Girshick, R., & Sun, J. (2015). Faster R-CNN: Towards real-time object detection with region proposal networks. International Conference on Neural Information Processing Systems, 39, 91–99 MIT press.
Google Scholar
Dai, J., Li, Y., He, K., & Sun, J. (2016). R-fcn: Object detection via region-based fully convolutional networks.
Erol, A., Bebis, G., Nicolescu, M., Boyle, R. D., & Twombly, X. (2007). Vision based hand pose estimation: A review. Computer Vision & Image Understanding, 108(1), 52–73.
Article Google Scholar
Wachs, J. P., Kölsch, M., Stern, H., & Edan, Y. (2011). Vision-based hand-gesture applications. Communications of the ACM, 54(2), 60–71.
Article Google Scholar
The Vision for Intelligent Vehicles and Applications (VIVA) Challenge, Laboratory for Intelligent and Safe Automobiles, UCSD. http://cvrr.ucsd.edu/vivachallenge/.
Bambach, S., Lee, S., Crandall, D. J., & Yu, C. (2016). Lending a hand: Detecting hands and recognizing activities in complex egocentric interactions. IEEE International Conference on Computer Vision (pp.1949-1957). IEEE.
Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., Berg, A. C., & Fei-Fei, L. (2015). Imagenet large scale visual recognition challenge. International Journal of Computer Vision, 115(3), 211–252.
Article MathSciNet Google Scholar
Jain, V., & Learned-Miller, E. (2010). FDDB: A benchmark for face detection in unconstrained settings. UMass Amherst Technical Report.
Betancourt, A. (2014). A sequential classifier for hand detection in the framework of egocentric vision. Computer Vision and Pattern Recognition Workshops (pp.600-605). IEEE.
Wang, Q., & Zhang, G. (2017). Ore image edge detection using hog-index dictionary learning approach. Journal of Engineering, 1(1).
Yin, H., & Gai, K. (2015). An empirical study on preprocessing high-dimensional class-imbalanced data for classification. IEEE, International Conference on High PERFORMANCE Computing and Communications (pp.1314-1319). IEEE Computer Society.
Le, T. H. N., Zhu, C., Zheng, Y., Luu, K., & Savvides, M. (2017). Robust hand detection in vehicles. International Conference on Pattern Recognition (pp.573-578). IEEE.
Kong, T., Yao, A., Chen, Y., & Sun, F. (2016). HyperNet: Towards accurate region proposal generation and joint object detection. Computer Vision and Pattern Recognition (pp.845-853). IEEE.
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., & Fu, C. Y., et al. (2016). SSD: Single shot MultiBox detector. European Conference on Computer Vision (pp.21-37). Springer international publishing.
Kakumanu, P., Makrogiannis, S., & Bourbakis, N. (2007). A survey of skin-color modeling and detection methods. Pattern Recognition, 40(3), 1106–1122.
Article Google Scholar
Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. Computer Science.
Iandola, F. N., Han, S., Moskewicz, M. W., Ashraf, K., Dally, W. J., & Keutzer, K. (2016). Squeezenet: Alexnet-level accuracy with 50x fewer parameters and <0.5mb model size.
Howard, A. G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., & Weyand, T., et al. (2017). Mobilenets: Efficient convolutional neural networks for mobile vision applications.
Gai, K., Qiu, M., & Sun, X. (2017). A survey on fintech. Journal of Network & Computer Applications.
Yin, H., Gai, K., & Wang, Z. (2016). A classification algorithm based on ensemble feature selections for imbalanced-class dataset. IEEE, International Conference on Big Data Security on Cloud (pp.245-249). IEEE.

Download references

Acknowledgements

This work is supported by Foundation of China Institute of water resources and hydropower research (GE0145B112017).

Author information

Authors and Affiliations

School of Mechanical Electronic & Information Engineering, China University of Mining & Technology, Beijing, 100083, People’s Republic of China
Qiangyu Wang & Guoying Zhang
State Key Laboratory of Simulation and Regulation of Water Cycle in River Basin, Beijing, 100048, China
Shu Yu
Department of Geotechnical Engineering, China Institute of Water Resources and Hydropower Research, Beijing, 100048, China
Shu Yu

Authors

Qiangyu Wang
View author publications
You can also search for this author in PubMed Google Scholar
Guoying Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Shu Yu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Qiangyu Wang.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, Q., Zhang, G. & Yu, S. 2D Hand Detection Using Multi-Feature Skin Model Supervised Cascaded CNN. J Sign Process Syst 91, 1105–1113 (2019). https://doi.org/10.1007/s11265-018-1406-3

Download citation

Received: 16 April 2018
Revised: 19 July 2018
Accepted: 04 September 2018
Published: 13 September 2018
Issue Date: October 2019
DOI: https://doi.org/10.1007/s11265-018-1406-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

2D Hand Detection Using Multi-Feature Skin Model Supervised Cascaded CNN

Abstract

Access this article

Similar content being viewed by others

Hand Skin Classification from Other Skin Objects Using Multi-direction 3D Color-Texture Feature and Cascaded Neural Network Classifier

An Efficient Human Computer Interaction through Hand Gesture Using Deep Convolutional Neural Network

Detecting Hands in Egocentric Videos: Towards Action Recognition

Change history

19 November 2019

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

2D Hand Detection Using Multi-Feature Skin Model Supervised Cascaded CNN

Abstract

Access this article

Similar content being viewed by others

Hand Skin Classification from Other Skin Objects Using Multi-direction 3D Color-Texture Feature and Cascaded Neural Network Classifier

An Efficient Human Computer Interaction through Hand Gesture Using Deep Convolutional Neural Network

Detecting Hands in Egocentric Videos: Towards Action Recognition

Change history

19 November 2019

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation