Attention Based Detection and Recognition of Hand Postures Against Complex Backgrounds

Pisharady, Pramod Kumar; Vadakkepat, Prahlad; Loh, Ai Poh

doi:10.1007/s11263-012-0560-5

Attention Based Detection and Recognition of Hand Postures Against Complex Backgrounds

Published: 30 August 2012

Volume 101, pages 403–419, (2013)
Cite this article

International Journal of Computer Vision Aims and scope Submit manuscript

Pramod Kumar Pisharady¹,
Prahlad Vadakkepat¹ &
Ai Poh Loh¹

3431 Accesses
152 Citations
Explore all metrics

Abstract

A system for the detection, segmentation and recognition of multi-class hand postures against complex natural backgrounds is presented. Visual attention, which is the cognitive process of selectively concentrating on a region of interest in the visual field, helps human to recognize objects in cluttered natural scenes. The proposed system utilizes a Bayesian model of visual attention to generate a saliency map, and to detect and identify the hand region. Feature based visual attention is implemented using a combination of high level (shape, texture) and low level (color) image features. The shape and texture features are extracted from a skin similarity map, using a computational model of the ventral stream of visual cortex. The skin similarity map, which represents the similarity of each pixel to the human skin color in HSI color space, enhanced the edges and shapes within the skin colored regions. The color features used are the discretized chrominance components in HSI, YCbCr color spaces, and the similarity to skin map. The hand postures are classified using the shape and texture features, with a support vector machines classifier. A new 10 class complex background hand posture dataset namely NUS hand posture dataset-II is developed for testing the proposed algorithm (40 subjects, different ethnicities, various hand sizes, 2750 hand postures and 2000 background images). The algorithm is tested for hand detection and hand posture recognition using 10 fold cross-validation. The experimental results show that the algorithm has a person independent performance, and is reliable against variations in hand sizes and complex backgrounds. The algorithm provided a recognition rate of 94.36 %. A comparison of the proposed algorithm with other existing methods evidences its better performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Supervised framework for top-down color interest point detection

Article 22 July 2020

Approach to hand posture recognition based on hand shape features for human–robot interaction

Article Open access 10 April 2021

Hand Posture Recognition from Disparity Cost Map

Notes

Graph matching is considered to be one of the most complex algorithms in vision based object recognition (Bienenstock and Malsburg 1987). The complexity is due to its combinatorial nature.
The dataset is available for free download: http://www.ece.nus.edu.sg/stfpage/elepv/NUS-HandSet/.
V1, V2, V3, V4, and V5 are the visual areas in the visual cortex. V1 is the primary visual cortex. V2 to V5 are the secondary visual areas, and are collectively termed as the extrastriate visual cortex.
Refer Serre et al. (2007) for further explanation of S ₁ and C ₁ stages (layer 1 and 2).
The number of prototype patches and orientations are tunable parameters in the system. Computational complexity increases with these parameters. The reported values provided optimal results (considering the accuracy and computational complexity).
The luminance color components are not utilized as these components are sensitive to skin color as well as lighting.
The dataset consists of hand postures by 40 subjects, with different ethnic origins.
400 images (1 image per class per subject) are considered. During the training phase the hand area is selected manually.
The dataset is available for academic research purposes: http://www.ece.nus.edu.sg/stfpage/elepv/NUS-HandSet/.
For cross validation the dataset is divided into 10 subsets each containing 200 images, the data from 4 subjects.

References

Alon, J., Athitsos, V., Yuan, Q., & Sclaroff, S. (2009). A unified framework for gesture recognition and spatiotemporal gesture segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(09), 1685–1699.
Article Google Scholar
Athitsos, V., & Sclaroff, S. (2003). Estimating 3d hand pose from a cluttered image. In IEEE conference on computer vision and pattern recognition (Vol. 2, pp. 432–439).
Google Scholar
Bienenstock, E., & Malsburg, C. v. d. (1987). A neural network for invariant pattern recognition. Europhysics Letters, 4(1), 121–126.
Article Google Scholar
Bishop, C. (1995). Neural networks for pattern recognition. London: Oxford University Press.
Google Scholar
Chaves-González, J. M., Vega-Rodrígueza, M. A., Gómez-Pulidoa, J. A., & Sánchez-Péreza, J. M. (2010). Detecting skin in face recognition systems: a colour spaces study. Digital Signal Processing, 20(03), 806–823.
Article Google Scholar
Chen, F. S., Fu, C. M., & Huang, C. L. (2003). Hand gesture recognition using a real-time tracking method and hidden Markov models. Image and Vision Computing, 21, 745–758.
Article Google Scholar
Chen, Q., Georganas, N. D., & Petriu, E. M. (2008). Hand gesture recognition using haar-like features and a stochastic context-free grammar. IEEE Transactions on Instrumentation and Measurement, 57(8), 1562–1571.
Article Google Scholar
Chikkerur, S., Serre, T., Tan, C., & Poggio, T. (2010). What and where: a Bayesian inference theory of attention. Vision Research, 50(22), 2233–2247.
Article Google Scholar
Daniel, K., John, M., & Charles, M. (2010). A person independent system for recognition of hand postures used in sign language. Pattern Recognition Letters, 31, 1359–1368.
Article Google Scholar
Dayan, P., Hinton, G. E., & Neal, R. M. (1995). The Helmholtz machine. Neural Computation, 7, 889–904.
Article Google Scholar
Eng-Jon, O., & Bowden, R. (2004). A boosted classifier tree for hand shape detection. In IEEE conference on automatic face and gesture recognition (pp. 889–894).
Google Scholar
Erol, A., Bebis, G., Nicolescu, M., Boyle, R. D., & Twombly, X. (2007). Vision-based hand pose estimation: a review. Computer Vision and Image Understanding, 108, 52–73.
Article Google Scholar
Ge, S. S., Yang, Y., & Lee, T. H. (2008). Hand gesture recognition and tracking based on distributed locally linear embedding. Image and Vision Computing, 26, 1607–1620.
Article Google Scholar
Hasanuzzamana, M., Zhanga, T., Ampornaramveth, V., Gotoda, H., Shirai, Y., & Ueno, H. (2007). Adaptive visual gesture recognition for human-robot interaction using a knowledge-based software platform. Robotics and Autonomous Systems, 55(8), 643–657.
Article Google Scholar
Itti, L., & Koch, C. (2001). Computational modelling of visual attention. Nature Reviews. Neuroscience, 2(3), 194–203.
Article Google Scholar
Itti, L., Koch, C., & Niebur, E. (1998). A model of saliency-based visual attention for rapid scene analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(11), 1254–1259.
Article Google Scholar
Jones, J. P., & Palmer, L. A. (1987). An evaluation of the twodimensional gabor filter model of simple receptive fields in cat striate cortex. Journal of Neurophysiology, 58(6), 1233–1258.
Google Scholar
Jones, M., & Rehg, J. (1999). Statistical color models with application to skin detection. In IEEE conference on computer vision and pattern recognition (Vol. 1).
Google Scholar
Just, A., & Marcel, S. (2009). A comparative study of two state-of-the-art sequence processing techniques for hand gesture recognition. Computer Vision and Image Understanding, 113(4), 532–543.
Article Google Scholar
Kolsch, M., & Turk, M. (2004). Robust hand detection. In IEEE conference on automatic face and gesture recognition (pp. 614–619).
Google Scholar
Lai, J., & Wang, W. X. (2008). Face recognition using cortex mechanism and svm. In C. Xiong, H. Liu, Y. Huang, & Y. Xiong (Eds.), 1st international conference intelligent robotics and applications, Wuhan, China (pp. 625–632).
Chapter Google Scholar
Lee, K. H., & Kim, J. H. (1999). An hmm based threshold model approach for gesture recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 21(10), 961–973.
Article Google Scholar
Lee, J., & Kunii, T. (1995). Model-based analysis of hand posture. IEEE Computer Graphics and Applications, 15(5), 77–86.
Article Google Scholar
Licsar, A., & Sziranyi, T. (2005). User-adaptive hand gesture recognition system with interactive training. Image and Vision Computing, 23, 1102–1114.
Article Google Scholar
Mitra, S., & Acharya, T. (2007). Gesture recognition: a survey. IEEE Transactions on Systems, Man and Cybernetics. Part C, Applications and Reviews 37(3), 311–324.
Article Google Scholar
Murphy, K. (2003). Bayes net toolbox for Matlab.
Niebur, E., & Koch, C. (1998). Computational architectures for attention. In R. Parasuraman (Ed.), The attentive brain (pp. 163–186). Cambridge: MIT Press.
Google Scholar
Ong, S. C. W., & Ranganath, S. (2005). Automatic sign language analysis: a survey and the future beyond lexical meaning. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(6), 873–891.
Article Google Scholar
Patwardhan, K. S., & Roy, S. D. (2007). Hand gesture modelling and recognition involving changing shapes and trajectories, using a predictive eigentracker. Pattern Recognition Letters, 28, 329–334.
Article Google Scholar
Pavlovic, V. I., Sharma, R., & Huang, T. S. (1997). Visual interpretation of hand gestures for human-computer interaction: a review. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(7), 677–694.
Article Google Scholar
Pearl, J. (1988). Probabilistic reasoning in intelligent systems: networks of plausible inference. San Mateo: Morgan Kaufmann.
Google Scholar
Phung, S. L., Bouzerdoum, A., & Chai, D. (2005). Skin segmentation using color pixel classification: analysis and comparison. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(01), 148–154.
Article Google Scholar
Poggio, T., & Bizzi, E. (2004). Generalization in vision and motor control. Nature, 431, 768–774.
Article Google Scholar
Poggio, T., & Riesenhuber, M. (1999). Hierarchical models of object recognition in cortex. Nature Neuroscience, 2(11), 1019–1025.
Article Google Scholar
Pramod Kumar, P., Vadakkepat, P., & Loh, A. P. (2010a). Hand posture and face recognition using a fuzzy-rough approach. International Journal of Humanoid Robotics, 07(03), 331–356.
Article Google Scholar
Pramod Kumar, P., Vadakkepat, P., & Loh, A. P. (2010b). Graph matching based hand posture recognition using neuro-biologically inspired features. In International conference on control, automation, robotics and vision (ICARCV) 2010, Singapore.
Google Scholar
Pramod Kumar, P., Stephanie, Q. S. H., Vadakkepat, P., & Loh, A. P. (2010c). Hand posture recognition using neuro-biologically inspired features. In International conference on computational intelligence, robotics and autonomous systems (CIRAS) 2010, Bangalore.
Google Scholar
Pramod Kumar, P., Vadakkepat, P., & Loh, A. P. (2011). Fuzzy-rough discriminative feature selection and classification algorithm, with application to microarray and image datasets. Applied Soft Computing, 11(04), 3429–3440.
Article Google Scholar
Ramamoorthy, A., Vaswani, N., Chaudhury, S., & Banerjee, S. (2003). Recognition of dynamic hand gestures. Pattern Recognition, 36, 2069–2081.
Article MATH Google Scholar
Rao, R. (2005). Bayesian inference and attentional modulation in the visual cortex. NeuroReport, 16(16), 1843–1848.
Article Google Scholar
Serre, T., Wolf, L., & Poggio, T. (2005). Object recognition with features inspired by visual cortex. In C. Schmid, S. Soatto, & C. Tomasi (Eds.), Conference on computer vision and pattern recognition, San Diego, CA (pp. 994–1000).
Google Scholar
Serre, T., Wolf, L., Bileschi, S., Riesenhuber, M., & Poggio, T. (2007). Robust object recognition with cortex-like mechanisms. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(3), 411–426.
Article Google Scholar
Siagian, C., & Itti, L. (2007). Rapid biologically-inspired scene classification using features shared with visual attention. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(2), 300–312.
Article Google Scholar
Su, M. C. (2000). A fuzzy rule-based approach to spatio-temporal hand gesture recognition. IEEE Transactions on Systems, Man and Cybernetics. Part C, Applications and Reviews, 30(2), 276–281.
Google Scholar
Teng, X., Wu, B., Yu, W., & Liu, C. (2005). A hand gesture recognition system based on local linear embedding. Journal of Visual Languages and Computing, 16, 442–454.
Article Google Scholar
Triesch, J., & Malsburg, C. (1996a). Robust classification of hand postures against complex backgrounds. In Proceedings of the second international conference on automatic face and gesture recognition, 1996, Killington, VT, USA (pp. 170–175).
Chapter Google Scholar
Triesch, J., & Malsburg, C. (1996b). Sebastien Marcel hand posture and gesture datasets: Jochen Triesch static hand posture database [online]: http://www.idiap.ch/resources/gestures/.
Triesch, J., & Malsburg, C. (1998). A gesture interface for human-robot-interaction. In Proceedings of the third IEEE international conference on automatic face and gesture recognition, 1998, Nara, Japan (pp. 546–551).
Chapter Google Scholar
Triesch, J., & Malsburg, C. (2001). A system for person-independent hand posture recognition against complex backgrounds. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(12), 1449–1453.
Article Google Scholar
Tsotsos, J. K., Culhane, S. M., Wai, Y. H., Lai, W. Y. K., Davis, N., & Nuflo, F. (1995). Modelling visual attention via selective tuning. Artificial Intelligence, 78(1–2), 507–545.
Article Google Scholar
Ueda, E., Matsumoto, Y., Imai, M., & Ogasawara, T. (2003). A hand-pose estimation for vision-based human interfaces. IEEE Transactions on Industrial Electronics, 50(4), 676–684.
Article Google Scholar
Van der Zant, T., Schomaker, L., & Haak, K. (2008). Handwritten-word spotting using biologically inspired features. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(11), 1945–1957.
Article Google Scholar
Wang, W. H. A., & Tung, C. L. (2008). Dynamic hand gesture recognition using hierarchical dynamic Bayesian networks through low-level image processing. In 7th international conference on machine learning and cybernetics, Kunming, P.R. China (pp. 3247–3253).
Google Scholar
Wiesel, T. N., & Hubel, D. H. (1962). Receptive fields, binocular interaction and functional architecture in the cat’s visual cortex. Journal of Physiology, 160, 106–154.
Google Scholar
Wu, Y., & Huang, T. S. (1999). Vision-based gesture recognition: a review. In A. Braffort, R. Gherbi, S. Gibet, J. Richardson, & D. Teil (Eds.), International gesture workshop on gesture-based communication in human computer interaction, Gif Sur Yvette, France (pp. 103–115). Berlin: Springer
Chapter Google Scholar
Wu, Y., & Huang, T. S. (2000). View-independent recognition of hand postures. In IEEE conference on computer vision and pattern recognition (Vol. 2, pp. 88–94).
Google Scholar
Yang, M. H., & Ahuja, N. (1998). Extraction and classification of visual motion patterns for hand gesture recognition. In Proceedings, IEEE computer society conference on computer vision and pattern recognition, Santa Barbara, CA, USA (pp. 892–897).
Google Scholar
Yang, M. H., Ahuja, N., & Tabb, M. (2002). Extraction of 2d motion trajectories and its application to hand gesture recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(8), 1061–1074.
Article Google Scholar
Yang, H. D., Park, A. Y., & Lee, S. W. (2007). Gesture spotting and recognition for human–robot interaction. IEEE Transactions on Robotics, 23(2), 256–270.
Article Google Scholar
Yin, X., & Xie, M. (2003). Estimation of the fundamental matrix from uncalibrated stereo hand images for 3d hand gesture recognition. Pattern Recognition, 36, 567–584.
Article Google Scholar
Yoon, H. S., Soh, J., Bae, Y. J., & Yang, H. S. (2001). Hand gesture recognition using combined features of location, angle and velocity. Pattern Recognition, 34, 1491–1501.
Article MATH Google Scholar
Zhao, M., Quek, F. K. H., & Wu, X. (1998). Rievl: recursive induction learning in hand gesture recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(11), 1174–1185.
Article Google Scholar

Download references

Acknowledgements

The authors would like to thank Ms. Ma Zin Thu Shein for taking part in the shooting of NUS hand posture dataset-II. Also the authors express their appreciation to all the 40 subjects volunteered for the development of the dataset.

Author information

Authors and Affiliations

Department of Electrical and Computer Engineering, National University of Singapore, 4 Engineering Drive 3, Singapore, 117576, Singapore
Pramod Kumar Pisharady, Prahlad Vadakkepat & Ai Poh Loh

Authors

Pramod Kumar Pisharady
View author publications
You can also search for this author in PubMed Google Scholar
Prahlad Vadakkepat
View author publications
You can also search for this author in PubMed Google Scholar
Ai Poh Loh
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Pramod Kumar Pisharady.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Pisharady, P.K., Vadakkepat, P. & Loh, A.P. Attention Based Detection and Recognition of Hand Postures Against Complex Backgrounds. Int J Comput Vis 101, 403–419 (2013). https://doi.org/10.1007/s11263-012-0560-5

Download citation

Received: 05 December 2011
Accepted: 08 August 2012
Published: 30 August 2012
Issue Date: February 2013
DOI: https://doi.org/10.1007/s11263-012-0560-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Attention Based Detection and Recognition of Hand Postures Against Complex Backgrounds

Abstract

Access this article

Similar content being viewed by others

Supervised framework for top-down color interest point detection

Approach to hand posture recognition based on hand shape features for human–robot interaction

Hand Posture Recognition from Disparity Cost Map

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Attention Based Detection and Recognition of Hand Postures Against Complex Backgrounds

Abstract

Access this article

Similar content being viewed by others

Supervised framework for top-down color interest point detection

Approach to hand posture recognition based on hand shape features for human–robot interaction

Hand Posture Recognition from Disparity Cost Map

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation