Abstract
The identification of a person’s gender plays an important role in various visual surveillance and monitoring applications which are growing more ubiquitously. This paper proposes a method for gender classification of pedestrians based on whole body images which, unlike facial-based methods, allows for observation from different viewpoints. We use a parts-based model that combines global and local information to make inference. Convolutional neural network (CNN) is leveraged for its superior feature learning and classification capability. Our method requires that only the gender label is available for the training images, without the need for any other expensive annotation such as the anatomical parts, key points or other attributes. We trained a CNN on the bounding box containing the whole body (global CNN) or a defined portion of the body (local CNN). Experimental results show that the upper half region of the body is the most discriminative for gender, in comparison with the middle or lower half. The best model is a jointly trained combination of a global CNN and a local upper body CNN, which achieves higher accuracy than previous works on publicly available datasets.







Similar content being viewed by others
References
Antipov G, Berrani SA, Ruchaud N, Dugelay JL (2015) Learned vs. hand-crafted features for pedestrian gender recognition. In: Proceedings of the 23rd ACM international conference on multimedia. ACM, pp 1263–1266
Cao L, Dikmen M, Fu Y, Huang TS (2008) Gender recognition from body. In: Proceedings of the 16th ACM international conference on multimedia. ACM, pp 725–728
Chatfield K, Simonyan K, Vedaldi A, Zisserman A (2014) Return of the devil in the details: delving deep into convolutional nets. arXiv:1405.3531
Collins M, Zhang J, Miller P, Wang H (2009) Full body image feature representations for gender profiling. In: 2009 IEEE 12th international conference on computer vision workshops (ICCV workshops). IEEE, pp 1235–1242
Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: IEEE computer society conference on Computer vision and pattern recognition, 2005. CVPR 2005, vol 1. IEEE, pp 886–893
Geelen CD, Wijnhoven RG, Dubbelman G et al (2015) Gender classification in low-resolution surveillance video: in-depth comparison of random forests and svms. In: SPIE/IS&T electronic imaging, international society for optics and photonics, pp 94070M–94070M
Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 580–587
Glorot X, Bengio Y (2010) Understanding the difficulty of training deep feedforward neural networks. In: Proceedings of the thirteenth international conference on artificial intelligence and statistics, pp 249–256
Glorot X, Bordes A, Bengio Y (2011) Deep sparse rectifier neural networks. In: Proceedings of the fourteenth international conference on artificial intelligence and statistics, pp 315–323
Grün F, Rupprecht C, Navab N, Federico T (2016) A taxonomy and library for visualizing learned features in convolutional neural networks. In: ICML workshop on visualization for deep learning (ICML-W)
Guo G, Mu G, Fu Y (2009) Gender from body: A biologically-inspired approach with manifold learning. In: Asian conference on computer vision. Springer, pp 236–245
Jia Y, Shelhamer E, Donahue J, Karayev S, Long J, Girshick R, Guadarrama S, Darrell T (2014) Caffe: convolutional architecture for fast feature embedding. In: Proceedings of the 22nd ACM international conference on multimedia. ACM, pp 675–678
Khan FS, van de Weijer J, Anwer RM, Felsberg M, Gatta C (2014) Semantic pyramids for gender and action recognition. IEEE Trans Image Process 23(8):3633–3645
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1097–1105
Li M, Bao S, Dong W, Wang Y, Su Z (2013) Head-shoulder based gender recognition. In: 2013 20th IEEE international conference on image processing (ICIP). IEEE, pp 2753–2756
Ng CB, Tay YH, Goi BM (2013) Comparing image representations for training a convolutional neural network to classify gender. In: 2013 1st international conference on artificial intelligence, modelling and simulation (AIMS). IEEE, pp 29–33
Ng CB, Tay YH, Goi BM (2015) A review of facial gender recognition. Pattern Anal Appl 18(4):739–755
Ng CB, Tay YH, Goi BM (2017) Training strategy for convolutional neural networks in pedestrian gender classification. In: Second international workshop on pattern recognition, international society for optics and photonics, vol 10443, pp 104431A
Ojala T, Pietikainen M, Maenpaa T (2002) Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans Pattern Anal Mach Intell 24(7):971–987
Oren M, Papageorgiou C, Sinha P, Osuna E, Poggio T (1997) Pedestrian detection using wavelet templates. In: Proceedings, 1997 IEEE computer society conference on computer vision and pattern recognition, 1997. IEEE, pp 193–199
Riesenhuber M, Poggio T (1999) Hierarchical models of object recognition in cortex. Nat Neurosci 2(11):1019–1025
Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M et al (2015) Imagenet large scale visual recognition challenge. Int J Comput Vis 115(3):211–252
Scherer D, Müller A, Behnke S (2010) Evaluation of pooling operations in convolutional architectures for object recognition. Artifl Neural Netw ICANN 2010:92–101
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556
Springenberg JT, Dosovitskiy A, Brox T, Riedmiller M (2014) Striving for simplicity: the all convolutional net. arXiv:1412.6806
Srivastava N, Hinton GE, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958
Theano Development Team (2016) Theano: a python framework for fast computation of mathematical expressions. arXiv:1605.02688
Zhu J, Liao S, Lei Z, Yi D, Li S (2013) Pedestrian attribute classification in surveillance: database and evaluation. In: Proceedings of the IEEE international conference on computer vision workshops, pp 331–338
Acknowledgements
This work has been supported by the Ministry of Science, Technology and Innovation (MOSTI) Science Fund, Project No: 01-02-11-SF0199. We thank the anonymous reviewers for their comments and suggestions.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Ng, CB., Tay, YH. & Goi, BM. Pedestrian gender classification using combined global and local parts-based convolutional neural networks. Pattern Anal Applic 22, 1469–1480 (2019). https://doi.org/10.1007/s10044-018-0725-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10044-018-0725-0