Abstract
Deep learning has revolutionized the computer vision by achieving human level performance on classification and detection of the objects in the images. These algorithms are often trained using supervised learning methods which require a labeled training data set. Since the labeling requires human intervention, the human interpretable tri-color images are needed for this purpose even though the image sensors capture subsampled images in Bayer format. For inference also, the Bayer format images are converted to tri-color images which require sophisticated image processing pipeline. This conversion seems unnecessary given the proven ability of the deep networks to learn in multiple different scenarios. Moreover, the image processing pipeline can be simplified by directly using the Bayer images for computer vision tasks. Availability of the dataset for training the deep networks remains the challenge for building such systems; so, a methodology is needed to use the color images for training DNNs which can classify the Bayer images for a given application. The existing tri-color images from training datasets can be subsampled in pseudo-Bayer pattern for training such deep neural networks. This paper presents the methodology and results of one such method adopted for hand pose recognition application using SqueezeNet with encouraging performance. Using this methodology, we achieved 100% accuracy in nine out of ten hand postures with pseudo-Bayer images and in six out of ten hand postures with real Bayer images.






Similar content being viewed by others
References
Wachs J, Kolsch M, Stern H, Edan Y. Vision-based hand gesture applications: challenges and innovations. Commun ACM. 2011;54(2):60–71.
Guo Y, Liu Y, Oerlemans A, Lao S, Wu S, Lew MS. Deep learning for visual understanding: a review. Neurocomputing. 2016;187:27–48.
Zeiler MD, Fergus R. Visualizing and understanding convolutional networks. In: European conference on computer vision. Cham: Springer International Publishing; 2014.
Fossum ER. CMOS image sensors: electronic camera-on-a-chip. IEEE Trans Electron Dev. 1997;44(10):1689–98.
Gamal AE, Eltoukhy H. CMOS image sensors. IEEE Circuits Dev Mag. 2005;21(3):6–20.
Bayer BE. Color imaging array, U.S. Patent 3,971,065. 1976.
Mancuso M, Battiato S. An introduction to the digital still camera technology. 2021. Accessed 1 Jan 2021.
Gharbi M, Chaurasia G, Paris S, Durand F. Deep joint demosaicking and denoising. ACM Trans Gr. 2016;35(6):1–12.
Chen C, Chen Q, Xu J, Koltun V. Learning to see in the dark. CVPR. 2018. https://doi.org/10.1109/CVPR.2018.00347.
Nowlan SJ, Platt JC. A convolutional neural network hand tracker. NIPS. (1994) pp 901–908.
Nagi J, Ducatelle F, Caro GAD, Ciresan D, Meier U, Giusti A, Nagi F, Schmidhuber J, Gambardella LM. Max-pooling convolutional neural networks for vision-based hand gesture recognition. In: IEEE ICSIPA; 2011.
Lin HI, Hsu MH, Chen WK. Human hand gesture recognition using a convolution neural network. IEEE Conf Autom Sci Eng. 2014. https://doi.org/10.1109/IRI.2019.00054.
Han M, Chen J, Li L, Chang Y. Visual hand gesture recognition with convolution neural network. In: IEEE International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing (SNPD), Shanghai; 2016.
Yamashita Y, Watasue T. Hand posture recognition based on bottom-up structured deep convolutional neural network with curriculum learning. In: ICIP; 2014.
Gulcehre C, Bengio Y. Knowledge matters: importance of prior information for optimization. 2013.
Jia Y, Shelhamer E, Donahue J, Karayev S, Long J, Girshick R, Guadarrama S, Darrell T. Caffe: convolutional architecture for fast feature embedding. 2014.
Pisharady PK, Vadakkepat P, Loh AP. Attention based detection and recognition of hand postures against complex backgrounds. Int J Comput Vis. 2013;101(3):403–19.
The NUS hand posture datasets I and II. 2021. Accessed 1 Jan 2021.
Deng J, Dong W, Socher R, Li LJ, Li K, Fei-Fei L. Imagenet: a large-scale hierarchical image database. In: 2009 IEEE CVPR; 2009.
Everingham M, Gool LV, Williams CK, Zisserman A. The pascal visual object classes (voc) challenge. Int J Comput Vis. 2010;88(2):303–38.
Lin TY, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P, Zitnick CL. Microsoft coco: common objects in context. In: ECCV; 2014.
LeCun Y, Bottou L, Bengio Y, Haffner P. Gradient-based learning applied to document recognition. Proc IEEE. 1998;86(11):2278–324.
Krizhevsky A, Sutskever I, Hinton GE. ImageNet classification with deep convolutional neural networks. NIPS. 2012. https://doi.org/10.1145/3065386.
Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. http://arxiv.org/abs/1409.1556 2014.
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A. Going deeper with convolutions. In: IEEE CVPR; 2015.
He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. In: IEEE CVPR; 2016.
Sze V, Chen Y-H, Yang T-J, Emer JS. Efficient processing of deep neural networks: a tutorial and survey. Proc IEEE. 2017;105(12):2295–329.
Iandola FN, Han S, Moskewicz MW, Ashraf K, Dally WJ, Keutzer K. SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and< 0.5 MB model size. http://arxiv.org/abs/1602.07360 2016.
https://github.com/forresti/SqueezeNet. 2021. Accessed 1 Jan 2021.
Bradski G, Kaehler A. Learning OpenCV: computer vision with the OpenCV library. Newton: O’Reilly Media, Inc; 2008.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of Interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Chandra, M., Lall, B. A Novel Method for CNN Training Using Existing Color Datasets for Classifying Hand Postures in Bayer Images. SN COMPUT. SCI. 2, 60 (2021). https://doi.org/10.1007/s42979-021-00450-w
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s42979-021-00450-w