Improved image classification with 4D light-field and interleaved convolutional neural network

Lu, Zhicheng; Yeung, Henry W. F.; Qu, Qiang; Chung, Yuk Ying; Chen, Xiaoming; Chen, Zhibo

doi:10.1007/s11042-018-6597-x

Improved image classification with 4D light-field and interleaved convolutional neural network

Published: 25 September 2018

Volume 78, pages 29211–29227, (2019)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Zhicheng Lu^1,2,
Henry W. F. Yeung²,
Qiang Qu²,
Yuk Ying Chung²,
Xiaoming Chen¹ &
…
Zhibo Chen¹

637 Accesses
15 Citations
Explore all metrics

Abstract

Image classification is a well-studied problem. However, there remains challenges for some special categories of images. This paper proposes a new deep convolutional neural network to improve image classification using extra light-field angular information. The proposed network model employs transfer learning by replacing the fully connected layer of a VGG network with a set of interleaved spatial-angular filters. The resulting model takes advantage of both the spatial and angular information of light-field images (LFIs), thus providing more accurate classification performance over traditional models. To evaluate the proposed network model, we established a light-field image dataset, currently consisting of 560 captured LFIs, which have been divided into 11 labeled categories. Based on this dataset, our experimental results show that the proposed LFI model yields an average of 92% classification accuracy as oppose to 84% from the model using traditional 2D images and 85% from the model using stereo pair images. In particular, on classifying challenging objects such as the “screen” images, the proposed LFI model demonstrated to have significant improvement of 16% and 12% respectively over the 2D image model and the stereo image model.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 2

Methods for image denoising using convolutional neural network: a review

Article Open access 10 June 2021

A review of object detection based on deep learning

Article 12 June 2020

Image Matching from Handcrafted to Deep Features: A Survey

Article Open access 04 August 2020

References

Adelson EH, Wang JY (1992) Single lens stereo with a plenoptic camera. IEEE Trans Pattern Anal Mach Intell 14(2):99
Article Google Scholar
Aiger D, Allen B, Golovinskiy A (2017) Large-scale 3d scene classification with multi-view volumetric cnn, arXiv preprint arXiv:1712.09216
Bastidas A (2017) Tiny imagenet image classification. https://pdfs.semanticscholar.org/1b0c/2ba54f7e2f3f5b3a2098721d36e6079d0382.pdf
Chen Y, Yang Y, Fang Q, Yao X (2017) Discriminative region guided deep neural network towards food image classification. In: CCF Chinese conference on computer vision. Springer, pp 577–587
Chen J, Hou J, Chau LP (2018) Light field compression with disparity-guided sparse coding based on structural key views. IEEE Trans Image Process 27(1):314
Article MathSciNet MATH Google Scholar
Chen J, Hou J, Chau LP (2018) Light field denoising via anisotropic parallax analysis in cnn framework. IEEE Signal Process Lett (IEEE SPL) 25(9):1403–1407
Article Google Scholar
Chen J, Hou J, Ni Y, Chau LP (2018) Accurate light field depth estimation with superpixel regularization over partially occluded regions. IEEE Trans Image Process (IEEE T-IP) 27(10):4889–4900
Article MathSciNet Google Scholar
Deng J, Dong W, Socher R, Li LJ, Li K, Fei-Fei L (2009) Imagenet: a large-scale hierarchical image database. In: IEEE conference on computer vision and pattern recognition, 2009. CVPR 2009. IEEE, pp 248–255
Deshpande A (2016) A beginner’s guide to understanding convolutional neural networks part 2 [online]. [cit. 2017-07-07]
Eckert S, Ghebremicael ST, Hurni H, Kohler T (2017) Identification and classification of structural soil conservation measures based on very high resolution stereo satellite data. J Environ Manag 193:592
Article Google Scholar
Gao XW, Hui R (2016) A deep learning based approach to classification of ct brain images. In: SAI computing conference (SAI), 2016. IEEE, pp 28–31
Hahnloser RH, Sarpeshkar R, Mahowald MA, Douglas RJ, Seung HS (2000) Digital selection and analogue amplification coexist in a cortex-inspired silicon circuit. Nature 405(6789):947
Article Google Scholar
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
He K, Gkioxari G, Dollár P, Girshick R (2017) Mask r-cnn, arXiv preprint arXiv:1703.06870
Hou J, Chen J, Chau LP (2018) Light field image compression based on bi-level view compensation with rate-distortion optimization. In: IEEE Transactions on Circuits and Systems for Video Technology (IEEE T-CSVT)
Image-net.org (2018) imagenet tree view. [online] available at: http://image-net.org/explore. Accessed: 25 Jan 2018
Ioffe S, Szegedy C (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. In: International conference on machine learning, pp 448–456
Jeon HG, Park J, Choe G, Park J, Bok Y, Tai YW, So Kweon I (2015) Accurate depth map estimation from a lenslet light field camera. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1547–1555
Kalantari NK, Wang TC, Ramamoorthi R (2016) Learning-based view synthesis for light field cameras. ACM Trans Graph (TOG) 35(6):193
Article Google Scholar
Kooi FL, Toet A (2004) Visual comfort of binocular and 3d displays. Displays 25(2–3):99
Article Google Scholar
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1097–1105
Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3431–3440
Ng R, Levoy M, Brédif M, Duval G, Horowitz M, Hanrahan P (2005) Light field photography with a hand-held plenoptic camera. Comput Sci Tech Rep CSTR 2(11):1
Google Scholar
Noh H, Hong S, Han B (2015) Learning deconvolution network for semantic segmentation. In: Proceedings of the IEEE international conference on computer vision, pp 1520–1528
Raghavendra R, Raja KB, Busch C (2015) Presentation attack detection for face recognition using light field camera. IEEE Trans Image Process 24(3):1060
Article MathSciNet MATH Google Scholar
Ren S, He K, Girshick R, Sun J (2017) Faster r-cnn: towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 39(6):1137
Article Google Scholar
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition, arXiv preprint arXiv:1409.1556
Toshev A, Szegedy C (2014) Deeppose: human pose estimation via deep neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1653–1660
Wang TC, Zhu JY, Hiroaki E, Chandraker M, Efros AA, Ramamoorthi R (2016) A 4d light-field dataset and cnn architectures for material recognition. In: European conference on computer vision. Springer, Berlin, pp 121–138
Wang Y, Hou G, Sun Z, Wang Z, Tan T (2016) A simple and robust super resolution method for light field images. In: 2016 IEEE international conference on image processing (ICIP). IEEE, pp 1459–1463
Wu G, Masia B, Jarabo A, Zhang Y, Wang L, Dai Q, Chai T, Liu Y (2017) Light field image processing: an overview. IEEE J Sel Top Sign Proces 11(7):926
Article Google Scholar
Yeung HWF, Hou J, Chen J, Chung YY, Chen X (2018) Fast light field reconstruction with deep coarse-to-fine modelling of spatial-angular clues. In: Accepted to European Conference on Computer Vision
Yoon Y, Jeon HG, Yoo D, Lee JY, So Kweon I (2015) Learning a deep convolutional network for light-field image super-resolution. In: Proceedings of the IEEE international conference on computer vision workshops, pp 24–32
Yosinski J, Clune J, Bengio Y, Lipson H (2014) How transferable are features in deep neural networks?. In: Advances in neural information processing systems, pp 3320–3328
Zhang Y, Lv H, Liu Y, Wang H, Wang X, Huang Q, Xiang X, Dai Q (2017) Light-field depth estimation via epipolar plane image analysis and locally linear embedding. IEEE Trans Circuits Syst Video Technol 27(4):739
Article Google Scholar
Zhao S, Chen Z (2017) Light field image coding via linear approximation prior. In: 2017 IEEE international conference on image processing (ICIP). IEEE, pp 4562–4566

Download references

Funding

This work was supported in part by the National Key Research and Development Program of China under Grant No. 2016YFC0801001, the National Program on Key Basic Research Projects (973 Program) under Grant 2015CB351803, NSFC under Grant 61571413, 61632001, 61390514.

Author information

Authors and Affiliations

School of Information Science and Technology, University of Science and Technology of China, Hefei, China
Zhicheng Lu, Xiaoming Chen & Zhibo Chen
School of Information Technologies, The University of Sydney, Sydney, Australia
Zhicheng Lu, Henry W. F. Yeung, Qiang Qu & Yuk Ying Chung

Authors

Zhicheng Lu
View author publications
You can also search for this author in PubMed Google Scholar
Henry W. F. Yeung
View author publications
You can also search for this author in PubMed Google Scholar
Qiang Qu
View author publications
You can also search for this author in PubMed Google Scholar
Yuk Ying Chung
View author publications
You can also search for this author in PubMed Google Scholar
Xiaoming Chen
View author publications
You can also search for this author in PubMed Google Scholar
Zhibo Chen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Xiaoming Chen or Zhibo Chen.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lu, Z., Yeung, H.W.F., Qu, Q. et al. Improved image classification with 4D light-field and interleaved convolutional neural network. Multimed Tools Appl 78, 29211–29227 (2019). https://doi.org/10.1007/s11042-018-6597-x

Download citation

Received: 13 February 2018
Revised: 25 July 2018
Accepted: 15 August 2018
Published: 25 September 2018
Issue Date: October 2019
DOI: https://doi.org/10.1007/s11042-018-6597-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Improved image classification with 4D light-field and interleaved convolutional neural network

Abstract

Access this article

Similar content being viewed by others

Methods for image denoising using convolutional neural network: a review

A review of object detection based on deep learning

Image Matching from Handcrafted to Deep Features: A Survey

References

Funding

Author information

Authors and Affiliations

Corresponding authors

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Improved image classification with 4D light-field and interleaved convolutional neural network

Abstract

Access this article

Similar content being viewed by others

Methods for image denoising using convolutional neural network: a review

A review of object detection based on deep learning

Image Matching from Handcrafted to Deep Features: A Survey

References

Funding

Author information

Authors and Affiliations

Corresponding authors

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation