Skip to main content
Log in

Improved image classification with 4D light-field and interleaved convolutional neural network

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Image classification is a well-studied problem. However, there remains challenges for some special categories of images. This paper proposes a new deep convolutional neural network to improve image classification using extra light-field angular information. The proposed network model employs transfer learning by replacing the fully connected layer of a VGG network with a set of interleaved spatial-angular filters. The resulting model takes advantage of both the spatial and angular information of light-field images (LFIs), thus providing more accurate classification performance over traditional models. To evaluate the proposed network model, we established a light-field image dataset, currently consisting of 560 captured LFIs, which have been divided into 11 labeled categories. Based on this dataset, our experimental results show that the proposed LFI model yields an average of 92% classification accuracy as oppose to 84% from the model using traditional 2D images and 85% from the model using stereo pair images. In particular, on classifying challenging objects such as the “screen” images, the proposed LFI model demonstrated to have significant improvement of 16% and 12% respectively over the 2D image model and the stereo image model.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  1. Adelson EH, Wang JY (1992) Single lens stereo with a plenoptic camera. IEEE Trans Pattern Anal Mach Intell 14(2):99

    Article  Google Scholar 

  2. Aiger D, Allen B, Golovinskiy A (2017) Large-scale 3d scene classification with multi-view volumetric cnn, arXiv preprint arXiv:1712.09216

  3. Bastidas A (2017) Tiny imagenet image classification. https://pdfs.semanticscholar.org/1b0c/2ba54f7e2f3f5b3a2098721d36e6079d0382.pdf

  4. Chen Y, Yang Y, Fang Q, Yao X (2017) Discriminative region guided deep neural network towards food image classification. In: CCF Chinese conference on computer vision. Springer, pp 577–587

  5. Chen J, Hou J, Chau LP (2018) Light field compression with disparity-guided sparse coding based on structural key views. IEEE Trans Image Process 27(1):314

    Article  MathSciNet  MATH  Google Scholar 

  6. Chen J, Hou J, Chau LP (2018) Light field denoising via anisotropic parallax analysis in cnn framework. IEEE Signal Process Lett (IEEE SPL) 25(9):1403–1407

    Article  Google Scholar 

  7. Chen J, Hou J, Ni Y, Chau LP (2018) Accurate light field depth estimation with superpixel regularization over partially occluded regions. IEEE Trans Image Process (IEEE T-IP) 27(10):4889–4900

    Article  MathSciNet  Google Scholar 

  8. Deng J, Dong W, Socher R, Li LJ, Li K, Fei-Fei L (2009) Imagenet: a large-scale hierarchical image database. In: IEEE conference on computer vision and pattern recognition, 2009. CVPR 2009. IEEE, pp 248–255

  9. Deshpande A (2016) A beginner’s guide to understanding convolutional neural networks part 2 [online]. [cit. 2017-07-07]

  10. Eckert S, Ghebremicael ST, Hurni H, Kohler T (2017) Identification and classification of structural soil conservation measures based on very high resolution stereo satellite data. J Environ Manag 193:592

    Article  Google Scholar 

  11. Gao XW, Hui R (2016) A deep learning based approach to classification of ct brain images. In: SAI computing conference (SAI), 2016. IEEE, pp 28–31

  12. Hahnloser RH, Sarpeshkar R, Mahowald MA, Douglas RJ, Seung HS (2000) Digital selection and analogue amplification coexist in a cortex-inspired silicon circuit. Nature 405(6789):947

    Article  Google Scholar 

  13. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778

  14. He K, Gkioxari G, Dollár P, Girshick R (2017) Mask r-cnn, arXiv preprint arXiv:1703.06870

  15. Hou J, Chen J, Chau LP (2018) Light field image compression based on bi-level view compensation with rate-distortion optimization. In: IEEE Transactions on Circuits and Systems for Video Technology (IEEE T-CSVT)

  16. Image-net.org (2018) imagenet tree view. [online] available at: http://image-net.org/explore. Accessed: 25 Jan 2018

  17. Ioffe S, Szegedy C (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. In: International conference on machine learning, pp 448–456

  18. Jeon HG, Park J, Choe G, Park J, Bok Y, Tai YW, So Kweon I (2015) Accurate depth map estimation from a lenslet light field camera. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1547–1555

  19. Kalantari NK, Wang TC, Ramamoorthi R (2016) Learning-based view synthesis for light field cameras. ACM Trans Graph (TOG) 35(6):193

    Article  Google Scholar 

  20. Kooi FL, Toet A (2004) Visual comfort of binocular and 3d displays. Displays 25(2–3):99

    Article  Google Scholar 

  21. Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1097–1105

  22. Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3431–3440

  23. Ng R, Levoy M, Brédif M, Duval G, Horowitz M, Hanrahan P (2005) Light field photography with a hand-held plenoptic camera. Comput Sci Tech Rep CSTR 2(11):1

    Google Scholar 

  24. Noh H, Hong S, Han B (2015) Learning deconvolution network for semantic segmentation. In: Proceedings of the IEEE international conference on computer vision, pp 1520–1528

  25. Raghavendra R, Raja KB, Busch C (2015) Presentation attack detection for face recognition using light field camera. IEEE Trans Image Process 24(3):1060

    Article  MathSciNet  MATH  Google Scholar 

  26. Ren S, He K, Girshick R, Sun J (2017) Faster r-cnn: towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 39(6):1137

    Article  Google Scholar 

  27. Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition, arXiv preprint arXiv:1409.1556

  28. Toshev A, Szegedy C (2014) Deeppose: human pose estimation via deep neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1653–1660

  29. Wang TC, Zhu JY, Hiroaki E, Chandraker M, Efros AA, Ramamoorthi R (2016) A 4d light-field dataset and cnn architectures for material recognition. In: European conference on computer vision. Springer, Berlin, pp 121–138

  30. Wang Y, Hou G, Sun Z, Wang Z, Tan T (2016) A simple and robust super resolution method for light field images. In: 2016 IEEE international conference on image processing (ICIP). IEEE, pp 1459–1463

  31. Wu G, Masia B, Jarabo A, Zhang Y, Wang L, Dai Q, Chai T, Liu Y (2017) Light field image processing: an overview. IEEE J Sel Top Sign Proces 11(7):926

    Article  Google Scholar 

  32. Yeung HWF, Hou J, Chen J, Chung YY, Chen X (2018) Fast light field reconstruction with deep coarse-to-fine modelling of spatial-angular clues. In: Accepted to European Conference on Computer Vision

  33. Yoon Y, Jeon HG, Yoo D, Lee JY, So Kweon I (2015) Learning a deep convolutional network for light-field image super-resolution. In: Proceedings of the IEEE international conference on computer vision workshops, pp 24–32

  34. Yosinski J, Clune J, Bengio Y, Lipson H (2014) How transferable are features in deep neural networks?. In: Advances in neural information processing systems, pp 3320–3328

  35. Zhang Y, Lv H, Liu Y, Wang H, Wang X, Huang Q, Xiang X, Dai Q (2017) Light-field depth estimation via epipolar plane image analysis and locally linear embedding. IEEE Trans Circuits Syst Video Technol 27(4):739

    Article  Google Scholar 

  36. Zhao S, Chen Z (2017) Light field image coding via linear approximation prior. In: 2017 IEEE international conference on image processing (ICIP). IEEE, pp 4562–4566

Download references

Funding

This work was supported in part by the National Key Research and Development Program of China under Grant No. 2016YFC0801001, the National Program on Key Basic Research Projects (973 Program) under Grant 2015CB351803, NSFC under Grant 61571413, 61632001, 61390514.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Xiaoming Chen or Zhibo Chen.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lu, Z., Yeung, H.W.F., Qu, Q. et al. Improved image classification with 4D light-field and interleaved convolutional neural network. Multimed Tools Appl 78, 29211–29227 (2019). https://doi.org/10.1007/s11042-018-6597-x

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-018-6597-x

Keywords

Navigation