Skip to main content

Second-Order Convolutional Neural Network Based on Cholesky Compression Strategy

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 12606))

Abstract

In the past few years, Convolution Neural Network (CNN) has been successfully applied to many computer vision tasks. Most of these networks can only extract first-order information from input images. The second-order statistical information refers to the second-order correlation obtained by calculating the covariance matrix, the fisher information matrix, or the vector outer product operation on the local feature group according to the channels. It has been shown that using second-order information on facial expression datasets can better capture the distortion of facial area features, while at the same time generate more parameters which may cause much more computational cost. In this article we propose a new CNN structure including layers which can (i) incorporate first-order information into the covariance matrix; (ii) use eigenvalue vectors to measure the importance of feature channels; (iii) reduce the bilinear dimensionality of the parameter matrix; and (iv) perform Cholesky decomposition on the positive definite matrix to complete the compression of the second-order information matrix. Due to the incorporation of both first-order and second-order information and the Cholesky compression strategy, our proposed method reduces the number of parameters by half of the SPDNet model, and simultaneously achieves better results in facial expression classification tasks than the corresponding first-order model and the reference second-order model.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Krizhevsky, A., Sutskever, I., Hinton, G.: ImageNet classification with deep convolutional neural networks. In: NIPS, pp. 1106–1114 (2012)

    Google Scholar 

  2. He, K., Zhang, X., Ren, S, Sun, J.: Deep residual learning for image recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)

    Google Scholar 

  3. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In ICLR, pp. 340–352 (2015)

    Google Scholar 

  4. Lin, T., RoyChowdhury, A., Maji, S.: Bilinear CNN models for fine-grained visual recognition. In: ICCV, pp. 1449–1457 (2015)

    Google Scholar 

  5. Ionescu, C.,Vantzos, O., Sminchisescu, C.: Matrix backpropagation for deep networks with structured layers. In: IEEE International Conference on Computer Vision, pp. 990–1002 (2015)

    Google Scholar 

  6. Li, P., Xie, J., Wang, Q., Zuo, W.: Is second-order information helpful for large-scale visual recognition? In: ICCV, pp. 1205–1213 (2017)

    Google Scholar 

  7. Tuzel, O., Porikli, F., Meer, P.: Region Covariance: A Fast Descriptor for Detection and Classification. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3952, pp. 589–600. Springer, Heidelberg (2006). https://doi.org/10.1007/11744047_45

    Chapter  Google Scholar 

  8. Gao, Y., Beijbom, O., Zhang, N., Darrell, T.: Compact bilinear pooling. In: International Conference on Computer Vision and Pattern Recognition, pp. 317–326 (2016)

    Google Scholar 

  9. Kong, S., Fowlkes, C.: Low-rank bilinear pooling for fine-grained classification. In: Conference on Computer Vision and Pattern Recognition, pp. 880–890 (2017)

    Google Scholar 

  10. Krizhevsky, A., Sutskever, I., Hinton, G.: ImageNet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 25(2) (2012)

    Google Scholar 

  11. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)

    Google Scholar 

  12. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: ICLR, pp. 553–572 (2015)

    Google Scholar 

  13. Duchi, J., Hazan, E., Singer, Y.: Adaptive Subgradient Methods for Online Learning and Stochastic Optimization. J. Mach. Learn. Res. 12(7), 257–269 (2011)

    MathSciNet  MATH  Google Scholar 

  14. Kingma, D., Ba, J.: Adam: A method for stochastic optimization. Computer Science, pp. 1135–1142 (2015)

    Google Scholar 

  15. Srivastava, N., Hinton, G., Krizhevsky, A., Salakhutdinov, R.: Dropout - a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)

    MathSciNet  MATH  Google Scholar 

  16. Zeiler, M.: ADADELTA: An Adaptive Learning Rate Method. arXiv.org (2012)

    Google Scholar 

  17. Tuzel, O., Porikli, F., Meer, P.: Pedestrian detection via classification on riemannian manifolds. In: IEEE TPAMI, pp. 1980–1991 (2008)

    Google Scholar 

  18. Pennec. X., Fillard, P., Ayache, N.: A riemannian framework for tensor computing. In: IJCV, pp. 990–1112 (2006)

    Google Scholar 

  19. Ha, M., San-Biagio, M., Murino, V.: Log-Hilbert-Schmidt metric between positive definite operators on Hilbert spaces. In NIPS, pp. 1124–1134 (2014)

    Google Scholar 

  20. Sra, S.: A new metric on the manifold of kernel matrices with application to matrix geometric means. In: NIPS, pp. 2010–2023 (2012)

    Google Scholar 

  21. Xu, X., Mu, N., Zhang, X.: Covariance descriptor based convolution neural network for saliency computation in low contrast images. In: International Joint Conference on Neural Networks, pp. 1220–1229 (2016)

    Google Scholar 

  22. Yu, K., Salzmann, M.: Second-order convolutional neural networks. In: Computer Vision and Pattern Recognition, pp. 1305–1316 (2017)

    Google Scholar 

  23. Yu, K., Salzmann, M.: Statistically-Motivated Second-Order Pooling. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11211, pp. 621–637. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01234-2_37

    Chapter  Google Scholar 

  24. Huang, Z., Van Gool, L.: A Riemannian network for SPD matrix learning. In: Internaltional Conference on Computer Vision and Pattern Recognition, pp. 2036–2042 ( 2017)

    Google Scholar 

  25. Acharya, D., Huang, Z., Paudel, D.: Covariance pooling for facial expression recognition. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, pp. 480–487 (2018)

    Google Scholar 

  26. Dai, T., Cai, J., Zhang, Y., Xia, S., Zhang, L.: Second-order attention network for single image super-resolution. In: International Conference on Computer Vision and Pattern Recogintion, pp. 1123–1135 (2019)

    Google Scholar 

  27. Dhall, A., et al.: Collecting large, richly annotated facial expression databases from movies. IEEE Multimedia 19(3), 34–41 (2012)

    Article  Google Scholar 

  28. Dhall, A., et al.: Emotion recognition in the wild challenge 2014: Baseline, data and protocol. In: ACM ICMI (2014)

    Google Scholar 

  29. Li, S., Deng, W, Du, J.: Reliable crowdsourcing and deep locality-preserving learning for expression recognition in the wild. In: The IEEE Conference on Computer Vision and Pattern Recognition, pp. 89–96 (2017)

    Google Scholar 

  30. Zhu, X., Ramanan, D.: Face detection, pose estimation and landmark estimation in the wild. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3445 (2012)

    Google Scholar 

  31. Benitez-Quiroz, C.F., Srinivasan, R., Martinez, A.M.: Emotionet: an accurate, real-time algorithm for the automatic annotation of a million facial expressions in the wild. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 5562–5570 (2016)

    Google Scholar 

  32. Goodfellow, I.J.: Challenges in representation learning. Neural Netw. 64(C), 59–63 (2015)

    Google Scholar 

Download references

Acknowledgements

This work is supported by NSF of Hebei Province (No. F2018201096), NSF of Guangdong Province (No. 2018A0303130026), the Key Science and Technology Foundation of the Educational Department of Hebei Province (ZD 2019021), and NSFC (No. 61976141).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Qiang Hua .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Li, Y., Zhang, J., Hua, Q. (2021). Second-Order Convolutional Neural Network Based on Cholesky Compression Strategy. In: Zhang, Y., Xu, Y., Tian, H. (eds) Parallel and Distributed Computing, Applications and Technologies. PDCAT 2020. Lecture Notes in Computer Science(), vol 12606. Springer, Cham. https://doi.org/10.1007/978-3-030-69244-5_30

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-69244-5_30

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-69243-8

  • Online ISBN: 978-3-030-69244-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics