AFS: Attention Using First and Second Order Information to Enrich Features

Zuo, Yan; Lv, Junjie; Wang, Huiwei

doi:10.1007/978-3-031-15937-4_45

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13532))

Included in the following conference series:

International Conference on Artificial Neural Networks

1843 Accesses

Abstract

Convolutional Neural Network (CNN) is the basis of many computer vision tasks. In order to depict complex boundaries in visual tasks, it is essential to fully explore the feature distribution to realize the potential of CNN. However, most of the current research focuses on the design of deeper architectures, and rarely explores high-level feature statistics. To solve this problem, we propose a simple and effective neural network attention insertion module, named Attention Module using First and Second order information fusion (AFS). Our method combines the first-order pooling and the second-order pooling, corresponding to the two independent dimensions of space and channel respectively, and verifies the effectiveness of our AFS through the two connection ways. The feature map outputed by the middle convolutional layer infers the attention map in turn along the channel and space, and then multiplies the attention map with the input feature map for adaptive feature refinement. Our AFS can be integrated with any feedforward CNNs and can be trained end-to-end with negligible overhead. We have conducted a large number of experiments on CIFAR-10 and CIFAR-100, and the experimental results show that our AFS module significantly improves the classification and detection performance on different models.

Supported in part by the Natural Science Foundation of Chongqing under Grant cstc2020jcyj-msxmX0057.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Redmon, J., Farhadi, A.: YOLO9000: better, faster, stronger. In: CVPR, pp. 7263–7271 (2017)
Google Scholar
Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: CVPR, pp. 3431–3440 (2015)
Google Scholar
Wang, X., Girshick, R., Gupta, A., et al.: Non-local neural networks. In: CVPR, pp. 7794–7803 (2018)
Google Scholar
LeCun, Y., Bottou, L., Bengio, Y., et al.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)
Article Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
Szegedy, C., Liu, W., Jia, Y., et al.: Going deeper with convolutions. In: CVPR, pp. 1–9 (2015)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Identity mappings in deep residual networks. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 630–645. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46493-0_38
Chapter Google Scholar
Ionescu, C., Vantzos, O., Sminchisescu, C.: Matrix backpropagation for deep networks with structured layers. In: ICCV, pp. 2965–2973 (2015)
Google Scholar
Wang, Y., Xie, L., Liu, C., et al.: SORT: second-order response transform for visual recognition. In: ICCV, pp. 1359–1368 (2017)
Google Scholar
Li, P., Xie, J., Wang, Q., Zuo, W.: Is second-order information helpful for large-scale visual recognition? In: ICCV, pp. 2070–2078 (2017)
Google Scholar
Gregor, K., Danihelka, I., Graves, A., et al.: DRAW: a recurrent neural network for image generation. In: ICML, pp. 1462–1471. PMLR (2015)
Google Scholar
Jaderberg, M., Simonyan, K., Zisserman, A.: Spatial transformer networks. In: NIPS 28 (2015)
Google Scholar
Xu, K., Ba, J., Kiros, R., et al.: Show, attend and tell: neural image caption generation with visual attention. In: ICML, pp. 2048–2057. PMLR (2015)
Google Scholar
Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: CVPR, pp. 7132–7141 (2018)
Google Scholar
Itti, L., Koch, C., Niebur, E.: A model of saliency-based visual attention for rapid scene analysis. IEEE TPAMI 20(11), 1254–1259 (1998)
Article Google Scholar
Rensink, R.A.: The dynamic representation of scenes. Vis. Cogn. 7(1–3), 17–42 (2000)
Article Google Scholar
Corbetta, M., Shulman, G.L.: Control of goal-directed and stimulus-driven attention in the brain. Nat. Rev. Neurosci. 3(3), 201–215 (2002)
Article Google Scholar
Larochelle, H., Hinton, G.E.: Learning to combine foveal glimpses with a third-order Boltzmann machine. In: NIPS (2010)
Google Scholar
Mnih, V., Heess, N., Graves, A., et al.: Recurrent models of visual attention. In: NIPS 27 (2014)
Google Scholar
Chen, L., Zhang, H., Xiao, J., et al. : SCA-CNN: spatial and channel-wise attention in convolutional networks for image captioning. In: CVPR, pp. 5659–5667 (2017)
Google Scholar
Vaswani, A., Shazeer, N., Parmar, N., et al.: Attention is all you need. In: NIPS 30 (2017)
Google Scholar
Li, P., Wang, Q., Zeng, H., et al.: Local log-Euclidean multivariate Gaussian descriptor and its application to image classification. IEEE TAPMI 39(4), 803–817 (2016)
Article Google Scholar
Wang, Q., Li, P., Zuo, W., et al.: RAID-G: robust estimation of approximate infinite dimensional Gaussian with application to material recognition. In: CVPR, pp. 4433–4441 (2016)
Google Scholar
Cui, Y., Zhou, F., Wang, J., et al.: Kernel pooling for convolutional neural networks. In: CVPR (2017)
Google Scholar
Li, P., Xie, J., Wang, Q., et al.: Towards faster training of global covariance pooling networks by iterative matrix square root normalization. In: CVPR, pp. 947–955 (2018)
Google Scholar
Xiao, H., Feng, J., Lin, G., et al.: MoNet: deep motion exploitation for video object segmentation. In: CVPR, pp. 1140–1148 (2018)
Google Scholar
Krizhevsky, A., Hinton, G.: Learning multiple layers of features from tiny images. Technical report, Citeseer (2009)
Google Scholar
Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48
Chapter Google Scholar
Pytorch. http://pytorch.org/
Cui, Y., Zhou, F., Wang, J., et al.: Kernel pooling for convolutional neural networks. In: CVPR, pp. 2921–2930 (2017)
Google Scholar
Zagoruyko, S., Komodakis, N.: Wide residual networks. arXiv preprint arXiv:1605.07146 (2016)
Howard, A.G., Zhu, M., Chen, B., et al.: MobileNets: efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861 (2017)

Download references

Author information

Authors and Affiliations

College of Electronic and Information Engineering, Southwest University, Chongqing, 400715, China
Yan Zuo, Junjie Lv & Huiwei Wang

Authors

Yan Zuo
View author publications
You can also search for this author in PubMed Google Scholar
Junjie Lv
View author publications
You can also search for this author in PubMed Google Scholar
Huiwei Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Huiwei Wang .

Editor information

Editors and Affiliations

University of the West of England, Bristol, UK
Elias Pimenidis
Lancaster University, Lancaster, UK
Plamen Angelov
Digital Innovation, Teeside University, Middlesbrough, UK
Chrisina Jayne
Democritus University of Thrace, Xanthi, Greece
Antonios Papaleonidas
The University of the West of England, Bristol, UK
Mehmet Aydin

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zuo, Y., Lv, J., Wang, H. (2022). AFS: Attention Using First and Second Order Information to Enrich Features. In: Pimenidis, E., Angelov, P., Jayne, C., Papaleonidas, A., Aydin, M. (eds) Artificial Neural Networks and Machine Learning – ICANN 2022. ICANN 2022. Lecture Notes in Computer Science, vol 13532. Springer, Cham. https://doi.org/10.1007/978-3-031-15937-4_45

Download citation

DOI: https://doi.org/10.1007/978-3-031-15937-4_45
Published: 07 September 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-15936-7
Online ISBN: 978-3-031-15937-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

AFS: Attention Using First and Second Order Information to Enrich Features