A feature-wise attention module based on the difference with surrounding features for convolutional neural networks

Tan, Shuo; Zhang, Lei; Shu, Xin; Wang, Zizhou

doi:10.1007/s11704-022-2126-1

A feature-wise attention module based on the difference with surrounding features for convolutional neural networks

Research Article
Published: 21 January 2023

Volume 17, article number 176338, (2023)
Cite this article

Frontiers of Computer Science Aims and scope Submit manuscript

Shuo Tan¹,
Lei Zhang¹,
Xin Shu¹ &
…
Zizhou Wang¹

197 Accesses
9 Citations
1 Altmetric
Explore all metrics

Abstract

Attention mechanism has become a widely researched method to improve the performance of convolutional neural networks (CNNs). Most of the researches focus on designing channel-wise and spatial-wise attention modules but neglect the importance of unique information on each feature, which is critical for deciding both “what” and “where” to focus. In this paper, a feature-wise attention module is proposed, which can give each feature of the input feature map an attention weight. Specifically, the module is based on the well-known surround suppression in the discipline of neuroscience, and it consists of two sub-modules, Minus-Square-Add (MSA) operation and a group of learnable nonlinear mapping functions. The MSA imitates the surround suppression and defines an energy function which can be applied to each feature to measure its importance. The group of non-linear functions refines the energy calculated by the MSA to more reasonable values. By these two sub-modules, feature-wise attention can be well captured. Meanwhile, due to the simple structure and few parameters of the two sub-modules, the proposed module can easily be almost integrated into any CNN. To verify the performance and effectiveness of the proposed module, several experiments were conducted on the Cifar10, Cifar100, Cinic10, and Tiny-ImageNet datasets, respectively. The experimental results demonstrate that the proposed module is flexible and effective for CNNs to improve their performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

SFA: Efficient Attention Mechanism for Superior CNN Performance

Article Open access 04 April 2025

MSANet: Multi-scale attention networks for image classification

Article 13 May 2022

AFS: Attention Using First and Second Order Information to Enrich Features

Discover the latest articles and news from researchers in related subjects, suggested using machine learning.

References

Deng J, Dong W, Socher R, Li L J, Li K, Fei-Fei L. ImageNet: a large-scale hierarchical image database. In: Proceedings of 2009 IEEE Conference on Computer Vision and Pattern Recognition. 2009, 248–255
Krizhevsky A. Learning multiple layers of features from tiny images. Toronto: University of Toronto, 2009
MATH Google Scholar
Everingham M, van Gool L, Williams C K I, Winn J, Zisserman A. The PASCAL visual object classes (VOC) challenge. International Journal of Computer Vision, 2010, 88(2): 303–338
Article Google Scholar
Cordts M, Omran M, Ramos S, Rehfeld T, Enzweiler M, Benenson R, Franke U, Roth S, Schiele B. The cityscapes dataset for semantic urban scene understanding. In: Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. 2016, 3213–3223
Lin T Y, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P, Zitnick C L. Microsoft COCO: common objects in context. In: Proceedings of the 13th European Conference on Computer Vision. 2014, 740–755
Hu J, Shen L, Sun G. Squeeze-and-excitation networks. In: Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2018, 7132–7141
Woo S, Park J, Lee J Y, Kweon I S. CBAM: convolutional block attention module. In: Proceedings of the 15th European Conference on Computer Vision. 2018, 3–19
Wang X, Girshick R, Gupta A, He K. Non-local neural networks. In: Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2018, 7794–7803
Wang F, Jiang M, Qian C, Yang S, Li C, Zhang H, Wang X, Tang X. Residual attention network for image classification. In: Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. 2017, 6450–6458
Yang L, Zhang R Y, Li L, Xie X. SimAM: a simple, parameter-free attention module for convolutional neural networks. In: Proceedings of the 38th International Conference on Machine Learning. 2021, 11863–11874
Wang L, Zhang L, Qi X, Yi Z. Deep attention-based imbalanced image classification. IEEE Transactions on Neural Networks and Learning Systems, 2022, 33(8): 3320–3330
Article MathSciNet MATH Google Scholar
Chen L, Zhang H, Xiao J, Nie L, Shao J, Liu W, Chua T S. SCA-CNN: spatial and channel-wise attention in convolutional networks for image captioning. In: Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. 2017, 6298–6306
Carrasco M. Visual attention: the past 25 years. Vision Research, 2011, 51(13): 1484–1525
Article MATH Google Scholar
Liu N, Han J, Yang M H. PiCANet: learning pixel-wise contextual attention for saliency detection. In: Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2018, 3089–3098
Zhang W, Xiao C. PCAN: 3D attention map learning using contextual information for point cloud based retrieval. In: Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019, 12428–12437
Webb B S, Dhruv N T, Solomon S G, Tailby C, Lennie P. Early and late mechanisms of surround suppression in striate cortex of macaque. Journal of Neuroscience, 2005, 25(50): 11666–11675
Article Google Scholar
Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. 2014, arXiv preprint arXiv: 1409.1556
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A. Going deeper with convolutions. In: Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. 2015, 1–9
He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. In: Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. 2016, 770–778
Zagoruyko S, Komodakis N. Wide residual networks. In: Proceedings of British Machine Vision Conference. 2016, 87.1–87.12
Huang G, Liu Z, van der Maaten L, Weinberger K Q. Densely connected convolutional networks. In: Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. 2017, 2261–2269
Chollet F. Xception: deep learning with depthwise separable convolutions. In: Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. 2017, 1800–1807
Xie S, Girshick R, Dollár P, Tu Z, He K. Aggregated residual transformations for deep neural networks. In: Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. 2017, 5987–5995
Domhan T, Springenberg J T, Hutter F. Speeding up automatic hyperparameter optimization of deep neural networks by extrapolation of learning curves. In: Proceedings of the 24th International Conference on Artificial Intelligence. 2015, 3460–3468
Ha D, Dai A, Le Q V. Hypernetworks. 2016, arXiv preprint arXiv: 1609.09106
Zoph B, Le Q V. Neural architecture search with reinforcement learning. In: Proceedings of the 5th International Conference on Learning Representations. 2017
Mendoza H, Klein A, Feurer M, Springenberg J T, Hutter F. Towards automatically-tuned neural networks. In: Proceedings of Workshop on Automatic Machine Learning. 2016, 58–65
Bello I, Zoph B, Vasudevan V, Le Q V. Neural optimizer search with reinforcement learning. In: Proceedings of the 34th International Conference on Machine Learning. 2017, 459–468
Fernando C, Banarse D, Blundell C, Zwols Y, Ha D, Rusu A A, Pritzel A, Wierstra D. Pathnet: Evolution channels gradient descent in super neural networks. 2017, arXiv preprint arXiv: 1701.08734
Roy A G, Navab N, Wachinger C. Recalibrating fully convolutional networks with spatial and channel “squeeze and excitation” blocks. IEEE Transactions on Medical Imaging, 2019, 38(2): 540–549
Article MATH Google Scholar
Chen Y, Kalantidis Y, Li J, Yan S, Feng J. A²-nets: double attention networks. 2018, arXiv preprint arXiv: 1810.11579
Fu J, Liu J, Tian H, Li Y, Bao Y, Fang Z, Lu H. Dual attention network for scene segmentation. In: Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019, 3141–3149
Cao Y, Xu J, Lin S, Wei F, Hu H. GCNet: non-local networks meet squeeze-excitation networks and beyond. In: Proceedings of 2019 IEEE/CVF International Conference on Computer Vision Workshop. 2019, 1971–1980
Yang Z, Zhu L, Wu Y, Yang Y. Gated channel transformation for visual recognition. In: Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020, 11791–11800
Ioffe S, Szegedy C. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In: Proceedings of the 32nd International Conference on Machine Learning. 2015, 448–456
Darlow L N, Crowley E J, Antoniou A, Storkey A J. CINIC-10 is not ImageNet or CIFAR-10. 2018, arXiv preprint arXiv: 1810.03505
Lee C Y, Xie S, Gallagher P, Zhang Z, Tu Z. Deeply-supervised nets. In: Proceedings of the Eighteenth International Conference on Artificial Intelligence and Statistics. 2015, 562–570
Sandler M, Howard A, Zhu M, Zhmoginov A, Chen L C. MobileNetV2: inverted residuals and linear bottlenecks. In: Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2018, 4510–4520
Wang Q, Wu B, Zhu P, Li P, Zuo W, Hu Q. ECA-Net: efficient channel attention for deep convolutional neural networks. In: Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020, 11531–11539
Zeiler M D, Fergus R. Visualizing and understanding convolutional networks. In: Proceedings of the 13th European Conference on Computer Vision. 2014, 818–833
Selvaraju R R, Cogswell M, Das A, Vedantam R, Parikh D, Batra D. Grad-CAM: visual explanations from deep networks via gradient-based localization. In: Proceedings of 2017 IEEE International Conference on Computer Vision. 2017, 618–626

Download references

Acknowledgements

This work was supported by the National Natural Science Fund for Distinguished Young Scholar (No. 62025601).

Author information

Authors and Affiliations

Machine Intelligence Laboratory, College of Computer Science, Sichuan University, Chengdu, 610065, China
Shuo Tan, Lei Zhang, Xin Shu & Zizhou Wang

Authors

Shuo Tan
View author publications
Search author on:PubMed Google Scholar
Lei Zhang
View author publications
Search author on:PubMed Google Scholar
Xin Shu
View author publications
Search author on:PubMed Google Scholar
Zizhou Wang
View author publications
Search author on:PubMed Google Scholar

Corresponding author

Correspondence to Lei Zhang.

Additional information

Shuo Tan is currently pursuing the MS degree at the Machine Intelligence Laboratory, College of Computer Science, Sichuan University, China. His current research interests include convolutional neural network and medical image analysis.

Lei Zhang received the BS and MS degrees in mathematics and the PhD degree in computer science from the University of Electronic Science and Technology of China, China in 2002, 2005, and 2008, respectively. She was a Post-Doctoral Research Fellow with the Department of Computer Science and Engineering, Chinese University of Hong Kong, China from 2008 to 2009. She was an Associate Editor of IEEE Transactions on Neural Networks and Learning Systems and an Associate Editor of IEEE Transactions on Cognitive and Developmental Systems. Her current research interests include theory and applications of neural networks based on neocortex computing and big data analysis methods by very deep neural networks.

Xin Shu is currently pursuing the PhD degree with the Machine Intelligence Laboratory, College of Computer Science, Sichuan University, China. His current research interests include neural network and intelligent medical.

Zizhou Wang is currently pursuing the PhD degree with the Machine Intelligence Laboratory, College of Computer Science, Sichuan University, China. His current research interests include neural network and medical image analysis.

Electronic supplementary material