Abstract
Channel attention mechanisms have attracted more and more researchers because of their generality and effectiveness in deep convolutional neural networks(DCNNs). However, the signal encoding methods of the current popular channel attention mechanisms are limited. For example, SENet uses the full-connection method to encode channel relevance, which is parameters-costly; ECANet uses 1D-Convolution to encode channel relevance, which is parameter fewer but can only encode per k adjacent channels in a fixed scale. This paper proposes a novel dilated efficient channel attention module(DECA), which consists of a novel multi-scale channel encoding method and a novel channel relevance feature fusion method. We empirically show that different scale channel relevance also contributes to performance, and fusing various scale channel relevance features can obtain more powerful channel feature representation. Besides, we widely use the weight-sharing method in the DECA module to make it more efficient. Specifically, we have applied our module to the real-life fire image detection task to evaluate its effectiveness. Extensive experiments on different backbone depths, detectors, and fire datasets have shown that the average performance boost of DECA module is more than 4.5% compare to the baselines. Meanwhile, DECA outperforms other state-of-art attention modules while keeping lower or comparable parameters in the experiments. The experimental results on different datasets also shown that the DECA module holds great generalization ability.
Similar content being viewed by others
References
Lin T-Y, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P, Zitnick CL (2014) Microsoft coco: common objects in context. In: European conference on computer vision, pp 740–755. Springer
Everingham M, Gool LV, Williams CKI, Winn J, Zisserman A (2010) The pascal visual object classes (voc) challenge. International Journal of Computer Vision 88(2):303–338
Krizhevsky A, Sutskever I, Hinton GE (2017) Imagenet classification with deep convolutional neural networks. Commun ACM 60(6):84–90
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1–9
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Huang G, Liu Z, Maaten LVD, Weinberger KQ (2017) Densely connected convolutional networks. In: proceedings of the IEEE conference on computer vision and pattern recognition, pp 4700–4708
Li Z, Peng C, Yu G, Zhang X, Deng Y, Sun J (2018) Detnet: a backbone network for object detection. arXiv:1804.06215
Shixiao W u, Zhang L (2018) Using popular object detection methods for real time forest fire detection. In: 2018 11th international symposium on computational intelligence and design (ISCID), vol 1. IEEE, pp 280–284
Zhaa X, Ji H, Zhang D, Bao H (2018) Fire smoke detection based on contextual object detection. In: 2018 IEEE 3rd international conference on image, vision and computing (ICIVC). IEEE, pp 473–476
Chen K, Cheng Y, Bai H, Mou C, Zhang Y (2019) Research on image fire detection based on support vector machine. In: 2019 9th international conference on fire science and fire protection engineering (ICFSFPE). IEEE, pp 1–7
Gaia (2021) D-fire: an image dataset of fire and smoke occurrences designed for machine learning and object recognition algorithms with more than 10000 images. https://github.com/gaiasd/DFireDataset
Girshick R (2015) Fast r-cnn. In: Proceedings of the IEEE international conference on computer vision, pp 1440–1448
Ren S, He K, Girshick R, Sun J (2015) Faster r-cnn: towards real-time object detection with region proposal networks. In: Advances in neural information processing systems, pp 91–99
Zhang H, Chang H, Ma B, Wang N, Chen X (2020) Dynamic r-cnn: towards high quality object detection via dynamic training. arXiv:2004.06002
Li Y, Chen Y, Wang N, Zhang Z (2019) Scale-aware trident networks for object detection. In: Proceedings of the IEEE international conference on computer vision, pp 6054–6063
Qiao S, Chen L-C, Yuille A (2020) Detectors: detecting objects with recursive feature pyramid and switchable atrous convolution. arXiv:2006.02334
Lin T-Y, Goyal P, Girshick R, He K, Dollár P (2017) Focal loss for dense object detection. In: Proceedings of the IEEE international conference on computer vision, pp 2980–2988
Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu C-Y, Berg AC (2016) Ssd: single shot multibox detector. In: European conference on computer vision, pp 21–37. Springer
Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: unified, real-time object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 779–788
Zhang S, Chi C, Yao Y, Lei Z, Li SZ (2020) Bridging the gap between anchor-based and anchor-free detection via adaptive training sample selection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 9759–9768
Liu S, Huang D, Wang Y (2019) Learning spatial fusion for single-shot object detection. arXiv:1911.09516
Hu J, Li S, Sun G (2018) Squeeze-and-excitation networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7132–7141
Wang Q, Wu B, Zhu P, Li P, Zuo W, Hu Q (2020) Eca-net: efficient channel attention for deep convolutional neural networks. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 11534–11542
Chaoxia C, Shang W, Zhang F (2020) Information-guided flame detection based on faster r-cnn. IEEE Access 8:58923–58932
Li S, Yan Q, Liu P (2020) An efficient fire detection method based on multiscale feature extraction, implicit deep supervision and channel attention mechanism. IEEE Trans Image Process 29:8467–8475
Woo S, Park J, Lee Joon-Young, In SK (2018) Cbam: convolutional block attention module. In: Proceedings of the European conference on computer vision (ECCV), pp 3–19
Li X, Wang W, Hu X, Yang J (2019) Selective kernel networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 510–519
Bello I, Zoph B, Vaswani A, Shlens J, Le QV (2019) Attention augmented convolutional networks. In: Proceedings of the IEEE international conference on computer vision, pp 3286–3295
Gao Z, Xie J, Wang Q, Li P (2019) Global second-order pooling convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3024–3033
Gao P, Zhang Q, Wang F, Xiao L, Fujita H, Zhang Y (2020) Learning reinforced attentional representation for end-to-end visual tracking. Inf Sci 517:52–67
Gao P, Yuan R, Wang F, Xiao L, Fujita H, Zhang Y (2020) Siamese attentional keypoint network for high performance visual tracking. Knowledge-Based Systems 193:105448
Lin T-Y, Dollár P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2117–2125
Pérez-Hernández F, Tabik S, Lamas A, Olmos R, Fujita H, Herrera F (2020) Object detection binary classifiers methodology based on deep learning to identify small objects handled similarly: application in video surveillance. Knowl-Based Syst 194: 105590
Chen L-C, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2017) Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs, vol 40, pp 834–848
Chen L-C, Papandreou G, Schroff F, Adam H (2017) Rethinking atrous convolution for semantic image segmentation. arXiv:1706.05587
GengYan L (2021) Fire detect dataset. https://github.com/gengyanlei/fire-detect-yolov4
Chen K, Wang J, Pang J, Cao Y, Xiong Y, Li X, Sun S, Feng W, Liu Z, Xu J, et al. (2019) Mmdetection: open mmlab detection toolbox and benchmark. arXiv:1906.07155
Deng J, Dong W, Socher R, Li L-J, Li K, Fei-Fei L (2009) Imagenet: a large-scale hierarchical image database. In: 2009 IEEE conference on computer vision and pattern recognition, pp 248–255. IEEE
Goyal P, Dollár P, Girshick R, Noordhuis P, Wesolowski L, Kyrola A, Tulloch A, Jia Y, He K (2017) Accurate, large minibatch sgd: training imagenet in 1 hour. arXiv:1706.02677
Wang X, Girshick R, Gupta A, He K (2018) Non-local neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7794–7803
Acknowledgements
This research is supported by National Natural Science Foundation of China (61862060). Thanks to Gaia corporation and Gengyan Lei for providing the fire dataset for this paper.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of Interests
The authors declare that they have no conflict of interest.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Wang, J., Yu, J. & He, Z. DECA: a novel multi-scale efficient channel attention module for object detection in real-life fire images. Appl Intell 52, 1362–1375 (2022). https://doi.org/10.1007/s10489-021-02496-y
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-021-02496-y