CAL-SSD: lightweight SSD object detection based on coordinated attention

Zhong, Xin

doi:10.1007/s11760-024-03716-x

CAL-SSD: lightweight SSD object detection based on coordinated attention

Original Paper
Published: 02 December 2024

Volume 19, article number 31, (2025)
Cite this article

Signal, Image and Video Processing Aims and scope Submit manuscript

Xin Zhong¹

88 Accesses
Explore all metrics

Abstract

Although existing object detection algorithms have achieved excellent detection accuracy, with the continuous improvement of detection accuracy, the parameters of the model are getting larger and larger, and the model complexity is getting higher and higher, which makes it difficult to deploy the object detection algorithms on the edge end and mobile end. In order to improve the application of the object detection algorithm on edge and mobile, this paper proposes a lightweight object detection algorithm, CAL-SSD, using a coordinated attention mechanism. First, we embed the coordinated attention mechanism into MobileNetv2 to form CA_MobileNetv2 as the backbone of the CAL-SSD object detection algorithm, significantly reducing the model parameters and complexity and improving the network’s ability to differentiate between object and background. Second, we design a super-resolution feature fusion module (SFFM) to introduce deep semantic information into shallow feature maps. Then, we use depthwise separable convolution instead of traditional 3×3 convolution to construct additional feature layers and detection heads, further reducing the model parameters. Finally, we employ BiFPN to construct a new feature pyramid to utilize the multi-scale features of the target fully. Experimental results on the PASCAL VOC and MS COCO datasets show that CAL-SSD significantly reduces the model parameters and complexity and achieves an optimal balance of speed and accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

YOLO-SA: An Efficient Object Detection Model Based on Self-attention Mechanism

Small Object Detection Algorithm Combining Coordinate Attention Mechanism and P2-BiFPN Structure

MSPNet: Multi-level Semantic Pyramid Network for Real-Time Object Detection

References

Cao, J., Bao, W., Shang, H., Yuan, M., Cheng, Q.: Gcl-yolo: a ghostconv-based lightweight yolo network for uav small object detection. Remote Sensing 15(20), 4932 (2023)
Article MATH Google Scholar
Cao, Y., Li, C., Peng, Y., Ru, H.: Mcs-yolo: a multiscale object detection method for autonomous driving road environment recognition. IEEE Access 11, 22342–22354 (2023)
Article Google Scholar
Ding, P., Qian, H., Bao, J., Zhou, Y., Yan, S.: L-yolov4: lightweight yolov4 based on modified rfb-s and depthwise separable convolution for multi-target detection in complex scenes. J. Real-Time Image Proc. 20(4), 71 (2023)
Article Google Scholar
Ding, P., Qian, H., Chu, S.: Slimyolov4: lightweight object detector based on yolov4. J. Real-Time Image Proc. 19(3), 487–498 (2022)
Article MATH Google Scholar
Girshick, R.: Fast r-cnn. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1440–1448 (2015)
Han, J., Yang, Y.: L-net: lightweight and fast object detector-based shufflenetv2. J. Real-Time Image Proc. 18(6), 2527–2538 (2021)
Article MATH Google Scholar
Han, K., Wang, Y., Tian, Q., Guo, J., Xu, C., Xu, C.: Ghostnet: More features from cheap operations. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1580–1589 (2020)
He, J., Chen, Y., Wang, N., Zhang, Z.: 3d video object detection with learnable object-centric global optimization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5106–5115 (2023)
Hou, Q., Zhou, D., Feng, J.: Coordinate attention for efficient mobile network design. In: Proceedings of the IEEE/CVF Conference On Computer Vision and Pattern Recognition, pp. 13713–13722 (2021)
Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7132–7141 (2018)
Iandola, F.N., Han, S., Moskewicz, M.W., Ashraf, K., Dally, W.J., Keutzer, K.: Squeezenet: Alexnet-level accuracy with 50x fewer parameters and< 0.5 mb model size. arXiv preprint arXiv:1602.07360 (2016)
Ioffe, S., Szegedy, C.: Batch normalization: Accelerating deep network training by reducing internal covariate shift. In: International Conference on Machine Learning, pp. 448–456. PMLR (2015)
Jiang, Z., Zhao, L., Li, S., Jia, Y.: Real-time object detection method based on improved yolov4-tiny. arXiv preprint arXiv:2011.04244 (2020)
Kaur, J., Singh, W.: A systematic review of object detection from images using deep learning. Multimedia Tools Appl. 83(4), 12253–12338 (2024)
Article MATH Google Scholar
Lampert, C.H., Blaschko, M.B., Hofmann, T.: Beyond sliding windows: Object localization by efficient subwindow search. In: 2008 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8. IEEE (2008)
Li, L., Li, B., Zhou, H.: Lightweight multi-scale network for small object detection. PeerJ Computer Sci. 8, e1145 (2022)
Article MATH Google Scholar
Li, Z., Peng, C., Yu, G., Zhang, X., Deng, Y., Sun, J.: Detnet: A backbone network for object detection. arXiv preprint arXiv:1804.06215 (2018)
Lin, T.Y., Dollár, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2117–2125 (2017)
Liu, S., Qi, L., Qin, H., Shi, J., Jia, J.: Path aggregation network for instance segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 8759–8768 (2018)
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.Y., Berg, A.C.: Ssd: Single shot multibox detector. In: European conference on computer vision, pp. 21–37. Springer (2016)
Qian, H., Wang, H.: Lightweight object detection based on super-resolution. In: 2022 China Automation Congress (CAC), pp. 2493–2498. IEEE (2022)
Qian, H., Wang, H., Feng, S., Yan, S.: Fessd: Ssd target detection based on feature fusion and feature enhancement. J. Real-Time Image Proc. 20(1), 2 (2023)
Article MATH Google Scholar
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: Unified, real-time object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 779–788 (2016)
Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4510–4520 (2018)
Shi, W., Caballero, J., Huszár, F., Totz, J., Aitken, A.P., Bishop, R., Rueckert, D., Wang, Z.: Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1874–1883 (2016)
Tan, M., Pang, R., Le, Q.V.: Efficientdet: Scalable and efficient object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 10781–10790 (2020)
Wang, H., Qian, H., Feng, S., Wang, W.: L-ssd: lightweight ssd target detection based on depth-separable convolution. J. Real-Time Image Proc. 21(2), 1–15 (2024)
Article MATH Google Scholar
Wen, L., Cheng, Y., Fang, Y., Li, X.: A comprehensive survey of oriented object detection in remote sensing images. Expert Syst. Appl. 224, 119960 (2023)
Article MATH Google Scholar
Woo, S., Park, J., Lee, J.Y., Kweon, I.S.: Cbam: Convolutional block attention module. In: Proceedings of the European conference on computer vision (ECCV), pp. 3–19 (2018)
Wu, B., Iandola, F., Jin, P.H., Keutzer, K.: Squeezedet: Unified, small, low power fully convolutional neural networks for real-time object detection for autonomous driving. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp. 129–137 (2017)
Yang, J., Jiang, J.: Dilated-cbam: An efficient attention network with dilated convolution. In: 2021 IEEE International Conference on Unmanned Systems (ICUS), pp. 11–15. IEEE (2021)
Zeng, N., Wu, P., Wang, Z., Li, H., Liu, W., Liu, X.: A small-sized object detection oriented multi-scale feature fusion approach with application to defect detection. IEEE Trans. Instrum. Meas. 71, 1–14 (2022)
MATH Google Scholar
Zhang, Y., Bi, S., Dong, M., Liu, Y.: The implementation of cnn-based object detector on arm embedded platforms. In: 2018 IEEE 16th Intl Conf on Dependable, Autonomic and Secure Computing, 16th Intl Conf on Pervasive Intelligence and Computing, 4th Intl Conf on Big Data Intelligence and Computing and Cyber Science and Technology Congress (DASC/PiCom/DataCom/CyberSciTech), pp. 379–382. IEEE (2018)
Zhong, X., Wang, M., Liu, W., Yuan, J., Huang, W.: Scpnet: Self-constrained parallelism network for keypoint-based lightweight object detection. J. Vis. Commun. Image Represent. 90, 103719 (2023)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Jiangsu Automation Research Institute, Jiangsu, 222000, China
Xin Zhong

Authors

Xin Zhong
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xin Zhong.

Ethics declarations

Conflict of interest STATEMENT

The authors declare that they have no Conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Zhong, X. CAL-SSD: lightweight SSD object detection based on coordinated attention. SIViP 19, 31 (2025). https://doi.org/10.1007/s11760-024-03716-x

Download citation

Received: 24 June 2024
Revised: 02 September 2024
Accepted: 07 September 2024
Published: 02 December 2024
DOI: https://doi.org/10.1007/s11760-024-03716-x

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

CAL-SSD: lightweight SSD object detection based on coordinated attention

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

YOLO-SA: An Efficient Object Detection Model Based on Self-attention Mechanism

Small Object Detection Algorithm Combining Coordinate Attention Mechanism and P2-BiFPN Structure

MSPNet: Multi-level Semantic Pyramid Network for Real-Time Object Detection

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest STATEMENT

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

CAL-SSD: lightweight SSD object detection based on coordinated attention

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

YOLO-SA: An Efficient Object Detection Model Based on Self-attention Mechanism

Small Object Detection Algorithm Combining Coordinate Attention Mechanism and P2-BiFPN Structure

MSPNet: Multi-level Semantic Pyramid Network for Real-Time Object Detection

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest STATEMENT

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation