Enhanced YOLOv8 framework for precision vehicle detection in high-resolution remote sensing images

Shao, Zhaowei; He, Kunyu; Yuan, Baohua; Xu, Sheng

doi:10.1007/s11760-024-03783-0

Enhanced YOLOv8 framework for precision vehicle detection in high-resolution remote sensing images

Original Paper
Published: 17 January 2025

Volume 19, article number 218, (2025)
Cite this article

Signal, Image and Video Processing Aims and scope Submit manuscript

Zhaowei Shao¹,
Kunyu He¹,
Baohua Yuan² &
…
Sheng Xu¹

139 Accesses
Explore all metrics

Abstract

Vehicle detection in high-resolution remote sensing imagery faces challenges such as varying scales, complex backgrounds, and high intra-class variability. We propose an enhanced YOLOv8 framework, incorporating three key advancements: the Adaptive Feature Pyramid Network (AFPN), Omni-Dimensional Convolution (ODConv), and a Slim Neck with Generalized Shuffle Convolution (GSConv). These enhancements improve vehicle detection accuracy, computational efficiency, and visual AI capabilities for applications such as computer animation and virtual worlds. Our model achieves a Mean Average Precision (mAP) of 0.7153, representing a 4.99% improvement over the baseline YOLOv8. Precision and recall increase to 0.9233 and 0.9329, respectively, while box loss is reduced from 1.213 to 1.054. This framework supports real-time surveillance, traffic monitoring, and urban planning. The NEPU-OWOD V2.0 dataset, used for evaluation, includes high-resolution images from multiple regions and seasons, along with diverse annotations and augmentations. Our modular approach allows for separate assessments of each enhancement. The dataset and source code are available for future research and development at (https://doi.org/10.5281/zenodo.13075939).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

SS-BEV: multi-camera BEV object detection based on multi-scale spatial structure understanding

Article 02 January 2025

Improving Traffic Surveillance with Deep Learning Powered Vehicle Detection, Identification, and Recognition

PVDM-YOLOv8l: a solution for reliable pedestrian and vehicle detection in autonomous vehicles under adverse weather conditions

Article 16 September 2024

References

Wang, K. et al.: Oriented object detection in optical remote sensing images using deep learning: a survey (2023). https://doi.org/10.48550/arXiv.2302.10473
Zhang, J. et al.: Fair1m: a benchmark dataset for fine-grained object recognition in high-resolution remote sensing imagery (2021). https://doi.org/10.48550/arXiv.2103.05569
Wang, X. et al.: A comprehensive review of yolo architectures in computer vision: from yolov1 to yolov8 and yolo-nas (2023). https://doi.org/10.48550/arXiv.2304.00501
Cao, L., Shen, Z., Xu, S.: Efficient forest fire detection based on an improved yolo model. Vis. Intell. 2, 20 (2024). https://doi.org/10.1007/s44267-024-00053-y
Article MATH Google Scholar
Cheng, G., et al.: Object detection in optical remote sensing images: a survey and a new benchmark. ISPRS J. Photogramm. Remote Sens. 159, 296–307 (2020). https://doi.org/10.48550/arXiv.1909.00133
Article MATH Google Scholar
Chen, G., Zhuang, P., Guo, J., Xu, J., Liu, H., Zhang, L.: Omni-dimensional dynamic convolution (2022). https://doi.org/10.48550/arXiv.2209.07947
Chen, W. et al.: Castdet: toward open vocabulary aerial object detection with clip-activated student-teacher learning (2023). https://doi.org/10.48550/arXiv.2311.11646
Liu, W., Zhang, Y., Hu, Y., Zhou, H.: Efficient meta-learning enabled lightweight multiscale few-shot object detection in remote sensing images (2023)
Ren, S., He, K., Girshick, R., Sun, J.: Faster r-cnn: Towards real-time object detection with region proposal networks. 28 (2015). https://doi.org/10.48550/arXiv.1506.01497
Dai, L., et al.: A deep learning system for predicting time to progression of diabetic retinopathy. Nat. Med. 30, 584–594 (2024). https://doi.org/10.1038/s41591-023-02702-z
Article MATH Google Scholar
Qian, B. et al.: Drac 2022: a public benchmark for diabetic retinopathy analysis on ultra-wide optical coherence tomography angiography images. https://doi.org/10.1016/j.patter.2024.100929
Qin, Y. et al.: Urbanevolver: function-aware urban layout regeneration (2024). https://doi.org/10.1007/s11263-024-02030-w
Lin, X., et al.: Eapt: Efficient attention pyramid transformer for image processing. IEEE Trans. Multimed. 25, 50–61 (2023). https://doi.org/10.1109/TMM.2021.3120873
Article MATH Google Scholar
Zhu, J. et al.: Clustering environment aware learning for active domain adaptation. https://doi.org/10.1109/TSMC.2024.3374068
Huang, J. et al.: Speed/accuracy trade-offs for modern convolutional object detectors. pp. 7310–7311 (2017)
Liu, S., Qi, X., Qin, H., Shi, J., Jia, J.: Path aggregation network for instance segmentation. pp. 8759–8768 (2018)
Dai, J., Qi, H., Xiong, Y., Li, Y., Zhang, G., Hu, H., Wei, Y.: Deformable convolutional networks. pp. 764–773 (2017). https://doi.org/10.48550/arXiv.1703.06211
Chen, L.-C., Papandreou, G., Kokkinos, K., Yuille, A.L.: Rethinking atrous convolution for semantic image segmentation (2017). https://doi.org/10.48550/arXiv.1706.05587
Li, L., Ding, J., Cui, H., Chen, Z., Liao, G.: Litemsnet: a lightweight semantic segmentation network with multi-scale feature extraction for urban streetscape scenes. pp. 1–15 (2024). https://doi.org/10.1007/s00371-024-03569-y
Sheng, B., et al.: Improving video temporal consistency via broad learning system. IEEE Trans. Cybern. 52(7), 6662–6675 (2022). https://doi.org/10.1109/TCYB.2021.3079311
Li, J., et al.: Automatic detection and classification system of domestic waste via multimodel cascaded convolutional neural network. IEEE Trans. Ind. Inform. 18(1), 163–173 (2022). https://doi.org/10.1109/TII.2021.3085669
Article MATH Google Scholar

Download references

Author information

Authors and Affiliations

College of Information Science and Technology and Artificial Intelligence, Nanjing Forestry University, Nanjing, Jiangsu, China
Zhaowei Shao, Kunyu He & Sheng Xu
College of information science and Engineering, Changzhou University, Changzhou, Jiangsu, China
Baohua Yuan

Authors

Zhaowei Shao
View author publications
You can also search for this author inPubMed Google Scholar
Kunyu He
View author publications
You can also search for this author inPubMed Google Scholar
Baohua Yuan
View author publications
You can also search for this author inPubMed Google Scholar
Sheng Xu
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Sheng Xu.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Shao, Z., He, K., Yuan, B. et al. Enhanced YOLOv8 framework for precision vehicle detection in high-resolution remote sensing images. SIViP 19, 218 (2025). https://doi.org/10.1007/s11760-024-03783-0

Download citation

Received: 23 August 2024
Revised: 29 November 2024
Accepted: 11 December 2024
Published: 17 January 2025
DOI: https://doi.org/10.1007/s11760-024-03783-0

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Enhanced YOLOv8 framework for precision vehicle detection in high-resolution remote sensing images

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

SS-BEV: multi-camera BEV object detection based on multi-scale spatial structure understanding

Improving Traffic Surveillance with Deep Learning Powered Vehicle Detection, Identification, and Recognition

PVDM-YOLOv8l: a solution for reliable pedestrian and vehicle detection in autonomous vehicles under adverse weather conditions

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now