SwinCGH-Net: Enhancing Robustness of Object Detection in Autonomous Driving with Weather Noise via Attention

Cao, Shi; Zhu, Qing; Zhu, Wanting

doi:10.1007/978-981-99-4761-4_8

Shi Cao¹³,
Qing Zhu¹³ &
Wanting Zhu¹³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 14090))

Included in the following conference series:

International Conference on Intelligent Computing

1341 Accesses

Abstract

Object detection in autonomous driving requires high accuracy and speed in different weather. At present, many CNN-based networks have achieved high accuracy on academic datasets, but their performance disastrously degrade when images contain various kinds of noises, which is fatal for autonomous driving. In this paper, we propose a detection network based on shifted windows Transformer (Swin Transformer) called SwinCGH-Net, with a kind of new detector head based on lightweight convolution attention module, which makes full use of the attention mechanism in both feature extraction and detection stages. Specifically, we use Swin Transformer as backbone to extract feature in order to obtain effective information from a small amount of pixels as well as integrate global information. Then we further improve the robustness of the network through the detector head contained lightweight attention block S-CBAM. Furthermore, we use Generalized Focal Loss to calculate loss, which effectively enhances the representation ability of the model. Experiments on Cityscapes and Cityscapes-C datasets demonstrate the superiority and effectiveness of our method in different weather condition. With the increasing level of weather noise, our method shows strong robustness compared with previous method, especially in small object detection.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 109.00; Price excludes VAT (USA)

Softcover Book: USD 139.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Laugros, A., Caplier, A., Ospici, M.: Are adversarial robustness and common perturbation robustness independant attributes? In: Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops, pp. 0–0 (2019)
Google Scholar
Sakaridis, C., Dai, D., Van Gool, L.: Semantic foggy scene understanding with synthetic data. Int. J. Comput. Vision 126, 973–992 (2018)
Article Google Scholar
Wang, H., Xie, Q., Zhao, Q., Meng, D.: A model-driven deep neural network for single image rain removal. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3103–3112 (2020)
Google Scholar
Chen, W.-T., et al.: All snow removed: Single image desnowing algorithm using hierarchical dual-tree complex wavelet representation and contradict channel loss. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 4196–4205 (2021)
Google Scholar
Michaelis, C., et al.: Benchmarking robustness in object detection: Autonomous driving when winter is coming. arXiv preprint arXiv:1907.07484 (2019)
Wang, Y., Sun, X., Fu, Y.: Scalable penalized regression for noise detection in learning with noisy labels. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 346–355 (2022)
Google Scholar
Geirhos, R., Temme, C.R., Rauber, J., Schütt, H.H., Bethge, M., Wichmann, F.A.: Generalisation in humans and deep neural networks. Advances in neural information processing systems 31 (2018)
Google Scholar
Hu, H., Zhang, Z., Xie, Z., Lin, S.: Local relation networks for image recognition. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 3464–3473 (2019)
Google Scholar
Guo, M.-H., et al.: Attention mechanisms in computer vision: A survey. Computational Visual Media 8(3), 331–368 (2022)
Google Scholar
Zhu, X., Hu, H., Lin, S., Dai, J.: Deformable convnets v2: More deformable, better results. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 9308–9316 (2019)
Google Scholar
Wang, Q., Wu, B., Zhu, P., Li, P., Zuo, W., Hu, Q.: Eca-net: efficient channel attention for deep convolutional neural networks. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 11534–11542 (2020)
Google Scholar
Vaswani, A., et al.: Attention is all you need. Advances in neural information processing systems 30 (2017)
Google Scholar
Dosovitskiy, A., et al.: An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020)
Liu, Z., et al.: Swin transformer: hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE/CVF international conference on computer vision, pp. 10012–10022 (2021)
Google Scholar
Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-End Object Detection with Transformers. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12346, pp. 213–229. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58452-8_13
Chapter Google Scholar
Li, X., et al.: Generalized focal loss: Learning qualified and distributed bounding boxes for dense object detection. Adv. Neural. Inf. Process. Syst. 33, 21002–21012 (2020)
Google Scholar
Lin, T.-Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection. In: Proceedings of the IEEE international conference on computer vision, pp. 2980–2988 (2017)
Google Scholar
Redmon, J., Farhadi, A.: Yolov3: an incremental improvement. arXiv preprint arXiv:1804.02767 (2018)
Ren, S., He, K., Girshick, R., Sun, J.: Faster r-cnn: Towards real-time object detection with region proposal networks. Advances in neural information processing systems 28 (2015)
Google Scholar
He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask r-cnn. In: Proceedings of the IEEE international conference on computer vision, pp. 2961–2969 (2017)
Google Scholar

Download references

Acknowledgements

This work is supported by Beijing Natural Science Foundation (4232017).

Author information

Authors and Affiliations

College of Software Engineering, Beijing University of Technology, Beijing, China
Shi Cao, Qing Zhu & Wanting Zhu

Authors

Shi Cao
View author publications
You can also search for this author in PubMed Google Scholar
Qing Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Wanting Zhu
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, Eastern Institute of Technology, Zhejiang, China
De-Shuang Huang
University of Wollongong, North Wollongong, NSW, Australia
Prashan Premaratne
Zhengzhou University of Light Industry, Zhengzhou, China
Baohua Jin
Zhong Yuan University of Technology, Zhengzhou, China
Boyang Qu
University of Ulsan, Ulsan, Korea (Republic of)
Kang-Hyun Jo
Department of Computer Science, Liverpool John Moores University, Liverpool, UK
Abir Hussain

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Cao, S., Zhu, Q., Zhu, W. (2023). SwinCGH-Net: Enhancing Robustness of Object Detection in Autonomous Driving with Weather Noise via Attention. In: Huang, DS., Premaratne, P., Jin, B., Qu, B., Jo, KH., Hussain, A. (eds) Advanced Intelligent Computing Technology and Applications. ICIC 2023. Lecture Notes in Computer Science(), vol 14090. Springer, Singapore. https://doi.org/10.1007/978-981-99-4761-4_8

Download citation

DOI: https://doi.org/10.1007/978-981-99-4761-4_8
Published: 31 July 2023
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-4760-7
Online ISBN: 978-981-99-4761-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics