Abstract
Objection detection of vehicles and pedestrians in fog is of great significance for intelligent transportation and autonomous driving. Polarization image is beneficial to improve the object detection under adverse weather conditions. This study proposed a polarization image fusion method based on grouped convolutional attention network (GCAnet) to improve the object detection for cars and persons in foggy street scenes. Based on the international available Polar LITIS image dataset, a multi-channel grouped convolution matrix was first constructed to input different types of polarization images. Then, a grouped attention module was added to enhance the features in each type of polarization image, and finally each convolutional matrix was further connected to the detection network in series to perform objection detection. The experimental results prove that three types of polarization image fusion are obviously better than those of any two types of polarization image fusion and one single polarization image; and adding ECA attention module after multi-channel convolution can further enhance the accuracy of I04590 + Pauli + Stokes fused image to the highest value of 76.46%. The improvement of network lightweight shows that the Mobilenet-ECA has increased the speed by 26% with a slightly reduced accuracy. The proposed GCAnet method has significantly surpassed traditional objection detection networks of SSD300, SSD512, Faster R-CNN600, Yolov3, and Yolov4, which has increased the mAP@0.5 by 28.90%, 27.60%, 15.01%, 24.98%, and 16.45%, respectively; and has increased the mAP@0.5 by 9.36% and 6.20% compared to foggy image detection methods of AOD-Net SSD and DeRF-Yolov3-X, respectively. This work demonstrates the potential of GCAnet enabled polarization image fusion technology to be used as an effective foggy objection detection method in the field of intelligent transportation and autonomous driving.
Similar content being viewed by others
References
Chen, G., Qin, H.B.: Class-discriminative focal loss for extreme imbalanced multiclass object detection towards autonomous driving. Visual Comput. 38, 1051–1063 (2022)
Wang, H., Chen, Y., Cai, Y., et al.: An improved SFNet algorithm for semantic segmentation of low-light autonomous driving road scenes. IEEE Trans. Intell. Transp. Syst. 23, 21405–21417 (2022)
Zhang, S., He, F.: Learning deep residual convolutional dehazing networks. Visual Comput. 36, 1797–1808 (2020)
Wu, D., Lv, S., Jiang, M., et al.: Using channel pruning-based YOLO v4 deep learning algorithm for the real-time and accurate detection of apple flowers in natural environments. Comput. Electron. Agric. 178, 105742 (2020)
An, Q., Chen, X., OuYang, Y.: Research on map matching of lidar/vision sensor for automatic driving aided positioning. Int. J. Veh. Inf. Commun. Syst. 6, 121–136 (2021)
Zhang, J.M., Zou, X., Kuang, L.D., et al.: A more comprehensive traffic sign detection benchmark. Human-Centric Comput. Inf. Sci. (2022). https://doi.org/10.22967/HCIS.2022.12.023
Hu, Q., Zhang, Y., Zhu, Y., et al.: Single image dehazing algorithm based on sky segmentation and optimal transmission maps. Visual Comput. 39, 997–1013 (2023)
Li, X.L., Hua, Z., Li, J.: Attention-based adaptive feature selection for multi-stage image dehazing. Visual Comput. 39, 663–678 (2023)
Li, B., Ren, W., Fu, D., et al.: Benchmarking single-image dehazing and beyond. IEEE Trans. Image Process. 28, 492–505 (2018)
Zhuo, Y.W., Zhang, T.J., Hu, J.F., et al.: A deep-shallow fusion network with multi detail extractor and spectral attention for hyperspectral pansharpening. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 15, 7539–7555 (2022)
Shit, S., Das, D.K., Ray, D.N., et al.: An encoder-decoder based CNN architecture using end to end dehaze and detection network for proper image visualization and detection. Comput. Animat. Virtual Worlds (2023). https://doi.org/10.1002/cav.2147
Das, B.L., Ebenezer, J.P., Mukhopadhyay, S.: A comparative study of single image fog removal methods. Visual Comput. 38, 1–17 (2022)
Chen, Y., Xia, R., Zou, K., et al.: Image inpainting algorithm via features fusion and two-steps inpainting. J. Visual Commun. Image Represent. 91, 103776 (2023)
Chen, Y., Xia, R., Zou, K., et al.: Image inpainting via repair network and optimization network. Int. J. Mach. Learn. Cybern. (2023). https://doi.org/10.1007/s13042-023-01811-y
Yang K, Yan X, Sun J, Xu N, Chen X.: A DeRF-YOLOv3-X object detection method for rain and fog background. J. Sens. Technol., 1222–1229 (2022).
Wang, H., Xu, Y., He, Y., et al.: A multi objective visual detection algorithm for fog driving scenes based on improved YOLOv5. IEEE Trans. Instrum. Meas. 71, 1–12 (2022)
Bian Y, Xing T et al.: Color Transfer Biomedical Imaging Technology Based on Deep Learning Infrared and Laser Engineering, 20210891-1-20210891-18 (2022).
Baiju, P.S., Antony, S.L., George, S.N.: An intelligent framework for transmission map estimation in image dehazing using total variation regularized low-rank approximation. Visual Comput. 38, 2357–2372 (2022)
Raikwar, S.C., Tapaswi, S.: Tight lower bound on transmission for single image dehazing. Visual Comput. 36, 191–209 (2020)
Wang, H.F., Shan, Y.H., Hao, T., et al.: Vehicle-road environment perception under low-visibility condition based on polarization features via deep learning. IEEE Trans. Intell. Transp. Syst. 23, 17873–17886 (2022)
Lin, C., Rong, X., Yu, X.: Multiscale attention feature fusion networks for single image Dehazing and beyond. IEEE Trans. Multimed. (2022). https://doi.org/10.1109/TMM.2022.3155937
Liu W, Chen C, Jiang R, et al.: Holistic Attention-Fusion Adversarial Network for Single Image Defogging. Computer Vision and Pattern Recognition, 2202.09553, (2022).
Yang, C.W., Feng, H., Xu, Z., et al.: Correction of overexposure utilizing haze removal model and image fusion technique. Visual Comput. 35, 695–705 (2019)
Blin, R., Ainouz, S., Canu, S., et al.: The polarlitis dataset: Road scenes under fog. IEEE Trans. Intell. Transp. Syst. 23, 10753–10762 (2022). https://doi.org/10.1109/TITS.2021.3095658
Yin, W.X., He, K., Xu, D., et al.: Adaptive low light visual enhancement and high-significant target detection for infrared and visible image fusion. Visual Comput. (2023). https://doi.org/10.1007/s00371-022-02759-w
Zhang, J.M., Zheng, Z., Xie, X., et al.: A traffic sign detector based on network reparameterization and features adaptive weighting. J. Ambient Intell. Smart Environ. (2022). https://doi.org/10.3233/AIS-220038
Zhang, J.M., Huang, H., Jin, X., et al.: Siamese visual tracking based on criss-cross attention and improved head network. Multimed. Tools Appl. (2023). https://doi.org/10.1007/s11042-023-1542
Zhang, X.H., Wang, H., Xu, C., et al.: A lightweight feature optimizing network for ship detection in SAR image. IEEE Access 7, 141662–141678 (2019)
Zhao, Y.Q., Gong, P., Pan, Q.: Object detection by spectropolarimeteric imagery fusion. IEEE Trans. Geosci. Remote Sens. 46, 3337–3345 (2021)
Cai, Y.H., Liu, J., Guo, Y., et al.: Video anomaly detection with multi-scale feature and temporal information fusion. Neurocomputing 423, 264–273 (2021)
Zhang, J.C., Shao, J., Chen, J., et al.: Polarization image fusion with self-learned fusion strategy. Pattern Recognit. 118, 108045 (2021)
Zhang, J.C., Shao, J., Chen, J., et al.: An unsupervised deep network for polarization image fusion. Optics Lett. 45, 1507–1510 (2020)
Xu, X., Zhang, X., Shao, Z., et al.: A group-wise feature enhancement-and-fusion network with dual-polarization feature enrichment for SAR ship detection. Remote Sens. 14, 5276 (2022)
Bai, R.Y.: A general image orientation detection method by feature fusion. Visual Comput. (2023). https://doi.org/10.1007/s00371-023-02782-5
Chen, Y.T., Xia, R., Yang, K., et al.: Image super-resolution via multi-level features fusion network. Visual Comput. (2023). https://doi.org/10.1007/s00371-023-02795-0
Acknowledgements
The authors would like to acknowledge the support from the Hebei Natural Science Foundation under grant No. C2020203010 and the National Natural Science Foundation of China under Grant No.62073280.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no known competing financial interests or personal relationships that could appeared to influence the work reported in this paper.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Tan, A., Guo, T., Zhao, Y. et al. Object detection based on polarization image fusion and grouped convolutional attention network. Vis Comput 40, 3199–3215 (2024). https://doi.org/10.1007/s00371-023-03022-6
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00371-023-03022-6