Abstract
Road surface defect detection plays an important role in the construction and maintenance of roads. However, the irregularity of road surface defects and the complexity of the background make the extraction of road surface defects very difficult. It is a challenge to extract the road surface defects accurately. To cope with this challenge, we introduce the theory of image segmentation in deep learning. However, existing deep learning networks suffer from insufficient segmentation accuracy, low model robustness, and a lack of generalization ability. Consequently, we propose a novel deep learning network named Strip Pyramid ConvNeXt Network for detecting road surface defects. Firstly, we introduced ConvNeXt as the encoder to ensure the segmentation accuracy of the model. Furthermore, we designed a strip pyramid pooling module with excellent edge detail extraction capability and a multi-feature fusion module. We also created a cementation fissure dataset (CE dataset) to test the accuracy of the model and verify the generalization capability and robustness of the model. Finally, we compared our model with ten advanced segmentation networks in recent years on CRACK500 dataset, GAPs384 dataset, and cementation fissure dataset (CE dataset), and our model outperforms others on four metrics.







Similar content being viewed by others
Data availability
The datasets analyzed during the current study are available from the corresponding author on reasonable request.
References
Oliveira, H., Correia, P.L.: Automatic crack detection on road imagery using anisotropic diffusion and region linkage. In: European Signal Processing Conference, 18th European Signal Processing Conference (EUSIPCO), pp. 274–278 (2010)
Chen, C., et al.: A potential crack region method to detect crack using image processing of multiple thresholding. SIViP 16(6), 1673–1681 (2022)
Pang, J., et al.: DcsNet: a real-time deep network for crack segmentation. SIViP 16(4), 911–919 (2022)
Kanwal, M., et al.: Saliency-based fabric defect detection via bag-of-words model. In: Signal Image and Video Processing
Yang, L.Y., et al.: Study on steel plate scratch detection based on improved MSR and phase consistency. In: Signal Image and Video Processing
Zhang, H., et al.: AE-FPN: adaptive enhance feature learning for detecting wire defects. In: Signal Image and Video Processing
Zhang, K., et al.: ARFNet: adaptive receptive field network for detecting insulator self-explosion defects. SIViP 16(8), 2211–2219 (2022)
Makaremi, M., Razmjooy, N., Ramezani, M.: A new method for detecting texture defects based on modified local binary pattern. SIViP 12(7), 1395–1401 (2018)
Yang, F., et al.: Feature pyramid and hierarchical boosting network for pavement crack detection. IEEE Trans. Intell. Transp. Syst. 21(4), 1525–1535 (2020)
Zhao, H., et al.: Pyramid scene parsing network. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) arXiv:1612.01105 (2016)
Eisenbach, M., et al.: How to get pavement distress detection ready for deep learning? A systematic approach. In: 2017 International Joint Conference on Neural Networks (IJCNN) (2017)
Tang, W.H., et al.: An iteratively optimized patch label inference network for automatic pavement distress detection. IEEE Trans. Intell. Transp. Syst. 23(7), 8652–8661 (2022)
Xie, E., et al.: SegFormer: simple and efficient design for semantic segmentation with transformers. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) arXiv:2105.15203 (2021)
Guo, M.-H., et al.: SegNeXt: rethinking convolutional attention design for semantic segmentation. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) arXiv:2209.08575 (2022)
Kirillov, A., et al.: PointRend: Image Segmentation as Rendering. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) arXiv:1912.08193 (2019)
Yin, M., et al.: Disentangled non-local neural networks. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds) Computer Vision—ECCV 2020, pp. 191–207 (2020)
Chu, X.X., et al.: Twins: revisiting the design of spatial attention in vision transformers. In: Advances in Neural Information Processing Systems, 35th Conference on Neural Information Processing Systems (NeurIPS) (2021)
Chen, Z., et al.: DPT: deformable patch-based transformer for visual recognition. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) arXiv:2107.14467 (2021)
Fan, M.Y., et al.: Rethinking BiSeNet For real-time semantic segmentation. In: IEEE Conference on Computer Vision and Pattern Recognition, IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 9711–9720 (2021)
Xiao, T., et al.: Unified perceptual parsing for scene understanding. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) arXiv:1807.10221 (2018)
Chen, J., et al.: TransUNet: transformers make strong encoders for medical image segmentation. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) arXiv:2102.04306 (2021)
Chen, L.-C., et al.: Rethinking atrous convolution for semantic image segmentation. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) arXiv:1706.05587 (2017)
Funding
This work was supported by the National Natural Science Foundation of China (51805078), the Fundamental Research Funds for the Central Universities (N2103011), the Central Guidance on Local Science and Technology Development Fund (2022JH6/100100023), and the 111 Project (B16009). Deep learning based highway defect detection algorithm research (S202210145181) Supported by National Training Program of Innovation and Entrepreneurship for Undergraduates.
Author information
Authors and Affiliations
Contributions
Ziang Zhou and Wensong Zhao wrote the main manuscript text and Jun Li prepared figures 1-7, Kechen Song prepared figures 8-10. All authors reviewed the manuscript.
Corresponding author
Ethics declarations
Conflict of interest
The authors have no relevant financial or personal nature interests to disclose and have no competing interests to declare that are relevant to the content of this article.
Ethical approval
Not applicable.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Zhou, Z., Zhao, W., Li, J. et al. SPCNet: a strip pyramid ConvNeXt network for detection of road surface defects. SIViP 18, 37–45 (2024). https://doi.org/10.1007/s11760-023-02698-6
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11760-023-02698-6