CorFormer: a hybrid transformer-CNN architecture for corrosion segmentation on metallic surfaces

Subedi, Abhishek; Qian, Cheng; Sadeghian, Reza; Jahanshahi, Mohammad R.

doi:10.1007/s00138-025-01663-2

CorFormer: a hybrid transformer-CNN architecture for corrosion segmentation on metallic surfaces

Research
Published: 29 January 2025

Volume 36, article number 45, (2025)
Cite this article

Machine Vision and Applications Aims and scope Submit manuscript

Abhishek Subedi¹,
Cheng Qian²,
Reza Sadeghian³ &
…
Mohammad R. Jahanshahi^1,2

248 Accesses
Explore all metrics

Abstract

The importance of periodic corrosion inspection in steel structures cannot be overstated. However, current manual inspection approaches are fraught with challenges: they are time-consuming, subjective, and pose risks. To address these limitations, extensive research has been conducted over the past decade gauging the feasibility of Convolutional Neural Networks (CNNs) for automation of corrosion inspection. Meanwhile, Transformer networks have recently emerged as powerful tools in computer vision due to their ability to model intricate global relationships. In this paper, a novel hybrid architecture, dubbed CorFormer, is proposed for effective and efficient automation of corrosion inspection. The CorFormer network fuses Transformer and CNN layers at different stages of the encoder, which captures global context through Transformer layers while leveraging the inherent inductive bias of CNNs. To bridge the semantic gap between features generated by Transformer and CNN layers, a Semantic Gap Merger (SGM) module is introduced after each feature merge operation. The encoder is complemented by a hierarchical decoder, able to decrypt complex features at large and small scales. CorFormer is compared against state-of-the-art CNN and Transformer architectures for corrosion segmentation, and is found to outperform the best alternative by 2.7% in terms of Intersection over Union (IoU) across 10 validation data splits. Furthermore, it enables real-time inspection at an impressive rate of 28 frames per second. Rigorous statistical tests provide support for the findings presented in this study, and an extensive ablation study validates all design choices.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A multilevel bridge corrosion detection method by transformer-based segmentation in a stitched view

Article Open access 28 February 2025

Research on the performance of the SegFormer model with fusion of edge feature extraction for metal corrosion detection

Article Open access 08 March 2025

Image Pre-processing and Segmentation for Real-Time Subsea Corrosion Inspection

Data availability

No datasets were generated or analysed during the current study.

References

Koch, G., Varney, J., Thompson, N., Moghissi, O., Gould, M., Payer, J.: International measures of prevention, application, and economics of corrosion technologies study. NACE International IMPACT Report (2016)
The Federal Highway Administration of The United States Department of Transportation: National Bridge Inspection Standard. The Federal Highway Administration of The United States Department of Transportation
Pidaparti, R.M., Aghazadeh, B.S., Whitfield, A., Rao, A., Mercier, G.P.: Classification of corrosion defects in NIAl bronze through image analysis. Corros. Sci. 52(11), 3661–3666 (2010)
Article Google Scholar
Chun, P., Funatani, K., Furukwa, S., Ohga, M.: Grade classification of corrosion damage on the surface of weathering steel members by digital image processing. In: Proceedings of the Thirteenth East Asia-Pacific Conference on Structural Engineering and Construction (EASEC-13), p. 4 (2013)
Shen, H.-K., Chen, P.-H., Chang, L.-M.: Automated steel bridge coating rust defect recognition method based on color and texture feature. Autom. Constr. 31, 338–356 (2013)
Article MATH Google Scholar
Jahanshahi, M.R., Masri, S.F.: Parametric performance evaluation of wavelet-based corrosion detection algorithms for condition assessment of civil infrastructure systems. J. Comput. Civ. Eng. 27(4), 345–357 (2012)
Article MATH Google Scholar
Ghanta, S., Karp, T., Lee, S.: Wavelet domain detection of rust in steel bridge images. In: Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference On, pp. 1033–1036. IEEE (2011)
Jahanshahi, M.R., Kelly, J.S., Masri, S.F., Sukhatme, G.S.: A survey and evaluation of promising approaches for automatic image-based defect detection of bridge structures. Struct. Infrastruct. Eng. 5(6), 455–486 (2009)
Article MATH Google Scholar
Liao, K.-W., Lee, Y.-T.: Detection of rust defects on steel bridge coatings via digital image recognition. Autom. Constr. 71, 294–306 (2016)
Article MATH Google Scholar
Son, H., Hwang, N., Kim, C., Kim, C.: Rapid and automated determination of rusted surface areas of a steel bridge for robotic maintenance systems. Autom. Constr. 42, 13–24 (2014)
Article MATH Google Scholar
Shen, H.-K., Chen, P.-H., Chang, L.-M.: Human-visual-perception-like intensity recognition for color rust images based on artificial neural network. Autom. Constr. 90, 178–187 (2018)
Article MATH Google Scholar
Khan, A., Rauf, Z., Sohail, A., Khan, A.R., Asif, H., Asif, A., Farooq, U.: A survey of the vision transformers and their CNN-transformer based variants. Artif. Intell. Rev. 56(Suppl 3), 2917–2970 (2023)
Article MATH Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
Tulbure, A.-A., Tulbure, A.-A., Dulf, E.-H.: A review on modern defect detection models using DCNNs-deep convolutional neural networks. J. Adv. Res. 35, 33–48 (2022)
Article Google Scholar
Zhu, J., Zhang, C., Qi, H., Lu, Z.: Vision-based defects detection for bridges using transfer learning and convolutional neural networks. Struct. Infrastruct. Eng. 16(7), 1037–1049 (2020)
Article MATH Google Scholar
Subedi, A., Tang, W., Mondal, T.G., Wu, R.-T., Jahanshahi, M.R.: Ensemble-based deep learning for autonomous bridge component and damage segmentation leveraging nested reg-unet. Smart Struct. Syst. 31(4), 335–349 (2023)
Google Scholar
Chen, F.-C., Jahanshahi, M.R.: ARF-crack: rotation invariant deep fully convolutional network for pixel-level crack detection. Mach. Vis. Appl. 31(6), 47 (2020)
Article MATH Google Scholar
Yang, Y., Yang, S., Zhao, Q., Cao, H., Peng, X.: Weakly supervised collaborative localization learning method for sewer pipe defect detection. Mach. Vis. Appl. 35(5), 1–15 (2024)
Article MATH Google Scholar
Cha, Y.-J., Choi, W., Suh, G., Mahmoudkhani, S., Büyüköztürk, O.: Autonomous structural visual inspection using region-based deep learning for detecting multiple damage types. Comput. Aided Civ. Infrastruct. Eng. 33(9), 731–747 (2018)
Article Google Scholar
Bastian, B.T., Jaspreeth, N., Ranjith, S.K., Jiji, C.: Visual inspection and characterization of external corrosion in pipelines using deep neural network. NDT & E Int. 107, 102134 (2019)
Article Google Scholar
Atha, D.J., Jahanshahi, M.R.: Evaluation of deep learning approaches based on convolutional neural networks for corrosion detection. Struct. Health Monit. 17(5), 1110–1128 (2018)
Article MATH Google Scholar
Liu, L., Tan, E., Zhen, Y., Yin, X.J., Cai, Z.Q.: Ai-facilitated coating corrosion assessment system for productivity enhancement. In: 2018 13th IEEE Conference on Industrial Electronics and Applications (ICIEA), pp. 606–610. IEEE (2018)
Nguyen, T., Ozaslan, T., Miller, I.D., Keller, J., Loianno, G., Taylor, C.J., Lee, D.D., Kumar, V., Harwood, J.H., Wozencraft, J.: U-Net for MAV-based penstock inspection: an investigation of focal loss in multi-class segmentation for corrosion identification. arXiv preprint arXiv:1809.06576 (2018)
Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: International Conference on Medical Image Computing and Computer-assisted Intervention, pp. 234–241. Springer (2015)
Nash, W., Drummond, T., Birbilis, N.: Quantity beats quality for semantic segmentation of corrosion in images. arXiv preprint arXiv:1807.03138 (2018)
Hoskere, V., Narazaki, Y., Hoang, T., Spencer Jr, B.: Vision-based structural inspection using multiscale deep convolutional neural networks. arXiv preprint arXiv:1805.01055 (2018)
Duy, L.D., Anh, N.T., Son, N.T., Tung, N.V., Duong, N.B., Khan, M.H.R.: Deep learning in semantic segmentation of rust in images. In: Proceedings of the 2020 9th International Conference on Software and Computer Applications, pp. 129–132 (2020)
Huang, J., Liu, Q., Xiang, L., Li, G., Zhang, Y., Chen, W.: A lightweight residual model for corrosion segmentation with local contextual information. Appl. Sci. (2022). https://doi.org/10.3390/app12189095
Article MATH Google Scholar
Zhu, T., Zhu, S., Zheng, T., Ding, H., Song, W., Li, C.: HEU-Net: hybrid attention residual block-based network with external skip connections for metal corrosion semantic segmentation. Vis. Comput. 40, 1–15 (2023)
Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., et al.: An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020)
Safa, A., Mohamed, A., Issam, B., Mohamed-Yassine, H.: SegFormer: semantic segmentation based tranformers for corrosion detection. In: 2023 International Conference on Networking and Advanced Systems (ICNAS), pp. 1–6. IEEE (2023)
Xie, E., Wang, W., Yu, Z., Anandkumar, A., Alvarez, J.M., Luo, P.: SegFormer: simple and efficient design for semantic segmentation with transformers. Adv. Neural. Inf. Process. Syst. 34, 12077–12090 (2021)
MATH Google Scholar
Sookpong, S., Phimsiri, S., Tosawadi, T., Choppradit, P., Suttichaya, V., Utintu, C., Thamwiwatthana, E.: Comparison of corrosion segmentation techniques on oil and gas offshore critical assets. In: 2023 20th International Conference on Electrical Engineering/Electronics, Computer, Telecommunications and Information Technology (ECTI-CON), pp. 1–5. IEEE (2023)
Chen, Z., Duan, Y., Wang, W., He, J., Lu, T., Dai, J., Qiao, Y.: Vision transformer adapter for dense predictions. arXiv preprint arXiv:2205.08534 (2022)
Zhao, H., Shi, J., Qi, X., Wang, X., Jia, J.: Pyramid scene parsing network, pp. 2881–2890 (2017)
Liu, Z., Hu, H., Lin, Y., Yao, Z., Xie, Z., Wei, Y., Ning, J., Cao, Y., Zhang, Z., Dong, L., et al.: Swin transformer V2: scaling up capacity and resolution. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12009–12019 (2022)
Tan, M., Le, Q.: EfficientNet: Rethinking model scaling for convolutional neural networks. In: International Conference on Machine Learning, pp. 6105–6114. PMLR (2019)
Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7132–7141 (2018)
Chen, L.-C., Papandreou, G., Schroff, F., Adam, H.: Rethinking atrous convolution for semantic image segmentation. arXiv preprint arXiv:1706.05587 (2017)
Chaurasia, A., Culurciello, E.: LinkNet: exploiting encoder representations for efficient semantic segmentation. In: 2017 IEEE Visual Communications and Image Processing (VCIP), pp. 1–4. IEEE (2017)
Zhou, B., Zhao, H., Puig, X., Xiao, T., Fidler, S., Barriuso, A., Torralba, A.: Semantic understanding of scenes through the ade20k dataset. Int. J. Comput. Vis. 127, 302–321 (2019)
Article Google Scholar
Xiao, T., Liu, Y., Zhou, B., Jiang, Y., Sun, J.: Unified perceptual parsing for scene understanding. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 418–434 (2018)
Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255. IEEE (2009)
Loshchilov, I., Hutter, F.: Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101 (2017)
Sudre, C.H., Li, W., Vercauteren, T., Ourselin, S., Jorge Cardoso, M.: Generalised dice overlap as a deep learning loss function for highly unbalanced segmentations. In: Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support: Third International Workshop, DLMIA 2017, and 7th International Workshop, ML-CDS 2017, Held in Conjunction with MICCAI 2017, Québec City, QC, Canada, September 14, Proceedings 3, pp. 240–248. Springer (2017)
Soomro, T.A., Afifi, A.J., Gao, J., Hellwich, O., Paul, M., Zheng, L.: Strided u-net model: retinal vessels segmentation using dice loss. In: 2018 Digital Image Computing: Techniques and Applications (DICTA), pp. 1–8. IEEE (2018)
Zhang, Y., Liu, S., Li, C., Wang, J.: Rethinking the dice loss for deep learning lesion segmentation in medical images. J. Shanghai Jiaotong Univ. (Sci.) 26, 93–102 (2021)
Article MATH Google Scholar
Lu, Y., Zhou, J.H., Guan, C.: Minimizing hybrid dice loss for highly imbalanced 3d neuroimage segmentation. In: 2020 42nd Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC), pp. 1059–1062. IEEE (2020)
Dietterich, T.G.: Approximate statistical tests for comparing supervised classification learning algorithms. Neural Comput. 10(7), 1895–1923 (1998)
Article MATH Google Scholar
Nadeau, C., Bengio, Y.: Inference for the generalization error. In: Advances in Neural Information Processing Systems, vol. 12 (1999)
Corani, G., Benavoli, A., Demšar, J., Mangili, F., Zaffalon, M.: Statistical comparison of classifiers through Bayesian hierarchical modelling. Mach. Learn. 106, 1817–1837 (2017)
Article MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

Elmore Family School of Electrical and Computer Engineering, Purdue University, 610 Purdue Mall, West Lafayette, IN, 47907, USA
Abhishek Subedi & Mohammad R. Jahanshahi
Lyles School of Civil Engineering, Purdue University, 610 Purdue Mall, West Lafayette, IN, 47907, USA
Cheng Qian & Mohammad R. Jahanshahi
School of Electrical Engineering and Computer Science, University of Ottawa, 75 Laurier Ave E, Ottawa, ON, K1N 6N5, Canada
Reza Sadeghian

Authors

Abhishek Subedi
View author publications
You can also search for this author inPubMed Google Scholar
Cheng Qian
View author publications
You can also search for this author inPubMed Google Scholar
Reza Sadeghian
View author publications
You can also search for this author inPubMed Google Scholar
Mohammad R. Jahanshahi
View author publications
You can also search for this author inPubMed Google Scholar

Contributions

A.S. conceptualized and implemented the architecture, performed the experiments, and wrote the manuscript. C.Q. collected the dataset, and wrote part of the introduction and literature review sections. R.S. helped C.Q. with data collection and literature review, along with helping edit the manuscript. M.R.J. provided guidance throughout the duration of the project, from conception to completion, as well as helped edit the manuscript.

Corresponding author

Correspondence to Abhishek Subedi.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Subedi, A., Qian, C., Sadeghian, R. et al. CorFormer: a hybrid transformer-CNN architecture for corrosion segmentation on metallic surfaces. Machine Vision and Applications 36, 45 (2025). https://doi.org/10.1007/s00138-025-01663-2

Download citation

Received: 06 August 2024
Revised: 27 November 2024
Accepted: 10 January 2025
Published: 29 January 2025
DOI: https://doi.org/10.1007/s00138-025-01663-2

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

CorFormer: a hybrid transformer-CNN architecture for corrosion segmentation on metallic surfaces

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A multilevel bridge corrosion detection method by transformer-based segmentation in a stitched view

Research on the performance of the SegFormer model with fusion of edge feature extraction for metal corrosion detection

Image Pre-processing and Segmentation for Real-Time Subsea Corrosion Inspection

Data availability

References

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now