Abstract
Scrap steel is a green renewable resource that can be infinitely recycled, and its recycling is of great significance in reducing carbon emissions and promoting the green transformation of the steel industry. However, the current scrap steel recycling faces a series of challenges, such as high labor intensity and occupational risks for inspectors, complex and diverse sources of scrap steel, varying types of materials, and difficulties in quantifying and standardizing manual visual inspection and rating. To overcome these challenges, we propose WaveSegNet, which is based on wavelet transform and multi-scale focusing structure for scrap steel segmentation. Firstly, we utilize wavelet transform to process images and extract features at different frequencies to capture details and structural information in the images. Secondly, we introduce a mechanism of multi-scale focusing to further enhance the accuracy of segmentation by extracting and perceiving features at different scales. Through experiments conducted on public dataset Cityscapes and scrap steel dataset, we have found that WaveSegNet demonstrates outstanding performance and efficiency in the field of semantic segmentation, outperforming other advanced models. These experimental results attest to the immense potential of WaveSegNet in intelligent rating and provide a new solution for the scrap steel recycling industry.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Akram, R., Ibrahim, R.L., Wang, Z., Adebayo, T.S., Irfan, M.: Neutralizing the surging emissions amidst natural resource dependence, eco-innovation, and green energy in g7 countries: insights for global environmental sustainability. J. Environ. Manag. 344, 118560 (2023)
Bracewell, R., Kahn, P.B.: The Fourier transform and its applications. Am. J. Phys. 34(8), 712–712 (1966)
Chen, L.C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Trans. Pattern Anal. Mach. Intell. 40(4), 834–848 (2017)
Chen, L.C., Papandreou, G., Schroff, F., Adam, H.: Rethinking atrous convolution for semantic image segmentation. arXiv abs/1706.05587 (2017). https://api.semanticscholar.org/CorpusID:22655199
Chen, L.-C., Zhu, Y., Papandreou, G., Schroff, F., Adam, H.: Encoder-decoder with atrous separable convolution for semantic image segmentation. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11211, pp. 833–851. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01234-2_49
Cordts, M., et al.: The cityscapes dataset for semantic urban scene understanding. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3213–3223 (2016)
Geng, Z., Guo, M.H., Chen, H., Li, X., Wei, K., Lin, Z.: Is attention better than matrix decomposition? arXiv preprint arXiv:2109.04553 (2021)
Guo, M.H., Lu, C.Z., Hou, Q., Liu, Z., Cheng, M.M., Hu, S.M.: SegNext: rethinking convolutional attention design for semantic segmentation. In: Advances in Neural Information Processing Systems, vol. 35, pp. 1140–1156 (2022)
Huang, H., He, R., Sun, Z., Tan, T.: Wavelet-SRNet: a wavelet-based CNN for multi-scale face super resolution. In: 2017 IEEE International Conference on Computer Vision (ICCV), pp. 1698–1706 (2017). https://doi.org/10.1109/ICCV.2017.187
Jain, J., Li, J., Chiu, M.T., Hassani, A., Orlov, N., Shi, H.: OneFormer: one transformer to rule universal image segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2989–2998 (2023)
Kim, C.W., Kim, H.G.: Study on automated scrap-sorting by an image processing technology. Adv. Mater. Res. 26, 453–456 (2007)
Lee, Y., Kim, J., Willette, J., Hwang, S.J.: MpViT: multi-path vision transformer for dense prediction. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7287–7296 (2022)
Liang, X., Shen, X., Xiang, D., Feng, J., Lin, L., Yan, S.: Semantic object parsing with local-global long short-term memory. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3185–3193 (2016)
Liu, P., Zhang, H., Zhang, K., Lin, L., Zuo, W.: Multi-level wavelet-CNN for image restoration (2018)
Liu, Z., et al.: Swin transformer: hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 10012–10022 (2021)
Liu, Z., Mao, H., Wu, C.Y., Feichtenhofer, C., Darrell, T., Xie, S.: A convnet for the 2020s. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11976–11986 (2022)
Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3440 (2015)
Ma, H., Liu, D., Xiong, R., Wu, F.: iWave: CNN-based wavelet-like transform for image compression. IEEE Trans. Multimedia 22(7), 1667–1679 (2020). https://doi.org/10.1109/TMM.2019.2957990
Ronneberger, O., Fischer, P., Brox, T.: U-net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Wieczorek, T., Pilarczyk, M.: Classification of steel scrap in the EAF process using image analysis methods. Arch. Metall. Mater. 53(2), 613–617 (2008)
Wu, T., Li, W., Jia, S., Dong, Y., Zeng, T.: Deep multi-level wavelet-CNN denoiser prior for restoring blurred image with Cauchy noise. IEEE Sig. Process. Lett. 27, 1635–1639 (2020). https://doi.org/10.1109/LSP.2020.3023299
Xie, E., Wang, W., Yu, Z., Anandkumar, A., Alvarez, J.M., Luo, P.: SegFormer: simple and efficient design for semantic segmentation with transformers. In: Advances in Neural Information Processing Systems, vol. 34 , pp. 12077–12090 (2021)
Xu, G., Li, M., Xu, J.: Application of machine learning in automatic grading of deep drawing steel quality. J. Eng. Sci. 44(6), 1062–1071 (2022)
Xu, W., et al.: Classification and rating of steel scrap using deep learning. Eng. Appl. Artif. Intell. 123, 106241 (2023)
Zhang, C., Kim, J.: Modeling long-and short-term temporal context for video object detection. In: 2019 IEEE International Conference on Image Processing (ICIP), pp. 71–75. IEEE (2019)
Zhang, H., Li, F., Xu, H., Huang, S., Liu, S., Ni, L.M., Zhang, L.: Mp-former: mask-piloted transformer for image segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 18074–18083 (2023)
Acknowledgements
This paper is funded by Supported projects of key R &D programs in Hebei Province (No. 21373802D) and Artificial Intelligence Collaborative Education Project of the Ministry of Education (201801003011).
The GPU server in this article is jointly funded by Shijiazhuang Wusou Network Technology Co., Ltd. and Hebei Rouzun Technology Co., Ltd.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Zhong, J., Xu, Y., Liu, C. (2024). WaveSegNet: Wavelet Transform and Multi-scale Focusing Network for Scrap Steel Segmentation. In: Cao, C., Chen, H., Zhao, L., Arshad, J., Asyhari, T., Wang, Y. (eds) Knowledge Science, Engineering and Management. KSEM 2024. Lecture Notes in Computer Science(), vol 14887. Springer, Singapore. https://doi.org/10.1007/978-981-97-5501-1_15
Download citation
DOI: https://doi.org/10.1007/978-981-97-5501-1_15
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-97-5500-4
Online ISBN: 978-981-97-5501-1
eBook Packages: Computer ScienceComputer Science (R0)