Abstract
In this paper, ConvNeXt is selected as a model for waste classification from digital images. ConvNeXt is a CNN-based backbone network that has been proposed to further improve the performance of models for visual tasks, following the various types of research work that have been generated based on Transformer. In this paper, we take ConvNeXt as the backbone to obtain an efficient waste classification model. In our experiments, we categorized waste into four classes based on predefined classification criteria. We have collected 1,660 labeled images for model training. By using ConvNeXt, we observed that the best experimental result in this paper was from ConvNeXt, which has achieved an accuracy 79.88% in the waste classification. In order to evaluate the model, we consider AP and mAP for waste classification. Our experimental results show that using Mask R-CNN network with ConvNeXt as the backbone outperforms the existing methods for waste classification.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Bochkovskiy, A., Wang, C.Y., Liao, M.Y.: YOLOv4: optimal speed and accuracy of object detection. arXiv (2020)
Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-End object detection with transformers. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12346, pp. 213–229. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58452-8_13
Chen, S.S., et al.: Carbon emissions under different domestic waste treatment modes induced by garbage classification: case study in pilot communities in Shanghai, China. Sci. Total Environ. 717, 137193 (2020)
Chollet, F.: Xception: deep learning with depthwise separable convolutions. In: IEEE Conference on Computer Vision and Pattern Recognition (2017)
Dosovitskiy, A., et al.: An image is worth 16x16 words: transformers for image recognition at scale. arXiv (2020)
Funch, O.L., Marhaug, R., Kohtala, S., Steinert, M.: Detecting glass and metal in consumer trash bags during waste collection using convolutional neural networks. Waste Manag. 119, 30–38 (2021)
Geirhos, R., et al.: Shortcut learning in deep neural networks. Nat. Mach. Intell. 2(11), 665–673 (2020)
He, K.M., Gkioxari, G., Dollár, P., Girshick, R.: Mask R-CNN. IEEE ICCV, pp. 2961–2969 (2017)
He, K.M., Zhang, X.Y., Ren, S.Q., Sun, J.: Deep residual learning for image recognition. In: IEEE CVPR, pp. 770–778 (2016)
Hendrycks, D., Gimpel, K.: Gaussian error linear units (GELUs). arXiv (2016)
Ji, H., Liu, Z., Yan, W.Q., Klette, R.: Early diagnosis of Alzheimer’s disease based on selective kernel network with spatial attention. In: Palaiahnakote, S., Sanniti di Baja, G., Wang, L., Yan, W. (eds.) Pattern Recognition. ACPR 2019. LNCS, vol. 12047, pp. 503–515. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-41299-9_39
Ji, H., Liu, Z., Yan, W., Klette, R.: Early diagnosis of Alzheimer’s disease using deep learning. ACM ICCCV (2019)
Kang, Z., Yang, J., Li, G.L., Zhang, Z.Y.: An automatic garbage classification system based on deep learning. IEEE Access. 8, 140019–140029 (2020)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. Commun. ACM 60, 84–90 (2017)
Liang, S., Yan, W.: A hybrid CTC+Attention model based on end-to-end framework for multilingual speech recognition. Multimed. Tools Appl. 81, 41295–41308 (2022)
Liu, X., Neuyen, M., Yan, W.Q.: Vehicle-related scene understanding using deep learning. In: Cree, M., Huang, F., Yuan, J., Yan, W. (eds.) Pattern Recognition. ACPR 2019. Communications in Computer and Information Science, vol. 1180, pp. 61–73. Springer, Singapore (2020). https://doi.org/10.1007/978-981-15-3651-9_7
Liu, Z., et al.: Swin transformer: hierarchical vision transformer using shifted windows. In: IEEE ICCV, pp. 10012–10022 (2021)
Liu, Z., Mao, H.Z., Wu, C.Y., Feichtenhofer, C., Darrell, T., Xie, S.N.: A ConvNet for the 2020s. arXiv (2022)
Luo, Z., Nguyen, M., Yan, W.: Kayak and sailboat detection based on the improved YOLO with transformer. ACM ICCCV (2022)
Luo, Z., Nguyen, M., Yan,W.: Sailboat detection based on automated search attention mechanism and deep learning models. IEEE IVCNZ (2021)
Nie, Z.F., Duan, W.J., Li, X.D.: Domestic garbage recognition and detection based on Faster R-CNN. J. Phys. Conf. Ser. 1738(1), 012089 (2021). https://doi.org/10.1088/1742-6596/1738/1/012089
Nixon, M., Aguado, A.: Feature Extraction and Image Processing for Computer Vision. Academic Press, Cambridge (2019)
Pan, C., Yan, W.Q.: Object detection based on saturation of visual perception. Multimed. Tools Appl. 79(27–28), 19925–19944 (2020). https://doi.org/10.1007/s11042-020-08866-x
Pan, C., Liu, J., Yan, W., et al.: Salient object detection based on visual perceptual saturation and two-stream hybrid networks. IEEE Trans. Image Process. 30, 4773–4787 (2022)
Pan, C., Yan, W.: A learning-based positive feedback in salient object detection. In: IVCNZ, pp. 311–317 (2018)
Prince, S.J.: Computer Vision: Models, Learning, and Inference. Cambridge University Press, Cambridge (2012)
Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI 1(8), 9 (2019)
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real - time object detection. In: IEEE CVPR, pp. 779–788 (2016)
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. NIPS 28 (2015)
Sakalle, A., Tomar, P., Bhardwaj, H., Acharya, D., Bhardwaj, A.: A LSTM based deep learning network for recognizing emotions using wireless brainwave driven system. Expert Syst. Appl. 173, 114516 (2021)
Shen, D., Xin, C., Nguyen, M., Yan, W.: Flame detection using deep learning. In: International Conference on Control, Automation and Robotics (2018)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv (2014)
Srivastava, N., Geoffrey, H., Alex, K., Ilya, S., Ruslan, S.: Dropout: a simple way to prevent neural networks from overfitting. J. Mac. Lear. 15, 1929–1958 (2014)
Vaswani, A., et al.: Attention is all you need. Adv. Neural Inf. Process. Syst. (2019)
Wang, C.Y., Bochkovskiy, A., Liao, H.Y.: YOLOv7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. arXiv (2022)
Xiao, B., Nguyen, M., Yan, W.Q.: Apple ripeness identification using deep learning. In: Nguyen, M., Yan, W.Q., Ho, H. (eds.) Geometry and Vision. ISGV 2021. Communications in Computer and Information Science, vol. 1386, pp. 53–67. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-72073-5_5
Xin, C., Nguyen, M., Yan, W.: Multiple flames recognition using deep learning. In: Handbook of Research on Multimedia Cyber Security, pp. 296–307 (2020)
Yan, W.Q.: Computational Methods for Deep Learning - Theoretic. Practice and Applications. Springer, Heidelberg (2021). https://doi.org/10.1007/978-3-030-61081-4
Yan, W.Q.: Introduction to Intelligent Surveillance - Surveillance Data Capture, Transmission, and Analytics, 3rd edn. Springer, Heidelberg (2019). https://doi.org/10.1007/978-3-030-10713-0
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 Springer Nature Switzerland AG
About this paper
Cite this paper
Qi, J., Nguyen, M., Yan, W.Q. (2023). Waste Classification from Digital Images Using ConvNeXt. In: Wang, H., et al. Image and Video Technology. PSIVT 2022. Lecture Notes in Computer Science, vol 13763. Springer, Cham. https://doi.org/10.1007/978-3-031-26431-3_1
Download citation
DOI: https://doi.org/10.1007/978-3-031-26431-3_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-26430-6
Online ISBN: 978-3-031-26431-3
eBook Packages: Computer ScienceComputer Science (R0)