Skip to main content

Waste Classification from Digital Images Using ConvNeXt

  • Conference paper
  • First Online:
Image and Video Technology (PSIVT 2022)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13763))

Included in the following conference series:

  • 832 Accesses

Abstract

In this paper, ConvNeXt is selected as a model for waste classification from digital images. ConvNeXt is a CNN-based backbone network that has been proposed to further improve the performance of models for visual tasks, following the various types of research work that have been generated based on Transformer. In this paper, we take ConvNeXt as the backbone to obtain an efficient waste classification model. In our experiments, we categorized waste into four classes based on predefined classification criteria. We have collected 1,660 labeled images for model training. By using ConvNeXt, we observed that the best experimental result in this paper was from ConvNeXt, which has achieved an accuracy 79.88% in the waste classification. In order to evaluate the model, we consider AP and mAP for waste classification. Our experimental results show that using Mask R-CNN network with ConvNeXt as the backbone outperforms the existing methods for waste classification.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Bochkovskiy, A., Wang, C.Y., Liao, M.Y.: YOLOv4: optimal speed and accuracy of object detection. arXiv (2020)

    Google Scholar 

  2. Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-End object detection with transformers. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12346, pp. 213–229. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58452-8_13

    Chapter  Google Scholar 

  3. Chen, S.S., et al.: Carbon emissions under different domestic waste treatment modes induced by garbage classification: case study in pilot communities in Shanghai, China. Sci. Total Environ. 717, 137193 (2020)

    Article  Google Scholar 

  4. Chollet, F.: Xception: deep learning with depthwise separable convolutions. In: IEEE Conference on Computer Vision and Pattern Recognition (2017)

    Google Scholar 

  5. Dosovitskiy, A., et al.: An image is worth 16x16 words: transformers for image recognition at scale. arXiv (2020)

    Google Scholar 

  6. Funch, O.L., Marhaug, R., Kohtala, S., Steinert, M.: Detecting glass and metal in consumer trash bags during waste collection using convolutional neural networks. Waste Manag. 119, 30–38 (2021)

    Article  Google Scholar 

  7. Geirhos, R., et al.: Shortcut learning in deep neural networks. Nat. Mach. Intell. 2(11), 665–673 (2020)

    Article  Google Scholar 

  8. He, K.M., Gkioxari, G., Dollár, P., Girshick, R.: Mask R-CNN. IEEE ICCV, pp. 2961–2969 (2017)

    Google Scholar 

  9. He, K.M., Zhang, X.Y., Ren, S.Q., Sun, J.: Deep residual learning for image recognition. In: IEEE CVPR, pp. 770–778 (2016)

    Google Scholar 

  10. Hendrycks, D., Gimpel, K.: Gaussian error linear units (GELUs). arXiv (2016)

    Google Scholar 

  11. Ji, H., Liu, Z., Yan, W.Q., Klette, R.: Early diagnosis of Alzheimer’s disease based on selective kernel network with spatial attention. In: Palaiahnakote, S., Sanniti di Baja, G., Wang, L., Yan, W. (eds.) Pattern Recognition. ACPR 2019. LNCS, vol. 12047, pp. 503–515. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-41299-9_39

  12. Ji, H., Liu, Z., Yan, W., Klette, R.: Early diagnosis of Alzheimer’s disease using deep learning. ACM ICCCV (2019)

    Google Scholar 

  13. Kang, Z., Yang, J., Li, G.L., Zhang, Z.Y.: An automatic garbage classification system based on deep learning. IEEE Access. 8, 140019–140029 (2020)

    Article  Google Scholar 

  14. Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. Commun. ACM 60, 84–90 (2017)

    Article  Google Scholar 

  15. Liang, S., Yan, W.: A hybrid CTC+Attention model based on end-to-end framework for multilingual speech recognition. Multimed. Tools Appl. 81, 41295–41308 (2022)

    Google Scholar 

  16. Liu, X., Neuyen, M., Yan, W.Q.: Vehicle-related scene understanding using deep learning. In: Cree, M., Huang, F., Yuan, J., Yan, W. (eds.) Pattern Recognition. ACPR 2019. Communications in Computer and Information Science, vol. 1180, pp. 61–73. Springer, Singapore (2020). https://doi.org/10.1007/978-981-15-3651-9_7

  17. Liu, Z., et al.: Swin transformer: hierarchical vision transformer using shifted windows. In: IEEE ICCV, pp. 10012–10022 (2021)

    Google Scholar 

  18. Liu, Z., Mao, H.Z., Wu, C.Y., Feichtenhofer, C., Darrell, T., Xie, S.N.: A ConvNet for the 2020s. arXiv (2022)

    Google Scholar 

  19. Luo, Z., Nguyen, M., Yan, W.: Kayak and sailboat detection based on the improved YOLO with transformer. ACM ICCCV (2022)

    Google Scholar 

  20. Luo, Z., Nguyen, M., Yan,W.: Sailboat detection based on automated search attention mechanism and deep learning models. IEEE IVCNZ (2021)

    Google Scholar 

  21. Nie, Z.F., Duan, W.J., Li, X.D.: Domestic garbage recognition and detection based on Faster R-CNN. J. Phys. Conf. Ser. 1738(1), 012089 (2021). https://doi.org/10.1088/1742-6596/1738/1/012089

  22. Nixon, M., Aguado, A.: Feature Extraction and Image Processing for Computer Vision. Academic Press, Cambridge (2019)

    Google Scholar 

  23. Pan, C., Yan, W.Q.: Object detection based on saturation of visual perception. Multimed. Tools Appl. 79(27–28), 19925–19944 (2020). https://doi.org/10.1007/s11042-020-08866-x

    Article  Google Scholar 

  24. Pan, C., Liu, J., Yan, W., et al.: Salient object detection based on visual perceptual saturation and two-stream hybrid networks. IEEE Trans. Image Process. 30, 4773–4787 (2022)

    Article  Google Scholar 

  25. Pan, C., Yan, W.: A learning-based positive feedback in salient object detection. In: IVCNZ, pp. 311–317 (2018)

    Google Scholar 

  26. Prince, S.J.: Computer Vision: Models, Learning, and Inference. Cambridge University Press, Cambridge (2012)

    Book  MATH  Google Scholar 

  27. Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI 1(8), 9 (2019)

    Google Scholar 

  28. Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real - time object detection. In: IEEE CVPR, pp. 779–788 (2016)

    Google Scholar 

  29. Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. NIPS 28 (2015)

    Google Scholar 

  30. Sakalle, A., Tomar, P., Bhardwaj, H., Acharya, D., Bhardwaj, A.: A LSTM based deep learning network for recognizing emotions using wireless brainwave driven system. Expert Syst. Appl. 173, 114516 (2021)

    Article  Google Scholar 

  31. Shen, D., Xin, C., Nguyen, M., Yan, W.: Flame detection using deep learning. In: International Conference on Control, Automation and Robotics (2018)

    Google Scholar 

  32. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv (2014)

    Google Scholar 

  33. Srivastava, N., Geoffrey, H., Alex, K., Ilya, S., Ruslan, S.: Dropout: a simple way to prevent neural networks from overfitting. J. Mac. Lear. 15, 1929–1958 (2014)

    MathSciNet  MATH  Google Scholar 

  34. Vaswani, A., et al.: Attention is all you need. Adv. Neural Inf. Process. Syst. (2019)

    Google Scholar 

  35. Wang, C.Y., Bochkovskiy, A., Liao, H.Y.: YOLOv7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. arXiv (2022)

    Google Scholar 

  36. Xiao, B., Nguyen, M., Yan, W.Q.: Apple ripeness identification using deep learning. In: Nguyen, M., Yan, W.Q., Ho, H. (eds.) Geometry and Vision. ISGV 2021. Communications in Computer and Information Science, vol. 1386, pp. 53–67. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-72073-5_5

  37. Xin, C., Nguyen, M., Yan, W.: Multiple flames recognition using deep learning. In: Handbook of Research on Multimedia Cyber Security, pp. 296–307 (2020)

    Google Scholar 

  38. Yan, W.Q.: Computational Methods for Deep Learning - Theoretic. Practice and Applications. Springer, Heidelberg (2021). https://doi.org/10.1007/978-3-030-61081-4

  39. Yan, W.Q.: Introduction to Intelligent Surveillance - Surveillance Data Capture, Transmission, and Analytics, 3rd edn. Springer, Heidelberg (2019). https://doi.org/10.1007/978-3-030-10713-0

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wei Qi Yan .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Qi, J., Nguyen, M., Yan, W.Q. (2023). Waste Classification from Digital Images Using ConvNeXt. In: Wang, H., et al. Image and Video Technology. PSIVT 2022. Lecture Notes in Computer Science, vol 13763. Springer, Cham. https://doi.org/10.1007/978-3-031-26431-3_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-26431-3_1

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-26430-6

  • Online ISBN: 978-3-031-26431-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics