GIGO, Garbage In, Garbage Out: An Urban Garbage Classification Dataset

Sukel, Maarten; Rudinac, Stevan; Worring, Marcel

doi:10.1007/978-3-031-27077-2_41

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13833))

Included in the following conference series:

International Conference on Multimedia Modeling

2291 Accesses
2 Citations

Abstract

This paper presents a real-world domain-specific dataset, which facilitates algorithm development and benchmarking on the challenging problem of multimodal classification of urban waste in street-level imagery. The dataset, which we have named “GIGO: Garbage in, Garbage out,” consists of 25k images collected from a large geographic area of Amsterdam. It is created with the aim of helping cities to collect different types of garbage from the streets in a more sustainable fashion. The collected data differs from existing benchmarking datasets, introducing unique scientific challenges. In this fine-grained classification dataset, the garbage categories are visually heterogeneous with different sizes, origins, materials, and visual appearance of the objects of interest. In addition, we provide various open data statistics about the geographic area in which the images were collected. Examples are information about demographics, different neighborhood statistics, and information about buildings in the vicinity. This allows for experimentation with multimodal approaches. Finally, we provide several state-of-the-art baselines utilizing the different modalities of the dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

AerialWaste dataset for landfill discovery in aerial and satellite images

Article Open access 31 January 2023

Thermal Bridges on Building Rooftops

Article Open access 10 May 2023

Computer Vision Assisted Approaches to Detect Street Garbage from Citizen Generated Imagery

References

Albert, A., Kaur, J., Gonzalez, M.C.: Using convolutional networks and satellite imagery to identify patterns in urban environments at a large scale. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1357–1366 (2017)
Google Scholar
An, J., et al.: IGAGCN: information geometry and attention-based spatiotemporal graph convolutional networks for traffic flow prediction. Neural Netw. 143, 355–367 (2021)
Article Google Scholar
Caesar, H., et al.: nuScenes: a multimodal dataset for autonomous driving. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11621–11631 (2020)
Google Scholar
Cai, G., Zhu, Y., Wu, Y., Jiang, X., Ye, J., Yang, D.: A multimodal transformer to fuse images and metadata for skin disease classification. The Visual Computer, pp. 1–13 (2022)
Google Scholar
Cordts, M., et al.: The cityscapes dataset for semantic urban scene understanding. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3213–3223 (2016)
Google Scholar
Dosovitskiy, A., et al.: An image is worth 16x16 words: transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020)
Moya, M.G., Phan, T.T., Gatica-Perez, D.: Zurich like new: analyzing open urban multimodal data. In: Proceedings of the 1st International Workshop on Multimedia Computing for Urban Data, pp. 1–8 (2021)
Google Scholar
Gurrin, C., et al.: [invited papers] Comparing approaches to interactive lifelog search at the lifelog search challenge (LSC2018). ITE Trans. Media Technol. Appl. 7(2), 46–59 (2019)
Article Google Scholar
Hamilton, W.L., Ying, R., Leskovec, J.: Inductive representation learning on large graphs. In: Proceedings of the 31st International Conference on Neural Information Processing Systems, pp. 1025–1035 (2017)
Google Scholar
Harman, D.: Overview of the first TREC conference. In: Proceedings of the 16th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 36–47. SIGIR 1993, Association for Computing Machinery, New York, NY, USA (1993). https://doi.org/10.1145/160688.160692. https://doi.org/10.1145/160688.160692
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Google Scholar
Hong, D., Gao, L., Yao, J., Zhang, B., Plaza, A., Chanussot, J.: Graph convolutional networks for hyperspectral image classification. IEEE Trans. Geosci. Remote Sens. 59(7), 5966–5978 (2021)
Article Google Scholar
Krause, J., Stark, M., Deng, J., Fei-Fei, L.: 3D object representations for fine-grained categorization. In: 4th International IEEE Workshop on 3D Representation and Recognition (3dRR-13). Sydney, Australia (2013)
Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems 25 (2012)
Google Scholar
Larson, M., et al.: Automatic tagging and geotagging in video collections and communities. In: Proceedings of the 1st ACM International Conference on Multimedia Retrieval. ICMR 2011, Association for Computing Machinery, New York, NY, USA (2011). https://doi.org/10.1145/1991996.1992047. https://doi.org/10.1145/1991996.1992047
Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48
Chapter Google Scholar
Liu, Z., et al.: Swin transformer: hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 10012–10022 (2021)
Google Scholar
Lokoč, J., et al.: Is the reign of interactive search eternal? findings from the video browser showdown 2020. ACM Trans. Multimedia Comput. Commun. Appl. (TOMM) 17(3), 1–26 (2021)
Google Scholar
Maji, S., Rahtu, E., Kannala, J., Blaschko, M., Vedaldi, A.: Fine-grained visual classification of aircraft. arXiv preprint arXiv:1306.5151 (2013)
Qiu, Z., Yao, T., Mei, T.: Learning spatio-temporal representation with pseudo-3D residual networks. In: proceedings of the IEEE International Conference on Computer Vision, pp. 5533–5541 (2017)
Google Scholar
Selvaraju, R.R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., Batra, D.: Grad-CAM: visual explanations from deep networks via gradient-based localization. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 618–626 (2017)
Google Scholar
Smeaton, A.F., Over, P., Kraaij, W.: Evaluation campaigns and TRECVid. In: Proceedings of the 8th ACM International Workshop on Multimedia Information Retrieval, pp. 321–330. MIR 2006, Association for Computing Machinery, New York, NY, USA (2006). https://doi.org/10.1145/1178677.1178722. https://doi.org/10.1145/1178677.1178722
Sukel, M., Rudinac, S., Worring, M.: Multimodal classification of urban micro-events. In: Proceedings of the 27th ACM International Conference on Multimedia, pp. 1455–1463 (2019)
Google Scholar
Sukel, M., Rudinac, S., Worring, M.: Urban object detection kit: a system for collection and analysis of street-level imagery. In: Proceedings of the 2020 International Conference on Multimedia Retrieval, pp. 509–516 (2020)
Google Scholar
Tan, M., Le, Q.: EfficientNet: rethinking model scaling for convolutional neural networks. In: International Conference on Machine Learning, pp. 6105–6114. PMLR (2019)
Google Scholar
Tan, M., Le, Q.: EfficientNetV2: smaller models and faster training. In: International Conference on Machine Learning, pp. 10096–10106. PMLR (2021)
Google Scholar
Teichmann, M., Weber, M., Zoellner, M., Cipolla, R., Urtasun, R.: MultiNet: real-time joint semantic reasoning for autonomous driving. In: 2018 IEEE Intelligent Vehicles Symposium (IV), pp. 1013–1020. IEEE (2018)
Google Scholar
Tobler, W.R.: A computer movie simulating urban growth in the detroit region. Econ. Geograp. 46, 234–240 (1970). http://www.jstor.org/stable/143141
Wah, C., Branson, S., Welinder, P., Perona, P., Belongie, S.: The Caltech-UCSD Birds-200-2011 Dataset. Tech. Rep. CNS-TR-2011-001, California Institute of Technology (2011)
Google Scholar
Xu, K., Li, C., Tian, Y., Sonobe, T., Kawarabayashi, K.i., Jegelka, S.: Representation learning on graphs with jumping knowledge networks. In: International Conference on Machine Learning, pp. 5453–5462. PMLR (2018)
Google Scholar
Xu, P., Zhu, X., Clifton, D.A.: Multimodal learning with transformers: a survey. arXiv preprint arXiv:2206.06488 (2022)
Yu, B., Yin, H., Zhu, Z.: Spatio-temporal graph convolutional networks: a deep learning framework for traffic forecasting. arXiv preprint arXiv:1709.04875 (2017)
Zhang, H., Hao, Y., Ngo, C.W.: Token shift transformer for video classification. In: Proceedings of the 29th ACM International Conference on Multimedia, pp. 917–925 (2021)
Google Scholar
Zhao, L., et al.: T-GCN: a temporal graph convolutional network for traffic prediction. IEEE Trans. Intell. Transp. Syst. 21(9), 3848–3858 (2019)
Article Google Scholar

Download references

Author information

Authors and Affiliations

University of Amsterdam, Amsterdam, The Netherlands
Maarten Sukel, Stevan Rudinac & Marcel Worring

Authors

Maarten Sukel
View author publications
You can also search for this author in PubMed Google Scholar
Stevan Rudinac
View author publications
You can also search for this author in PubMed Google Scholar
Marcel Worring
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Maarten Sukel .

Editor information

Editors and Affiliations

University of Bergen, Bergen, Norway
Duc-Tien Dang-Nguyen
Dublin City University, Dublin, Ireland
Cathal Gurrin
Radboud University Nijmegen, Nijmegen, The Netherlands
Martha Larson
Dublin City University, Dublin, Ireland
Alan F. Smeaton
University of Amsterdam, Amsterdam, The Netherlands
Stevan Rudinac
National Institute of Information and Communications Technology, Tokyo, Japan
Minh-Son Dao
Department of Information Science and Media Studies, University of Bergen, Bergen, Norway
Christoph Trattner
La Trobe University, Melbourne, VIC, Australia
Phoebe Chen

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sukel, M., Rudinac, S., Worring, M. (2023). GIGO, Garbage In, Garbage Out: An Urban Garbage Classification Dataset. In: Dang-Nguyen, DT., et al. MultiMedia Modeling. MMM 2023. Lecture Notes in Computer Science, vol 13833. Springer, Cham. https://doi.org/10.1007/978-3-031-27077-2_41

Download citation

DOI: https://doi.org/10.1007/978-3-031-27077-2_41
Published: 29 March 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-27076-5
Online ISBN: 978-3-031-27077-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics