Skip to main content

Leveraging Human-Machine Interactions for Computer Vision Dataset Quality Enhancement

  • Conference paper
  • First Online:
Intelligent Human Computer Interaction (IHCI 2023)

Abstract

Large-scale datasets for single-label multi-class classification, such as ImageNet-1k, have been instrumental in advancing deep learning and computer vision. However, a critical and often understudied aspect is the comprehensive quality assessment of these datasets, especially regarding potential multi-label annotation errors. In this paper, we introduce a lightweight, user-friendly, and scalable framework that synergizes human and machine intelligence for efficient dataset validation and quality enhancement. We term this novel framework Multilabelfy. Central to Multilabelfy is an adaptable web-based platform that systematically guides annotators through the re-evaluation process, effectively leveraging human-machine interactions to enhance dataset quality. By using Multilabelfy on the ImageNetV2 dataset, we found that approximately \(47.88\%\) of the images contained at least two labels, underscoring the need for more rigorous assessments of such influential datasets. Furthermore, our analysis showed a negative correlation between the number of potential labels per image and model top-1 accuracy, illuminating a crucial factor in model evaluation and selection. Our open-source framework, Multilabelfy, offers a convenient, lightweight solution for dataset enhancement, emphasizing multi-label proportions. This study tackles major challenges in dataset integrity and provides key insights into model performance evaluation. Moreover, it underscores the advantages of integrating human expertise with machine capabilities to produce more robust models and trustworthy data development.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Krizhevsky, A., Sutskever, I., Hinton, G.: ImageNet classification with deep convolutional neural networks. Commun. ACM 60(6), 84–90 (2017). https://doi.org/10.1145/3065386

    Article  Google Scholar 

  2. Wightman, R.: Pytorch image models. GitHub (2019). https://github.com/huggingface/pytorch-image-models/blob/main/results/results-imagenet.csv

  3. Ozbulak, U., et al.: Know your self-supervised learning: a survey on image-based generative and discriminative training. Trans. Mach. Learn. Res. (2023). https://openreview.net/forum?id=Ma25S4ludQ

  4. Beyer, L., Hénaff, O., Kolesnikov, A., Zhai, X., Oord, A.: Are we done with ImageNet? arXiv preprint (2020). http://arxiv.org/abs/2006.07159

  5. Tsipras, D., Santurkar, S., Engstrom, L., Ilyas, A., Madry, A.: From ImageNet to image classification: contextualizing progress on benchmarks. In: 37th International Conference on Machine Learning, Article no. 896, pp. 9625–9635 (2020). https://dl.acm.org/doi/10.5555/3524938.3525830

  6. Vasudevan, V., Caine, B., Gontijo-Lopes, R., Fridovich-Keil, S., Roelofs, R.: When does dough become a bagel? Analyzing the remaining mistakes on ImageNet. In: NeurIPS (2022). https://openreview.net/pdf?id=mowt1WNhTC7

  7. Recht, B., Roelofs, R., Schmidt, L., Shankar, V.: Do ImageNet classifiers generalize to ImageNet? In: 36th International Conference on Machine Learning (2019). http://proceedings.mlr.press/v97/recht19a/recht19a.pdf

  8. Engstrom, L., Ilyas, A., Santurkar, S., Tsipras, D., Steinhardt, J., Madry, A.: Identifying statistical bias in dataset replication. In: 37th International Conference on Machine Learning (2020). http://proceedings.mlr.press/v119/engstrom20a/engstrom20a.pdf

  9. Anzaku, E., Wang, H., Van Messem, A., De Neve, W.: A principled evaluation protocol for comparative investigation of the effectiveness of DNN classification models on similar-but-non-identical datasets. arXiv preprint (2022). http://arxiv.org/abs/2209.01848

  10. Shankar, V., Roelofs, R., Mania, H., Fang, A., Recht, B., Schmidt, L.: Evaluating machine accuracy on ImageNet. In: 37th International Conference on Machine Learning, vol. 119, pp. 8634–8644 (2020). https://proceedings.mlr.press/v119/shankar20c.html

  11. Northcutt, C., Athalye, A., Mueller, J.: Pervasive label errors in test sets destabilize machine learning benchmarks. In: Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks (2021). https://openreview.net/pdf?id=XccDXrDNLek

  12. Luccioni, A., Rolnick, D.: Bugs in the data: how imagenet misrepresents biodiversity. In: Proceedings of the Thirty-Fifth Conference on Innovative Applications of Artificial Intelligence and Thirteenth Symposium on Educational Advances in Artificial Intelligence, Article no. 1613, pp. 14382–14390 (2023). https://dl.acm.org/doi/10.1609/aaai.v37i12.26682

  13. Fang, Y., Sun, Q., Wang, X., Huang, T., Wang, X., Cao, Y.: Eva-02: a visual representation for neon genesis. arXiv preprint (2023). https://doi.org/10.48550/arXiv.2303.11331

Download references

Acknowledgment

This research was supported by Ghent University Global Campus (GUGC) in Korea. This research was also supported under the National Research Foundation of Korea (NRF), (2020K1A3A1A68093469), funded by the Korean Ministry of Science and ICT (MSIT). We want to specifically thank the following people for their contribution to the annotation process: Gayoung Lee, Gyubin Lee, Herim Lee, Hyesoo Hong, Jihyung Yoo, Jin-Woo Park, Kangmin Kim, Jihyung Yoo, Jongbum Won, Sohee Lee, Sohn Yerim, Taeyoung Choi, Younghyun Kim, Yujin Cho, and Wonjun Yang.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Esla Timothy Anzaku .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Anzaku, E.T. et al. (2024). Leveraging Human-Machine Interactions for Computer Vision Dataset Quality Enhancement. In: Choi, B.J., Singh, D., Tiwary, U.S., Chung, WY. (eds) Intelligent Human Computer Interaction. IHCI 2023. Lecture Notes in Computer Science, vol 14531. Springer, Cham. https://doi.org/10.1007/978-3-031-53827-8_27

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-53827-8_27

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-53826-1

  • Online ISBN: 978-3-031-53827-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics