Skip to main content

CCLM: Class-Conditional Label Noise Modelling

  • Conference paper
  • First Online:
Pattern Recognition and Image Analysis (IbPRIA 2023)

Abstract

The performance of deep neural networks highly depends on the quality and volume of the training data. However, cost-effective labelling processes such as crowdsourcing and web crawling often lead to data with noisy (i.e., wrong) labels. Making models robust to this label noise is thus of prime importance. A common approach is using loss distributions to model the label noise. However, the robustness of these methods highly depends on the accuracy of the division of training set into clean and noisy samples. In this work, we dive in this research direction highlighting the existing problem of treating this distribution globally and propose a class-conditional approach to split the clean and noisy samples. We apply our approach to the popular DivideMix algorithm and show how the local treatment fares better with respect to the global treatment of loss distribution. We validate our hypothesis on two popular benchmark datasets and show substantial improvements over the baseline experiments. We further analyze the effectiveness of the proposal using two different metrics - Noise Division Accuracy and Classiness.

A. Tatjer and B. Nagarajan—Joint First Authors.

P. Radeva—IAPR Fellow.

R. Marques—Serra Húnter Fellow.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Angluin, D., Laird, P.: Learning from noisy examples. Mach. Learn. 2, 343–370 (1988)

    Article  Google Scholar 

  2. Arazo, E., Ortego, D., Albert, P., O’Connor, N., McGuinness, K.: Unsupervised label noise modeling and loss correction. In: International Conference on Machine Learning, pp. 312–321. PMLR (2019)

    Google Scholar 

  3. Berthelot, D., Carlini, N., Goodfellow, I., Papernot, N., Oliver, A., Raffel, C.A.: Mixmatch: a holistic approach to semi-supervised learning. In: NIPS, vol. 32 (2019)

    Google Scholar 

  4. Byrd, J., Lipton, Z.: What is the effect of importance weighting in deep learning? In: International Conference on Machine Learning, pp. 872–881. PMLR (2019)

    Google Scholar 

  5. Chen, C., et al.: Generalized data weighting via class-level gradient manipulation. In: NIPS, vol. 34, pp. 14097–14109 (2021)

    Google Scholar 

  6. Chen, Z., Song, A., Wang, Y., Huang, X., Kong, Y.: A noise rate estimation method for image classification with label noise. In: Journal of Physics: Conference Series, vol. 2433, p. 012039. IOP Publishing (2023)

    Google Scholar 

  7. Cheng, D., et al.: Instance-dependent label-noise learning with manifold-regularized transition matrix estimation. In: CVPR, pp. 16630–16639 (2022)

    Google Scholar 

  8. Ding, K., Shu, J., Meng, D., Xu, Z.: Improve noise tolerance of robust loss via noise-awareness. arXiv preprint arXiv:2301.07306 (2023)

  9. Han, B., et al.: Co-teaching: robust training of deep neural networks with extremely noisy labels. In: NIPS, vol. 31 (2018)

    Google Scholar 

  10. Han, J., Luo, P., Wang, X.: Deep self-learning from noisy labels. In: ICCV, pp. 5138–5147 (2019)

    Google Scholar 

  11. Hendrycks, D., Mazeika, M., Wilson, D., Gimpel, K.: Using trusted data to train deep networks on labels corrupted by severe noise. In: NIPS, vol. 31 (2018)

    Google Scholar 

  12. Khetan, A., Lipton, Z.C., Anandkumar, A.: Learning from noisy singly-labeled data. arXiv preprint arXiv:1712.04577 (2017)

  13. Kim, D., Ryoo, K., Cho, H., Kim, S.: SplitNet: learnable clean-noisy label splitting for learning with noisy labels. arXiv preprint arXiv:2211.11753 (2022)

  14. Krizhevsky, A., Hinton, G., et al.: Learning multiple layers of features from tiny images (2009)

    Google Scholar 

  15. Li, J., Socher, R., Hoi, S.C.: DivideMix: learning with noisy labels as semi-supervised learning. arXiv preprint arXiv:2002.07394 (2020)

  16. Liao, Y.H., Kar, A., Fidler, S.: Towards good practices for efficiently annotating large-scale image classification datasets. In: CVPR, pp. 4350–4359 (2021)

    Google Scholar 

  17. Liu, S., Zhu, Z., Qu, Q., You, C.: Robust training under label noise by over-parameterization. In: ICML, pp. 14153–14172. PMLR (2022)

    Google Scholar 

  18. Liu, X., Luo, S., Pan, L.: Robust boosting via self-sampling. Knowl.-Based Syst. 193, 105424 (2020)

    Article  Google Scholar 

  19. Ma, X., Huang, H., Wang, Y., Romano, S., Erfani, S., Bailey, J.: Normalized loss functions for deep learning with noisy labels. In: ICML, pp. 6543–6553 (2020)

    Google Scholar 

  20. Miyamoto, H.K., Meneghetti, F.C., Costa, S.I.: The Fisher-Rao loss for learning under label noise. Inf. Geometry 1–20 (2022)

    Google Scholar 

  21. Nagarajan, B., Marques, R., Mejia, M., Radeva, P.: Class-conditional importance weighting for deep learning with noisy labels. In: VISIGRAPP (5: VISAPP), pp. 679–686 (2022)

    Google Scholar 

  22. Nishi, K., Ding, Y., Rich, A., Hollerer, T.: Augmentation strategies for learning with noisy labels. In: CVPR, pp. 8022–8031 (2021)

    Google Scholar 

  23. Northcutt, C., Jiang, L., Chuang, I.: Confident learning: estimating uncertainty in dataset labels. J. Artif. Intell. Res. 70, 1373–1411 (2021)

    Article  MathSciNet  MATH  Google Scholar 

  24. Oyen, D., Kucer, M., Hengartner, N., Singh, H.S.: Robustness to label noise depends on the shape of the noise distribution in feature space. arXiv preprint arXiv:2206.01106 (2022)

  25. Patrini, G., Rozza, A., Krishna Menon, A., Nock, R., Qu, L.: Making deep neural networks robust to label noise: a loss correction approach. In: CVPR, pp. 1944–1952 (2017)

    Google Scholar 

  26. Song, H., Kim, M., Park, D., Shin, Y., Lee, J.G.: Learning from noisy labels with deep neural networks: a survey. IEEE Tran. NNLS (2022)

    Google Scholar 

  27. Sun, Z., et al.: PNP: robust learning from noisy labels by probabilistic noise prediction. In: CVPR, pp. 5311–5320 (2022)

    Google Scholar 

  28. Valle-Pérez, G., Camargo, C.Q., Louis, A.A.: Deep learning generalizes because the parameter-function map is biased towards simple functions. arXiv e-prints arXiv:1805.08522 (2018)

  29. Wang, H., Xiao, R., Dong, Y., Feng, L., Zhao, J.: ProMix: combating label noise via maximizing clean sample utility. arXiv preprint arXiv:2207.10276 (2022)

  30. Wei, J., Zhu, Z., Cheng, H., Liu, T., Niu, G., Liu, Y.: Learning with noisy labels revisited: a study using real-world human annotations. arXiv preprint arXiv:2110.12088 (2021)

  31. Wu, P., Zheng, S., Goswami, M., Metaxas, D., Chen, C.: A topological filter for learning with label noise. In: NIPS, vol. 33, pp. 21382–21393 (2020)

    Google Scholar 

  32. Zhang, C., Bengio, S., Hardt, M., Recht, B., Vinyals, O.: Understanding deep learning (still) requires rethinking generalization. ACM 64(3), 107–115 (2021)

    Article  Google Scholar 

  33. Zhang, Y., Niu, G., Sugiyama, M.: Learning noise transition matrix from only noisy labels via total variation regularization. In: ICML, pp. 12501–12512 (2021)

    Google Scholar 

  34. Zheltonozhskii, E., Baskin, C., Mendelson, A., Bronstein, A.M., Litany, O.: Contrast to divide: self-supervised pre-training for learning with noisy labels. In: WACV, pp. 1657–1667 (2022)

    Google Scholar 

  35. Zhou, X., Liu, X., Zhai, D., Jiang, J., Ji, X.: Asymmetric loss functions for noise-tolerant learning: theory and applications. IEEE Trans. PAMI (2023)

    Google Scholar 

Download references

Acknowledgements

This work was partially funded by the Horizon EU project MUSAE (No. 01070421), 2021-SGR-01094 (AGAUR), Icrea Academia’2022 (Generalitat de Catalunya), Robo STEAM (2022-1-BG01-KA220-VET-000089434, Erasmus+ EU), DeepSense (ACE053/22/000029, ACCIÓ), DeepFoodVol (AEI-MICINN, PDC2022-133642-I00) and CERCA Programme/Generalitat de Catalunya. B. Nagarajan acknowledges the support of FPI Becas, MICINN, Spain. We acknowledge the support of NVIDIA Corporation with the donation of the Titan Xp GPUs. As Serra Húnter Fellow, Ricardo Marques acknowledges the support of the Serra Húnter Programme.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bhalaji Nagarajan .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Tatjer, A., Nagarajan, B., Marques, R., Radeva, P. (2023). CCLM: Class-Conditional Label Noise Modelling. In: Pertusa, A., Gallego, A.J., Sánchez, J.A., Domingues, I. (eds) Pattern Recognition and Image Analysis. IbPRIA 2023. Lecture Notes in Computer Science, vol 14062. Springer, Cham. https://doi.org/10.1007/978-3-031-36616-1_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-36616-1_1

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-36615-4

  • Online ISBN: 978-3-031-36616-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics