short-paper

Combining PENCIL with AMDIM for image classification with noisy and sparsely labeled data

Authors:
Alexandros Petropoulos

Aristotle University of Thessaloniki, Greece

Aristotle University of Thessaloniki, Greece
View Profile

,
Christos Diou

Harokopio University of Athens, Greece

Harokopio University of Athens, Greece
View Profile

SETN 2020: 11th Hellenic Conference on Artificial IntelligenceSeptember 2020Pages 93–96https://doi.org/10.1145/3411408.3411441

Published:02 September 2020Publication History

SETN 2020: 11th Hellenic Conference on Artificial Intelligence

Pages 93–96

ABSTRACT

Recent years have seen an increase in data availability and computational power, which has led to superior performance in training deep learning models for image classification. In many real-world use-cases, however, the training datasets come with noisy labels that have been automatically generated for a small subset of the available data. In this paper we explore strategies for maintaining classification performance when the labels become noisy and sparse. In particular, we evaluate the effectiveness of combining PENCIL, a framework for correcting noisy labels during training and AMDIM, a self-supervised technique for learning good data representations from unlabeled data. We find that this combination is significantly more effective when dealing with sparse and noisy labels, compared to using either of these approaches alone.

References

Görkem Algan and Ilkay Ulusoy. 2019. Image Classification with Deep Learning in the Presence of Noisy Labels: A Survey. arXiv:1912.05170 [cs, stat] (Dec. 2019). arxiv:1912.05170 [cs, stat]Google Scholar
Devansh Arpit, Stanisł aw Jastrzębski, Nicolas Ballas, David Krueger, Emmanuel Bengio, Maxinder S. Kanwal, Tegan Maharaj, Asja Fischer, Aaron Courville, Yoshua Bengio, and Simon Lacoste-Julien. 2017. A Closer Look at Memorization in Deep Networks. arXiv:1706.05394 [cs, stat] (July 2017). arxiv:1706.05394 [cs, stat]Google Scholar
Philip Bachman, R Devon Hjelm, and William Buchwalter. [n.d.]. Learning Representations by Maximizing Mutual Information Across Views. ([n. d.]), 11.Google Scholar
Benoit Frenay and Michel Verleysen. 2014. Classification in the Presence of Label Noise: A Survey. IEEE Transactions on Neural Networks and Learning Systems 25, 5 (May 2014), 845–869. https://doi.org/10.1109/TNNLS.2013.2292894Google ScholarCross Ref
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Identity Mappings in Deep Residual Networks. arXiv:1603.05027 [cs] (July 2016). arxiv:1603.05027 [cs]Google Scholar
Olivier J. Hénaff, Aravind Srinivas, Jeffrey De Fauw, Ali Razavi, Carl Doersch, S. M. Ali Eslami, and Aaron van den Oord. 2019. Data-Efficient Image Recognition with Contrastive Predictive Coding. arXiv:1905.09272 [cs] (Dec. 2019). arxiv:1905.09272 [cs]Google Scholar
R. Devon Hjelm, Alex Fedorov, Samuel Lavoie-Marchildon, Karan Grewal, Phil Bachman, Adam Trischler, and Yoshua Bengio. 2019. Learning Deep Representations by Mutual Information Estimation and Maximization. arXiv:1808.06670 [cs, stat] (Feb. 2019). arxiv:1808.06670 [cs, stat]Google Scholar
Longlong Jing and Yingli Tian. 2019. Self-Supervised Visual Feature Learning with Deep Neural Networks: A Survey. arXiv:1902.06162 [cs] (Feb. 2019). arxiv:1902.06162 [cs]Google Scholar
Alex Krizhevsky. 2009. Learning Multiple Layers of Features from Tiny Images.Google Scholar
Kuang-Huei Lee, Xiaodong He, Lei Zhang, and Linjun Yang. 2018. CleanNet: Transfer Learning for Scalable Image Classifier Training with Label Noise. arXiv:1711.07131 [cs] (March 2018). arxiv:1711.07131 [cs]Google Scholar
Ziwei Liu, Ping Luo, Shi Qiu, Xiaogang Wang, and Xiaoou Tang. 2016. DeepFashion: Powering Robust Clothes Recognition and Retrieval with Rich Annotations. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Google ScholarCross Ref
Vasileios Papapanagiotou, Christos Diou, and Anastasios Delopoulos. 2015. Improving Concept-Based Image Retrieval with Training Weights Computed from Tags. ACM Trans. Multimedia Comput. Commun. Appl. 12, 2, Article 32 (Nov. 2015), 22 pages. https://doi.org/10.1145/2790230Google Scholar
Ioannis Sarafis, Christos Diou, Theodora Tsikrika, and Anastasios Delopoulos. 2014. Weighted SVM from clickthrough data for image retrieval. In 2014 IEEE International Conference on Image Processing (ICIP). IEEE, 3013–3017.Google ScholarCross Ref
Chen Sun, Abhinav Shrivastava, Saurabh Singh, and Abhinav Gupta. 2017. Revisiting unreasonable effectiveness of data in deep learning era. In Proceedings of the IEEE international conference on computer vision. 843–852.Google ScholarCross Ref
Daiki Tanaka, Daiki Ikami, Toshihiko Yamasaki, and Kiyoharu Aizawa. 2018. Joint Optimization Framework for Learning with Noisy Labels. arXiv:1803.11364 [cs, stat] (March 2018). arxiv:1803.11364 [cs, stat]Google Scholar
Theodora Tsikrika, Christos Diou, Arjen P. de Vries, and Anastasios Delopoulos. 2009. Image Annotation Using Clickthrough Data. In Proceedings of the ACM International Conference on Image and Video Retrieval (Santorini, Fira, Greece) (CIVR ’09). Association for Computing Machinery, New York, NY, USA, Article 14, 8 pages. https://doi.org/10.1145/1646396.1646415Google ScholarDigital Library
Theodora Tsikrika, Christos Diou, Arjen P de Vries, and Anastasios Delopoulos. 2011. Reliability and effectiveness of clickthrough data for automatic image annotation. Multimedia Tools and Applications 55, 1 (2011), 27–52.Google ScholarDigital Library
Aaron van den Oord, Yazhe Li, and Oriol Vinyals. 2019. Representation Learning with Contrastive Predictive Coding. arXiv:1807.03748 [cs, stat] (Jan. 2019). arxiv:1807.03748 [cs, stat]Google Scholar
Jesper E Van Engelen and Holger H Hoos. 2020. A survey on semi-supervised learning. Machine Learning 109, 2 (2020), 373–440.Google ScholarCross Ref
Andreas Veit, Neil Alldrin, Gal Chechik, Ivan Krasin, Abhinav Gupta, and Serge Belongie. 2017. Learning From Noisy Large-Scale Datasets With Minimal Supervision. arXiv:1701.01619 [cs] (April 2017). arxiv:1701.01619 [cs]Google Scholar
Kun Yi and Jianxin Wu. 2019. Probabilistic End-to-End Noise Correction for Learning with Noisy Labels. arXiv:1903.07788 [cs] (March 2019). arxiv:1903.07788 [cs]Google Scholar

Combining PENCIL with AMDIM for image classification with noisy and sparsely labeled data
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Supervised learning
    2. Machine learning approaches

Recommendations

Improving Medical Image Classification in Noisy Labels Using only Self-supervised Pretraining
Data Engineering in Medical Imaging
Abstract
Noisy labels hurt deep learning-based supervised image classification performance as the models may overfit the noise and learn corrupted feature extractors. For natural image classification training with noisy labeled data, model initialization ...
Read More
Noisy multi-label semi-supervised dimensionality reduction
Highlights
- A new semi-supervised and label noise-tolerant multi-label dimensionality reduction method.
Abstract
Noisy labeled data represent a rich source of information that often are easily accessible and cheap to obtain, but label noise might also have many negative consequences if not accounted for. How to fully utilize noisy labels has been ...
Read More
Identifying noisy labels with a transductive semi-supervised leave-one-out filter
Highlights
- Semi-supervised classifiers are susceptible to label noise.
- Our method (LGC_...
Abstract
Obtaining data with meaningful labels is often costly and error-prone. In this situation, semi-supervised learning (SSL) approaches are interesting, as they leverage assumptions about the unlabeled data to make up for the limited ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SETN 2020: 11th Hellenic Conference on Artificial Intelligence
September 2020
249 pages
ISBN:9781450388788
DOI:10.1145/3411408
Editors:
Constantine Spyropoulos,
Iraklis Varlamis,
Ion Androutsopoulos,
Prodromos Malakasiotis
Copyright © 2020 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 2 September 2020
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
deep learning
label noise
neural networks
self-learning
Qualifiers
- short-paper
- Research
- Refereed limited
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 47
  Total Downloads
- Downloads (Last 12 months)3
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

Combining PENCIL with AMDIM for image classification with noisy and sparsely labeled data

SETN 2020: 11th Hellenic Conference on Artificial Intelligence

ABSTRACT

References

Cited By

Recommendations

Improving Medical Image Classification in Noisy Labels Using only Self-supervised Pretraining

Noisy multi-label semi-supervised dimensionality reduction

Identifying noisy labels with a transductive semi-supervised leave-one-out filter

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

HTML Format

Caption

Combining PENCIL with AMDIM for image classification with noisy and sparsely labeled data

SETN 2020: 11th Hellenic Conference on Artificial Intelligence

ABSTRACT

References

Cited By

Recommendations

Improving Medical Image Classification in Noisy Labels Using only Self-supervised Pretraining

Noisy multi-label semi-supervised dimensionality reduction

Identifying noisy labels with a transductive semi-supervised leave-one-out filter

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

HTML Format

Share this Publication link

Share on Social Media