Towards instance-dependent label noise-tolerant classification: a probabilistic approach

Bootkrajang, Jakramate; Chaijaruwanich, Jeerayut

doi:10.1007/s10044-018-0750-z

Towards instance-dependent label noise-tolerant classification: a probabilistic approach

Theoretical Advances
Published: 30 August 2018

Volume 23, pages 95–111, (2020)
Cite this article

Pattern Analysis and Applications Aims and scope Submit manuscript

620 Accesses
9 Citations
Explore all metrics

Abstract

Learning from labelled data is becoming more and more challenging due to inherent imperfection of training labels. Existing label noise-tolerant learning machines were primarily designed to tackle class-conditional noise which occurs at random, independently from input instances. However, relatively less attention was given to a more general type of label noise which is influenced by input features. In this paper, we try to address the problem of learning a classifier in the presence of instance-dependent label noise by developing a novel label noise model which is expected to capture the variation of label noise rate within a class. This is accomplished by adopting a probability density function of a mixture of Gaussians to approximate the label flipping probabilities. Experimental results demonstrate the effectiveness of the proposed method over existing approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

LapRamp: a noise resistant classification algorithm based on manifold regularization

Article 15 July 2023

Learning from Noisy Label Distributions

Label denoising based on Bayesian aggregation

Article 19 December 2015

Notes

We used LIBLINEAR [31] in this study.

References

Beigman E, Klebanov BB (2009) Learning with annotation noise. In: ACL 2009, Proceedings of the 47th annual meeting of the association for computational linguistics, 2–7 August 2009, Singapore, pp 280–287
Kolcz A, Cormack GV (2009) Genre-based decomposition of email class noise. In: SIGKDD’09, pp 427–436
Johnson BA, Iizuka K (2016) Integrating openstreetmap crowdsourced data and landsat time-series imagery for rapid land use/land cover (LULC) mapping: case study of the laguna de bay area of the philippines. Appl Geogr 67:140–149
Article Google Scholar
Snow R, O’Connor B, Jurafsky D, Ng AY (2008) Cheap and fast—but is it good? Evaluating non-expert annotations for natural language tasks. In: EMNLP, pp 254–263
Shen D, Ruvini J-D, Sarwar B (2012) Large-scale item categorization for e-commerce. In: Proceedings of the 21st ACM international conference on information and knowledge management, CIKM ’12, New York, NY, USA. ACM, pp 595–604
Xiao T, Xia T, Yang Y, Huang C, Wang X (2015) Learning from massive noisy labeled data for image classification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2691–2699
Frénay B, Verleysen M (2014) Classification in the presence of label noise: a survey. IEEE Trans Neural Netw Learn Syst 25(5):845–869
Article Google Scholar
Menon AK, van Rooyen B, Natarajan N (2016) Learning from binary labels with instance-dependent corruption. arXiv preprint arXiv:1605.00751
Biggio B, Nelson B, Laskov P (2011) Support vector machines under adversarial label noise. In: ACML, volume 20 of JMLR proceedings, pp 97–112. JMLR.org
Chhikara RS, McKeon J (1984) Linear discriminant analysis with misallocation in training samples. J Am Stat Assoc 79(388):899–906
Article MathSciNet Google Scholar
Lawrence ND, Schölkopf B (2001) Estimating a Kernel fisher discriminant in the presence of label noise. In: ICML’01. Morgan Kaufmann, pp 306–313
Li Y, Wessels LFA, de Ridder D, Reinders MJT (2007) Classification in the presence of class noise using a probabilistic kernel Fisher method. Pattern Recognit 40(12):3349–3357
Article Google Scholar
Raykar VC, Shipeng Y, Zhao LH, Valadez GH, Florin C, Bogoni L, Moy L (2010) Learning from crowds. J Mach Learn Res 11:1297–1322
MathSciNet Google Scholar
Bootkrajang J, Kabán A (2012) Label-noise robust logistic regression and its applications. In: ECML-PKDD’12, pp 143–158
Chapter Google Scholar
Bootkrajang J, Kabán A (2014) Learning kernel logistic regression in the presence of class label noise. Pattern Recognit 47(11):3641–3655
Article Google Scholar
Lugosi G (1992) Learning with an unreliable teacher. Pattern Recognit 25:79–87
Article MathSciNet Google Scholar
Long PM, Servedio RA (2010) Random classification noise defeats all convex potential boosters. Mach Learn 78(3):287–304
Article MathSciNet Google Scholar
Natarajan N, Dhillon IS, Ravikumar PK, Tewari A (2013) Learning with noisy labels. In: NIPS’13, pp 1196–1204
Manwani N, Sastry PS (2013) Noise tolerance under risk minimization. IEEE Trans Cybernet 43(3):1146–1151
Article Google Scholar
Ghosh A, Manwani N, Sastry PS (2015) Making risk minimization tolerant to label noise. Neurocomputing 160:93–107
Article Google Scholar
Lachenbruch PA (1974) Discriminant analysis when the initial samples are misclassified II: non-random misclassification models. Technometrics 16(3):419–424
Article Google Scholar
Bootkrajang J (2016) A generalised label noise model for classification in the presence of annotation errors. Neurocomputing 192:61–71
Article Google Scholar
Du J, Cai Z (2015) Modelling class noise with symmetric and asymmetric distributions. In: AAAI, pp 2589–2595
Schmidt M (2005) minFunc: unconstrained differentiable multivariate optimization in matlab. http://www.cs.ubc.ca/~schmidtm/Software/minFunc.html
Chen Y, Ye X (2011) Projection onto a simplex. arXiv preprint arXiv:1101.6081
West M, Blanchette C, Dressman H, Huang E, Ishida S, Spang R, Zuzan H, Olson JA Jr, Marks JR, Nevins JR (2001) Predicting the clinical status of human breast cancer by using gene expression profiles. Proc Natl Acad Sci USA 98(20):11462–11467
Article Google Scholar
Alon U, Barkai N, Notterman DA, Gishdagger K, Ybarradagger S, Mackdagger D, Levine AJ (1999) Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. Proc Natl Acad Sci USA 96(12):6745–6750
Article Google Scholar
Golub TR, Slonim DK, Tamayo P, Huard C, Gaasenbeek M, Mesirov JP, Coller H, Loh ML, Downing JR, Caligiuri MA, Bloomfield CD (1999) Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286:531–537
Article Google Scholar
Dua D, Karra Taniskidou E (2017) UCI Machine Learning Repository. University of California, School of Information and Computer Science, Irvine, CA. http://archive.ics.uci.edu/ml
Google Scholar
Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30
MathSciNet MATH Google Scholar
Fan R-E, Chang K-W, Hsieh C-J, Wang X-R, Lin C-J (2008) LIBLINEAR: a library for large linear classification. J Mach Learn Res 9:1871–1874
MATH Google Scholar

Download references

Acknowledgements

The authors would like to thank anonymous reviewers for constructive comments. This research is financially supported by the Thailand Research Fund (Grant No. MRG59080235). Department of Computer Science, Faculty of Science at Chiang Mai University provides research and computing facilities.

Author information

Authors and Affiliations

Department of Computer Science, Chiang Mai University, Chiang Mai, 50200, Thailand
Jakramate Bootkrajang & Jeerayut Chaijaruwanich

Authors

Jakramate Bootkrajang
View author publications
You can also search for this author in PubMed Google Scholar
Jeerayut Chaijaruwanich
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jakramate Bootkrajang.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bootkrajang, J., Chaijaruwanich, J. Towards instance-dependent label noise-tolerant classification: a probabilistic approach. Pattern Anal Applic 23, 95–111 (2020). https://doi.org/10.1007/s10044-018-0750-z

Download citation

Received: 12 March 2018
Accepted: 23 August 2018
Published: 30 August 2018
Issue Date: February 2020
DOI: https://doi.org/10.1007/s10044-018-0750-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Towards instance-dependent label noise-tolerant classification: a probabilistic approach

Abstract

Access this article

Similar content being viewed by others

LapRamp: a noise resistant classification algorithm based on manifold regularization

Learning from Noisy Label Distributions

Label denoising based on Bayesian aggregation

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Towards instance-dependent label noise-tolerant classification: a probabilistic approach

Abstract

Access this article

Similar content being viewed by others

LapRamp: a noise resistant classification algorithm based on manifold regularization

Learning from Noisy Label Distributions

Label denoising based on Bayesian aggregation

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation