Evaluating Noise Correction

Teng, Choh Man

doi:10.1007/3-540-44533-1_22

Choh Man Teng³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 1886))

Included in the following conference series:

Pacific Rim International Conference on Artificial Intelligence

934 Accesses
14 Citations

Abstract

Data quality is a prime concern for many tasks in learning and induction. We proposed in a previous paper a noise correction mechanism called polishing, which exploits the interdependence between the different components of a data set, to identify the noisy values and their appropriate replacements. The design of a sound and informative metric for evaluating the effectiveness of a noise correction scheme turned out to be non-trivial. We motivate here a number of classifier dependent measures and proximity measures, each focusing on a different aspect of the corrected data and the associated classifier. We report on some extended experimentation with polishing, as measured by the proposed metrics. The results suggested that polishing is able to repair a corrupted data set to some extent, and the metrics we devised appear to be reasonable.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Carla E. Brodley and Mark A. Friedl. Identifying and eliminating mislabeled training instances. In Proceedings of the Thirteenth National Conference on Artificial Intelligence, 1996.
Google Scholar
P. Clark and T. Niblett. The CN2 induction algorithm. Machine Learning, 3(4):261–283, 1989.
Google Scholar
George Drastal. Informed pruning in constructive induction. In Proceedings of the Eighth International Workshop on Machine Learning, pages 132–136, 1991.
Google Scholar
[Gamberger et al., 1996]_Dragan Gamberger, Nada Lavrač, and Sašo Džeroski. Noise elimination in inductive concept learning: A case study in medical diagnosis. In Proceedings of the Seventh International Workshop on Algorithmic Learning Theory, pages 199–212, 1996.
Google Scholar
George H. John. Robust decision trees: Removing outliers from databases. In Proceedings of the First International Conference on Knowledge Discovery and Data Mining, pages 174–179, 1995.
Google Scholar
P. M. Murphy and D. W. Aha. UCI repository of machine learning databases. University of California, Irvine, Department of Information and Computer Science, 1998. http://www.ics.uci.edu/~mlearn/MLRepository.litml.
J. Ross Quinlan. Simplifying decision trees. International Journal of Man-Machine Studies, 27(3):221–234, 1987.
Article Google Scholar
J. Ross Quinlan. C4.5: Programs for Machine Learning. Morgan Kaufmann, 1993.
Google Scholar
Peter J. Rousseeuw and Annick M. Leroy. Robust Regression and Outlier Detection. John Wiley & Sons, 1987.
Google Scholar
Alen D. Shapiro. Structured Induction in Expert Systems. Addison-Wesley, 1987.
Google Scholar
Choh Man Teng. Correcting noisy data. In Proceedings of the Sixteenth International Conference on Machine Learning, pages 239–248, 1999.
Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer Science and Engineering, University of New South Wales, Sydney, NSW, 2052, Australia
Choh Man Teng

Authors

Choh Man Teng
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Institute of Scientific and Industrial Research, Osaka University, 8-1 Mihogaoka, Ibaraki, Osaka, 567-0047, Japan
Riichiro Mizoguchi
Computer Sciences Laboratory, Research School of Information Sciences and Engineering, Australian National University, Canberra, ACT, 0200, Australia
John Slaney

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Teng, C.M. (2000). Evaluating Noise Correction. In: Mizoguchi, R., Slaney, J. (eds) PRICAI 2000 Topics in Artificial Intelligence. PRICAI 2000. Lecture Notes in Computer Science(), vol 1886. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44533-1_22

Download citation

DOI: https://doi.org/10.1007/3-540-44533-1_22
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-67925-7
Online ISBN: 978-3-540-44533-3
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics