References
Aha DK, Albert M (1991) Instance-based learning algorithm. Mach Learn 6:37–66
Ananthakrishna R, Chaudhuri S, Ganti V (2002) Eliminating fuzzy duplicates in data warehouses. In: Proceedings of VLDB conference, pp 586–597, Hong Kong
Ballou DP, Tayi GK (1999) Enhancing data quality in data warehouse environment. Commun ACM 42(1):73–78
Barnett V, Lewis T (1994) Outlier in statistical data. Wiley, New York
Berry M, Linoff G (1999) Mastering data mining. Wiley, New York
Brodley C, Friedl M (1999) Identifying mislabeled training data. J Artif Intell Res 11:131–167
Fayyad U, Piatetsky-Shapiro G, Smyth P, Uthurusamy R (eds) (1996) Advances in knowledge discovery and data mining. AAAI/MIT Press, Cambridge, MA
Gamberger D, Lavrac N, Dzeroski S (2000) Noise detection and elimination in data preprocessing: experiments in medical domains. Appl Artif Intell 14:205–223
John GH (1995) Robust decision trees: Removing outliers from database. In: Proceedings of the KDD, pp 174–179
Khoshgoftaar TM, Zhong S, Joshi V (2005) Noise elimination with ensemble-classifier filtering for software quality estimation. Intell Data Anal Int J 9(1):3–27
Little RJA, Rubin DB (2002) Statistical analysis with missing data, 2nd edn. Wiley, New York
Parsons S (1996) Current approaches to handling imperfect information in data and knowledge bases. IEEE Trans Knowl Data Eng 8:353–372
Pazzani M, Brunk C (1991) Detecting and correcting errors in rule-based expert systems: an integration of empirical and explanation-based learning. Knowl Acquis 3:157–173
Pearson RK (2005) Mining imperfect data: dealing with contamination and incomplete records. SIAM, Philadelphia, PA
Pierce EM (2004) Assessing data quality with control matrices. Commun ACM 47(2):82–86
Pipino L, Lee Y, Wang R (2002) Data quality assessment. Commun ACM 45(4):211–218
Quinlan J (1986) Induction of decision tree. Mach Learn 1(1):81–106
Zhu X, Wu X, Yang Y (2004) Error detection and impact-sensitive instance ranking in noisy datasets. In: Proceedings of the 19th national conference on artificial intelligence (AAAI), San Jose, CA
Zhu X, Wu X, Chen Q (2006) Bridging local and global data cleansing: identifying class noise in large, distributed datasets. Data Mining Knowl Discovery 12(2):275–308
Author information
Authors and Affiliations
Additional information
Xingquan Zhu is an Assistant Professor in the Department of Computer Science and Engineering, Florida Atlantic University, Boca Raton, FL. He received his Ph.D. degree in computer science from Fudan University, Shanghai, P.R. China, in 2001. He spent 4 months with Microsoft Research Asia, Beijing, P.R. China, where he was working on content-based image retrieval with relevance feedback. From 2001 to 2002, he was a Postdoctoral Associate in the Department of Computer Science, Purdue University, West Lafayette, IN. From Oct., 2002 to July 2006, he was a Research Assistant Professor in the Department of Computer Science, University of Vermont, Burlington VT. His research interests include data mining, machine learning, data quality, multimedia computing, and information retrieval.
Taghi M. Khoshgoftaar is a Professor in the Department of Computer Science and Engineering, Florida Atlantic University and the Director of the Graduate Programs and Research. His research interests are in software engineering, software metrics, software reliability and quality engineering, computational intelligence applications, computer security, computer performance evaluation, data mining, machine learning, statistical modeling, and intelligent data analysis. He has published more than 300 refereed papers in these areas. He is a member of the IEEE, IEEE Computer Society, and IEEE Reliability Society. He was the General Chair of the IEEE International Conference on Tools with Artificial Intelligence 2005.
Ian Davidson is an Assistant Professor in Computer Science at the State University of New York (SUNY) at Albany. His research interests are in the area of designing efficient data mining, machine learning, and artificial intelligence algorithms and their applications to novel areas. He has published over 30 papers in various conferences and journals and holds a Ph.D. in computer science from Monash University, Australia.
Shichao Zhang is a Senior Research Fellow in the Faculty of Information Technology at the University of Technology, Sydney, and a Professor at the Guangxi Normal University. He received the Ph.D. degree in computer science from Deakin University, Australia. His research interests include data analysis and smart pattern discovery. He has published over 30 international journal papers (including 6 in IEEE/ACM Transactions, 2 in Information Systems, 1 in Data Mining and Knowledge Discovery, 6 in IEEE magazines) and over 30 international conference papers (including 2 ICML papers, 1 AAAI paper, 1 KDD paper, and 3 FUZZ-IEEE/AAMAS papers). He has won 4 China NSF/863 grants, 3 Australian large ARC grants, and 2 Australian small ARC grants. He is a Senior Member of the IEEE, a Member of the ACM, and serving as an Associate Editor for Knowledge and Information Systems and IEEE Intelligent Informatics Bulletin.
Rights and permissions
About this article
Cite this article
Zhu, X., Khoshgoftaar, T.M., Davidson, I. et al. Editorial: Special issue on mining low-quality data. Knowl Inf Syst 11, 131–136 (2007). https://doi.org/10.1007/s10115-006-0058-y
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-006-0058-y