Abstract
In most data mining applications where induction is used as the primary tool for knowledge extraction, it is difficult to precisely identify a complete set of relevant attributes. The real world database from which knowledge is to be extracted usually contains a combination of relevant, noisy and irrelevant attributes. Therefore, pre-processing the database to select relevant attributes becomes a very important task in knowledge discovery and data mining. This paper starts with two existing induction systems, C4.5 and HCV, and uses one of them to select relevant attributes for the other. Experimental results on 12 standard data sets showtha t using HCV induction for C4.5 attribute selection is generally useful.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Ali, K.M. & Passani, M.J., Reducing the small disjuncts problem by learning probabilistic concept descriptions, Computational Learning Theory and Natural Learning Systems, T. Petsche et al. (Eds.), Vol.3, 1992.
Clark, P.E. & Boswell, R., Rule induction with CN2: Some recent improvements. In Proceedings of the Fifth European Working Session on Learning, Porto, Portugal: Springer-Verlag, 1991, 151–163.
Dougherty et al., Supervised and Unsupervised Discretization of Continuous Features, Proceedings of the 12th International Conference on Machine Learning, 194–202.
Gams, M., Drobnic, M. & Petkovsek, M. Learning from examples — a uniform view, Int. J. Man-Machine Studies, 34 (1991): 49–68.
Hong, J., AE1: An extension matrix approximate method for the general covering problem, International Journal of Computer and Information Sciences, 14 (1985), 6: 421–437.
Mahlen, P, Dealing with Continuous Attribute Domains in Inductive Learning, Masters Thesis, Dept. of Numerical Analysis and Computer Science, Royal Instit. of Technology, Stockholm, Sweden, 1995.
Michalski, R.S., Mozetic, I., Hong, J., Lavrac, N., The multi-purpose incremental learning system AQ15 and its testing application to three medical domains, Proceedings of the Fifth National Conference on Artificial Intelligence, 1986, 1041–1045.
Murphe, P.M. & Aha, D.W., UCI Repository of Machine Learning Databases, Machine-Readable Data Repository, Irvine, CA, University of California, Department of Information and Computer Science, 1995.
Pagllo, G. & Haussler, D., Boolean feature discovery in empirical learning, Machine Learning, 5 (1990): 71–99.
Quinlan, J.R., Induction of decision trees, Machine Learning, 1(1986).
Quinlan, J.R., C4.5: Programs for Machine Learning, CA: Morgan Kaufmann1993.
Shannon, C.E. & Weaver, W., The Mathematical Theory of Communications, The University of Illinois Press, Urbana, IL, 1949.
Utgoff, P.E., Incremental Induction of Decision Trees, Machine Learning, 4 (1989), 161–186.
Utgoff P.E., Shift of Bias for Inductive Concept Learning, Machine Learning: An AI Approach, Volume 2, Chapter 5, Morgan Kaufmann Pub., 1986, 107–148.
Wu, X., The HCV induction algorithm, Proceedings of the 21st ACM Computer Science Conference, S.C., Kwasny and J. Fuch (Eds.), ACM Press, USA, 1993, 168–175.
Wu, X., Knowledge Acquisition from Data Bases, Ablex Publishing Corp., U.S.A., 1995.
Wu, X., Krisar, J. & Mahlen, P., Noise Handling with Extension Matrices, International Journal on Artificial Intelligence Tools, 5 (1996), 1: 81–97.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1999 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Wu, X. (1999). Induction as Pre-processing. In: Zhong, N., Zhou, L. (eds) Methodologies for Knowledge Discovery and Data Mining. PAKDD 1999. Lecture Notes in Computer Science(), vol 1574. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-48912-6_16
Download citation
DOI: https://doi.org/10.1007/3-540-48912-6_16
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-65866-5
Online ISBN: 978-3-540-48912-2
eBook Packages: Springer Book Archive