Skip to main content

Induction as Pre-processing

  • Conference paper
  • First Online:
Methodologies for Knowledge Discovery and Data Mining (PAKDD 1999)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 1574))

Included in the following conference series:

Abstract

In most data mining applications where induction is used as the primary tool for knowledge extraction, it is difficult to precisely identify a complete set of relevant attributes. The real world database from which knowledge is to be extracted usually contains a combination of relevant, noisy and irrelevant attributes. Therefore, pre-processing the database to select relevant attributes becomes a very important task in knowledge discovery and data mining. This paper starts with two existing induction systems, C4.5 and HCV, and uses one of them to select relevant attributes for the other. Experimental results on 12 standard data sets showtha t using HCV induction for C4.5 attribute selection is generally useful.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Ali, K.M. & Passani, M.J., Reducing the small disjuncts problem by learning probabilistic concept descriptions, Computational Learning Theory and Natural Learning Systems, T. Petsche et al. (Eds.), Vol.3, 1992.

    Google Scholar 

  2. Clark, P.E. & Boswell, R., Rule induction with CN2: Some recent improvements. In Proceedings of the Fifth European Working Session on Learning, Porto, Portugal: Springer-Verlag, 1991, 151–163.

    Google Scholar 

  3. Dougherty et al., Supervised and Unsupervised Discretization of Continuous Features, Proceedings of the 12th International Conference on Machine Learning, 194–202.

    Google Scholar 

  4. Gams, M., Drobnic, M. & Petkovsek, M. Learning from examples — a uniform view, Int. J. Man-Machine Studies, 34 (1991): 49–68.

    Article  Google Scholar 

  5. Hong, J., AE1: An extension matrix approximate method for the general covering problem, International Journal of Computer and Information Sciences, 14 (1985), 6: 421–437.

    Article  Google Scholar 

  6. Mahlen, P, Dealing with Continuous Attribute Domains in Inductive Learning, Masters Thesis, Dept. of Numerical Analysis and Computer Science, Royal Instit. of Technology, Stockholm, Sweden, 1995.

    Google Scholar 

  7. Michalski, R.S., Mozetic, I., Hong, J., Lavrac, N., The multi-purpose incremental learning system AQ15 and its testing application to three medical domains, Proceedings of the Fifth National Conference on Artificial Intelligence, 1986, 1041–1045.

    Google Scholar 

  8. Murphe, P.M. & Aha, D.W., UCI Repository of Machine Learning Databases, Machine-Readable Data Repository, Irvine, CA, University of California, Department of Information and Computer Science, 1995.

    Google Scholar 

  9. Pagllo, G. & Haussler, D., Boolean feature discovery in empirical learning, Machine Learning, 5 (1990): 71–99.

    Article  Google Scholar 

  10. Quinlan, J.R., Induction of decision trees, Machine Learning, 1(1986).

    Google Scholar 

  11. Quinlan, J.R., C4.5: Programs for Machine Learning, CA: Morgan Kaufmann1993.

    Google Scholar 

  12. Shannon, C.E. & Weaver, W., The Mathematical Theory of Communications, The University of Illinois Press, Urbana, IL, 1949.

    Google Scholar 

  13. Utgoff, P.E., Incremental Induction of Decision Trees, Machine Learning, 4 (1989), 161–186.

    Article  Google Scholar 

  14. Utgoff P.E., Shift of Bias for Inductive Concept Learning, Machine Learning: An AI Approach, Volume 2, Chapter 5, Morgan Kaufmann Pub., 1986, 107–148.

    Google Scholar 

  15. Wu, X., The HCV induction algorithm, Proceedings of the 21st ACM Computer Science Conference, S.C., Kwasny and J. Fuch (Eds.), ACM Press, USA, 1993, 168–175.

    Google Scholar 

  16. Wu, X., Knowledge Acquisition from Data Bases, Ablex Publishing Corp., U.S.A., 1995.

    Google Scholar 

  17. Wu, X., Krisar, J. & Mahlen, P., Noise Handling with Extension Matrices, International Journal on Artificial Intelligence Tools, 5 (1996), 1: 81–97.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1999 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Wu, X. (1999). Induction as Pre-processing. In: Zhong, N., Zhou, L. (eds) Methodologies for Knowledge Discovery and Data Mining. PAKDD 1999. Lecture Notes in Computer Science(), vol 1574. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-48912-6_16

Download citation

  • DOI: https://doi.org/10.1007/3-540-48912-6_16

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-65866-5

  • Online ISBN: 978-3-540-48912-2

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics