Improved rule discovery performance on uncertainty

Tolun, Mehmet R.; Sever, Hayri; Uludağ, Mahmut

doi:10.1007/3-540-64383-4_26

Mehmet R. Tolun⁹,
Hayri Sever¹⁰ &
Mahmut Uludağ¹¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 1394))

Included in the following conference series:

Pacific-Asia Conference on Knowledge Discovery and Data Mining

1674 Accesses

Abstract

In this paper we describe the improved version of a novel rule induction algorithm, namely ILA. We first outline the basic algorithm, and then present how the algorithm is enhanced using the new evaluation metric that handles uncertainty in a given data set. In addition to having a faster induction than the original one, we believe that our contribution comes into picture with a new metric that allows users to define their preferences through a penalty factor. We use this penalty factor to tackle with over-fitting bias, which is inherently found in a great many of inductive algorithms. We compare the improved algorithm ILA-2 to a variety of induction algorithms, including ID3, OC1, C4.5, CN2, and ILA. According to our preliminary experimental work, the algorithm appears to be comparable to the well-known algorithms such as CN2 and C4.5 in terms of accuracy and size.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Whitebox Induction of Default Rules Using High-Utility Itemset Mining

Complexity of Rule Sets Induced by Two Versions of the MLEM2 Rule Induction Algorithm

IRBASIR-B: Rule Induction from Similarity Relations, a Bayesian Approach

References

Clark, P. & Niblett, T., (1989). “The CN2 Induction Algorithm”, Machine Learning, 3, pp.261–283.
Google Scholar
Deogun, J. S., Raghavan, V. V., Sarkar A., and Sever, H., (1997). “Data Mining: Trends in Research and Development”, Rough Sets and Data Mining: Analysis for imprecise Data. (T. Y. Lin and N. Cercone, Eds), Kluwer Academic Publishers.
Google Scholar
Fayyad U.M. (1996). “Data Mining and Knowledge Discovery: Making Sense Out of Data”, IEEE Expert, October, pp. 20–25.
Google Scholar
Kohavi, R., Sommerfield, D., & Dougherty, J., (1996). “Data Mining Using MLC++: A Machine Learning Library in C++”, Tools with Al, pp. 234–245.
Google Scholar
Langley, P., (1996). Elements of Machine Learning. San Francisco: Morgan Kaufmann Publishers.
Google Scholar
Matheus, C.J., Chan, P. K., & Piatetsky-Shapiro, G., (1993). “Systems for Knowledge Discovery in Databases”, IEEE Trans. on Knowledge and Data Engineering, 5(6), pp.903–912.
Article Google Scholar
Merz, C. J., & Murphy, P. M., (1997). UCI Repository of Machine Learning Databases, http://www.ics.uci.edu/~mlearn/MLRepository.html, Irvine, CA: University of California, Department of Information and Computer Science.
Google Scholar
Murthy, S.K., Kasif, S., & Salzberg, S., (1994). “A System for Induction of Oblique Decision Trees”, Journal of Artificial Intelligence Research, 2, pp. l–32.
Google Scholar
Quinlan J.R. 1986. “Induction of Decision Trees”, Machine Learning, 1, pp. 811–06.
Google Scholar
Quinlan, J.R., (1993). C4.5: Programs for Machine Learning. Philadelphia, PA: Morgan Kaufmann.
Google Scholar
Quinlan, J.R., (1994). “The Minimum Description Length Principle and Categorical Theories”, Proceedings of the 11th International Conference on Machine Learning, pp. 233–241.
Google Scholar
Salzberg, S., (1995). “On Comparing Classifiers: A critique of current research and Methods”, Technical Report JHU-5/06, Department of Computer Science, John Hopkins University, May 1995.
Google Scholar
Simoudis E. (1996), “Reality Check for Data Mining”, IEEE Expert, October, pp. 26–33.
Google Scholar
Tolun, M.R., & Abu-Soud, S.M., (1998). “ILA: An Inductive Learning Algorithm for Rule Extraction”, to appear in Expert Systems with Applications.
Google Scholar
Zadeh, L. A., (1994). “Soft Computing and Fuzzy Logic”, IEEE Software, pp 48–56.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Engineering, Middle East Technical University, 06531, Inönü Bulvari - Ankara, Turkey
Mehmet R. Tolun
Department of Computer Sc. & Eng., Hacettepe University, 06532, Beytepe - Ankara, Turkey
Hayri Sever
Artificial Intelligence Group, TÜBİTAK Marmara Research Center, PK 21, 41470, Gebze Kocaeli, Turkey
Mahmut Uludağ

Authors

Mehmet R. Tolun
View author publications
You can also search for this author in PubMed Google Scholar
Hayri Sever
View author publications
You can also search for this author in PubMed Google Scholar
Mahmut Uludağ
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Computer Science and Software Engineering, Monash university, 900 Dandenong Road, Caulfield East, Victoria, 3145, Australia
Xindong Wu
Department of Computer Science, The University of Melbourne, Parkville, Victoria, 3052, Australia
Ramamohanarao Kotagiri
School of Computer Science and Engineering, Monash university, Clayton, Victoria, 3168, Australia
Kevin B. Korb

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Tolun, M.R., Sever, H., Uludağ, M. (1998). Improved rule discovery performance on uncertainty. In: Wu, X., Kotagiri, R., Korb, K.B. (eds) Research and Development in Knowledge Discovery and Data Mining. PAKDD 1998. Lecture Notes in Computer Science, vol 1394. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-64383-4_26

Download citation

DOI: https://doi.org/10.1007/3-540-64383-4_26
Published: 25 August 2005
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-64383-8
Online ISBN: 978-3-540-69768-8
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics