Abstract
The covering algorithm has been ubiquitous in the induction of classification rules. This approach to machine learning uses heuristic search that seeks to find a minimum number of rules that adequately explain the data. However, recent research has provided evidence that learning redundant classifiers can increase predictive accuracy. Learning all possible classifiers seems to be a plausible ultimate form of this notion of redundant classifiers. This paper presents an algorithm that in effect learns all classifiers. Preliminary investigation by Webb (1996b) suggested that a heuristic covering algorithm in general learns classification rules with higher predictive accuracy than those learned by this new approach. In this paper we present an extensive empirical comparison between the learning-all-rules algorithm and three varied established approaches to inductive learning, namely, a covering algorithm, an instance-based learner and a decision tree learner. Empirical evaluation provides strong evidence in support of learning-all-rules as a plausible approach to inductive learning.
Chapter PDF
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Aha, D.W. (1990). A Study of Instance-Based Algorithms for Supervised Learning Tasks. PhD Thesis, Department of Information and Computer Science, University of California, Irvine, Technical Report 90-42.
Aha, D. W. (1997). Editorial on Lazy Learning. Artificial Intelligence Review, 11: 7–10.
Aha, D. W., Kibler, D., and Albert, M. (1991). Instance-based learning algorithms. Machine Learning, 6: 37–66.
Ali, K., Brunk, C., and Pazzani, M. (1994). On learning multiple descriptions of a concept. In Proceedings of Tools with Artificial Intelligence. New Orleans, LA.
Breiman, L. (1996) Bagging predictors. Machine Learning, 24: 123–140.
Clark, Peter and Niblett, T. (1989). The CN2 induction algorithm. Machine Learning, 3: 261–284.
Clark, P. and Boswell, R. (1991). Rule induction with CN2: Some recent improvements. In Proceedings of the Fifth European Working Session on Learning, pp. 151–163.
Dietterich, T. G. and Bakiri, G. (1994). Solving multiclass learning problems via errorcorrecting output codes. Journal of Artificial Intelligence Research, 2: 263–286.
Domingos, P. (1995). Rule induction and instance-based learning: A unified approach. In Proceedings of the 13th International Joint COnference on Artificial Intelligence, Montreal, Morgan Kaufmann, pp. 226–1232.
Fix, E. and J.L. Hodges (1952). Discriminatory analysis — Nonparametric discrimination: Consistency properties. From Project 21-49-004, Report Number 4, USAF School of Aviation Medicine, Randolph Field, Texas, pp. 261–279.
Fayyad, U.M. and Irani, K.B. (1993). Multi-interval discretization of continuous-valued attributes for classification learning. In Proceedings of the Thirteenth International Joint Conference on Artificial Intelligence, pp. 1022–1027, Morgan Kaufmann publishers.
Fayyad, U.M., Piatetsky-Shapiro, G., Smyth, P., and Uthurusamy, R. (1996). Advances in knowledge discovery and data mining. MIT Press, Menlo Park, Ca.
Friedman, J. H., Kohavi, R., and Yun, Y. (1996). Lazy decision trees. In Proceedings of the Thirteenth National Conference on Artificial Intelligence. AAAI Press, Portland, OR, pp. 717–724.
Kwok, S. W. and Carter, C. (1990). Multiple decision trees. In Shachter, R. D. and Levitt, T. S. and Kanal, L. N. and Lemmer, J. F. (Eds.) Uncertainty in Artificial Intelligence 4. North Holland, Amsterdam, pp. 327–335.
Michalski, R. S. (1984) A theory and methodology of inductive learning. In Michalski, R. S. and Carbonell, J. G. and Mitchell, T. M. (Eds.) Machine Learning: An Artificial Intelligence Approach. Springer-Verlag, Berlin, pp. 83–129.
Merz, C.J., and Murphy, P.M. (1997). UCI Repository of machine learning databases [http://www.ics.uci.edu/ mlearn/MLRepository.html]. Irvine, CA: University of California, Department of Information and Computer Science.
Muggleton, Stephen and Feng, C. (1990). Efficient induction of logic programs. In Proceedings of the First Conference on Algorithmic Learning Theory, Tokyo.
Nock, R. and Olivier G. (1995). On learning decision committees. In Proceedings of the Twelfth International Conference on Machine Learning, pp. 413–420, Taho City, Ca. Morgan Kaufmann publishers.
Oliver, J. J. and Hand, D. J. (1995). On pruning and averaging decision trees. In Proceedings of the Twelfth International Conference on Machine Learning, Morgan Kaufmann, Taho City, Ca., pp. 430–437.
Quinlan, J.R. (1990) Learning logical definitions from relations. Machine Learning, 5: 239–266.
Quinlan, J.R. (1993). C4.5: Programs for Machine Learning. Morgan Kaufmann, San Mateo, CA.
Rissanen, J. (1989). Stochastic Complexity in Statistical Inquiry. World Scientific, Singapore.
Schapire, R. E. (1990). The strength of weak learnability. Machine Learning, 5: 197–227.
Ting K. M., (1995). Common Issues in Instance-based and Naive Bayesian Classifiers. PhD thesis, Basser Dept of Computer Science, University of Sydney.
Webb, G. I. (1993). Systematic search for categorical attribute-value data-driven machine learning. In AI'93 — Proceedings of the Sixth Australian Joint Conference on Artificial Intelligence, World Scientific, Melbourne, pp. 342–347.
Webb, G.I. (1995). An efficient admissible algorithm for unordered search. Journal of Artificial Intelligence Research, 3: 431–465.
Webb, G. I. (1996a). Further experimental evidence against the utility of Occam's razor. Journal of Artificial Intelligence Research, 4: 397–417.
Webb, G. I. (1996b). A heuristic covering algorithm has higher predictive accuracy than learning all rules. In Proceedings of Information, Statistics and Induction in Science, Melbourne, pp. 20–30.
Wogulis, J. and Langley, P. (1989). Improving efficiency by learning intermediate concepts. In Proceedings of the Eleventh International Joint Conference on Artificial Intelligence, Morgan Kaufmann, San Mateo, CA, pp. 657–662.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1998 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Viswanathan, M., Webb, G.I. (1998). Classification learning using all rules. In: Nédellec, C., Rouveirol, C. (eds) Machine Learning: ECML-98. ECML 1998. Lecture Notes in Computer Science, vol 1398. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0026685
Download citation
DOI: https://doi.org/10.1007/BFb0026685
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-64417-0
Online ISBN: 978-3-540-69781-7
eBook Packages: Springer Book Archive