Abstract
This paper deals with learning decision lists from examples. In real world problems, data are often noisy and imperfectly described. It is commonly acknowledged that in such cases, consistent but inevitably complex classification procedures usually cause overfitting: results are perfect on the learning set but worse on new examples. Therefore, one searches for less complex procedures which are almost consistent or, in other words, for a good compromise between complexity and goodness-of-fit. But such a requirement generally involves NP-completeness. In a way, CN2 provides a greedy approach to the problem. In this paper, we propose to search the solution space more extensively, using a stochastic procedure, an association of simulated annealing (SA) and simple tabu search (TS) in two distinct phases. In the first phase, we use SA to diversify the search. In the second phase, TS intensifies the search. We compare CART, CN2, and our method using natural and artificial domains.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
E.H.L. Aarts and P.J.M. van Laarhoven, Statistical cooling: A general approach to combinatorial optimization problems, Philips J. Res. 40 (1985) 193.
P. Bonelli, A. Parodi, S. Sen and S.W. Wilson, NEWBOOLE: A fast GBML system,Proc. Int. Conf. on Machine Learning, Austin, TX (1990) p. 153.
E. Bonomi and J.L. Lutton, TheN-city travelling salesman problem: Statistical mechanics and the metropolis algorithm, SIAM Rev. 26 (1984) 551.
A. Blumer, A. Ehrenfeucht, D. Haussler and M.K. Warmuth, Occam's razor, Information Processing Lett. 24 (1987) 377.
L. Breiman, J.H. Friedman, R.A. Olshen and C.J. Stone,Classification and Regression Trees (Wadsworth, 1984).
P. Clark and T. Niblett, The CN2 induction algorithm, Machine Learning 3 (1989) 261.
P.R. Cohen and E.A. Feigenbaum (eds.),The Artificial Intelligence Handbook, Vol. 3 (Pitman, 1982) chap. 14.
K.A. De Jong and W.M. Spears, Learning concept classification rules using genetic algorithms,Proc. Int. Joint Conf. on Artificial Intelligence, Sydney (1991) p. 651.
O. Gascuel and G. Caraux, Distribution-free performance bounds with the resubstitution error estimate, Pattern Recognition Lett. 13 (1992) 757.
F. Glover, Tabu search — Part I, ORSA J. 3, v. 1 (1989) 190.
F. Glover, Tabu search — Part II, ORSA J. 1, v. 2 (1990) 4.
J.J. Grefenstette, C.L. Ramsey and A.C. Schultz, Learning sequential decision rules using simulation models and competition, Machine Learning 5 (1990) 355.
A. Hertz and D. de Werra, The tabu search metaheuristic: How we used it, Ann. Math. and AI 1 (1990) 111.
M.D. Huang, F. Romeo and A. Sangiovanni-Vincentelli, An efficient general cooling schedule for simulated annealing,Proc. IEEE Int. Conf. on CAD, Santa Clara (1986) p. 381.
L. Hyafil and R. Rivest, Constructing optimal binary decision trees is NP-complete, Information Processing Lett. 5 (1976) 15.
J.D. Kelly, Jr. and L. Davis, A hybrid genetic algorithm for classification,Proc. Int. Joint Conf. on Artificial Intelligence, Sydney (1991) p. 645.
S. Kirkpatrik, C.D. Gelatt, Jr. and M.P. Vecchi, Optimization by simulated annealing, Science 4598, v. 220 (1983) 671.
E.M. Kleinberg, Stochastic discrimination, Ann. Math. and AI 1 (1990) 207.
P.J.M. van Laarhoven and E.H.L. Aarts,Simulated Annealing: Theory and Applications (Reidel, 1987).
M. Lundy, Applications of the annealing algorithm to combinatorial problems in statistics, Biometrika 72 (1985) 191.
M. Lundy and A. Mees, Convergence of an annealing algorithm, Math. Program. 34 (1986) 111.
N. Metropolis, A. Rosenbluth, M. Rosenbluth, A. Teller and E. Teller, Equation of state calculations by fast computing machines, J. Chem. Phys. 21 (1953) 1087.
A. Parodi and P. Bonelli, Un système de classifieurs efficace et sa comparaison avec deux methodes representatives d'apprentissage automatique dans trois domaines medicaux, in:Actes Journées Françaises d'Apprentissage, Sète (1991).
J. Rissanen, Modeling by shortest data description, Automatica 14 (1978) 465.
R. Rivest, Learning decision lists, Machine Learning 2 (1987) 229.
L.G. Valiant, A theory of the learnable, Commun. ACM 27 (1984) 1134.
V.N. Vapnik,Estimation of Dependencies Based on Empirical Data (Springer, 1974) chap. 6.
S.W. Wilson, Classifier systems and the animat problem, Machine Learning 2 (1987) 199.
K. Yamanishi, A learning criterion for stochastic rules, Machine Learning 9 (1992) 165.
Author information
Authors and Affiliations
Additional information
On leave from Departamento de Ciências de Computação, UECE, 60715 Fortaleza CE, Brazil, supported in part by CAPES under grant number 3563/89.
Rights and permissions
About this article
Cite this article
de Carvalho Gomes, F., Gascuel, O. SDL, a stochastic algorithm for learning decision lists with limited complexity. Ann Math Artif Intell 10, 281–302 (1994). https://doi.org/10.1007/BF01530954
Issue Date:
DOI: https://doi.org/10.1007/BF01530954