Article

Solving regression problems with rule-based ensemble classifiers

Authors:
Nitin Indurkhya

IBM T.J. Watson Research Center, Yorktown Heights, NY

IBM T.J. Watson Research Center, Yorktown Heights, NY
View Profile

,
Sholom M. Weiss

IBM T.J. Watson Research Center, Yorktown Heights, NY

IBM T.J. Watson Research Center, Yorktown Heights, NY
View Profile

KDD '01: Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data miningAugust 2001Pages 287–292https://doi.org/10.1145/502512.502553

Published:26 August 2001Publication History

KDD '01: Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining

Pages 287–292

ABSTRACT

We describe a lightweight learning method that induces an ensemble of decision-rule solutions for regression problems. Instead of direct prediction of a continuous output variable, the method discretizes the variable by k-means clustering and solves the resultant classification problem. Predictions on new examples are made by averaging the mean values of classes with votes that are close in number to the most likely class. We provide experimental evidence that this indirect approach can often yield strong results for many applications, generally outperforming direct approaches such as regression trees and rivaling bagged regression trees.

References

1.E. Bauer and R. Kohavi. An empirical comparison of voting classification algorithms: Bagging, boosting and variants. Machine Learning, 36(1):105-139, 1999. Google ScholarDigital Library
2.L. Breiman. Bagging predictors. Machine Learning, 24:123-140, 1996. Google ScholarDigital Library
3.L. Breiman, J. Friedman, R. Olshen, and C. Stone. Classification and Regression Trees. Wadsworth, Monterrey, CA., 1984.Google Scholar
4.R. Camacho. Inducing Models of Human Control Skills using Machine Learning Algorithms. Ph.d. thesis, University of Porto, 2000.Google Scholar
5.W. Cohen and Y. Singer. A simple, fast, and effective rule learner. In Proceedings of Annual Conference of American Association for Artificial Intelligence, pages 335-342, 1999. Google ScholarDigital Library
6.J. Dougherty, R. Kohavi, and M. Sahami. Supervised and unsupervised discretization of continuous features. In Proceedings of the 12th Int'l Conference on Machine Learning, pages 194-202, 1995.Google ScholarCross Ref
7.U. Fayyad and K. Irani. Multi-interval discretization of continuous-valued attributes for classification learning. In Proceedings of the 13th Int'l Joint Conference on Artificial Intelligence, pages 1022-1027, 1993.Google Scholar
8.J. Friedman. Multivariate adaptive regression splines. Annals of Statistics, 19(1):1-141, 1991.Google ScholarCross Ref
9.J. Friedman, T. Hastie, and R. Tibshirani. Additive logistic regression: A statistical view of boosting. Technical report, Stanford University Statistics Department, 1998. www.stat-stanford.edu/~tibs.Google Scholar
10.J. Hartigan and M. Wong. A k-means clustering algorithm, ALGORITHM AS 136. Applied Statistics, 28(1), 1979.Google Scholar
11.R. Schapire, Y. Freund, P. Bartlett, and W. Lee. Boosting the margin: A new explanation for the effectiveness of voting methods. In Proceedings of the Fourteenth Int'l Conference on Machine Learning, pages 322-330. Morgan Kanhnann, 1998. Google ScholarDigital Library
12.L. Torgo and J. Gama. Regression using classification algorithms. Intelligent Data Analysis, 1(4), 1997.Google Scholar
13.S. Weiss and N. Indurkhya. Rule-based machine learning methods for functional prediction. Journal of Artificial Intelligence Research, 3:383-403, 1995. Google ScholarDigital Library
14.S. Weiss and N. Indurkhya. Lightweight rule induction. In Proceedings of the Seventeenth International Conference on Machine Learning, pages 1135-1142, 2000. Google ScholarDigital Library

Index Terms

Solving regression problems with rule-based ensemble classifiers
1. Information systems
2. Mathematics of computing
  1. Probability and statistics
    1. Statistical paradigms
      1. Regression analysis
        Robust regression

Recommendations

On the Interpretation of Ensemble Classifiers in Terms of Bayes Classifiers

Many of the best classifiers are ensemble methods such as bagging, random forests, boosting, and Bayes model averaging. We give conditions under which each of these four classifiers can be regarded as a Bayes classifier. We also give conditions under ...
Read More
Learning regression problems by using classifiers
Special Section: Recent Advances in Machine Learning and Soft Computing

Regression via Classification (RvC) is a process to solve a regression problem by using a classifier. An ensemble consists of many models, in which the final result is the combination of the results of these individual models. In this paper, two RvC ...
Read More
Ensemble-based classifiers

The idea of ensemble methodology is to build a predictive model by integrating multiple models. It is well-known that ensemble methods can be used for improving prediction performance. Researchers from various disciplines such as statistics and AI ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
KDD '01: Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
August 2001
493 pages
ISBN:158113391X
DOI:10.1145/502512
Conference Chair:
Doheon Lee
Chonnam National University, Korea
,
General Chair:
Mario Schkolnick
SGI
,
Program Chairs:
Foster Provost
New York University
,
Ramakrishnan Srikant
IBM Almaden Research Center
Copyright © 2001 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 26 August 2001
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Qualifiers
- Article
Conference

Acceptance Rates
KDD '01 Paper Acceptance Rate31of237submissions,13%Overall Acceptance Rate1,133of8,635submissions,13%
More
Upcoming Conference
KDD '24

Sponsor:

sigkdd

sigkdd

The 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 25 - 29, 2024

Barcelona , Spain
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 15
  Total Citations
  View Citations
- 604
  Total Downloads
- Downloads (Last 12 months)6
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Solving regression problems with rule-based ensemble classifiers

KDD '01: Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining

ABSTRACT

References

Cited By

Index Terms

Recommendations

On the Interpretation of Ensemble Classifiers in Terms of Bayes Classifiers

Learning regression problems by using classifiers

Ensemble-based classifiers