research-article

Incremental multi-target model trees for data streams

Authors:
Elena Ikonomovska

Jožef Stefan Institute, Ljubljana, Slovenia

Jožef Stefan Institute, Ljubljana, Slovenia
View Profile

,
João Gama

Porto, Portugal

Porto, Portugal
View Profile

,
Sašo Džeroski

Jožef Stefan Institute, Ljubljana, Slovenia

Jožef Stefan Institute, Ljubljana, Slovenia
View Profile

SAC '11: Proceedings of the 2011 ACM Symposium on Applied ComputingMarch 2011Pages 988–993https://doi.org/10.1145/1982185.1982402

Published:21 March 2011Publication History

SAC '11: Proceedings of the 2011 ACM Symposium on Applied Computing

Pages 988–993

ABSTRACT

As in batch learning, one may identify a class of streaming real-world problems which require the modeling of several targets simultaneously. Due to the dependencies among the targets, simultaneous modeling can be more successful and informative than creating independent models for each target. As a result one may obtain a smaller model able to simultaneously explain the relations between the input attributes and the targets. This problem has not been addressed previously in the streaming setting. We propose an algorithm for inducing multi-target model trees with low computational complexity, based on the principles of predictive clustering trees and probability bounds for supporting splitting decisions. Linear models are computed for each target separately, by incremental training of perceptrons in the leaves of the tree. Experiments are performed on synthetic and real-world datasets. The multi-target regression tree algorithm produces equally accurate and smaller models for simultaneous prediction of all the target attributes, as compared to a set of independent regression trees built separately for each target attribute. When the regression surface is smooth, the linear models computed in the leaves significantly improve the accuracy for all of the targets.

References

A. Appice and S. Džeroski. Stepwise induction of multi-target model trees. In Proc 18th European Conf on Machine Learning, volume 4701 of LNCS, pages 502--509. Springer, Berlin, 2007. Google ScholarDigital Library
H. Blockeel, L. D. Raedt, and J. Ramon. Top-down induction of clustering trees. In Proc 15th Intl Conf on Machine Learning, pages 55--63. Morgan Kaufmann, San Mateo, CA, 1998. Google ScholarDigital Library
L. Breiman, J. Friedman, R. Olshen, and C. Stone. Classification and Regression Trees. Wadsworth and Brooks, Monterey, CA, 1984.Google Scholar
P. Domingos and G. Hulten. Mining high-speed data streams. In Proc 6th ACM SIGKDD Intl Conf on Knowledge Discovery and Data Mining, pages 71--80. ACM Press, New York, 2000. Google ScholarDigital Library
J. H. Friedman. Multivariate adaptive regression splines. Annals of Statistics, 19(1): 1--61, 1991.Google ScholarCross Ref
P. Geurts, L. Wehenkel, and F. d'Alché Buc. Kernelizing the output of tree-based methods. In Proc 23rd Intl Conf on Machine learning, pages 345--352. ACM Press, New York, 2006. Google ScholarDigital Library
V. Gjorgjioski, S. Džeroski, and M. White. Clustering analysis of vegetation data. Technical Report 10065, Jožef Stefan Institute, Ljubljana, 2003.Google Scholar
W. Hoeffding. Probability inequalities for sums of bounded random variables. Journal of the American Statistical Association, 58(301): 13--30, 1963.Google ScholarCross Ref
E. Ikonomovska and J. Gama. Learning model trees from data streams. In Proc 11th Intl Conf on Discovery Science, volume 5255 of LNAI, pages 52--63. Springer, Berlin, 2008. Google ScholarDigital Library
E. Ikonomovska, J. Gama, and S. Džeroski. Learning model trees from evolving data streams. Data Mining and Knowledge Discovery, pages 1--41, 2010. Google ScholarDigital Library
D. Potts and C. Sammut. Incremental learning of linear model trees. Machine Learning, 61(1--3): 5--48, 2005. Google ScholarDigital Library
J. R. Quinlan. Learning with continuous classes. In Proc 5th Australian Joint Conf on Artificial Intelligence, pages 343--348. World Scientific, Singapore, 1992.Google Scholar
M. R. Segal. Tree-structured methods for longitudinal data. Journal of the American Statistical Association, 87(418): 407--418, 1992.Google ScholarCross Ref
D. Stojanova, P. Panov, V. Gjorgjioski, A. Kobler, and S. Džeroski. Estimating vegetation height and canopy cover from remotely sensed data with machine learning. Ecological Informatics, 5(4): 256--266, 2010.Google ScholarCross Ref
J. Struyf and S. Džeroski. Constraint based induction of multi-objective regression trees. In Proc 4th Intl Wshp on Knowledge Discovery in Inductive Databases, volume 3933 of LNCS, pages 222--233. Springer, Berlin, 2006. Google ScholarDigital Library

Index Terms

Incremental multi-target model trees for data streams
1. Computing methodologies
  1. Machine learning
    1. Learning settings

Recommendations

An Incremental Fuzzy Decision Tree Classification Method for Mining Data Streams
MLDM '07: Proceedings of the 5th international conference on Machine Learning and Data Mining in Pattern Recognition

One of most important algorithms for mining data streams is VFDT. It uses Hoeffding inequality to achieve a probabilistic bound on the accuracy of the tree constructed. Gama et al. have extended VFDT in two directions. Their system VFDTc can deal with ...
Read More
Ambiguous decision trees for mining concept-drifting data streams

In real world situations, explanations for the same observations may be different depending on perceptions or contexts. They may change with time especially when concept drift occurs. This phenomenon incurs ambiguities. It is useful if an algorithm can ...
Read More
Incremental Learning of Linear Model Trees

A linear model tree is a decision tree with a linear functional model in each leaf. Previous model tree induction algorithms have been batch techniques that operate on the entire training set. However there are many situations when an incremental ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SAC '11: Proceedings of the 2011 ACM Symposium on Applied Computing
March 2011
1868 pages
ISBN:9781450301138
DOI:10.1145/1982185
Conference Chairs:
William Chu
Tunghai University, TaiChung, Taiwan
,
W. Eric Wong
University of Texas at Dallas, Richardson, Texas
,
Program Chairs:
Mathew J. Palakal
Indiana University Purdue University, Indianapolis
,
Chih-Cheng Hung
Southern Polytechnic State University, Marietta
Copyright © 2011 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 21 March 2011
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Hoeffding bound
data streams
incremental learning
incremental model trees
multi-target regression trees
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate1,650of6,669submissions,25%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 20
  Total Citations
  View Citations
- 320
  Total Downloads
- Downloads (Last 12 months)9
- Downloads (Last 6 weeks)2
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Incremental multi-target model trees for data streams

SAC '11: Proceedings of the 2011 ACM Symposium on Applied Computing

ABSTRACT

References

Cited By

Index Terms

Recommendations

An Incremental Fuzzy Decision Tree Classification Method for Mining Data Streams

Ambiguous decision trees for mining concept-drifting data streams

Incremental Learning of Linear Model Trees

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Incremental multi-target model trees for data streams

SAC '11: Proceedings of the 2011 ACM Symposium on Applied Computing

ABSTRACT

References

Cited By

Index Terms

Recommendations

An Incremental Fuzzy Decision Tree Classification Method for Mining Data Streams

Ambiguous decision trees for mining concept-drifting data streams

Incremental Learning of Linear Model Trees

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media