skip to main content
10.1145/1982185.1982402acmconferencesArticle/Chapter ViewAbstractPublication PagessacConference Proceedingsconference-collections
research-article

Incremental multi-target model trees for data streams

Published:21 March 2011Publication History

ABSTRACT

As in batch learning, one may identify a class of streaming real-world problems which require the modeling of several targets simultaneously. Due to the dependencies among the targets, simultaneous modeling can be more successful and informative than creating independent models for each target. As a result one may obtain a smaller model able to simultaneously explain the relations between the input attributes and the targets. This problem has not been addressed previously in the streaming setting. We propose an algorithm for inducing multi-target model trees with low computational complexity, based on the principles of predictive clustering trees and probability bounds for supporting splitting decisions. Linear models are computed for each target separately, by incremental training of perceptrons in the leaves of the tree. Experiments are performed on synthetic and real-world datasets. The multi-target regression tree algorithm produces equally accurate and smaller models for simultaneous prediction of all the target attributes, as compared to a set of independent regression trees built separately for each target attribute. When the regression surface is smooth, the linear models computed in the leaves significantly improve the accuracy for all of the targets.

References

  1. A. Appice and S. Džeroski. Stepwise induction of multi-target model trees. In Proc 18th European Conf on Machine Learning, volume 4701 of LNCS, pages 502--509. Springer, Berlin, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. H. Blockeel, L. D. Raedt, and J. Ramon. Top-down induction of clustering trees. In Proc 15th Intl Conf on Machine Learning, pages 55--63. Morgan Kaufmann, San Mateo, CA, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. L. Breiman, J. Friedman, R. Olshen, and C. Stone. Classification and Regression Trees. Wadsworth and Brooks, Monterey, CA, 1984.Google ScholarGoogle Scholar
  4. P. Domingos and G. Hulten. Mining high-speed data streams. In Proc 6th ACM SIGKDD Intl Conf on Knowledge Discovery and Data Mining, pages 71--80. ACM Press, New York, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. J. H. Friedman. Multivariate adaptive regression splines. Annals of Statistics, 19(1): 1--61, 1991.Google ScholarGoogle ScholarCross RefCross Ref
  6. P. Geurts, L. Wehenkel, and F. d'Alché Buc. Kernelizing the output of tree-based methods. In Proc 23rd Intl Conf on Machine learning, pages 345--352. ACM Press, New York, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. V. Gjorgjioski, S. Džeroski, and M. White. Clustering analysis of vegetation data. Technical Report 10065, Jožef Stefan Institute, Ljubljana, 2003.Google ScholarGoogle Scholar
  8. W. Hoeffding. Probability inequalities for sums of bounded random variables. Journal of the American Statistical Association, 58(301): 13--30, 1963.Google ScholarGoogle ScholarCross RefCross Ref
  9. E. Ikonomovska and J. Gama. Learning model trees from data streams. In Proc 11th Intl Conf on Discovery Science, volume 5255 of LNAI, pages 52--63. Springer, Berlin, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. E. Ikonomovska, J. Gama, and S. Džeroski. Learning model trees from evolving data streams. Data Mining and Knowledge Discovery, pages 1--41, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. D. Potts and C. Sammut. Incremental learning of linear model trees. Machine Learning, 61(1--3): 5--48, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. J. R. Quinlan. Learning with continuous classes. In Proc 5th Australian Joint Conf on Artificial Intelligence, pages 343--348. World Scientific, Singapore, 1992.Google ScholarGoogle Scholar
  13. M. R. Segal. Tree-structured methods for longitudinal data. Journal of the American Statistical Association, 87(418): 407--418, 1992.Google ScholarGoogle ScholarCross RefCross Ref
  14. D. Stojanova, P. Panov, V. Gjorgjioski, A. Kobler, and S. Džeroski. Estimating vegetation height and canopy cover from remotely sensed data with machine learning. Ecological Informatics, 5(4): 256--266, 2010.Google ScholarGoogle ScholarCross RefCross Ref
  15. J. Struyf and S. Džeroski. Constraint based induction of multi-objective regression trees. In Proc 4th Intl Wshp on Knowledge Discovery in Inductive Databases, volume 3933 of LNCS, pages 222--233. Springer, Berlin, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Incremental multi-target model trees for data streams

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      SAC '11: Proceedings of the 2011 ACM Symposium on Applied Computing
      March 2011
      1868 pages
      ISBN:9781450301138
      DOI:10.1145/1982185

      Copyright © 2011 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 21 March 2011

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate1,650of6,669submissions,25%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader