ABSTRACT
Decision rules are one of the most interpretable and flexible models for data mining prediction tasks. Till now, few works presented online, any-time and one-pass algorithms for learning decision rules in the stream mining scenario. A quite recent algorithm, the Very Fast Decision Rules (VFDR), learns set of rules, where each rule discriminates one class from all the other. In this work we extend the VFDR algorithm by decomposing a multi-class problem into a set of two-class problems and inducing a set of discriminative rules for each binary problem. The proposed algorithm maintains all properties required when learning from stationary data streams: online and any-time classifiers, processing each example once. Moreover, it is able to learn ordered and unordered rule sets. The new approach is evaluated on various real and artificial datasets. The new algorithm improves the performance of the previous version and is competitive with the state-of-the-art decision tree learning method for data streams.
- M. R. Berthold, C. Borgelt, F. Hoeppner, and F. Klawonn. Guide to Intelligent Data Analysis: How to Intelligently Make Sense of Real Data, volume 42 of Texts in Computer Science. Springer-Verlag, 2010. Google ScholarDigital Library
- A. Bifet, G. Holmes, R. Kirkby, and B. Pfahringer. Moa: Massive online analysis. Journal of Machine Learning Research (JMLR), 2010. Google ScholarDigital Library
- P. Clark and R. Boswell. Rule induction with cn2: Some recent improvements. pages 151--163. Springer-Verlag, 1991. Google ScholarDigital Library
- P. Clark and T. Niblett. The CN2 induction algorithm. Machine Learning, 3: 261--283, 1989. Google ScholarDigital Library
- W. Cohen. Fast effective rule induction. In A. Prieditis and S. Russel, editors, Machine Learning, Proceedings of the 12th International Conference. Morgan Kaufmann, 1995.Google Scholar
- P. Domingos. Unifying instance-based and rule-based induction. Machine Learning, 24: 141--168, 1996. Google ScholarDigital Library
- P. Domingos and G. Hulten. Mining high-speed data streams. In KDD, pages 71--80, 2000. Google ScholarDigital Library
- F. Ferrer, J. Aguilar, and J. Riquelme. Incremental rule learning and border examples selection from numerical data streams. Journal of Universal Computer Science, 11(8): 1426--1439, 2005.Google Scholar
- A. Frank and A. Asuncion. UCI machine learning repository, 2010.Google Scholar
- E. Frank and I. H. Witten. Generating accurate rule sets without global optimization. In J. Shavlik, editor, Proceedings of the 15th International Conference - ICML'98, pages 144--151. Morgan Kaufmann, 1998. Google ScholarDigital Library
- J. Fürnkranz. Round robin rule learning. In Proceedings of the 18th International Conference on Machine Learning (ICML-01): 146--153, pages 146--153. Morgan Kaufmann, 2001. Google ScholarDigital Library
- J. Gama, R. Fernandes, and R. Rocha. Decision trees for mining data streams. Intelligent Data Analysis, 10: 23--45, 2006. Google ScholarDigital Library
- J. Gama and P. Kosina. Learning decision rules from data streams. In Proceedings of the 22nd International Joint Conference on Artificial Intelligence, pages 1255--1260. AAAI, Menlo Park, USA, 2011. Google ScholarDigital Library
- J. Gama, R. Rocha, and P. Medas. Accurate decision trees for mining high-speed data streams. In Proceedings of the Ninth International Conference on Knowledge Discovery and Data Mining. ACM Press, New York, NY, 2003. Google ScholarDigital Library
- G. Hulten, L. Spencer, and P. Domingos. Mining time-changing data streams. In Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining, pages 97--106. ACM, New York, NY, USA, 2001. Google ScholarDigital Library
- J. Z. Kolter and M. A. Maloof. Dynamic weighted majority: A new ensemble method for tracking concept drift. In Proceedings of the 3th International IEEE Conference on Data Mining, pages 123--130. IEEE Computer Society, 2003. Google ScholarDigital Library
- M. Maloof and R. Michalski. Incremental learning with partial instance memory. Artificial Intelligence, 154: 95--126, 2004. Google ScholarDigital Library
- J. R. Quinlan and R. M. Cameron-Jones. Induction of logic programs: Foil and related systems. New Generation Comput., 13(3&4): 287--312, 1995.Google ScholarCross Ref
- R. Quinlan. C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers, Inc. San Mateo, CA, 1993. Google ScholarDigital Library
- R. Rivest. Learning decision lists. Machine Learning, 2: 229--246, 1987. Google ScholarDigital Library
- J. C. Schlimmer and R. H. Granger. Incremental learning from noisy data. Machine Learning, 1: 317--354, 1986. Google ScholarDigital Library
- W. N. Street and Y. Kim. A streaming ensemble algorithm SEA for large-scale classification. pages 377--382. ACM Press, 2001. Google ScholarDigital Library
- S. Weiss and N. Indurkhya. Predictive Data Mining, a practical Guide. Morgan Kaufmann Publishers, 1998. Google ScholarDigital Library
- G. Widmer and M. Kubat. Learning in the presence of concept drift and hidden contexts. Machine Learning, 23: 69--101, 1996. Google ScholarDigital Library
Index Terms
- Very Fast Decision Rules for multi-class problems
Recommendations
Very fast decision rules for classification in data streams
Data stream mining is the process of extracting knowledge structures from continuous, rapid data records. Many decision tasks can be formulated as stream mining problems and therefore many new algorithms for data streams are being proposed. Decision ...
Random rules from data streams
SAC '13: Proceedings of the 28th Annual ACM Symposium on Applied ComputingExisting works suggest that random inputs and random features produce good results in classification. In this paper we study the problem of generating random rule sets from data streams. One of the most interpretable and flexible models for data stream ...
Adaptive model rules from data streams
ECMLPKDD'13: Proceedings of the 2013th European Conference on Machine Learning and Knowledge Discovery in Databases - Volume Part IDecision rules are one of the most expressive languages for machine learning. In this paper we present Adaptive Model Rules (AMRules), the first streaming rule learning algorithm for regression problems. In AMRules the antecedent of a rule is a ...
Comments