Abstract
We study the tree induction over a stream of perennial objects. The perennial objects are dynamic in nature and cannot be forgotten. The objects come from a multi-table stream, e.g., streams of Customer and Transaction. As the Transactions arrive, the perennial Customers’ profiles grow and accumulate over time. To perform tree induction, we propose a tree induction algorithm that can handle perennial objects. The algorithm also encompasses a method that identifies and adapts to the concept drift in the stream. We have also incorporated a conventional classifier (kNN) at the leaves to further improve the classification accuracy of our algorithm. We have evaluated our method on a synthetic dataset and the PKDD Challenge 1999 dataset.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Džeroski, S.: Multi-relational data mining: An introduction. SIGKDD Explorations Newsletter 5(1) (2003)
Siddiqui, Z.F., Spiliopoulou, M.: Combining multiple interrelated streams for incremental clustering. In: Proc of 21st International Conference on Scientific and Statistical Database Management, SSDBM 2009, Springer, Heidelberg (2009)
Siddiqui, Z.F., Spiliopoulou, M.: Stream clustering of growing objects. In: Gama, J., Costa, V.S., Jorge, A.M., Brazdil, P.B. (eds.) DS 2009. LNCS, vol. 5808, pp. 433–440. Springer, Heidelberg (2009)
Hulten, G., Spencer, L., Domingos, P.: Mining time-changing data streams. In: The 7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2001, pp. 97–106. ACM, New York (2001)
Schlimmer, J.C., Granger, J.R.H.: Incremental learning from noisy data. Machine Learning 1(3), 317–354 (1986)
Quinlan, J.R.: Learning from noisy data. In: Proceedings of the Second International Machine Learning Workshop, Urbana-Champaign, IL, pp. 58–64 (1983)
Utgoff, P.E.: Id5: An incremental id3. In: Proceedings of the 5th International Conference on Machine Learning, ICML 1988, pp. 107–120. Morgan Kaufman, San Francisco (1988)
Utgoff, P.E.: Incremental induction of decision trees. Machine Learning 4 (1989)
Gratch, J.: Sequential inductive learning. In: Proceedings of the 13th National Conference on Artificial Intelligence, AAAI 1996, pp. 779–786 (1996)
Domingos, P., Hulten, G.: Mining high-speed data streams. In: The 6th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2000, pp. 71–80. ACM Press, New York (2000)
Catlett, J.: Megainduction: Machine Learning on Very Large Databases. PhD thesis, University of Sydney, Sydney, Australia (1991)
Gama, J., Rocha, R., Medas, P.: Accurate decision trees for mining high-speed data streams. In: The 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2003, pp. 523–528. ACM, New York (2003)
McGovern, A., Hiers, N., Collier, M., Gagne II, D.J., Brown, R.A.: Spatiotemporal relational probability trees. In: Proceedings of the 2008 IEEE International Conference on Data Mining, ICDM 2008, Pisa, Italy, pp. 935–940 (2008)
Bradley, A.P.: The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recognition 30(7), 1145–1159 (1997)
Kroegel, M.A.: On Propositionalization for Knowledge Discovery in Relational Databases. PhD thesis, University of Magdeburg, Germany (2003)
Perlich, C., Melville, P., Liu, Y., Swirszcz, G., Lawrence, R.D., Rosset, S.: Breast cancer identification: Kdd cup winner’s report. SIGKDD Explorations Newsletter 10(2), 39–42 (2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Siddiqui, Z.F., Spiliopoulou, M. (2010). Tree Induction over Perennial Objects. In: Gertz, M., Ludäscher, B. (eds) Scientific and Statistical Database Management. SSDBM 2010. Lecture Notes in Computer Science, vol 6187. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-13818-8_43
Download citation
DOI: https://doi.org/10.1007/978-3-642-13818-8_43
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-13817-1
Online ISBN: 978-3-642-13818-8
eBook Packages: Computer ScienceComputer Science (R0)