Skip to main content

Tree Induction over Perennial Objects

  • Conference paper
Scientific and Statistical Database Management (SSDBM 2010)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 6187))

Abstract

We study the tree induction over a stream of perennial objects. The perennial objects are dynamic in nature and cannot be forgotten. The objects come from a multi-table stream, e.g., streams of Customer and Transaction. As the Transactions arrive, the perennial Customers’ profiles grow and accumulate over time. To perform tree induction, we propose a tree induction algorithm that can handle perennial objects. The algorithm also encompasses a method that identifies and adapts to the concept drift in the stream. We have also incorporated a conventional classifier (kNN) at the leaves to further improve the classification accuracy of our algorithm. We have evaluated our method on a synthetic dataset and the PKDD Challenge 1999 dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Džeroski, S.: Multi-relational data mining: An introduction. SIGKDD Explorations Newsletter 5(1) (2003)

    Google Scholar 

  2. Siddiqui, Z.F., Spiliopoulou, M.: Combining multiple interrelated streams for incremental clustering. In: Proc of 21st International Conference on Scientific and Statistical Database Management, SSDBM 2009, Springer, Heidelberg (2009)

    Google Scholar 

  3. Siddiqui, Z.F., Spiliopoulou, M.: Stream clustering of growing objects. In: Gama, J., Costa, V.S., Jorge, A.M., Brazdil, P.B. (eds.) DS 2009. LNCS, vol. 5808, pp. 433–440. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  4. Hulten, G., Spencer, L., Domingos, P.: Mining time-changing data streams. In: The 7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2001, pp. 97–106. ACM, New York (2001)

    Chapter  Google Scholar 

  5. Schlimmer, J.C., Granger, J.R.H.: Incremental learning from noisy data. Machine Learning 1(3), 317–354 (1986)

    Google Scholar 

  6. Quinlan, J.R.: Learning from noisy data. In: Proceedings of the Second International Machine Learning Workshop, Urbana-Champaign, IL, pp. 58–64 (1983)

    Google Scholar 

  7. Utgoff, P.E.: Id5: An incremental id3. In: Proceedings of the 5th International Conference on Machine Learning, ICML 1988, pp. 107–120. Morgan Kaufman, San Francisco (1988)

    Google Scholar 

  8. Utgoff, P.E.: Incremental induction of decision trees. Machine Learning 4 (1989)

    Google Scholar 

  9. Gratch, J.: Sequential inductive learning. In: Proceedings of the 13th National Conference on Artificial Intelligence, AAAI 1996, pp. 779–786 (1996)

    Google Scholar 

  10. Domingos, P., Hulten, G.: Mining high-speed data streams. In: The 6th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2000, pp. 71–80. ACM Press, New York (2000)

    Chapter  Google Scholar 

  11. Catlett, J.: Megainduction: Machine Learning on Very Large Databases. PhD thesis, University of Sydney, Sydney, Australia (1991)

    Google Scholar 

  12. Gama, J., Rocha, R., Medas, P.: Accurate decision trees for mining high-speed data streams. In: The 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2003, pp. 523–528. ACM, New York (2003)

    Chapter  Google Scholar 

  13. McGovern, A., Hiers, N., Collier, M., Gagne II, D.J., Brown, R.A.: Spatiotemporal relational probability trees. In: Proceedings of the 2008 IEEE International Conference on Data Mining, ICDM 2008, Pisa, Italy, pp. 935–940 (2008)

    Google Scholar 

  14. Bradley, A.P.: The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recognition 30(7), 1145–1159 (1997)

    Article  Google Scholar 

  15. Kroegel, M.A.: On Propositionalization for Knowledge Discovery in Relational Databases. PhD thesis, University of Magdeburg, Germany (2003)

    Google Scholar 

  16. Perlich, C., Melville, P., Liu, Y., Swirszcz, G., Lawrence, R.D., Rosset, S.: Breast cancer identification: Kdd cup winner’s report. SIGKDD Explorations Newsletter 10(2), 39–42 (2008)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Siddiqui, Z.F., Spiliopoulou, M. (2010). Tree Induction over Perennial Objects. In: Gertz, M., Ludäscher, B. (eds) Scientific and Statistical Database Management. SSDBM 2010. Lecture Notes in Computer Science, vol 6187. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-13818-8_43

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-13818-8_43

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-13817-1

  • Online ISBN: 978-3-642-13818-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics