Skip to main content

Evolving Chinese Restaurant Processes for Modeling Evolutionary Traces in Temporal Data

  • Conference paper
  • First Online:
Advances in Knowledge Discovery and Data Mining (PAKDD 2015)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9078))

Included in the following conference series:

  • 4213 Accesses

Abstract

Due to the evolving nature of temporal data, clusters often exhibit complex dynamic patterns like birth and death. In particular, a cluster can branch into multiple clusters simultaneously. Intuitively, clusters can evolve as evolutionary trees over time. However, existing models are incapable of recovering the tree-like evolutionary trace in temporal data. To this end, we propose an Evolving Chinese Restaurant Process (ECRP), which is essentially a temporal non-parametric clustering model. ECRP incorporates dynamics of cluster number, parameters and popularity. ECRP allows each cluster to have multiple branches over time. We design an online learning framework based on Gibbs sampling to infer the evolutionary traces of clusters over time. In experiments, we validate that ECRP can capture tree-like evolutionary traces of clusters from real-world data sets and achieve better clustering results than the state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Ahmed, A., Hong, L., Smola, A.: Nested chinese restaurant franchise process: applications to user tracking and document modeling. In: Proceedings of the 30th International Conference on Machine Learning (ICML-13), pp. 1426–1434 (2013)

    Google Scholar 

  2. Ahmed, A., Xing, E.P.: Dynamic non-parametric mixture models and the recurrent chinese restaurant process: with applications to evolutionary clustering. In: SDM, pp. 219–230. SIAM (2008)

    Google Scholar 

  3. Ahmed, A., Xing, E.P.: Timeline: A dynamic hierarchical dirichlet process model for recovering birth/death and evolution of topics in text stream (2012). arXiv preprint http://arxiv.org/abs/1203.3463arXiv:1203.3463

  4. Blei, D.M., Frazier, P.I.: Distance dependent chinese restaurant processes. The Journal of Machine Learning Research 12, 2461–2488 (2011)

    MATH  MathSciNet  Google Scholar 

  5. Blei, D.M., Griffiths, T.L., Jordan, M.I., Tenenbaum, J.B.: Hierarchical topic models and the nested chinese restaurant process. In NIPS 16, (2003)

    Google Scholar 

  6. Blei, D.M., Lafferty, J.D.: Dynamic topic models. In: Proceedings of the 23rd International Conference on Machine Learning, pp. 113–120. ACM (2006)

    Google Scholar 

  7. Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. The Journal of machine Learning research 3, 993–1022 (2003)

    MATH  Google Scholar 

  8. Chakrabarti, D., Kumar, R., Tomkins, A.: Evolutionary clustering. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. KDD 2006, pp. 554–560. ACM, New York (2006)

    Google Scholar 

  9. Chi, Y., Song, X., Zhou, D., Hino, K., Tseng, B.L.: Evolutionary spectral clustering by incorporating temporal smoothness. In: Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 153–162. ACM (2007)

    Google Scholar 

  10. Gao, Z., Song, Y., Liu, S., Wang, H., Wei, H., Chen, Y., Cui, W.: Tracking and connecting topics via incremental hierarchical dirichlet processes. In: 2011 IEEE 11th International Conference on Data Mining (ICDM), pp. 1056–1061, December 2011

    Google Scholar 

  11. Gelman, A., Carlin, J.B., Stern, H.S., Dunson, D.B., Vehtari, A., Rubin, D.B.: Bayesian data analysis. CRC Press (2013)

    Google Scholar 

  12. Griffin, J.E., Steel, M.J.: Order-based dependent dirichlet processes. Journal of the American Statistical Association 101(473), 179–194 (2006)

    Article  MATH  MathSciNet  Google Scholar 

  13. Kawamae, N.: Theme chronicle model: Chronicle consists of timestamp and topical words over each theme. In: Proceedings of the 21st ACM International Conference on Information and Knowledge Management, CIKM 2012, pp. 2065–2069. ACM, New York (2012)

    Google Scholar 

  14. Neal, R.M.: Markov chain sampling methods for dirichlet process mixture models. Journal of Computational and Graphical Statistics 9(2), 249–265 (2000)

    MathSciNet  Google Scholar 

  15. Pitman, J.: Exchangeable and partially exchangeable random partitions. Probability Theory and Related Fields 102(2), 145–158 (1995)

    Article  MATH  MathSciNet  Google Scholar 

  16. Ren, L., Dunson, D.B., Carin, L.: The dynamic hierarchical dirichlet process. In: Proceedings of the 25th International Conference on Machine Learning, pp. 824–831. ACM (2008)

    Google Scholar 

  17. Wang, X., McCallum, A.: Topics over time: a non-markov continuous-time model of topical trends. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 424–433. ACM (2006)

    Google Scholar 

  18. Zhang, P., Li, J., Wang, P., Gao, B., Zhu, X., Guo, L.: Enabling fast prediction for ensemble models on data streams. In: KDD (2011)

    Google Scholar 

  19. Zhang, P., Zhou, C., Wang, P., Gao, B., Zhu, X., Guo, L.: E-tree: An efficient indexing structure for ensemble models on data streams. IEEE Trans. Knowl. Data Eng. 27(2), 461–474 (2015)

    Article  Google Scholar 

  20. Zhang, P., Zhu, X., Shi, Y.: Categorizing and mining concept drifting data streams. In: KDD (2008)

    Google Scholar 

  21. Zhu, X., Ghahramani, Z., Lafferty, J.: Time-sensitive dirichlet process mixture models. Technical report, DTIC Document (2005)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Peng Wang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Wang, P., Zhou, C., Zhang, P., Feng, W., Guo, L., Fang, B. (2015). Evolving Chinese Restaurant Processes for Modeling Evolutionary Traces in Temporal Data. In: Cao, T., Lim, EP., Zhou, ZH., Ho, TB., Cheung, D., Motoda, H. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2015. Lecture Notes in Computer Science(), vol 9078. Springer, Cham. https://doi.org/10.1007/978-3-319-18032-8_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-18032-8_7

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-18031-1

  • Online ISBN: 978-3-319-18032-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics