Classification and clustering with continuous time Bayesian network models

Codecasa, Daniele; Stella, Fabio

doi:10.1007/s10844-014-0345-0

Classification and clustering with continuous time Bayesian network models

Published: 22 November 2014

Volume 45, pages 187–220, (2015)
Cite this article

Journal of Intelligent Information Systems Aims and scope Submit manuscript

Daniele Codecasa¹ &
Fabio Stella¹

758 Accesses
3 Citations
Explore all metrics

Abstract

Classification and clustering of streaming data are relevant in finance, computer science, and engineering while they are becoming increasingly important in medicine and biology. Streaming data are analyzed with algorithms and models capable to represent dynamics, sequences and time. Dynamic Bayesian networks and hidden Markov models are commonly used to analyze streaming data. However, they are concerned with evenly spaced time series data and thus suffer from several limitations. Indeed, it is not clear how timestamps should be discretized even if some approaches to mitigate this problem have been recently made available. In this paper we describe the class of continuous time Bayesian networks classifiers and develop algorithms for their parametric and structural learning to solve classification and clustering of multivariate discrete state continuous time trajectories. Numerical experiments on synthetic and real world data are used to compare the performance of continuous time Bayesian network models to that achieved by dynamic Bayesian networks. In particular, post-stroke rehabilitation data is used for the classification task while urban traffic data from continuous time loop is used for the clusteirng task. The achieved results confirm the effectiveness of the proposed approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Notes

http://www.swarco.net/
This definition differs from the one proposed in Stella and Amer (2012). In fact, this definition does not require the CTBNC graph \(\mathcal {G}\) to be connected. Therefore, feature selection is achieved as the product of any structural learning algorithm.
Time count sufficient statistics refers to the time spent in a particular state by a variable given the state of its parents.
With α _y we refer to the hyperparameter associated with the class value y.
For further experiments on continuous time Bayesian network classifiers for classification purposes refer to Codecasa and Stella (2013, 2014b).
With α _y we refer to the hyperparameter associated with the class value y.
http://www.civitas.eu/archimedes
For computational reasons a random subset of trajectories is used in the Monza road network tests.

References

Aggarwal, C.C., Han, J., Wang, J., Yu, P.S. (2004). On demand classification of data streams. In Proceedings of the 10th ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’04 (pp. 503–508). New York: ACM.
Angulo, E., Romero, F.P., García, R., Serrano-Guerrero, J., Olivas, J.A. (2011). An adaptive approach to enhanced traffic signal optimization by using soft-computing techniques. Expert Systems with Applications, 38(3), 2235–2247.
Article Google Scholar
Arnott, R., & Small, K. (1994). The economics of traffic congestion. American Scientist, 82(5), 446–455.
Google Scholar
Barber, D., & Cemgil, A. (2010). Graphical models for time-series. IEEE Signal Processing Magazine, 27(6), 18–28.
Google Scholar
Bradley, P., Fayyad, U., Reina, C. (1998). Scaling clustering algorithms to large databases, (pp. 9–15): AAAI Press.
Chatterjee, S., & Russell, S.J. (2010). Why are dbns sparse? In Y.W. Teh, & D.M. Titterington (Eds.), JMLR proceedings of aistats (Vol. 9, pp. 81–88). JMLR.org.
Chickering, D.M., Heckerman, D., Meek, C. (2004). Large-sample learning of bayesian networks is np-hard. The Journal of Machine Learning Research, 5, 1287–1330.
MATH MathSciNet Google Scholar
Codecasa, D, & Stella, F (2013). Conditional log-likelihood for continuous time bayesian network classifiers. In International workshop NFMCP held at ECML-PKDD.
Codecasa, D., & Stella, F. (2014). CTBNCToolkit: Continuous time bayesian network classifier toolkit. ArXiv e-prints.
Codecasa, D., & Stella, F. (2014). Learning continuous time bayesian network classifiers. International Journal of Approximate Reasoning (Vol. 55, pp. 1728–1746), Elsevier.
Cohn, I., El-Hay, T., Friedman, N., Kupferman, R. (2010). Mean field variational approximation for continuous-time bayesian networks. The Journal of Machine Learning Research, 2745–2783.
Dacorogna, MM (2001). An introduction to high-frequency finance. In AP.
Daigle, G., Krueger, G.D., Clark, J. (1997). Tsis: Advanced traffic software tools for the user. In Traffic congestion and traffic safety in the 21st century: Challenges, innovations, and opportunities.
Dean, T., & Kanazawa, K. (1989). A model for reasoning about persistence and causation. Computational Intelligence, 5(2), 142–150.
Article Google Scholar
Domingos, P., & Hulten, G. (2000). Mining high-speed data streams. In Proceedings of the 6th ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’00 (pp. 71–80). New York: ACM.
El-Hay, T., Cohn, I., Friedman, N., Kupferman, R. (2010). Continuous-time belief propagation. In: J. Fürnkranz, & T. Joachims (Eds.) In Proceedings of the 27th international conference on machine learning (ICML). Omnipress, Haifa, (pp. 343–350).
El-Hay, T., Friedman, N., Kupferman, R. (2008). Gibbs sampling in factorized continuous-time markov processes. In: D.A. McAllester, & P. Myllym (Eds.) In Proceedings of the 24th conference on UAI, (pp. 169–178).
Enright, C. (2012). A probabilistic framework based on mathematical models with application to medical data streams. Ph.D. Thesis.
Fan, Y., & Shelton, C. (2008). Sampling for approximate inference in continuous time bayesian networks. In 10th International symposium on artificial intelligence and mathematics.
Farnstrom, F., Lewis, J., Elkan, C. (2000). Scalability for clustering algorithms revisited. In SIGKDD Exploration Newsletter (pp. 51–57). doi:10.1145/360402.360419.
Favilla, J., Machion, A., Gomide, F. (1993). Fuzzy traffic control: Adaptive strategies. In 2nd IEEE international conference on Fuzzy systems (pp. 506–511). IEEE.
Felici, G., Rinaldi, G., Sforza, A., Truemper, K. (2006). A logic programming based approach for on-line traffic control. Transportation Research Part C: Emerging Technologies, 14(3), 175–189.
Article Google Scholar
Forster, A, & Young, J (2002). The clinical and cost effectiveness of physiotherapy in the management of elderly people following a stroke. London: The Chartered Society Of Physiotherapy.
Fowlkes, E.B., & Mallows, C.L. (1983). A method for comparing two hierarchical clusterings. Journal of the American Statistical Association, 78(383), 553–569.
Article MATH Google Scholar
Friedman, N., Geiger, D., Goldszmidt, M. (1997). Bayesian network classifiers. Machine Learning, 29(2), 131–163.
Article MATH Google Scholar
Gaber, M.M., Zaslavsky, A., Krishnaswamy, S. (2005). Mining data streams: A review. SIGMOD Record, 34(2), 18–26.
Article Google Scholar
Gan, G., Ma, C., Wu, J. (2007). Data clustering: Theory, algorithms, and applications. vol. 20: Siam.
Gershenson, C. (2004). Self-organizing traffic lights. arXiv: nlin/0411066.
Gopalratnam, K., Kautz, H., Weld, D.S. (2005). Extending continuous time bayesian networks. In Proceedings of the national conference on artificial intelligence (p. 981). Menlo Park; Cambridge : AAAI Press; MIT Press; 1999.
Grossman, D., & Domingos, P. (2004). Learning bayesian network classifiers by maximizing conditional likelihood. In Proceedings of the 21st international conference on machine learning (pp. 361–368). ACM.
Gualtieri, M., Rigamonti, L., Galeotti, V., Camatini, M. (2005). Toxicity of tire debris extracts on human lung cell line a549. Toxicology in Vitro, 19(7), 1001–1008.
Article Google Scholar
Gunawardana, A., Meek, C., Xu, P. (2011). A model for temporal dependencies in event streams. In Neural information processing systems (pp. 1962–1970). Neural information processing systems foundation.
Haijema, R., & van der Wal, J. (2008). An mdp decomposition approach for traffic control at isolated signalized intersections. Probability in the Engineering and Informational Sciences, 22(04), 587–602.
Article MATH MathSciNet Google Scholar
Halkidi, M., Batistakis, Y., Vazirgiannis, M. (2001). On clustering validation techniques. Journal of Intelligent Information Systems, 17(2-3), 107–145.
Article MATH Google Scholar
Halkidi, M., Batistakis, Y., Vazirgiannis, M. (2002). Cluster validity methods: part i. ACM Sigmod Record, 31(2), 40–45.
Article Google Scholar
Hirankitti, V., & Krohkaew, J. (2007). An agent approach for intelligent traffic-light control. In 1st Asia international conference on modelling & simulation, AMS’07 (pp. 496–501). IEEE.
Hulten, G., Spencer, L., Domingos, P. (2001). Mining time-changing data streams. In Proceedings of the 7th ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’01 (pp. 97–106). New York: ACM.
Keogh, E., & Ratanamahatana, C.A. (2005). Exact indexing of dynamic time warping. Knowledge and Information Systems, 7(3), 358–386.
Article Google Scholar
Koller, D., & Friedman, N. (2009). Probabilistic graphical models: principles and techniques. Cambridge, MA: The MIT Press.
Google Scholar
Kranen, P., Assent, I., Baldauf, C., Seidl, T. (2011). The clustree: Indexing micro-clusters for anytime stream mining. In Knowledge and information systems journal (Springer KAIS) (Vol. 29, number 2, pp. 249–272). London: Springer.
Kwakkel, G., van Peppen, R., Wagenaar, R.C, Dauphinee, S.W., Richards, C., Ashburn, A., Miller, K., Lincoln, N., Partridge, C., Wellwood, I., et al. (2004). Effects of augmented exercise therapy time after stroke a meta-analysis. Stroke, 35(11), 2529–2539.
Article Google Scholar
Lämmer, S., & Helbing, D. (2008). Self-control of traffic lights and vehicle flows in urban road networks. Journal of Statistical Mechanics: Theory and Experiment, 2008(04), P04,019.
Article Google Scholar
Law, Y.N., & Zaniolo, C. (2005). An adaptive nearest neighbor classification algorithm for data streams. In Proceedings of the 9th european conference on principles and practice of knowledge discovery in databases, PKDD’05 (pp. 108–120). Berlin, Heidelberg: Springer-Verlag.
Mantecca, P., Gualtieri, M., Andrioletti, M., Bacchetta, R., Vismara, C., Vailati, G., Camatini, M. (2007). Tire debris organic extract affects< i> xenopus</i> development. Environment International, 33(5), 642–648.
Article Google Scholar
Murphy, K., & et al. (2001). The bayes net toolbox for matlab. Computing Science and Statistics, 33(2), 1024–1034.
Google Scholar
Nodelman, U., Koller, D., Shelton, C. (2005). Expectation propagation for continuous time bayesian networks. In Proceedings of the 21st conference on UAI (pp. 431–40). Edinburgh, Scotland.
Nodelman, U., Shelton, C., Koller, D. (2002a). Continuous time bayesian networks. In Proceedings of the 18th conference on UAI (pp. 378–387). San Mateo:Morgan Kaufmann.
Nodelman, U., Shelton, C.R., Koller, D. (2002b). Learning continuous time bayesian networks. In Proceedings of the 19th conference on UAI (pp. 451–458). San Mateo:Morgan Kaufmann.
Nodelman, U., Shelton, C.R., Koller, D. (2012). Expectation maximization and complex duration distributions for continuous time bayesian networks. CoRR /abs1207.1402.
Owen, L.E., Zhang, Y., Rao, L., McHale, G. (2000). Street and traffic simulation: traffic flow simulation using corsim. In Proceedings of the 32nd conference on winter simulation, Society for computer simulation international (pp. 1143–1147).
Paolucci, S., Antonucci, G., Grasso, M.G., Morelli, D., Troisi, E., Coiro, P., Bragoni, M. (2000). Early versus delayed inpatient stroke rehabilitation: a matched comparison conducted in Italy. Archives of Physical Medicine and Rehabilitation, 81 (6), 695–700.
Article Google Scholar
Papageorgiou, M., Diakaki, C., Dinopoulou, V., Kotsialos, A., Wang, Y. (2003). Review of road traffic control strategies. Proceedings of the IEEE, 91(12), 2043–2067.
Article Google Scholar
Park, B., & Messer, C. (1998). A genetic algorithm-based signal optimization program for oversaturated intersections. In Proceeding of the 5th world congress of intelligent transport system towards the new horizon together (pp. 12–16), Seoul.
Rabiner, L. (1989). A tutorial on hidden markov models and selected applications in speech recognition. Proceedings of the IEEE, 77(2), 257–286.
Article Google Scholar
Rajaram, S., Graepel, T., Herbrich, R. (2005). Poisson-networks: A model for structured point processes. In Proceedings of the 10th international workshop on artificial intelligence and statistics (pp. 277–284).
Rand, W.M. (1971). Objective criteria for the evaluation of clustering methods. Journal of the American Statistical Association, 66 (336), 846–850.
Article Google Scholar
Rao, V., & Teh, Y.W. (2012). Fast mcmc sampling for markov jump processes and continuous time bayesian networks. arXiv preprint arXiv: http://arxiv.org/abs/1202.3760.
Robertson, D.I. (1969). “Transyt” method for area traffic control. Traffic Engineering & Control, 11(6).
Robertson, D.I., & Bretherton, R.D. (1991). Optimizing networks of traffic signals in real time-the scoot method. IEEE Transactions on Vehicular Technology, 40(1), 11–15.
Article Google Scholar
Saria, S., Nodelman, U., Koller, D. (2007). Reasoning at the right time granularity. In UAI (pp. 326–334).
Silva, J.A., Faria, E.R., Barros, R.C., Hruschka, E.R., Carvalho, A.C.P.L.F.d., Gama, J.a. (2013). Data stream clustering: A survey. ACM Computing Surveys, 46(1), 13:1–13:31. doi:10.1145/2522968.2522981.
Article Google Scholar
Simma, A., Goldszmidt, M., MacCormick, J., Barham, P., Black, R., Isaacs, R., Mortier, R. (2008). Ct-nor: Representing and reasoning about events in continuous time. In Proceedings of the 24th conference on UAI (pp. 484–493). AUAI.
Simma, A., & Jordan, M. (2010). Modeling events with cascades of poisson processes. In Proceedings of the 26th conference on UAI (pp. 546–555). AUAI.
Sims, A.G., & Dobinson, K. (1980). The sydney coordinated adaptive traffic (scat) system philosophy and benefits. IEEE Transactions on Vehicular Technology, 29(2), 130–137.
Article Google Scholar
Spall, J.C., & Chin, D.C. (1997). Traffic-responsive signal timing for system-wide traffic control. Transportation Research Part C: Emerging Technologies, 5(3–4), 153–163.
Article Google Scholar
Stella, F., & Amer, Y. (2012). Continuous time bayesian network classifiers. Journal of Biomedical Informatics, 45(6), 1108–1119.
Article Google Scholar
Acerbi, E., & Stella, F. (2014). Continuous time bayesian networks for gene network reconstruction: a comparative study on time course data. In 10th international symposium on bioinformatics research and applications.
Stella, F., Viganò, V., Bogni, D., Benzoni, M. (2006). An integrated forecasting and regularization framework for light rail transit systems. Journal of Intelligent Transportation Systems, 10(2), 59–73.
Article Google Scholar
Thorpe, T.L., & Anderson, C.W. (1996). Tra c light control using sarsa with three state representations. Technical report, Citeseer.
Tormene, P., Giorgino, T., Quaglini, S., Stefanelli, M. (2009). Matching incomplete time series with dynamic time warping: an algorithm and an application to post-stroke rehabilitation. Artificial Intelligence in Medicine, 45(1), 11–34.
Article Google Scholar
Truccolo, W., Eden, U., Fellows, M., Donoghue, J., Brown, E. (2005). A point process framework for relating neural spiking activity to spiking history, neural ensemble, and extrinsic covariate effects. Journal of Neurophysiology, 93(2), 1074–1089.
Article Google Scholar
Villa, S., & Stella, F. (2014). A continuous time bayesian network classifier for intraday fx prediction (pp. 1–14). Quantitative Finance. doi:10.1080/14697688.2014.906811.
Voit, E. (2012). A first course in systems biology. New York: Garland Science.
Google Scholar
Weiss, J.C., & Page, D. (2013a). Forest-based point process for event prediction from electronic health records. In Machine learning and knowledge discovery in databases (pp. 547–562). Berlin Heidelberg New York:Springer.
Weiss, J.C., & Page, D. (2013b). Forest-based point processes for event prediction from electronic health records. Machine Learning and Knowledge Discovery in Databases, 8190, 547–562.
Google Scholar
Wiering, M. (2000). Multi-agent reinforcement learning for traffic light control.
Xu, R., & Wunsch, D. (2008). In Clustering, Vol. 10. Wiley.
Yilmaz, A., Javed, O., Shah, M. (2006). Object tracking: A survey. ACM Computing Surveys, 38(4), 1–45. doi:10.1145/1177352.1177355.
Article Google Scholar
Yu, X.H., & Recker, W.W. (2006). Stochastic adaptive control model for traffic signal systems. Transportation Research Part C: Emerging Technologies, 14(4), 263–282.
Article Google Scholar
Zhou, A., Cao, F., Qian, W., Jin, C. (2008). Tracking clusters in evolving data streams over sliding windows. Knowledge and Information Systems, 15(2), 181–214.
Article Google Scholar

Download references

Author information

Authors and Affiliations

DISCo, Università degli Studi di Milano-Bicocca, Viale Sarca 336, 20126, Milano, Italy
Daniele Codecasa & Fabio Stella

Authors

Daniele Codecasa
View author publications
You can also search for this author in PubMed Google Scholar
Fabio Stella
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Fabio Stella.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Codecasa, D., Stella, F. Classification and clustering with continuous time Bayesian network models. J Intell Inf Syst 45, 187–220 (2015). https://doi.org/10.1007/s10844-014-0345-0

Download citation

Received: 20 May 2014
Revised: 27 September 2014
Accepted: 07 November 2014
Published: 22 November 2014
Issue Date: October 2015
DOI: https://doi.org/10.1007/s10844-014-0345-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Classification and clustering with continuous time Bayesian network models

Abstract

Access this article

Similar content being viewed by others

A Classification Based Scoring Function for Continuous Time Bayesian Network Classifiers

Clustering and Change Detection in Multiple Streaming Time Series

Fast classification of univariate and multivariate time series through shapelet discovery

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Abstract

Access this article

Similar content being viewed by others

A Classification Based Scoring Function for Continuous Time Bayesian Network Classifiers

Clustering and Change Detection in Multiple Streaming Time Series

Fast classification of univariate and multivariate time series through shapelet discovery

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation