Elsevier

Pattern Recognition Letters

Volume 26, Issue 14, 15 October 2005, Pages 2295-2308
Pattern Recognition Letters

Learning dynamic Bayesian network models via cross-validation

https://doi.org/10.1016/j.patrec.2005.04.005Get rights and content

Abstract

We study cross-validation as a scoring criterion for learning dynamic Bayesian network models that generalize well. We argue that cross-validation is more suitable than the Bayesian scoring criterion for one of the most common interpretations of generalization. We confirm this by carrying out an experimental comparison of cross-validation and the Bayesian scoring criterion, as implemented by the Bayesian Dirichlet metric and the Bayesian information criterion. The results show that cross-validation leads to models that generalize better for a wide range of sample sizes.

Section snippets

Motivation

Let Xt={X1t,,XIt} denote a set of I discrete random variables that represents the state of a temporal process at a discrete time point t. A dynamic Bayesian network (DBN) is a pair (G, θ) that models the temporal process by specifying a probability distribution for X0,  , XT, p(X0,  , XTG, θ) (Friedman et al., 1998, Neapolitan, 2003). The first component of the DBN, G, is an acyclic directed graph (DAG) whose nodes correspond to the random variables in X0 and X1. Edges from X1 to X0 are not allowed

Experiments

In this section, we evaluate CV as a scoring criterion for learning DBN models that generalize well. We use BSC (BD and BIC implementations) as benchmark. All the experiments involve data sampled from known DBNs. This enables us to assess the topological accuracy of the models learnt, in addition to their generalization ability. We first describe the experimental setting.

Discussion

BSC is probably the most commonly used scoring criterion for learning DBN models from data. Typically, BSC is regarded as scoring the likelihood of a model having generated the learning data. Alternatively, BSC can be seen as scoring the accuracy of the model as a sequential predictor of the learning data. This alternative view is interesting because it reflects that BSC scores some sort of generalization. In this paper, we are concerned with a different interpretation of generalization, namely

Acknowledgements

We thank Roland Nilsson for providing us with the code for the Wilcoxon test, and Magnus Ekdahl and the three anonymous referees for their valuable comments on this paper. This work is funded by the Swedish Foundation for Strategic Research (SSF) and Linköping Institute of Technology.

References (27)

  • H. Akaike

    A new look at the statistical model identification

    IEEE Trans. Automatic Control

    (1974)
  • Bouckaert, R.R., 2003. Choosing between two learning algorithms based on calibrated tests. In: Proc. Twentieth...
  • D.M. Chickering

    Optimal structure identification with greedy search

    J. Machine Learning Res.

    (2002)
  • D.M. Chickering et al.

    A comparison of scientific and engineering criteria for Bayesian model selection

    Statist. Comput.

    (2000)
  • R.G. Cowell et al.

    Probabilistic Networks and Expert Systems

    (1999)
  • P. D’haeseleer et al.

    Genetic network inference: From co-expression clustering to reverse engineering

    Bioinformatics

    (2000)
  • Friedman, N., Murphy, K., Russell, S., 1998. Learning the structure of dynamic probabilistic networks. In: Proc....
  • D. Heckerman et al.

    Learning Bayesian networks: The combination of knowledge and statistical data

    Machine Learning

    (1995)
  • R. Hofmann et al.

    Nonlinear Markov networks for continuous variables

    Advances in Neural Information Processing Systems

    (1998)
  • D. Husmeier

    Sensitivity and specificity of inferring genetic regulatory interactions from microarray experiments with dynamic Bayesian networks

    Bioinformatics

    (2003)
  • Kauffman, S., Peterson, C., Samuelsson, B., Troein, C., 2003. Random Boolean network models and the yeast...
  • E. Keogh et al.

    Learning the structure of augmented Bayesian classifiers

    Internat. J. Artificial Intelligence Tools

    (2002)
  • Kim, S., Imoto, S., Miyano, S., 2003. Dynamic Bayesian network and nonparametric regression for nonlinear modeling of...
  • Cited by (0)

    View full text