Learning dynamic Bayesian network models via cross-validation

doi:10.1016/j.patrec.2005.04.005

Pattern Recognition Letters

Volume 26, Issue 14, 15 October 2005, Pages 2295-2308

https://doi.org/10.1016/j.patrec.2005.04.005 Get rights and content

Abstract

We study cross-validation as a scoring criterion for learning dynamic Bayesian network models that generalize well. We argue that cross-validation is more suitable than the Bayesian scoring criterion for one of the most common interpretations of generalization. We confirm this by carrying out an experimental comparison of cross-validation and the Bayesian scoring criterion, as implemented by the Bayesian Dirichlet metric and the Bayesian information criterion. The results show that cross-validation leads to models that generalize better for a wide range of sample sizes.

Section snippets

Motivation

Let $X^{t} = {X_{1}^{t}, \dots, X_{I}^{t}}$ denote a set of I discrete random variables that represents the state of a temporal process at a discrete time point t. A dynamic Bayesian network (DBN) is a pair (G, θ) that models the temporal process by specifying a probability distribution for X⁰, … , X^T, p(X⁰, … , X^T∣G, θ) (Friedman et al., 1998, Neapolitan, 2003). The first component of the DBN, G, is an acyclic directed graph (DAG) whose nodes correspond to the random variables in X⁰ and X¹. Edges from X¹ to X⁰ are not allowed

Experiments

In this section, we evaluate CV as a scoring criterion for learning DBN models that generalize well. We use BSC (BD and BIC implementations) as benchmark. All the experiments involve data sampled from known DBNs. This enables us to assess the topological accuracy of the models learnt, in addition to their generalization ability. We first describe the experimental setting.

Discussion

BSC is probably the most commonly used scoring criterion for learning DBN models from data. Typically, BSC is regarded as scoring the likelihood of a model having generated the learning data. Alternatively, BSC can be seen as scoring the accuracy of the model as a sequential predictor of the learning data. This alternative view is interesting because it reflects that BSC scores some sort of generalization. In this paper, we are concerned with a different interpretation of generalization, namely

Acknowledgements

We thank Roland Nilsson for providing us with the code for the Wilcoxon test, and Magnus Ekdahl and the three anonymous referees for their valuable comments on this paper. This work is funded by the Swedish Foundation for Strategic Research (SSF) and Linköping Institute of Technology.

References (27)

H. Akaike
A new look at the statistical model identification
IEEE Trans. Automatic Control
(1974)
Bouckaert, R.R., 2003. Choosing between two learning algorithms based on calibrated tests. In: Proc. Twentieth...
D.M. Chickering
Optimal structure identification with greedy search
J. Machine Learning Res.
(2002)
D.M. Chickering et al.
A comparison of scientific and engineering criteria for Bayesian model selection
Statist. Comput.
(2000)
R.G. Cowell et al.
Probabilistic Networks and Expert Systems
(1999)
P. D’haeseleer et al.
Genetic network inference: From co-expression clustering to reverse engineering
Bioinformatics
(2000)
Friedman, N., Murphy, K., Russell, S., 1998. Learning the structure of dynamic probabilistic networks. In: Proc....
D. Heckerman et al.
Learning Bayesian networks: The combination of knowledge and statistical data
Machine Learning
(1995)
R. Hofmann et al.
Nonlinear Markov networks for continuous variables
Advances in Neural Information Processing Systems
(1998)
D. Husmeier
Sensitivity and specificity of inferring genetic regulatory interactions from microarray experiments with dynamic Bayesian networks
Bioinformatics
(2003)

Kauffman, S., Peterson, C., Samuelsson, B., Troein, C., 2003. Random Boolean network models and the yeast...

E. Keogh et al.

Learning the structure of augmented Bayesian classifiers

Internat. J. Artificial Intelligence Tools

(2002)

Kim, S., Imoto, S., Miyano, S., 2003. Dynamic Bayesian network and nonparametric regression for nonlinear modeling of...

Cited by (0)

View full text

Pattern Recognition Letters

Learning dynamic Bayesian network models via cross-validation

Abstract

Section snippets

Motivation

Experiments

Discussion

Acknowledgements

A new look at the statistical model identification

IEEE Trans. Automatic Control

Optimal structure identification with greedy search

J. Machine Learning Res.

A comparison of scientific and engineering criteria for Bayesian model selection

Statist. Comput.

Probabilistic Networks and Expert Systems

Genetic network inference: From co-expression clustering to reverse engineering

Bioinformatics

Learning Bayesian networks: The combination of knowledge and statistical data

Machine Learning

Nonlinear Markov networks for continuous variables

Advances in Neural Information Processing Systems

Sensitivity and specificity of inferring genetic regulatory interactions from microarray experiments with dynamic Bayesian networks

Bioinformatics

Learning the structure of augmented Bayesian classifiers

Internat. J. Artificial Intelligence Tools