Abstract:
Minimum description length (MDL) model selection, in its modern NML formulation, involves a model complexity term which is equivalent to minimax/maximin regret. When the ...Show MoreMetadata
Abstract:
Minimum description length (MDL) model selection, in its modern NML formulation, involves a model complexity term which is equivalent to minimax/maximin regret. When the data are discrete-valued, the complexity term is a logarithm of a sum of maximized likelihoods over all possible data-sets. Because the sum has an exponential number of terms, its evaluation is in many cases intractable. In the continuous case, the sum is replaced by an integral for which a closed form is available in only a few cases. We present an approach based on Monte Carlo sampling, which works for all model classes, and gives strongly consistent estimators of the minimax regret. The estimates convergence almost surely to the correct value with increasing number of iterations. For the important class of Markov models, one of the presented estimators is particularly efficient: in empirical experiments, accuracy that is sufficient for model selection is usually achieved already on the first iteration, even for long sequences.
Published in: 2008 IEEE Information Theory Workshop
Date of Conference: 05-09 May 2008
Date Added to IEEE Xplore: 25 July 2008
ISBN Information: