Abstract:
We observe a length-n sample generated by an unknown, stationary ergodic Markov process (model) over a finite alphabet A. In this paper, we do not assume any bound on the...Show MoreMetadata
Abstract:
We observe a length-n sample generated by an unknown, stationary ergodic Markov process (model) over a finite alphabet A. In this paper, we do not assume any bound on the memory of the source, nor do we assume that the source is rapidly mixing. Rather we consider a class Md of all Markov sources where for all i ∈ ℕ, the mutual information between bits i apart, conditioned on all bits in between, is bounded by log(1 + d(i)). Given any string w of symbols from A and an unknown source in Md, we want estimates of the conditional probability distribution of symbols following w (model parameters), as well as the stationary probability of w. In this setting, we can only have estimators of model parameters converge to the underlying truth in a pointwise sense over Md. However, can we look at a length-n sample and identify if an estimate is likely to be accurate? In this paper we specifically address the case where d(i) diminishes exponentially with i. Since the memory is unknown a-priori, a natural approach is to estimate a potentially coarser model with memory kn = O(log n). As n grows, estimates get refined and this approach is consistent with the above scaling of kn also known to be essentially optimal. But while effective asymptotically, the situation is quite different when we want the best answers possible with a length-n sample, rather than just consistency. Combining results in universal compression with Aldous' coupling arguments, we obtain sufficient conditions on the length-n sample (even for slow mixing models) to identify when naive (i) estimates of the model parameters and (ii) estimates related to the stationary probabilities are accurate; and also bound the deviations of the naive estimates from true values.
Published in: 2013 51st Annual Allerton Conference on Communication, Control, and Computing (Allerton)
Date of Conference: 02-04 October 2013
Date Added to IEEE Xplore: 13 February 2014
ISBN Information: