ABSTRACT
Sequential learning and decision algorithms are investigated, with various application areas, under a family of additive loss functions for individual data sequences. Simple universal sequential schemes are known, under certain conditions, to approach optimality uniformly as fast as n-1logn, where n is the sample size. For the case of finite-alphabet observations, the class of schemes that can be implemented by finite-state machines (FSM's), is studied. It is shown that Markovian machines with sufficiently long memory exist that are asymptotically nearly as good as any given FSM (deterministic or randomized) for the purpose of sequential decision. For the continuous-valued observation case, a useful class of parametric schemes is discussed with special attention to the recursive least squares (RLS) algorithm.
- A92.P. H. A!goet, "Universal Schemes for Prediction, Gambling, and Portfolio Selection,'' Ann. Probab., April 1992.Google Scholar
- AC88.P. H. A!goet and T. M. Cover, "Asymptotic Optimality and Asymptotic Equipartition Properties of Log-Optimum Investment,'' Ann. Probab., 16, No. 2, pp. 876- 898, 1988.Google ScholarCross Ref
- B56.D. Blackwell, "An Analog to the Minimax Theorem for Vector Payoffs," Pac. J. Math., vol. 6, pp. 1-8, 1956.Google ScholarCross Ref
- C74.T. M. Cover, "Universal Gambling Schemes and the Complexity Measures of Kolmogorov and Chaitin," Technical Report 12, Dept. of Statistics, Stanford University, 1974.Google Scholar
- C91.T. M. Cover, "Universal Portfolios," mATH.fINANCE,VOL,L,NO.1 PP. 1-29, January 1991.Google Scholar
- CS77.T. M. Cover and A. Shenhar, "Compound Bayes Predictors for Sequences with Apparent Markov Structure," IEEE 7~ans. Syst. Man. Cybern., Vol. SMC-7, pp. 421- 424, May-June 1977.Google ScholarCross Ref
- CT91.T. M. Cover and J. A. Thomas, Elements of Information Theory, J. Wiley & Sons, Google ScholarDigital Library
- CK81.I. Csisz~ and J. Kamer, Information Theory: Coding Theorems for Discrete Memoryless Systems. Academic Press, 1981.Google Scholar
- D83.L. D. Davisson, "Minimax Noiseless Universal Coding for Markov Sources," IEEE Trans. Inform. Theory, IT-29, No. 2, pp. 211-215, 1983.Google Scholar
- F91.M. Feder, "Gambling Using a Finite-State Machine," IEEE Trans. Inform. Theory, Voi. IT-37, No. 5, pp. 1459-1465, Sept. 1991.Google Scholar
- FMG92.M. Feder, N. Merhav, and M. Gutman, "Universal Prediction of Individual Sequences," to appear in IEEE Trans. Inform. Theory, July 1992. Also, summarized in Proc. 17th Convention of Electrical & Electronics Engineers in Israel, pp. 223-226, May 1991.Google Scholar
- G62.A. Gill, Introduction to the Theory of Finite-State Machines. McGraw Hill, 1962.Google Scholar
- G68.D. C. Gilliland, "Sequential Compound Estimation," Ann. Math. Statist., Vol. 39, No. 6, pp. 1890-1904, 1968.Google ScholarCross Ref
- G72.D. C. Gilliland, "Asymptotic Risk Stability Resulting from Play Against the Past in a Sequence of Decision Problems," IEEE 7?ans. Inform. Theory, Vol. IT-18, No. 5, pp. 614-617, Sept. 1972.Google Scholar
- GH69.D. C. Gi!liland, and J. F. Hannan, "On an Extended Compound Decision Problem," Ann. Math. Statist., Vol. 40, No. 5, pp. 1536-1541, 1969.Google ScholarCross Ref
- GH78.D. C. Gilliland and M. K. Helmers, "On the Continuity of the Bayes Response," IEEE I?ans. Inform. Theory, Vol. IT-24, No. 4, pp. 506-508, July 1978.Google Scholar
- GP77.G. C. Goodwin and R. L. Payne, Dynamic System Identification: Experiment Design and Data Analysis. Mathematics in Science and Engineering, Vol. 136. Academic Press, 1977.Google Scholar
- G84.R. M. Gray, "Vector Quantization," IEEE ASSP Magazine, Vol. 1, No. 2, pp. 4-29, 1984.Google ScholarCross Ref
- G88.R. M. Gray, Probability, Random Processes, and Ergodic Properties. Springer-Verlag, 1988.Google Scholar
- H57.J. F. Hannan, "Approximation to Bayes Risk in Repeated Plays," in Contributions to the Theory of Games, Vol. III, Annals of Mathematics Studies, No. 39, pp. 97-139, Princeton 1957.Google Scholar
- HR57.J. F. Hannan and H. Robbins, "Asymptotic Solutions of the Compound Decision Problem for Two Completely Specified Distributions,'' Ann. Math. Statist., Vol. 26, pp. 37-51, 1957.Google ScholarCross Ref
- H86.S. Haykin, Adaptive Filter Theory. Prentice-Hall, 1986. Google ScholarDigital Library
- H72.M. E. Hellman, "The Effects of Randomization on Finite-Memory Decision Schemes," IEEE Trans. Inform. Theory, IT-18, No. 4, pp. 499-502, 1972.Google Scholar
- HC70.M. E. Heilman and T. M. Cover, "Learning with Finite Memory," Ann. Math. Statist., Vol. 41, No. 3, pp. 765-782, 1970.Google ScholarCross Ref
- HC71.M. E. Hellman and T. M. Cover, "On Memory Saved by Randomization," Ann. Math. Statist., Vol. 42, No. 3, pp. 1075- 1078, 1971.Google ScholarCross Ref
- JN84.N. S. Jayant and P. Noll, Digital Coding of Wave forms. Englewood Cliffs, N.J. Prentice-Hall, 1984.Google Scholar
- J67.M. V. Johns, Jr., "Two-action Compound Decision Problems," Proc. Fifth Berkeley Symp. Math. Statist. Prob., Vol. 1, pp. 463-478, University of California Press, 1967.Google Scholar
- KT81.R. E. Krichevsky and V. K. Trofimov, "The Performance of Universal Encoding,'' IEEE Trans. Inform. Theory, IT-27, No. 2, pp. 199-207, March 1981.Google Scholar
- L84.G. G. Langdon, Jr., "An Introduction to Arithmetic Coding," IBM J. Res. Develop., Vol. 28, No. 2, pp. 135-149, 1984.Google ScholarDigital Library
- LR86.F. T. Leighton and R. L. Rivcat, "Eatimating a Probability Using Finite Memory," IEEE Trans. Inform. Theory, IT-32, No. 6, pp. 733-742, 1986. Google ScholarDigital Library
- LBG80.Y, Linde, A. Buzo, and R. M. Gray, "An Algorithm for Vector Quantizer Design," IEEE Trans. Commun., COM-28, No. 1, pp. 84-95, 1980.Google Scholar
- M75.J. Makhoui, "Linear Prediction: A Tutorial Review," Proc. IEEE, Vol. 63, No. 4, 1975,Google Scholar
- MF92.N. Merhav and M. Feder, "Universal Schemes for Sequential Decision from Individual Data Sequences," submitted to IEEE Trans. Inform. Theory, 1992.Google Scholar
- N79.Y. Nogami, "The k-Extended Set- Compound Estimation Problem in a Nonregular Family of Distributions over (0,0+1)," Ann. Inst. Statist. Math., Vol. 31A, pp. 169-176, 1979.Google ScholarCross Ref
- PWZ92.E. Plotnik, M. J. Weinberger, and J. Ziv, "Upper Bounds on the Probability of Sequences Emitted by Finite-State Soumes and on the Redundancy of the Lempe!-Ziv Algorithm," IEEE Trans. Inform. Theory, Vol. IT-38, No. 1, pp. 66-72, January 1992.Google Scholar
- R84.J. Rissanen, "Universal Coding, Information, Prediction, and Estimation," IEEE Trans. Inform. Theory, Vol. IT-30, No. 4, pp. 629-636, July 1984.Google Scholar
- R86.J. Rissanen, "Stochastic Complexity and Modeling," Ann. Statist., Vol. 14, No. 3, pp. 1080-1100, 1986.Google ScholarCross Ref
- R51.H. Robbins, "Asymptotically Subminimax Solutions of Compound Statistical Decision Problems," Proc. 2nd Berkeley Syrup. Math. Statist. Prob., pp. 131-148, 1951.Google Scholar
- S63.E. Samuel, "Asymptotic Solutions of the Sequential Compound Decision Problem," Ann. Math. Statist., pp. 1079-1095, 1963.Google ScholarCross Ref
- S64.E. Samuel, "Convergence of the Losses of Certain Decision Rules for the Sequential Compound Decision Problem," Ann. Math. Statist., pp. 1606-1621, 1964.Google ScholarCross Ref
- S74.B. O. Shubert "Finite-Memory Classification of Bernoulli Sequences Using Reference Samples," IEEE Trans. Inform. Theory, IT-20, No. 3, pp. 384-387, 1974.Google Scholar
- S65.D. D. Swain, "Bounds and Rates of Convergence for the Extended Compound Estimation Problem in the Sequence Case," Tech. Report no. 81, Department of Statistics, Stanford University, 1965.Google Scholar
- V66.J. Van Ryzin, "The Sequential Compound Decision Problem with mxn Finite Loss Matrix," Ann. Math. Statist., Vol. 37, pp. 954-975, 1966.Google ScholarCross Ref
- V80.S. B. Vardeman, "Admissible Solutions of k-Extended Finite State Set and Sequence Compound Decision Problems," J. Multivariate Anal., Vol. 10, pp. 426441, 1980.Google ScholarCross Ref
- ZL78.J. Ziv and A. Lempel, "Compression of Individual Sequences via Variable-Rate Coding," IEEE Trans. Inform. Theory, IT- 24, No. 5, pp. 530-536, Sept. 1978.Google ScholarCross Ref
- Z90.J. Ziv, "Compression, Tests for Randomness, and Estimating the Statistical Model of an Individual Sequence," in Sequences, (R. M. Capocelli, Ed.) pp. 366-373, Springer-Verlag, 1990. Google ScholarDigital Library
Index Terms
- Universal sequential learning and decision from individual data sequences
Recommendations
Universal schemes for sequential decision from individual data sequences
Sequential decision algorithms are investigated in relation to a family of additive performance criteria for individual data sequences. Simple universal sequential schemes are known, under certain conditions, to approach optimality uniformly as fast as ...
Sequential prediction of individual sequences under general loss functions
We consider adaptive sequential prediction of arbitrary binary sequences when the performance is evaluated using a general loss function. The goal is to predict on each individual sequence nearly as well as the best prediction strategy in a given ...
Quasi-universal k-regular sequences
AbstractWe study the k-regular sequences introduced by Allouche and Shallit. We call a k-regular integer sequence s quasi-universal, if for every recursively enumerable set A of positive integers, the k-kernel of s contains a sequence t such ...
Comments