Elsevier

Information Sciences

Volume 35, Issue 3, June 1985, Pages 183-198
Information Sciences

Multiaction learning automata possessing ergodicity of the mean

https://doi.org/10.1016/0020-0255(85)90049-0Get rights and content

Abstract

Multiaction learning automata which update their action probabilities on the basis of the responses they get from an environment are considered in this paper. The automata update the probabilities according to whether the environment responds with a reward or a penalty. Learning automata are said to possess ergodicity of the mean if the mean action probability is the state probability (or unconditional probability) of an ergodic Markov chain. In an earlier paper [11] we considered the problem of a two-action learning automaton being ergodic in the mean (EM). The family of such automata was characterized completely by proving the necessary and sufficient conditions for automata to be EM. In this paper, we generalize the results of [11] and obtain necessary and sufficient conditions for the multiaction learning automaton to be EM. These conditions involve two families of probability updating functions. It is shown that for the automaton to be EM the two families must be linearly dependent. The vector defining the linear dependence is the only vector parameter which controls the rate of convergence of the automaton. Further, the technique for reducing the variance of the limiting distribution is discussed. Just as in the two-action case, it is shown that the set of absolutely expedient schemes and the set of schemes which possess ergodicity of the mean are mutually disjoint.

References (17)

  • M.F. Norman

    Some convergence theorems for stochastic learning models with distance diminishing operators

    J. Math. Psych.

    (1968)
  • M.L. Tsetlin

    On the behaviour of finite automata in random media

    Avtomat. i Telemekh.

    (1961)
  • M.L. Tsetlin

    Automaton Theory and the Modelling of Biological Systems

    (1973)
  • A. Paz

    Introduction to Probabilistic Automata

    (1971)
  • V.I. Varshavskii et al.

    On the behaviour of stochastic automata with variable structure

    Avtomat. i Telemekh.

    (1963)
  • K. S. Narendra and M. A. L. Thathachar, to...
  • K.S. Narendra et al.

    Learning automata—a survey

    IEEE Trans. Systems Man Cybernet.

    (1974)
  • D.L. Isaacson et al.

    Markov Chains: Theory and Applications

    (1976)
There are more references available in the full text version of this article.

Cited by (16)

  • Continuous and discretized pursuit learning schemes: Various algorithms and their comparison

    2001, IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
  • Multiple response learning automata

    1996, IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
  • Adaptive Directional Neighbor Discovery Schemes in Wireless Networks

    2020, 2020 International Conference on Computing, Networking and Communications, ICNC 2020
  • Evaluating an Adaptive Web Traffic Routing Method for the Cloud

    2019, 2019 IEEE ComSoc International Communications Quality and Reliability Workshop, CQR 2019
View all citing articles on Scopus
View full text