Bayesian inference based learning automaton scheme in Q-model environments

Di, Chong; Li, Fangqi; Li, Shenghong; Tian, Jianwei

doi:10.1007/s10489-021-02230-8

Bayesian inference based learning automaton scheme in Q-model environments

Published: 10 March 2021

Volume 51, pages 7453–7468, (2021)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Chong Di¹,
Fangqi Li¹,
Shenghong Li ORCID: orcid.org/0000-0002-0767-2307¹ &
…
Jianwei Tian²

276 Accesses
1 Altmetric
Explore all metrics

Abstract

Learning automaton (LA) is a reinforcement learning unit that learns the optimal action in a stochastic environment. Great efforts have been made to improve the performance of LA in the environments that provide only reward or penalty. However, in many practical scenarios, the feedback from the environment splits into multiple levels. The later environment is recognized by the LA community as the Q-model. This paper studies the LA in Q-model environments, whose study has been scanty. We propose a novel Bayesian inference-based LA that is capable of functioning in Q-model environments, BILA_ML. We utilize Bayesian inference to estimate the environment’s response to each action. Then, KL divergence metric is adopted for adaptive decision-making. The BILA_ML scheme is proved to be 𝜖-optimal and is evaluated to be superior to established LA frameworks by comprehensive experiments.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Double Competitive Strategy-Based Learning Automata Algorithm

A novel reduced parameter s-model of estimator learning automata in the switching non-stationary environment

Article 27 March 2022

A Parameter-Free Gradient Bayesian Two-Action Learning Automaton Scheme

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Notes

In the paper, multi-level environments and Q-model environments are used interchangeably.

References

Sutton RS, Barto AG (2018) Reinforcement learning: an introduction, MIT Press, Cambridge
Narendra KS, Thathachar MAL (2012) Learning automata: an introduction. Courier Corporation
Tsetlin ML (1961) On behaviour of finite automata in random medium. Avtom I Telemekhanika 22(10):1345–1354
Google Scholar
Hasanzadeh M, Meybodi MR (2014) Grid resource discovery based on distributed learning automata. Computing 96(9):909–922
Article Google Scholar
Jobava A, Yazidi A, Oommen BJ, Begnum K (2018) On achieving intelligent traffic-aware consolidation of virtual machines in a data center using learning automata. J Comput Sci 24:290–312
Article Google Scholar
Rahmanian AA, Ghobaei-Arani M, Tofighy S (2018) A learning automata-based ensemble resource usage prediction algorithm for cloud computing environment. Future Gener Comput Syst 79:54–71
Article Google Scholar
Yazidi A, Hammer HL, Jonassen TM (2019) Two-time scale learning automata: an efficient decision making mechanism for stochastic nonlinear resource allocation. Appl Intell 49(9):3392–3405
Article Google Scholar
Di C, Zhang B, Liang Q, Li S, Guo Y (2018) Learning automata-based access class barring scheme for massive random access in machine-to-machine communications. IEEE Internet Things J 6(4):6007–6017
Article Google Scholar
Mofrad MH, Sadeghi S, Rezvanian A, Meybodi MR (2015) Cellular edge detection: combining cellular automata and cellular learning automata. AEU-Int J Electron Commun 69(9):1282–1290
Article Google Scholar
Kumar N, Lee J-H, Rodrigues JJPC (2014) Intelligent mobile video surveillance system as a bayesian coalition game in vehicular sensor networks: learning automata approach. IEEE Trans Intell Transp Syst 16(3):1148–1161
Article Google Scholar
Adinehvand K, Sardari D, Hosntalab M, Pouladian M (2017) An efficient multistage segmentation method for accurate hard exudates and lesion detection in digital retinal images. J Intell Fuzzy Syst 33 (3):1639–1649
Article Google Scholar
Vafashoar R, Meybodi MR (2016) Multi swarm bare bones particle swarm optimization with distribution adaption. Appl Soft Comput 47:534–552
Article Google Scholar
Kordestani JK, Firouzjaee HA, Meybodi MR (2018) An adaptive bi-flight cuckoo search with variable nests for continuous dynamic optimization problems. Appl Intell 48(1):97–117
Article Google Scholar
Rezvanian A, Meybodi MR (2017) Sampling algorithms for stochastic graphs: a learning automata approach. Knowl-Based Syst 127:126–144
Article Google Scholar
Saghiri AM, Meybodi MR (2018) Open asynchronous dynamic cellular learning automata and its application to allocation hub location problem. Knowl-Based Syst 139:149–169
Article Google Scholar
Mirsaleh MR, Meybodi MR (2018) Balancing exploration and exploitation in memetic algorithms: a learning automata approach. Comput Intell 34(1):282–309
Article MathSciNet Google Scholar
Yazidi A, Bouhmala N, Goodwin M (2020) A team of pursuit learning automata for solving deterministic optimization problems. Appl Intell 50:2916–2931
Article Google Scholar
Ahangaran M, Taghizadeh N, Beigy H (2017) Associative cellular learning automata and its applications. Appl Soft Comput 53:1–18
Article Google Scholar
Sohrabi MK, Roshani R (2017) Frequent itemset mining using cellular learning automata. Comput Hum Behav 68:244–253
Article Google Scholar
Ghavipour M, Meybodi MR (2018) Trust propagation algorithm based on learning automata for inferring local trust in online social networks. Knowl-Based Syst 143:307–316
Article Google Scholar
Hasanzadeh-Mofrad M, Rezvanian A (2018) Learning automata clustering. J Comput Sci 24:379–388
Article MathSciNet Google Scholar
Rezvanian A, Moradabadi B, Ghavipour M, Khomami MMD, Meybodi MR (2019) Introduction to learning automata models. In: Learning automata approach for social networks. Springer, pp 1–49
Khaksar Manshad M, Meybodi M, Salajegheh A (2021) A new irregular cellular learning automata-based evolutionary computation for time series link prediction in social networks. Appl Intell 51:71–84
Article Google Scholar
Goodwin M, Yazidi A (2020) Distributed learning automata-based scheme for classification using novel pursuit scheme. Appl Intell 50:2222–2238
Article Google Scholar
Zhang J, Wang Y, Wang C, Zhou MC (2017) Fast variable structure stochastic automaton for discovering and tracking spatiotemporal event patterns. IEEE Trans Cybern 48(3):890–903
Article Google Scholar
Najim K, Poznyak AS (2014) Learning automata: theory and applications, Elsevier
Varshavskii VI, Vorontsova IP (1963) On the behavior of stochastic automata with a variable structure. Avtomatika i Telemekhanika 24(3):353–360
MathSciNet Google Scholar
Oommen BJ, Hansen E (1984) The asymptotic optimality of discretized linear reward-inaction learning automata. IEEE Trans Syst Man Cybern (3): 542–545
Oommen BJ, Lanctôt JK (1990) Discretized pursuit learning automata. IEEE Trans Syst Man Cybern 20(4):931–938
Article MathSciNet Google Scholar
Agache M, Oommen BJ (2002) Generalized pursuit learning schemes: new families of continuous and discretized learning automata. IEEE Trans Syst Man Cybern Part B (Cybernetics) 32(6):738–749
Article Google Scholar
Zhang X, Granmo O-C, Oommen BJ (2013) On incorporating the paradigms of discretization and bayesian estimation to create a new family of pursuit learning automata. Appl Intell 39(4):782–792
Article Google Scholar
Zhang J, Wang C, Zhou MC (2014) Last-position elimination-based learning automata. IEEE Trans Cybern 44(12):2484–2492
Article Google Scholar
Zhang J, Wang C, Zang D, Zhou M (2015) Incorporation of optimal computing budget allocation for ordinal optimization into learning automata. IEEE Trans Autom Sci Eng 13(2):1008–1017
Article Google Scholar
Papadimitriou GI, Sklira M, Pomportsis AS (2004) A new class of/spl epsi/-optimal learning automata. IEEE Trans Syste Man Cybern Part B (Cybernetics) 34(1):246–254
Article Google Scholar
Ge H, Jiang W, Li S, Li J, Wang Y, Jing Y (2015) A novel estimator based learning automata algorithm. Appl Intell 42(2):262–275
Article Google Scholar
Yazidi A, Zhang X, Jiao L, Oommen BJ (2019) The hierarchical continuous pursuit learning automation: a novel scheme for environments with large numbers of actions. IEEE Trans Neural Netw Learn Syst 31(2):512–526
Article MathSciNet Google Scholar
Chasparis GC (2019) Stochastic stability of perturbed learning automata in positive-utility games. IEEE Trans Autom Control 64(11):4454–4469
Article MathSciNet Google Scholar
Zhang X, Jiao L, Oommen BJ, Granmo O-C (2019) A conclusive analysis of the finite-time behavior of the discretized pursuit learning automaton. IEEE Trans Neural Netw Learn Syst 31(1):284–294
Article MathSciNet Google Scholar
Di C, Liang Q, Li F, Li S, Luo F An efficient parameter-free learning automaton scheme. IEEE Trans Neural Netw Learn Syst
Di C, Li S, Li F, Qi K (2019) A novel framework for learning automata: a statistical hypothesis testing approach. IEEE Access 7:27911–27922
Article Google Scholar
Ge H, Yan Y, Li J, Guo Y, Li S (2016) A parameter-free gradient bayesian two-action learning automaton scheme. In: Proceedings of the 2015 international conference on communications, signal processing, and systems. Springer, pp 963–970
Ge H (2017) A parameter-free learning automaton scheme. arXiv:1711.10111
Guo Y, Ge H, Li S (2017) A loss function based parameterless learning automaton scheme. Neurocomputing 260:331–340
Article Google Scholar
Guo Y, Li S (2018) A non-monte-carlo parameter-free learning automata scheme based on two categories of statistics. IEEE Trans Cybern 49(12):4153–4166
Article Google Scholar
Jamalian AH, Rezvani R, Shams H, Mehrabi SH (2012) A new learning automaton for interaction with triple level environments. In: 2012 IEEE 11th international conference on cognitive informatics and cognitive computing. IEEE, pp 492–498
Jiang W, Li S-H (2014) A general method for p-model fssa learning in triple level environment. Neurocomputing 137:150–156
Article Google Scholar
Baba N, et al. (1976) On the learning behavior of the SLR-I reinforcement scheme for stochastic automata. IEEE Trans Syst Man Cybern SMC-6(8):580–582
Article Google Scholar
Casella G, Berger RL (2002) Statistical inference, vol 2. Duxbury Pacific Grove
Auer P, Cesa-Bianchi N, Fischer P (2002) Finite-time analysis of the multiarmed bandit problem. Mach Learn 47(2–3):235–256
Article Google Scholar

Download references

Acknowledgements

This research work is funded by the National Nature Science Foundation of China under Grant 61971283 and 2020 Industrial Internet Innovation Development Project of Ministry of Industry and Information Technology of P.R. China “Smart energy Internet security situation awareness platform project”.

Author information

Authors and Affiliations

School of Electronic Information and Electrical Engineering, Shanghai Jiao Tong University, Shanghai, China
Chong Di, Fangqi Li & Shenghong Li
State Grid Information Communication Company of Hunan, Hunan Key Laboratory for Internet of Things in Electricity, Changsha, China
Jianwei Tian

Authors

Chong Di
View author publications
You can also search for this author in PubMed Google Scholar
Fangqi Li
View author publications
You can also search for this author in PubMed Google Scholar
Shenghong Li
View author publications
You can also search for this author in PubMed Google Scholar
Jianwei Tian
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shenghong Li.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Di, C., Li, F., Li, S. et al. Bayesian inference based learning automaton scheme in Q-model environments. Appl Intell 51, 7453–7468 (2021). https://doi.org/10.1007/s10489-021-02230-8

Download citation

Accepted: 22 January 2021
Published: 10 March 2021
Issue Date: October 2021
DOI: https://doi.org/10.1007/s10489-021-02230-8

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Bayesian inference based learning automaton scheme in Q-model environments

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A Double Competitive Strategy-Based Learning Automata Algorithm

A novel reduced parameter s-model of estimator learning automata in the switching non-stationary environment

A Parameter-Free Gradient Bayesian Two-Action Learning Automaton Scheme

Explore related subjects

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now