A set of novel continuous action-set reinforcement learning automata models to optimize continuous functions

Guo, Ying; Ge, Hao; Li, Shenghong

doi:10.1007/s10489-016-0853-4

A set of novel continuous action-set reinforcement learning automata models to optimize continuous functions

Published: 08 December 2016

Volume 46, pages 845–864, (2017)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Ying Guo¹,
Hao Ge¹ &
Shenghong Li¹

405 Accesses
2 Citations
Explore all metrics

Abstract

Learning automata (LA) as a powerful tool for reinforcement learning which belongs to the subject of Artificial Intelligence, could search for the optimal state adaptively in a random environment. In the past decades quite a few FALA algorithms are maturely developed but exposing critical defects, when they are applied to optimize continuous functions. In order to overcome their shortcomings and explore a higher-performance LA, we propose a novel CALA algorithm to solve the function optimization problems via one kind of LA prototypes, i.e, the continuous action-set reinforcement learning automata, which is abbreviated as CARLA. The key mechanism of the proposed algorithm lies in a combination of equidistant discretization and linear interpolation. Specifically, four categories of application models are constructed. Two of them are created to obtain continuous actions when the priori information is finite ones, thus avoiding the drawbacks of FALA. The realization of this functionality recourses to the so-called cumulative distribution function (CDF) and a new concept of area surrounded by curves (AsbC) respectively. The other two models are modified versions to balance the trade-off between accuracy and speed. Moreover, these models are expanded to their generalized versions so that multidimensional function optimization problems can be handled as well. A massive amount of experiments including four benchmarks and three scenarios are designed to demonstrate the effectiveness and efficiency of the proposed application models. The proposed algorithm outperforms the state of the arts of LA as well as optimization algorithms, with a high accuracy rate, a fast convergence speed, and a competitive time consumption, especially in noised environments.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Function Optimization via a Continuous Action-Set Reinforcement Learning Automata Model

An Introduction to Learning Automata and Optimization

A novel estimator based learning automata algorithm

Article 03 October 2014

Hao Ge, Wen Jiang, … Yuchun Jing

Notes

The parameter space is the scope where we search for an optimum.
A could be either finite points {α ₁,α ₂,⋯ ,α _r} or a continuous interval chosen from real line (α _{m
i
n},α _{m
a
x}), corresponding to FALA and CALA respectively.
Throughout the paper, β=1 means the environment rewards the selected action to the maximum extent. And vice versa.
Through out the paper, D _α does not change over time. That is, only a stationary random environment is considered.
It is exactly the order that we introduced the exsiting algorithms in Section 2.2.
The parameters of FALA are I: n=7; II: n=7; III: r=5; IV: D=600.
Different cases represent different initial parameters μ ₀ and σ ₀ in CALA, which are (3,5), (3,6), (-10,5), (-10,7), (10,5), (10,7) and (7,3) respectively.

References

Sutton RS, Barto AG (2013) IEEE Trans Neural Netw 9(5):1054
Article Google Scholar
Thathachar M, Sastry PS (2002) IEEE Trans Syst Man Cybern B Cybern 32(6):711
Article Google Scholar
Tsetlin M (1961) Avtomat I Telemekh 22(10):1345
Google Scholar
Varshavskii V, Vorontsova I (1963) Avtomatika i Telemekhanika 24(3):353
Google Scholar
Thathachar M, Oommen B (1979) J Cybern Inf Sci 2(1):24
Google Scholar
Thathachar ML, Sastry PS (1985) IEEE Trans Syst Man Cybern 1:168
Article Google Scholar
Papadimitriou GI, Sklira M, Pomportsis AS (2004) IEEE Trans Syst Man Cybern Part B Cybern 34 (1):246
Article Google Scholar
Zhang X, Granmo OC, Oommen BJ (2013) Appl Intell 39(4):782
Article Google Scholar
Ge H, Jiang W, Li S, Li J, Wang Y, Jing Y (2015) Appl Intell 42(2):262
Article Google Scholar
Zhang J, Wang C, Zhou M (2014) IEEE Trans Cybern 44(12):2484
Article Google Scholar
Zhang J, Wang C, Zhou M (2015) IEEE Trans Cybern 45(10):2089
Article Google Scholar
Oommen BJ (1997) IEEE Trans Syst Man Cybern Part B Cybern 27(4):733
Article MathSciNet Google Scholar
Oommen BJ, Raghunath G (1998) IEEE Trans Syst Man Cybern Part B Cybern 28(6):947
Article Google Scholar
Oommen BJ, Raghunath G, Kuipers B (2006) IEEE Trans Syst Man Cybern Part B Cybern 36(4):820
Article Google Scholar
Huang DS, Jiang W (2012) IEEE Trans Syst Man Cybern Part B Cybern 42(5):1489
Article Google Scholar
Yazidi A, Granmo OC, Oommen BJ, Goodwin M (2014) IEEE Trans Cybern 44(11):2202
Article Google Scholar
Jiang W, Huang DS, Li S (2015)
Oommen BJ, Granmo OC, Pedersen A (2007) . In: IEEE Symposium on Computational Intelligence and Games, 2007. CIG 2007, pp 161–167
Calitoiu D (2009)
Maravall D, De Lope J, Fuentes JP (2013) Pattern Recogn Lett 34(14):1719
Article Google Scholar
Cuevas E, Wario F, Zaldivar D, Pérez-Cisneros M . In: Artificial Intelligence, Evolutionary Computing and Metaheuristics (Springer 2013), pp. 545–570
Oommen BJ, Hashem MK (2010) IEEE Trans Syst Man Cybern Part B Cybern 40(2):481
Article Google Scholar
Oommen BJ, Hashem MK (2013) IEEE transactions on cybernetics 43(6):2020
Article Google Scholar
Ge H, Wang Y, Li S, Chen CLP, Guo Y (2016) Neurocomputing 188:311
Article Google Scholar
Misra S, Tiwari V, Obaidat MS (2009) IEEE J Sel Areas Commun 27(4):466
Article Google Scholar
Xu Y, Wang J, Wu Q, Anpalagan A, Yao YD (2012) IEEE Trans Wirel Commun 11(4):1380
Article Google Scholar
Kumar N, Misra S, Obaidat MS (2015) IEEE Syst J 9(3):1081
Article Google Scholar
Misra S, Krishna PV, Saritha V, Agarwal H, Shu L, Obaidat MS (2015) IEEE Syst J 9(1):22
Article Google Scholar
Rezvanian A, Rahmati M, Meybodi M, Physica A (2014) Statistical Mechanics and its Applications 396:224
Article Google Scholar
Misra S, Krishna PV, Kalaiselvan K, Saritha V, Obaidat MS (2014) IEEE Trans Netw Serv Manag 11(1):15
Article Google Scholar
Zhong W, Xu Y, Wang J, Li D, Tianfield H (2014) EURASIP J Wirel Commun Netw 2014(1):1
Article Google Scholar
Jiang W, Zhao CL, Li SH, Chen L (2014) Neurocomputing 137:205
Article Google Scholar
Misra S, Krishna PV, Saritha V, Obaidat MS (2013) IEEE Commun Mag 51(1):98
Article Google Scholar
Narendra KS, Thathachar MA Learning automata: an introduction (Courier Corporation, 2012)
Howell M, Gordon T, Brandao F (2002) IEEE Trans Syst Man Cybern Part B Cybern 32(6):804
Article Google Scholar
Haupt RL, Haupt SE (2004) Practical genetic algorithms. Wiley
Zeng X, Liu Z (2005) Inf Sci 174(3):165
Article Google Scholar
Wu Q, Liao H (2013) Inf Sci 220:379
Article Google Scholar
Beigy H, Meybodi M (2005) Scientia Iranica 12(1):14
MathSciNet Google Scholar
Beigy H, Meybodi M (2006) J Frankl Inst 343(1):27
Article Google Scholar
Howell MN, Frost GP, Gordon TJ, Wu QH (1997) Mechatronics 7(3):263
Article Google Scholar
Rabaseda S, Rakotomalala R, Sebban M (1996) Inf Sci 92(1): 137
Article Google Scholar
Sakhnovich LA (2012) Interpolation theory and its applications, vol 428. Springer Science & Business Media
Brochu E, Cora VM, de Freitas N (2009) CoRR abs/1012.2599
Deb K (2015) . In: SP

Download references

Acknowledgments

This research work is funded by the National Science Foundation of China (61271316), Key Laboratory for Shanghai Integrated Information Security Management Technology Research, and Chinese National Engineering Laboratory for Information Content Analysis Technology.

Author information

Authors and Affiliations

Department of Electronic Engineering, Shanghai Jiao Tong University, 800 Dongchuan Road, Min Hang, Shanghai, 200240, People’s Republic of China
Ying Guo, Hao Ge & Shenghong Li

Authors

Ying Guo
View author publications
You can also search for this author in PubMed Google Scholar
Hao Ge
View author publications
You can also search for this author in PubMed Google Scholar
Shenghong Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ying Guo.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Guo, Y., Ge, H. & Li, S. A set of novel continuous action-set reinforcement learning automata models to optimize continuous functions. Appl Intell 46, 845–864 (2017). https://doi.org/10.1007/s10489-016-0853-4

Download citation

Published: 08 December 2016
Issue Date: June 2017
DOI: https://doi.org/10.1007/s10489-016-0853-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A set of novel continuous action-set reinforcement learning automata models to optimize continuous functions

Abstract

Access this article

Similar content being viewed by others

Function Optimization via a Continuous Action-Set Reinforcement Learning Automata Model

An Introduction to Learning Automata and Optimization

A novel estimator based learning automata algorithm

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A set of novel continuous action-set reinforcement learning automata models to optimize continuous functions

Abstract

Access this article

Similar content being viewed by others

Function Optimization via a Continuous Action-Set Reinforcement Learning Automata Model

An Introduction to Learning Automata and Optimization

A novel estimator based learning automata algorithm

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation