Abstract
The optimal decision-making task based on the Markovian learning methods is investigated. The stochastic and deterministic learning methods are described. The decision-making problem is formulated. The problem of Markovian learning of an agent making optimal decisions in a deterministic environment was solved on the example of finding the shortest path in the cell space. The mathematical formulation of the decision - making problem with deterministic and stochastic strategies based on recurrent estimation of criterion functions of utility of states and efficiency of options of actions of the agent was provided. The evaluation of criterion functions values takes place in real time on the basis of reinforced Q-learning and does not require a model of the environment, which is important for practical applications of decision-making in conditions of uncertainty. The algorithmic and software tools for the decision making modelling in uncertainty conditions are developed. The computer simulation results of decision-making process in cellular space are discussed and presented.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Baier, C., Größer, M., Bertrand, N.: Probabilistic w-automata. J. ACM (JACM) 59(1), 1–52 (2012). https://doi.org/10.1145/2108242.2108243
Bienenstock, E., Soulié, F.F., Weisbuch, G. (eds.): Disordered Systems and Biological Organization: Proceedings of the NATO Advanced Research Workshop on Disordered Systems and Biological Organization Held at Les Houches. NATO ASI Series, vol. 20, p. 405. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-82657-3
Cormen, T.H., Leiserson, C.E., Rivest, R.L., Stein, C.: Introduction to Algorithms, 2nd edn. The Knuth-Morris-Pratt Algorithm (2001)
Feinberg, E.A., Shwartz, A. (eds.): Handbook of Markov Decision Processes: Methods and Applications. International Series in Operations Research & Management Science, vol. 40, 1st edn., p. 565. Springer, Boston, MA (2012). https://doi.org/10.1007/978-1-4615-0805-2
Filar, J., Vrieze, K. (eds.): Competitive Markov Decision Processes, 1st edn., p. 394. Springer, New York, NY (2012). https://doi.org/10.1007/978-1-4612-4054-9
Fricke, G.M., Letendre, K.A., Moses, M.E., Cannon, J.L.: Persistence and adaptation in immunity: T cells balance the extent and thoroughness of search. PLoS Comput. Biol. 12(3), e1004818 (2016). https://doi.org/10.1371/journal.pcbi.1004818
Fudenberg, D., Levine, D.: The Theory of Learning in Games. MIT Press, Cambridge (1998)
Jordan, M.I., Mitchell, T.M.: Machine learning: trends, perspectives, and prospects. Science 349(6245), 255–260 (2015). https://doi.org/10.1126/science.aaa8415
Kochenderfer, M.J.: Decision Making Under Uncertainty: Theory and Application. MIT Press, Cambridge (2015)
Kravets, P., Lytvyn, V., Vysotska, V., Burov, Y.: Promoting training of multi-agent systems. In: CEUR Workshop Proceedings, vol. 2608, pp. 364–378 (2020)
Kravets, P., Lytvyn, V., Vysotska, V., Ryshkovets, Y., Vyshemyrska, S., Smailova, S.: Dynamic coordination of strategies for multi-agent systems. Adv. Intell. Syst. Comput. 1246, 653–670 (2020). https://doi.org/10.1007/978-3-030-54215-3_42
Kushner, H.J., Clark, D.S.: Stochastic Approximation Methods for Constrained and Unc1onstrained Systems. Applied Mathematical Sciences, vol. 26, 1st edn., p. 263. Springer, New York, NY (2012). https://doi.org/10.1007/978-1-4684-9352-8
Puterman, M.L.: Markov Decision Processes: Discrete Stochastic Dynamic Programming. Wiley, New York (2005)
Stone, P.: Layered Learning in Multiagent Systems: A Winning Approach to Robotic Soccer. MIT Press, Cambridge (2000)
Sutton, R., Barto, A.: Reinforcement Learning: An Introduction., 2nd edn. MIT Press, Cambridge (1998). https://doi.org/10.1017/S0263574799271172
Szaban, M., Seredynski, F., Bouvry, P.: Collective behavior of rules for cellular automata-based stream ciphers. In: IEEE International Conference on Evolutionary Computation, pp. 179–183 (2006). https://doi.org/10.1109/CEC.2006.1688306
Tanabe, R., Fukunaga, A.: Success-history based parameter adaptation for differential evolution. In: IEEE Congress on Evolutionary Computation, pp. 71–78 (2013). https://doi.org/10.1109/CEC.2013.6557555
Wasan, M.T.: Stochastic Approximation, vol. 58, 1st edn., p. 216. Cambridge University Press, Cambridge (2004)
Watkins, C., Dayan, P.: Q-learning. Mach. Learn. 8, 279–292 (1992). https://doi.org/10.1007/BF00992698
Weiss, G.: Multiagent Systems. LNCS, vol. 799, pp. 149–152. Springer, Heidelberg (1994). https://doi.org/10.1007/BFb0030538
Wooldridge, M.: An Introduction to Multiagent Systems. Wiley, Hoboken (2002)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Kravets, P. et al. (2022). Markovian Learning Methods in Decision-Making Systems. In: Babichev, S., Lytvynenko, V. (eds) Lecture Notes in Computational Intelligence and Decision Making. ISDMCI 2021. Lecture Notes on Data Engineering and Communications Technologies, vol 77. Springer, Cham. https://doi.org/10.1007/978-3-030-82014-5_28
Download citation
DOI: https://doi.org/10.1007/978-3-030-82014-5_28
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-82013-8
Online ISBN: 978-3-030-82014-5
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)