Approximate Nash Equilibrium Learning for n-Player Markov Games in Dynamic Pricing

Liu, Larkin

doi:10.1007/978-3-031-73497-7_29

Larkin Liu¹²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 14967))

Included in the following conference series:

EPIA Conference on Artificial Intelligence

118 Accesses

Abstract

We investigate Nash equilibrium learning in a competitive Markov Game (MG) environment, where multiple agents compete, and multiple Nash equilibria can exist. In particular, for an oligopolistic dynamic pricing environment, exact Nash equilibria are difficult to obtain due to the curse-of-dimensionality. We develop a new model-free method to find approximate Nash equilibria. Gradient-free black box optimization is then applied to estimate $\epsilon $, the maximum reward advantage of an agent unilaterally deviating from any joint policy, and to also estimate the $\epsilon $-minimizing policy for any given state. The policy-$\epsilon $ correspondence and the state to $\epsilon $-minimizing policy are represented by neural networks, the latter being the Nash Policy Net. During batch update, we perform Nash Q learning on the system, by adjusting the action probabilities using the Nash Policy Net. We demonstrate that an approximate Nash equilibrium can be learned, particularly in the dynamic pricing domain where exact solutions are often intractable.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 64.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Machine Learning Architectures for Price Formation Models

Article 08 May 2023

Computational Performance of Deep Reinforcement Learning to Find Nash Equilibria

Article Open access 03 January 2023

Learning equilibria in symmetric auction games using artificial neural networks

Article 09 August 2021

References

Briesch, R.A., et al.: A comparative analysis of reference price models. J. Cons. Res. 24(2), 202–214 (1997)
Google Scholar
Ceppi, S., et al.: Local Search Methods for Finding a Nash Equi-librium in Two-Player Games, pp. 335–342 (2010). https://doi.org/10.1109/WI-IAT.2010.57
Daskalakis, C., Goldberg, P.W., Pa-padimitriou, C.H.: The Complexity of Computing a Nash Equilibrium . In: Proceedings of the Thirty-Eighth Annual ACM Symposium on Theory of Computing, STOC 2006, pp. 71–78. Association for Computing Machinery, Seattle, WA, USA (2006). https://doi.org/10.1145/1132516.1132527, isbn: 1595931341
Diouane, Y., et al.: TREGO: a Trust-Region Framework for Efficient Global Optimization. arXiv: org/abs/2101.06808 (2021)
Eriksson, D., et al.: Scalable Global Optimization via Local Bayesian Optimization. arXiv: 1910.01739 (2019)
Ferreira, K.J., Alex Lee, B.H., Simchi-Levi, D.: Analytics for an online retailer: Demand forecasting and price optimization. Manufact. Serv. Operations Manag. 18(1), 69–88 (2016)
Google Scholar
Goeree, J.K., Holt, C.A., Palfrey, T/R.: Stochastic game theory for social science: a primer on quantal response equilibrium. Cheltenham. Edward Elgar Publishing, UK (2020). https://www.elgaronline.com/view/edcoll/9781785363320/9781785363320.xml, isbn: 9781785363320
Michael Harrison, J., Bora Keskin, N., Zeevi, A.: Bayesian dynamic pricing policies: learning and earning under a binary prior distribution. Manag. Sci. 58(3), 570–586 (2012)
Google Scholar
Junling, H., Wellman, M.P.: Nash Q-Learning for General-Sum Stochastic Games. J. Mach. Learn. Res. 4, 1039–1069 (2003), issn: 1532-4435
Google Scholar
Janiszewski, C., Lichtenstein, D.R.: A range theory ac-count of price perception. J. Consumer Res. 25(4), 353–368 (1999)
Google Scholar
Liu, J., et al.: Dynamic Pricing on E-commerce Platform with Deep Reinforcement Learning arXiv: 1912.02572 (2019)
Liu, L.: Approximate Nash Equilibrium Learning for n-Player Markov Games in Dynamic Pricing . arXiv preprint arXiv: 2207.06492 (2022)
Jue, Liu., Zhan, Pang., Linggang, Qi.: Dynamic pricing and in-ventory management with demand learning: a bayesian approach. Comput. Operations Re. 124, 105078 (2020). https://doi.org/10.1016/j.cor.2020.105078
Duncan Luce, R.: Individual Choice Behavior: A Theoretical Analysis. Wiley, New York, NY, USA (1959)
Google Scholar
Mnih, V., et al.: Human-level control through deep reinforcement learning . Nature 518(7540), 529–533 ( 2015), issn: 00280836
Google Scholar
Nash, J.F.: Equilibrium points in n-person games. Proc. Nat. Acad. Sci. 36(1), 48–49 (1950). https://doi.org/10.1073/pnas.36.1.48. https://www.pnas.org/content/36/1/48issn: 0027-8424
Porter, R., Nudelman, E., Shoham, Y.: Simple search methods for finding a Nash equilibrium. Games Econ. Behav. 63(2), 642–662 (2008). Second World Congress of the Game Theory Society
Google Scholar
Ramponi, G., et al.: Learning in non-cooperative configurable markov decision processes . In: Beygelzimer, A., et al. (ed.) Advances in Neural Information Processing Systems (2021). https://openreview.net/forum?id=t-0eCf8L4-a
Rao, V.R.: Pricing models in marketing. Handbooks Operat. Res. Manag. Sci. 5, 517–552 (1993)
Google Scholar
Raman, K., Bass, F.M., et al.: A general test of reference price theory in the presence of threshold effects. In: Tijdschrift voor Economie en management 47(2), 205–226 (2002)
Google Scholar
Regis, R.G.: Trust regions in Kriging-based optimization with expected improvement. Eng. Optimiz. 48(6), 1037–1059 (2016). https://doi.org/10.1080/0305215X.2015.1082350
Sayin, M.O., et al.: Decentralized Q-Learning in Zero-sum Markov Games, arXiv: 2106.02748 (2021)
Thompson, W.R.: On the likelihood that one unknown prob-ability exceeds another in view of evidence of two samples. Biometrika 25(3–4), 285–294 (1933). https://doi.org/10.1093/biomet/25.3-4.285. https://academic.oup.com/biomet/article-pdf/25/3-4/285/513725/25-3-4-285.pdf
Taudes, A., Rudloff, C.: Integrating inventory control and a price change in the presence of reference price effects: a two-period model. Mathem. Methods Operat. Res. 75(1), 29–65 (2012)
Google Scholar
Christopher, J.C.H.: Watkins and Peter Dayan Q-learning. Mach. Learn. 8(3), 279–292 (1992). https://doi.org/10.1007/BF00992698
Article Google Scholar
Wang, L., Fonseca, R., Tian, Y.: Learning Search Space Partition for Black-box Optimization using Monte Carlo Tree Search, arXiv: 2007.00708 (2020)
Wang, X., Sandholm, T.: Reinforcement learning to play an optimal nash equilibrium in team markov games . In: Becker, S., Thrun, S., Obermayer, K. (eds.) Advances in Neural Information Processing Systems. Ed. by . Vol. 15. MIT Press (2003). https://proceedings.neurips.cc/paper/2002/file/f8e59f4b2fe7c5705bf878bbd494ccdf-Paper.pdf

Download references

Author information

Authors and Affiliations

Technical University of Munich, Munich, Germany
Larkin Liu

Authors

Larkin Liu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Larkin Liu .

Editor information

Editors and Affiliations

University of Minho, Braga, Portugal
Manuel Filipe Santos
University of Minho, Braga, Portugal
José Machado
University of Minho, Braga, Portugal
Paulo Novais
University of Minho, Braga, Portugal
Paulo Cortez
Polytechnic Institute of Viana do Castelo, Viana do Castelo, Portugal
Pedro Miguel Moreira

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Liu, L. (2025). Approximate Nash Equilibrium Learning for n-Player Markov Games in Dynamic Pricing. In: Santos, M.F., Machado, J., Novais, P., Cortez, P., Moreira, P.M. (eds) Progress in Artificial Intelligence. EPIA 2024. Lecture Notes in Computer Science(), vol 14967. Springer, Cham. https://doi.org/10.1007/978-3-031-73497-7_29

Download citation

DOI: https://doi.org/10.1007/978-3-031-73497-7_29
Published: 16 November 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-73496-0
Online ISBN: 978-3-031-73497-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Approximate Nash Equilibrium Learning for n-Player Markov Games in Dynamic Pricing

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Machine Learning Architectures for Price Formation Models

Computational Performance of Deep Reinforcement Learning to Find Nash Equilibria

Learning equilibria in symmetric auction games using artificial neural networks

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Approximate Nash Equilibrium Learning for n-Player Markov Games in Dynamic Pricing

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Machine Learning Architectures for Price Formation Models

Computational Performance of Deep Reinforcement Learning to Find Nash Equilibria

Learning equilibria in symmetric auction games using artificial neural networks

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation