ABSTRACT
There has been substantial work developing simple, efficient no-regret algorithms for a wide class of repeated decision-making problems including online routing. These are adaptive strategies an individual can use that give strong guarantees on performance even in adversarially-changing environments. There has also been substantial work on analyzing properties of Nash equilibria in routing games. In this paper, we consider the question: if each player in a routing game uses a no-regret strategy, will behavior converge to a Nash equilibrium? In general games the answer to this question is known to be no in a strong sense, but routing games have substantially more structure.In this paper we show that in the Wardrop setting of multicommodity flow and infinitesimal agents, behavior will approach Nash equilibrium (formally, on most days, the cost of the flow will be close to the cost of the cheapest paths possible given that flow) at a rate that depends polynomially on the players' regret bounds and the maximum slope of any latency function. We also show that price-of-anarchy results may be applied to these approximate equilibria, and also consider the finite-size (non-infinitesimal) load-balancing model of Azar [2].
- B. Awerbuch and R. Kleinberg. Adaptive routing with end-to-end feedback: Distributed learning and geometric approaches. In Proceedings of the 36th ACM Symposium on Theory of Computing, 2004. Google ScholarDigital Library
- Y. Azar. On-line Load Balancing. Online Algorithms - The State of the Art, pages 178--195, Springer, 1998.Google ScholarCross Ref
- M. Beckmann, C. B. McGuire, and C. B. Winsten. Studies in the Economics of Transportation. Yale University Press, 1956.Google Scholar
- P. Berenbrink, T. Friedetzky, L. A. Goldberg, P. Goldberg, Z. Hu, and R. Martin. Distributed selfish load balancing. In Proc. 17th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), 2006. Google ScholarDigital Library
- N. Cesa-Bianchi, Y. Freund, D. Helmbold, D. Haussler, R. Schapire, and M. Warmuth. How to use expert advice. JACM, 44(3):427--485, 1997. Google ScholarDigital Library
- A. Czumaj, P. Krysta, and B. Vöcking. Selfish traffic allocation for server farms. In Proc. 34th Annual ACM Symp. Theory of Computing, pages 287--296, 2002. Google ScholarDigital Library
- A. Czumaj and B. Vöcking. Tight bounds on worst case equilibria. In Proc. 13th Annual ACM-SIAM Symp. Discrete Algorithms, pages 413--420, 2002. Google ScholarDigital Library
- E. Even-Dar, A. Kesselman, and Y. Mansour. Convergence time to Nash equilibria. In 30th International Conference on Automata, Languages and Programming (ICALP), pages 502--513, 2003. Google ScholarDigital Library
- E. Even-Dar and Y. Mansour. Fast convergence of selfish rerouting. In Proc. 16th Annual ACM-SIAM Symp. Discrete Algorithms, pages 772--781, 2005. Google ScholarDigital Library
- A. Fabrikant, A. Luthra, E. Maneva, C. H. Papadimitriou, and S. Shenker. On a network creation game. In Proceedings of the 22nd Annual Symposium on Principles of Distributed Computing (PODC), pages 347--351, 2003. Google ScholarDigital Library
- S. Fischer, H. Raecke, and B. Vöcking. Fast convergence to Wardrop equilibria by adaptive sampling methods. In Proceedings of 38th ACM Symposium on Theory of Computing (STOC), 2006. Google ScholarDigital Library
- S. Fischer and B. Vöcking. On the evolution of selfish routing. In Proc. 12th Annural European Symposium on Algorithms (ESA), pages 323--334, 2004.Google ScholarCross Ref
- S. Fischer and B. Vöcking. Adaptive routing with stale information. In Proceedings of the 24th Annual ACM Symposium on Principles of Distributed Computing (PODC), 2005. Google ScholarDigital Library
- D. P. Foster and R. V. Vohra. Calibrated learning and correlated equilibrium. Games and Economic Behavior, 1997.Google ScholarCross Ref
- Y. Freund and R. Schapire. A decision-theoretic generalization of on-line learning and an application to boosting. J. Comput. Syst. Sci., 55(1):119--139, 1997. Google ScholarDigital Library
- Y. Freund and R. Schapire. Adaptive game playing using multiplicative weights. Games and Economic Behavior, 29:79--103, 1999.Google ScholarCross Ref
- P. Goldberg. Bounds for the convergence rate of randomized local search in multiplayer games, uniform resource sharing game. In Proceedings of the Twenty-Third PODC, pages 131--144, 2004. Google ScholarDigital Library
- J. Hannan. Approximation to Bayes risk in repeated play. In M. Dresher, A. Tucker, and P. Wolfe, editors, Contributions to the Theory of Games, volume III, pages 97--139. Princeton University Press, 1957.Google Scholar
- S. Hart and A. Mas-Colell. A simple adaptive procedure leading to correlated equilibrium. Econometrica, 2000.Google ScholarCross Ref
- A. Kalai and S. Vempala. Efficient algorithms for on-line optimization. In Journal of Computer and System Sciences, 71(3): 291--307, 2005. Google ScholarDigital Library
- E. Koutsoupias and C. H. Papadimitriou. Worst-case equilibria. In Proceedings of 16th STACS, pages 404--413, 1999. Google ScholarDigital Library
- N. Littlestone and M. K. Warmuth. The weighted majority algorithm. Information and Computation, 108(2):212--261, 1994. Google ScholarDigital Library
- H. McMahan and A. Blum. Online geometric optimization in the bandit setting against an adaptive adversary. In Proc. 17th Annual Conference on Learning Theory (COLT), pages 109--123, 2004.Google ScholarCross Ref
- I. Milchtaich. Congestion games with player-specific payoff functions. Games and Economic Behavior, 13:111--124, 1996.Google ScholarCross Ref
- T. Roughgarden. On the severity of Braess's paradox: Designing networks for selfish users is hard. In 42nd Annual IEEE Symposium on Foundations of Computer Science, 2001. Google ScholarDigital Library
- T. Roughgarden and E. Tardos. How bad is selfish routing? Journal of the ACM, 49(2):236--259, 2002. Google ScholarDigital Library
- E. Takimoto and M. K. Warmuth. Path kernels and multiplicative updates. In Proc. 15th Annual Conference on Learning Theory (COLT), 2002. Google ScholarDigital Library
- J. G. Wardrop. Some theoretical aspects of road traffic research. In Proceedings of the Institute of Civil Engineers, Pt. II, volume 1, pages 325--378, 1952.Google ScholarCross Ref
- M. Zinkevich. Online convex programming and generalized infinitesimal gradient ascent. In Proceedings of the 20th International Conference on Machine Learning, pages 928--936, 2003.Google ScholarDigital Library
- M. Zinkevich. Theoretical guarantees for algorithms in multi-agent settings. Technical Report CMU-CS-04-161, Carnegie Mellon University, 2004.Google Scholar
Index Terms
- Routing without regret: on convergence to nash equilibria of regret-minimizing algorithms in routing games
Recommendations
Multiplicative updates outperform generic no-regret learning in congestion games: extended abstract
STOC '09: Proceedings of the forty-first annual ACM symposium on Theory of computingWe study the outcome of natural learning algorithms in atomic congestion games. Atomic congestion games have a wide variety of equilibria often with vastly differing social costs. We show that in almost all such games, the well-known multiplicative-...
Beating the best Nash without regret
Nash equilibrium analysis has become the de facto solution standard in game theory. This approach, despite its prominent role, has been the subject of much criticism for being too optimistic. Indeed, in general games, natural play need not converge to ...
A dynamic network game for the adoption of new technologies
EC '14: Proceedings of the fifteenth ACM conference on Economics and computationWhen a product or technology is first introduced, there is uncertainty about its value or quality. This quality can be learned by trying the product, at a risk. It can also be learned by letting others try it and free-riding on the information that they ...
Comments