The Shortest Path Problem Under Partial Monitoring

György, András; Linder, Tamás; Ottucsák, György

doi:10.1007/11776420_35

András György²⁰,
Tamás Linder^20,21 &
György Ottucsák²²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4005))

Included in the following conference series:

International Conference on Computational Learning Theory

Abstract

The on-line shortest path problem is considered under partial monitoring scenarios. At each round, a decision maker has to choose a path between two distinguished vertices of a weighted directed acyclic graph whose edge weights can change in an arbitrary (adversarial) way such that the loss of the chosen path (defined as the sum of the weights of its composing edges) be small. In the multi-armed bandit setting, after choosing a path, the decision maker learns only the weights of those edges that belong to the chosen path. For this scenario, an algorithm is given whose average cumulative loss in n rounds exceeds that of the best path, matched off-line to the entire sequence of the edge weights, by a quantity that is proportional to \(1/\sqrt{n}\) and depends only polynomially on the number of edges of the graph. The algorithm can be implemented with linear complexity in the number of rounds n and in the number of edges. This result improves earlier bandit-algorithms which have performance bounds that either depend exponentially on the number of edges or converge to zero at a slower rate than \(O(1/\sqrt{n})\). An extension to the so-called label efficient setting is also given, where the decision maker is informed about the weight of the chosen path only with probability ε<1. Applications to routing in packet switched networks along with simulation results are also presented.

This research was supported in part by the János Bolyai Research Scholarship of the Hungarian Academy of Sciences, the Mobile Innovation Center of Hungary, by the Natural Sciences and Engineering Research Council (NSERC) of Canada, and by the Hungarian Inter-University Center for Telecommunications and Informatics (ETIK).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Adaptivity in Network Interdiction

Minimax Regret 1-Median Problem in Dynamic Path Networks

Single-source shortest paths in the CONGEST model with improved bounds

Article 27 November 2021

References

Auer, P., Cesa-Bianchi, N., Freund, Y., Schapire, R.: The non-stochastic multi-armed bandit problem. SIAM Journal on Computing 32(1), 48–77 (2002)
Article MathSciNet MATH Google Scholar
Awerbuch, B., Holmer, D., Rubens, H., Kleinberg, R.: Provably competitive adaptive routing. In: Proceedings of IEEE INFOCOM 2005, vol. 1, pp. 631–641 (March 2005)
Google Scholar
Awerbuch, B., Kleinberg, R.D.: Adaptive routing with end-to-end feedback: distributed learning and geometric approaches. In: Proceedings of the 36th Annual ACM Symposium on the Theory of Computing, STOC 2004, Chicago, IL, USA, pp. 45–53. ACM Press, New York (2004)
Chapter Google Scholar
Blackwell, D.: An analog of the minimax theorem for vector payoffs. Pacific Journal of Mathematics 6, 1–8 (1956)
MathSciNet MATH Google Scholar
Bousquet, O., Warmuth, M.K.: Tracking a small set of experts by mixing past posteriors. Journal of Machine Learning Research 3, 363–396 (2002)
Article MathSciNet Google Scholar
Cesa-Bianchi, N., Freund, Y., Helmbold, D.P., Haussler, D., Schapire, R., Warmuth, M.K.: How to use expert advice. Journal of the ACM 44(3), 427–485 (1997)
Article MathSciNet MATH Google Scholar
Cesa-Bianchi, N., Lugosi, G.: Prediction, Learning, and Games. Cambridge University Press, Cambridge (2006)
Book MATH Google Scholar
Cesa-Bianchi, N., Lugosi, G., Stoltz, G.: Minimizing regret with label efficient prediction. IEEE Trans. Inform. Theory IT-51, 2152–2162 (2005)
Article MathSciNet Google Scholar
Devroye, L., Györfi, L., Lugosi, G.: A Probabilistic Theory of Pattern Recognition. Springer, New York (1996)
MATH Google Scholar
Gelenbe, E., Gellman, M., Lent, R., Liu, P., Su, P.: Autonomous smart routing for network QoS. In: Proceedings of First International Conference on Autonomic Computing, pp. 232–239. IEEE Computer Society, Los Alamitos (2004)
Chapter Google Scholar
Gelenbe, E., Lent, R., Xhu, Z.: Measurement and performance of a cognitive packet network. Journal of Computer Networks 37, 691–701 (2001)
Article Google Scholar
György, A., Linder, T., Lugosi, G.: Efficient algorithms and minimax bounds for zero-delay lossy source coding. IEEE Transactions on Signal Processing 52, 2337–2347 (2004)
Article MathSciNet Google Scholar
György, A., Linder, T., Lugosi, G.: A “follow the perturbed leader”-type algorithm for zero-delay quantization of individual sequences. In: Proc. Data Compression Conference, Snowbird, UT, USA, pp. 342–351 (March 2004)
Google Scholar
György, A., Linder, T., Lugosi, G.: Tracking the best of many experts. In: Auer, P., Meir, R. (eds.) COLT 2005. LNCS, vol. 3559, pp. 204–216. Springer, Heidelberg (2005)
Chapter Google Scholar
György, A., Linder, T., Lugosi, G.: Tracking the best quantizer. In: Proceedings of the IEEE International Symposium on Information Theory, Adelaide, Australia,pp. 1163–1167 ( June-July 2005)
Google Scholar
György, A., Ottucsák, G.: Adaptive routing using expert advice. The Computer Journal 49(2), 180–189 (2006)
Article Google Scholar
Hannan, J.: Approximation to bayes risk in repeated plays. In: Dresher, M., Tucker, A., Wolfe, P. (eds.) Contributions to the Theory of Games, vol. 3, pp. 97–139. Princeton University Press, Princeton (1957)
Google Scholar
Helmbold, D.P., Schapire, R.E.: Predicting nearly as well as the best pruning of a decision tree. Machine Learning 27, 51–68 (1997)
Article Google Scholar
Herbster, M., Warmuth, M.K.: Tracking the best expert. Machine Learning 32(2), 151–178 (1998)
Article MATH Google Scholar
Kalai, A.T., Vempala, S.S.: Efficient algorithms for online decision problems. In: Schölkopf, B., Warmuth, M.K. (eds.) COLT/Kernel 2003. LNCS, vol. 2777, pp. 26–40. Springer, Heidelberg (2003)
Chapter Google Scholar
Littlestone, N., Warmuth, M.K.: The weighted majority algorithm. Information and Computation 108, 212–261 (1994)
Article MathSciNet MATH Google Scholar
McMahan, H.B., Blum, A.: Online geometric optimization in the bandit setting against an adaptive adversary. In: Shawe-Taylor, J., Singer, Y. (eds.) COLT 2004. LNCS, vol. 3120, pp. 109–123. Springer, Heidelberg (2004)
Chapter Google Scholar
Mohri, M.: General algebraic frameworks and algorithms for shortest distance problems. Technical Report 981219-10TM, AT&T Labs Research (1998)
Google Scholar
Takimoto, E., Warmuth, M.K.: Path kernels and multiplicative updates. Journal of Machine Learning Research 4, 773–818 (2003)
Article MathSciNet Google Scholar
Vovk, V.: Aggregating strategies. In: Proceedings of the Third Annual Workshop on Computational Learning Theory, Rochester, NY, pp. 372–383. Morgan Kaufmann, San Francisco (1990)
Google Scholar
Vovk, V.: Derandomizing stochastic prediction strategies. Machine Learning 35(3), 247–282 (1999)
Article MATH Google Scholar

Download references

Author information

Authors and Affiliations

Informatics Laboratory, Computer and Automation Research Institute of the Hungarian Academy of Sciences, Lágymányosi u. 11, Budapest, H-1111, Hungary
András György & Tamás Linder
Department of Mathematics and Statistics, Queen’s University, Kingston, Ontario, K7L 3N6, Canada
Tamás Linder
Department of Computer Science and Information Theory, Budapest University of Technology and Economics, Magyar Tudósok Körútja 2, Budapest, H-1117, Hungary
György Ottucsák

Authors

András György
View author publications
You can also search for this author in PubMed Google Scholar
Tamás Linder
View author publications
You can also search for this author in PubMed Google Scholar
György Ottucsák
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

ICREA and Department of Economics, Universitat Pompeu Fabra, Ramon Trias Fargas 25-27, 08005, Barcelona, Spain
Gábor Lugosi
Ruhr-Universität Bochum, Germany
Hans Ulrich Simon

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

György, A., Linder, T., Ottucsák, G. (2006). The Shortest Path Problem Under Partial Monitoring. In: Lugosi, G., Simon, H.U. (eds) Learning Theory. COLT 2006. Lecture Notes in Computer Science(), vol 4005. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11776420_35

Download citation

DOI: https://doi.org/10.1007/11776420_35
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-35294-5
Online ISBN: 978-3-540-35296-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

The Shortest Path Problem Under Partial Monitoring

Abstract

Access this chapter

Preview

Similar content being viewed by others

Adaptivity in Network Interdiction

Minimax Regret 1-Median Problem in Dynamic Path Networks

Single-source shortest paths in the CONGEST model with improved bounds

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

The Shortest Path Problem Under Partial Monitoring

Abstract

Access this chapter

Preview

Similar content being viewed by others

Adaptivity in Network Interdiction

Minimax Regret 1-Median Problem in Dynamic Path Networks

Single-source shortest paths in the CONGEST model with improved bounds

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation