Skip to main content

The Shortest Path Problem Under Partial Monitoring

  • Conference paper
Learning Theory (COLT 2006)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4005))

Included in the following conference series:

Abstract

The on-line shortest path problem is considered under partial monitoring scenarios. At each round, a decision maker has to choose a path between two distinguished vertices of a weighted directed acyclic graph whose edge weights can change in an arbitrary (adversarial) way such that the loss of the chosen path (defined as the sum of the weights of its composing edges) be small. In the multi-armed bandit setting, after choosing a path, the decision maker learns only the weights of those edges that belong to the chosen path. For this scenario, an algorithm is given whose average cumulative loss in n rounds exceeds that of the best path, matched off-line to the entire sequence of the edge weights, by a quantity that is proportional to \(1/\sqrt{n}\) and depends only polynomially on the number of edges of the graph. The algorithm can be implemented with linear complexity in the number of rounds n and in the number of edges. This result improves earlier bandit-algorithms which have performance bounds that either depend exponentially on the number of edges or converge to zero at a slower rate than \(O(1/\sqrt{n})\). An extension to the so-called label efficient setting is also given, where the decision maker is informed about the weight of the chosen path only with probability ε<1. Applications to routing in packet switched networks along with simulation results are also presented.

This research was supported in part by the János Bolyai Research Scholarship of the Hungarian Academy of Sciences, the Mobile Innovation Center of Hungary, by the Natural Sciences and Engineering Research Council (NSERC) of Canada, and by the Hungarian Inter-University Center for Telecommunications and Informatics (ETIK).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Auer, P., Cesa-Bianchi, N., Freund, Y., Schapire, R.: The non-stochastic multi-armed bandit problem. SIAM Journal on Computing 32(1), 48–77 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  2. Awerbuch, B., Holmer, D., Rubens, H., Kleinberg, R.: Provably competitive adaptive routing. In: Proceedings of IEEE INFOCOM 2005, vol. 1, pp. 631–641 (March 2005)

    Google Scholar 

  3. Awerbuch, B., Kleinberg, R.D.: Adaptive routing with end-to-end feedback: distributed learning and geometric approaches. In: Proceedings of the 36th Annual ACM Symposium on the Theory of Computing, STOC 2004, Chicago, IL, USA, pp. 45–53. ACM Press, New York (2004)

    Chapter  Google Scholar 

  4. Blackwell, D.: An analog of the minimax theorem for vector payoffs. Pacific Journal of Mathematics 6, 1–8 (1956)

    MathSciNet  MATH  Google Scholar 

  5. Bousquet, O., Warmuth, M.K.: Tracking a small set of experts by mixing past posteriors. Journal of Machine Learning Research 3, 363–396 (2002)

    Article  MathSciNet  Google Scholar 

  6. Cesa-Bianchi, N., Freund, Y., Helmbold, D.P., Haussler, D., Schapire, R., Warmuth, M.K.: How to use expert advice. Journal of the ACM 44(3), 427–485 (1997)

    Article  MathSciNet  MATH  Google Scholar 

  7. Cesa-Bianchi, N., Lugosi, G.: Prediction, Learning, and Games. Cambridge University Press, Cambridge (2006)

    Book  MATH  Google Scholar 

  8. Cesa-Bianchi, N., Lugosi, G., Stoltz, G.: Minimizing regret with label efficient prediction. IEEE Trans. Inform. Theory IT-51, 2152–2162 (2005)

    Article  MathSciNet  Google Scholar 

  9. Devroye, L., Györfi, L., Lugosi, G.: A Probabilistic Theory of Pattern Recognition. Springer, New York (1996)

    MATH  Google Scholar 

  10. Gelenbe, E., Gellman, M., Lent, R., Liu, P., Su, P.: Autonomous smart routing for network QoS. In: Proceedings of First International Conference on Autonomic Computing, pp. 232–239. IEEE Computer Society, Los Alamitos (2004)

    Chapter  Google Scholar 

  11. Gelenbe, E., Lent, R., Xhu, Z.: Measurement and performance of a cognitive packet network. Journal of Computer Networks 37, 691–701 (2001)

    Article  Google Scholar 

  12. György, A., Linder, T., Lugosi, G.: Efficient algorithms and minimax bounds for zero-delay lossy source coding. IEEE Transactions on Signal Processing 52, 2337–2347 (2004)

    Article  MathSciNet  Google Scholar 

  13. György, A., Linder, T., Lugosi, G.: A “follow the perturbed leader”-type algorithm for zero-delay quantization of individual sequences. In: Proc. Data Compression Conference, Snowbird, UT, USA, pp. 342–351 (March 2004)

    Google Scholar 

  14. György, A., Linder, T., Lugosi, G.: Tracking the best of many experts. In: Auer, P., Meir, R. (eds.) COLT 2005. LNCS, vol. 3559, pp. 204–216. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  15. György, A., Linder, T., Lugosi, G.: Tracking the best quantizer. In: Proceedings of the IEEE International Symposium on Information Theory, Adelaide, Australia,pp. 1163–1167 ( June-July 2005)

    Google Scholar 

  16. György, A., Ottucsák, G.: Adaptive routing using expert advice. The Computer Journal 49(2), 180–189 (2006)

    Article  Google Scholar 

  17. Hannan, J.: Approximation to bayes risk in repeated plays. In: Dresher, M., Tucker, A., Wolfe, P. (eds.) Contributions to the Theory of Games, vol. 3, pp. 97–139. Princeton University Press, Princeton (1957)

    Google Scholar 

  18. Helmbold, D.P., Schapire, R.E.: Predicting nearly as well as the best pruning of a decision tree. Machine Learning 27, 51–68 (1997)

    Article  Google Scholar 

  19. Herbster, M., Warmuth, M.K.: Tracking the best expert. Machine Learning 32(2), 151–178 (1998)

    Article  MATH  Google Scholar 

  20. Kalai, A.T., Vempala, S.S.: Efficient algorithms for online decision problems. In: Schölkopf, B., Warmuth, M.K. (eds.) COLT/Kernel 2003. LNCS, vol. 2777, pp. 26–40. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  21. Littlestone, N., Warmuth, M.K.: The weighted majority algorithm. Information and Computation 108, 212–261 (1994)

    Article  MathSciNet  MATH  Google Scholar 

  22. McMahan, H.B., Blum, A.: Online geometric optimization in the bandit setting against an adaptive adversary. In: Shawe-Taylor, J., Singer, Y. (eds.) COLT 2004. LNCS, vol. 3120, pp. 109–123. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  23. Mohri, M.: General algebraic frameworks and algorithms for shortest distance problems. Technical Report 981219-10TM, AT&T Labs Research (1998)

    Google Scholar 

  24. Takimoto, E., Warmuth, M.K.: Path kernels and multiplicative updates. Journal of Machine Learning Research 4, 773–818 (2003)

    Article  MathSciNet  Google Scholar 

  25. Vovk, V.: Aggregating strategies. In: Proceedings of the Third Annual Workshop on Computational Learning Theory, Rochester, NY, pp. 372–383. Morgan Kaufmann, San Francisco (1990)

    Google Scholar 

  26. Vovk, V.: Derandomizing stochastic prediction strategies. Machine Learning 35(3), 247–282 (1999)

    Article  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

György, A., Linder, T., Ottucsák, G. (2006). The Shortest Path Problem Under Partial Monitoring. In: Lugosi, G., Simon, H.U. (eds) Learning Theory. COLT 2006. Lecture Notes in Computer Science(), vol 4005. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11776420_35

Download citation

  • DOI: https://doi.org/10.1007/11776420_35

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-35294-5

  • Online ISBN: 978-3-540-35296-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics