Abstract
There are big data applications where there is an abundance of latent structure in the data. The online action selection learning algorithms in the literature use an exponential weighting for action selection. However, such strategies are provably suboptimal or computationally inefficient or both. The complexity of addressing such action selection problem is attributed to the combinatorial structure of the action set \(\mathscr {A} = n^d\), where n is the number of instances and d is the dimensionality of the problem. Here, we develop an online algorithm for structured big data by adapting striking techniques from discrete optimization and approximation algorithms called ‘extended formulations’. Such formulations appeal to the underlying geometry of the set with efficient exploration of feasible actions with a guaranteed logarithmic dependence on the dimensionality. An empirical evaluation over simulated and real dataset show our method outperforms the state-of-the-art online algorithms over combinatorial action sets.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Abernethy, J., Hazan, E., Rakhlin, A.: Competing in the dark: an efficient algorithm for bandit linear optimization. In: Proceedings of the 21st Annual Conference on Learning Theory (COLT), vol. 3, p. 3 (2008)
Audibert, J-Y., Bubeck, S., Lugosi, G.: Minimax policies for combinatorial prediction games. arXiv preprint arXiv:1105.4871 (2011)
Auer, P., Cesa-Bianchi, N., Freund, Y., Schapire, R.E.: Gambling in a rigged casino: the adversarial multi-armed bandit problem. In: Proceedings of the 36th Annual Symposium on Foundations of Computer Science, 1995, pp. 322–331. IEEE (1995)
Auer, P., Cesa-Bianchi, N., Freund, Y., Schapire, R.E.: The nonstochastic multiarmed bandit problem. SIAM J. Comput. 32(1), 48–77 (2002)
Balas, E., Ceria, S., Cornuéjols, G.: A lift-and-project cutting plane algorithm for mixed 0–1 programs. Math. Program. 58(1–3), 295–324 (1993)
Ball, K.: An elementary introduction to modern convex geometry. Flavors Geom. 13, 13 (1997)
Bubeck, S., Cesa-Bianchi, N.: Regret analysis of stochastic and nonstochastic multi-armed bandit problems. Found. Trends Mach. Learn. 5(1), 64–87 (2012)
Bubeck, S., Cesa-Bianchi, N., Kakade, S.M.: Towards minimax policies for online linear optimization with bandit feedback. In: JMLR Workshop and Conference Proceedings Volume 23: COLT 2012 (2012)
Cesa-Bianchi, N., Gaillard, P., Lugosi, G., Stoltz, G.: Mirror descent meets fixed share (and feels no regret). arXiv preprint arXiv:1202.3323 (2012)
Cesa-Bianchi, N., Lugosi, G.: Combinatorial bandits. J. Comput. Syst. Sci. 78(5), 1404–1422 (2012)
Chandrasekaran, V., Jordan, M.I.: Computational and statistical tradeoffs via convex relaxation. In: Proceedings of the National Academy of Sciences (2013)
Conforti, M., Cornuéjols, G., Zambelli, G.: Extended formulations in combinatorial optimization. 4OR 8(1), 1–48 (2010)
Dani, V., Hayes, T., Kakade, S.M.: The price of bandit information for online optimization. In: Advances in Neural Information Processing Systems, vol. 20, pp. 345–352 (2008)
Fiorini, S., Massar, S., Pokutta, S., Tiwary, HR., de Wolf, R.: Linear vs. semidefinite extended formulations: exponential separation and strong lower bounds. In: Proceedings of the 44th Symposium on Theory of Computing, pp. 95–106. ACM (2012)
Goldberg, K.: Anonymous Ratings from the Jester Online Joke Recommender System (2003). Accessed 03 Oct 2013
Gouveia, J., Parrilo, P.A., Thomas, R.: Lifts of convex sets and cone factorizations. Math. Oper. Res. 38, 248–264 (2011)
Nesterov, Y., Nemirovskii, A.S., Ye, Y.: Interior-Point Polynomial Algorithms in Convex Programming, vol. 13. SIAM, Philadelphia (1994)
Yannakakis, M.: Expressing combinatorial optimization problems by linear programs. J. Comput. Syst. Sci. 43(3), 441–466 (1991)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Ghosh, S., Prügel-Bennett, A. (2017). Extended Formulations for Online Action Selection on Big Action Sets. In: Angelov, P., Manolopoulos, Y., Iliadis, L., Roy, A., Vellasco, M. (eds) Advances in Big Data. INNS 2016. Advances in Intelligent Systems and Computing, vol 529. Springer, Cham. https://doi.org/10.1007/978-3-319-47898-2_17
Download citation
DOI: https://doi.org/10.1007/978-3-319-47898-2_17
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-47897-5
Online ISBN: 978-3-319-47898-2
eBook Packages: EngineeringEngineering (R0)