Abstract
Multi-satellite scheduling often involves generating a fixed number of potential task schedules, evaluating them all, and selecting the path that yields the highest expected rewards. Unfortunately, this approach, however accurate, is nearly impossible to scale up and be applied to large realistic problems due to combinatorial explosion. Furthermore, re-generating solutions each time the tasks change is costly, inefficient and slow. To address these issues, we adapt a deep reinforcement learning solution that automatically learns a policy for multi-satellite scheduling, as well as a representation for the problems. The algorithm learns a heuristic that selects the next best task given the current problem and partial solution, avoiding any search in the creation of the schedule. Although preliminary results in learning a collection satellite scheduling heuristic still fail to outperform baseline domain specific methods, the trained system might be fast enough to potentially generate decisions in near real-time.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014)
Barkaoui, M., Berger, J.: A new hybrid genetic algorithm for the multi-satellite collection scheduling problem. J. Oper. Res. Soc. (To appear)
Benoist, T., Rottembourg, B.: Upper bounds for revenue maximization in a satellite scheduling problem. Q. J. Belg. Fr. Ital. Oper. Res. Soc. 2(3), 235–249 (2004)
Dai, H., Dai, B., Song, L.: Discriminative embeddings of latent variable models for structured data. In: International Conference on Machine Learning, pp. 2702–2711 (2016)
Dai, H., Khalil, E.B., Zhang, Y., Dilkina, B., Song, L.: Learning combinatorial optimization algorithms over graphs. arXiv preprint arXiv:1704.01665 (2017)
Hamilton, W.L., Ying, R., Leskovec, J.: Representation learning on graphs: methods and applications. arXiv preprint arXiv:1709.05584 (2017)
Khalil, E., Dai, H., Zhang, Y., Dilkina, B., Song, L.: Learning combinatorial optimization algorithms over graphs. In: Advances in Neural Information Processing Systems, pp. 6348–6358 (2017)
Lawler, E.L., Lenstra, J.K., Kan, A.H.R., Shmoys, D.B.: Sequencing and scheduling: algorithms and complexity. Handb. Oper. Res. Manag. Sci. 4, 445–522 (1993)
LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436 (2015)
Lenstra, J.K., Kan, A.R., Brucker, P.: Complexity of machine scheduling problems. In: Annals of Discrete Mathematics, vol. 1, pp. 343–362. Elsevier (1977)
Liben-Nowell, D., Kleinberg, J.: The link-prediction problem for social networks. J. Am. Soc. Inform. Sci. Technol. 58(7), 1019–1031 (2007)
Mnih, V., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529 (2015)
Nazari, M., Oroojlooy, A., Snyder, L., Takac, M.: Reinforcement learning for solving the vehicle routing problem. In: Bengio, S., Wallach, H., Larochelle, H., Grauman, K., Cesa-Bianchi, N., Garnett, R. (eds.) Advances in Neural Information Processing Systems, vol. 31, pp. 9860–9870. Curran Associates, Inc. (2018)
Pennington, J., Socher, R., Manning, C.: Glove: global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543 (2014)
Sarkheyli, A., Vaghei, B.G., Bagheri, A.: New tabu search heuristic in scheduling earth observation satellites. In: 2010 2nd International Conference on Software Technology and Engineering, vol. 2, pp. V2–199. IEEE (2010)
Silver, D., Huang, A., Maddison, C.J., Guez, A., Sifre, L., Van Den Driessche, G., Schrittwieser, J., Antonoglou, I., Panneershelvam, V., Lanctot, M., et al.: Mastering the game of go with deep neural networks and tree search. Nature 529(7587), 484 (2016)
Vishwanathan, S.V.N., Schraudolph, N.N., Kondor, R., Borgwardt, K.M.: Graph kernels. J. Mach. Learn. Res. 11(Apr), 1201–1242 (2010)
Wang, J., Demeulemeester, E., Qiu, D., Liu, J.: Exact and inexact scheduling algorithms for multiple earth observation satellites under uncertainties of clouds. Available at SSRN 2634934 (2015)
Watkins, C.J., Dayan, P.: Q-learning. Mach. Learn. 8(3–4), 279–292 (1992)
Wolfe, W.J., Sorensen, S.E.: Three scheduling algorithms applied to the earth observing systems domain. Manag. Sci. 46(1), 148–166 (2000)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Crown
About this paper
Cite this paper
Lam, J.T., Rivest, F., Berger, J. (2019). Deep Reinforcement Learning for Multi-satellite Collection Scheduling. In: Martín-Vide, C., Pond, G., Vega-Rodríguez, M. (eds) Theory and Practice of Natural Computing. TPNC 2019. Lecture Notes in Computer Science(), vol 11934. Springer, Cham. https://doi.org/10.1007/978-3-030-34500-6_13
Download citation
DOI: https://doi.org/10.1007/978-3-030-34500-6_13
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-34499-3
Online ISBN: 978-3-030-34500-6
eBook Packages: Computer ScienceComputer Science (R0)