Deep Reinforcement Learning for Multi-satellite Collection Scheduling

Lam, Jason T.; Rivest, François; Berger, Jean

doi:10.1007/978-3-030-34500-6_13

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 11934))

Included in the following conference series:

International Conference on Theory and Practice of Natural Computing

794 Accesses
4 Citations

Abstract

Multi-satellite scheduling often involves generating a fixed number of potential task schedules, evaluating them all, and selecting the path that yields the highest expected rewards. Unfortunately, this approach, however accurate, is nearly impossible to scale up and be applied to large realistic problems due to combinatorial explosion. Furthermore, re-generating solutions each time the tasks change is costly, inefficient and slow. To address these issues, we adapt a deep reinforcement learning solution that automatically learns a policy for multi-satellite scheduling, as well as a representation for the problems. The algorithm learns a heuristic that selects the next best task given the current problem and partial solution, avoiding any search in the creation of the schedule. Although preliminary results in learning a collection satellite scheduling heuristic still fail to outperform baseline domain specific methods, the trained system might be fast enough to potentially generate decisions in near real-time.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 49.99; Price excludes VAT (USA)

Softcover Book: USD 64.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014)
Barkaoui, M., Berger, J.: A new hybrid genetic algorithm for the multi-satellite collection scheduling problem. J. Oper. Res. Soc. (To appear)
Google Scholar
Benoist, T., Rottembourg, B.: Upper bounds for revenue maximization in a satellite scheduling problem. Q. J. Belg. Fr. Ital. Oper. Res. Soc. 2(3), 235–249 (2004)
MathSciNet MATH Google Scholar
Dai, H., Dai, B., Song, L.: Discriminative embeddings of latent variable models for structured data. In: International Conference on Machine Learning, pp. 2702–2711 (2016)
Google Scholar
Dai, H., Khalil, E.B., Zhang, Y., Dilkina, B., Song, L.: Learning combinatorial optimization algorithms over graphs. arXiv preprint arXiv:1704.01665 (2017)
Hamilton, W.L., Ying, R., Leskovec, J.: Representation learning on graphs: methods and applications. arXiv preprint arXiv:1709.05584 (2017)
Khalil, E., Dai, H., Zhang, Y., Dilkina, B., Song, L.: Learning combinatorial optimization algorithms over graphs. In: Advances in Neural Information Processing Systems, pp. 6348–6358 (2017)
Google Scholar
Lawler, E.L., Lenstra, J.K., Kan, A.H.R., Shmoys, D.B.: Sequencing and scheduling: algorithms and complexity. Handb. Oper. Res. Manag. Sci. 4, 445–522 (1993)
Google Scholar
LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436 (2015)
Article Google Scholar
Lenstra, J.K., Kan, A.R., Brucker, P.: Complexity of machine scheduling problems. In: Annals of Discrete Mathematics, vol. 1, pp. 343–362. Elsevier (1977)
Google Scholar
Liben-Nowell, D., Kleinberg, J.: The link-prediction problem for social networks. J. Am. Soc. Inform. Sci. Technol. 58(7), 1019–1031 (2007)
Article Google Scholar
Mnih, V., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529 (2015)
Article Google Scholar
Nazari, M., Oroojlooy, A., Snyder, L., Takac, M.: Reinforcement learning for solving the vehicle routing problem. In: Bengio, S., Wallach, H., Larochelle, H., Grauman, K., Cesa-Bianchi, N., Garnett, R. (eds.) Advances in Neural Information Processing Systems, vol. 31, pp. 9860–9870. Curran Associates, Inc. (2018)
Google Scholar
Pennington, J., Socher, R., Manning, C.: Glove: global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543 (2014)
Google Scholar
Sarkheyli, A., Vaghei, B.G., Bagheri, A.: New tabu search heuristic in scheduling earth observation satellites. In: 2010 2nd International Conference on Software Technology and Engineering, vol. 2, pp. V2–199. IEEE (2010)
Google Scholar
Silver, D., Huang, A., Maddison, C.J., Guez, A., Sifre, L., Van Den Driessche, G., Schrittwieser, J., Antonoglou, I., Panneershelvam, V., Lanctot, M., et al.: Mastering the game of go with deep neural networks and tree search. Nature 529(7587), 484 (2016)
Article Google Scholar
Vishwanathan, S.V.N., Schraudolph, N.N., Kondor, R., Borgwardt, K.M.: Graph kernels. J. Mach. Learn. Res. 11(Apr), 1201–1242 (2010)
MathSciNet MATH Google Scholar
Wang, J., Demeulemeester, E., Qiu, D., Liu, J.: Exact and inexact scheduling algorithms for multiple earth observation satellites under uncertainties of clouds. Available at SSRN 2634934 (2015)
Google Scholar
Watkins, C.J., Dayan, P.: Q-learning. Mach. Learn. 8(3–4), 279–292 (1992)
MATH Google Scholar
Wolfe, W.J., Sorensen, S.E.: Three scheduling algorithms applied to the earth observing systems domain. Manag. Sci. 46(1), 148–166 (2000)
Article Google Scholar

Download references

Author information

Authors and Affiliations

School of Computing, Queen’s University, Kingston, ON, Canada
Jason T. Lam & François Rivest
Department of Mathematics and Computer Science, Royal Military College of Canada, Kingston, ON, Canada
François Rivest
Defence Research Development Canada, Valcartier, QC, Canada
Jean Berger

Authors

Jason T. Lam
View author publications
You can also search for this author in PubMed Google Scholar
François Rivest
View author publications
You can also search for this author in PubMed Google Scholar
Jean Berger
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jean Berger .

Editor information

Editors and Affiliations

Rovira i Virgili University, Tarragona, Spain
Carlos Martín-Vide
Royal Military College of Canada, Kingston, ON, Canada
Geoffrey Pond
University of Extremadura, Cáceres, Spain
Miguel A. Vega-Rodríguez

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lam, J.T., Rivest, F., Berger, J. (2019). Deep Reinforcement Learning for Multi-satellite Collection Scheduling. In: Martín-Vide, C., Pond, G., Vega-Rodríguez, M. (eds) Theory and Practice of Natural Computing. TPNC 2019. Lecture Notes in Computer Science(), vol 11934. Springer, Cham. https://doi.org/10.1007/978-3-030-34500-6_13

Download citation

DOI: https://doi.org/10.1007/978-3-030-34500-6_13
Published: 22 November 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-34499-3
Online ISBN: 978-3-030-34500-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics