Skip to main content

Deep Reinforcement Learning for Multi-satellite Collection Scheduling

  • Conference paper
  • First Online:
Theory and Practice of Natural Computing (TPNC 2019)

Abstract

Multi-satellite scheduling often involves generating a fixed number of potential task schedules, evaluating them all, and selecting the path that yields the highest expected rewards. Unfortunately, this approach, however accurate, is nearly impossible to scale up and be applied to large realistic problems due to combinatorial explosion. Furthermore, re-generating solutions each time the tasks change is costly, inefficient and slow. To address these issues, we adapt a deep reinforcement learning solution that automatically learns a policy for multi-satellite scheduling, as well as a representation for the problems. The algorithm learns a heuristic that selects the next best task given the current problem and partial solution, avoiding any search in the creation of the schedule. Although preliminary results in learning a collection satellite scheduling heuristic still fail to outperform baseline domain specific methods, the trained system might be fast enough to potentially generate decisions in near real-time.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 49.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 64.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014)

  2. Barkaoui, M., Berger, J.: A new hybrid genetic algorithm for the multi-satellite collection scheduling problem. J. Oper. Res. Soc. (To appear)

    Google Scholar 

  3. Benoist, T., Rottembourg, B.: Upper bounds for revenue maximization in a satellite scheduling problem. Q. J. Belg. Fr. Ital. Oper. Res. Soc. 2(3), 235–249 (2004)

    MathSciNet  MATH  Google Scholar 

  4. Dai, H., Dai, B., Song, L.: Discriminative embeddings of latent variable models for structured data. In: International Conference on Machine Learning, pp. 2702–2711 (2016)

    Google Scholar 

  5. Dai, H., Khalil, E.B., Zhang, Y., Dilkina, B., Song, L.: Learning combinatorial optimization algorithms over graphs. arXiv preprint arXiv:1704.01665 (2017)

  6. Hamilton, W.L., Ying, R., Leskovec, J.: Representation learning on graphs: methods and applications. arXiv preprint arXiv:1709.05584 (2017)

  7. Khalil, E., Dai, H., Zhang, Y., Dilkina, B., Song, L.: Learning combinatorial optimization algorithms over graphs. In: Advances in Neural Information Processing Systems, pp. 6348–6358 (2017)

    Google Scholar 

  8. Lawler, E.L., Lenstra, J.K., Kan, A.H.R., Shmoys, D.B.: Sequencing and scheduling: algorithms and complexity. Handb. Oper. Res. Manag. Sci. 4, 445–522 (1993)

    Google Scholar 

  9. LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436 (2015)

    Article  Google Scholar 

  10. Lenstra, J.K., Kan, A.R., Brucker, P.: Complexity of machine scheduling problems. In: Annals of Discrete Mathematics, vol. 1, pp. 343–362. Elsevier (1977)

    Google Scholar 

  11. Liben-Nowell, D., Kleinberg, J.: The link-prediction problem for social networks. J. Am. Soc. Inform. Sci. Technol. 58(7), 1019–1031 (2007)

    Article  Google Scholar 

  12. Mnih, V., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529 (2015)

    Article  Google Scholar 

  13. Nazari, M., Oroojlooy, A., Snyder, L., Takac, M.: Reinforcement learning for solving the vehicle routing problem. In: Bengio, S., Wallach, H., Larochelle, H., Grauman, K., Cesa-Bianchi, N., Garnett, R. (eds.) Advances in Neural Information Processing Systems, vol. 31, pp. 9860–9870. Curran Associates, Inc. (2018)

    Google Scholar 

  14. Pennington, J., Socher, R., Manning, C.: Glove: global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543 (2014)

    Google Scholar 

  15. Sarkheyli, A., Vaghei, B.G., Bagheri, A.: New tabu search heuristic in scheduling earth observation satellites. In: 2010 2nd International Conference on Software Technology and Engineering, vol. 2, pp. V2–199. IEEE (2010)

    Google Scholar 

  16. Silver, D., Huang, A., Maddison, C.J., Guez, A., Sifre, L., Van Den Driessche, G., Schrittwieser, J., Antonoglou, I., Panneershelvam, V., Lanctot, M., et al.: Mastering the game of go with deep neural networks and tree search. Nature 529(7587), 484 (2016)

    Article  Google Scholar 

  17. Vishwanathan, S.V.N., Schraudolph, N.N., Kondor, R., Borgwardt, K.M.: Graph kernels. J. Mach. Learn. Res. 11(Apr), 1201–1242 (2010)

    MathSciNet  MATH  Google Scholar 

  18. Wang, J., Demeulemeester, E., Qiu, D., Liu, J.: Exact and inexact scheduling algorithms for multiple earth observation satellites under uncertainties of clouds. Available at SSRN 2634934 (2015)

    Google Scholar 

  19. Watkins, C.J., Dayan, P.: Q-learning. Mach. Learn. 8(3–4), 279–292 (1992)

    MATH  Google Scholar 

  20. Wolfe, W.J., Sorensen, S.E.: Three scheduling algorithms applied to the earth observing systems domain. Manag. Sci. 46(1), 148–166 (2000)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jean Berger .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Crown

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Lam, J.T., Rivest, F., Berger, J. (2019). Deep Reinforcement Learning for Multi-satellite Collection Scheduling. In: Martín-Vide, C., Pond, G., Vega-Rodríguez, M. (eds) Theory and Practice of Natural Computing. TPNC 2019. Lecture Notes in Computer Science(), vol 11934. Springer, Cham. https://doi.org/10.1007/978-3-030-34500-6_13

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-34500-6_13

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-34499-3

  • Online ISBN: 978-3-030-34500-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics