Skip to main content

Cooperative Dynamic Domain Reduction

  • Conference paper
  • First Online:
Distributed Autonomous Robotic Systems

Part of the book series: Springer Proceedings in Advanced Robotics ((SPAR,volume 9))

  • 1232 Accesses

Abstract

Unmanned vehicles (UxVs) are increasingly deployed in a wide range of challenging scenarios, including disaster response, surveillance, and search and rescue. This paper is motivated by scenarios where a heterogeneous swarm of UxVs is tasked with completing a variety of different objectives that possibly require cooperation from vehicles of varying capabilities. Our goal is to develop an approach that enables vehicles to aid each other in the services of these objectives in a distributed and autonomous fashion. To address this problem, we build on Dynamic domain reduction for multi-agent planning (DDRP), which is a framework that utilizes model-based hierarchical reinforcement learning and spatial state abstractions crafted for robotic planning. Our strategy to tackle the exponential complexity of reasoning over the joint action space of the multi-agent system is to have agents reason over single-agent trajectories, evaluate the result as a function of the cooperative objectives that can be completed, and use simulated annealing to refine the search for the best set of joint trajectories. The resulting algorithm is termed Cooperative dynamic domain reduction for multi-agent planning (CDDRP). Our analysis characterizes the long-term convergence in probability to the optimal set of trajectories. We provide simulations to estimate the performance of CDDRP in the context of swarm deployment.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Ma, A., Ouimet, M., Cortés, J.: Dynamic domain reduction for multi-agent planning. In: International Symposium on Multi-Robot and Multi-Agent Systems, pp. 142–149, Los Angeles, CA (2017)

    Google Scholar 

  2. Gerkey, B.P., Mataric, M.J.: A formal analysis and taxonomy of task allocation in multi-robot systems. Int. J. Robot. Res. 23(9), 939–954 (2004)

    Article  Google Scholar 

  3. Bullo, F., Cortés, J., Martínez, S.: Distributed Control of Robotic Networks. Applied Mathematics Series. Princeton University Press (2009). Electronically available at http://coordinationbook.info

  4. Mesbahi, M., Egerstedt, M.: Graph Theoretic Methods in Multiagent Networks. Applied Mathematics Series. Princeton University Press (2010)

    Google Scholar 

  5. Dunbabin, M., Marques, L.: Robots for environmental monitoring: significant advancements and applications. IEEE Robot. Autom. Mag. 19(1), 24–39 (2012)

    Article  Google Scholar 

  6. Das, J., Py, F., Harvey, J.B.J., Ryan, J.P., Gellene, A., Graham, R., Caron, D.A., Rajan, K., Sukhatme, G.S.: Data-driven robotic sampling for marine ecosystem monitoring. Int. J. Robot. Res. 34(12), 1435–1452 (2015)

    Article  Google Scholar 

  7. Cortés, J., Egerstedt, M.: Coordinated control of multi-robot systems: a survey. SICE J. Control Meas. Syst. Integr. 10(6), 495–503 (2017)

    Article  Google Scholar 

  8. Sutton, R., Precup, D., Singh, S.: Between MDPs and semi-MDPs: a framework for temporal abstraction in reinforcement learning. Artif. Intell. 112(1–2), 181–211 (1999)

    Article  MathSciNet  Google Scholar 

  9. Broz, F., Nourbakhsh, I., Simmons, R.: Planning for human-robot interaction using time-state aggregated POMDPs. In: AAAI, vol. 8, pp. 1339–1344 (2008)

    Google Scholar 

  10. Puterman, M.: Markov Decision Processes: Discrete Stochastic Dynamic Programming. Wiley (2014)

    Google Scholar 

  11. Howard, R.: Dynamic Programming and Markov Processes. M.I.T. Press (1960)

    Google Scholar 

  12. Bertsekas, D.P.: Dynamic Programming and Optimal Control. Athena Scientific (1995)

    Google Scholar 

  13. Kocsis, L., Szepesvári, C.: Bandit based Monte-Carlo planning. In: ECML, vol. 6, pp. 282–293. Springer (2006)

    Google Scholar 

  14. Parr, R., Russell, S.: Hierarchical control and learning for Markov decision processes, University of California, Berkeley, Berkeley, CA (1998)

    Google Scholar 

  15. Bai, A., Srivastava, S., Russell, S.: Markovian state and action abstractions for MDPs via hierarchical MCTS. In: Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, IJCAI, pp. 3029–3039, New York, NY (2016)

    Google Scholar 

  16. Barto, A., Mahadevan, S.: Recent advances in hierarchical reinforcement learning. Discrete Event Dyn. Syst. 13(4), 341–379 (2003)

    Article  MathSciNet  Google Scholar 

  17. Boutilier, C.: Sequential optimality and coordination in multiagent systems. In: Proceedings of the 16th International Joint Conference on Artifical Intelligence (IJCAI), vol. 1, pp. 478–485 (1999)

    Google Scholar 

  18. Kirkpatrick, S., Gelatt, C., Vecchi, M.: Optimization by simulated annealing. Science 220(4598), 671–680 (1983)

    Article  MathSciNet  Google Scholar 

  19. Laarhoven, P., Aarts, E.: Simulated annealing. In: Simulated Annealing: Theory and Applications, pp. 7–15. Springer (1987)

    Google Scholar 

  20. Malek, M., Guruswamy, M., Pandya, M., Owens, H.: Serial and parallel simulated annealing and tabu search algorithms for the traveling salesman problem. Ann. Oper. Res. 21(1), 59–84 (1989)

    Article  MathSciNet  Google Scholar 

  21. Hajek, B.: Cooling schedules for optimal annealing. Math. Oper. Res. 13(2), 311–329 (1988)

    Article  MathSciNet  Google Scholar 

  22. Suman, B., Kumar, P.: A survey of simulated annealing as a tool for single and multiobjective optimization. J. Oper. Res. Soc. 57(10), 1143–1160 (2006)

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported by ONR Award N00014-16-1-2836.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Aaron Ma .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Ma, A., Ouimet, M., Cortés, J. (2019). Cooperative Dynamic Domain Reduction. In: Correll, N., Schwager, M., Otte, M. (eds) Distributed Autonomous Robotic Systems. Springer Proceedings in Advanced Robotics, vol 9. Springer, Cham. https://doi.org/10.1007/978-3-030-05816-6_35

Download citation

Publish with us

Policies and ethics