Abstract
Unmanned vehicles (UxVs) are increasingly deployed in a wide range of challenging scenarios, including disaster response, surveillance, and search and rescue. This paper is motivated by scenarios where a heterogeneous swarm of UxVs is tasked with completing a variety of different objectives that possibly require cooperation from vehicles of varying capabilities. Our goal is to develop an approach that enables vehicles to aid each other in the services of these objectives in a distributed and autonomous fashion. To address this problem, we build on Dynamic domain reduction for multi-agent planning (DDRP), which is a framework that utilizes model-based hierarchical reinforcement learning and spatial state abstractions crafted for robotic planning. Our strategy to tackle the exponential complexity of reasoning over the joint action space of the multi-agent system is to have agents reason over single-agent trajectories, evaluate the result as a function of the cooperative objectives that can be completed, and use simulated annealing to refine the search for the best set of joint trajectories. The resulting algorithm is termed Cooperative dynamic domain reduction for multi-agent planning (CDDRP). Our analysis characterizes the long-term convergence in probability to the optimal set of trajectories. We provide simulations to estimate the performance of CDDRP in the context of swarm deployment.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Ma, A., Ouimet, M., Cortés, J.: Dynamic domain reduction for multi-agent planning. In: International Symposium on Multi-Robot and Multi-Agent Systems, pp. 142–149, Los Angeles, CA (2017)
Gerkey, B.P., Mataric, M.J.: A formal analysis and taxonomy of task allocation in multi-robot systems. Int. J. Robot. Res. 23(9), 939–954 (2004)
Bullo, F., Cortés, J., Martínez, S.: Distributed Control of Robotic Networks. Applied Mathematics Series. Princeton University Press (2009). Electronically available at http://coordinationbook.info
Mesbahi, M., Egerstedt, M.: Graph Theoretic Methods in Multiagent Networks. Applied Mathematics Series. Princeton University Press (2010)
Dunbabin, M., Marques, L.: Robots for environmental monitoring: significant advancements and applications. IEEE Robot. Autom. Mag. 19(1), 24–39 (2012)
Das, J., Py, F., Harvey, J.B.J., Ryan, J.P., Gellene, A., Graham, R., Caron, D.A., Rajan, K., Sukhatme, G.S.: Data-driven robotic sampling for marine ecosystem monitoring. Int. J. Robot. Res. 34(12), 1435–1452 (2015)
Cortés, J., Egerstedt, M.: Coordinated control of multi-robot systems: a survey. SICE J. Control Meas. Syst. Integr. 10(6), 495–503 (2017)
Sutton, R., Precup, D., Singh, S.: Between MDPs and semi-MDPs: a framework for temporal abstraction in reinforcement learning. Artif. Intell. 112(1–2), 181–211 (1999)
Broz, F., Nourbakhsh, I., Simmons, R.: Planning for human-robot interaction using time-state aggregated POMDPs. In: AAAI, vol. 8, pp. 1339–1344 (2008)
Puterman, M.: Markov Decision Processes: Discrete Stochastic Dynamic Programming. Wiley (2014)
Howard, R.: Dynamic Programming and Markov Processes. M.I.T. Press (1960)
Bertsekas, D.P.: Dynamic Programming and Optimal Control. Athena Scientific (1995)
Kocsis, L., Szepesvári, C.: Bandit based Monte-Carlo planning. In: ECML, vol. 6, pp. 282–293. Springer (2006)
Parr, R., Russell, S.: Hierarchical control and learning for Markov decision processes, University of California, Berkeley, Berkeley, CA (1998)
Bai, A., Srivastava, S., Russell, S.: Markovian state and action abstractions for MDPs via hierarchical MCTS. In: Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, IJCAI, pp. 3029–3039, New York, NY (2016)
Barto, A., Mahadevan, S.: Recent advances in hierarchical reinforcement learning. Discrete Event Dyn. Syst. 13(4), 341–379 (2003)
Boutilier, C.: Sequential optimality and coordination in multiagent systems. In: Proceedings of the 16th International Joint Conference on Artifical Intelligence (IJCAI), vol. 1, pp. 478–485 (1999)
Kirkpatrick, S., Gelatt, C., Vecchi, M.: Optimization by simulated annealing. Science 220(4598), 671–680 (1983)
Laarhoven, P., Aarts, E.: Simulated annealing. In: Simulated Annealing: Theory and Applications, pp. 7–15. Springer (1987)
Malek, M., Guruswamy, M., Pandya, M., Owens, H.: Serial and parallel simulated annealing and tabu search algorithms for the traveling salesman problem. Ann. Oper. Res. 21(1), 59–84 (1989)
Hajek, B.: Cooling schedules for optimal annealing. Math. Oper. Res. 13(2), 311–329 (1988)
Suman, B., Kumar, P.: A survey of simulated annealing as a tool for single and multiobjective optimization. J. Oper. Res. Soc. 57(10), 1143–1160 (2006)
Acknowledgements
This work was supported by ONR Award N00014-16-1-2836.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Ma, A., Ouimet, M., Cortés, J. (2019). Cooperative Dynamic Domain Reduction. In: Correll, N., Schwager, M., Otte, M. (eds) Distributed Autonomous Robotic Systems. Springer Proceedings in Advanced Robotics, vol 9. Springer, Cham. https://doi.org/10.1007/978-3-030-05816-6_35
Download citation
DOI: https://doi.org/10.1007/978-3-030-05816-6_35
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-05815-9
Online ISBN: 978-3-030-05816-6
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)