Abstract
We consider the problem of dynamically adjusting the formation and size of robot teams performing distributed area coverage, when they encounter obstacles or occlusions along their path. Based on our earlier formulation of the robotic team formation problem as a coalitional game called a weighted voting game (WVG), we show that the robot team size can be dynamically adapted by adjusting the WVG’s quota parameter. We use a Q-learning algorithm to learn the value of the quota parameter and a policy reuse mechanism to adapt the learning process to changes in the underlying environment. Experimental results using simulated e-puck robots within the Webots simulator show that our Q-learning algorithm converges within a finite number of steps in different types of environments. Using the learning algorithm also improves the performance of an area coverage application where multiple robot teams move in formation to explore an initially unknown environment by 5 − 10%.
Keywords
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Bahceci, E., Soysal, O., Sahin, E.: Review: Pattern formation and adaptation in multi-robot systems. CMU Tech. Report no. CMU-RI-TR-03-43 (2003)
Balch, T., Arkin, R.: Behavior-based formation control of multi-robot teams. IEEE Transactions on Robotics and Automation 14(6), 926–939 (1998)
Batalin, M., Sukhatme, G.: The Design and Analysis of an Efficient Local Algorithm for Coverage and Exploration Based on Sensor Network Deployment. IEEE Transactions on Robotics 23(4), 661–675 (2007)
Burgard, W., Moors, M., Stachniss, C., Schneider, F.: Coordinated Multi-robot Exploration. IEEE Trans. Robotics 21(3), 376–386 (2005)
Bowling, M., Veloso, M.: Simultaneous adversarial multi-robot learning. In: Proc. 18th International Joint Conference on Artificial Intelligence (IJCAI), pp. 699–704 (2003)
Cheng, K., Dasgupta, P., Wang, Y.: Distributed Area Coverage Using Robot Flocks. In: World Congress on Nature and Biologically Inspired Computing (NaBIC 2009), pp. 678–683 (2009)
Cheng, K., Dasgupta, P.: Weighted Voting Game Based Multi-robot Team Formation for Distributed Area Coverage. In: 3rd Practical and Cognitive Agents and Robots Workshop, Toronto, Canada, pp. 9–15 (2010)
Choset, H.: Coverage for robotics: A survey of recent results. Annals of Math and AI 31, 113–126 (2001)
Clark, P., Rilee, M., Curtis, S., Cheung, C., Truszkowski, W., Marr, G., Rudisill, M.: PAM: Biologically Inspired Engineering And Exploration Mission Concept, Components, And Requirements For Asteroid Population Survey. In: Proc. 55th Intl. Astronautical Congress, Vancouver, Canada, IAC-04-Q5.07 (2004)
Cook, P.: Stable control of vehicle convoys for safety and comfort. IEEE Trans. on Automatic Control 52(3), 526–531 (2007)
Dasgupta, P., Cheng, K.: Robust Multi-robot Team Formations using Weighted Voting Games. In: 10th International Symposium on Distributed Autonomous Robotics Systems (DARS 2010), EPFL, Switzerland (2010)
Dasgupta, P., Whipple, T., Cheng, K.: Effects of Multi-robot Team Formations on Distributed Area Coverage. International Journal of Swarm Intelligence Research 2(1), 44–69 (2011)
Zlot, R., Stentz, A., Bernardine Dias, M., Thayer, S.: Multi-Robot Exploration Controlled by a Market Economy. In: Intl. Conf. Robotics and Automation (ICRA), pp. 3016–3023 (2002)
Fernandez, F., Borrajo, D., Parker, L.: A Reinforement Learning Algorithm in Cooperative Multi-Robot Domains. Journal of Intelligent and Robotic Systems 43(2-4), 161–174 (2005)
Fernandez, F., Veloso, M.: Probabilistic Policy Reuse in Reinforcement Learning Agent. In: Proc. 5th Intl. Joint Conference on Autonomous Agents and Multi-Agent Systems (AAMAS), pp. 720–727 (2006)
Falconi, R., Gowal, S., Martinoli, A.: Graph Based Distributed Control of Non-Holonomic Vehicles Endowed with Local Positioning Information Engaged in Escorting Missions. In: ICRA 2010, Anchorage, AK, pp. 3207–3214 (2010)
Fredslund, J., Mataric, M.: A general algorithm for robot formations using local sensing and minimal comm. IEEE Trans. on Rob. and Auton. 18(5), 837–846 (2002)
Gerkey, B., Mataric, M.: A formal analysis and taxonomy of task allocation in multi-robot systems. Intl. Journal of Robotics Research 23(9), 939–954 (2004)
Gokce, F., Sahin, E.: To flock or not to flock: the pros and cons of flocking in long-range migration of mobile robot swarms. In: AAMAS 2009, pp. 65–72 (2009)
Gomes, E., Kowalczyk, R.: Dynamic analysis of multiagent Q-learning with ε-greedy exploration. In: Proc. of the 26th Annual International Conference on Machine Learning, Montreal, Canada, pp. 369–376 (2009)
Hazon, N., Kaminka, G.: On Redundancy, Efficiency, and Robustness in Coverage for Multiple Robots. Robotics and Autonomous Systems 56, 1102–1114 (2008)
Jager, M., Nebel, B.: Dynamic Decentralized Area Partitioning for Cooperating Cleaning Robots. In: Intl. Conf. Robotics and Automation (ICRA), pp. 3577–3582 (2002)
Mataric, M.: Reinforcement learning in the multi-robot domain. Autonomous Robots 4, 73–83 (1997)
Olfati Saber, R.: Flocking for Multi-Agent Dynamic Systems: Algorithms and Theory. IEEE Trans. on Automatic Control 51(3), 401–420 (2006)
Rekleitis, I., Dudek, G., Milios, E.: Multi-Robot Collaboration for Robust Exploration. Annals of Mathematics and Artificial Intelligence 31(1-4), 7–40 (2001)
Reynolds, C.: Flocks, herds and schools: A distributed behavioral model. Computer Graphics 21(4), 25–34 (1987)
Silva, B.N., Machworth, A.: Using Spatial Hints to Improve Policy Reuse in a Reinforcement Learning Agent. In: Proc. 9th Intl. Joint Conf. on Autonomous Agents and Multi-Agent Systems (AAMAS), pp. 317–324 (2010)
Shehory, O., Kraus, S.: Methods for task allocation via agent coalition formation. Artif. Intell. J. 101(1-2), 165–200 (1998)
Shoham, Y., Leyton-Brown, K.: Multiagent Systems: Algorithmic, Game Theoretic and Logical Foundations. Cambridge University Press (2009)
Smith, B., Egerstedt, M., Howard, A.: Automatic Generation of Persistent Formations for Multi-Agent Networks Under Range Constraints. Mobile Networks and Applications Journal 14, 322–335 (2009)
Smart, W., Kaelbling, L.: Effective reinforcement learning for mobile robots. In: Proc. International Conference on Robotics and Automation (ICRA), pp. 3404–3410 (2002)
Sutton, R., Barto, A.: Reinforcement Learning, Cambridge, MA, USA (1998)
Touzet, C.: Distributed Lazy Q-learning for cooperative mobile robots. International Journal of Advanced Robotic Systems 1(1), 5–13 (2004)
Tovey, C., Lagoudakis, M., Jain, S., Koenig, S.: The Generation of Bidding Rules for Auction-Based Robot Coordination. In: Multi-Robot Systems: From Swarms to Intelligent Automata, vol. 3, pp. 3–14. Springer, Heidelberg (2005)
Vig, L., Adams, J.: Multi-robot coalition formation. IEEE Transactions on Robotics 22(4), 637–649 (2006)
Yang, E., Gu, D.: Multi-robot systems with agent-based reinforcement learning: evolution, opportunities and challenges. International Journal of Modelling, Identification and Control 6(4), 271–286 (2009)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Dasgupta, P., Cheng, K., Banerjee, B. (2012). Adaptive Multi-robot Team Reconfiguration Using a Policy-Reuse Reinforcement Learning Approach. In: Dechesne, F., Hattori, H., ter Mors, A., Such, J.M., Weyns, D., Dignum, F. (eds) Advanced Agent Technology. AAMAS 2011. Lecture Notes in Computer Science(), vol 7068. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-27216-5_23
Download citation
DOI: https://doi.org/10.1007/978-3-642-27216-5_23
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-27215-8
Online ISBN: 978-3-642-27216-5
eBook Packages: Computer ScienceComputer Science (R0)