Performance Competitions as Research Infrastructure: Large Scale Comparative Studies of Multi-Agent Teams

Kaminka, Gal A.; Frank, Ian; Arai, Katsuto; Tanaka-Ishii, Kumiko

doi:10.1023/A:1024180921782

Performance Competitions as Research Infrastructure: Large Scale Comparative Studies of Multi-Agent Teams

Published: July 2003

Volume 7, pages 121–144, (2003)
Cite this article

Autonomous Agents and Multi-Agent Systems Aims and scope Submit manuscript

Gal A. Kaminka¹,
Ian Frank²,
Katsuto Arai³ &
…
Kumiko Tanaka-Ishii⁴

125 Accesses
2 Citations
Explore all metrics

Abstract

Performance competitions (events that pit many different programs against each other on a standardized task) provide a way for a research community to promote research progress towards challenging goals. In this paper, we argue that for maximum research benefit, any such competition must involve comparative studies under closely controlled, varying conditions. We demonstrate the critical role of comparative studies in the context of one well-known and growing performance competition: the annual Robotic Soccer World Cup (RoboCup) Championship. Specifically, over the past three years, we have carried out annual large-scale comparative evaluations—distinct from the competition itself—of the multi-agent teams taking part in the largest RoboCup league. Our study, which involved 30 different teams of agents produced by dozens of different research groups, focused on robustness. We show that (i) multi-agent teams exhibit a clear performance-robustness tradeoff; (ii) teams tend to over-specialize, so that they cannot handle beneficial changes we make to their operating environment; and (iii) teams improve in performance more than in robustness from one year to the next, despite the emphasis by RoboCup organizers on robustness as a key challenge. These results demonstrate the potential of large-scale comparative studies for producing important results otherwise difficult to discover, and are significant both in the lessons they raise for designers of multi-agent teams, and in understanding the place of performance competitions within the multi-agent research infrastructure.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The Standard Platform League

Bright: Benchmarking Research Infrastructure for Generalized Heterogeneous Teams

Research community dynamics behind popular AI benchmarks

Article 17 May 2021

References

T. Ando, “Andhill-98: A RoboCup team which reinforces positioning observability,” in M. Asada and H. Kitano, (eds.), RoboCup-98: Robot Soccer World Cup II, Springer-Verlag, pp. 373–388, 1999.
D. Andre and A. Teller, “Evolving team Darwin United,” in M. Asada and H. Kitano, (eds.), RoboCup-98: Robot Soccer World Cup II, Springer-Verlag, pp. 346–352, 1999.
R. C. Arkin, “The 1997 AAAI robot competition and exhibition,” AI Magazine, vol. 19, no.3, pp. 13–17, 1998.
Google Scholar
E. M. Atkins, E. H. Durfee, and K. G. Shin, “Detecting and reacting to unplanned-for World States,” in Proceedings of the Fourteenth National Conference on Artificial Intelligence (AAAI-97), Providence, RI, pp. 571–576, 1997.
K. D. Bailey, Social Entropy Theory, State University of New York Press, 1990.
T. Balch, “Behavioral diversity in learning robot teams,” Ph.D. thesis, Georgia Institute of Technology, 1998.
C. Blake and C. Merz, “UCI repository of machine learning databases,” http://www.ics.uci.edu/fmlearn/MLRepository.html, 1998.
H.-D. Burkhard, M. Hannebauer, and J. Wendler, “AT Humboldt-development, practice, and theory,” in H. Kitano, (ed.), RoboCup-97: Robot Soccer World Cup I, Springer-Verlag, vol. LNAI 1395, pp. 357–372, 1998.
P. Cohen, Empirical Methods for Artificial Intelligence, MIT Press: Cambridge, MA, 1995.
Google Scholar
P. R. Cohen and H. J. Levesque, “Teamwork,” Nous, vol. 35, 1991.
D. Goldberg and M. J. Mataric, “Interference as a tool for designing and evaluating multi-robot controllers,” in Proceedings of the Fourteenth National Conference on Artificial Intelligence (AAAI-97), Providence, RI, pp. 637–642, 1997.
B. J. Grosz and S. Kraus, “Collaborative plans for complex group actions,” Artificial Intelligence, vol. 86, pp. 269–358, 1996.
Google Scholar
P. Gugenberger, J. Wendler, K. Schroter, and H.-D. Burkhard, “AT Humboldt in RoboCup-98,” in M. Asada and H. Kitano, (eds.), RoboCup-98: Robot Soccer World Cup II, Springer-Verlag, pp. 358–363, 1999.
S. Hanks, M. E. Pollack, and P. Cohen, “Benchmarks, test beds, controlled experimentation, and the design of agent architectures,” AI Magazine, vol. 14, no.4, pp. 17–42, 1993.
Google Scholar
B. Horling, B. Benyo, and V. Lesser, “Using self-diagnosis to adapt organizational structures,” in Proceedings of the Fifth International Conference on Autonomous Agents (Agents-01), pp. 529–536, 2001.
B. Horling, V. R. Lesser, R. Vincent, A. Bazzan, and P. Xuan, “Diagnosis as an integral part of multi-agent adaptability,” Technical Report CMPSCI Technical Report 1999-03, University of Massachusetts/Amherst, 1999.
Google Scholar
N. R. Jennings, “Commitments and conventions: The foundations of coordination in multi-agent systems,” Knowledge Engineering Review, vol. 8, no.3, pp. 223–250, 1993.
Google Scholar
N. R. Jennings, “Controlling cooperative problem solving in industrial multi-agent systems using joint intentions,” Artificial Intelligence, vol. 75, no.2, pp. 195–240, 1995.
Google Scholar
G. A. Kaminka, “The multi-agent systems evaluation repository,” http://www.cs.cmu.edu/fgalk/Eval/, 1998.
G. A. Kaminka and M. Tambe, “Robust multi-agent teams via socially-attentive monitoring,” Journal of Artificial Intelligence Research, vol. 12, pp. 105–147, 2000.
Google Scholar
H. Kitano, M. Tambe, P. Stone, M. Veloso, S. Coradeschi, E. Osawa, H. Matsubara, I. Noda, and M. Asada, “The RoboCup synthetic agent challenge '97,” in Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI-97), Nagoya, Japan, 1997.
S. Kumar and P. R. Cohen, “Towards a fault-tolerant multi-agent system architecture,” in Proceedings of the Fourth International Conference on Autonomous Agents (Agents-00), Barcelona, Spain, pp. 459–466, 2000.
S. Kumar, P. R. Cohen, and H. J. Levesque, “The adaptive agent architecture: Achieving fault-tolerance using persistent broker teams,” in Proceedings of the Fourth International Conference on Multiagent Systems (ICMAS-00), Boston, MA, pp. 159–166, 2000.
S. Lenser, J. Bruce, and M. Veloso, “CMPack: A complete software system for autonomous legged soccer robots,” in Proceedings of the Fifth International Conference on Autonomous Agents (Agents-01), pp. 204–211, 2001.
N. Lesh, “The plan-recognition data repository,” http://www.merl.com/people/lesh/prdata.html, 1995.
H. Matsubara, I. Frank, K. Tanaka-Ishii, I. Noda, H. Nakashima, and K. Hasida, “Automatic soccer commentary and RoboCup,” in M. Asada, (ed.), The Second RoboCup Workshop (RoboCup-98), Paris, France, pp. 7–22, 1998.
D. McDermott, “The AI planning system competition problem-set repository,” ftp://ftp.cs.yale.edu/pub/mcdermott/domains/, 1998.
D. McDermott, “The 1998 AI planning systems competition,” AI Magazine, vol. 21, no.2, pp. 35–55, 2000.
Google Scholar
I. Noda and I. Frank, “Investigating the complex with virtual soccer,” in J.-C. Heudin, (ed.), Virtual Worlds, Springer Verlag, pp. 241–253, 1998.
I. Noda, H. Matsubara, K. Hiraki, and I. Frank, “Soccer Server: A tool for research on multiagent systems,” Applied Artificial Intelligence, vol. 12, nos.2 – 3, pp. 233–250, 1998.
Google Scholar
M. Ohta and T. Ando, “Cooperative reward in reinforcement learning,” in Proceedings of the 3rd JSAI RoboMech Symposia, pp. 7–11, 1998.
M. Pechoucek, V. Marik, and O. Stepankova, “Towards reducing communication traffic in multi-agent systems,” Journal of Applied System Studies (Special Issue on Virtual Organizations and E-Commerce Applications).
J. Rickel and W. L. Johnson, “Animated agents for procedural training in virtual reality: Perception, cognition and motor control,” Applied Artificial Intelligence, vol. 13, pp. 343–382, 1999.
Google Scholar
P. Stone, “Layered learning and flexible teamwork in multi-agent systems,” Ph.D., Carnegie-Mellon University, 1998.
P. Stone, P. F. Riley, and M. Veloso, “The CMUnited-99 champion simulator team,” in RoboCup-98: Robot Soccer World Cup III, Springer-Verlag, 2000.
P. Stone, M. Veloso, and P. F. Riley, “The CMUnited-98 champion simulator team,” in RoboCup-98: Robot Soccer World Cup II, Springer-Verlag, pp. 61–76, 1999.
Google Scholar
T. Sugawara and V. R. Lesser, “Learning to improve coordinated actions in cooperative distributed problem-solving environments,” Machine Learning, vol. 33, no.2/3, pp. 129–153, 1998.
Google Scholar
M. Tambe, “Towards flexible teamwork,” Journal of Artificial Intelligence Research, vol. 7, pp. 83–124, 1997.
Google Scholar
M. Tambe, W. L. Johnson, R. Jones, F. Koss, J. E. Laird, P. S. Rosenbloom, and K. Schwamb, “Intelligent agents for interactive simulation environments,” AI Magazine, vol. 16, no.1, 1995.
M. Tambe, G. A. Kaminka, S. C. Marsella, I. Muslea, and T. Raines, “Two fielded teams and two experts: A RoboCup challenge response from the trenches,” in Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI-99), vol. 1, pp. 276–281, 1999.
Google Scholar
M. Tambe, D. V. Pynadath, N. Chauvat, A. Das, and G. A. Kaminka, “Adaptive agent integration architectures for heterogeneous team members,” in Proceedings of the Fourth International Conference on Multiagent Systems (ICMAS-00), Boston, MA, pp. 301–308, 2000.
K. Tanaka-Ishii, I. Noda, and I. F. et al., “MIKE: An automatic commentary system for soccer—system design and control—,” in Proceedings of International Conference on Multi-Agent Systems '98, Paris, France, pp. 285–292, 1998.
K. Tanaka-Ishii, I. Noda, I. Frank, and H. Matsubara, “A statistical perspective on the RoboCup simulator league: Progress and prospects,” in The 3rd Proceedings of RoboCup Workshop, 1999.
K. Toyama and G. D. Hager, “If at first you don't succeed...,” in Proceedings of the Fourteenth National Conference on Artificial Intelligence (AAAI-97), Providence, RI, pp. 3–9, 1997.
W. Wagenaar, Paradoxes of Gambling Behaviour, Lawrence Erlbaum Associates Ltd. ISBN 0-86377-080-0, 1988.

Download references

Author information

Authors and Affiliations

Computer Science Department, Carnegie-Mellon University, Pittsburgh, PA, 15213, U.S.A.
Gal A. Kaminka
Future University-Hakodate, 116-2 Kamedanakano, Hakodate-shi, Hokkaido, 041-8655, Japan
Ian Frank
Department of Cognitive and Information Sciences, Faculty of Letters, Chiba University, Japan
Katsuto Arai
Interfaculty Initiative in Information Studies, Graduate School of Interdisciplinary Information Studies, University of Tokyo, Tokyo, Japan
Kumiko Tanaka-Ishii

Authors

Gal A. Kaminka
View author publications
You can also search for this author in PubMed Google Scholar
Ian Frank
View author publications
You can also search for this author in PubMed Google Scholar
Katsuto Arai
View author publications
You can also search for this author in PubMed Google Scholar
Kumiko Tanaka-Ishii
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kaminka, G.A., Frank, I., Arai, K. et al. Performance Competitions as Research Infrastructure: Large Scale Comparative Studies of Multi-Agent Teams. Autonomous Agents and Multi-Agent Systems 7, 121–144 (2003). https://doi.org/10.1023/A:1024180921782

Download citation

Issue Date: July 2003
DOI: https://doi.org/10.1023/A:1024180921782

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Performance Competitions as Research Infrastructure: Large Scale Comparative Studies of Multi-Agent Teams

Abstract

Access this article

Similar content being viewed by others

The Standard Platform League

Bright: Benchmarking Research Infrastructure for Generalized Heterogeneous Teams

Research community dynamics behind popular AI benchmarks

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Navigation

Performance Competitions as Research Infrastructure: Large Scale Comparative Studies of Multi-Agent Teams

Abstract

Access this article

Similar content being viewed by others

The Standard Platform League

Bright: Benchmarking Research Infrastructure for Generalized Heterogeneous Teams

Research community dynamics behind popular AI benchmarks

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation