1 Introduction

UT Austin Villa won the 2017 RoboCup 3D Simulation League for the sixth time in the past seven years, having also won the competition in 2011 [1], 2012 [2], 2014 [3], 2015 [4], and 2016 [5] while finishing second in 2013. During the course of the competition the team scored 171 goals and conceded none along the way to winning all 23 games the team played. Many of the components of the 2017 UT Austin Villa agent were reused from the team’s successful previous years’ entries in the competition. This paper is not an attempt at a complete description of the 2017 UT Austin Villa agent, the base foundation of which is the team’s 2011 championship agent fully described in a team technical report [6], but instead focuses on changes made in 2017 that helped the team repeat as champions.

In addition to winning the main RoboCup 3D Simulation League competition, UT Austin Villa also won the RoboCup 3D Simulation League technical challenge by winning each of the three league challenges: free, passing and scoring, and Gazebo running challenge. This paper also serves to document these challenges and the approaches used by UT Austin Villa when competing in the challenges.

The remainder of the paper is organized as follows. In Sect. 2 a description of the 3D simulation domain is given. Section 3 details the most important improvement to the 2017 UT Austin Villa team: fast walk kicks, while Sect. 4 analyzes the contribution of this improvement in addition to the overall performance of the team at the competition. Section 5 describes and analyzes the league challenges that were used to determine the winner of the technical challenge, and Sect. 6 concludes.

2 Domain Description

The RoboCup 3D simulation environment is based on SimSpark [7, 8], a generic physical multiagent system simulator. SimSpark uses the Open Dynamics Engine (ODE) library for its realistic simulation of rigid body dynamics with collision detection and friction. ODE also provides support for the modeling of advanced motorized hinge joints used in the humanoid agents.

Games consist of 11 versus 11 agents playing two 5 min halves of soccer on a \(30 \times 20\) m field. The robot agents in the simulation are modeled after the Aldebaran Nao robot, which has a height of about 57 cm, and a mass of 4.5 kg. Each robot has \(22^\circ \) of freedom: six in each leg, four in each arm, and two in the neck. In order to monitor and control its hinge joints, an agent is equipped with joint perceptors and effectors. Joint perceptors provide the agent with noise-free angular measurements every simulation cycle (20 ms), while joint effectors allow the agent to specify the speed/direction in which to move a joint.

Visual information about the environment is given to an agent every third simulation cycle (60 ms) through noisy measurements of the distance and angle to objects within a restricted vision cone (\(120^\circ \)). Agents are also outfitted with noisy accelerometer and gyroscope perceptors, as well as force resistance perceptors on the sole of each foot. Additionally, agents can communicate with each other every other simulation cycle (40 ms) by sending 20 byte messages.

In addition to the standard Nao robot model, four additional variations of the standard model, known as heterogeneous types, are available for use. These variations from the standard model include changes in leg and arm length, hip width, and also the addition of toes to the robot’s foot. Teams must use at least three different robot types, no more than seven agents of any one robot type, and no more than nine agents of any two robot types.

The main change for the 2017 RoboCup 3D Simulation League competition from previous years was the removal of crowding rules. Previously, too many players crowded around the ball caused players to be penalized and beamed to the sideline. Crowding rules were primarily enforced to decrease the number of collisions between robots as multiple collisions at the same time can slow down the simulator and potentially cause it to crash. With existing touching rules such that a player is beamed to the sideline if a group of three or more players are touching each other, and the addition in 2016 of charging fouls that penalize players for running into opponents, it was determined that crowding rules were no longer needed.

Figure 1 shows a visualization of the Nao robot and the soccer field during a game.

Fig. 1.
figure 1

A screenshot of the Nao humanoid robot (left), and a view of the soccer field during a 11 versus 11 game (right).

3 Fast Walk Kicks

Many components developed prior to 2017 contributed to the success of the UT Austin Villa team including dynamic role assignment [9], marking [10], and an optimization framework used to learn low level behaviors for walking and kicking via an overlapping layered learning approach [11]. This section discusses the development of a new and important component for 2017: fast walk kicks. Fast walk kicks refer to the ability of agents to approach the ball and quickly kick it without having to first come to a stop and enter a stable standing position. The amount of time it takes for agents to approach and kick the ball is an important consideration as kick attempts that take longer to perform give opponents a better chance to stop them from being executed.

The UT Austin Villa team specifies kicking motions through a periodic state machine with multiple key frames, where a key frame is a parameterized static pose of fixed joint positions. Figure 2 shows an example series of poses for a kicking motion. While some joint positions are specified by hand, a subset of values for joint positions are optimized using the CMA-ES [12] algorithm and overlapping layered learning [11] methodologies.

Prior to the 2017 competition all kicking motions performed by the UT Austin Villa team first required the agent to come to a stable standing position with both feet on the ground before kicking the ball. The team’s fastest kicks took about 0.5 s to execute but only traveled a little over 5 m. Longer kicks, traveling as far as 20 m, were slower and could take up to 2 s to execute.

Fig. 2.
figure 2

Example of a fixed series of poses that make up a kicking motion.

The UT Austin Villa team has noticed a couple trends when optimizing parameter values for kicks: policies with more parameters allow for longer kicks, and policies with more parameters allow for kicking motions with shorter durations that are quicker to execute without the robot becoming unstable and falling over. As adding more parameters to a policy increases the space of policies that can be represented, it is not surprising that policies with more parameters have allowed for kicks that can travel farther and be executed faster. However, adding more parameters to a kick can make learning slower and more difficult, and there is likely an upper limit on the number of parameters that can effectively be learned as CMA-ES does not scale well to thousands of parameters [13].

Given a desire to develop a kick with good distance that is very fast to execute, we decided to learn kicking motion parameters for every joint over 12 simulation cycles (24 ms)—such a kicking motion is thus learned over the entire range of possible poses for any kick less than 0.25 s in duration. We optimized \({\approx }260\) parameters for this kick across 1000 generation of CMA-ES using a population size of 300—previously we have used a CMA-ES population size of 150 when optimizing kicks consisting of \({\approx }75\) parameters, however we decided to double the size of the population due to the larger number of parameters being optimized. Initial parameter values were seeded with joint angles taken from a subset of poses used by our longest kick: joint angle values across a 12 simulation cycle window of the kick that include when the ball is struck by the foot. During learning we used the following fitness function that rewards the agent for the distance the ball is kicked, encourages accuracy by giving a Gaussian penalty for the difference/offset between the desired and actual angles that the ball travels, and promotes stability via a negative value if the agent falls over during kicking:

$$\begin{aligned} \textit{fitness}_{\text {kick}} = \left\{ \begin{array}{cl} -1 &{} : \text {Agent Fell}\\ \text {distBallTraveledForward}*e^{-\text {angleOffset}^2/360} &{} : \text {Otherwise} \end{array} \right. \end{aligned}$$

The resulting kick learned from this optimization takes 0.24 s to execute, travels close to 20 m in distance (nearly the same distance as our longest kicks that can take up to 2 s to execute), and provides a substantial increase in the team’s performance—a performance analysis of using fast walk kicks is provided in Sect. 4.1.

As our learned kick takes less than 0.25 s to execute, the robot must begin the kicking motion starting from a walking position, and perform a “walk kick” due to there not being enough time for the robot to first assume a standing position before striking the ball. During walk kicks it is important that a robot has its non-kicking support leg on the ground before initiating a kicking motion as otherwise the robot will likely fall over. When attempting walk kicks, the UT Austin Villa agent will wait until its support leg is on the ground—as determined by a large enough force measured by the force resistance perceptor on the sole of the agent’s support leg’s foot—before beginning a kick. The magmaOffenburg team, who also developed a walk kick for this year’s competition, similarly ensure that a robot’s support leg is on the ground before attempting a kick [14].

4 Main Competition Results and Analysis

In winning the 2017 RoboCup competition UT Austin Villa finished with a perfect record of 23 wins and no lossesFootnote 1. During the competition the team scored 171 goals while conceding none. Despite finishing with a perfect record, the relatively few number of games played at the competition, coupled with the complex and stochastic environment of the RoboCup 3D simulator, make it difficult to determine UT Austin Villa being better than other teams by a statistically significant margin. At the end of the competition, however, all teams were required to release their binaries used during the competition. Results of UT Austin Villa playing 1000 games against each of the other twelve teams’ released binaries from the competition are shown in Table 1.

Table 1. UT Austin Villa’s released binary’s performance when playing 1000 games against the released binaries of all other teams at RoboCup 2017. This includes place (the rank a team achieved at the 2017 competition), average goal difference (values in parentheses are the standard error), win-loss-tie record, and goals for/against.

UT Austin Villa finished with at least an average goal difference greater than 3.75 goals against every opponent. Additionally, UT Austin Villa won all but 22 games that ended in ties—no losses—out of the 12,000 that were played in Table 1 with a win percentage greater than 98% against all teams. These results show that UT Austin Villa winning the 2017 competition was far from a chance occurrence. The following subsection analyzes the contribution of fast walk kicks (described in Sect. 3) to the team’s dominant performance.

4.1 Analysis of Fast Walk Kicks

To analyze the contribution of fast walk kicks (Sect. 3) to the UT Austin Villa team’s performance, we played 1000 games between a version of the 2017 UT Austin Villa team with fast walk kicks turned off—and no other changes—against each of the RoboCup 2017 teams’ released binaries. Results comparing the performance of the UT Austin Villa team with and without using fast walk kicks are shown in Table 2.

Table 2. Average goal difference achieved by versions of the UT Austin Villa team with and without fast walk kicks, and the gain in average goal difference by using fast walk kicks, when playing 1000 games against all teams at RoboCup 2017.

Against all opponents the average goal difference was higher when using fast walk kicks, with the gain in average goal difference performance against each opponent averaging 1.423 goals. These results show that fast walk kicks provide a substantial improvement in game performance to the UT Austin Villa team.

4.2 Additional Tournament Competition Analysis

To further analyze the tournament competition, Table 3 shows the average goal difference for each team at RoboCup 2017 when playing 1000 games against all other teams at RoboCup 2017.

Table 3. Average goal difference for each team at RoboCup 2017 (rows) when playing 1000 games against the released binaries of all other teams at RoboCup 2017 (columns). Teams are ordered from most to least dominant in terms of winning (positive goal difference) and losing (negative goal difference).

It is interesting to note that the ordering of teams in terms of winning (positive goal difference) and losing (negative goal difference) is strictly dominant—every opponent that a team wins against also loses to every opponent that defeats that same team. Relative goal difference does not have this same property, however, as a team that does better against one opponent relative to another team does not always do better against a second opponent relative to that same team. UT Austin Villa is dominant in terms of relative goal difference, however, as UT Austin Villa has a higher goal difference against each opponent than all other teams against the same opponent.

5 Technical Challenges

For the fourth straight year there was an overall technical challenge consisting of three different league challenges: free, passing and scoring, and Gazebo running challenge. For each league challenge a team participated in points were awarded toward the overall technical challenge based on the following equation:

$$\begin{aligned} \texttt {points}(\textit{rank}) = 25 - 20*(\textit{rank}-1)/(\textit{numberOfParticipants}-1) \end{aligned}$$
Table 4. Overall ranking and points totals for each team participating in the RoboCup 2017 3D Simulation League technical challenge as well as ranks and points awarded for each of the individual league challenges that make up the technical challenge.

Table 4 shows the ranking and cumulative team point totals for the technical challenge as well as for each individual league challenge. UT Austin Villa earned the most points and won the technical challenge by taking first in each of the league challenges. The following subsections detail UT Austin Villa’s participation in each league challengeFootnote 2.

5.1 Free Challenge

During the free challenge, teams give a five minute presentation on a research topic related to their team. Each team in the league then ranks the top five presentations with the best receiving 5 votes and the 5th best receiving 1 vote. Additionally several respected research members of the RoboCup community outside the league vote, with their votes being counted double. The winner of the free challenge is the team that receives the most votes. Table 5 shows the results of the free challenge in which UT Austin Villa was awarded first place.

Table 5. Results of the free challenge.

UT Austin Villa’s free challenge submissionFootnote 3 presented the team’s fast walk kicks discussed in Sect. 3. Additionally, UT Austin Villa’s free challenge submission divulged preliminary work on representing the policy of a kicking motion as a neural network, and using deep learning [15] and the Trust Region Policy Optimization (TRPO) algorithm [16] to learn longer kicks. The BahiaRT team provided details about an optimization framework they created, the magmaOffenburg team talked about a 2D simulator they use for testing the strategy layer of their team, and the AIUT3D team introduced a motion editor for 3D Simulation League agentsFootnote 4.

5.2 Passing and Scoring Challenge

In the course of the passing and scoring challengeFootnote 5, a group of four agents on one team attempts to pass the ball between themselves—such that each agent touches the ball at least once—before scoring a goal in as little time as possible. At the beginning of the challenge the ball is placed at the center of the field and the agents must start with at least a three meter distance, along the X axis, from each other. If the initial position of the agents does not comply with the rules, the team is awarded a score of 85. The challenge ends when a goal is scored, the ball leaves the field, or 80 s have passed. For each distinct agent kicking the ball—judged as the ball traveling freely for at least 2.5 m after being kicked, the score is reduced by one point. If a goal is scored, the score is reduced by one point. If the goal is scored after the ball has been kicked by all four players, the score is the time (in seconds) from the start of the trial until the scoring event. The objective of the challenge is to get as low a score as possible.

The starting position and strategy used by UT Austin Villa for the passing and scoring challenge is shown in Fig. 3. Whichever agent is closest to the ball passes the ball to a position about a meter in front of the next farthest agent from the goal as shown by the yellow arrows in Fig. 3. Once the ball has been sequentially passed forward between agents and the agent closest to the goal receives the ball, that agent kicks the ball in the goal as shown by the pink arrow in Fig. 3. When agents are not the closest agent to the ball they just stand in place.

Fig. 3.
figure 3

Starting positions and strategy for the passing and scoring challenge. Yellow arrows represent passes between agents and the pink arrow represents a shot on goal. (Color figure online)

Table 6 shows the results of the passing and scoring challenge where teams were ranked by the average score of a team’s best (lowest) three out of four trials. UT Austin Villa won the challenge with an average score/time of less than 20. Each of UT Austin Villa’s passing and scoring challenge trial scores were better than all the scores of other teams’ trials.

Table 6. Scores for each of the teams competing in the passing and scoring challenge.

5.3 Gazebo Running Challenge

Ongoing work within the RoboCup community is the development of a pluginFootnote 6 for the Gazebo [17] robotics simulator to support the RoboCup 3D Simulation League. As such, a challengeFootnote 7 was held where robots attempt to walk forward as fast as possible for 20 s in the Gazebo simulator without falling. In preparation for the challenge UT Austin Villa optimized fast walking parameters for the team’s omnidirectional walk engine [18] within the Gazebo simulator using the CMA-ES algorithm [12]. Walk engine parameters were optimized for 300 generations of CMA-ES with a population size of 150.

Results of the Gazebo running challenge are shown in Table 7. Each participating team performed four running attempts and were scored by the average forward walking speed across their three best attempts. UT Austin Villa won the challenge with all of the team’s runs having a speed of over 1.15 m/s. Each of UT Austin Villa’s running attempt speeds were greater than all other teams’ attempts. UT Austin Villa also won this same challenge at RoboCup 2016 [5].

Table 7. Speed in meters per second for each of the teams competing in the Gazebo running challenge.

6 Conclusion

UT Austin Villa won the 2017 RoboCup 3D Simulation League main competition as well as all technical league challengesFootnote 8. Data taken using released binaries from the competition show that UT Austin Villa winning the competition was statistically significant. The 2017 UT Austin Villa team also improved dramatically from 2016 as it was able to beat the team’s 2016 champion binary by an average of 1.339 (±0.039) goals across 1000 games.

In an effort to both make it easier for new teams to join the RoboCup 3D Simulation League, and also provide a resource that can be beneficial to existing teams, the UT Austin Villa team has released their base code [19]Footnote 9. This code release provides a fully functioning agent and good starting point for new teams to the RoboCup 3D Simulation League (it was used by six teams at the 2017 competition: AIUT3D, HfutEngine3D, KgpKubs, Miracle3D, Nexus3D, and RIC-AASTMT). Additionally the code release offers a foundational platform for conducting research in multiple areas including robotics, multiagent systems, and machine learning.