Keywords

1 Introduction

Data is having an increasing impact on the world around us, also on sports such as soccer due to developments in sensor technology and optical tracking [18]. Recently, in addition to the annotated event data of soccer match-play, competition-wide high-quality tracking data of the players and the ball on the pitch during match-play has become available. This spatio-temporal data is rich and complex and offers many opportunities for analyzing and optimizing tactics in soccer by applying modern data science techniques [22]. We will demonstrate that it is possible to analyze tactical behavior in soccer without having to strictly define beforehand which specific metrics may construe this tactical behavior by adopting an exploratory data mining technique.

1.1 Tactical Analysis in Practice

In technical terms, tactics concern how teams and individuals manage space and time, and adapt to the opponent and conditions of play [10]. The coach, often supported by their staff, is the ‘tactical mastermind’ who designs the game plan. Essentially, the coach has to make a head-to-head comparison: which strategy works best specifically for us, against a specific opponent.

Currently, it is common practice to have a video analyst and often also an embedded scientist to provide additional insights for the coaching staff. Video analysts laboriously go through video footage of match-play to highlight specific game situations (e.g., typical strengths and weaknesses). This kind of qualitative analysis is highly tuned to the coaching staff’s philosophy, but often relies a great deal on the ‘expert eye’, making it prone to bias.

An embedded scientist is more focused on quantitative analyses, for example using annotated event data. This data varies from straightforward performance indicators such as the number of successful passes per player, to more complicated analyses such as passing networks among players highlighting who is well-connected to whom [6]. The type of event is recorded (e.g., a pass), but also the estimated location and the players involved [4]. Interestingly, these events are available for almost every professional league worldwide. As such, this data is even used to inform clubs about potential new acquisitions on the transfer market [2, 20] and can certainly be used for tactical analysis. However, the manually coded data does not provide all the context in which a play took place (i.e., what did all the players do up until the event).

Given the subjectivity of video analysis and the lack of context in event data, the coaching staff could benefit from more systematic analyses of tactics that allow for an objective comparison between the success rates of different playing styles.

1.2 Positional Tracking Data

In one form or another, ‘data’ already plays a role in the decision-making process of the coaching staff, however, a new type of data will make the role of data in sports even more important. The latest development in soccer data is the semi-automatic tracking of the positions of the players on the pitch by professional leagues such as the German Bundesliga and the Dutch Eredivisie. This spatio-temporal data has the potential to allow for the systematic analysis of tactical behavior in invasion-based team sports (e.g., soccer, hockey, rugby) [18, 22].

Although event data also contains information about space and time, event data is much more superficial as only the locations of some players are known. With positional tracking data on the other hand, the positions of the players (and the ball) leading up to an event can be assessed (e.g., a pass to a closely guarded or an open player), thus providing the necessary contextFootnote 1 as to why a sequence of actions may have been successful.

By coupling the events and the positioning of the players, novel and insightful patterns can be uncovered in the tactics of an invasion-based game such as soccer. In fact, the positional tracking data is so rich and complex, that numerous hand-crafted metrics could be conceived. Indeed, in recent years metrics have been developed that describe –for example– how threatening a player is on offence (e.g., Dangerousity) [15], how well a player positioned itself off-the-ball [26], or how effective a pass was, based on the displacement in the defensive team it triggered [9, 12]. Gudmundsson and Horton [11] provide a clear overview of the pioneering work on tracking data in sports. They highlight that one of the open problems is that not many spatially informed metrics for player and team performance have been rigorously tested, often because only limited match data is available. It is thus pertinent to carefully consider which (variation of a) metric is the most indicative of success.

1.3 Aims

In the current paper, we aim to demonstrate a methodological approach that deals with two challenges for tactical analysis in an invasion-based team sport. First, current analyses either could be more objective and lack scalability (video analysis), or could be more reliable and lack context (annotated event data). We will demonstrate that with tracking data, the subjective constructs deemed important by experts can be operationalized algorithmically. The second challenge is that with tracking data so many (slightly) different metrics can be derived, that it is difficult to assess which is the most informative. Fortunately, now that tracking data is abundantly available, it is possible to discover the metrics that are most informative of success using a descriptive data mining technique.

2 Methodological Approach

To put this spatio-temporal data information in an actionable and interpretable context, we adopt an event-based approach. For a chosen type of event, we compute many of the metrics that already exist in scientific literature. By formulating a qualification of success, we can then use a descriptive data mining technique to uncover actionable and interpretable patterns. Here, a pattern refers to the features and their value ranges that best classify success. For our experiment, we selected the event Turnovers, which we classified based on the location (in- or outside the opponent’s penalty box). We used Subgroup Discovery to identify patterns in the data.

2.1 Key Events

During a match, discrete events occur that can be the key to understanding successful performance. Taking a shot on goal, for example, is a key occurrence that is directly related to winning a match. The outcome of an event can be classified: if the shot on goal led to a goal, it was a successful event. However, even with the increase of available data, goals and even shots on goal occur so infrequently that other more frequently occurring key can be analyzed more productively. Although dealing with sparse successful events is a key challenge in soccer analytics, we here circumvent this issue by looking at a more frequently occurring event such as the moment that a team loses possession of the ball.

Such changes in possession are an important part of soccer match-play as they can reveal successful match-play, without relying on infrequent events such as a scored goal. For our methodology, we define a turnover as the instant that the opposing team gains possession of the ball.Footnote 2 Typically, any turnover would be considered as unsuccessful from the perspective of the team losing the ball. However, our definition of Turnovers includes any change in possession, also as a consequence of a goal or some other proxy of success (e.g., an intercepted cross pass, a shot on- or off target). Based on the tracking data, our automatized methodological approach identifies the location of the ball at the instant that the possession changes team. Successful events are then the events where the defending team gained possession inside their penalty box. That is, if an attacking team managed to get the ball inside the opponent’s penalty box at the instant the possession ended, the turnover was classified as successful. From here onward, we refer somewhat counter-intuitively to possession sequences that ended inside the opponents’ penalty box as successful Turnovers and all other possession sequences as unsuccessful Turnovers. In other words, an attacking sequence where the team in possession got very close to (scoring) a goal.

2.2 Feature Construction

The features that describe the key events are constructed from a range of theory-driven metrics [1, 7, 8, 15, 19, 22]. Here, we briefly explain the metrics conceptually, but for algorithmic details we refer the reader to the literature. All metrics require some form of spatial aggregation: a distance-based interpretation of what is happening on the pitch. This could for example be the Width of the team, which is the distance between the player closest to one and the player closest to the other sideline. The metrics are always considered with respect to a team (i.e., the Width of the team with the ball). Additionally, it is possible to formulate slight variations (see A-D in Table 1) by for example excluding the goalkeeper or looking at other subsets of players (e.g., the defenders, midfielders or attackers).

Distance-Based Metrics. In addition to the Width, we incorporated some other distance-based metrics of what happens on the pitch. The Centroid refers to the average positioning of the players on a team [7, 8]. Similarly, the Spread is the standard deviation of the distances between each player and its team’s Centroid [1, 19]. The Surface refers to the area covered by the Convex Hull, of which we also take the Circumference [22], that can be drawn around different subsets of players (e.g., the defenders) [8, 22]. Finally, the shape ratio is the ratio between the Width and distance between the player closest to- and farthest away from- the goalkeeper [7].

Potential Danger. Moreover, the context of the players relative to each other can be taken into account. Link and colleagues [15], for example, developed a measure called Dangerousity which captures how threatening a ball carrier is. It is a combination of the Pressure exerted by the defending team, the Zone the player is in (i.e., closer to the goal and inside the penalty box is more threatening), the Control the player has over the ball and the Density of the players around the ball carrier. Pressure is based on the position of the defender(s) with respect to the ball carrier. The closer a defender is to the ball carrier, the higher the pressure. Additionally, pressure is scaled based on the defender’s position with respect to the goal and the ball carrier. The pressure of a defender in the ’head-on’ zone (between the goal and the ball carrier) is weighted higher than a defender in the ’hind’ zone (i.e., the ball carrier is in-between the defender and the goal). Zone is a value assigned to each location in the final third of the pitch. The Zone-values increase as the distance to the goal gets smaller, with an additional increase for zones that Link and colleagues deemed threatening (e.g., inside the penalty box). Control is based on the difference in velocity between the ball and the ball carrier, where a small difference indicates high control. Finally, the Density is based on the number of players and how crowded they are on the line between the ball carrier and the goal. For more details on how Dangerousity and its components are defined, see Link and colleagues [15].

Temporal Aggregation. From the available metrics, the event-based features are generated by reducing them to scalar values by systematically compressing the temporal dimension. As can be seen in Table 1, from every metric we construct multiple features. Multiple windows could be examined, but for the sake of simplicity we limit ourselves to one specific window. We aggregate the metrics from 10 until 5 s preceding each event. We opted for a window of 5 s as it captures a relatively short term process, as many decisive moments have a rather immediate effect. By excluding the time directly preceding the event, we force a more predictive analysis that captures what happens preceding, rather than at the instant of, the event. We aggregate over time by taking the average and the standard deviation of the metrics. Furthermore, each metric is aggregated for each team separately and for some of the metrics there are some more specific variations as can bee seen in Table 1.

Table 1. An overview of all metrics and the features that were constructed from them.

2.3 Discovering Patterns

Once the rich and complex positional data has been reduced to a tabular format, the data can be explored for patterns. We will use Subgroup Discovery with the tool Cortana [17]. Subgroup discovery is an exploratory, descriptive data mining technique, targeted at labeled examples. It has previously been shown to be informative in a sports-related setting [13, 14]. A subgroup is a part of the dataset that has a distribution of the target attribute that stands out compared to that of the rest of the dataset.

Take the following example: of a dataset with shot attempts, each shot is labeled as on target or not on target. In the whole dataset, the percentage of shots on targets may be rather low. A subgroup, identified by a (set of) condition(s), of the dataset might have a larger percentage of successful events. It could be, for example, that the percentage of successful events increases when the distance of the ball carrier to the goal is small.

3 Experiment

In this experiment, we show an implementation of our methodology for the event Turnovers. We defined success based on whether the possession ended inside the opponent’s penalty box. We generated the features as presented in Table 1, which we explored using Subgroup Discovery.

3.1 Data

We used a database with 48 matches from the seasons 2014–2018 from two top-level soccer clubs in the Dutch premier division (‘Eredivisie’). The data was collected by the clubs for performance analysis. The database included matches from the regular competition, the national cup and the Europa League. The clubs obtained written consent from the players to collect, share and store their data. We, in turn, obtained written informed consent from the clubs, to allow us to use the data for scientific purposes. All personal data was anonymized and the principles of the Declaration of Helsinki were adhered to throughout the research project. The X and Y coordinates of all players and the ball were recorded at 10 Hz with a video-based tracking system (SportsVU, STATS LLC, Chicago, IL, USA). For our experiment, we used tracking data only; the ball possession and key events were all computed algorithmically.

Table 2. Overview of significant subgroups (p < 0.05) ranked based on the WRAcc. The coverage and posterior indicate how large and successful a subgroup is. The condition of the subgroup is specified by the interval of a feature constructed from a metric referring to a specific Team (attacking or defending) and aggregation method (average or standard deviation).

3.2 Subgroup Discovery

Our tabular data contained 6729 examples (i.e., Turnovers) and 72 features (see Table 1). The prior was 13.5%, that is, 910 Turnovers took place inside the opponent’s penalty box. To assess the quality of the subgroups, we will use the Weighted Relative Accuracy (WRAcc):

$$\begin{aligned} WRAcc(S,T)=p(S)*(p(T|S)-p(T)), \end{aligned}$$

where S is the subgroup indicator variable (a binary function that decides for each example whether it is covered by the subgroup) and T the target variable. Additionally, we compute the Area Under the Convex Hull of all of the subgroups’ True- and False- Positive Rates (ROC AUC). Subgroups that have no correlation with the target (i.e., based on a random subset) will lie on the diagonal of the ROC-curve, yielding an ROC AUC of 0.5 (i.e., the naive baseline). We searched at depth 1, meaning that the exploration was restricted to one condition per subgroup. We adopted the intervals strategy, which means that conditions for subgroups could be formulated both as a range as well as a cut-off. Through swap-randomization with 100 repetitions we determined that subgroups with a WRAcc of at least 0.076 were not found by chance (p < 0.05).

3.3 Subgroups

We found 24 significant (p < 0.05) subgroups with an ROC AUC of 0.627 (see Table 2), wh With a prior of 13.5%, the percentage-point increase varied from 1.7 to 10.6% for the different subgroups. The subgroups ranked highest had the best combination of an increase in percent point successful events whilst still covering many examples. Given the similarity of some of the constructed features, we present the similar subgroups together.

Dispersion. Many of the subgroups concern a measure that captures the dispersion of the players on the pitch. Subgroups of offensive sequences where the ball ended inside the penalty box were either characterized by a relatively compact defending team, or a relatively spread out attacking team. Note that although there might be overlap between these subgroups, they do not necessarily concern the same subsets of Turnovers. A compact defending team could refer to specific game situations where all defenders are bunched together, such as a corner or a free kick on the attacking half. A spread-out attacking team might in practice correspond to a counter attack situation, where the attacking players are unorganized and thus spread out.

The width-related subgroups show us that an offensive success is slightly more likely (increase from prior 13.5 to posterior 16.2%) if the defending team is rather narrowly positioned (less than 40.11 m). In contrast, the attacking team should be rather broadly positioned (given that the width of the pitch is 70 m).

Potential Danger. The various components of Dangerousity, and Dangerousity itself, are all normalized between 0 and 1 to denote more (closer to 1) and less (closer to 0) threatening situations. In terms of potential Control, the subgroups indicate that the attacking team should and the defending team should not be in control of the ball. Furthermore, the more threatening the Zone, the more likely it is that success follows 5 s later. The Density reflects on how many players there were around the ball carrier. The interval of the related subgroups indicates that it should not be too crowded around the ball carrier. The subgroup based on the standard deviation of the compound measure Dangerousity (rank 19) tells us that there must have been a stark increase of Dangerousity.

4 Discussion

The focus of the current paper was on demonstrating the potential value of the relatively new positional tracking data which could be employed to enrich event data. First of all, the scalability and objectivity of current daily practice can be improved by using tracking data. We demonstrated that key events can be identified automatically, making it easier to analyze many matches at the same time and reducing the variable errors. Secondly, the numerous features that can be generated from tracking data can be dealt with by using an exploratory data mining technique. We demonstrated that the most prominent patterns in the data can be discovered among many features by using Subgroup Discovery.

Admittedly, the discovery that counter attacks lead to situations where the ball is likely to end in the opponent’s penalty box will not revolutionize soccer. Nevertheless, being able to quantify the importance of specific game situations -regardless of how obvious these situations are- is a step forward in objectifying soccer analyses. Moreover, our methodology can be tuned to a coaching staff’s specific interests in many ways. The most difficult parameter choice is the label of success. In our case, reaching the opponent’s penalty box, it is safe to assume that there is some correlation with success. However, by itself reaching the penalty box will never result in winning a match. In our current approach, we simplified the setting by reducing success to a Boolean. Future implementations of this approach should consider other (numeric) targets as well.

There are also some other notable parameters that could be tuned. It might be that a coach is interested in an entirely different type of key event. One could apply specific conditions to an event (e.g., turnovers on the opponent’s half), but also consider other familiar events such as Passes and Shots on goal. Another aspect that could be examined further is how to compress the temporal dimension. Specifically, the window within which the metrics are aggregated could be further explored. Windows could be chosen to reflect specific short- and/or long-term processes. Currently, how the spatial relations develop over time is often neglected by taking either an arbitrarily chosen window [8, 9] or sometimes an instantaneous value [15, 19]. Our methodology allows for the systematic comparison of various windows, which could yield interesting insights on short- and long-term processes during a match. Moreover, there are other aggregation function that could be considered in addition to the average and standard deviation that we included in the current analysis. Particularly the minimum and maximum could be interesting, as in soccer success can be the result of seizing a small window of opportunity. For example, Link and colleagues [15] aggregate their Dangerousity measures over time by looking at the ‘peak danger’ during specific periods of time. Finally, it is possible to extend the metrics that are included in the analysis. Although we implemented a broad range of metrics, the list definitely not exhaustive. There are many more existing and yet-to-be-formulated metrics that could be incorporated in our methodology. Most notably, there are many more ways to quantify how an area on the pitch is controlled [3, 5, 21, 24, 25]. Each of these features could extend our methodology to cover more grounds in finding the key tactics that lead to success.

Furthermore, to link the findings from our methodology to practice, it is pertinent to create a tool that demonstrates the metrics and their subgroups. Therefore, future work would extend the practical value of our modelling approach by creating an interactive and dynamic plotting tool where the (sometimes rather abstract) features are plotted over time, for example in combination with the positions of the players on the pitch in a two-dimensional bird’s eye view. Also, the discovery of actionable patterns should be aimed at identifying strengths and weaknesses of specific teams with respect to each other. By contrasting two teams, a head-to-head comparison could be made that informs about what works against whom, and vice versa (i.e., which strategy is typically successful for a specific team).

From a scientific point of view, it is important to note that findings from data mining are not the same as scientific facts in the traditional sense. For the more applied sports science domain, these findings can still be highly informative as a basis for generating new data-driven hypotheses. The findings from our approach, could be used to inform about a fingerprint of tactical behavior. With such a fingerprint, pertinent questions in the sports science domain can be further examined, such as how tactics develops with age, or the differences between countries. With better tools to analyze tactics in soccer, the next step will be to test the outcome of a specific intervention. When this can be reliably done, this kind of analysis could be used in a tool for coaches. Our long-term vision is that practitioners can experiment with their tactics in a ‘Cockpit’ that would help them come up with the best intervention for the specific situation at hand.

4.1 Conclusions

Tracking data has the clear potential to add context of the goings-on on the pitch to key events. Our methodological approach shows that interpretable and actionable results could be obtained by systematically exploring many metrics and determining how well they represent success and with which thresholds. Our work also shows that care must be taken in generating features, as it is inevitable that many arbitrary decisions have to be made. With more data available, data mining techniques should be employed to critically assess which metrics best represent success. Future work should focus on further developing the metrics that represent the context-of-play. Moreover, future analyses should take the playing style of specific teams, and maybe even players, into account. By making a head-to-head comparison between teams, the contrasts could best highlight the strengths and weaknesses for a specific team against another specific team.

Our proposed modelling approach can be used to further understand tactics in invasion-based team sports by comparing specific targets, teams and styles of play. When the full potential of tracking data is captured, it will affect the way soccer and possibly other team sports are played.