Secrets of soccer: Neural network flows and game performance

https://doi.org/10.1016/j.compeleceng.2019.106505Get rights and content

Abstract

Soccer is the most popular sport in the world, with currently over three billion fans. There are various reasons for this success, but a unique feature of soccer stands out: every match has a high level of unpredictability. For instance, it is not uncommon for a less skilled team to sometimes defeat a better team. Moreover, a match that is apparently decided in favor of a team can suddenly change course, even with few minutes left, ending with a completely opposite result. These highly dramatic effects have brought popularity to this sport, making every match a complex event where the outcome is far from granted. This same complexity has however challenged all attempts of data analysis: the “secrets of soccer”, that is to say the recipes for success, are still an unknown realm, defying all common statistical approaches. In this study we try to shed some light on these secrets by introducing a novel approach that uses neural network flows. We transform a team play into a corresponding brain-like structure, an abstraction that we analyze using measures of efficiency, assessing the “quality of thinking” of the brain. This way, we can view any soccer match as an alternate battle of minds and explore how far this parallelism can help to solve some fundamental open problems, like finding an effective recipe for success, and establishing the best field control strategies.

Introduction

The game of soccer has in the latest years gained an enormous traction, becoming widespread in almost all parts of the world. For example, soccer is the number one sport for global audience, tv viewers, internet popularity, number of professionals, market value and more. Its success is due to several factors, but also partly to its unpredictability: lovers of this sport praise the fact that soccer is fascinating because of its lack of apparent determinism. Within a single game, having a better team does not guarantee success, and even teams belonging to inferior leagues can sometimes beat world famous top teams. Moreover, even within a single game playing better does not guarantee success: a team can be vastly superior as far as all common statistics (like ball possession, shoots on target and so on) are concerned, and then eventually still lose the game. An extreme case of this paradox occurred for instance in the English Premier League, when a team of underdogs (Leicester City Football Club) won the 2016 title: Leicester won many matches even if very often puzzlingly behind in most game statistics. This uncertainty in games has brought soccer to an exceptional level of popularity, having a situation where each game, despite the level of strategy put in by managers and players, still have some surprise factor.

Along with the success of soccer, there have also been interest in trying to study the complexity of the game and unravel some of the strategies that influence the final game score. However, even with the increasing progress in technology and data science techniques, the secrets of this game have still proved hard to unlock. There have been many works related to soccer analysis using various tools (see for instance [1], [2], [3], [4], [5]), studies trying to identify the various kinds of team formation [6], as well as studies performing general statistical analysis so to help understand the mysteries of this sport [7]. However, despite all the attempts to define and analyze significant metrics for soccer analysis (see e.g. Clemente et al. [8]) or to find some sense about statistical data on matches [9], soccer has remained a baffling quest for data analysis, despite the ongoing availability of big data sets (cf. Rein and Memmert [10]). In fact, most commonly used statistics to define a team performance (like total shots, shots on target, shots off target, ball possession, number of off-sides, fouls received, corners and so on) did not manage to effectively grasp what makes a team win a match (cf. Clemente et al. [8]). This complexity is in fact part of the fascinating aspects of this game: whatever the reasons that eventually make a team win a single match, they are neither simple nor obvious.

In this paper, which is a revised and expanded version of [11], we tackle the problem of soccer analysis using a different approach. The key point is to shift perspective, starting from a fundamental question: what is the focus aspect to consider in a soccer game? The most natural answer would be the players: the players are the prime actors in the game, and they determine success or failure. And so for instance, having better players looks like an obvious recipe for success. Following this intuition, works in the literature have been trying to study individual player behavior, attempting to grasp how such behavior can actually impact the game (see for instance [12], [13], [14], [15]). However, as said earlier, these approaches have not managed to provide definitive answers, failing to find meaningful metrics for success. In this work we instead change perspective, and do not consider the individual soccer players as the main actors of the game. Instead, we view players as subordinate to another component: the play field. The soccer field itself is considered as the primary component, and players are seen as functional objects that enable information exchange among the various zones of the field. In doing so, we turn each match into a kind of brain-like network, where ball passing becomes an instance of a neural network flow, and the whole entity “playfield plus players plus ball passing” becomes the equivalent of a so-called Team Brain. This way we can change point of view, and instead of looking at a match as a soccer battle, we view it as a corresponding “battle of minds”: whoever thinks better wins.

In this study we introduce the approach, and then test it on real data from an international soccer tournament. The obtained results show that this parallelism seems to work and can help to provide valuable insights to unlock at least some of the complexity of the game. This allows to go beyond classic statistical approaches (which proved to be unsuitable for the game), and to start answering the fundamental question that is the key to this sport: what does it really make a team win or lose a single soccer match? We show how the quality (efficiency) of the corresponding Team Brain looks like the secret key to win a game, correlating qualitatively with actual win-draw-lose results. Moreover, the Team Brain model also provides quantitative information about the final score, enabling to guess with high probability the result of game.

Extending the analysis, we also investigate the Team Brain dynamics with respect to time. In order to win do we need a Team Brain with a few intense moments of thoughts, or is it instead better a steady thinking flow? In other words, is the recipe to success given by peaks of brilliant play, or is it more important to be consistent all along the match?

Another important problem we tackle is in-game prediction power: how much can we infer the final result also in mid-game, without waiting for the complete end-of-game data?

Last but not least, we then also proceed to see whether there are areas of the playfield that are more important than others, and correspondingly find some surprising results about the perception of a Team Brain. We show that the “mind” composed by a Team Brain views the play field using different geospatial metrics, and this discrepancy from our classical perception could also explain the previous difficulties in identifying proper significant statistics for soccer. Relatedly, we also focus on the actual physical size of the neural zones and investigate what are the most meaningful areas to consider in order to build a winning Team Brain.

Answering all these questions provides interesting food for thought and helps to unravel some of the secrets of this complex sport.

The paper is organized as follows. In Section 2 we introduce the concept of Team Brain, transforming the match of a team into a network brain-like connection structure. In Section 3 we then introduce the machinery that allows us to define how efficient such a structure can be, which is the prelude to the next Section 4, where we then leverage a team match into the new concept of “battle of minds”. In Section 5 we apply this approach to real game data, showing how the performance of the Team Brain is a key metric correlated to the final result of a game. In Section 6 we delve deeper, and see whether we can actually get a result predictor for a game. In Section 7 we study the impact of time on the battles of mind, and investigate its relationship with goals and game progression. We then proceed in Section 9 analyzing whether there are zones of the field that are more critical than others for the success of a team, obtaining rather surprising results that revisit concepts like midfield, forward, wings and so on. Section 10 then critically analyses the concept of neural zone with respect to size, trying to determine the meaningful physical size range of the neural zones. In Section 11 we briefly hint at some possible future lines of research expanding on the current analysis. Finally, Section 12 ends the paper summarizing the results.

Section snippets

The team brain

In this Section we introduce the concept of Team Brain and show how to turn a soccer match into a corresponding neural structure. As hinted before, we reverse the classic approach of considering the team players as first-class actors, abstracting from them and focusing instead on the playfield as primary factor, seeing players as components of a bigger cyber-structure. In particular, we consider team players as functional to one basic action: information exchange. In this interpretation of the

Measuring team brains

In the previous section we have seen how to turn the game play of a team into a Team Brain, represented by a multigraph network. The next step is to define a measure telling us how well this brain can “reason”, following the underlying idea that better Team Brains would then correspond to winning teams. Given that the Team Brain is modeled with the idea of neural zones that exchange information, we use the notion of network efficiency introduced in [16], [17]. As the name suggests, network

Team performance and battles of minds

The alternate view of a soccer match as a “battle of minds” allows to then look for correlations between these two worlds.

However, when dealing with the soccer world, we have first to define what level of detail we are interested in. For instance, we might be interested either in the final score of the game (a very precise outcome), or in a lower level of detail, like for instance just knowing who the winner is, independently on the number of scored goals. We investigate both choices, defining

Correlation results

Having defined how to turn a soccer match into a battle of minds leads to the next step: is there a correlation between these two views of the game? In other words, is the concept of a Battle of Minds (with its related concept of better thinking given by the measure of network efficiency) truly representative of the game result? Can we semantically compress a complex game dynamic into this abstraction, therefore obtaining insights on the fundamental forces leading to winning or losing a game?

In

Prediction results

Apart from correlation, we might also want to investigate the relationship between battles of minds and the actual result of a match in a different way. Correlation tells us there is a general correspondence between these values, but like every average value there might be also cases where the correspondence is not perfect (and so, for few individual matches, the battle of minds might not perfectly coincide with the outcome of a match). So, in addition to a global correlation measure, we can

Global versus local time

So far, we have correlated the outcome of battles of minds with the game result. The same definition of battles of minds relies on calculating the overall efficiency of the brain-like structure generated during the whole game. One might wonder whether in fact there are also local variations of efficiency that matter during the match. In other words, is the overall result the by-product of a “thinking process” that has to take into account the whole match, or can we also focus on smaller time

Underlying field structure

We have previously stressed the fact that the basic constitutive layer of the Team Brain is made by the physical soccer field. The correlation results between Team Brain efficiency and performance then tell us that what really matters is the ball movement, that is the equivalent of a transmission exchange (a “thought” in brain terminology). This information exchange (ball passing) occurs within a geometrical environment (the soccer field), with its notion of distance.

We might wonder how much

Critical areas and field perception

Given that distances do matter, we might go further, and ask an even more challenging question. Does the Team Brain perceive the field in exactly the way we perceive it? In other words, the Team Brain interacts with the underlying field structure, and dropping its metrical information (as seen previously) degrades performances. But the metric information we have provided to the Team Brain is the plain one given by field geometry: Euclidean distances on the field. It might be that the Team Brain

Neural zones size

In order to extract a brain-like network structure from the playfield, we have divided it into 26 zones. Apart from the two special zones corresponding to the goals, one might wonder whether this division is the only possible one. While it is an intuitively reasonable division, allowing us to distinguish between center and wing areas (in the horizontal axis), and allowing to distinguish in each midfield between forward and backward zones, the question may arise of whether there are other

Future work

In order to further expand on the current analysis, the key factor to future work is collecting more data with enough level of precision: data about match results are widespread, whereas data about ball passes are available commercially but not academically. In fact, such larger datasets could automatically be collected using computer vision techniques for ball tracking (see for example Kamble et al. [21]) allowing not only to better validate the results presented here but also to reason at a

Conclusions

In this study we have introduced a new model of soccer analysis, based on the key intuition that players are not first-class entities, but are rather parts of a primary entity (the soccer field), and are functional to the common goal of information exchanges among its various zones. The soccer field becomes the equivalent of a brain container, and the players play the role of neurons, thus acting within a combined cyber-entity by activating neural network flows. This way we are able to turn

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Massimo Marchiori is Professor at UNIPD (Italy) and CTO of EISMD (Belgium). At MIT/W3C (USA) he led the development of several world standards. He created Hypersearch (Google’s forerunner), Volunia (the world-first social search engine), Negapedia (the negative version of Wikipedia). He won several awards, including the IBM research award, the Microsoft Data Science Award, the MIT TR35 award.

References (25)

  • P. Halvorsen et al.

    Bagadus: An Integrated System for Arena Sports Analytics: A Soccer Case Study

    Proceedings of the 4th ACM multimedia systems conference. MMSys ’13

    (2013)
  • C. Perin et al.

    Soccerstories: a kick-off for visual soccer analysis

    IEEE Trans Vis Comput Graph

    (2013)
  • A. Rehman et al.

    Features extraction for soccer video semantic analysis: current achievements and remaining issues

    Artif Intell Rev

    (2014)
  • H. Janetzko et al.

    Feature-driven visual analytics of soccer data

    2014 IEEE conference on visual analytics science and technology (VAST)

    (2014)
  • M. Schlipsing et al.

    Adaptive pattern recognition in real-time video-based soccer analysis

    J Real-Time Image Process

    (2017)
  • A. Bialkowski et al.

    Large-scale analysis of soccer matches using spatiotemporal tracking data

    2014 IEEE international conference on data mining

    (2014)
  • M. Shafizadeh et al.

    Performance consistency of international soccer teams in euro 2012: a time series analysis

    J Hum Kinet

    (2013)
  • F.M. Clemente et al.

    Computational metrics for soccer analysis: connecting the dots

    (2017)
  • J. Castellano et al.

    The use of match statistics that discriminate between successful and unsuccessful soccer teams

    J Hum Kinet

    (2012)
  • R. Rein et al.

    Big data and tactical analysis in elite soccer: future challenges and opportunities for sports science

    Springerplus

    (2016)
  • M. Marchiori et al.

    The team brain: Soccer analysis and battles of minds

    Proceedings of the IEEE cyber science and technology congress (CyberSciTech 2018)

    (2018)
  • W. Gregson et al.

    Match-to-match variability of high-speed activities in premier league soccer

    Int J Sports Med

    (2010)
  • Massimo Marchiori is Professor at UNIPD (Italy) and CTO of EISMD (Belgium). At MIT/W3C (USA) he led the development of several world standards. He created Hypersearch (Google’s forerunner), Volunia (the world-first social search engine), Negapedia (the negative version of Wikipedia). He won several awards, including the IBM research award, the Microsoft Data Science Award, the MIT TR35 award.

    Marco de Vecchi, data analyst, graduated in Computer Science at the University of Padua (Italy) under the supervision of Professor Massimo Marchiori.

    This paper is for CAEE special section SI-csc.Reviews processed and recommended for publication to the Editor-in-Chief by Guest Editor Dr. Xiaokang Zhou.

    View full text