Football Pass Prediction Using Player Locations

Fournier-Viger, Philippe; Liu, Tianbiao; Chun-Wei Lin, Jerry

doi:10.1007/978-3-030-17274-9_13

Football Pass Prediction Using Player Locations

Philippe Fournier-Viger¹⁸,
Tianbiao Liu¹⁹ &
Jerry Chun-Wei Lin²⁰

Conference paper
First Online: 07 April 2019

1927 Accesses
3 Citations

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11330))

Abstract

In many sports, predicting the passing behavior of players is desirable at it provides insights that can help to understand and improve player performance. In this paper, we describe a novel model for football pass prediction, developed to participate in the Prediction Challenge of the 5th Workshop on Machine Learning and Data Mining for Sports Analytics, collocated with ECML PAKDD 2018. The model called Football Pass Predictor (FPP) considers various aspects to generate predictions such as the distance between players, the proximity of players from the opposite team, and the direction of each pass. Experimental results shows that the model can achieve a prediction accuracy of 33.8%, and more than 50% if two guesses are allowed. This is considerably more than the random predictor, which obtains 8.3%.

You have full access to this open access chapter, Download conference paper PDF

1 Introduction

Passing is a key aspect of the football game. It is performed between players, occurs almost everywhere on the pitch, and creates scoring opportunities. According to statistics from the Union of European Football Association, during the 2017–2018 season, a top European team could perform more than 400 passes in a Champions league match [1]. Passing can account for the majority of tactical/technical behaviors in a game. Many researchers have studied passing behavior from the perspective of football tactics and techniques [2, 3]. Nowadays, with the development of data science and computer technology, researchers can analyze and simulate passing using massive databases and complicated models, to acquire a deeper understanding of passing behaviors. For instance, Liu et al. [4, 5] evaluated the effect of passing on creating scoring opportunities using a Markov chain model. In another study, an Apriori-based algorithm was applied to perform a descriptive analysis of passing behavior [4, 5] Apriori-based diagnostical analysis of football passes was also done using a sequential rule discovery approach [6] based on the RuleGrowth algorithm [7].

Passing behavior is influenced by many factors and a player must quickly react to the sudden changes of situations around him. Considering the playing context and different playing situations, Stöckl et al. [10] provided an approach to describe the tactical difficulty of passes. In another study, Rein et al. [11] assessed passing effectiveness in elite soccer using two algorithms considering the number of defenders and players’ control of space. Gyarmati et al. [12] developed a QPass evaluation system to estimate players role in building up an attack. Another way to identify key players in a team is using network analysis and to consider pass difficulty [13]. Generally, passing differs according to the context. Sometimes, passings yield dangerous situations with respect to opponents, while sometimes few risks are involved. Cakmak et al. [14] developed a descriptive model to quantify the effectiveness of passes and identify key passes and regular passers in a team. Using Player Trajectory technology, Lida and Mase [15] studied the ball passing behavior of players by considering their trajectories. Dhar and Singh [16] analyzed video footage to develop a passing strategy. Although, all these studies have been done to analyze football passing behavior, they do not provide a computational model that can be used for pass prediction. But pass prediction could have many applications.

The contribution of this paper is to fill this void by proposing a model named FPP (Football Pass Predictor) to predict the player who will receive a pass initiated by another player. This work is done in the context of the Prediction Challenge of the 5th Workshop on Machine Learning and Data Mining for Sports Analytics, collocated with ECML PAKDD 2018. The proposed model considers various aspects to generate predictions such as the distance between players, the proximity of players from the opposite team and the direction of each pass. Experimental results shows that the model can achieve high prediction accuracy.

The rest of this paper is organized as follows. Section 2 provides a brief description of the provided dataset, and key observations that were made. Section 3 presents the proposed FPP model. Section 4 presents experimental results. Finally, Sect. 5 draws a conclusion.

2 Observations About the Data

The dataset provided for the prediction challenge contains 12,124 records describing passes from 14 football matches of a Belgian team and opposing teams during the 2014/2015 football season. In a football match, two teams are facing each others, where each team has 14 players (including 3 substitutes). A database record describes a pass. It provides the (1) the location of the 14 football players of each team using 2D coordinates, (2) the time at which the pass started and ended, (3) the player who sent the ball, and (4) the player who received the ball. Coordinates are expressed in the [−5250, 5250] [−3400, 3400] intervals for the X and Y axes, respectively. Note that player names are not indicated in the data as well as the names of the teams. Moreover, it has not been indicated if the positions of the players have been recorded when a pass starts or ends. Besides, although timestamps are provided in the data, records from all matches were put in a single file and randomly shuffled. Thus, each pass can only be considered individually rather than in the context of a match. The data was collected by the prediction challenge organizers, and made available at https://github.com/JanVanHaaren/mlsa18-pass-prediction.

By analyzing the data, the authors of this paper made a few interesting observations. First, out of the 12,124 passes, only 17% of the passes are intercepted by the opposite team. Because unsuccessful passes are much less likely than successful ones, a design principle for the proposed model is to assume that all passes will be successful when making predictions. Second, if was found that the 163th line of the dataset is an invalid record. In that record, the player number 15 who sends the ball has no coordinates. This record has been ignored. Third, although the dataset provides timestamps, it is difficult to use this information for pass prediction since each record is often separated by numerous seconds, and the position of players is given only once for each pass but each record contains two timestamps. For this reason, the trajectories of players are not available, and it is hence difficult to analyze each pass in the context of the overall game. Fourth, a related issue is that records from all matches are stored together in the dataset. Thus, it is unclear which passes belong to which match. Besides, it is not indicated which team is playing on which side of the football field. We have inferred this information by assuming that the left (right) side of the pitch belongs to the team having the leftmost (rightmost) player.

3 The FPP Model

Based on the observations made on the data, the proposed FPP model was developed. To design the model an iterative design approach was used where several versions of the model were successively designed, each adding additional criteria to increase prediction accuracy. Among the multiples versions of the model, four are described in the paper. These versions, sorted by ascending order of complexity, are called M1, M2, M3, and FPP, respectively. They are described in the following paragraphs, and illustrated in Fig. 1.

M1. The first model is based on the assumption that the sender will pass the ball to the closest player of his team. Assume that a player X has the ball and that we want to predict who will receive the ball. Let P be the set of players from the same team as X (excluding X). For each player \(Y \in P\), the Euclidian distance between X and Y is calculated, denoted as \(d_{X,Y}\). Then, for each player \(Y \in P\) a score is assigned to Y, defined as \(score(Y) = d(X,Y)\). The player with the smallest score is chosen as the prediction.

M2. The second model is an improvement of the M1 model. An additional idea is considered, which is that a player may be less likely to receive the ball if a player of the opposite team is close to him. The motivation is that this situation may be considered more risky for the sender, and that the opposite team player may intercept the ball. Formally, let O be the set of players from the opposite team. For each player \(Y \in P\), its score is defined as \(score(Y) = d(X,Y)\) + penaltyC(Y, O). The term penaltyC is defined as \(penaltyC(Y,O) = 900\) if there exists a player \(Z \in O\) such that \(d(Y,Z) < 700\), and otherwise \(penaltyC(Y,O) = 0\). The values 700 and 900 were found empirically (by trial and error) to obtain a high prediction accuracy.

M3. The third model is an improvement of M2, which considers that more than one player from the opposite team may be close to a potential receiver and increase risks. For each player \(Y \in P\), its score is defined as \(score(Y) = d(X,Y)\) + penaltyC(Y, O) + penaltyD(Y, O). The term penaltyD is defined as \(penaltyD(Y,O) = 55\) if there exists two players \(Z \in O\) such that \(d(Y,Z) < 700\), and otherwise \(penaltyD(Y,O) = 0\). The value 55 was found empirically.

FPP. The fourth model is an improvement of the M3 model, which considers the direction of the ball, based on the assumption that a player prefers to send the ball forward. Let the notation Z.x denotes the position of a player Z on the x axis. The score of a player \(Y \in P\) is defined as \(score(Y) = d(X,Y)\) + penaltyC(Y, O) + \(penaltyD(Y, O) + direction(X,Y)\). The term direction(X, Y) is defined as \(-0.3 \times |X.x - Y.x|\) if the pass is a forward pass (toward the opposite team goal) or as \(0.1 \times |X.x - Y.x|\) if the pass is a backward pass. The values 0.1 and 0.3 were found empirically as providing the best results.

4 Experimental Evaluation

An experimental evaluation was performed to evaluate the five versions of the designed FPP model. The models were compared with a random predictor as baseline. Since no performance measure was explicitly specified for the prediction challenge, it was decided to evaluate the models in terms of accuracy (number of correct predictions divided by total number of passes to be predicted). Furthermore, the accuracy when two guesses are allowed was measured. In that situation, a model can make two predictions for each pass, and if one of them is right, the pass is considered as correctly predicted. The source code of the proposed model and evaluation framework can be downloaded from http://philippe-fournier-viger.com/foot2018/. It is written in Java.

Table 1. Accuracy of the compared models

Full size table

Results are shown in Table 1. It is first observed that the heuristic of predicting that the ball is passed to the closest player of the same team achieves a high accuracy (27.44%) compared to the baseline random predictor (7.88%). Then, if we add a penalty if there is a player from the opposite team that is close to a potential receiver, it increases accuracy by more than 5%, from 27.44% (M1) to 32.83% (M2). If this model is further extended for the case of two players from the opposite team, the accuracy increases slightly, from 32.83% (M2) to 32.97% (M3). Moreover, if the direction of the ball is considered, the accuracy increases from 32.97% (M4) to 33.38% (FPP). Finally, if we allows to perform two guesses, the accuracy of the best model (FPP) increases to 51.81%.

Besides, the described models, the authors of this paper have also tried various other ideas including calculating the angle between players to determine if a player from the opposite team may intercept a pass. But these ideas did not improve accuracy, or even decreased it. Other models could also be considered. However, what can be developed remains limited by the data. For example, having more rich data such as the real time locations of players, and data were records are not shuffled, could allow to obtain player trajectories and develop more complex models. Besides, an improvement of the FPP model could be to use a genetic algorithm to tune its parameters instead of tuning them by hand, and to split the data in training and testing sets, or using k-fold cross validations to avoid the potential problem of overfitting.

5 Conclusion

This paper has proposed a model called Football Pass Predictor (FPP) to predict the receivers of passes in football matches. The model considers various aspects such as the distance between players, the proximity of players from the opposite team, the direction of each pass, to generate predictions. The performance of the model was compared with a baseline random predictor and several variations of the proposed models. Results from experiments shows that FPP can achieve a prediction accuracy of 33.8%, and more than 50% if two guesses are allowed. An interesting perspective for future work is to collect richer data, which would allow to develop more complex models. We also plan to evaluate the possibility of using pattern mining approaches for football pass prediction [8, 9].

References

http://www.uefa.com/uefachampionsleague/season=2018/statistics/round=2000881/matches/kind=passes/index.html. Accessed 15 June 2018
Garratt, K., Murphy, A., Bower, R.: Passing and goal scoring characteristics in Australian A-League football. Int. J. Perform. Anal. Sport 17(1–2), 77–85 (2017)
Article Google Scholar
Plummer, B.T.: Analysis of attacking possessions leading to a goal attempt, and goal scoring patterns within men’s elite soccer. J. Sports Sci. Med. 1(1), 1–38 (2013)
Google Scholar
Liu, T.: Systematische Spielbeobachtung im internationalen Leistungsfußball. Ph.D. dissertation. University of Bayreuth (2014)
Google Scholar
Liu, T., Hohmann, A.: Apriori-based diagnostical analysis of passings in the football game. In: Proceedings of IEEE 2016 International Conference on Big Data Analysis, pp. 1–4. IEEE (2016)
Google Scholar
Liu, T., Fournier-Viger, P., Hohmann, A.: Using diagnostic analysis to discover offensive patterns in a football game. In: Tavana, M., Patnaik, S. (eds.) Recent Developments in Data Science and Business Analytics. SPBE, pp. 381–386. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-72745-5_43
Chapter Google Scholar
Fournier-Viger, P., Nkambou, R., Tseng, S.M.: RuleGrowth: mining sequential rules common to several sequences by pattern-growth. In: Proceedings of 26th Symposium on Applied Computing, pp. 954–959. ACM Press (2011)
Google Scholar
Fournier-Viger, P., Lin, J.C.-W., Vo, B., Chi, T.T., Zhang, J., Le, H.B.: A survey of itemset mining. WIREs Data Min. Knowl. Discov. e1207 (2017). https://doi.org/10.1002/widm.1207
Google Scholar
Fournier-Viger, P., Lin, J.C.-W., Kiran, R.U., Koh, Y.S., Thomas, R.: A survey of sequential pattern mining. Data Sci. Pattern Recogn. (DSPR) 1(1), 54–77 (2017)
Google Scholar
Stöckl, M., Cruz, D., Duarte, R.: Modelling the tactical difficulty of passes in soccer. In: Chung, P., Soltoggio, A., Dawson, C.W., Meng, Q., Pain, M. (eds.) Proceedings of the 10th International Symposium on Computer Science in Sports (ISCSS). AISC, vol. 392, pp. 139–143. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-24560-7_17
Chapter Google Scholar
Rein, R., Raabe, D., Memmert, D.: “Which pass is better?” Novel approaches to assess passing effectiveness in elite soccer. Hum. Mov. Sci. 55, 172–181 (2017)
Article Google Scholar
Gyarmati, L., Stanojevic, R.: QPass: a merit-based evaluation of soccer passes. Preprint on arXiv:1608.03532 (2016)
McHale, I.G., Relton, S.D.: Identifying key players in soccer teams using network analysis and pass difficulty. Eur. J. Oper. Res. 268(1), 339–347 (2018)
Article Google Scholar
Cakmak, A., Uzun, A., Delibas, E.: Computational modeling of pass effectiveness in soccer. J. Adv. Complex Syst. (2018, in press)
Google Scholar
Lida, R., Mase, K.: Ball passing course creating behavior in soccer game detection from player trajectory. IEICE Technical report, vol. 113, no. 432, pp. 171–176 (2014)
Google Scholar
Dhar, J., Singh, A.: Game analysis and prediction of ball positions in a football match from video footages. In: Proceedings of International Conference on Recent Advances and Innovations in Engineering, pp. 1–6. IEEE (2014)
Google Scholar

Download references

Acknowledgement

This work is partially financed by the Youth 1000 talent funding of Philippe Fournier-Viger.

Author information

Authors and Affiliations

School of Natural Sciences and Humanities, Harbin Institute of Technology (Shenzhen), Shenzhen, China
Philippe Fournier-Viger
College of Sports and Physical Education, Beijing Normal University, Beijing, China
Tianbiao Liu
Department of Computing, Mathematics and Physics, Western Norway University of Applied Sciences (HVL), Bergen, Norway
Jerry Chun-Wei Lin

Authors

Philippe Fournier-Viger
View author publications
You can also search for this author in PubMed Google Scholar
Tianbiao Liu
View author publications
You can also search for this author in PubMed Google Scholar
Jerry Chun-Wei Lin
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Philippe Fournier-Viger .

Editor information

Editors and Affiliations

Leuphana University, Lüneburg, Germany
Ulf Brefeld
Katholieke Universiteit Leuven, Heverlee, Belgium
Jesse Davis
SciSports, Enschede, The Netherlands
Jan Van Haaren
Université de Caen Normandie, Caen, France
Albrecht Zimmermann

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Fournier-Viger, P., Liu, T., Chun-Wei Lin, J. (2019). Football Pass Prediction Using Player Locations. In: Brefeld, U., Davis, J., Van Haaren, J., Zimmermann, A. (eds) Machine Learning and Data Mining for Sports Analytics. MLSA 2018. Lecture Notes in Computer Science(), vol 11330. Springer, Cham. https://doi.org/10.1007/978-3-030-17274-9_13

Download citation

DOI: https://doi.org/10.1007/978-3-030-17274-9_13
Published: 07 April 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-17273-2
Online ISBN: 978-3-030-17274-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics