Skip to main content

Shot Analysis in Different Levels of German Football Using Expected Goals

  • Conference paper
  • First Online:
Machine Learning and Data Mining for Sports Analytics (MLSA 2022)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1783))

  • 611 Accesses

Abstract

Shooting has been one of the most analyzed and researched parts of association football as it directly leads to goals which determine the score of the match. We take a look at it from a previously unseen perspective and analyze if there are differences between four different levels in German football (Bundesliga, Regionalliga, U19 Bundesliga and U17 Bundesliga) in shooting tendencies and efficiency and explore how these change as players get older. To do that we employ statistical analysis and examine the individual weights of Expected Goals models based on logistic regression. We find that players in higher levels tend to be more risky and aim for corners of the goal and are more predictable in terms of their shot origins. A comparison of headers and kicks show that goal likelihood of the latter is much more influenced by whether a shot has happened after a set piece, whereas goal likelihood of headers decreases more steeply with increasing distance from goal. Analysis also reveals that with increasing level goalkeepers tend to be more reliable saving shots at medium height but have a harder time with shots aimed at bottom corners.

Supported by VfB Stuttgart.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 54.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 69.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Some matches were missing from the datasets as they were cancelled or not yet recorded, hence the odd number of matches in leagues.

  2. 2.

    The column to the left of the penalty mark and the row below penalty box being considerably less intense than their neighbours is attributed to limitations in the data as shot positions are represented as integers. The column and row in question have only two possible y-positions associated with it while their neighboring rows and columns have three positions, therefore these particular two have less recorded shots.

References

  1. StatsBomb xG. https://statsbomb.com/articles/soccer/statsbomb-release-expected-goals-with-shot-impact-height/

  2. Wyscout shot definintions & parameters. https://dataglossary.wyscout.com/shot/

  3. Armatas, V., Yiannakos, A., Papadopoulou, S., Skoufas, D.: Evaluation of goals scored in top ranking soccer matches: Greek Superleague 2006–07. Serbian J. Sports Sci. 3(1), 39–43 (2009)

    Google Scholar 

  4. Benítez-Sillero, J.D.D., Martínez-Aranda, L.M., Sanz-Matesanz, M., Domínguez-Escribano, M.: determining factors of psychological performance and differences among age categories in youth football players. Sustainability 13(14), 7713 (2021)

    Google Scholar 

  5. Bonsang, E., Dohmen, T.: Risk attitude and cognitive aging. J. Econ. Behav. Organization 112, 112–126 (2015)

    Article  Google Scholar 

  6. Brier, G.: Verification of forecast expressed in terms of probability. Mon. Weather Rev. 78(1), 1–3 (1950)

    Article  Google Scholar 

  7. Castellano, J., Casamichana, D., Lago, C.: The use of match statistics that discriminate between successful and unsuccessful soccer teams. J. Hum. Kinet. 31(2012), 137–147 (2012)

    Article  Google Scholar 

  8. Draper, N.R., Smith, H.: Dummy variables. In: Applied Regression Analysis. Wiley Series in Probability and Statistics, Wiley, 1 edn. (1998)

    Google Scholar 

  9. Lago-Peñas, C., Lago-Ballesteros, J., Dellal, A., Gómez, M.: Game-related statistics that discriminated winning, drawing and losing teams from the Spanish soccer league. J. Sports Sci. Med. 9(2010), 288–293 (2010)

    Google Scholar 

  10. Lago-Peñas, C., Lago-Ballesteros, J., Rey, E.: Differences in performance indicators between winning and losing teams in the UEFA champions league. J. Hum. Kinet. 27(2011), 135–146 (2011)

    Article  Google Scholar 

  11. Link, D., Lang, S., Seidenschwarz, P.: Real time quantification of dangerousity in football using spatiotemporal tracking data. PLoS ONE 11(12), e0168768 (2016)

    Article  Google Scholar 

  12. Pappalardo, L., et al.: A public data set of spatio-temporal match events in soccer competitions. Scientific Data 6(1), 236 (2019)

    Article  Google Scholar 

  13. Pollard, R., Ensum, J., Taylor, S.: Estimating the probability of a shot resulting in a goal: the effects of distance, angle and space. Int. J. Soccer Sci. 2(1), 15 (2004)

    Google Scholar 

  14. Pollard, R., Reep, C.: Measuring the effectiveness of playing strategies at soccer. J. Royal Statist. Soc. Ser. D (The Statistician) 46(4), 541–550 (1997)

    Google Scholar 

  15. Rathke, A.: An examination of expected goals and shot efficiency in soccer. J. Human Sport Exercise 12(Proc2), 514–529 (2017)

    Google Scholar 

  16. Robberechts, P., Davis, J.: How data availability affects the ability to learn good xG models. In: Brefeld, U., Davis, J., Van Haaren, J., Zimmermann, A. (eds.) MLSA 2020. CCIS, vol. 1324, pp. 17–27. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-64912-8_2

  17. Rolison, J.J., Hanoch, Y., Wood, S., Liu, P.J.: Risk-taking differences across the adult life span: a question of age and domain. J. Gerontol. Ser. B 69(6), 870–880 (2014)

    Article  Google Scholar 

  18. Rábano-Muñoz, A., Asian-Clemente, J., de Villarreal, E.S., Nayler, J., Requena, B.: Age-related differences in the physical and physiological demands during small-sided games with floaters. Sports 7(4), 79 (2019)

    Google Scholar 

  19. Tippett, J.: The expected goals philosophy: a game-changing way of analysing football. Independently Published (2019)

    Google Scholar 

  20. Trninić, V., Trninić, M., Penezić, Z.: Personality differences between the players regarding the type of sport and age. Acta Kinesiologica 10(2), 69–74 (2016)

    Google Scholar 

  21. Vars, F.E.: Missing well: optimal targeting of soccer shots. Chance 22(4), 21–28 (2009)

    Article  MathSciNet  Google Scholar 

  22. Witmore, J.: The analyst - what are expected goals? (2019). https://theanalyst.com/eu/2021/07/what-are-expected-goals-xg/

  23. Witmore, J.: Evolving Expected Goals (xG) (2022). https://theanalyst.com/eu/2022/03/evolving-expected-goals-xg/

  24. Worville, T.: What age do players in different positions peak? (2021). https://theathletic.com/2935360/2021/11/15/what-age-do-players-in-different-positions-peak/

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Laurynas Raudonius .

Editor information

Editors and Affiliations

A Box plots of significantly different distributions

A Box plots of significantly different distributions

Here you can find the box plots for distributions we’ve encountered that, according to ANOVA, significantly differ \((p_{value} < 0.05)\) between the four leagues.

Fig. 4.
figure 4

Distance to goal distributions in different leagues

Fig. 5.
figure 5

Our xg score distributions for headers in different leagues

Fig. 6.
figure 6

Our xg score distributions for kicks in different leagues

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Raudonius, L., Seidl, T. (2023). Shot Analysis in Different Levels of German Football Using Expected Goals. In: Brefeld, U., Davis, J., Van Haaren, J., Zimmermann, A. (eds) Machine Learning and Data Mining for Sports Analytics. MLSA 2022. Communications in Computer and Information Science, vol 1783. Springer, Cham. https://doi.org/10.1007/978-3-031-27527-2_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-27527-2_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-27526-5

  • Online ISBN: 978-3-031-27527-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics