Abstract
It is well known in the sports industry that the performance of athletes is strongly influenced by physiological and psychological factors. In recent years, many researchers have analysed whether athlete-generated social media content can be used as proxies for such performance factors, with some promising results. In this study, we investigated whether such proxies are useful features for a machine learning model to predict athletes’ performance in subsequent competitions. We extracted millions of tweets that NBA basketball players posted themselves or were tagged in and derived features reflecting players’ mood, social media behaviour, and sleep quality before games. Using these and other social media-unrelated features, we performed statistical tests to examine whether the features significantly improve the accuracy of a random forest model for predicting players’ BPM scores in upcoming games. The results show that, in particular, the number of tweets a player is tagged in prior to a game significantly improves the predictions of the model. Our findings provide insights for practitioners on the effects of social media on athlete performance that can be used prospectively for mental health awareness training and optimisation of pre-game routines.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Baron, R.S.: Distraction-conflict theory: progress and problems. Adv. Exp. Soc. Psychol. 19, 1–40 (1986). https://doi.org/10.1016/S0065-2601(08)60211-7
Barrie, C., Chun-ting Ho, J.: academictwitteR: an R package to access the twitter academic research product track v2 API endpoint. J. Open Source Softw. 6(62), 3272 (2021). https://doi.org/10.21105/joss.03272
Breiman, L.: Random forest. Mach. Learn. 45(1), 5–32 (2001). https://doi.org/10.1023/A:1010933404324
Bürkner, P.C., Gabry, J., Vehtari, A.: Approximate leave-future-out cross-validation for bayesian time series models. J. Stat. Comput. Simul. 90(14), 2499–2523 (2020). https://doi.org/10.1080/00949655.2020.1783262
Coleman, T., Peng, W., Mentch, L.: Scalable and efficient hypothesis testing with random forests (2019). https://doi.org/10.48550/arXiv.1904.07830
ESPN: vince carter addresses the negative effects of social media on athletes (2020). https://www.youtube.com/watch?v=1cX5_2YadU4. Accessed 03 Mar 2022
Giachanou, A., Crestani, F.: Like it or not: a survey of twitter sentiment analysis methods. ACM Comput. Surv. (CSUR) 49(2), 1–41 (2016). https://doi.org/10.1145/2938640
Grüttner, A., Vitisvorakarn, M., Wambsganss, T., Rietsche, R., Back, A.: The new window to athletes’ soul-what social media tells us about athletes’ performances. In: Proceeding of Hawaii International Conference on System Sciences (HICSS), pp. 2479–2488 (2020). https://doi.org/10.24251/HICSS.2020.303
Hayes, M., Filo, K., Geurin, A., Riot, C.: An exploration of the distractions inherent to social media use among athletes. Sport Manage. Rev. 23(5), 852–868 (2020). https://doi.org/10.1016/j.smr.2019.12.006
Hutto, C., Gilbert, E.: VADER: a parsimonious rule-based model for sentiment analysis of social media text. In: Proceeding of AAAI Conference on Web and Social Media, vol. 8, pp. 216–225 (2014). https://www.aaai.org/ocs/index.php/ICWSM/ICWSM14/paper/view/8109/8122
Iso-Ahola, S.E.: Intrapersonal and interpersonal factors in athletic performance. Scandinavian J. Med. Sci. Sports 5(4), 191–199 (1995). https://doi.org/10.1111/j.1600-0838.1995.tb00035.x
Jones, J.J., Kirschen, G.W., Kancharla, S., Hale, L.: Association between late-night tweeting and next-day game performance among professional basketball players. Sleep Health 5(1), 68–71 (2019). https://doi.org/10.1016/j.sleh.2018.09.005
Lim, J.H., Donovan, L.A.N., Kaufman, P., Ishida, C.: Professional athletes’ social media use and player performance: evidence from the national football league. Int. J. Sport Commun. 14(1), 1–27 (2020). https://doi.org/10.1123/ijsc.2020-0055
Mentch, L., Hooker, G.: Quantifying uncertainty in random forests via confidence intervals and hypothesis tests. J. Mach. Learn. Res. 17(1), 841–881 (2016)
Myers, D.: About Box Plus/Minus (BPM) (2020). https://www.basketball-reference.com/about/bpm2.html. Accessed 12 Mar 2022
Nguyen, D.Q., Vu, T., Nguyen, A.T.: BERTweet: a pre-trained language model for english tweets (2020). https://arxiv.org/abs/2005.10200
von Ott, K., Puymbroeck, M.V.: Does the media impact athletic performance. Sport J. 9(3), (2006)
Rinker, T.W.: Textclean: text cleaning tools. Buffalo, New York (2018). https://github.com/trinker/textclean, version 0.9.3
Rousidis, D., Koukaras, P., Tjortjis, C.: Social media prediction: a literature review. Multimedia Tools Appl. 79(9), 6279–6311 (2020). https://doi.org/10.1007/s11042-019-08291-9
Snijders, T.A.: On cross-validation for predictor evaluation in time series. In: On Model Uncertainty and its Statistical Implications, pp. 56–69. Springer (1988). https://doi.org/10.1007/978-3-642-61564-1_4
Strobl, C., Boulesteix, A.L., Zeileis, A., Hothorn, T.: Bias in random forest variable importance measures: illustrations, sources and a solution. BMC Bioinformatics 8(1), 1–21 (2007). https://doi.org/10.1186/1471-2105-8-25
Watkins, R.A., Sugimoto, D., Hunt, D.L., Oldham, J.R., Stracciolini, A.: The impact of social media use on sleep quality and performance among collegiate athletes. Orthop. J. Sports Med. 9(7_suppl3) (2021). https://doi.org/10.1177/2325967121S00087
Wright, M.N., Ziegler, A.: ranger: a fast implementation of random forests for high dimensional data in C++ and R. arXiv preprint arXiv:1508.04409 (2015). https://arxiv.org/abs/1508.04409
Xu, C., Yu, Y.: Measuring NBA players’ mood by mining athlete-generated content. In: Proceeding of Hawaii International Conference on System Sciences (HICSS), pp. 1706–1713. IEEE (2015). https://doi.org/10.1109/HICSS.2015.205
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Dreyer, F., Greif, J., Günther, K., Spiliopoulou, M., Niemann, U. (2022). Data-Driven Prediction of Athletes’ Performance Based on Their Social Media Presence. In: Pascal, P., Ienco, D. (eds) Discovery Science. DS 2022. Lecture Notes in Computer Science(), vol 13601. Springer, Cham. https://doi.org/10.1007/978-3-031-18840-4_15
Download citation
DOI: https://doi.org/10.1007/978-3-031-18840-4_15
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-18839-8
Online ISBN: 978-3-031-18840-4
eBook Packages: Computer ScienceComputer Science (R0)