UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

Predictive modelling of football injuries

Kampakis, S; (2016) Predictive modelling of football injuries. Doctoral thesis , UCL (University College London). Green open access

[thumbnail of thesis_final.pdf]
thesis_final.pdf - Accepted Version

Download (4MB) | Preview


The goal of this thesis is to investigate the potential of predictive modelling for football injuries. This work was conducted in close collaboration with Tottenham Hotspurs FC (THFC), the PGA European tour and the participation of Wolverhampton Wanderers (WW). Three investigations were conducted: 1. Predicting the recovery time of football injuries using the UEFA injury recordings: The UEFA recordings is a common standard for recording injuries in professional football. For this investigation, three datasets of UEFA injury recordings were available: one from THFC, one from WW and one that was constructed by merging both. Poisson, negative binomial and ordinal regression were used to model the recovery time after an injury and assess the significance of various injury-related covariates. Then, different machine learning algorithms (support vector machines, Gaussian processes, neural networks, random forests, naïve Bayes and k-nearest neighbours) were used in order to build a predictive model. The performance of the machine learning models is then improved by using feature selection conducted through correlation-based subset feature selection and random forests. 2. Predicting injuries in professional football using exposure records: The relationship between exposure (in training hours and match hours) in professional football athletes and injury incidence was studied. A common problem in football is understanding how the training schedule of an athlete can affect the chance of him getting injured. The task was to predict the number of days a player can train before he gets injured. The dataset consisted of the exposure records of professional footballers in Tottenham Hotspur Football Club from the season 2012-2013. The problem was approached by a Gaussian process model equipped with a dynamic time warping kernel that allowed the calculation of the similarity of exposure records of different lengths. 3. Predicting intrinsic injury incidence using in-training GPS measurements: A significant percentage of football injuries can be attributed to overtraining and fatigue. GPS data collected during training sessions might provide indicators of fatigue, or might be used to detect very intense training sessions which can lead to overtraining. This research used GPS data gathered during training sessions of the first team of THFC, in order to predict whether an injury would take place during a week. The data consisted of 69 variables in total. Two different binary classification approaches were followed and a variety of algorithms were applied (supervised principal component analysis, random forests, naïve Bayes, support vector machines, Gaussian process, neural networks, ridge logistic regression and k-nearest neighbours). Supervised principal component analysis shows the best results, while it also allows the extraction of components that reduce the total number of variables to 3 or 4 components which correlate with injury incidence. The first investigation contributes the following to the field: • It provides models based on the UEFA injury recordings, a standard used by many clubs, which makes it easier to replicate and apply the results. • It investigates which variables seem to be more highly related to the prediction of recovery after an injury. • It provides a comparison of models for predicting the time to return to play after injury. The second investigation contributes the following to the field: • It provides a model that can be used to predict the time when the first injury of the season will take place. • It provides a kernel that can be utilized by a Gaussian process in order to measure the similarity of training and match schedules, even if the time series involved are of different lengths. The third investigation contributes the following to the field: • It provides a model to predict injury on a given week based on GPS data gathered from training sessions. • It provides components, extracted through supervised principal component analysis, that correlate with injury incidence and can be used to summarize the large number of GPS variables in a parsimonious way.

Type: Thesis (Doctoral)
Title: Predictive modelling of football injuries
Event: University College London
Open access status: An open access version is available from UCL Discovery
Language: English
UCL classification: UCL
UCL > Provost and Vice Provost Offices
UCL > Provost and Vice Provost Offices > UCL BEAMS
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Computer Science
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Maths and Physical Sciences
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Maths and Physical Sciences > Dept of Statistical Science
URI: https://discovery.ucl.ac.uk/id/eprint/1508067
Downloads since deposit
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item