Akhanli, Serhat Emre;
(2019)
Distance construction and clustering of football player performance data.
Doctoral thesis (Ph.D), UCL (University College London).
Preview |
Text
thesis.pdf - Accepted Version Download (18MB) | Preview |
Abstract
I present a new idea to map football players information by using multidimensional scaling and to cluster football players. The actual goal is to define a proper distance measure between players. The data was assembled from whoscored.com. Variables are of the mixed type, containing nominal, ordinal, count and continuous information. In the data pre-processing stage, four different steps are followed through for continuous and count variables: 1) representation (i.e., considerations regarding how the relevant information is most appropriately represented, e.g., relative to minutes played), 2) transformation (football knowledge as well as the skewness of the distribution of some count variables indicates that transformation should be used to decrease the effective distance between higher values compared to the distances between lower values), 3) standardisation (in order to make within-variable variations comparable), and 4) variable weighting including variable selection. In a final phase, all the different types of distance measures are combined by using the principle of the Gower dissimilarity (Gower, 1971). As the second part of this thesis, the aim was to choose a suitable clustering technique and to estimate the best number of clusters for the dissimilarity measurement obtained from football players data set. For this aim, different clustering quality indexes have been introduced, and as first proposed by Hennig (2017), a new concept to calibrate the clustering quality indexes has been presented. In this respect, Hennig (2017) proposed two random clustering algorithms, which generates random clustering points from which standardised clustering quality index values can be calculated and aggregated in an appropriate way. In this thesis, two new additional random clustering algorithms have been proposed and the aggregation of clustering quality indexes has been examined with different types of simulated and real data sets. In the end, this new concept has been applied to the dissimilarity measurement of football players.
Type: | Thesis (Doctoral) |
---|---|
Qualification: | Ph.D |
Title: | Distance construction and clustering of football player performance data |
Event: | University College London |
Open access status: | An open access version is available from UCL Discovery |
Language: | English |
Additional information: | Copyright © The Author 2019. Original content in this thesis is licensed under the terms of the Creative Commons Attribution 4.0 International (CC BY 4.0) Licence (https://creativecommons.org/licenses/by/4.0/). Any third-party copyright material present remains the property of its respective owner(s) and is licensed under its existing terms. Access may initially be restricted at the author’s request. |
UCL classification: | UCL UCL > Provost and Vice Provost Offices UCL > Provost and Vice Provost Offices > UCL BEAMS UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Maths and Physical Sciences UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Maths and Physical Sciences > Dept of Statistical Science |
URI: | https://discovery.ucl.ac.uk/id/eprint/10065964 |
Archive Staff Only
View Item |