UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

A pattern-clustering method for longitudinal data - heroin users receiving methadone

Lin, C; (2014) A pattern-clustering method for longitudinal data - heroin users receiving methadone. Doctoral thesis , UCL (University College London). Green open access


Download (2MB)


Methadone is used as a substitute of heroin and there may be certain groups of users according to methadone dosage. In this work we analyze data for 314 participants of a methadone study over 180 days. The data, which is called category-ordered data throughout this study, consists of seven categories in which six categories have an ordinal scale for representing dosages and one category for missing dosages. We develop a clustering method involving the so-called p-dissimilarity, modification of Prediction Strength (PS), a null model test, and two ordering algorithms. (1) The p-dissimilarity is used to measure dissimilarity between the 180-day time series of the participants. It accommodates categorical and ordinal scales by using a parameter p as a switch between data being treated as categorical and ordinal. It measures dissimilarity between observed dosages and missing dosages by using a parameter β. Also, it could be applied in a wider field of applications, such as survey studies in which questions use choices on the Likert scales and a don't know-category. (2) The PS determines the number of clusters by measuring the stability of clusters, and the Average Silhouette Width (ASW) measures coherence. We propose rules to modify PS so that it can be fully applied to hierarchal clustering methods. Next, instead of preselecting a clustering method, we let the data to decide which clustering method to use based on cluster stability and cluster coherence. The partition around medoids (PAM) method is then selected. (3) We propose the null model test to determine the number of clusters (k). Many methods for the determination of number of clusters give values for k ≥ 2 based on cluster compactness and separation, and suggest to use the k with the highest value. Viewing this question from a different perspective, for a fixed k and a selected clustering method, the null model test uses a null model and parametric bootstrap to explore the distribution of a statistic under the null assumption. A hypothesis test for each k can then be performed. For our data, we construct a Markov null model without structure of clusters, in which the distributions of the categories are the same as those of the real data. We apply the null model test to investigate whether the clusters found according to PAM and ASW/PS can be explained by random variation. (4) We use heatplots to evaluate the quality of clustering. A heatplot is a graph that represents data by colour. It consists of horizontal lines representing the data for objects. However, the interpretability of a heatplot strongly depends on the location of the objects along the vertical-axis. We propose two algorithms to locate objects on a heatplot. The first algorithm using multidimensional scaling (MDS) is for general use. The second algorithm using projection vector is for the PAM method. Each of them locates objects in a heatplot. The heatplot can then be used for information visualisation. It displays clustering structures, relationships between objects and clusters in terms of their dissimilarities, locations of medoids, and the density of clusters. Despite the fact that no significant clustering structure is observed, the sequences of categories for clusters are clinically useful. The sequences of categories indicate detoxification. Our data shows participants with low heroin addictions attempted to reduce/quit the use of methadone at the third month. As for participants with high addictions, few attempted to reduce the use of methadone at the fifth month and most required more time to finish the detoxification process. Also, we find the heroin onset age might have an influence on the patterns of detoxification.

Type: Thesis (Doctoral)
Title: A pattern-clustering method for longitudinal data - heroin users receiving methadone
Open access status: An open access version is available from UCL Discovery
Language: English
UCL classification: UCL > Provost and Vice Provost Offices
UCL > Provost and Vice Provost Offices > UCL BEAMS
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Maths and Physical Sciences
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Maths and Physical Sciences > Dept of Statistical Science
URI: https://discovery.ucl.ac.uk/id/eprint/1452358
Downloads since deposit
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item