UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

Flexible model-based clustering of mixed binary and continuous data: application to genetic regulation and cancer

Abidin, FNZ; Westhead, DR; (2017) Flexible model-based clustering of mixed binary and continuous data: application to genetic regulation and cancer. Nucleic Acids Research , 45 (7) , Article e53. 10.1093/nar/gkw1270. Green open access

[thumbnail of Flexible model-based clustering of mixed binary and continuous data application to genetic regulation and cancer.pdf]
Preview
Text
Flexible model-based clustering of mixed binary and continuous data application to genetic regulation and cancer.pdf - Published Version

Download (3MB) | Preview

Abstract

Clustering is used widely in ‘omics’ studies and is often tackled with standard methods, e.g. hierarchical clustering. However, the increasing need for integration of multiple data sets leads to a requirement for clustering methods applicable to mixed data types, where the straightforward application of standard methods is not necessarily the best approach. A particularly common problem involves clustering entities characterized by a mixture of binary data (e.g. presence/absence of mutations, binding, motifs and epigenetic marks) and continuous data (e.g. gene expression, protein abundance, metabolite levels). Here, we present a generic method based on a probabilistic model for clustering this type of data, and illustrate its application to genetic regulation and the clustering of cancer samples. We show that the resulting clusters lead to useful hypotheses: in the case of genetic regulation these concern regulation of groups of genes by specific sets of transcription factors and in the case of cancer samples combinations of gene mutations are related to patterns of gene expression. The clusters have potential mechanistic significance and in the latter case are significantly linked to survival. The method is available as a stand-alone software package (GNU General Public Licence) from http://github.com/BioToolsLeeds/FlexiCoClusteringPackage.git.

Type: Article
Title: Flexible model-based clustering of mixed binary and continuous data: application to genetic regulation and cancer
Open access status: An open access version is available from UCL Discovery
DOI: 10.1093/nar/gkw1270
Publisher version: https://doi.org/10.1093/nar/gkw1270
Language: English
Additional information: This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
UCL classification: UCL
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Brain Sciences
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Brain Sciences > The Ear Institute
URI: https://discovery.ucl.ac.uk/id/eprint/10082500
Downloads since deposit
58Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item