UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

The computer program Structure for assigning individuals to populations: easy to use but easier to misuse

Wang, J; (2017) The computer program Structure for assigning individuals to populations: easy to use but easier to misuse. Molecular Ecology Resources , 17 (5) pp. 981-990. 10.1111/1755-0998.12650. Green open access

[thumbnail of Wang_Assign_V3_MainText.pdf]
Preview
Text
Wang_Assign_V3_MainText.pdf

Download (426kB) | Preview
[thumbnail of Wang_Assign_V2_Figs.pdf]
Preview
Text
Wang_Assign_V2_Figs.pdf

Download (311kB) | Preview
[thumbnail of Wang_Assign_V2_Supp.pdf]
Preview
Text
Wang_Assign_V2_Supp.pdf

Download (381kB) | Preview

Abstract

The computer program Structure implements a Bayesian method, based on a population genetics model, to assign individuals to their source populations using genetic marker data. It is widely applied in the fields of ecology, evolutionary biology, human genetics and conservation biology for detecting hidden genetic structures, inferring the most likely number of populations (K), assigning individuals to source populations, and estimating admixture and migration rates. Recently, several simulation studies repeatedly concluded that the program yields erroneous inferences when samples from different populations are highly unbalanced in size. Analysing both simulated and empirical datasets, this study confirms that Structure indeed yields poor individual assignments to source populations and gives frequently incorrect estimates of K when sampling is unbalanced. However, this poor performance is mainly caused by the adoption of the default ancestry prior, which assumes all source populations contribute equally to the pooled sample of individuals. When the alternative ancestry prior, which allows for unequal representations of the source populations by the sample, is adopted, accurate individual assignments could be obtained even if sampling is highly unbalanced. The alternative prior also improves the inference of K by two estimators, albeit the improvement is not as much as that in individual assignments to populations. For the difficult case of many populations and unbalanced sampling, a rarely used parameter combination of the alternative ancestry prior, an initial ALPHA value much smaller than the default and the uncorrelated allele frequency model is required for Structure to yield accurate inferences. I conclude that Structure is easy to use but is easier to misuse because of its complicated genetic model and many parameter (prior) options which may not be obvious to choose, and suggest using multiple plausible models (parameters) and K estimators in conducting comparative and exploratory Structure analysis.

Type: Article
Title: The computer program Structure for assigning individuals to populations: easy to use but easier to misuse
Open access status: An open access version is available from UCL Discovery
DOI: 10.1111/1755-0998.12650
Publisher version: http://doi.org/10.1111/1755-0998.12650
Language: English
Additional information: This article is protected by copyright. All rights reserved. This is the peer reviewed version of the following article: Wang, J; (2016) The computer program Structure for assigning individuals to populations: easy to use but easier to misuse. Molecular Ecology Resources 10.1111/1755-0998.12650, which has been published in final form at http://doi.org/10.1111/1755-0998.12650. This article may be used for non-commercial purposes in accordance with Wiley Terms and Conditions for Self-Archiving.
Keywords: Genetic structure; Markers; genetic differentiation; admixture; population clustering
UCL classification: UCL
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Life Sciences
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Life Sciences > Div of Biosciences
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Life Sciences > Div of Biosciences > Genetics, Evolution and Environment
URI: https://discovery.ucl.ac.uk/id/eprint/1538933
Downloads since deposit
962Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item