UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

A parsimony estimator of the number of populations from a STRUCTURE-like analysis

Wang, J; (2019) A parsimony estimator of the number of populations from a STRUCTURE-like analysis. Molecular Ecology Resources , 19 (4) pp. 970-981. 10.1111/1755-0998.13000. Green open access

[thumbnail of Wang MS_V3.pdf]
Preview
Text
Wang MS_V3.pdf - Accepted Version

Download (990kB) | Preview

Abstract

Population genetics model based Bayesian methods have been proposed and widely applied to making unsupervised inference of population structure from a sample of multilocus genotypes. Usually they provide good estimates of the ancestry (or population membership) of sampled individuals by clustering them probabilistically or proportionally into (anonymous) populations. However, they have difficulties in accurately estimating the number of populations (K) represented by the sampled individuals. This study proposed a new ad hoc estimator of K, calculable from the output of a population clustering program such as STRUCTURE or ADMIXTURE. The new criterion, called parsimony index (PI), aims to identify the number of populations (K) which yields consistently the minimal admixture estimates of sampled individuals. Extensive simulated and empirical data were used to compare the accuracy of PI and two popular K estimators based on Pr[X|K] (i.e. the probability of genotype data X given K) and ΔK (i.e. the rate of change of the probability of data as a function of K) calculated from STRUCTURE outputs, and the accuracy of PI and the cross‐validation method calculated from ADMIXTURE outputs. It was shown that PI was more accurate than the other methods consistently in various population structure (e.g. hierarchical island model, different extents of differentiation) and sampling (e.g. unbalanced sample sizes, different marker information contents) scenarios. The ΔK method was more accurate than the Pr[X|K] method only for hierarchically structured or highly inbred populations, and the opposite was true in the other scenarios. The PI method was implemented in a computer program, KFinder, which can be run on all major computer platforms.

Type: Article
Title: A parsimony estimator of the number of populations from a STRUCTURE-like analysis
Open access status: An open access version is available from UCL Discovery
DOI: 10.1111/1755-0998.13000
Publisher version: https://doi.org/10.1111/1755-0998.13000
Language: English
Additional information: This version is the author accepted manuscript. For information on re-use, please refer to the publisher’s terms and conditions.
Keywords: STRUCTURE, markers, genetic differentiation, number of populations
UCL classification: UCL
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Life Sciences
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Life Sciences > Div of Biosciences
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Life Sciences > Div of Biosciences > Genetics, Evolution and Environment
URI: https://discovery.ucl.ac.uk/id/eprint/10066461
Downloads since deposit
396Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item