UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

Phase Resolution of Heterozygous Sites in Diploid Genomes is Important to Phylogenomic Analysis under the Multispecies Coalescent Model

Huang, J; Bennett, J; Flouri, T; Yang, Z; (2021) Phase Resolution of Heterozygous Sites in Diploid Genomes is Important to Phylogenomic Analysis under the Multispecies Coalescent Model. Systematic Biology , Article syab047. 10.1093/sysbio/syab047. (In press). Green open access

[thumbnail of Yang_syab047.pdf]
Preview
Text
Yang_syab047.pdf - Published Version

Download (1MB) | Preview

Abstract

Genome sequencing projects routinely generate haploid consensus sequences from diploid genomes, which are effectively chimeric sequences with the phase at heterozygous sites resolved at random. The impact of phasing errors on phylogenomic analyses under the multispecies coalescent (MSC) model is largely unknown. Here, we conduct a computer simulation to evaluate the performance of four phase-resolution strategies (the true phase resolution, the diploid analytical integration algorithm which averages over all phase resolutions, computational phase resolution using the program PHASE, and random resolution) on estimation of the species tree and evolutionary parameters in analysis of multilocus genomic data under the MSC model. We found that species tree estimation is robust to phasing errors when species divergences were much older than average coalescent times but may be affected by phasing errors when the species tree is shallow. Estimation of parameters under the MSC model with and without introgression is affected by phasing errors. In particular, random phase resolution causes serious overestimation of population sizes for modern species and biased estimation of cross-species introgression probability. In general, the impact of phasing errors is greater when the mutation rate is higher, the data include more samples per species, and the species tree is shallower with recent divergences. Use of phased sequences inferred by the PHASE program produced small biases in parameter estimates. We analyze two real data sets, one of East Asian brown frogs and another of Rocky Mountains chipmunks, to demonstrate that heterozygote phase-resolution strategies have similar impacts on practical data analyses. We suggest that genome sequencing projects should produce unphased diploid genotype sequences if fully phased data are too challenging to generate, and avoid haploid consensus sequences, which have heterozygous sites phased at random. In case the analytical integration algorithm is computationally unfeasible, computational phasing prior to population genomic analyses is an acceptable alternative. [BPP; introgression; multispecies coalescent; phase; species tree.]

Type: Article
Title: Phase Resolution of Heterozygous Sites in Diploid Genomes is Important to Phylogenomic Analysis under the Multispecies Coalescent Model
Open access status: An open access version is available from UCL Discovery
DOI: 10.1093/sysbio/syab047
Publisher version: https://doi.org/10.1093/sysbio/syab047
Language: English
Additional information: © The Author(s) 2021. Published by Oxford University Press on behalf of the Society of Systematic Biologists. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
UCL classification: UCL
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Life Sciences
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Life Sciences > Div of Biosciences
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Life Sciences > Div of Biosciences > Genetics, Evolution and Environment
URI: https://discovery.ucl.ac.uk/id/eprint/10129707
Downloads since deposit
32Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item