Thawornwattana, Yuttapong;
Flouri, Tomas;
Mallet, James;
Yang, Ziheng;
(2025)
Inference of Gene Flow between Species from Genomic Data When the Mode, Direction, and Lineages are Misspecified.
Molecular Biology and Evolution
, 42
(6)
, Article msaf121. 10.1093/molbev/msaf121.
Preview |
Text
Inference of Gene Flow between Species from Genomic Data When the Mode, Direction, and Lineages are Misspecified.pdf - Accepted Version Download (1MB) | Preview |
Abstract
Thanks to genomic data, interspecific gene flow is increasingly recognized as a major evolutionary force that shapes biodiversity. Two models have been developed in the multispecies coalescent (MSC) framework to infer gene flow from genomic data, assuming either constant-rate continuous migration (MSC-M) or discrete introgression/hybridization (MSC-I). The extreme simplicity of these models raises concerns about their usefulness as they represent misspecified models when applied to real data. Here, we study inference of gene flow under the MSC-M model, considering mis-assignment of gene flow onto incorrect parental or daughter lineages, misspecification of the direction of gene flow, and misspecification of the mode of gene flow. Mis-assignment of gene flow to an incorrect lineage causes large biases in the estimated rates. The Bayesian test has high power for inferring both recent and ancient gene flow, between either sister lineages or nonsister lineages, although misspecification of the direction of gene flow may make it hard to distinguish early divergence with gene flow from recent complete isolation. Misspecification of the mode of gene flow (MSC-I versus MSC-M) has small local effects, and gene flow is detected with high power despite the misspecification. We analyze a genomic dataset from the purple cone spruce (Picea spp., Pinaceae), which putatively arose through homoploid hybrid speciation, to demonstrate practical implications of our theoretical analyses. Overall, we find that the extremely idealized models of gene flow (in particular the discrete MSC-I model) are very effective for extracting information about species divergence and gene flow from genomic data.
Type: | Article |
---|---|
Title: | Inference of Gene Flow between Species from Genomic Data When the Mode, Direction, and Lineages are Misspecified |
Location: | United States |
Open access status: | An open access version is available from UCL Discovery |
DOI: | 10.1093/molbev/msaf121 |
Publisher version: | https://doi.org/10.1093/molbev/msaf121 |
Language: | English |
Additional information: | © The Author(s) 2025. Published by Oxford University Press on behalf of Society for Molecular Biology and Evolution. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
Keywords: | Science & Technology, Life Sciences & Biomedicine, Biochemistry & Molecular Biology, Evolutionary Biology, Genetics & Heredity, gene flow, introgression, migration, multispecies coalescent, model misspecification, <sc>Bpp</sc>, MAXIMUM-LIKELIHOOD IMPLEMENTATION, EVOLUTIONARY HISTORY, POPULATION SIZES, MIGRATION, INTROGRESSION, SPECIATION, SEQUENCE, DIVERGENCE |
UCL classification: | UCL UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Life Sciences UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Life Sciences > Div of Biosciences UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Life Sciences > Div of Biosciences > Genetics, Evolution and Environment |
URI: | https://discovery.ucl.ac.uk/id/eprint/10211258 |
Archive Staff Only
![]() |
View Item |