UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

Inference of haplotypic phase and missing genotypes in polyploid organisms and variable copy number genomic regions

Su, SY; White, J; Balding, DJ; Coin, LJM; (2008) Inference of haplotypic phase and missing genotypes in polyploid organisms and variable copy number genomic regions. BMC BIOINFORMATICS , 9 , Article 513. 10.1186/1471-2105-9-513. Green open access

[thumbnail of 1471-2105-9-513.pdf]
Preview
PDF
1471-2105-9-513.pdf

Download (862kB)

Abstract

Background: The power of haplotype-based methods for association studies, identification of regions under selection, and ancestral inference, is well-established for diploid organisms. For polyploids, however, the difficulty of determining phase has limited such approaches. Polyploidy is common in plants and is also observed in animals. Partial polyploidy is sometimes observed in humans (e. g. trisomy 21; Down's syndrome), and it arises more frequently in some human tissues. Local changes in ploidy, known as copy number variations (CNV), arise throughout the genome. Here we present a method, implemented in the software polyHap, for the inference of haplotype phase and missing observations from polyploid genotypes. PolyHap allows each individual to have a different ploidy, but ploidy cannot vary over the genomic region analysed. It employs a hidden Markov model (HMM) and a sampling algorithm to infer haplotypes jointly in multiple individuals and to obtain a measure of uncertainty in its inferences.Results: In the simulation study, we combine real haplotype data to create artificial diploid, triploid, and tetraploid genotypes, and use these to demonstrate that polyHap performs well, in terms of both switch error rate in recovering phase and imputation error rate for missing genotypes. To our knowledge, there is no comparable software for phasing a large, densely genotyped region of chromosome from triploids and tetraploids, while for diploids we found polyHap to be more accurate than fastPhase. We also compare the results of polyHap to SATlotyper on an experimentally haplotyped tetraploid dataset of 12 SNPs, and show that polyHap is more accurate.Conclusion: With the availability of large SNP data in polyploids and CNV regions, we believe that polyHap, our proposed method for inferring haplotypic phase from genotype data, will be useful in enabling researchers analysing such data to exploit the power of haplotype-based analyses.

Type: Article
Title: Inference of haplotypic phase and missing genotypes in polyploid organisms and variable copy number genomic regions
Open access status: An open access version is available from UCL Discovery
DOI: 10.1186/1471-2105-9-513
Publisher version: http://dx.doi.org/10.1186/1471-2105-9-513
Language: English
Additional information: © 2008 Su et al; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Keywords: EXPECTATION-MAXIMIZATION ALGORITHM, SINGLE-NUCLEOTIDE POLYMORPHISMS, HIDDEN MARKOV MODEL, LINKAGE DISEQUILIBRIUM, DISEASE ASSOCIATION, POPULATION
UCL classification: UCL
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Life Sciences
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Life Sciences > Div of Biosciences
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Life Sciences > Div of Biosciences > Genetics, Evolution and Environment
URI: https://discovery.ucl.ac.uk/id/eprint/101239
Downloads since deposit
124Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item