UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

Identification of breed contributions in crossbred dogs

Doehring, O; (2015) Identification of breed contributions in crossbred dogs. Doctoral thesis , UCL (University College London). Green open access

[thumbnail of Orlando_Doehring_Thesis.pdf] PDF
Available under License : See the attached licence file.

Download (4MB)


There has been a strong public interest recently in the interrogation of canine ancestries using direct-toconsumer (DTC) genetic ancestry inference tools. Our goal is to improve the accuracy of the associated computational tools, by developing superior algorithms for identifying the breed composition of mixedbreed dogs. Genetic test data has been provided by Mars Veterinary, using SNP markers. We approach this ancestry inference problem from two main directions. The first approach is optimized for datasets composed of a small number of ancestry informative markers (AIM). Firstly, we compute haplotype frequencies from purebred ancestral panels which characterize genetic variation within breeds and are utilized to predict breed compositions. Due to a large number of possible breed combinations in admixed dogs we approximately sample this search space with a Metropolis-Hastings algorithm. As proposal density we either uniformly sample new breeds for the lineage, or we bias the Markov Chain so that breeds in the lineage are more likely to be replaced by similar breeds. The second direction we explore is dominated by HMM approaches which view genotypes as realizations of latent variable sequences corresponding to breeds. In this approach an admixed canine sample is viewed as a linear combination of segments from dogs in the ancestral panel. Results were evaluated using two different performance measures. Firstly, we looked at a generalization of binary ROC-curves to multi-class classification problems. Secondly, to more accurately judge breed contribution approximations we computed the difference between expected and predicted breed contributions. Experimental results on a synthetic, admixed test dataset using AIMs showed that the MCMC approach successfully predicts breed proportions for a variety of lineage complexities. Furthermore, due to exploration in the MCMC algorithm true breed contributions are underestimated. The HMM approach performed less well which is presumably due to using less information of the dataset.

Type: Thesis (Doctoral)
Title: Identification of breed contributions in crossbred dogs
Open access status: An open access version is available from UCL Discovery
Language: English
UCL classification: UCL
UCL > Provost and Vice Provost Offices
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Life Sciences
URI: https://discovery.ucl.ac.uk/id/eprint/1464461
Downloads since deposit
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item