UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

Who are you? A framework to identify and report genetic sample mix‐ups

Duntsch, L; Brekke, P; Ewen, JG; Santure, AW; (2022) Who are you? A framework to identify and report genetic sample mix‐ups. Molecular Ecology Resources , 22 (5) pp. 1855-1867. 10.1111/1755-0998.13575. Green open access

[thumbnail of Duntsch 2021 sample mix ups UCL discovery.pdf]
Preview
Text
Duntsch 2021 sample mix ups UCL discovery.pdf - Accepted Version

Download (1MB) | Preview

Abstract

Sample mix-ups occur when samples have accidentally been duplicated, mislabelled or swapped. When samples are subsequently genotyped or sequenced, this can lead to individual IDs being incorrectly linked to genetic data, resulting in incorrect or biased research results, or reduced power to detect true biological patterns. We surveyed the community and found that almost 80% of responding researchers have encountered sample mix-ups. However, many recent studies in the field of molecular ecology do not appear to systematically report individual assignment checks as part of their publications. Although checks may be done, lack of consistent reporting means that it is difficult to assess whether sample mix-ups have occurred or been detected. Here, we present an easy-to-follow sample verification framework that can utilise existing metadata, including species, population structure, sex and pedigree information. We demonstrate its application to a data set representing individuals of a threatened Aotearoa New Zealand bird species, the hihi, genotyped on a 50K SNP array. We detected numerous incorrect genotype-ID associations when comparing observed and genetic sex or comparing to relationships in a verified microsatellite pedigree. The framework proposed here helped to confirm 488 individuals (39%), correct another 20 bird-genotype links, and detect hundreds of incorrect sample IDs, emphasizing the value of routinely checking genetic and genomic data sets for their accuracy. We therefore promote the implementation and reporting of this simple yet effective sample verification framework as a standardized quality control step for studies in the field of molecular ecology.

Type: Article
Title: Who are you? A framework to identify and report genetic sample mix‐ups
Open access status: An open access version is available from UCL Discovery
DOI: 10.1111/1755-0998.13575
Publisher version: https://doi.org/10.1111/1755-0998.13575
Language: English
Additional information: This version is the author accepted manuscript. For information on re-use, please refer to the publisher's terms and conditions.
Keywords: data QC, duplicate check, framework, pedigree verification, sample mix-up, SNP array data
UCL classification: UCL
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Life Sciences
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Life Sciences > Div of Biosciences
URI: https://discovery.ucl.ac.uk/id/eprint/10141642
Downloads since deposit
83Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item