UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

A pan-cancer landscape of somatic mutations in non-unique regions of the human genome

Tarabichi, M; Demeulemeester, J; Verfaillie, A; Flanagan, AM; Van Loo, P; Konopka, T; (2021) A pan-cancer landscape of somatic mutations in non-unique regions of the human genome. Nature Biotechnology , 39 pp. 1589-1596. 10.1038/s41587-021-00971-y. Green open access

[thumbnail of Tarabachi Accepted Manuscript.pdf]
Preview
Text
Tarabachi Accepted Manuscript.pdf - Accepted Version

Download (3MB) | Preview

Abstract

A substantial fraction of the human genome displays high sequence similarity with at least one other genomic sequence, posing a challenge for the identification of somatic mutations from short-read sequencing data. Here we annotate genomic variants in 2,658 cancers from the Pan-Cancer Analysis of Whole Genomes (PCAWG) cohort with links to similar sites across the human genome. We train a machine learning model to use signals distributed over multiple genomic sites to call somatic events in non-unique regions and validate the data against linked-read sequencing in an independent dataset. Using this approach, we uncover previously hidden mutations in ~1,700 coding sequences and in thousands of regulatory elements, including in known cancer genes, immunoglobulins and highly mutated gene families. Mutations in non-unique regions are consistent with mutations in unique regions in terms of mutation burden and substitution profiles. The analysis provides a systematic summary of the mutation events in non-unique regions at a genome-wide scale across multiple human cancers.

Type: Article
Title: A pan-cancer landscape of somatic mutations in non-unique regions of the human genome
Location: United States
Open access status: An open access version is available from UCL Discovery
DOI: 10.1038/s41587-021-00971-y
Publisher version: http://dx.doi.org/10.1038/s41587-021-00971-y
Language: English
Additional information: This version is the author accepted manuscript. For information on re-use, please refer to the publisher’s terms and conditions.
Keywords: Bioinformatics, Cancer genomics, Genome informatics, Genomics
UCL classification: UCL
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Medical Sciences
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Medical Sciences > Cancer Institute
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Medical Sciences > Cancer Institute > Research Department of Pathology
URI: https://discovery.ucl.ac.uk/id/eprint/10132482
Downloads since deposit
78Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item