UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

High-throughput sequencing analysis pipeline

Mozere, M; (2016) High-throughput sequencing analysis pipeline. Doctoral thesis , UCL (University College London).

Full text not available from this repository.

Abstract

High-throughput sequencing methods were developed to increase the productivity of processing data from genomic DNA. Sequencing platforms are generating massive amounts of genetic variation data which makes it difficult to pinpoint a small subset of functionally important variants. The focus has now shifted from generating sequences to searching for the critical differences that separate normal variants from disease ones. Our High-throughput Sequencing Analysis Pipeline (HSAP) is a multistep analysis software designed to annotate and filter variants in a top-down fashion from Variant Calling Format (VCF) files in order to find disease causing variants in the patients. It is designed in Linux medium and is composed of a collection of interacting task-specific modules written in different programming languages (such as Python, C++) and shell scripts. Each module is designed to perform a specific task, such as: annotate variants with their functional characterisation, zygosity status, allele frequencies within population; filter variants depending on the inherited disease model, read depth, call quality, physical location and other criteria. The output is added to the universal VCF format file, which contains annotated and filtered genomic variants. The pipeline was verified by identifying/confirming a specific disease-causing mutation for a single-gene disorder. HSAP is designed as an open-source locally self-contained bootable software that uses only information from publicly available databases. It has a user-friendly offline web-interface that allows to select different modules and chain them together to create unique filtering arrangements in order to adapt the pipeline as needed.

Type: Thesis (Doctoral)
Title: High-throughput sequencing analysis pipeline
Event: UCL
Language: English
UCL classification: UCL
UCL > Provost and Vice Provost Offices
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Medical Sciences
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Medical Sciences > Div of Medicine
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Medical Sciences > Div of Medicine > Renal Medicine
URI: https://discovery.ucl.ac.uk/id/eprint/1528797
Downloads since deposit
0Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item