UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

Bayesian mixture models for metagenomic community profiling

Morfopoulou, S; (2015) Bayesian mixture models for metagenomic community profiling. Doctoral thesis , (UCL) University College London. Green open access

[thumbnail of Morfopoulou_final_thesis.pdf]
Preview
Text
Morfopoulou_final_thesis.pdf

Download (3MB) | Preview

Abstract

Metagenomics can be defined as the study of DNA sequences from environmental or community samples. This is a rapidly progressing field and application ideas that seemed outlandish a few years ago are now routine and familiar. Metagenomics’ scope is broad and includes the analysis of a diverse set of samples such as environmental or clinical samples. Human tissues are in essence metagenomic samples due to the presence of microorganisms, such as bacteria, viruses and fungi in both healthy and diseased individuals. Deep sequencing of clinical samples is now an established tool for pathogen detection, with direct medical applications. The large amount of data generated produces an opportunity to detect species even at very low levels, provided that computational tools can effectively profile the relevant metagenomic communities. Data interpretation is complicated by the fact that short sequencing reads can match multiple organisms and by the lack of completeness of existing databases, particularly for viruses. The research presented in this thesis focuses on using Bayesian Mixture Model techniques to produce taxonomic profiles for metagenomic data. A novel Bayesian mixture model framework for resolving complex metagenomic mixtures is introduced, called metaMix. The use of parallel Monte Carlo Markov chains (MCMC) for the exploration of the species space enables the identification of the set of species most likely to contribute to the mixture. The improved accuracy of metaMix compared to relevant methods is demonstrated, particularly for profiling complex communities consisting of several related species. metaMix was designed specifically for the analysis of deep transcriptome sequencing datasets, with a focus on viral pathogen detection. However, the principles are generally applicable to all types of metagenomic mixtures. metaMix is implemented as a user friendly R package, freely available on CRAN: http://cran.r-project.org/web/packages/metaMix.

Type: Thesis (Doctoral)
Title: Bayesian mixture models for metagenomic community profiling
Event: University College London
Open access status: An open access version is available from UCL Discovery
Language: English
Keywords: metagenomics, statistical genetics, MCMC, bioinformatics
UCL classification: UCL
UCL > Provost and Vice Provost Offices
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Population Health Sciences
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Population Health Sciences > UCL GOS Institute of Child Health
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Population Health Sciences > UCL GOS Institute of Child Health > Infection, Immunity and Inflammation Dept
URI: https://discovery.ucl.ac.uk/id/eprint/1473450
Downloads since deposit
416Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item