UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

ncRNA orthologies in the vertebrate lineage.

Pignatelli, M; Vilella, AJ; Muffato, M; Gordon, L; White, S; Flicek, P; Herrero, J; (2016) ncRNA orthologies in the vertebrate lineage. Database (Oxford) , 2016 , Article bav127. 10.1093/database/bav127. Green open access

[thumbnail of ncRNA orthologies in the vertebrate lineage.pdf]
Preview
Text
ncRNA orthologies in the vertebrate lineage.pdf - Published Version

Download (2MB) | Preview

Abstract

Annotation of orthologous and paralogous genes is necessary for many aspects of evolutionary analysis. Methods to infer these homology relationships have traditionally focused on protein-coding genes and evolutionary models used by these methods normally assume the positions in the protein evolve independently. However, as our appreciation for the roles of non-coding RNA genes has increased, consistently annotated sets of orthologous and paralogous ncRNA genes are increasingly needed. At the same time, methods such as PHASE or RAxML have implemented substitution models that consider pairs of sites to enable proper modelling of the loops and other features of RNA secondary structure. Here, we present a comprehensive analysis pipeline for the automatic detection of orthologues and paralogues for ncRNA genes. We focus on gene families represented in Rfam and for which a specific covariance model is provided. For each family ncRNA genes found in all Ensembl species are aligned using Infernal, and several trees are built using different substitution models. In parallel, a genomic alignment that includes the ncRNA genes and their flanking sequence regions is built with PRANK. This alignment is used to create two additional phylogenetic trees using the neighbour-joining (NJ) and maximum-likelihood (ML) methods. The trees arising from both the ncRNA and genomic alignments are merged using TreeBeST, which reconciles them with the species tree in order to identify speciation and duplication events. The final tree is used to infer the orthologues and paralogues following Fitch's definition. We also determine gene gain and loss events for each family using CAFE. All data are accessible through the Ensembl Comparative Genomics ('Compara') API, on our FTP site and are fully integrated in the Ensembl genome browser, where they can be accessed in a user-friendly manner.Database URL: http://www.ensembl.org.

Type: Article
Title: ncRNA orthologies in the vertebrate lineage.
Location: England
Open access status: An open access version is available from UCL Discovery
DOI: 10.1093/database/bav127
Publisher version: http://dx.doi.org/10.1093/database/bav127
Language: English
Additional information: © The Author(s) 2016. Published by Oxford University Press. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
UCL classification: UCL
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Medical Sciences
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Medical Sciences > Cancer Institute
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Medical Sciences > Cancer Institute > Research Department of Cancer Bio
URI: https://discovery.ucl.ac.uk/id/eprint/1476886
Downloads since deposit
83Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item