UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

Speeding up all-against-all protein comparisons while maintaining sensitivity by considering subsequence-level homology.

Wittwer, LD; Piližota, I; Altenhoff, AM; Dessimoz, C; (2014) Speeding up all-against-all protein comparisons while maintaining sensitivity by considering subsequence-level homology. PeerJ , 2 , Article e607. 10.7717/peerj.607. Green open access

[thumbnail of peerj-607.pdf] PDF
peerj-607.pdf

Download (968kB)

Abstract

Orthology inference and other sequence analyses across multiple genomes typically start by performing exhaustive pairwise sequence comparisons, a process referred to as "all-against-all". As this process scales quadratically in terms of the number of sequences analysed, this step can become a bottleneck, thus limiting the number of genomes that can be simultaneously analysed. Here, we explored ways of speeding-up the all-against-all step while maintaining its sensitivity. By exploiting the transitivity of homology and, crucially, ensuring that homology is defined in terms of consistent protein subsequences, our proof-of-concept resulted in a 4× speedup while recovering >99.6% of all homologs identified by the full all-against-all procedure on empirical sequences sets. In comparison, state-of-the-art k-mer approaches are orders of magnitude faster but only recover 3-14% of all homologous pairs. We also outline ideas to further improve the speed and recall of the new approach. An open source implementation is provided as part of the OMA standalone software at http://omabrowser.org/standalone.

Type: Article
Title: Speeding up all-against-all protein comparisons while maintaining sensitivity by considering subsequence-level homology.
Location: United States
Open access status: An open access version is available from UCL Discovery
DOI: 10.7717/peerj.607
Publisher version: http://dx.doi.org/10.7717/peerj.607
Language: English
Additional information: 2014 Wittwer et al. Distributed under Creative Commons CC-BY 4.0
Keywords: All-against-all, Homology, Orthology, Sequence alignment, Smith–Waterman
UCL classification: UCL
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Life Sciences
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Life Sciences > Div of Biosciences
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Life Sciences > Div of Biosciences > Genetics, Evolution and Environment
URI: https://discovery.ucl.ac.uk/id/eprint/1451937
Downloads since deposit
123Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item