UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

Linear functional organization of the omic embedding space

Xenos, A; Malod-Dognin, N; Milinkovic, S; Przulj, N; (2021) Linear functional organization of the omic embedding space. Bioinformatics , 37 (21) pp. 3839-3847. 10.1093/bioinformatics/btab487. Green open access

[thumbnail of btab487.pdf]
Preview
Text
btab487.pdf - Published Version

Download (714kB) | Preview

Abstract

Motivation: We are increasingly accumulating complex omics data that capture different aspects of cellular functioning. A key challenge is to untangle their complexity and effectively mine them for new biomedical information. To decipher this new information, we introduce algorithms based on network embeddings. Such algorithms represent biological macromolecules as vectors in d-dimensional space, in which topologically similar molecules are embedded close in space and knowledge is extracted directly by vector operations. Recently, it has been shown that neural networks used to obtain vectorial representations (embeddings) are implicitly factorizing a mutual information matrix, called Positive Pointwise Mutual Information (PPMI) matrix. Thus, we propose the use of the PPMI matrix to represent the human protein-protein interaction (PPI) network and also introduce the graphlet degree vector PPMI matrix of the PPI network to capture different topological (structural) similarities of the nodes in the molecular network. Results: We generate the embeddings by decomposing these matrices with Nonnegative Matrix Tri-Factorization. We demonstrate that genes that are embedded close in these spaces have similar biological functions, so we can extract new biomedical knowledge directly by doing linear operations on their embedding vector representations. We exploit this property to predict new genes participating in protein complexes and to identify new cancer-related genes based on the cosine similarities between the vector representations of the genes. We validate 80% of our novel cancer-related gene predictions in the literature and also by patient survival curves that demonstrating that 93.3% of them have a potential clinical relevance as biomarkers of cancer.

Type: Article
Title: Linear functional organization of the omic embedding space
Location: England
Open access status: An open access version is available from UCL Discovery
DOI: 10.1093/bioinformatics/btab487
Publisher version: http://dx.doi.org/10.1093/bioinformatics/btab487
Language: English
Additional information: VC The Author(s) 2021. Published by Oxford University Press. 3839 This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
Keywords: Science & Technology, Life Sciences & Biomedicine, Technology, Physical Sciences, Biochemical Research Methods, Biotechnology & Applied Microbiology, Computer Science, Interdisciplinary Applications, Mathematical & Computational Biology, Statistics & Probability, Biochemistry & Molecular Biology, Computer Science, Mathematics, PREDICTION, NETWORKS, TOPOLOGY
UCL classification: UCL
UCL > Provost and Vice Provost Offices > UCL BEAMS
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Computer Science
URI: https://discovery.ucl.ac.uk/id/eprint/10187845
Downloads since deposit
5Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item