Xenos, A;
Malod-Dognin, N;
Milinkovic, S;
Przulj, N;
(2021)
Linear functional organization of the omic embedding space.
Bioinformatics
, 37
(21)
pp. 3839-3847.
10.1093/bioinformatics/btab487.
Preview |
Text
btab487.pdf - Published Version Download (714kB) | Preview |
Abstract
Motivation: We are increasingly accumulating complex omics data that capture different aspects of cellular functioning. A key challenge is to untangle their complexity and effectively mine them for new biomedical information. To decipher this new information, we introduce algorithms based on network embeddings. Such algorithms represent biological macromolecules as vectors in d-dimensional space, in which topologically similar molecules are embedded close in space and knowledge is extracted directly by vector operations. Recently, it has been shown that neural networks used to obtain vectorial representations (embeddings) are implicitly factorizing a mutual information matrix, called Positive Pointwise Mutual Information (PPMI) matrix. Thus, we propose the use of the PPMI matrix to represent the human protein-protein interaction (PPI) network and also introduce the graphlet degree vector PPMI matrix of the PPI network to capture different topological (structural) similarities of the nodes in the molecular network. Results: We generate the embeddings by decomposing these matrices with Nonnegative Matrix Tri-Factorization. We demonstrate that genes that are embedded close in these spaces have similar biological functions, so we can extract new biomedical knowledge directly by doing linear operations on their embedding vector representations. We exploit this property to predict new genes participating in protein complexes and to identify new cancer-related genes based on the cosine similarities between the vector representations of the genes. We validate 80% of our novel cancer-related gene predictions in the literature and also by patient survival curves that demonstrating that 93.3% of them have a potential clinical relevance as biomarkers of cancer.
Type: | Article |
---|---|
Title: | Linear functional organization of the omic embedding space |
Location: | England |
Open access status: | An open access version is available from UCL Discovery |
DOI: | 10.1093/bioinformatics/btab487 |
Publisher version: | http://dx.doi.org/10.1093/bioinformatics/btab487 |
Language: | English |
Additional information: | VC The Author(s) 2021. Published by Oxford University Press. 3839 This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com |
Keywords: | Science & Technology, Life Sciences & Biomedicine, Technology, Physical Sciences, Biochemical Research Methods, Biotechnology & Applied Microbiology, Computer Science, Interdisciplinary Applications, Mathematical & Computational Biology, Statistics & Probability, Biochemistry & Molecular Biology, Computer Science, Mathematics, PREDICTION, NETWORKS, TOPOLOGY |
UCL classification: | UCL UCL > Provost and Vice Provost Offices > UCL BEAMS UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Computer Science |
URI: | https://discovery.ucl.ac.uk/id/eprint/10187845 |
Archive Staff Only
View Item |