eprintid: 10192201
rev_number: 9
eprint_status: archive
userid: 699
dir: disk0/10/19/22/01
datestamp: 2024-05-13 11:27:32
lastmod: 2024-05-13 11:27:32
status_changed: 2024-05-13 11:27:32
type: article
metadata_visibility: show
sword_depositor: 699
creators_name: Kashif-Khan, Naail
creators_name: Savva, Renos
creators_name: Frank, Stefanie
title: Mining metagenomics data for novel bacterial nanocompartments
ispublished: pub
divisions: UCL
divisions: B04
divisions: C05
divisions: F47
keywords: Science & Technology, Life Sciences & Biomedicine, Genetics & Heredity, Mathematical & Computational Biology, PROTEIN-STRUCTURE
note: © The Author(s) 2024. Published by Oxford University Press on behalf of NAR Genomics and Bioinformatics.
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
abstract: Encapsulin nanocompartments are prokaryotic protein-based organelles. T he y displa y div erse natural functions, including mineral storage and stress response. Encapsulins also ha v e applications in synthetic biology, drug deliv ery, v accines, and met abolic engineering . Disco v ering no v el encapsulins is challenging due to inconsistent annotations, and data contamination due to similarity with phage proteins. P re vious studies ha v e disco v ered thousands of encapsulin sequences from bacteria and archaea, but met agenomics dat abases were not specifically interrogated. Metagenomics can provide information on a much larger diversity of unculturable organisms and environmental samples than con v entional sequencing experiments, and metagenomic protein databases ha v e shed light on previously unexplored regions of the protein universe. This study le v erages de v elopments in deep learning for str uct ure and function prediction, to produce a dataset of o v er 1300 no v el putativ e encap- sulin sequences from the MGnify Protein Database. Some well-known encapsulins and their cargo proteins were identified, predominantly pero xidases and ferritin-lik e proteins. A potentially no v el encapsulin-associated biosynthetic gene cluster in v olv ed in producing cytoto xic or an- timicrobial saccharides was discovered using biosynthetic gene cluster prediction. Finally, a cluster of predicted str uct ures with no v el features not seen in experimentally solved encapsulin str uct ures was discovered using large-scale, deep learning-based str uct ure prediction of putative metagenomic encapsulins.
date: 2024-03-07
date_type: published
publisher: OXFORD UNIV PRESS
official_url: https://doi.org/10.1093/nargab/lqae025
oa_status: green
full_text_type: pub
language: eng
primo: open
primo_central: open_green
verified: verified_manual
elements_id: 2262034
doi: 10.1093/nargab/lqae025
lyricists_name: Frank, Stefanie
lyricists_id: SFRAN44
actors_name: Frank, Stefanie
actors_id: SFRAN44
actors_role: owner
funding_acknowledgements: BB/T008709/1 [Biotechnology and Biological Sciences Research Council]; [London Interdisciplinary Biosciences Consortium Doctoral Training Partnership]; [Oracle for Research]; EP/R013756/1 [EPSRC]
full_text_status: public
publication: NAR Genomics and Bioinformatics
volume: 6
number: 1
article_number: lqae025
pages: 13
issn: 2631-9268
citation:        Kashif-Khan, Naail;    Savva, Renos;    Frank, Stefanie;      (2024)    Mining metagenomics data for novel bacterial nanocompartments.                   NAR Genomics and Bioinformatics , 6  (1)    , Article lqae025.  10.1093/nargab/lqae025 <https://doi.org/10.1093/nargab%2Flqae025>.       Green open access   
 
document_url: https://discovery.ucl.ac.uk/id/eprint/10192201/1/Frank_lqae025.pdf