UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

Treemendous: an R package for integrating taxonomic information across backbones

Specker, F; Paz, A; Crowther, TW; Maynard, DS; (2024) Treemendous: an R package for integrating taxonomic information across backbones. PeerJ , 12 , Article e16896. 10.7717/peerj.16896. Green open access

[thumbnail of peerj-16896.pdf]
Preview
Text
peerj-16896.pdf - Published Version

Download (1MB) | Preview

Abstract

Standardizing and translating species names from different databases is key to the successful integration of data sources in biodiversity research. There are numerous taxonomic name-resolution applications that implement increasingly powerful name-cleaning and matching approaches, allowing the user to resolve species relative to multiple backbones simultaneously. Yet there remains no principled approach for combining information across these underlying taxonomic backbones, complicating efforts to combine and merge species lists with inconsistent and conflicting taxonomic information. Here, we present Treemendous, an open-source software package for the R programming environment that integrates taxonomic relationships across four publicly available backbones to improve the name resolution of tree species. By mapping relationships across the backbones, this package can be used to resolve datasets with conflicting and inconsistent taxonomic origins, while ensuring the resulting species are accepted and consistent with a single reference backbone. The user can chain together different functionalities ranging from simple matching to a single backbone, to graph-based iterative matching using synonym-accepted relations across all backbones in the database. In addition, the package allows users to ‘translate’ one tree species list into another, streamlining the assimilation of new data into preexisting datasets or models. The package provides a flexible workflow depending on the use case, and can either be used as a stand-alone name-resolution package or in conjunction with existing packages as a final step in the name-resolution pipeline. The Treemendous package is fast and easy to use, allowing users to quickly merge different data sources by standardizing their species names according to the regularly updated database. By combining taxonomic information across multiple backbones, the package increases matching rates and minimizes data loss, allowing for more efficient translation of tree species datasets to aid research into forest biodiversity and tree ecology.

Type: Article
Title: Treemendous: an R package for integrating taxonomic information across backbones
Open access status: An open access version is available from UCL Discovery
DOI: 10.7717/peerj.16896
Publisher version: https://doi.org/10.7717/peerj.16896
Language: English
Additional information: This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ) and either DOI or URL of the article must be cited.
Keywords: Biodiversity research, Forest inventory, Nomenclature, R language, Taxonomic databases
UCL classification: UCL
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Life Sciences
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Life Sciences > Div of Biosciences
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Life Sciences > Div of Biosciences > Genetics, Evolution and Environment
URI: https://discovery.ucl.ac.uk/id/eprint/10189470
Downloads since deposit
7Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item