UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

Large language models help facilitate the automated synthesis of information on potential pest controllers

Scheepens, Daan; Millard, Joseph; Farrell, Maxwell; Newbold, Tim; (2024) Large language models help facilitate the automated synthesis of information on potential pest controllers. Methods in Ecology and Evolution 10.1111/2041-210X.14341. (In press). Green open access

[thumbnail of Methods Ecol Evol - 2024 - Scheepens - Large language models help facilitate the automated synthesis of information on.pdf]
Preview
Text
Methods Ecol Evol - 2024 - Scheepens - Large language models help facilitate the automated synthesis of information on.pdf - Published Version

Download (8MB) | Preview

Abstract

The body of ecological literature, which informs much of our knowledge of the global loss of biodiversity, has been experiencing rapid growth in recent decades. The increasing difficulty of synthesising this literature manually has simultaneously resulted in a growing demand for automated text mining methods. Within the domain of deep learning, large language models (LLMs) have been the subject of considerable attention in recent years due to great leaps in progress and a wide range of potential applications; however, quantitative investigation into their potential in ecology has so far been lacking. In this work, we analyse the ability of GPT‐4 to extract information about invertebrate pests and pest controllers from abstracts of articles on biological pest control, using a bespoke, zero‐shot prompt. Our results show that the performance of GPT‐4 is highly competitive with other state‐of‐the‐art tools used for taxonomic named entity recognition and geographic location extraction tasks. On a held‐out test set, we show that species and geographic locations are extracted with F1‐scores of 99.8% and 95.3%, respectively, and highlight that the model can effectively distinguish between ecological roles of interest such as predators, parasitoids and pests. Moreover, we demonstrate the model's ability to effectively extract and predict taxonomic information across various taxonomic ranks. However, we do report a small number of cases of fabricated information (confabulations). Due to a lack of specialised, pre‐trained ecological language models, general‐purpose LLMs may provide a promising way forward in ecology. Combined with tailored prompt engineering, such models can be employed for a wide range of text mining tasks in ecology, with the potential to greatly reduce time spent on manual screening and labelling of the literature.

Type: Article
Title: Large language models help facilitate the automated synthesis of information on potential pest controllers
Open access status: An open access version is available from UCL Discovery
DOI: 10.1111/2041-210X.14341
Publisher version: http://dx.doi.org/10.1111/2041-210x.14341
Language: English
Additional information: This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited. © 2024 The Authors. Methods in Ecology and Evolution published by John Wiley & Sons Ltd on behalf of British Ecological Society.
Keywords: Science & Technology, Life Sciences & Biomedicine, Ecology, Environmental Sciences & Ecology, biocontrol, ChatGPT, GPT-4, large language model, prompt engineering, relation extraction, taxonomic named entity recognition, text mining
UCL classification: UCL
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Life Sciences
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Life Sciences > Div of Biosciences
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Life Sciences > Div of Biosciences > Genetics, Evolution and Environment
URI: https://discovery.ucl.ac.uk/id/eprint/10193462
Downloads since deposit
15Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item