UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

Framework-based qualitative analysis of free responses of Large Language Models: Algorithmic fidelity

Amirova, Aliya; Fteropoulli, Theodora; Ahmed, Nafiso; Cowie, Martin R; Leibo, Joel Z; (2024) Framework-based qualitative analysis of free responses of Large Language Models: Algorithmic fidelity. PLoS One , 19 (3) , Article e0300024. 10.1371/journal.pone.0300024. Green open access

[thumbnail of journal.pone.0300024.pdf]
Preview
Text
journal.pone.0300024.pdf - Published Version

Download (1MB) | Preview

Abstract

Today, with the advent of Large-scale generative Language Models (LLMs) it is now possible to simulate free responses to interview questions such as those traditionally analyzed using qualitative research methods. Qualitative methodology encompasses a broad family of techniques involving manual analysis of open-ended interviews or conversations conducted freely in natural language. Here we consider whether artificial "silicon participants" generated by LLMs may be productively studied using qualitative analysis methods in such a way as to generate insights that could generalize to real human populations. The key concept in our analysis is algorithmic fidelity, a validity concept capturing the degree to which LLM-generated outputs mirror human sub-populations' beliefs and attitudes. By definition, high algorithmic fidelity suggests that latent beliefs elicited from LLMs may generalize to real humans, whereas low algorithmic fidelity renders such research invalid. Here we used an LLM to generate interviews with "silicon participants" matching specific demographic characteristics one-for-one with a set of human participants. Using framework-based qualitative analysis, we showed the key themes obtained from both human and silicon participants were strikingly similar. However, when we analyzed the structure and tone of the interviews we found even more striking differences. We also found evidence of a hyper-accuracy distortion. We conclude that the LLM we tested (GPT-3.5) does not have sufficient algorithmic fidelity to expect in silico research on it to generalize to real human populations. However, rapid advances in artificial intelligence raise the possibility that algorithmic fidelity may improve in the future. Thus we stress the need to establish epistemic norms now around how to assess the validity of LLM-based qualitative research, especially concerning the need to ensure the representation of heterogeneous lived experiences.

Type: Article
Title: Framework-based qualitative analysis of free responses of Large Language Models: Algorithmic fidelity
Location: United States
Open access status: An open access version is available from UCL Discovery
DOI: 10.1371/journal.pone.0300024
Publisher version: https://doi.org/10.1371/journal.pone.0300024
Language: English
Additional information: © 2024 Amirova et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Keywords: Humans, Artificial Intelligence, Silicon, Language, Communication, Caffeine
UCL classification: UCL
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Brain Sciences
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Brain Sciences > Division of Psychiatry
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Brain Sciences > Division of Psychiatry > Epidemiology and Applied Clinical Research
URI: https://discovery.ucl.ac.uk/id/eprint/10189467
Downloads since deposit
9Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item