UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

Annotating datasets in behavioural and social sciences to promote interoperability: development of the Schema for Ontology-based Dataset Annotation (SODA) version 1.0 [version 1; peer review: awaiting peer review]

West, Robert; Brown, Jamie; Shahab, Lion; Baird, Harriet; Webb, Thomas; Squires, Hazel; Tattan-Birch, Harry; ... Michie, Susan; + view all (2025) Annotating datasets in behavioural and social sciences to promote interoperability: development of the Schema for Ontology-based Dataset Annotation (SODA) version 1.0 [version 1; peer review: awaiting peer review]. Wellcome Open Research , 10 , Article 455. 10.12688/wellcomeopenres.24234.1. Green open access

[thumbnail of Brown_V1.pdf]
Preview
Text
Brown_V1.pdf

Download (491kB) | Preview

Abstract

Background and aims: Ontologies are increasingly employed to help find, use and synthesise information, but methods for using them to annotate documents and datasets remain in their infancy in the behavioural and social sciences. The Behavioural Research UK DEMO-DATA project aimed to develop a prototype schema for annotating datasets in behavioural and social sciences. / Methods: A case-study dataset (the ‘Smoking Toolkit Study’), used to inform an Agent-Based Model of trajectories in cigarette smoking and cessation in England, was chosen for annotation using two ontologies - The Behaviour Change Intervention Ontology (BCIO) and the Addiction Ontology (AddictO). The data set included 21 variables representing information about sociodemographic and tobacco and nicotine use attributes of the study population. A preliminary version of the schema for linking variables to ontology classes was developed as a basis for annotating each variable in the dataset. This was applied and revised iteratively until it was judged by an expert panel of domain experts and modellers to represent the variables sufficiently accurately to enable searching for and integration of data. / Results: The prototype Schema for Ontology-based Dataset Annotation (SODA) version 1.0 was developed over seven iterations. Variables were represented by an ‘object property’|‘ontology class’ expression (e.g., ‘has characteristic’|‘extent of social smoking’) together with information about the data types (e.g., numbers, ontology subclasses, or Boolean values), measurement source, unit of measurement, any coding or data transformations and whether or not the variable was fully characterised by the annotation. The prototype schema was applied successfully to the smoking dataset with 15 new ontology classes being created as required. / Conclusions: A prototype schema for annotating behavioural and social science datasets was developed and successfully applied to a dataset on smoking in England using ontology relations and classes. The next step is to further develop and evaluate the schema by application to case studies with a range of users and other datasets.

Type: Article
Title: Annotating datasets in behavioural and social sciences to promote interoperability: development of the Schema for Ontology-based Dataset Annotation (SODA) version 1.0 [version 1; peer review: awaiting peer review]
Open access status: An open access version is available from UCL Discovery
DOI: 10.12688/wellcomeopenres.24234.1
Publisher version: https://doi.org/10.12688/wellcomeopenres.24234.1
Language: English
Additional information: Copyright © 2025 West R et al. This is an open access work distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
UCL classification: UCL
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Brain Sciences
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Brain Sciences > Div of Psychology and Lang Sciences
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Brain Sciences > Div of Psychology and Lang Sciences > Clinical, Edu and Hlth Psychology
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Population Health Sciences > Institute of Epidemiology and Health > Behavioural Science and Health
URI: https://discovery.ucl.ac.uk/id/eprint/10212682
Downloads since deposit
4Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item