UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

From Knowledge to Innovation: Using NLP to Identify Innovative Activity

Williams, Jennie; (2025) From Knowledge to Innovation: Using NLP to Identify Innovative Activity. Doctoral thesis (Ph.D), UCL (University College London). Green open access

[thumbnail of Williams_10207135_Thesis.pdf]
Preview
Text
Williams_10207135_Thesis.pdf

Download (172MB) | Preview

Abstract

In response to the challenges posed by Brexit and the Covid-19 pandemic, the UK updated its industrial strategy to position itself as ‘a global hub for innovation’ by 2035. However, defining and measuring innovation remains complex, with standard metrics like R&D spending, patent activity, and researcher counts failing to capture the full spectrum of innovative activity, especially outside Science, Technology, Engineering, and Mathematics (STEM) fields. This highlights the need for alternative metrics to better understand and track innovation across a wider range of disciplines, including the arts, humanities, and social sciences. This thesis addresses this gap by using doctoral thesis content as a proxy for innovative activity. PhD theses, representing novel and non-trivial research, capture groundbreaking work across diverse fields. Using data from the British Library’s E- Thesis Online Service (EThOS), this research employs advanced Natural Language Processing (NLP) techniques, including word-to-document embeddings and machine learning algorithms, to process and analyse the unstructured metadata of PhD theses. The goal is to identify clusters of innovation by constructing a semantic space where research outputs are related based on their textual content. The analysis reveals innovative clusters across both STEM and non-STEM disciplines. Clusters in fields such as particle physics and photovoltaic materials highlight innovation in scientific areas, while clusters in archaeology, musicology, and urban planning demonstrate innovation outside STEM. The research also shows geographic spread and thematic cohesion within these clusters, highlighting the interdisciplinary nature of academic innovation. This study confirms that text-based analysis of doctoral research can effectively detect and classify innovative activity, offering a scalable methodology that captures the spectrum of academic innovation. The findings emphasise the potential to uncover hidden patterns and provide a richer understanding of the innovation landscape across all fields.

Type: Thesis (Doctoral)
Qualification: Ph.D
Title: From Knowledge to Innovation: Using NLP to Identify Innovative Activity
Open access status: An open access version is available from UCL Discovery
Language: English
Additional information: Copyright © The Author 2025. Original content in this thesis is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0) Licence (https://creativecommons.org/licenses/by-nc/4.0/). Any third-party copyright material present remains the property of its respective owner(s) and is licensed under its existing terms. Access may initially be restricted at the author’s request.
Keywords: Knowledge mapping, Innovation, word embedding, Hierarchical Cluster Analysis, Metadata
UCL classification: UCL
UCL > Provost and Vice Provost Offices > UCL BEAMS
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of the Built Environment
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of the Built Environment > Centre for Advanced Spatial Analysis
URI: https://discovery.ucl.ac.uk/id/eprint/10207135
Downloads since deposit
0Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item