UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

Reflections on Infrastructures for Mining Nineteenth-Century Newspaper Data

Nyhan, J; Hauswedell, T; Tiedau, U; (2020) Reflections on Infrastructures for Mining Nineteenth-Century Newspaper Data. In: DIGITAL SCHOLARSHIP, DIGITAL CLASSROOMS: New International Perspectives in Research and Teaching Proceedings of the Gale Digital Humanities Day at the British Library. (pp. pp. 27-38). GALE: a Cengage Company: London, UK. Green open access

[thumbnail of Nyhan_PDFsam_Nyhan_dsdc__full_text.pdf]
Nyhan_PDFsam_Nyhan_dsdc__full_text.pdf - Published version

Download (192kB) | Preview


In this study we compare and contrast our experiences (as historians and as digital humanities and information studies researchers) of seeking to mine large-scale historical datasets via university-based, high-performance computing infrastructures versus our experiences of using external, cloud-hosted platforms and tools to mine the same data. In particular, we reflect on our recent experiences in two large transnational digital humanities projects: Asymmetrical Encounters: E-Humanity Approaches to Reference Cultures in Europe, 1815–1992, which was funded by a Humanities in the European Research Area grant (2013–2016) and Oceanic Exchanges: Tracing Global Information Networks in Historical Newspaper Repositories 1840–1914, which was funded through the Transatlantic Partnership for Social Sciences and Humanities 2016 Digging into Data Challenge (2017–2019). As part of the research for both these projects we sought to mine the OCR text of nineteenth-century historical newspapers that had been mounted on UCL’s HighPerformance Computing Infrastructures from Gale’s TDM drives. We compare and contrast our experiences of this with our subsequent experiences of performing comparable tasks via Gale Digital Scholar Lab. We contextualise our experiences and observations within wider discourses and recommendations about infrastructural support for humanities-led analyses of large datasets and discuss the advantages and drawbacks of both approaches. We situate our discussions in the aforementioned infrastructural scenarios with reflections on the human experiences of undertaking this research, which represents a step change for many of those who work in the (digital) humanities. Finally, we conclude by discussing the public and private sector research investments that are needed to support further developments and to facilitate access to and critical interrogation of large-scale digital archives

Type: Proceedings paper
Title: Reflections on Infrastructures for Mining Nineteenth-Century Newspaper Data
Event: Gale Digital Humanities Day
Location: British Library
Dates: 02 May 2019 - 02 May 2019
Open access status: An open access version is available from UCL Discovery
Publisher version: https://www.gale.com/binaries/content/assets/gale-...
Language: English
Additional information: © Julianne Nyhan, Tessa Hauswedell, and Ulrich Tiedau. This work is licensed under the Creative Commons Attribution 4.0 International License. To view a copy of the license, visit http://creativecommons.org/licenses/by/4.0.
Keywords: Digital Infrastructures; Text-Mining; Historical Newspaper Collections; High Performance Computing, Critical Cultural Heritage; Digital Humanities; Times Digital Archive; Times of London.
UCL classification: UCL
UCL > Provost and Vice Provost Offices > UCL SLASH
UCL > Provost and Vice Provost Offices > UCL SLASH > Faculty of Arts and Humanities
UCL > Provost and Vice Provost Offices > UCL SLASH > Faculty of Arts and Humanities > Dept of Information Studies
UCL > Provost and Vice Provost Offices > UCL SLASH > Faculty of Arts and Humanities > SELCS
URI: https://discovery.ucl.ac.uk/id/eprint/10090119
Downloads since deposit
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item