UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

Automatic metadata generation in an archaeological digital library: Semantic annotation of grey literature

Vlachidis, A; Binding, C; May, K; Tudhope, D; (2013) Automatic metadata generation in an archaeological digital library: Semantic annotation of grey literature. Computational Linguistics , 458 pp. 187-202. 10.1007/978-3-642-34399-5_10.

[thumbnail of Vlachidis_Automatic Metadata Generation Extended.pdf] Text
Vlachidis_Automatic Metadata Generation Extended.pdf - Accepted Version
Access restricted to UCL open access staff

Download (269kB)

Abstract

This paper discusses the automatic generation of rich metadata from excavation reports from the Archaeological Data Service library of grey literature (OASIS). The work is part of the STAR project, in collaboration with English Heritage. An extension of the CIDOC CRM ontology for the archaeological domain acts as a core ontology. Rich metadata is automatically extracted from grey literature, directed by the CRM, via a three phase process of semantic enrichment employing the GATE toolkit augmented with bespoke rules and knowledge resources. The paper demonstrates the potential of combining knowledge based resources (ontologies and thesauri) in information extraction, and techniques for delivering the automatically extracted metadata as XML annotations coupled with the grey literature reports and as RDF graphs decoupled from content. Examples from two consuming applications are discussed, the Andronikos web portal which serves the annotated XML files for visual inspection and the STAR project, research demonstrator which offers unified search across of archaeological excavation data and grey literature via the core ontology CRM-EH.

Type: Article
Title: Automatic metadata generation in an archaeological digital library: Semantic annotation of grey literature
DOI: 10.1007/978-3-642-34399-5_10
Publisher version: https://doi.org/10.1007/978-3-642-34399-5_10
Additional information: This version is the author accepted manuscript. For information on re-use, please refer to the publisher’s terms and conditions.
Keywords: Automatic Metadata Generation, CIDOC CRM, Digital Archaeology, Digital Library, GATE, Knowledge Organization Systems, Information Extraction, Semantic Annotation, Semantic Search, SKOS
UCL classification: UCL
UCL > Provost and Vice Provost Offices > UCL SLASH
UCL > Provost and Vice Provost Offices > UCL SLASH > Faculty of Arts and Humanities
UCL > Provost and Vice Provost Offices > UCL SLASH > Faculty of Arts and Humanities > Dept of Information Studies
URI: https://discovery.ucl.ac.uk/id/eprint/1556220
Downloads since deposit
0Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item