eprintid: 1490319
rev_number: 30
eprint_status: archive
userid: 608
dir: disk0/01/49/03/19
datestamp: 2016-05-07 21:13:13
lastmod: 2021-09-19 23:51:05
status_changed: 2017-07-31 10:41:40
type: article
metadata_visibility: show
creators_name: Vermeesch, P
creators_name: Garzanti, E
title: Making geological sense of 'Big Data' in sedimentary provenance
ispublished: pub
divisions: UCL
divisions: B04
divisions: C06
divisions: F57
keywords: Science & Technology, Physical Sciences, Geochemistry & Geophysics, Provenance, Statistics, Sediments, U-Pb, Zircon, Heavy Minerals, Detrital Age Distributions, Chinese Loess Plateau, Namib Sand Sea, Transport
note: This version is the author accepted manuscript. For information on re-use, please refer to the publisher’s terms and conditions.
abstract: Sedimentary provenance studies increasingly apply multiple chemical, mineralogical and isotopic proxies to many samples. The resulting datasets are often so large (containing thousands of numerical values) and complex (comprising multiple dimensions) that it is warranted to use the Internet-era term ‘Big Data’ to describe them. This paper introduces Multidimensional Scaling (MDS), Generalised Procrustes Analysis (GPA) and Individual Differences Scaling (INDSCAL, a type of ‘3-way MDS’ algorithm) as simple yet powerful tools to extract geological insights from ‘Big Data’ in a provenance context. Using a dataset from the Namib Sand Sea as a test case, we show how MDS can be used to visualise the similarities and differences between 16 fluvial and aeolian sand samples for five different provenance proxies, resulting in five different ‘configurations’. These configurations can be fed into a GPA algorithm, which translates, rotates, scales and reflects them to extract a ‘consensus view’ for all the data considered together. Alternatively, the five proxies can be jointly analysed by INDSCAL, which fits the data with not one but two sets of coordinates: the ‘group configuration’, which strongly resembles the graphical output produced by GPA, and the ‘source weights’, which can be used to attach geological meaning to the group configuration. For the Namib study, the three methods paint a detailed and self-consistent picture of a sediment routing system in which sand composition is determined by the combination of provenance and hydraulic sorting effects.
date: 2015-08-20
date_type: published
publisher: ELSEVIER SCIENCE BV
official_url: https://doi.org/10.1016/j.chemgeo.2015.05.004
oa_status: green
full_text_type: other
language: eng
primo: open
primo_central: open_green
article_type_text: Article
verified: verified_manual
elements_id: 1040065
doi: 10.1016/j.chemgeo.2015.05.004
lyricists_name: Vermeesch, Pieter
lyricists_id: PVERM09
full_text_status: public
publication: Chemical Geology
volume: 409
pagerange: 20-27
pages: 8
issn: 0009-2541
citation:        Vermeesch, P;    Garzanti, E;      (2015)    Making geological sense of 'Big Data' in sedimentary provenance.                   Chemical Geology , 409    pp. 20-27.    10.1016/j.chemgeo.2015.05.004 <https://doi.org/10.1016/j.chemgeo.2015.05.004>.       Green open access   
 
document_url: https://discovery.ucl.ac.uk/id/eprint/1490319/1/Vermeesch_making_geological_sense_of_big_data.pdf