UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

Sideloading - Ingestion Of large point clouds into the apache spark big data engine

Boehm, J; Liu, K; Alis, C; (2016) Sideloading - Ingestion Of large point clouds into the apache spark big data engine. In: Proceedings of the XXIII ISPRS Congress. (pp. pp. 343-348). International Society of Photogrammetry and Remote Sensing (ISPRS): Prague, Czech Republic. Green open access

[thumbnail of Boehm_isprs-archives-XLI-B2-343-2016.pdf]
Preview
Text
Boehm_isprs-archives-XLI-B2-343-2016.pdf - Published Version

Download (1MB) | Preview

Abstract

In the geospatial domain we have now reached the point where data volumes we handle have clearly grown beyond the capacity of most desktop computers. This is particularly true in the area of point cloud processing. It is therefore naturally lucrative to explore established big data frameworks for big geospatial data. The very first hurdle is the import of geospatial data into big data frameworks, commonly referred to as data ingestion. Geospatial data is typically encoded in specialised binary file formats, which are not naturally supported by the existing big data frameworks. Instead such file formats are supported by software libraries that are restricted to single CPU execution. We present an approach that allows the use of existing point cloud file format libraries on the Apache Spark big data framework. We demonstrate the ingestion of large volumes of point cloud data into a compute cluster. The approach uses a map function to distribute the data ingestion across the nodes of a cluster. We test the capabilities of the proposed method to load billions of points into a commodity hardware compute cluster and we discuss the implications on scalability and performance. The performance is benchmarked against an existing native Apache Spark data import implementation.

Type: Proceedings paper
Title: Sideloading - Ingestion Of large point clouds into the apache spark big data engine
Event: XXIII ISPRS Congress
Location: Prague, Czech Republic
Dates: 12 July 2016 - 19 July 2016
Open access status: An open access version is available from UCL Discovery
DOI: 10.5194/isprsarchives-XLI-B2-343-2016
Publisher version: http://dx.doi.org/10.5194/isprsarchives-XLI-B2-343...
Language: English
Additional information: All site content, except where otherwise noted, is licensed under the Creative Commons Attribution 3.0 License.
Keywords: Big Data, LiDAR, Cloud Computing, Point Cloud, Spark
UCL classification: UCL
UCL > Provost and Vice Provost Offices
UCL > Provost and Vice Provost Offices > UCL BEAMS
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Civil, Environ and Geomatic Eng
URI: https://discovery.ucl.ac.uk/id/eprint/1514274
Downloads since deposit
232Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item