Spatial Data Mining Analytical Environment for Large Scale Geospatial Data

Published Nov. 17, 2015

In this project, we propose a framework for processing and analyzing large-scale geospatial and environmental data using a “Big Data” infrastructure. Existing Big Data solutions do not include any mechanism to analyze large-scale geospatial data. In this work, we extend HBase and HDFS to support geospatial data and demonstrate its analytical use with some common geospatial data types and data mining technology provided by the R language. The resulting Hadoop-based framework is a robust capability to share large-scale geospatial data using spatial data mining and making its outputs available to end users.

R and HBase.
Fig. 1: R and HBase.
Shark Alert: Test Objects are Multiple Geospatial Data Files.
Fig. 2: Shark Alert: Test Objects are Multiple Geospatial Data Files.
Probability of Shark Appearance Calculated by Water Temperature and Salinity.
Fig. 3: Probability of Shark Appearance Calculated by Water Temperature and Salinity.