Spatial Data Mining Analytical Environment for Large Scale Geospatial Data

11/17/2015 2:37:31 PM

In this project, we propose a framework for processing and analyzing large-scale geospatial and environmental data using a “Big Data” infrastructure. Existing Big Data solutions do not include any mechanism to analyze large-scale geospatial data. In this work, we extend HBase and HDFS to support geospatial data and demonstrate its analytical use with some common geospatial data types and data mining technology provided by the R language.  The resulting Hadoop-based framework is a robust capability to share large-scale geospatial data using spatial data mining and making its outputs available to end users.

 Figure 1. R and HBase

Figure 1. R and HBase

 

Figure 2. Shark Alert: Test Objects are Multiple Geospatial Data Files

Figure 2. Shark Alert: Test Objects are Multiple Geospatial Data Files

 

Figure 3. Probability of Shark Appearance Calculated by Water Temperature and Salinity.

Figure 3. Probability of Shark Appearance Calculated by Water Temperature and Salinity.