Geospatial data

1. Automatic Scaling Hadoop in the Cloud for Efficient Process of Big Geospatial Data Mostafa Abbass American University of Culture and Education (AUCE) Nabatieh Campus Cloud Computing & Security CSI549 Dr. Abbass ZEIN EDDINE

2. Outline 1. Challenges 2. Hadoop for Geospatial Data Processing 3. Auto-Scaling Framework 4. CoveringHDFS 5. Auto-Scaling Algorithm 6. Experimental Result and Discussion 7. Conclusion 8. Critique

3. 1. Challenges While traditional computing infrastructure does not scale well with the rapidly increasing data volume, Hadoop has attracted increasing attention in geoscience communities for handling big geospatial data. The massive data volume and the intrinsic complexity and high dimensions of the geospatial datasets cause challenges to the Efficient processing of big geospatial data that is crucial for tackling global and regional. To accelerate geospatial data processing, distributed computing infrastructures are widely used. 1

4. 2. Hadoop for Geospatial Data Processing  Hadoop as an open source application for the MapReduce framework.  Adaptation of Hadoop to create geospatial glossaries from large volunteer geospatial data.  Hadoop has been leveraged to store and process bulky remote sensing images to support large concurrent user requests.  Hadoop MapReduce utilized to enable penalization of big climate data processing.  HadoopGIS offers a scalable and high performance spatial query system over MapReduce to accelerate geospatial data analysis . 2

5. 3. Auto-Scaling Framework 3 The goal of the framework is to dynamically adjust computing resources based on the processing workload to handle the spike requirement for computing power while minimizing resource consumption.

6. 4. CoveringHDFS 4 The CoveringHDFS is a mechanism to scale down the cluster safely and timely without losing data, instead of modifying the underlying Hadoop software.

7. 5. Auto-Scaling Algorithm 5 a. Scaling up  N = N pending− N finished 𝑛 b. Scaling down Removing the compute-slaves is straightforward. When the idle time for a compute-slave exceeds a user-specified threshold, this slave is terminated.

8. 6. Auto-Scaling Prototype and Experimental Result The proposed auto-scaling framework is able to work with the cloud platforms that allow users to provision VMs using API (through IaaS).  six physical machines  1 Gigabit Ethernet (Gbps)  8-core CPU running at 2.35 GHz  16 GB of RAM 6 a. Prototype Implementation

9. b. Experimental Design Cluster Type Master Slaves HDFS Auto-scaling cluster One medium instance Dynamic, start with three core- slaves with medium instances, can scale up 12 compute-slaves with small instances CoveringHDFS, starting with 3 core-slaves Seven-slave cluster One medium instance Static, 7 slaves with three medium instances and four small instances raditional HDFS with 7 slaves Fourteen-slave cluster One medium instance Static, 14 slaves with three medium instances and 11 small instances Traditional HDFS with 14 slaves Hadoop Cluster Setup 7

10. c. Result and Discussion 8

11. 7. Conclusion  Such a cloud-enabled, auto-scalable computing cluster offers a powerful tool to process big geoscience data with optimized performance and reduced resource consumption. 9  While DEM interpolation is used as an example, the proposed framework can be extended to handle other geoprocessing applications that run on Hadoop, such as the climate data analytical services powered by Hadoop and cloud computing.

12.  The importance of this research lies, in helping to face global, and regional challenges such as climate change and natural disasters through effective processing of large geospatial data, as the results show that this automatic scaling framework is able to significantly reduce the use of computing resources by 80% and ensure that the processing is completed within a period of time.  I recommend reading this paper because it is very useful in the field of cloud computing and has good results. 8. Critique THANK YOU 10

Geospatial data

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Geospatial data

Similar to Geospatial data (20)

Recently uploaded

Recently uploaded (20)

Geospatial data