SlideShare ist ein Scribd-Unternehmen logo
1 von 5
Downloaden Sie, um offline zu lesen
International Journal of Innovative Research in Advanced Engineering (IJIRAE)
Volume 1 Issue 2 (April 2014)
___________________________________________________________________________________________________
ISSN: 2278-2311 IJIRAE | http://ijirae.com
© 2014, IJIRAE All Rights Reserved Page -78
Survey on Load Rebalancing for Distributed File System in
Cloud
Prof. Pranalini S. Ketkar Ankita Bhimrao Patkure
IT Department, DCOER, PG Scholar, Computer Department DCOER,
Pune University Pune university
pranalini.ketkar@gmail.com ankitapatkure@rediffmail.com
Abstract— Distributed file system is used as a key building block of cloud computing. In distributed file system, a
large file is divided into number of chunks and allocates each chunk to separate node to perform MapReduce function
parallel over each node. In cloud, if number of storage nodes, number of files and assesses to that file increases then
the central node (master in MapReduce) becomes bottleneck. The load rebalancing task is used to eliminate the load
on central node. Using load rebalancing algorithm the load of nodes is balanced as well as the movement cost is
reduced. In this survey paper the problem of load imbalancing is overcome.
Keywords— DFS, MapReduce, Load balancing, Distributed Hash Table, cloud
I. INTRODUCTION
In Cloud computing, the number of computers that are connected using communication network. The notation
of cloud indicates that internet is mandatory to perform the various cloud operations i.e. to create delete, append and
replace. It is used in IT-companies to share information and resources with the all users. There are various characteristics
of cloud i.e. Scalable, on demand service, User centric, Powerful, Versatile, Platform independent etc. In cloud three
technologies are included the MapReduce programming, Virtualization and distributed file systems for the data storage
purpose.
Distributed file system is classical model of file system that is used in the form of chunks for cloud computing.
Cloud computing application is based on the MapReduce programming used in distributed file system. MapReduce is the
master-slave architecture in Hadoop. Master act like Namenode and Slave act like Datanode. Master takes large problem,
divides it into sub problem and assigns it to worker node i.e. to multiple slaves to solve problem individually. In
distributed file system, a large file is divided into number of chunks and allocates each chunk to separate node to perform
MapReduce function parallel over each node. For example in word count application it identifies the occurrences of each
distinct word in large file. In this application a large file is divided into fixed-size chunks (parts) and assigns each chunk
to different cloud storage node. Then each storage node calculates the occurrences of each distinct word by scanning and
parsing its own chunk. Then give its result to master to calculate the final result. In distributed file system, the load of
each node is directly proportional to number of file chunks that node consists.
As the increase in storage and network, load balancing is the main issue in the large scale distributed systems.
Load should be balance over multiple nodes to improve system performance, resource utilization, response time and
stability. Load balancing is divided into two categories: static and dynamic. In static load balancing algorithm, it does not
consider the previous behaviour of a node while distribute the load. But in case of dynamic load balancing algorithm, it
checks the previous behaviour of node while distribute the load. In cloud, if number of storage nodes, number of files and
assesses to that file increases then the central node (master in MapReduce) becomes bottleneck. The load rebalancing
task is used to eliminate the load on central node. In load balancing algorithm, storage nodes are structured over network
based on the distributed hash table (DHT); each file chunk having rapid key lookup in DHTs, in that unique identifier is
assign to each file chunk [1]. DHTs enable nodes to self-recognize and repair while it constantly offers lookup
functionality in node. Here aim to reduce the movement cost which is caused by load rebalancing of nodes to maximize
the network bandwidth. Each Chunkserver first find whether it is light node or heavy node without global knowledge of
node. The numbers of file chunks are migrated from heavy node to light node to balance their load. This process repeats
until all heavy nodes becomes the light nodes. To overcome this load balancing problem each node perform load
rebalancing algorithm independently without global knowledge about load of all nodes.
II. LITERATURE SURVEY
MapReduce: Simplified Data Processing On Large Clusters [9]
MapReduce is the programming model used in implementation for processing and generating large scale
datasets. It is used at Google for many different purposes. Here map and reduce functions are used. Map function
generate set of intermediate key pairs and reduce function merges all intermediate key values associated with same
intermediate key. The map and reduce function allows to perform parallelize operation easily and re-execute the
mechanism for fault tolerance. At the run-time, system takes care of detail information of partitioning the input data,
schedule the program execution across number of available machines, handling failures and managing inter-
communication between machines.
International Journal of Innovative Research in Advanced Engineering (IJIRAE)
Volume 1 Issue 2 (April 2014)
___________________________________________________________________________________________________
ISSN: 2278-2311 IJIRAE | http://ijirae.com
© 2014, IJIRAE All Rights Reserved Page - 79
In distributed file system nodes simultaneously perform computing and storage operations. The large file in
partitioned into number of chunks and allocate it to distinct nodes to perform MapReduce task parallel over nodes.
Typically, MapReduce task processes on many terabytes of data on thousands of machines. This model is easy to use; it
hides the details of parallelization, optimization, fault-tolerance and load balancing. MapReduce is used for Google’s
production Web search service, machine learning, data mining, etc. Using this programming model, redundant execution
used to reduce the impact of slow machines, handle machine failure as well as data loss.
Load Balancing Algorithm for DHT based structured Peer to Peer System [10]
Peer to peer system have an emerging application in distributed environment. As compared to client-server
architecture, peer to peer system improved resource utilization by making use of unused resources over network. Peer to
peer system uses Distributed Hash Table (DHTs) as an allocation mechanism. It perform join, leave and update
operations. Here load balancing algorithm uses the concept of virtual server to temporary storage of data. Using the
heterogeneous indexing, peers balanced their loads proportional to their capacities. In this, decentralized load balance
algorithm construct network to manipulate global information and organized in tree shape fashion. Each peer can
independently compute probability distribution capacities of participating peers and reallocate their load in parallel.
Optimized Cloud Partitioning Technique to Simplify Load Balancing [5]
Cloud computing has some issue regarding resource management and load balancing. In this paper, Cloud
environment is divided into number of parts by making use of cloud cluster technique which helps for the process of load
balancing. The cloud consists of number of nodes and it partitioned into n cluster based on cloud cluster technique. In
this, it consists of main controller which maintains all information regarding all load balancer in cluster and index table.
Initial step is to choose the correct cluster and it follows algorithm as
1) In cloud environment, the nodes connected to central controller are initialized as 0 in index table.
2) When controller receives new request, it queries the load balancer of each cluster for job allocation.
3) Then controller pass index table to find next available node having less weight. If found then continue the
processing otherwise index table reinitialized to 0 and in an increment manner then again controller passes table
to find next available node.
4) After completing the process the load balancer update the status in allocation table.
Cloud partitioning method consist 2 steps:-
1) Visit each node randomly to match it with neighbor node. If it having same characteristics and shares similar
data with minimal cost then two nodes are combined into new node with share same details. Repeat until there is
no neighbor node having similar characteristics. Subsequently update the cost between neighbor two nodes and
current neighbor node.
2) After joining two nodes into new node having similar characteristics visited node send the information to new
node instead of sending it twice.
It gives the high performance, stability, minimum response time and optimal resource utilization.
Histogram-Based Global Load Balancing in Structured Peer to Peer System [11]
Peer to peer system having solution for sharing and locating resources over internet. In this paper there are two
key components. First is histogram manager that maintain histogram that reflect global view of distribution of load.
Histogram stores statistical information about average load of no overlapping groups of nodes. It is used to check
whether node is normally loaded, Light or heavily loaded. Second component is load balance manager that take the
action of load redistribution if node becomes light or heavy. Load balancing manager balance the load statically when
new node joins and dynamically when existing node become light or heavily loaded. The cost of constructing histogram
and maintaining it may be expensive in dynamic system. To reduce the maintaining cost two techniques are used.
Constructing and maintaining histogram is expensive if node join and leave system frequently. Every new node in peer to
peer system find its neighbour node and these neighbour nodes need to share its information with new node to setup
connection. Now the cost of histogram is totally based on histogram update message caused by changing the load of
nodes in the system .To reduce the cost approximate value of histogram is taken.
III. ALGORITHMS FOR LOAD REBALANCE [1]
It contains three different modules.
1) File Allocation
International Journal of Innovative Research in Advanced Engineering (IJIRAE)
Volume 1 Issue 2 (April 2014)
___________________________________________________________________________________________________
ISSN: 2278-2311 IJIRAE | http://ijirae.com
© 2014, IJIRAE All Rights Reserved Page - 80
2) DHT Devision
3) Load Rebalancing
4) Advantage of Node heterogeneity
File Allocation
In this, large file is partitioned into number of chunks (C1, C2, C3…..Cn) and it allocates to sub servers
(Chunkserver). Here the files can be added, deleted or appended dynamically to sub server. It will help to avoid the data
loss. Fig. [1] Shows that, given large file is dived into number of parts and that parts are distributed over different
Chunkserver.
Chunk server Chunk server
C1 C2 C3
Fig 1: File allocation
DHT Devising
The storage nodes are structured over network based on the distributed hash table (DHT); each file chunk
having rapid key lookup in DHTs, in that unique identifier is assign to each file chunk. DHT guarantees that if any node
leaves then allocated chunks are migrated to its successor; if node joins then it allocate the chunks which are stored in
successor. DHT network is transparent. It specifies the location of chunks that can be integrated with existing distributed
file system where master node manages namespace of file system and mapping of chunks to storage nodes. Fig. 2 shows
the structure of Distributed Hash Table which is visible to all nodes.
put (key, data) get(key) data
........
Fig 2: Distributed Hash Table
Load Rebalancing
First calculate whether loads are light or heavy in each sub server.
A node is Light if, Number of chunk < (1-∆L) A
And a node is heavy if, Number of chunk > (1-∆U) A
Where ∆L and ∆U are the system parameters. All heavy nodes shared its load with light node. If node ‘i’ is the lightly
loaded then it contact to its successor ‘i+1’and migrate its load to its successor and then join instantly to heavy node.
Name node
Default File Default File Default File
Distributed application
Distributed Hash Table
Chunk server Chunk server Chunk server
International Journal of Innovative Research in Advanced Engineering (IJIRAE)
Volume 1 Issue 2 (April 2014)
___________________________________________________________________________________________________
ISSN: 2278-2311 IJIRAE | http://ijirae.com
© 2014, IJIRAE All Rights Reserved Page - 81
Here node equalization technique is used to reduce latency, overload and resource utilization. The heavily loaded node
transfer requested chunks to lightly loaded node and it traverse through physical network link.
Light Node Heavy Node
Chunkserver1 Chunkserver2 Chunkserver3
Fig 3: Heavy node and light node
Consider example, the capacity of each Chunkserver is 256MB i.e. 3 chunks (each chunk is of 64 MB) then
according to above Fig. 3 and load rebalancing algorithm chunkserver1 is light node having load 192MB (no of chunk <
(1-∆L) A) and Chunkserver 3 is the heavy node having load 320MB (no of chunk > (1-∆U)A). Using the load
equalization technique, heavy node transfers its load to light node i.e. Chunkserver 3 transfer its some load to
Chunkserver 1.
Fig4 (a): Initial load of chunks Fig4 (b): Rebalancing the load
Consider example, in Fig. 4(a) capacity of each node is 5 chunks (A) but here N1 is the light node having 2
chunks and N5 is heavy node consist 10 chunks. Then N1 identifies that N5 is the heavy node then it contact to its
successor i.e. to N2 and transfer its own load to N2. And request to heavy node for some load. Min (Lj-A),A) that much
load is transfer from heavy node to light node. According to formulae 5 chunks are transfer from N5 to N1 (Fig 4(b)).
Advantage of Node Heterogeneity
Nodes on which file is distributed are heterogeneous in nature. According to nodes capacity there is one
bottleneck resource. Consider, capacity of nodes (Cp1 , Cp2, Cp3,...., Cpn). Each node consist approximate number of
file chunks. The load on that node needs to be balanced as follows:
Ai = γ Cpi
where γ is the load per unit capacity of node. And
γ = m/Σ k=1 Cpk
where m is the number of file chunks stored on system.
Central node
K1
K1
K1K2
K3
K2 K3
K4
K4
K2 K4
K3
International Journal of Innovative Research in Advanced Engineering (IJIRAE)
Volume 1 Issue 2 (April 2014)
___________________________________________________________________________________________________
ISSN: 2278-2311 IJIRAE | http://ijirae.com
© 2014, IJIRAE All Rights Reserved Page - 82
IV. CONCLUSIONS
Cloud computing application is based on the MapReduce programming used in distributed file system. Load
imbalancing Problem mostly occur in dynamic, large-scale and distributed file system. Load should be balance over
multiple nodes to improve system performance, resource utilization, response time and stability.
In the load rebalancing algorithm the load of node is balanced as well as the movement cost also reduced. The load
balancing task performs independently without global knowledge of system. This load balancing algorithm has fast
concurrency rate. Load balancing reduce the load and movement cost while it uses physical network locality and node
heterogeneity.
REFERENCES
1. Hung-Chang Hsiao, Member, IEEE Computer Society, Hsueh-Yi Chung, HaiyingShen, Member, IEEE, and
Yu-Chang Chao, proposed a “Load Rebalancing for Distributed File Systems in Clouds” IEEE transaction on
parallel and distributed systems, vol. 24, no. 5, May 2013
2. J. Dean and S. Ghemawat, “MapReduce: Simplified Data Processing on Large Clusters,” Proc. Sixth Symp.
Operating System Designed Implementation (OSDI ’04), pp. 137-150, Dec. 2004.
3. S. Mohammad, S. Breß, and E. Schallehn, “Cloud Data Management: A Short Overview and Comparison of
Current Approaches,” Proc. 24th GI-Workshop Foundations of Databases (Grundlagen von Datenbanken), 2012.
4. Apache Hadoop, http://hadoop.apache.org/, 2013
5. P.Jamuna and R.Anand Kumar “Optimized Cloud Computing Technique To Simplify Load
Balancing”International Journal of Advanced Research in Computer Science and Software Engineering,
Volume 3, Issue 11,November 2013.
6. Kokilavani .K, Department Of Pervasive Computing Technology, Kings College Of Engineering, Punalkulam,
Tamil nadu “Enhance load balancing algorithm for distributed file system in cloud” International Journal of
Engineering and Innovative Technology (IJEIT) Volume 3, Issue 6, December 2013
7. Hadoop Distributed File System, http://hadoop.apache.org/hdfs/, 2012.
8. Hadoop Distributed File System “Rebalancing Blocks,” http://
developer.yahoo.com/hadoop/tutorial/module2.html#rebalancing,2012
9. J. Dean and S. Ghemawat, “MapReduce: Simplified Data Processing on Large Clusters,” Proc. Sixth Symp.
Operating System Design and Implementation (OSDI ’04), pp. 137-150, Dec. 2004.
10. ChahitaTanak, Rajesh Bharati “Load Balancing Algorithm for DHT Based Structured Peer to Peer
System”International Journal of Emerging Technology and Advanced Engineering (ISSN 2250-2459, ISO
9001:2008 Certified Journal, Volume 3, Issue 1, January 2013)
11. QuangHieu Vu, Member, IEEE, Beng Chin Ooi, Martin Rinard, and Kian-Lee Tan “Histogram-Based Global
Load Balancing in Structured Peer-to-Peer Systems” IEEE transaction on knowledge and data engineering,vol.
21, no. 4, April2009.

Weitere ähnliche Inhalte

Was ist angesagt?

The Concept of Load Balancing Server in Secured and Intelligent Network
The Concept of Load Balancing Server in Secured and Intelligent NetworkThe Concept of Load Balancing Server in Secured and Intelligent Network
The Concept of Load Balancing Server in Secured and Intelligent Network
IJAEMSJORNAL
 
user_defined_functions_forinterpolation
user_defined_functions_forinterpolationuser_defined_functions_forinterpolation
user_defined_functions_forinterpolation
sushanth tiruvaipati
 
A method for balancing heterogeneous request load in dht based p2 p
A method for balancing heterogeneous request load in dht based p2 pA method for balancing heterogeneous request load in dht based p2 p
A method for balancing heterogeneous request load in dht based p2 p
IAEME Publication
 
Vol 3 No 1 - July 2013
Vol 3 No 1 - July 2013Vol 3 No 1 - July 2013
Vol 3 No 1 - July 2013
ijcsbi
 

Was ist angesagt? (19)

The Concept of Load Balancing Server in Secured and Intelligent Network
The Concept of Load Balancing Server in Secured and Intelligent NetworkThe Concept of Load Balancing Server in Secured and Intelligent Network
The Concept of Load Balancing Server in Secured and Intelligent Network
 
A load balancing algorithm based on
A load balancing algorithm based onA load balancing algorithm based on
A load balancing algorithm based on
 
user_defined_functions_forinterpolation
user_defined_functions_forinterpolationuser_defined_functions_forinterpolation
user_defined_functions_forinterpolation
 
Geo distributed parallelization pacts in map reduce
Geo distributed parallelization pacts in map reduceGeo distributed parallelization pacts in map reduce
Geo distributed parallelization pacts in map reduce
 
A study on dynamic load balancing in grid environment
A study on dynamic load balancing in grid environmentA study on dynamic load balancing in grid environment
A study on dynamic load balancing in grid environment
 
MapReduce: Ordering and Large-Scale Indexing on Large Clusters
MapReduce: Ordering and  Large-Scale Indexing on Large ClustersMapReduce: Ordering and  Large-Scale Indexing on Large Clusters
MapReduce: Ordering and Large-Scale Indexing on Large Clusters
 
10. resource management
10. resource management10. resource management
10. resource management
 
TOWARDS REDUCTION OF DATA FLOW IN A DISTRIBUTED NETWORK USING PRINCIPAL COMPO...
TOWARDS REDUCTION OF DATA FLOW IN A DISTRIBUTED NETWORK USING PRINCIPAL COMPO...TOWARDS REDUCTION OF DATA FLOW IN A DISTRIBUTED NETWORK USING PRINCIPAL COMPO...
TOWARDS REDUCTION OF DATA FLOW IN A DISTRIBUTED NETWORK USING PRINCIPAL COMPO...
 
Resource management
Resource managementResource management
Resource management
 
Optimization of workload prediction based on map reduce frame work in a cloud...
Optimization of workload prediction based on map reduce frame work in a cloud...Optimization of workload prediction based on map reduce frame work in a cloud...
Optimization of workload prediction based on map reduce frame work in a cloud...
 
A method for balancing heterogeneous request load in dht based p2 p
A method for balancing heterogeneous request load in dht based p2 pA method for balancing heterogeneous request load in dht based p2 p
A method for balancing heterogeneous request load in dht based p2 p
 
Efficient Resource Management Mechanism with Fault Tolerant Model for Computa...
Efficient Resource Management Mechanism with Fault Tolerant Model for Computa...Efficient Resource Management Mechanism with Fault Tolerant Model for Computa...
Efficient Resource Management Mechanism with Fault Tolerant Model for Computa...
 
Vol 3 No 1 - July 2013
Vol 3 No 1 - July 2013Vol 3 No 1 - July 2013
Vol 3 No 1 - July 2013
 
[IJET V2I5P18] Authors:Pooja Mangla, Dr. Sandip Kumar Goyal
[IJET V2I5P18] Authors:Pooja Mangla, Dr. Sandip Kumar Goyal[IJET V2I5P18] Authors:Pooja Mangla, Dr. Sandip Kumar Goyal
[IJET V2I5P18] Authors:Pooja Mangla, Dr. Sandip Kumar Goyal
 
J0210053057
J0210053057J0210053057
J0210053057
 
Scheduling Algorithm Based Simulator for Resource Allocation Task in Cloud Co...
Scheduling Algorithm Based Simulator for Resource Allocation Task in Cloud Co...Scheduling Algorithm Based Simulator for Resource Allocation Task in Cloud Co...
Scheduling Algorithm Based Simulator for Resource Allocation Task in Cloud Co...
 
Time Efficient VM Allocation using KD-Tree Approach in Cloud Server Environment
Time Efficient VM Allocation using KD-Tree Approach in Cloud Server EnvironmentTime Efficient VM Allocation using KD-Tree Approach in Cloud Server Environment
Time Efficient VM Allocation using KD-Tree Approach in Cloud Server Environment
 
Enhancing Big Data Analysis by using Map-reduce Technique
Enhancing Big Data Analysis by using Map-reduce TechniqueEnhancing Big Data Analysis by using Map-reduce Technique
Enhancing Big Data Analysis by using Map-reduce Technique
 
DYNAMIC TASK SCHEDULING BASED ON BURST TIME REQUIREMENT FOR CLOUD ENVIRONMENT
DYNAMIC TASK SCHEDULING BASED ON BURST TIME REQUIREMENT FOR CLOUD ENVIRONMENTDYNAMIC TASK SCHEDULING BASED ON BURST TIME REQUIREMENT FOR CLOUD ENVIRONMENT
DYNAMIC TASK SCHEDULING BASED ON BURST TIME REQUIREMENT FOR CLOUD ENVIRONMENT
 

Andere mochten auch

Silva_Class 20 Prompt_Analysis Paper Option 2
Silva_Class 20 Prompt_Analysis Paper Option 2Silva_Class 20 Prompt_Analysis Paper Option 2
Silva_Class 20 Prompt_Analysis Paper Option 2
Jeffrey Silva
 
Mural art presentation
Mural art presentationMural art presentation
Mural art presentation
chloeesim
 
Program Evaluation Midterm
Program Evaluation MidtermProgram Evaluation Midterm
Program Evaluation Midterm
Jeffrey Silva
 
Hospital security officer performance appraisal
Hospital security officer performance appraisalHospital security officer performance appraisal
Hospital security officer performance appraisal
BigBang789
 
TVET DWA 2012-3-7_KKedits FINAL to DWA
TVET DWA 2012-3-7_KKedits FINAL to DWATVET DWA 2012-3-7_KKedits FINAL to DWA
TVET DWA 2012-3-7_KKedits FINAL to DWA
Toyin Oshaniwa
 
A COMPARATIVE STUDY OF OMRF & SMRF STRUCTURAL SYSTEM USING DIFFERENT SOFTWARES
A COMPARATIVE STUDY OF OMRF & SMRF STRUCTURAL SYSTEM USING DIFFERENT SOFTWARESA COMPARATIVE STUDY OF OMRF & SMRF STRUCTURAL SYSTEM USING DIFFERENT SOFTWARES
A COMPARATIVE STUDY OF OMRF & SMRF STRUCTURAL SYSTEM USING DIFFERENT SOFTWARES
AM Publications
 
dat-TrafficManager-for-Vantage
dat-TrafficManager-for-Vantagedat-TrafficManager-for-Vantage
dat-TrafficManager-for-Vantage
Scott Matics
 

Andere mochten auch (20)

Silva_Class 20 Prompt_Analysis Paper Option 2
Silva_Class 20 Prompt_Analysis Paper Option 2Silva_Class 20 Prompt_Analysis Paper Option 2
Silva_Class 20 Prompt_Analysis Paper Option 2
 
Проект«вбудущее»
Проект«вбудущее»Проект«вбудущее»
Проект«вбудущее»
 
Cv
CvCv
Cv
 
Shaving Myths And Facts
Shaving Myths And FactsShaving Myths And Facts
Shaving Myths And Facts
 
Unidad 3 trabajo Integrales Robinson Duran.
Unidad 3 trabajo Integrales Robinson Duran.Unidad 3 trabajo Integrales Robinson Duran.
Unidad 3 trabajo Integrales Robinson Duran.
 
Privacy Perspectives, Requirements and Design trade-offs of Encounter- based ...
Privacy Perspectives, Requirements and Design trade-offs of Encounter- based ...Privacy Perspectives, Requirements and Design trade-offs of Encounter- based ...
Privacy Perspectives, Requirements and Design trade-offs of Encounter- based ...
 
Data Analysis - SPSS - Survey Analytics
Data Analysis - SPSS - Survey AnalyticsData Analysis - SPSS - Survey Analytics
Data Analysis - SPSS - Survey Analytics
 
Integrated
IntegratedIntegrated
Integrated
 
Iris Encryption using (2, 2) Visual cryptography & Average Orientation Circul...
Iris Encryption using (2, 2) Visual cryptography & Average Orientation Circul...Iris Encryption using (2, 2) Visual cryptography & Average Orientation Circul...
Iris Encryption using (2, 2) Visual cryptography & Average Orientation Circul...
 
Mural art presentation
Mural art presentationMural art presentation
Mural art presentation
 
Program Evaluation Midterm
Program Evaluation MidtermProgram Evaluation Midterm
Program Evaluation Midterm
 
WELCOME TO TFAA
WELCOME TO TFAAWELCOME TO TFAA
WELCOME TO TFAA
 
Thyroid problems in the later years
Thyroid problems in the later yearsThyroid problems in the later years
Thyroid problems in the later years
 
Mga Likas Na Yaman Ng Kanlurang Asya
Mga Likas Na Yaman Ng Kanlurang AsyaMga Likas Na Yaman Ng Kanlurang Asya
Mga Likas Na Yaman Ng Kanlurang Asya
 
Hospital security officer performance appraisal
Hospital security officer performance appraisalHospital security officer performance appraisal
Hospital security officer performance appraisal
 
TVET DWA 2012-3-7_KKedits FINAL to DWA
TVET DWA 2012-3-7_KKedits FINAL to DWATVET DWA 2012-3-7_KKedits FINAL to DWA
TVET DWA 2012-3-7_KKedits FINAL to DWA
 
A COMPARATIVE STUDY OF OMRF & SMRF STRUCTURAL SYSTEM USING DIFFERENT SOFTWARES
A COMPARATIVE STUDY OF OMRF & SMRF STRUCTURAL SYSTEM USING DIFFERENT SOFTWARESA COMPARATIVE STUDY OF OMRF & SMRF STRUCTURAL SYSTEM USING DIFFERENT SOFTWARES
A COMPARATIVE STUDY OF OMRF & SMRF STRUCTURAL SYSTEM USING DIFFERENT SOFTWARES
 
A Study on Data Mining Based Intrusion Detection System
A Study on Data Mining Based Intrusion Detection SystemA Study on Data Mining Based Intrusion Detection System
A Study on Data Mining Based Intrusion Detection System
 
dat-TrafficManager-for-Vantage
dat-TrafficManager-for-Vantagedat-TrafficManager-for-Vantage
dat-TrafficManager-for-Vantage
 
Dt learn
Dt learnDt learn
Dt learn
 

Ähnlich wie Survey on Load Rebalancing for Distributed File System in Cloud

Dynamic Load Calculation in A Distributed System using centralized approach
Dynamic Load Calculation in A Distributed System using centralized approachDynamic Load Calculation in A Distributed System using centralized approach
Dynamic Load Calculation in A Distributed System using centralized approach
IJARIIT
 

Ähnlich wie Survey on Load Rebalancing for Distributed File System in Cloud (20)

Dynamic Load Calculation in A Distributed System using centralized approach
Dynamic Load Calculation in A Distributed System using centralized approachDynamic Load Calculation in A Distributed System using centralized approach
Dynamic Load Calculation in A Distributed System using centralized approach
 
2012an20
2012an202012an20
2012an20
 
Data Analysis In The Cloud
Data Analysis In The CloudData Analysis In The Cloud
Data Analysis In The Cloud
 
Use of genetic algorithm for
Use of genetic algorithm forUse of genetic algorithm for
Use of genetic algorithm for
 
A Novel Switch Mechanism for Load Balancing in Public Cloud
A Novel Switch Mechanism for Load Balancing in Public CloudA Novel Switch Mechanism for Load Balancing in Public Cloud
A Novel Switch Mechanism for Load Balancing in Public Cloud
 
C017311316
C017311316C017311316
C017311316
 
Scalable Distributed Job Processing with Dynamic Load Balancing
Scalable Distributed Job Processing with Dynamic Load BalancingScalable Distributed Job Processing with Dynamic Load Balancing
Scalable Distributed Job Processing with Dynamic Load Balancing
 
A Survey of Job Scheduling Algorithms Whit Hierarchical Structure to Load Ba...
A Survey of Job Scheduling Algorithms Whit  Hierarchical Structure to Load Ba...A Survey of Job Scheduling Algorithms Whit  Hierarchical Structure to Load Ba...
A Survey of Job Scheduling Algorithms Whit Hierarchical Structure to Load Ba...
 
Grid computing for load balancing strategies
Grid computing for load balancing strategiesGrid computing for load balancing strategies
Grid computing for load balancing strategies
 
Load balancing in Distributed Systems
Load balancing in Distributed SystemsLoad balancing in Distributed Systems
Load balancing in Distributed Systems
 
Data Partitioning in Mongo DB with Cloud
Data Partitioning in Mongo DB with CloudData Partitioning in Mongo DB with Cloud
Data Partitioning in Mongo DB with Cloud
 
Simple Load Rebalancing For Distributed Hash Tables In Cloud
Simple Load Rebalancing For Distributed Hash Tables In CloudSimple Load Rebalancing For Distributed Hash Tables In Cloud
Simple Load Rebalancing For Distributed Hash Tables In Cloud
 
Optimization of workload prediction based on map reduce frame work in a cloud...
Optimization of workload prediction based on map reduce frame work in a cloud...Optimization of workload prediction based on map reduce frame work in a cloud...
Optimization of workload prediction based on map reduce frame work in a cloud...
 
Cloak-Reduce Load Balancing Strategy for Mapreduce
Cloak-Reduce Load Balancing Strategy for MapreduceCloak-Reduce Load Balancing Strategy for Mapreduce
Cloak-Reduce Load Balancing Strategy for Mapreduce
 
CLOAK-REDUCE LOAD BALANCING STRATEGY FOR MAPREDUCE
CLOAK-REDUCE LOAD BALANCING STRATEGY FOR MAPREDUCECLOAK-REDUCE LOAD BALANCING STRATEGY FOR MAPREDUCE
CLOAK-REDUCE LOAD BALANCING STRATEGY FOR MAPREDUCE
 
A novel load balancing model for overloaded cloud
A novel load balancing model for overloaded cloudA novel load balancing model for overloaded cloud
A novel load balancing model for overloaded cloud
 
HIGH AVAILABILITY AND LOAD BALANCING FOR POSTGRESQL DATABASES: DESIGNING AND ...
HIGH AVAILABILITY AND LOAD BALANCING FOR POSTGRESQL DATABASES: DESIGNING AND ...HIGH AVAILABILITY AND LOAD BALANCING FOR POSTGRESQL DATABASES: DESIGNING AND ...
HIGH AVAILABILITY AND LOAD BALANCING FOR POSTGRESQL DATABASES: DESIGNING AND ...
 
Cloud computing – partitioning algorithm
Cloud computing – partitioning algorithmCloud computing – partitioning algorithm
Cloud computing – partitioning algorithm
 
Unit-2 Hadoop Framework.pdf
Unit-2 Hadoop Framework.pdfUnit-2 Hadoop Framework.pdf
Unit-2 Hadoop Framework.pdf
 
Unit-2 Hadoop Framework.pdf
Unit-2 Hadoop Framework.pdfUnit-2 Hadoop Framework.pdf
Unit-2 Hadoop Framework.pdf
 

Mehr von AM Publications

ANALYSIS AND DESIGN E-MARKETPLACE FOR MICRO, SMALL AND MEDIUM ENTERPRISES
ANALYSIS AND DESIGN E-MARKETPLACE FOR MICRO, SMALL AND MEDIUM ENTERPRISESANALYSIS AND DESIGN E-MARKETPLACE FOR MICRO, SMALL AND MEDIUM ENTERPRISES
ANALYSIS AND DESIGN E-MARKETPLACE FOR MICRO, SMALL AND MEDIUM ENTERPRISES
AM Publications
 
SIMULATION OF ATMOSPHERIC POLLUTANTS DISPERSION IN AN URBAN ENVIRONMENT
SIMULATION OF ATMOSPHERIC POLLUTANTS DISPERSION IN AN URBAN ENVIRONMENTSIMULATION OF ATMOSPHERIC POLLUTANTS DISPERSION IN AN URBAN ENVIRONMENT
SIMULATION OF ATMOSPHERIC POLLUTANTS DISPERSION IN AN URBAN ENVIRONMENT
AM Publications
 

Mehr von AM Publications (20)

DEVELOPMENT OF TODDLER FAMILY CADRE TRAINING BASED ON ANDROID APPLICATIONS IN...
DEVELOPMENT OF TODDLER FAMILY CADRE TRAINING BASED ON ANDROID APPLICATIONS IN...DEVELOPMENT OF TODDLER FAMILY CADRE TRAINING BASED ON ANDROID APPLICATIONS IN...
DEVELOPMENT OF TODDLER FAMILY CADRE TRAINING BASED ON ANDROID APPLICATIONS IN...
 
TESTING OF COMPOSITE ON DROP-WEIGHT IMPACT TESTING AND DAMAGE IDENTIFICATION ...
TESTING OF COMPOSITE ON DROP-WEIGHT IMPACT TESTING AND DAMAGE IDENTIFICATION ...TESTING OF COMPOSITE ON DROP-WEIGHT IMPACT TESTING AND DAMAGE IDENTIFICATION ...
TESTING OF COMPOSITE ON DROP-WEIGHT IMPACT TESTING AND DAMAGE IDENTIFICATION ...
 
THE USE OF FRACTAL GEOMETRY IN TILING MOTIF DESIGN
THE USE OF FRACTAL GEOMETRY IN TILING MOTIF DESIGNTHE USE OF FRACTAL GEOMETRY IN TILING MOTIF DESIGN
THE USE OF FRACTAL GEOMETRY IN TILING MOTIF DESIGN
 
TWO-DIMENSIONAL INVERSION FINITE ELEMENT MODELING OF MAGNETOTELLURIC DATA: CA...
TWO-DIMENSIONAL INVERSION FINITE ELEMENT MODELING OF MAGNETOTELLURIC DATA: CA...TWO-DIMENSIONAL INVERSION FINITE ELEMENT MODELING OF MAGNETOTELLURIC DATA: CA...
TWO-DIMENSIONAL INVERSION FINITE ELEMENT MODELING OF MAGNETOTELLURIC DATA: CA...
 
USING THE GENETIC ALGORITHM TO OPTIMIZE LASER WELDING PARAMETERS FOR MARTENSI...
USING THE GENETIC ALGORITHM TO OPTIMIZE LASER WELDING PARAMETERS FOR MARTENSI...USING THE GENETIC ALGORITHM TO OPTIMIZE LASER WELDING PARAMETERS FOR MARTENSI...
USING THE GENETIC ALGORITHM TO OPTIMIZE LASER WELDING PARAMETERS FOR MARTENSI...
 
ANALYSIS AND DESIGN E-MARKETPLACE FOR MICRO, SMALL AND MEDIUM ENTERPRISES
ANALYSIS AND DESIGN E-MARKETPLACE FOR MICRO, SMALL AND MEDIUM ENTERPRISESANALYSIS AND DESIGN E-MARKETPLACE FOR MICRO, SMALL AND MEDIUM ENTERPRISES
ANALYSIS AND DESIGN E-MARKETPLACE FOR MICRO, SMALL AND MEDIUM ENTERPRISES
 
REMOTE SENSING AND GEOGRAPHIC INFORMATION SYSTEMS
REMOTE SENSING AND GEOGRAPHIC INFORMATION SYSTEMS REMOTE SENSING AND GEOGRAPHIC INFORMATION SYSTEMS
REMOTE SENSING AND GEOGRAPHIC INFORMATION SYSTEMS
 
EVALUATE THE STRAIN ENERGY ERROR FOR THE LASER WELD BY THE H-REFINEMENT OF TH...
EVALUATE THE STRAIN ENERGY ERROR FOR THE LASER WELD BY THE H-REFINEMENT OF TH...EVALUATE THE STRAIN ENERGY ERROR FOR THE LASER WELD BY THE H-REFINEMENT OF TH...
EVALUATE THE STRAIN ENERGY ERROR FOR THE LASER WELD BY THE H-REFINEMENT OF TH...
 
HMM APPLICATION IN ISOLATED WORD SPEECH RECOGNITION
HMM APPLICATION IN ISOLATED WORD SPEECH RECOGNITIONHMM APPLICATION IN ISOLATED WORD SPEECH RECOGNITION
HMM APPLICATION IN ISOLATED WORD SPEECH RECOGNITION
 
PEDESTRIAN DETECTION IN LOW RESOLUTION VIDEOS USING A MULTI-FRAME HOG-BASED D...
PEDESTRIAN DETECTION IN LOW RESOLUTION VIDEOS USING A MULTI-FRAME HOG-BASED D...PEDESTRIAN DETECTION IN LOW RESOLUTION VIDEOS USING A MULTI-FRAME HOG-BASED D...
PEDESTRIAN DETECTION IN LOW RESOLUTION VIDEOS USING A MULTI-FRAME HOG-BASED D...
 
INTELLIGENT BLIND STICK
INTELLIGENT BLIND STICKINTELLIGENT BLIND STICK
INTELLIGENT BLIND STICK
 
EFFECT OF SILICON - RUBBER (SR) SHEETS AS AN ALTERNATIVE FILTER ON HIGH AND L...
EFFECT OF SILICON - RUBBER (SR) SHEETS AS AN ALTERNATIVE FILTER ON HIGH AND L...EFFECT OF SILICON - RUBBER (SR) SHEETS AS AN ALTERNATIVE FILTER ON HIGH AND L...
EFFECT OF SILICON - RUBBER (SR) SHEETS AS AN ALTERNATIVE FILTER ON HIGH AND L...
 
UTILIZATION OF IMMUNIZATION SERVICES AMONG CHILDREN UNDER FIVE YEARS OF AGE I...
UTILIZATION OF IMMUNIZATION SERVICES AMONG CHILDREN UNDER FIVE YEARS OF AGE I...UTILIZATION OF IMMUNIZATION SERVICES AMONG CHILDREN UNDER FIVE YEARS OF AGE I...
UTILIZATION OF IMMUNIZATION SERVICES AMONG CHILDREN UNDER FIVE YEARS OF AGE I...
 
REPRESENTATION OF THE BLOCK DATA ENCRYPTION ALGORITHM IN AN ANALYTICAL FORM F...
REPRESENTATION OF THE BLOCK DATA ENCRYPTION ALGORITHM IN AN ANALYTICAL FORM F...REPRESENTATION OF THE BLOCK DATA ENCRYPTION ALGORITHM IN AN ANALYTICAL FORM F...
REPRESENTATION OF THE BLOCK DATA ENCRYPTION ALGORITHM IN AN ANALYTICAL FORM F...
 
OPTICAL CHARACTER RECOGNITION USING RBFNN
OPTICAL CHARACTER RECOGNITION USING RBFNNOPTICAL CHARACTER RECOGNITION USING RBFNN
OPTICAL CHARACTER RECOGNITION USING RBFNN
 
DETECTION OF MOVING OBJECT
DETECTION OF MOVING OBJECTDETECTION OF MOVING OBJECT
DETECTION OF MOVING OBJECT
 
SIMULATION OF ATMOSPHERIC POLLUTANTS DISPERSION IN AN URBAN ENVIRONMENT
SIMULATION OF ATMOSPHERIC POLLUTANTS DISPERSION IN AN URBAN ENVIRONMENTSIMULATION OF ATMOSPHERIC POLLUTANTS DISPERSION IN AN URBAN ENVIRONMENT
SIMULATION OF ATMOSPHERIC POLLUTANTS DISPERSION IN AN URBAN ENVIRONMENT
 
PREPARATION AND EVALUATION OF WOOL KERATIN BASED CHITOSAN NANOFIBERS FOR AIR ...
PREPARATION AND EVALUATION OF WOOL KERATIN BASED CHITOSAN NANOFIBERS FOR AIR ...PREPARATION AND EVALUATION OF WOOL KERATIN BASED CHITOSAN NANOFIBERS FOR AIR ...
PREPARATION AND EVALUATION OF WOOL KERATIN BASED CHITOSAN NANOFIBERS FOR AIR ...
 
ANALYSIS ON LOAD BALANCING ALGORITHMS IMPLEMENTATION ON CLOUD COMPUTING ENVIR...
ANALYSIS ON LOAD BALANCING ALGORITHMS IMPLEMENTATION ON CLOUD COMPUTING ENVIR...ANALYSIS ON LOAD BALANCING ALGORITHMS IMPLEMENTATION ON CLOUD COMPUTING ENVIR...
ANALYSIS ON LOAD BALANCING ALGORITHMS IMPLEMENTATION ON CLOUD COMPUTING ENVIR...
 
A MODEL BASED APPROACH FOR IMPLEMENTING WLAN SECURITY
A MODEL BASED APPROACH FOR IMPLEMENTING WLAN SECURITY A MODEL BASED APPROACH FOR IMPLEMENTING WLAN SECURITY
A MODEL BASED APPROACH FOR IMPLEMENTING WLAN SECURITY
 

Kürzlich hochgeladen

VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
dharasingh5698
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Christo Ananth
 
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort ServiceCall Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 

Kürzlich hochgeladen (20)

Call for Papers - International Journal of Intelligent Systems and Applicatio...
Call for Papers - International Journal of Intelligent Systems and Applicatio...Call for Papers - International Journal of Intelligent Systems and Applicatio...
Call for Papers - International Journal of Intelligent Systems and Applicatio...
 
Intze Overhead Water Tank Design by Working Stress - IS Method.pdf
Intze Overhead Water Tank  Design by Working Stress - IS Method.pdfIntze Overhead Water Tank  Design by Working Stress - IS Method.pdf
Intze Overhead Water Tank Design by Working Stress - IS Method.pdf
 
Unit 1 - Soil Classification and Compaction.pdf
Unit 1 - Soil Classification and Compaction.pdfUnit 1 - Soil Classification and Compaction.pdf
Unit 1 - Soil Classification and Compaction.pdf
 
Generative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTGenerative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPT
 
Double rodded leveling 1 pdf activity 01
Double rodded leveling 1 pdf activity 01Double rodded leveling 1 pdf activity 01
Double rodded leveling 1 pdf activity 01
 
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordCCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
 
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
 
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...
 
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
 
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort ServiceCall Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
 
chapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineeringchapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineering
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
 
Thermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VThermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - V
 
Roadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and RoutesRoadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and Routes
 
Thermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptThermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.ppt
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
 
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performance
 
Online banking management system project.pdf
Online banking management system project.pdfOnline banking management system project.pdf
Online banking management system project.pdf
 
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
 

Survey on Load Rebalancing for Distributed File System in Cloud

  • 1. International Journal of Innovative Research in Advanced Engineering (IJIRAE) Volume 1 Issue 2 (April 2014) ___________________________________________________________________________________________________ ISSN: 2278-2311 IJIRAE | http://ijirae.com © 2014, IJIRAE All Rights Reserved Page -78 Survey on Load Rebalancing for Distributed File System in Cloud Prof. Pranalini S. Ketkar Ankita Bhimrao Patkure IT Department, DCOER, PG Scholar, Computer Department DCOER, Pune University Pune university pranalini.ketkar@gmail.com ankitapatkure@rediffmail.com Abstract— Distributed file system is used as a key building block of cloud computing. In distributed file system, a large file is divided into number of chunks and allocates each chunk to separate node to perform MapReduce function parallel over each node. In cloud, if number of storage nodes, number of files and assesses to that file increases then the central node (master in MapReduce) becomes bottleneck. The load rebalancing task is used to eliminate the load on central node. Using load rebalancing algorithm the load of nodes is balanced as well as the movement cost is reduced. In this survey paper the problem of load imbalancing is overcome. Keywords— DFS, MapReduce, Load balancing, Distributed Hash Table, cloud I. INTRODUCTION In Cloud computing, the number of computers that are connected using communication network. The notation of cloud indicates that internet is mandatory to perform the various cloud operations i.e. to create delete, append and replace. It is used in IT-companies to share information and resources with the all users. There are various characteristics of cloud i.e. Scalable, on demand service, User centric, Powerful, Versatile, Platform independent etc. In cloud three technologies are included the MapReduce programming, Virtualization and distributed file systems for the data storage purpose. Distributed file system is classical model of file system that is used in the form of chunks for cloud computing. Cloud computing application is based on the MapReduce programming used in distributed file system. MapReduce is the master-slave architecture in Hadoop. Master act like Namenode and Slave act like Datanode. Master takes large problem, divides it into sub problem and assigns it to worker node i.e. to multiple slaves to solve problem individually. In distributed file system, a large file is divided into number of chunks and allocates each chunk to separate node to perform MapReduce function parallel over each node. For example in word count application it identifies the occurrences of each distinct word in large file. In this application a large file is divided into fixed-size chunks (parts) and assigns each chunk to different cloud storage node. Then each storage node calculates the occurrences of each distinct word by scanning and parsing its own chunk. Then give its result to master to calculate the final result. In distributed file system, the load of each node is directly proportional to number of file chunks that node consists. As the increase in storage and network, load balancing is the main issue in the large scale distributed systems. Load should be balance over multiple nodes to improve system performance, resource utilization, response time and stability. Load balancing is divided into two categories: static and dynamic. In static load balancing algorithm, it does not consider the previous behaviour of a node while distribute the load. But in case of dynamic load balancing algorithm, it checks the previous behaviour of node while distribute the load. In cloud, if number of storage nodes, number of files and assesses to that file increases then the central node (master in MapReduce) becomes bottleneck. The load rebalancing task is used to eliminate the load on central node. In load balancing algorithm, storage nodes are structured over network based on the distributed hash table (DHT); each file chunk having rapid key lookup in DHTs, in that unique identifier is assign to each file chunk [1]. DHTs enable nodes to self-recognize and repair while it constantly offers lookup functionality in node. Here aim to reduce the movement cost which is caused by load rebalancing of nodes to maximize the network bandwidth. Each Chunkserver first find whether it is light node or heavy node without global knowledge of node. The numbers of file chunks are migrated from heavy node to light node to balance their load. This process repeats until all heavy nodes becomes the light nodes. To overcome this load balancing problem each node perform load rebalancing algorithm independently without global knowledge about load of all nodes. II. LITERATURE SURVEY MapReduce: Simplified Data Processing On Large Clusters [9] MapReduce is the programming model used in implementation for processing and generating large scale datasets. It is used at Google for many different purposes. Here map and reduce functions are used. Map function generate set of intermediate key pairs and reduce function merges all intermediate key values associated with same intermediate key. The map and reduce function allows to perform parallelize operation easily and re-execute the mechanism for fault tolerance. At the run-time, system takes care of detail information of partitioning the input data, schedule the program execution across number of available machines, handling failures and managing inter- communication between machines.
  • 2. International Journal of Innovative Research in Advanced Engineering (IJIRAE) Volume 1 Issue 2 (April 2014) ___________________________________________________________________________________________________ ISSN: 2278-2311 IJIRAE | http://ijirae.com © 2014, IJIRAE All Rights Reserved Page - 79 In distributed file system nodes simultaneously perform computing and storage operations. The large file in partitioned into number of chunks and allocate it to distinct nodes to perform MapReduce task parallel over nodes. Typically, MapReduce task processes on many terabytes of data on thousands of machines. This model is easy to use; it hides the details of parallelization, optimization, fault-tolerance and load balancing. MapReduce is used for Google’s production Web search service, machine learning, data mining, etc. Using this programming model, redundant execution used to reduce the impact of slow machines, handle machine failure as well as data loss. Load Balancing Algorithm for DHT based structured Peer to Peer System [10] Peer to peer system have an emerging application in distributed environment. As compared to client-server architecture, peer to peer system improved resource utilization by making use of unused resources over network. Peer to peer system uses Distributed Hash Table (DHTs) as an allocation mechanism. It perform join, leave and update operations. Here load balancing algorithm uses the concept of virtual server to temporary storage of data. Using the heterogeneous indexing, peers balanced their loads proportional to their capacities. In this, decentralized load balance algorithm construct network to manipulate global information and organized in tree shape fashion. Each peer can independently compute probability distribution capacities of participating peers and reallocate their load in parallel. Optimized Cloud Partitioning Technique to Simplify Load Balancing [5] Cloud computing has some issue regarding resource management and load balancing. In this paper, Cloud environment is divided into number of parts by making use of cloud cluster technique which helps for the process of load balancing. The cloud consists of number of nodes and it partitioned into n cluster based on cloud cluster technique. In this, it consists of main controller which maintains all information regarding all load balancer in cluster and index table. Initial step is to choose the correct cluster and it follows algorithm as 1) In cloud environment, the nodes connected to central controller are initialized as 0 in index table. 2) When controller receives new request, it queries the load balancer of each cluster for job allocation. 3) Then controller pass index table to find next available node having less weight. If found then continue the processing otherwise index table reinitialized to 0 and in an increment manner then again controller passes table to find next available node. 4) After completing the process the load balancer update the status in allocation table. Cloud partitioning method consist 2 steps:- 1) Visit each node randomly to match it with neighbor node. If it having same characteristics and shares similar data with minimal cost then two nodes are combined into new node with share same details. Repeat until there is no neighbor node having similar characteristics. Subsequently update the cost between neighbor two nodes and current neighbor node. 2) After joining two nodes into new node having similar characteristics visited node send the information to new node instead of sending it twice. It gives the high performance, stability, minimum response time and optimal resource utilization. Histogram-Based Global Load Balancing in Structured Peer to Peer System [11] Peer to peer system having solution for sharing and locating resources over internet. In this paper there are two key components. First is histogram manager that maintain histogram that reflect global view of distribution of load. Histogram stores statistical information about average load of no overlapping groups of nodes. It is used to check whether node is normally loaded, Light or heavily loaded. Second component is load balance manager that take the action of load redistribution if node becomes light or heavy. Load balancing manager balance the load statically when new node joins and dynamically when existing node become light or heavily loaded. The cost of constructing histogram and maintaining it may be expensive in dynamic system. To reduce the maintaining cost two techniques are used. Constructing and maintaining histogram is expensive if node join and leave system frequently. Every new node in peer to peer system find its neighbour node and these neighbour nodes need to share its information with new node to setup connection. Now the cost of histogram is totally based on histogram update message caused by changing the load of nodes in the system .To reduce the cost approximate value of histogram is taken. III. ALGORITHMS FOR LOAD REBALANCE [1] It contains three different modules. 1) File Allocation
  • 3. International Journal of Innovative Research in Advanced Engineering (IJIRAE) Volume 1 Issue 2 (April 2014) ___________________________________________________________________________________________________ ISSN: 2278-2311 IJIRAE | http://ijirae.com © 2014, IJIRAE All Rights Reserved Page - 80 2) DHT Devision 3) Load Rebalancing 4) Advantage of Node heterogeneity File Allocation In this, large file is partitioned into number of chunks (C1, C2, C3…..Cn) and it allocates to sub servers (Chunkserver). Here the files can be added, deleted or appended dynamically to sub server. It will help to avoid the data loss. Fig. [1] Shows that, given large file is dived into number of parts and that parts are distributed over different Chunkserver. Chunk server Chunk server C1 C2 C3 Fig 1: File allocation DHT Devising The storage nodes are structured over network based on the distributed hash table (DHT); each file chunk having rapid key lookup in DHTs, in that unique identifier is assign to each file chunk. DHT guarantees that if any node leaves then allocated chunks are migrated to its successor; if node joins then it allocate the chunks which are stored in successor. DHT network is transparent. It specifies the location of chunks that can be integrated with existing distributed file system where master node manages namespace of file system and mapping of chunks to storage nodes. Fig. 2 shows the structure of Distributed Hash Table which is visible to all nodes. put (key, data) get(key) data ........ Fig 2: Distributed Hash Table Load Rebalancing First calculate whether loads are light or heavy in each sub server. A node is Light if, Number of chunk < (1-∆L) A And a node is heavy if, Number of chunk > (1-∆U) A Where ∆L and ∆U are the system parameters. All heavy nodes shared its load with light node. If node ‘i’ is the lightly loaded then it contact to its successor ‘i+1’and migrate its load to its successor and then join instantly to heavy node. Name node Default File Default File Default File Distributed application Distributed Hash Table Chunk server Chunk server Chunk server
  • 4. International Journal of Innovative Research in Advanced Engineering (IJIRAE) Volume 1 Issue 2 (April 2014) ___________________________________________________________________________________________________ ISSN: 2278-2311 IJIRAE | http://ijirae.com © 2014, IJIRAE All Rights Reserved Page - 81 Here node equalization technique is used to reduce latency, overload and resource utilization. The heavily loaded node transfer requested chunks to lightly loaded node and it traverse through physical network link. Light Node Heavy Node Chunkserver1 Chunkserver2 Chunkserver3 Fig 3: Heavy node and light node Consider example, the capacity of each Chunkserver is 256MB i.e. 3 chunks (each chunk is of 64 MB) then according to above Fig. 3 and load rebalancing algorithm chunkserver1 is light node having load 192MB (no of chunk < (1-∆L) A) and Chunkserver 3 is the heavy node having load 320MB (no of chunk > (1-∆U)A). Using the load equalization technique, heavy node transfers its load to light node i.e. Chunkserver 3 transfer its some load to Chunkserver 1. Fig4 (a): Initial load of chunks Fig4 (b): Rebalancing the load Consider example, in Fig. 4(a) capacity of each node is 5 chunks (A) but here N1 is the light node having 2 chunks and N5 is heavy node consist 10 chunks. Then N1 identifies that N5 is the heavy node then it contact to its successor i.e. to N2 and transfer its own load to N2. And request to heavy node for some load. Min (Lj-A),A) that much load is transfer from heavy node to light node. According to formulae 5 chunks are transfer from N5 to N1 (Fig 4(b)). Advantage of Node Heterogeneity Nodes on which file is distributed are heterogeneous in nature. According to nodes capacity there is one bottleneck resource. Consider, capacity of nodes (Cp1 , Cp2, Cp3,...., Cpn). Each node consist approximate number of file chunks. The load on that node needs to be balanced as follows: Ai = γ Cpi where γ is the load per unit capacity of node. And γ = m/Σ k=1 Cpk where m is the number of file chunks stored on system. Central node K1 K1 K1K2 K3 K2 K3 K4 K4 K2 K4 K3
  • 5. International Journal of Innovative Research in Advanced Engineering (IJIRAE) Volume 1 Issue 2 (April 2014) ___________________________________________________________________________________________________ ISSN: 2278-2311 IJIRAE | http://ijirae.com © 2014, IJIRAE All Rights Reserved Page - 82 IV. CONCLUSIONS Cloud computing application is based on the MapReduce programming used in distributed file system. Load imbalancing Problem mostly occur in dynamic, large-scale and distributed file system. Load should be balance over multiple nodes to improve system performance, resource utilization, response time and stability. In the load rebalancing algorithm the load of node is balanced as well as the movement cost also reduced. The load balancing task performs independently without global knowledge of system. This load balancing algorithm has fast concurrency rate. Load balancing reduce the load and movement cost while it uses physical network locality and node heterogeneity. REFERENCES 1. Hung-Chang Hsiao, Member, IEEE Computer Society, Hsueh-Yi Chung, HaiyingShen, Member, IEEE, and Yu-Chang Chao, proposed a “Load Rebalancing for Distributed File Systems in Clouds” IEEE transaction on parallel and distributed systems, vol. 24, no. 5, May 2013 2. J. Dean and S. Ghemawat, “MapReduce: Simplified Data Processing on Large Clusters,” Proc. Sixth Symp. Operating System Designed Implementation (OSDI ’04), pp. 137-150, Dec. 2004. 3. S. Mohammad, S. Breß, and E. Schallehn, “Cloud Data Management: A Short Overview and Comparison of Current Approaches,” Proc. 24th GI-Workshop Foundations of Databases (Grundlagen von Datenbanken), 2012. 4. Apache Hadoop, http://hadoop.apache.org/, 2013 5. P.Jamuna and R.Anand Kumar “Optimized Cloud Computing Technique To Simplify Load Balancing”International Journal of Advanced Research in Computer Science and Software Engineering, Volume 3, Issue 11,November 2013. 6. Kokilavani .K, Department Of Pervasive Computing Technology, Kings College Of Engineering, Punalkulam, Tamil nadu “Enhance load balancing algorithm for distributed file system in cloud” International Journal of Engineering and Innovative Technology (IJEIT) Volume 3, Issue 6, December 2013 7. Hadoop Distributed File System, http://hadoop.apache.org/hdfs/, 2012. 8. Hadoop Distributed File System “Rebalancing Blocks,” http:// developer.yahoo.com/hadoop/tutorial/module2.html#rebalancing,2012 9. J. Dean and S. Ghemawat, “MapReduce: Simplified Data Processing on Large Clusters,” Proc. Sixth Symp. Operating System Design and Implementation (OSDI ’04), pp. 137-150, Dec. 2004. 10. ChahitaTanak, Rajesh Bharati “Load Balancing Algorithm for DHT Based Structured Peer to Peer System”International Journal of Emerging Technology and Advanced Engineering (ISSN 2250-2459, ISO 9001:2008 Certified Journal, Volume 3, Issue 1, January 2013) 11. QuangHieu Vu, Member, IEEE, Beng Chin Ooi, Martin Rinard, and Kian-Lee Tan “Histogram-Based Global Load Balancing in Structured Peer-to-Peer Systems” IEEE transaction on knowledge and data engineering,vol. 21, no. 4, April2009.