SlideShare ist ein Scribd-Unternehmen logo
1 von 7
ORCHESTRATING BULK DATA TRANSFERS ACROSS
GEO-DISTRIBUTED DATACENTERS
Abstract—As it has become the norm for cloud providers to host multiple datacenters around the globe, significant
demands exist for inter-datacenter data transfers in large volumes, e.g., migration of big data. A challenge arises on
how to schedule the bulk data transfers at different urgency levels, in order to fully utilize the available inter-
datacenter bandwidth. The Software Defined Networking (SDN) paradigm has emerged recently which decouples
the control plane from the data paths, enabling potential global optimization of data routing in a network. This paper
aims to design a dynamic, highly efficient bulk data transfer service in a geo-distributed datacenter system, and
engineer its design and solution algorithms closely within an SDN architecture. We model data transfer demands as
delay tolerant migration requests with different finishing deadlines. Thanks to the flexibility provided by SDN, we
enable dynamic, optimal routing of distinct chunks within each bulk data transfer (instead of treating each transfer as
an infinite flow), which can be temporarily stored at intermediate datacenters to mitigate bandwidth contention with
more urgent transfers. An optimal chunk routing optimization model is formulated to solve for the best chunk
transfer schedules over time. To derive the optimal schedules in an online fashion, three algorithms are discussed,
namely a bandwidth-reserving algorithm, a dynamically -adjusting algorithm, and a future-demandfriendly
algorithm, targeting at different levels of optimality and scalability. We build an SDN system based on the Beacon
platformand OpenFlow APIs, and carefully engineer our bulk data transfer algorithms in the system. Extensive real-
world experiments are carried out to compare the three algorithms as well as those from the existing literature, in
terms of routing optimality, computational delay and overhead.
EXISTING SYSTEM:
In the network inside a data center, TCP congestion control and FIFO flow scheduling are currently used for data
flow transport, which are unaware of flow deadlines. A number of proposals have appeared for deadline-aware
congestion and rate control. D3 [12] exploits deadline information to control the rate at which each source host
introduces traffic into the network, and apportions bandwidth at the routers along the paths greedily to satisfy as
many deadlines as possible. D2TCP [13] is a Deadline-Aware Datacenter TCP protocol to handle bursty flows with
deadlines. A congestion avoidance algorithm is employed, which uses ECN feedback from the routers and flow
deadlines to modify the congestion window at the sender.In pFabric [14], switches implement simple prioritybased
scheduling/dropping mechanisms, based on a priority number carried in the packets of each flow, and each flow
starts at the line rate which throttles back only when high and persistent packet loss occurs. Differently, our work
focuses on transportation of bulk flows among datacenters in a geodistributed cloud. Instead of end-to-end
congestion control, we enable store-and-forward in intermediate datacenters, such that a source data center can send
data out as soon as the first-hop connection bandwidth allows, whereas intermediate datacenters can temporarily
store the data if more urgent/important flows need the next-hop link bandwidths.
PROPOSED SYSTEM:
This paper proposes a novel optimization model for dynamic, highly efficient scheduling of bulk data transfers in a
geo-distributed datacenter system, and engineers its design and solution algorithms practically within an OpenFlow-
based SDN architecture. We model data transfer requests as delay tolerant data migration tasks with different
finishing deadlines. Thanks to the flexibility of transmission scheduling provided by SDN, we enable dynamic,
optimal routing of distinct chunks within each bulk data transfer (instead of treating each transfer as an infinite
flow), which can be temporarily stored at intermediate datacenters and transmitted only at carefully scheduled times,
to mitigate bandwidth contention among tasks of different urgency levels. Our contributions are summarized as
follows. First, we formulate the bulk data transfer problem into a novel, optimal chunk routing problem, which
maximizes the aggregate utility gain due to timely transfer completions before the specified deadlines. Such an
optimization model enables flexible, dynamic adjustment of chunk transfer schedules in a systemwith dynamically-
arriving data transfer requests, which is impossible with a popularly-modeled flow-based optimal routing model.
Second, we discuss three dynamic algorithms to solve the optimal chunk routing problem, namely a bandwidth-
reserving algorithm, a dynamically-adjusting algorithm, and a futuredemand- friendly algorithm. These solutions are
targeting at different levels of optimality and computational complexity. Third, we build an SDN system based on
the OpenFlow APIs and Beacon platform [11], and carefully engineer our bulk data transfer algorithms in the
system. Extensive realworld experiments with real network traffic are carried out to compare the three algorithms as
well as those in the existing literature, in terms of routing optimality, computational delay and overhead.
Module 1
Orchestrating in Cloud computing
The goal of cloud orchestration is to, insofar as is possible, automate the configuration, coordination and
management of software and software interactions in such an environment. The process involves automating
workflows required for service delivery. Tasks involved include managing server runtimes, directing the flow of
processes among applications and dealing with exceptions to typical workflows. Vendors of cloud orchestration
products include Eucalyptus, Flexiant, IBM, Microsoft, VMware and V3 Systems. The term “orchestration”
originally comes from the study of music, where it refers to the arrangement and coordination of instruments for a
given piece.
Module 2
SDN-based Architecture
We consider a cloud spanning multiple datacenters located in different geographic locations (Fig. 1). Each
datacenter is connected via a core switch to the other datacenters. The connections among the datacenters are
dedicated, fullduplex links, either through leading tier-1 ISPs or private fiber networks of the cloud provider,
allowing independent and simultaneous two-way data transmissions. Data transfer requests may arise from each
datacenter to move bulk volumes of data to another datacenter. A gateway server is connected to the core switch in
each datacenter, responsible for aggregating cross-datacenter data transfer requests from the same datacenter, as
well as for temporarily storing data from other datacenters and forwarding them via the switch. It also tracks
network topology and bandwidth availability among the datacenters with the help of the switches. Combined closely
with the SDN paradigm, a central controller is deployed to implement the optimal data transfer algorithms,
dynamically configure the flow table on each switch, and instruct the gateway servers to store or to forward each
data chunk. The layered architecture we present realistically resembles B4 [9], which was designed and deployed by
Google for their G-scale inter-datacenter network: the gateway server plays a similar role of the site controller layer,
the controller corresponds well to the global layer, and the core switch at each location can be deemed as the per-site
switch clusters in B4.
Module 3
Dynamic algorithms
We present three practical algorithms which make job acceptance and chunk routing decisions in each time slot, and
achieve different levels of optimality and scalability.
The Bandwidth-Reserving Algorithm The first algorithm honors decisions made in previous time slots, and
reserves bandwidth along the network links for scheduled chunk transmissions of previously accepted jobs in its
routing computation for newly arrived jobs. Let J(_ ) be the set consisting of only the latest data transfer requests
arrived in time slot _ . Define Bm;n(t) as the residual bandwidth on each connection (m; n) in time slot t 2 [_ +1; �],
excluding bandwidth needed for the remaining chunk transfers of accepted jobs arrived before _ . In each time slot _
, the algorithm solves optimization (1) with job set J(_ ) and bandwidth Bm;n(t)’s for duration [_ +1; �], and derives
admission control decisions for jobs arrived in this time slot, as well as their chunk transfer schedules before their
respective deadlines. Theorem 1 states the NP-hardness of optimization problem in (1) (with detailed proof in
AppendixA). Nevertheless, such a linear integer program may still be solved in reasonable time at a typical scale of
the problem (e.g., tens of datacenters in the system), using an optimization tool such as CPLEX [26]. To cater for
larger scale problems, we also propose a highly efficient heuristic. .
The Dynamically-Adjusting Algorithm
The second algorithm retains job acceptance decisions made in previous time slots, but adjusts routing schedules for
chunks of accepted jobs, which have not reached their respective destinations, together with the admission control
and routing computation of newly arrived jobs. Let J(_ ) be the set of data transfer requests arrived in time slot _ ,
and J(_�) represent the set of unfinished, previously accepted jobs by time slot _ . In each time slot _ , the algorithm
solves a modified version of optimization (1)
Module 4
The Future-Demand-Friendly Heuristic
We further propose a simple but efficient heuristic to make job acceptance and chunk routing decisions in each time
slot, with polynomial-time computational complexity, suitable for systems with larger scales. Similar to the first
algorithm, the heuristic retains routing decisions computed earlier for chunks of already accepted jobs, but only
makes decisions for jobs received in this time slot using the remaining bandwidth. On the other hand, it is more
future demand friendly than the first algorithm, by postponing the transmission of accepted jobs as much as possible,
to save bandwidth available in the immediate future in case more urgent transmission jobs may arrive. Let J(_ ) be
the set of latest data transfer requests arrived in time slot _ . The heuristic is given in Alg. 1. At the job level, the
algorithm preferably handles data transfer requests with higher weights and smaller sizes (line 1), i.e., larger weight
per unit bandwidth consumption. For each chunk in job J, the algorithm chooses a transfer path with the fewest
number of hops, that has available bandwidth to forward the chunk from the source to the destination before the
deadline.
CONCLUSION
This paper presents our efforts to tackle an arising challenge in geo-distributed datacenters, i.e., deadline-aware bulk
data transfers. Inspired by the emerging Software Defined Networking (SDN) initiative that is well suited to
deployment of an efficient scheduling algorithm with the global view of the network, we propose a reliable and
efficient underlying bulk data transfer service in an inter-datacenter network, featuring optimal routing for distinct
chunks over time, which can be temporarily stored at intermediate datacenters and forwarded at carefully computed
times. For practical application of the optimization framework, we derive three dynamic algorithms, targeting at
different levels of optimality and scalability. We also present the design and implementation of our Bulk Data
Transfer (BDT) system, based on the Beacon platform and OpenFlow APIs. Experiments with realistic settings
verify the practicality of the design and the efficiency of the three algorithms, based on extensive comparisons with
schemes in the literature.
REFERENCES
[1] Data Center Map, http://www.datacentermap.com/datacenters.html.
[2] K. K. Ramakrishnan, P. Shenoy, and J. Van der Merwe, “Live Data Center Migration across WANs: A Robust
Cooperative Context Aware Approach,” in Proceedings of the 2007 SIGCOMM workshop on Internet network
management, ser. INM ’07, New York, NY, USA, 2007, pp. 262–267.
[3] Y. Wu, C. Wu, B. Li, L. Zhang, Z. Li, and F. C. M. Lau, “Scaling Social Media Applications into Geo-
Distributed Clouds,” in INFOCOM, 2012.
[4] J. Dean and S. Ghemawat, “MapReduce: Simplified Data Processing on Large Clusters,” Commun. ACM, vol.
51, no. 1, pp. 107–113, Jan. 2008.
[5] A. Greenberg, G. Hjalmtysson, D. A. Maltz, A. Myers, J. Rexford, G. Xie, H. Yan, J. Zhan, and H. Zhang, “A
Clean Slate 4D Approach to Network Control and Management,” ACM SIGCOMM Co mputer Communication
Review, vol. 35, no. 5, pp. 41–54, 2005.
[6] SDN, https://www.opennetworking.org/sdn-resources/sdn-definition.
[7] N. McKeown, T. Anderson, H. Balakrishnan, G. M. Parulkar, L. L. Peterson, J. Rexford, S. Shenker, and J. S.
Turner, “OpenFlow: Enabling Innovation in Campus Networks,” Computer Communication Review, vol. 38, no. 2,
pp. 69–74, 2008.
[8] U. Hoelzle, “Openflow@ google,” Open Networking Summit, 2012.
[9] S. Jain, A. Kumar, S. Mandal, J. Ong, L. Poutievski, A. Singh, S. Venkata, J. Wanderer, J. Zhou, M. Zhu et al.,
“B4: Experience with a Globally-deployed Software Defined WAN,” in Proceedings of the ACM SIGCOMM 2013
conference on SIGCOMM. ACM, 2013, pp. 3–14.
[10] S. J. Vaughan-Nichols, “OpenFlow: The Next Generation of the Network?” Computer, vol. 44, no. 8, pp. 13–
15, 2011.
[11] Beacon Home, https://openflow.stanford.edu/display/Beacon/Home.
[12] C. Wilson, H. Ballani, T. Karagiannis, and A. Rowtron, “Better Never than Late: Meeting Deadlines in
Datacenter Networks,” in Proceedings of the ACM SIGCOMM, New York, NY, USA, 2011, pp. 50–61.

Weitere ähnliche Inhalte

Was ist angesagt?

SECTOR TREE-BASED CLUSTERING FOR ENERGY EFFICIENT ROUTING PROTOCOL IN HETEROG...
SECTOR TREE-BASED CLUSTERING FOR ENERGY EFFICIENT ROUTING PROTOCOL IN HETEROG...SECTOR TREE-BASED CLUSTERING FOR ENERGY EFFICIENT ROUTING PROTOCOL IN HETEROG...
SECTOR TREE-BASED CLUSTERING FOR ENERGY EFFICIENT ROUTING PROTOCOL IN HETEROG...IJCNCJournal
 
Dual-resource TCPAQM for Processing-constrained Networks
Dual-resource TCPAQM for Processing-constrained NetworksDual-resource TCPAQM for Processing-constrained Networks
Dual-resource TCPAQM for Processing-constrained Networksambitlick
 
Comparative study on priority based qos
Comparative study on priority based qosComparative study on priority based qos
Comparative study on priority based qosijwmn
 
Mobile Data Gathering with Load Balanced Clustering and Dual Data Uploading i...
Mobile Data Gathering with Load Balanced Clustering and Dual Data Uploading i...Mobile Data Gathering with Load Balanced Clustering and Dual Data Uploading i...
Mobile Data Gathering with Load Balanced Clustering and Dual Data Uploading i...1crore projects
 
Higher Throughput Maintenance Using Average Time Standard for Multipath Data ...
Higher Throughput Maintenance Using Average Time Standard for Multipath Data ...Higher Throughput Maintenance Using Average Time Standard for Multipath Data ...
Higher Throughput Maintenance Using Average Time Standard for Multipath Data ...Eswar Publications
 
Load balancing in Distributed Systems
Load balancing in Distributed SystemsLoad balancing in Distributed Systems
Load balancing in Distributed SystemsRicha Singh
 
The Concept of Load Balancing Server in Secured and Intelligent Network
The Concept of Load Balancing Server in Secured and Intelligent NetworkThe Concept of Load Balancing Server in Secured and Intelligent Network
The Concept of Load Balancing Server in Secured and Intelligent NetworkIJAEMSJORNAL
 
JPJ1410 PACK: Prediction-Based Cloud Bandwidth and Cost Reduction System
JPJ1410  PACK: Prediction-Based Cloud Bandwidth and Cost Reduction SystemJPJ1410  PACK: Prediction-Based Cloud Bandwidth and Cost Reduction System
JPJ1410 PACK: Prediction-Based Cloud Bandwidth and Cost Reduction Systemchennaijp
 
Dynamic adaptation balman
Dynamic adaptation balmanDynamic adaptation balman
Dynamic adaptation balmanbalmanme
 
Clustering based Time Slot Assignment Protocol for Improving Performance in U...
Clustering based Time Slot Assignment Protocol for Improving Performance in U...Clustering based Time Slot Assignment Protocol for Improving Performance in U...
Clustering based Time Slot Assignment Protocol for Improving Performance in U...journal ijrtem
 
Cross Layer- Performance Enhancement Architecture (CL-PEA) for MANET
Cross Layer- Performance Enhancement Architecture (CL-PEA) for MANETCross Layer- Performance Enhancement Architecture (CL-PEA) for MANET
Cross Layer- Performance Enhancement Architecture (CL-PEA) for MANETijcncs
 
IEEE 2014 JAVA CLOUD COMPUTING PROJECTS A stochastic model to investigate dat...
IEEE 2014 JAVA CLOUD COMPUTING PROJECTS A stochastic model to investigate dat...IEEE 2014 JAVA CLOUD COMPUTING PROJECTS A stochastic model to investigate dat...
IEEE 2014 JAVA CLOUD COMPUTING PROJECTS A stochastic model to investigate dat...IEEEGLOBALSOFTSTUDENTPROJECTS
 
ENERGY EFFICIENT MULTICAST ROUTING IN MANET
ENERGY EFFICIENT MULTICAST ROUTING IN MANET ENERGY EFFICIENT MULTICAST ROUTING IN MANET
ENERGY EFFICIENT MULTICAST ROUTING IN MANET ijac journal
 
An Adaptive Load Balancing Middleware for Distributed Simulation
An Adaptive Load Balancing Middleware for Distributed SimulationAn Adaptive Load Balancing Middleware for Distributed Simulation
An Adaptive Load Balancing Middleware for Distributed SimulationGabriele D'Angelo
 
M.E Computer Science Mobile Computing Projects
M.E Computer Science Mobile Computing ProjectsM.E Computer Science Mobile Computing Projects
M.E Computer Science Mobile Computing ProjectsVijay Karan
 

Was ist angesagt? (16)

SECTOR TREE-BASED CLUSTERING FOR ENERGY EFFICIENT ROUTING PROTOCOL IN HETEROG...
SECTOR TREE-BASED CLUSTERING FOR ENERGY EFFICIENT ROUTING PROTOCOL IN HETEROG...SECTOR TREE-BASED CLUSTERING FOR ENERGY EFFICIENT ROUTING PROTOCOL IN HETEROG...
SECTOR TREE-BASED CLUSTERING FOR ENERGY EFFICIENT ROUTING PROTOCOL IN HETEROG...
 
Dual-resource TCPAQM for Processing-constrained Networks
Dual-resource TCPAQM for Processing-constrained NetworksDual-resource TCPAQM for Processing-constrained Networks
Dual-resource TCPAQM for Processing-constrained Networks
 
Comparative study on priority based qos
Comparative study on priority based qosComparative study on priority based qos
Comparative study on priority based qos
 
Mobile Data Gathering with Load Balanced Clustering and Dual Data Uploading i...
Mobile Data Gathering with Load Balanced Clustering and Dual Data Uploading i...Mobile Data Gathering with Load Balanced Clustering and Dual Data Uploading i...
Mobile Data Gathering with Load Balanced Clustering and Dual Data Uploading i...
 
Higher Throughput Maintenance Using Average Time Standard for Multipath Data ...
Higher Throughput Maintenance Using Average Time Standard for Multipath Data ...Higher Throughput Maintenance Using Average Time Standard for Multipath Data ...
Higher Throughput Maintenance Using Average Time Standard for Multipath Data ...
 
Load balancing in Distributed Systems
Load balancing in Distributed SystemsLoad balancing in Distributed Systems
Load balancing in Distributed Systems
 
The Concept of Load Balancing Server in Secured and Intelligent Network
The Concept of Load Balancing Server in Secured and Intelligent NetworkThe Concept of Load Balancing Server in Secured and Intelligent Network
The Concept of Load Balancing Server in Secured and Intelligent Network
 
JPJ1410 PACK: Prediction-Based Cloud Bandwidth and Cost Reduction System
JPJ1410  PACK: Prediction-Based Cloud Bandwidth and Cost Reduction SystemJPJ1410  PACK: Prediction-Based Cloud Bandwidth and Cost Reduction System
JPJ1410 PACK: Prediction-Based Cloud Bandwidth and Cost Reduction System
 
Dynamic adaptation balman
Dynamic adaptation balmanDynamic adaptation balman
Dynamic adaptation balman
 
Clustering based Time Slot Assignment Protocol for Improving Performance in U...
Clustering based Time Slot Assignment Protocol for Improving Performance in U...Clustering based Time Slot Assignment Protocol for Improving Performance in U...
Clustering based Time Slot Assignment Protocol for Improving Performance in U...
 
Cross Layer- Performance Enhancement Architecture (CL-PEA) for MANET
Cross Layer- Performance Enhancement Architecture (CL-PEA) for MANETCross Layer- Performance Enhancement Architecture (CL-PEA) for MANET
Cross Layer- Performance Enhancement Architecture (CL-PEA) for MANET
 
IEEE 2014 JAVA CLOUD COMPUTING PROJECTS A stochastic model to investigate dat...
IEEE 2014 JAVA CLOUD COMPUTING PROJECTS A stochastic model to investigate dat...IEEE 2014 JAVA CLOUD COMPUTING PROJECTS A stochastic model to investigate dat...
IEEE 2014 JAVA CLOUD COMPUTING PROJECTS A stochastic model to investigate dat...
 
ENERGY EFFICIENT MULTICAST ROUTING IN MANET
ENERGY EFFICIENT MULTICAST ROUTING IN MANET ENERGY EFFICIENT MULTICAST ROUTING IN MANET
ENERGY EFFICIENT MULTICAST ROUTING IN MANET
 
An Adaptive Load Balancing Middleware for Distributed Simulation
An Adaptive Load Balancing Middleware for Distributed SimulationAn Adaptive Load Balancing Middleware for Distributed Simulation
An Adaptive Load Balancing Middleware for Distributed Simulation
 
A046020112
A046020112A046020112
A046020112
 
M.E Computer Science Mobile Computing Projects
M.E Computer Science Mobile Computing ProjectsM.E Computer Science Mobile Computing Projects
M.E Computer Science Mobile Computing Projects
 

Andere mochten auch

Andere mochten auch (15)

Marimar
MarimarMarimar
Marimar
 
Satania
SataniaSatania
Satania
 
10 Ways to Drive Revenue with Sales Enablement
10 Ways to Drive Revenue with Sales Enablement10 Ways to Drive Revenue with Sales Enablement
10 Ways to Drive Revenue with Sales Enablement
 
Salto 2016 - sessão com Guilherme Lito
Salto 2016 - sessão com Guilherme LitoSalto 2016 - sessão com Guilherme Lito
Salto 2016 - sessão com Guilherme Lito
 
RASS HR
RASS HRRASS HR
RASS HR
 
Desenvolupament de negoci: com fer que la botiga on - line incrementi les v...
Desenvolupament de negoci: com fer  que la botiga on - line incrementi les  v...Desenvolupament de negoci: com fer  que la botiga on - line incrementi les  v...
Desenvolupament de negoci: com fer que la botiga on - line incrementi les v...
 
HCMUT
HCMUTHCMUT
HCMUT
 
MSc Group Project Presentation
MSc Group Project PresentationMSc Group Project Presentation
MSc Group Project Presentation
 
Startups are about learning, SW Startup Day at TUT
Startups are about learning, SW Startup Day at TUTStartups are about learning, SW Startup Day at TUT
Startups are about learning, SW Startup Day at TUT
 
Ta1 part1
Ta1 part1Ta1 part1
Ta1 part1
 
Ellwood Consulting Brochure
Ellwood Consulting BrochureEllwood Consulting Brochure
Ellwood Consulting Brochure
 
Η τεχνική Pige
Η τεχνική PigeΗ τεχνική Pige
Η τεχνική Pige
 
PHC Postcard Samples
PHC Postcard SamplesPHC Postcard Samples
PHC Postcard Samples
 
Samir-Senior
Samir-SeniorSamir-Senior
Samir-Senior
 
князева а
князева акнязева а
князева а
 

Ähnlich wie Orchestrating Bulk Data Transfers across Geo-Distributed Datacenters

ORCHESTRATING BULK DATA TRANSFERS ACROSS GEO-DISTRIBUTED DATACENTERS
ORCHESTRATING BULK DATA TRANSFERS ACROSS GEO-DISTRIBUTED DATACENTERSORCHESTRATING BULK DATA TRANSFERS ACROSS GEO-DISTRIBUTED DATACENTERS
ORCHESTRATING BULK DATA TRANSFERS ACROSS GEO-DISTRIBUTED DATACENTERSShakas Technologies
 
International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)IJERD Editor
 
Enhancement of qos in multihop wireless networks by delivering cbr using lb a...
Enhancement of qos in multihop wireless networks by delivering cbr using lb a...Enhancement of qos in multihop wireless networks by delivering cbr using lb a...
Enhancement of qos in multihop wireless networks by delivering cbr using lb a...eSAT Journals
 
Enhancement of qos in multihop wireless networks by delivering cbr using lb a...
Enhancement of qos in multihop wireless networks by delivering cbr using lb a...Enhancement of qos in multihop wireless networks by delivering cbr using lb a...
Enhancement of qos in multihop wireless networks by delivering cbr using lb a...eSAT Publishing House
 
Traffic-aware adaptive server load balancing for softwaredefined networks
Traffic-aware adaptive server load balancing for softwaredefined networks Traffic-aware adaptive server load balancing for softwaredefined networks
Traffic-aware adaptive server load balancing for softwaredefined networks IJECEIAES
 
M phil-computer-science-mobile-computing-projects
M phil-computer-science-mobile-computing-projectsM phil-computer-science-mobile-computing-projects
M phil-computer-science-mobile-computing-projectsVijay Karan
 
M.Phil Computer Science Mobile Computing Projects
M.Phil Computer Science Mobile Computing ProjectsM.Phil Computer Science Mobile Computing Projects
M.Phil Computer Science Mobile Computing ProjectsVijay Karan
 
IEEE Networking 2016 Title and Abstract
IEEE Networking 2016 Title and AbstractIEEE Networking 2016 Title and Abstract
IEEE Networking 2016 Title and Abstracttsysglobalsolutions
 
DWDM-RAM: An Architecture for Data Intensive Service Enabled by Next Generati...
DWDM-RAM: An Architecture for Data Intensive Service Enabled by Next Generati...DWDM-RAM: An Architecture for Data Intensive Service Enabled by Next Generati...
DWDM-RAM: An Architecture for Data Intensive Service Enabled by Next Generati...Tal Lavian Ph.D.
 
A multi path routing algorithm for ip
A multi path routing algorithm for ipA multi path routing algorithm for ip
A multi path routing algorithm for ipAlvianus Dengen
 
Adaptive Routing in Wireless Sensor Networks: QoS Optimisation for Enhanced A...
Adaptive Routing in Wireless Sensor Networks: QoS Optimisation for Enhanced A...Adaptive Routing in Wireless Sensor Networks: QoS Optimisation for Enhanced A...
Adaptive Routing in Wireless Sensor Networks: QoS Optimisation for Enhanced A...M H
 
COST-MINIMIZING DYNAMIC MIGRATION OF CONTENT DISTRIBUTION SERVICES INTO HYBR...
 COST-MINIMIZING DYNAMIC MIGRATION OF CONTENT DISTRIBUTION SERVICES INTO HYBR... COST-MINIMIZING DYNAMIC MIGRATION OF CONTENT DISTRIBUTION SERVICES INTO HYBR...
COST-MINIMIZING DYNAMIC MIGRATION OF CONTENT DISTRIBUTION SERVICES INTO HYBR...Nexgen Technology
 
Cost minimizing dynamic migration of content
Cost minimizing dynamic migration of contentCost minimizing dynamic migration of content
Cost minimizing dynamic migration of contentnexgentech15
 
Cost-Minimizing Dynamic Migration of Content Distribution Services into Hybri...
Cost-Minimizing Dynamic Migration of Content Distribution Services into Hybri...Cost-Minimizing Dynamic Migration of Content Distribution Services into Hybri...
Cost-Minimizing Dynamic Migration of Content Distribution Services into Hybri...nexgentechnology
 
QoS-aware scheduling in LTE-A networks with SDN control
QoS-aware scheduling in LTE-A networks with SDN controlQoS-aware scheduling in LTE-A networks with SDN control
QoS-aware scheduling in LTE-A networks with SDN controlUniversity of Piraeus
 
Mobile data gathering with load balanced
Mobile data gathering with load balancedMobile data gathering with load balanced
Mobile data gathering with load balancedjpstudcorner
 
An efficient vertical handoff mechanism for future mobile network
An efficient vertical handoff mechanism for  future mobile networkAn efficient vertical handoff mechanism for  future mobile network
An efficient vertical handoff mechanism for future mobile networkBasil John
 

Ähnlich wie Orchestrating Bulk Data Transfers across Geo-Distributed Datacenters (20)

ORCHESTRATING BULK DATA TRANSFERS ACROSS GEO-DISTRIBUTED DATACENTERS
ORCHESTRATING BULK DATA TRANSFERS ACROSS GEO-DISTRIBUTED DATACENTERSORCHESTRATING BULK DATA TRANSFERS ACROSS GEO-DISTRIBUTED DATACENTERS
ORCHESTRATING BULK DATA TRANSFERS ACROSS GEO-DISTRIBUTED DATACENTERS
 
International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)
 
Enhancement of qos in multihop wireless networks by delivering cbr using lb a...
Enhancement of qos in multihop wireless networks by delivering cbr using lb a...Enhancement of qos in multihop wireless networks by delivering cbr using lb a...
Enhancement of qos in multihop wireless networks by delivering cbr using lb a...
 
Enhancement of qos in multihop wireless networks by delivering cbr using lb a...
Enhancement of qos in multihop wireless networks by delivering cbr using lb a...Enhancement of qos in multihop wireless networks by delivering cbr using lb a...
Enhancement of qos in multihop wireless networks by delivering cbr using lb a...
 
Traffic-aware adaptive server load balancing for softwaredefined networks
Traffic-aware adaptive server load balancing for softwaredefined networks Traffic-aware adaptive server load balancing for softwaredefined networks
Traffic-aware adaptive server load balancing for softwaredefined networks
 
Networking Articles Overview
Networking Articles OverviewNetworking Articles Overview
Networking Articles Overview
 
N1803048386
N1803048386N1803048386
N1803048386
 
M phil-computer-science-mobile-computing-projects
M phil-computer-science-mobile-computing-projectsM phil-computer-science-mobile-computing-projects
M phil-computer-science-mobile-computing-projects
 
M.Phil Computer Science Mobile Computing Projects
M.Phil Computer Science Mobile Computing ProjectsM.Phil Computer Science Mobile Computing Projects
M.Phil Computer Science Mobile Computing Projects
 
IEEE Networking 2016 Title and Abstract
IEEE Networking 2016 Title and AbstractIEEE Networking 2016 Title and Abstract
IEEE Networking 2016 Title and Abstract
 
DWDM-RAM: An Architecture for Data Intensive Service Enabled by Next Generati...
DWDM-RAM: An Architecture for Data Intensive Service Enabled by Next Generati...DWDM-RAM: An Architecture for Data Intensive Service Enabled by Next Generati...
DWDM-RAM: An Architecture for Data Intensive Service Enabled by Next Generati...
 
A multi path routing algorithm for ip
A multi path routing algorithm for ipA multi path routing algorithm for ip
A multi path routing algorithm for ip
 
Adaptive Routing in Wireless Sensor Networks: QoS Optimisation for Enhanced A...
Adaptive Routing in Wireless Sensor Networks: QoS Optimisation for Enhanced A...Adaptive Routing in Wireless Sensor Networks: QoS Optimisation for Enhanced A...
Adaptive Routing in Wireless Sensor Networks: QoS Optimisation for Enhanced A...
 
COST-MINIMIZING DYNAMIC MIGRATION OF CONTENT DISTRIBUTION SERVICES INTO HYBR...
 COST-MINIMIZING DYNAMIC MIGRATION OF CONTENT DISTRIBUTION SERVICES INTO HYBR... COST-MINIMIZING DYNAMIC MIGRATION OF CONTENT DISTRIBUTION SERVICES INTO HYBR...
COST-MINIMIZING DYNAMIC MIGRATION OF CONTENT DISTRIBUTION SERVICES INTO HYBR...
 
Cost minimizing dynamic migration of content
Cost minimizing dynamic migration of contentCost minimizing dynamic migration of content
Cost minimizing dynamic migration of content
 
Cost-Minimizing Dynamic Migration of Content Distribution Services into Hybri...
Cost-Minimizing Dynamic Migration of Content Distribution Services into Hybri...Cost-Minimizing Dynamic Migration of Content Distribution Services into Hybri...
Cost-Minimizing Dynamic Migration of Content Distribution Services into Hybri...
 
QoS-aware scheduling in LTE-A networks with SDN control
QoS-aware scheduling in LTE-A networks with SDN controlQoS-aware scheduling in LTE-A networks with SDN control
QoS-aware scheduling in LTE-A networks with SDN control
 
Mobile data gathering with load balanced
Mobile data gathering with load balancedMobile data gathering with load balanced
Mobile data gathering with load balanced
 
SPROJReport (1)
SPROJReport (1)SPROJReport (1)
SPROJReport (1)
 
An efficient vertical handoff mechanism for future mobile network
An efficient vertical handoff mechanism for  future mobile networkAn efficient vertical handoff mechanism for  future mobile network
An efficient vertical handoff mechanism for future mobile network
 

Kürzlich hochgeladen

MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxAnupkumar Sharma
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatYousafMalik24
 
ROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptxROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptxVanesaIglesias10
 
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxBarangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxCarlos105
 
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptxQ4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptxlancelewisportillo
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for BeginnersSabitha Banu
 
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONTHEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONHumphrey A Beña
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4MiaBumagat1
 
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptxAUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptxiammrhaywood
 
Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)cama23
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Celine George
 
Transaction Management in Database Management System
Transaction Management in Database Management SystemTransaction Management in Database Management System
Transaction Management in Database Management SystemChristalin Nelson
 
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfJemuel Francisco
 
Music 9 - 4th quarter - Vocal Music of the Romantic Period.pptx
Music 9 - 4th quarter - Vocal Music of the Romantic Period.pptxMusic 9 - 4th quarter - Vocal Music of the Romantic Period.pptx
Music 9 - 4th quarter - Vocal Music of the Romantic Period.pptxleah joy valeriano
 
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSGRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSJoshuaGantuangco2
 
Concurrency Control in Database Management system
Concurrency Control in Database Management systemConcurrency Control in Database Management system
Concurrency Control in Database Management systemChristalin Nelson
 

Kürzlich hochgeladen (20)

MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice great
 
ROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptxROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptx
 
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxBarangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
 
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptxQ4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for Beginners
 
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONTHEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4
 
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptxAUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
 
Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17
 
Transaction Management in Database Management System
Transaction Management in Database Management SystemTransaction Management in Database Management System
Transaction Management in Database Management System
 
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
 
Music 9 - 4th quarter - Vocal Music of the Romantic Period.pptx
Music 9 - 4th quarter - Vocal Music of the Romantic Period.pptxMusic 9 - 4th quarter - Vocal Music of the Romantic Period.pptx
Music 9 - 4th quarter - Vocal Music of the Romantic Period.pptx
 
Raw materials used in Herbal Cosmetics.pptx
Raw materials used in Herbal Cosmetics.pptxRaw materials used in Herbal Cosmetics.pptx
Raw materials used in Herbal Cosmetics.pptx
 
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptxLEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
 
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSGRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
 
Concurrency Control in Database Management system
Concurrency Control in Database Management systemConcurrency Control in Database Management system
Concurrency Control in Database Management system
 
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptxYOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
 
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptxFINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
 

Orchestrating Bulk Data Transfers across Geo-Distributed Datacenters

  • 1. ORCHESTRATING BULK DATA TRANSFERS ACROSS GEO-DISTRIBUTED DATACENTERS Abstract—As it has become the norm for cloud providers to host multiple datacenters around the globe, significant demands exist for inter-datacenter data transfers in large volumes, e.g., migration of big data. A challenge arises on how to schedule the bulk data transfers at different urgency levels, in order to fully utilize the available inter- datacenter bandwidth. The Software Defined Networking (SDN) paradigm has emerged recently which decouples the control plane from the data paths, enabling potential global optimization of data routing in a network. This paper aims to design a dynamic, highly efficient bulk data transfer service in a geo-distributed datacenter system, and engineer its design and solution algorithms closely within an SDN architecture. We model data transfer demands as delay tolerant migration requests with different finishing deadlines. Thanks to the flexibility provided by SDN, we enable dynamic, optimal routing of distinct chunks within each bulk data transfer (instead of treating each transfer as an infinite flow), which can be temporarily stored at intermediate datacenters to mitigate bandwidth contention with more urgent transfers. An optimal chunk routing optimization model is formulated to solve for the best chunk transfer schedules over time. To derive the optimal schedules in an online fashion, three algorithms are discussed, namely a bandwidth-reserving algorithm, a dynamically -adjusting algorithm, and a future-demandfriendly algorithm, targeting at different levels of optimality and scalability. We build an SDN system based on the Beacon platformand OpenFlow APIs, and carefully engineer our bulk data transfer algorithms in the system. Extensive real- world experiments are carried out to compare the three algorithms as well as those from the existing literature, in terms of routing optimality, computational delay and overhead. EXISTING SYSTEM: In the network inside a data center, TCP congestion control and FIFO flow scheduling are currently used for data flow transport, which are unaware of flow deadlines. A number of proposals have appeared for deadline-aware congestion and rate control. D3 [12] exploits deadline information to control the rate at which each source host
  • 2. introduces traffic into the network, and apportions bandwidth at the routers along the paths greedily to satisfy as many deadlines as possible. D2TCP [13] is a Deadline-Aware Datacenter TCP protocol to handle bursty flows with deadlines. A congestion avoidance algorithm is employed, which uses ECN feedback from the routers and flow deadlines to modify the congestion window at the sender.In pFabric [14], switches implement simple prioritybased scheduling/dropping mechanisms, based on a priority number carried in the packets of each flow, and each flow starts at the line rate which throttles back only when high and persistent packet loss occurs. Differently, our work focuses on transportation of bulk flows among datacenters in a geodistributed cloud. Instead of end-to-end congestion control, we enable store-and-forward in intermediate datacenters, such that a source data center can send data out as soon as the first-hop connection bandwidth allows, whereas intermediate datacenters can temporarily store the data if more urgent/important flows need the next-hop link bandwidths. PROPOSED SYSTEM: This paper proposes a novel optimization model for dynamic, highly efficient scheduling of bulk data transfers in a geo-distributed datacenter system, and engineers its design and solution algorithms practically within an OpenFlow- based SDN architecture. We model data transfer requests as delay tolerant data migration tasks with different finishing deadlines. Thanks to the flexibility of transmission scheduling provided by SDN, we enable dynamic, optimal routing of distinct chunks within each bulk data transfer (instead of treating each transfer as an infinite flow), which can be temporarily stored at intermediate datacenters and transmitted only at carefully scheduled times, to mitigate bandwidth contention among tasks of different urgency levels. Our contributions are summarized as follows. First, we formulate the bulk data transfer problem into a novel, optimal chunk routing problem, which maximizes the aggregate utility gain due to timely transfer completions before the specified deadlines. Such an optimization model enables flexible, dynamic adjustment of chunk transfer schedules in a systemwith dynamically- arriving data transfer requests, which is impossible with a popularly-modeled flow-based optimal routing model. Second, we discuss three dynamic algorithms to solve the optimal chunk routing problem, namely a bandwidth- reserving algorithm, a dynamically-adjusting algorithm, and a futuredemand- friendly algorithm. These solutions are targeting at different levels of optimality and computational complexity. Third, we build an SDN system based on the OpenFlow APIs and Beacon platform [11], and carefully engineer our bulk data transfer algorithms in the
  • 3. system. Extensive realworld experiments with real network traffic are carried out to compare the three algorithms as well as those in the existing literature, in terms of routing optimality, computational delay and overhead. Module 1 Orchestrating in Cloud computing The goal of cloud orchestration is to, insofar as is possible, automate the configuration, coordination and management of software and software interactions in such an environment. The process involves automating workflows required for service delivery. Tasks involved include managing server runtimes, directing the flow of processes among applications and dealing with exceptions to typical workflows. Vendors of cloud orchestration products include Eucalyptus, Flexiant, IBM, Microsoft, VMware and V3 Systems. The term “orchestration” originally comes from the study of music, where it refers to the arrangement and coordination of instruments for a given piece. Module 2 SDN-based Architecture We consider a cloud spanning multiple datacenters located in different geographic locations (Fig. 1). Each datacenter is connected via a core switch to the other datacenters. The connections among the datacenters are dedicated, fullduplex links, either through leading tier-1 ISPs or private fiber networks of the cloud provider, allowing independent and simultaneous two-way data transmissions. Data transfer requests may arise from each
  • 4. datacenter to move bulk volumes of data to another datacenter. A gateway server is connected to the core switch in each datacenter, responsible for aggregating cross-datacenter data transfer requests from the same datacenter, as well as for temporarily storing data from other datacenters and forwarding them via the switch. It also tracks network topology and bandwidth availability among the datacenters with the help of the switches. Combined closely with the SDN paradigm, a central controller is deployed to implement the optimal data transfer algorithms, dynamically configure the flow table on each switch, and instruct the gateway servers to store or to forward each data chunk. The layered architecture we present realistically resembles B4 [9], which was designed and deployed by Google for their G-scale inter-datacenter network: the gateway server plays a similar role of the site controller layer, the controller corresponds well to the global layer, and the core switch at each location can be deemed as the per-site switch clusters in B4. Module 3 Dynamic algorithms We present three practical algorithms which make job acceptance and chunk routing decisions in each time slot, and achieve different levels of optimality and scalability. The Bandwidth-Reserving Algorithm The first algorithm honors decisions made in previous time slots, and reserves bandwidth along the network links for scheduled chunk transmissions of previously accepted jobs in its routing computation for newly arrived jobs. Let J(_ ) be the set consisting of only the latest data transfer requests arrived in time slot _ . Define Bm;n(t) as the residual bandwidth on each connection (m; n) in time slot t 2 [_ +1; �], excluding bandwidth needed for the remaining chunk transfers of accepted jobs arrived before _ . In each time slot _ , the algorithm solves optimization (1) with job set J(_ ) and bandwidth Bm;n(t)’s for duration [_ +1; �], and derives admission control decisions for jobs arrived in this time slot, as well as their chunk transfer schedules before their respective deadlines. Theorem 1 states the NP-hardness of optimization problem in (1) (with detailed proof in AppendixA). Nevertheless, such a linear integer program may still be solved in reasonable time at a typical scale of the problem (e.g., tens of datacenters in the system), using an optimization tool such as CPLEX [26]. To cater for larger scale problems, we also propose a highly efficient heuristic. . The Dynamically-Adjusting Algorithm
  • 5. The second algorithm retains job acceptance decisions made in previous time slots, but adjusts routing schedules for chunks of accepted jobs, which have not reached their respective destinations, together with the admission control and routing computation of newly arrived jobs. Let J(_ ) be the set of data transfer requests arrived in time slot _ , and J(_�) represent the set of unfinished, previously accepted jobs by time slot _ . In each time slot _ , the algorithm solves a modified version of optimization (1) Module 4 The Future-Demand-Friendly Heuristic We further propose a simple but efficient heuristic to make job acceptance and chunk routing decisions in each time slot, with polynomial-time computational complexity, suitable for systems with larger scales. Similar to the first algorithm, the heuristic retains routing decisions computed earlier for chunks of already accepted jobs, but only makes decisions for jobs received in this time slot using the remaining bandwidth. On the other hand, it is more future demand friendly than the first algorithm, by postponing the transmission of accepted jobs as much as possible, to save bandwidth available in the immediate future in case more urgent transmission jobs may arrive. Let J(_ ) be the set of latest data transfer requests arrived in time slot _ . The heuristic is given in Alg. 1. At the job level, the algorithm preferably handles data transfer requests with higher weights and smaller sizes (line 1), i.e., larger weight per unit bandwidth consumption. For each chunk in job J, the algorithm chooses a transfer path with the fewest number of hops, that has available bandwidth to forward the chunk from the source to the destination before the deadline. CONCLUSION This paper presents our efforts to tackle an arising challenge in geo-distributed datacenters, i.e., deadline-aware bulk data transfers. Inspired by the emerging Software Defined Networking (SDN) initiative that is well suited to deployment of an efficient scheduling algorithm with the global view of the network, we propose a reliable and efficient underlying bulk data transfer service in an inter-datacenter network, featuring optimal routing for distinct
  • 6. chunks over time, which can be temporarily stored at intermediate datacenters and forwarded at carefully computed times. For practical application of the optimization framework, we derive three dynamic algorithms, targeting at different levels of optimality and scalability. We also present the design and implementation of our Bulk Data Transfer (BDT) system, based on the Beacon platform and OpenFlow APIs. Experiments with realistic settings verify the practicality of the design and the efficiency of the three algorithms, based on extensive comparisons with schemes in the literature. REFERENCES [1] Data Center Map, http://www.datacentermap.com/datacenters.html. [2] K. K. Ramakrishnan, P. Shenoy, and J. Van der Merwe, “Live Data Center Migration across WANs: A Robust Cooperative Context Aware Approach,” in Proceedings of the 2007 SIGCOMM workshop on Internet network management, ser. INM ’07, New York, NY, USA, 2007, pp. 262–267. [3] Y. Wu, C. Wu, B. Li, L. Zhang, Z. Li, and F. C. M. Lau, “Scaling Social Media Applications into Geo- Distributed Clouds,” in INFOCOM, 2012. [4] J. Dean and S. Ghemawat, “MapReduce: Simplified Data Processing on Large Clusters,” Commun. ACM, vol. 51, no. 1, pp. 107–113, Jan. 2008. [5] A. Greenberg, G. Hjalmtysson, D. A. Maltz, A. Myers, J. Rexford, G. Xie, H. Yan, J. Zhan, and H. Zhang, “A Clean Slate 4D Approach to Network Control and Management,” ACM SIGCOMM Co mputer Communication Review, vol. 35, no. 5, pp. 41–54, 2005. [6] SDN, https://www.opennetworking.org/sdn-resources/sdn-definition. [7] N. McKeown, T. Anderson, H. Balakrishnan, G. M. Parulkar, L. L. Peterson, J. Rexford, S. Shenker, and J. S. Turner, “OpenFlow: Enabling Innovation in Campus Networks,” Computer Communication Review, vol. 38, no. 2, pp. 69–74, 2008. [8] U. Hoelzle, “Openflow@ google,” Open Networking Summit, 2012.
  • 7. [9] S. Jain, A. Kumar, S. Mandal, J. Ong, L. Poutievski, A. Singh, S. Venkata, J. Wanderer, J. Zhou, M. Zhu et al., “B4: Experience with a Globally-deployed Software Defined WAN,” in Proceedings of the ACM SIGCOMM 2013 conference on SIGCOMM. ACM, 2013, pp. 3–14. [10] S. J. Vaughan-Nichols, “OpenFlow: The Next Generation of the Network?” Computer, vol. 44, no. 8, pp. 13– 15, 2011. [11] Beacon Home, https://openflow.stanford.edu/display/Beacon/Home. [12] C. Wilson, H. Ballani, T. Karagiannis, and A. Rowtron, “Better Never than Late: Meeting Deadlines in Datacenter Networks,” in Proceedings of the ACM SIGCOMM, New York, NY, USA, 2011, pp. 50–61.