Boost PC performance: How more available memory can improve productivity
Sc10 nov16th-flex res-presentation
1. A Flexible Reservation Algorithm
for Advance Network
Provisioning
Mehmet Balman , Evangelos Chaniotakis,
Arie Shoshani, Alex Sim
Scientific Data Management Research Group (SDM)
Energy Sciences Network (ESNet)
Lawrence Berkeley National Laboratory
SC'10 November 2010, New Orleans, Louisiana, USA
2. Introduction
Next generation research networks such as ESNet
(Energy Sciences Network) provide high-speed on-
demand data access between collaborating
institutions by delivering network-as-a-service
Currently, reservation systems (i.e. OSCARS)
provides yes/no answers to a reservation request for
(bandwidth, start_time, end_time)
We present a novel approach to improve advance
network reservation system by presenting to the
clients, the possible reservation options and
alternatives for earliest completion time and shortest
transfer duration.
3. Motivation
• We are in a new era that offers new oppurtunities to
conduct scientific research with the help of
computation
• Computational intensive science: particle physics, climate modelling,
bio-informatics simulations
• Scientific simulations and experimental facilities
generate massive data sets
• Climate modelling data
• 35 terabytes shared by more then 2500 users worldwide,
• Next generation archive will be more than 650 terabytes
• Large Hadron Collider
• Expected to generate 100gigabits per second
Scientific applications are becoming more data-intensive
(dealing with petabytes of data)
•
4. Motivation
Large scale application necessitate collaborations
Data need to be transferred to remote sites for
further analysis (validate with simulations)
Need on demand high speed data access between
collaborating parties
High performance visualization
Large volume data analysis
Need coordination and management of resources
Complex middleware is required to manage the end-
to-end distribution of data
5. ESNet (Energy Sciences Network)
Provides high bandwidth network interconnect
between more than 40 sites
Connecting experimental facilities, supercomputing
centers and thousands DOE scientists
Delivering network as a service (OSCARS)
Predictable performance
Efficient resource utilization
Guaranteed bandwidth
6. On-Demand Secure Circuits and Advance Reservation
System
(OSCARS)
Conducts a QoS path for guaranteed bandwidth
End-to-end provisioning between multiple domains
Guaranteed bandwidth (at certain time, for a certain
bandwidth and length of time)
OSCARS components include reservation manager,
Bandwidth scheduler, and path setup system
Needs to have information about current and future
states of the network
Making a reservation → need to ensure availability of the requested
bandwidth from source to destination for the requested time
interval
7. Revervation Request
For every new reservation request
R={ nsource, ndestination, Mbandwidth, tstart, tend}.
committed reservations between tstart and tend are examined
a snapshot graph G' of the network topology is generated
by extracting available bandwidth information for each
port in the time period (tstart, tend)
The shortest path from source to destination is calculated based
on the engineering metric on each link, and a bandwidth
guaranteed path is set up to commit and eventually complete the
reservation request for the given time period
8. Network Reservation / Topology
Components (Graph):
node (router), port, link (connecting two ports)
engineering metric (~latency)
maximum bandwidth (capacity)
A
1000Mbps
Reservation: 800Mbps
source, destination, path, time B C
300Mbps
900Mbps 500Mbps
(time t1, t3) A -> B -> D (900Mbps)
(time t2, t3) A -> C -> D (400Mbps) D
(time t4, t5) A -> B -> D (800Mpbs)
Reservation 1
t1
Reservation 2 t4 t5
t2 t3 Reservation 3
9. Network Reservation / Example
(time t1, t2) :
A
A to D (600Mbps) no 800 Mbps / 0Mbps (800Mbps)
100 Mbps / 900Mbps (1000Mbps)
A to D (500Mbps) yes
300 Mbps / 0 Mbps (300Mbps)
B C
0 Mbps / 900Mbps (900Mbps) 500 Mbps / 0Mbps (500Mbps)
D
Active reservation
reservation 1: (time t1, t3) A -> B -> D (900Mbps)
reservation 2: (time t2, t3) A -> C -> D (400Mbps)
reservation 3: (time t4, t5) A -> B -> D (800Mpbs)
10. Network Reservation / Example
(time t1, t3) :
A
A to D (500Mbps) no
400 Mbps / 400Mbps (800Mbps)
100 Mbps / 900Mbps (1000Mbps)
A to C (500Mbps) no
(no splitting – not max-flow)
300 Mbps / 0 Mbps (300Mbps)
B C
0 Mbps / 900Mbps (900Mbps) 100 Mbps / 400Mbps (500Mbps)
D
Active reservation
reservation 1: (time t1, t3) A -> B -> D (900Mbps)
reservation 2: (time t2, t3) A -> C -> D (400Mbps)
reservation 3: (time t4, t5) A -> B -> D (800Mpbs)
11. Problem
if the requested bandwidth can not be guaranteed:
Try-and-error until get an available reservation
Client is not given other possible options
Does not provide an optimal choice for client
May cause ineffective use of overall system
Overload system with trial-and-error attempts
End-to-end High Performance Data Movement
Bandwidth network reservation
Bandwidth provisioning in client sites
Storage allocation
How can we enhance the OSCARS reservation system?
• Submit constraints and the system suggests possible
reservations satisfying requirements
12. Alternative Approach / Flexible
Reservation
Users provide maximum bandwidth they can use, total size of the
data requested to be transferred, the earliest start time, and the
latest completion time
Users can set criteria such that they would like to reserve a path
for earliest completion time or reserve a path for shortest transfer
duration.
Rs'={ nsource , ndestination, MMAXbandwidth, DdataSize, tEarliestStart, tLatestEnd}.
The reservation engine finds out the reservation
R={ nsource, ndestination, Mbandwidth, tstart, tend}
for the earliest completion or for the shortest duration
where Mbandwidth≤ MMAXbandwidth and tEarliestStart ≤ tstart < tend≤ tLatestEnd .
13. Time-dependent Graphs
We deal with a dynamic network such that the bandwidth
value for every link is time dependent
The most common approach is the discrete-time algorithms in
which the time is modeled as a set of discrete values and a
static graph is constructed for every time interval.
Flexible Reservation Service
– Source / destination end-points
– Maximum bandwidth that can be used (provisioning in clients)
– Amount of data requested to be transferred (Volume)
– Earliest start time
– Latest completion time
– Criteria
– reserve a path for earliest completion,
– reserve a path shortest transfer duration
14. Max Bandwidth
The maximum bandwidth available for allocation
from a source node to a destination node
Modified version of Kruskal and Dijstra's algorithms
» Shortest path,
» Min-cost path
» Minimum spanning tree Bottleneck constraint
» Max bandwidth path
(max-bandwith)
Ex: QoS Constraint is additive
in shortest path
15. Path Finding A
A
B C
1000Mbps/ 800Mbps /
eng metric 10 eng metric 20
B D
D C
B 300Mbps /
C
eng metric 20
900Mbps
/eng metric 30 500Mbps / D D
eng metric 100
D
(2) A
(1) A (3) A
800
00
10
300
B C
B C
B C
90
0
D
D
Visit B
D
Visit A C (parent A) 800/20/1 hop
B (parent A) 1000/10/ 1hop D (parent B) 900/30/2 hops Visit D
C (parent A) 800/20/1 hop
Max bandwidth from A to D is 900
16. Example Problem
A vehicle travelling from city A to city B
There are multiple cities between A and B connected with separate
highways.
Each highway has a specific speed limit
– (maximum bandwidth)
But we need to reduce our speed if there is high traffic load on the road
We know the load on each highway for every time period
– (active reservations)
The first question is which path the vehicle should follow in order to
reach city B from city A as early as possible?
Or, we can delay our journey and start later if the total travel time would
be reduced. Thus, the second question is to find the route along with
the starting time for shortest travel duration.
17. Challange
But, we are dealing with bandwidth reservation
where allocation should be set in advance when a
request is received.
We have to set the speed limit before starting and
cannot change that during the journey
Advance Bandwitdth Reservation
Therefore, known time-dependent graph algorithms
do not fit into our problem domain.
18. Approach
We discretize the time-dependent dynamic network topology by
dividing the search interval into time steps.
Each time step represents a stable status of the topology.
A time window is subsequent combinations of time steps.
Search interval is divided into time windows
Obtain a snaphots of the network topology each
time windows
The algorithm should be fast and scalable.
– Searching the given time interval is accomplished in
polynomial time.
– Number of time windows is bounded by the number of active
reservations
19. Time steps
Reservation 1: (time t1, t6) A -> B -> D A
(900Mbps) 1000Mbps
800Mbps
Reservation 2: (time t4, t7) A -> C -> D
B C
(400Mbps) 300Mbps
Reservation 3: (time t9, t12) A -> B -> D 900Mbps 500Mbps
(700Mpbs) D
t1 t2 t3 t4 t5 t6 t7 t8 t9 t10 t11 t12 t13
Reservation 1
Reservation 2
Reservation 3
20. Time steps
Time steps between t1 and t13 Max (2r+1) time steps,
where r is the number of
reservations
t1 t2 t3 t4 t5 t6 t7 t8 t9 t10 t11 t12 t13
time
Reservation 1
Reservation 2
Reservation 3
time steps
Res
Res 1 Res 1,2 Res 3
2
t1 t4 t6 t7 t9 t12 t13
time
ts1 ts2 ts3 ts4
21. Static Graphs
Res
Res 1 Res 1,2 2
t4 t6 t7 t7 t9
t1 t4 t6
A A A A
800 Mbps 400 Mbps 400 Mbps 800 Mbps
100 Mbps 100 Mbps 1000 Mbps 1000 Mbps
300 Mbps) 300 Mbps) 300 Mbps) 300 Mbps)
B C B C B C B C
0 Mbps 500 Mbps 0 Mbps 100 Mbps 900 Mbps 100 Mbps 900 Mbps 500 Mbps
D D D D
G(ts1) G(ts2) G(ts3) G(ts4)
22. Time Windows
Max (s × (s + 1))/2 time windows, where s is the number of time steps
Res 1,2 Res 2
t6 t9
t1 t6
tw=ts3+ts4
A A
tw=ts1+ts2
400 Mbps 400 Mbps
100 Mbps 1000 Mbps
300 Mbps 300 Mbps
B C B C
Bottleneck constraint
0 Mbps 100 Mbps 900 Mbps 100 Mbps
D D
G(tw)=G(ts1) x G(ts2) G(tw)=G(ts3) x G(ts4)
23. Search Time Windows
• Search through these time windows in a sequential order
to check whether we can satisfy the requested allocation
for that time window.
• First, check the duration of the time window
– Can we satisfy the user request in that time windows?
(we know the max bandwidth user can support)
• Then, calculate the max bandwidth available in the time
window
24. Performance
max-bandwidth path ~ O(n^2 )
n is the number of nodes in the topology graph
In the worst-case, we may require to search all time
windows, (s × (s + 1))/2, where s is the number of
time steps.
If there are r committed reservations in the search
period, there can be a maximum of 2r + 1 different
time steps in the worst-case.
Overall, the worst-case complexity is bounded
by O(r^2 n^2 )
Note: r is relatively very small compared to the number
of nodes n
25. Example
A
Reservation 1: (time t1, t6) A -> B -> D 1000Mbps
800Mbps
(900Mbps)
Reservation 2: (time t4, t7) A -> C -> D B C
300Mbps
(400Mbps)
900Mbps 500Mbps
Reservation 3: (time t9, t12) A -> B -> D
(700Mpbs) D
t1 t2 t3 t4 t5 t6 t7 t8 t9 t10 t11 t12 t13
Reservation 1
Reservation 2
Reservation 3
from A to D (earliest completion)
max bandwidth = 200Mbps, volume = 200Mbps x 4 time slots
earliest start = t1, latest finish t13
26. Search Order - Time Windows
time Res
Res 1 Res 1,2 Res 3
windows 2
t1 t4 t6 t7 t9 t12 t13
Max bandwidth from A to D
t1--t4 Res 1 1. 900Mbps (3)
t4—t6 Res 1, 2 2. 100Mbps (2)
Res 1, 2 3. 100Mbps (5)
t1--t6
t6—t7 4. 900Mbps (1)
2
5. 100Mbps (3)
t4—t7 Res 1,2
6. 100Mbps (6)
t1—t7 Res 1, 2 7. 900Mpbs (2)
t7—t9 8. 900Mbps (3)
t6—t9 Res 2 9. 100Mbps (5)
t4—t9 Res 1, 2 10. 100Mbps (8)
t1—t9 Res 1, 2
Reservation: ( A to D ) (100Mbps) start=t1 end=t9
27. Search Order - Time Windows
Shortest duration?
time Res
Res 1 Res 1,2 2
Res 3
windows
t1 t4 t6 t7 t9 t12 t13
Max bandwidth from A to D
t9—t12 Res 3
1. 200Mbps (3)
t12—t12
2. 900Mbps (1)
t9—t13 3. 200Mbps (4) Res 3
Reservation: (A to D ) (200Mbps) start=t9 end=t13
from A to D, max bandwidth = 200Mbps
volume = 175Mbps x 4 time slots
earliest start = t1, latest finish t13
earliest completion: ( A to D ) (100Mbps) start=t1 end=t8
shortest duration: ( A to D ) (200Mbps) start=t9 end=t12.5
28. Implementation Details
• Query (source, destination, max bandwidth, volume, max hop
count)
– Find reachable set from source to destination
– Search time windows
• If reservation request can not fit into the time window skip
• Get active reservations for the time window
• Query and obtain a value object for the time window
• Calculate max bandwidth using the value object
• Examine whether request can be satisfied or not?
– Return a reservation request
– Start time, end time
– Bandwidth to allocate
– Path Value (bandwidth, eng metric, hop count)
Reachable set (hop count?)
35. Summary
● A new methodology in which users submit constraints
and the system suggests possible reservation options
satisfying requirements
● Polynomial-time algorithm, where the user species the
total volume that needs to be transferred, a maximum
bandwidth that he/she can use, and a desired time
period within which the transfer should be done.
● Quite practical even it is applied to large networks with
thousands of routers and links
● Implemented our algorithm as a new library (not specific
to OSCARS) – any reservation system can use that
36. Thanks
Special Thanks to David Robertson, Mary
Thompson, Chin Guok @ ESNet
Scientific Data Management Research Group
http://sdm.lbl.gov
Mehmet Balman
mbalman@lbl.gov
sdm.lbl.gov/~balman
38. Implementation Details
• Value
– bandwidth values used to calculate path in each step
(searching time windows)
– Keeps only related link values
• ValueBucket
– Register reservation list
– Initialize with a reachable set
– Query value object by giving a set of active reservations
• Keeps the status of the topology for a specific time
interval
39. Implementation Details
• Flow
– Register graph object
– Find the reachable set with the given maximum hop count
– Load a value object
– Find maximum bandwidth from source to destination
– No unnecessary memory allocation
• Suggest
– Register graph object
– Register reservation list
– Update time window list if necessary
– Search time windows
– Suggest a reservation request for earliest completion time or
shortest duration
40. Implementation
• Graph object
• Reservation list
– Register graph
– Register reservations
• Query (source, destination, max bandwidth, volume, max hop
count)
– Find reachable set from source to destination
– Search time windows
• If reservation request can not fit into the time window skip
• Get active reservations for the time window
• Query and obtain a value object for the time window
• Calculate max bandwidth using the value object
• Examine whether request can be satisfied or not?
– Return a reservation request
– Start time, end time
– Bandwidth to allocate
– Path Value (bandwidth, eng metric, hop count)
41. Modular Design for easy
integration into OSCARS
• Graph object, and Reservation objects already exist in OSCARS
– No need to replace them
• Other objects need to be added to OSCARS, including:
– Time Window object,
– Flow object,
– Value Bucket object,
– Suggest object
• Using “Registration” (reference) method, not “Loading” method
– E.g. in “flow”, a new graph needs to be only registered; no need to
recreate a new object
– This approach supports modularity
43. Demo
Generated graph has 12 nodes (node1 to node12 800Mbps available)
(node1 to node5 800Mbps available )
Reservations from node1 to node12
1 )max bandwidth 500, volume 3600000 (2hours x 500), start now
2) max bandwidth 300, volume 2160000 (2hours x 300), start after 1hour
3) max bandwidth 800, volume 2880000(1hours x 800), start after 4 hours
4) max bandwidth 200, volume 1440000 (2hours x 200), start after 6 hours
5) max bandwidth 300, volume 2160000 (2hours x 300), start after 7 hours
For each:
Ask for a reservation request for earliest completion time
Apply the reservation
node1 to node12 max bandwidth 700, volume 4320000(2hours x 600)
node1 to node5 max bandwidth 700, volume 4320000(2hours x 600)
44. Demo
now 1 2 3 4 5 6 7 8
hours
500 reservations
300
800
200
300
Time windows
Available bandwidth from node1 to node12
300 0 500 800 0 800 600 300
Available bandwidth from node1 to node5 (node1 to node8)
500 200 700 800 200 800 800 500