2. Flow control problem
ss Consider file transferConsider file transfer
ss Sender sends a stream of packets representing fragments of aSender sends a stream of packets representing fragments of a
filefile
ss Sender should try to match rate at which receiver and networkSender should try to match rate at which receiver and network
can process datacan process data
ss Can’t send too slow or too fastCan’t send too slow or too fast
ss Too slowToo slow
xx wastes timewastes time
ss Too fastToo fast
xx can lead to buffer overflowcan lead to buffer overflow
ss How to find the correct rate?How to find the correct rate?
3. Other considerations
ss SimplicitySimplicity
ss OverheadOverhead
ss ScalingScaling
ss FairnessFairness
ss StabilityStability
ss Many interesting tradeoffsMany interesting tradeoffs
xx overhead for stabilityoverhead for stability
xx simplicity for unfairnesssimplicity for unfairness
4. Where?
ss Usually at transport layerUsually at transport layer
ss Also, in some cases, inAlso, in some cases, in datalinkdatalink layerlayer
5. Model
ss Source, sink, server, service rate, bottleneck, round trip timeSource, sink, server, service rate, bottleneck, round trip time
6. Classification
ss Open loopOpen loop
xx Source describes its desired flow rateSource describes its desired flow rate
xx NetworkNetwork admitsadmits callcall
xx Source sends at this rateSource sends at this rate
ss Closed loopClosed loop
xx Source monitors available service rateSource monitors available service rate
33 Explicit or implicitExplicit or implicit
xx Sends at this rateSends at this rate
xx Due to speed of light delay, errors are bound to occurDue to speed of light delay, errors are bound to occur
ss HybridHybrid
xx Source asks for some minimum rateSource asks for some minimum rate
xx But can send more, if availableBut can send more, if available
7. Open loop flow control
ss Two phases to flowTwo phases to flow
xx Call setupCall setup
xx Data transmissionData transmission
ss Call setupCall setup
xx Network prescribes parametersNetwork prescribes parameters
xx User chooses parameter valuesUser chooses parameter values
xx Network admits or denies callNetwork admits or denies call
ss Data transmissionData transmission
xx User sends within parameter rangeUser sends within parameter range
xx NetworkNetwork policespolices usersusers
xx Scheduling policies give userScheduling policies give user QoSQoS
8. Hard problems
s Choosing a descriptor at a source
s Choosing a scheduling discipline at intermediate network
elements
s Admitting calls so that their performance objectives are met (call
admission control).
9. Traffic descriptors
ss Usually anUsually an envelopeenvelope
xx Constrains worst case behaviorConstrains worst case behavior
ss Three usesThree uses
xx Basis for traffic contractBasis for traffic contract
xx Input toInput to regulatorregulator
xx Input toInput to policerpolicer
10. Descriptor requirements
ss RepresentativityRepresentativity
xx adequately describes flow, so that network does not reserveadequately describes flow, so that network does not reserve
too little or too much resourcetoo little or too much resource
ss VerifiabilityVerifiability
xx verify that descriptor holdsverify that descriptor holds
ss PreservabilityPreservability
xx Doesn’t change inside the networkDoesn’t change inside the network
ss UsabilityUsability
xx Easy to describe and use for admission controlEasy to describe and use for admission control
11. Examples
s Representative, verifiable, but not useble
xx Time series ofTime series of interarrivalinterarrival timestimes
ss Verifiable,Verifiable, preservablepreservable, and useable, but not representative, and useable, but not representative
xx peak ratepeak rate
12. Some common descriptors
ss Peak ratePeak rate
ss Average rateAverage rate
ss Linear bounded arrival processLinear bounded arrival process
13. Peak rate
ss Highest ‘rate’ at which a source can send dataHighest ‘rate’ at which a source can send data
ss Two ways to compute itTwo ways to compute it
ss For networks with fixed-size packetsFor networks with fixed-size packets
xx min inter-packet spacingmin inter-packet spacing
ss For networks with variable-size packetsFor networks with variable-size packets
xx highest rate overhighest rate over allall intervals of a particular durationintervals of a particular duration
ss Regulator for fixed-size packetsRegulator for fixed-size packets
xx timer set on packet transmissiontimer set on packet transmission
xx if timer expires, send packet, if anyif timer expires, send packet, if any
ss ProblemProblem
xx sensitive to extremessensitive to extremes
14. Average rate
ss Rate over some time period (Rate over some time period (windowwindow))
ss Less susceptible toLess susceptible to outliersoutliers
ss Parameters:Parameters: tt andand aa
ss Two types: jumping window and moving windowTwo types: jumping window and moving window
ss Jumping windowJumping window
xx over consecutive intervals of lengthover consecutive intervals of length tt, only, only aa bits sentbits sent
xx regulator reinitializes every intervalregulator reinitializes every interval
ss Moving windowMoving window
xx over all intervals of lengthover all intervals of length t,t, onlyonly aa bits sentbits sent
xx regulator forgets packet sent more thanregulator forgets packet sent more than tt seconds agoseconds ago
15. Linear Bounded Arrival Process
ss Source bounds # bits sent in any time interval by a linearSource bounds # bits sent in any time interval by a linear
function of timefunction of time
s the number of bits transmitted in any active interval of length t is
less than rt + s
s r is the long term rate
s s is the burst limit
s insensitive to outliers
16. Leaky bucket
ss A regulator for an LBAPA regulator for an LBAP
ss Token bucket fills up at rateToken bucket fills up at rate rr
ss Largest # tokens <Largest # tokens < ss
17. Variants
ss Token and data bucketsToken and data buckets
xx Sum is what mattersSum is what matters
ss Peak rate regulatorPeak rate regulator
18. Choosing LBAP parameters
ss Tradeoff betweenTradeoff between rr andand ss
ss Minimal descriptorMinimal descriptor
xx doesn’t simultaneously have smallerdoesn’t simultaneously have smaller rr andand ss
xx presumably costs lesspresumably costs less
ss How to choose minimal descriptor?How to choose minimal descriptor?
ss Three way tradeoffThree way tradeoff
xx choice ofchoice of ss (data bucket size)(data bucket size)
xx loss rateloss rate
xx choice ofchoice of rr
19. Choosing minimal parameters
ss Keeping loss rate the sameKeeping loss rate the same
xx ifif ss is more,is more, rr is less (smoothing)is less (smoothing)
xx for eachfor each rr we have leastwe have least ss
ss Choose knee of curveChoose knee of curve
20. LBAP
ss Popular in practice and in academiaPopular in practice and in academia
xx sort of representativesort of representative
xx verifiableverifiable
xx sort ofsort of preservablepreservable
xx sort of usablesort of usable
ss Problems with multiple time scale trafficProblems with multiple time scale traffic
xx large burst messes up thingslarge burst messes up things
21. Open loop vs. closed loop
ss Open loopOpen loop
xx describe trafficdescribe traffic
xx network admits/reserves resourcesnetwork admits/reserves resources
xx regulation/policingregulation/policing
ss Closed loopClosed loop
xx can’t describe traffic or network doesn’t support reservationcan’t describe traffic or network doesn’t support reservation
xx monitor available bandwidthmonitor available bandwidth
33 perhaps allocated using GPS-emulationperhaps allocated using GPS-emulation
xx adapt to itadapt to it
xx if not done properly eitherif not done properly either
33 too much losstoo much loss
33 unnecessary delayunnecessary delay
22. Taxonomy
ss First generationFirst generation
xx ignores network stateignores network state
xx only match receiveronly match receiver
ss Second generationSecond generation
xx responsive to stateresponsive to state
xx three choicesthree choices
33 State measurementState measurement
•• explicit or implicitexplicit or implicit
33 ControlControl
•• flow control window size or rateflow control window size or rate
33 Point of controlPoint of control
•• endpoint or within networkendpoint or within network
23. Explicit vs. Implicit
ss ExplicitExplicit
xx Network tells source its current rateNetwork tells source its current rate
xx Better controlBetter control
xx More overheadMore overhead
ss ImplicitImplicit
xx Endpoint figures out rate by looking at networkEndpoint figures out rate by looking at network
xx Less overheadLess overhead
ss Ideally, want overhead of implicit with effectiveness of explicitIdeally, want overhead of implicit with effectiveness of explicit
24. Flow control window
ss Recall error control windowRecall error control window
ss Largest number of packet outstanding (sent but notLargest number of packet outstanding (sent but not ackedacked))
ss If endpoint has sent all packets in window, it must wait => slowsIf endpoint has sent all packets in window, it must wait => slows
down its ratedown its rate
ss Thus, window providesThus, window provides bothboth error control and flow controlerror control and flow control
ss This is calledThis is called transmissiontransmission windowwindow
ss Coupling can be a problemCoupling can be a problem
xx Few buffers are receiver => slow rate!Few buffers are receiver => slow rate!
25. Window vs. rate
ss In adaptive rate, we directly control rateIn adaptive rate, we directly control rate
ss Needs a timer per connectionNeeds a timer per connection
ss Plusses for windowPlusses for window
xx no need for fine-grained timerno need for fine-grained timer
xx self-limitingself-limiting
ss Plusses for ratePlusses for rate
xx better control (finer grain)better control (finer grain)
xx no coupling of flow control and error controlno coupling of flow control and error control
ss Rate control must be careful to avoid overhead and sending tooRate control must be careful to avoid overhead and sending too
muchmuch
26. Hop-by-hop vs. end-to-end
ss Hop-by-hopHop-by-hop
xx first generation flow control at each linkfirst generation flow control at each link
33 next server = sinknext server = sink
xx easy to implementeasy to implement
ss End-to-endEnd-to-end
xx sender matches all the servers on its pathsender matches all the servers on its path
ss Plusses for hop-by-hopPlusses for hop-by-hop
xx simplersimpler
xx distributes overflowdistributes overflow
xx better controlbetter control
ss Plusses for end-to-endPlusses for end-to-end
xx cheapercheaper
27. On-off
s Receiver gives ON and OFF signals
s If ON, send at full speed
s If OFF, stop
s OK when RTT is small
s What if OFF is lost?
s Bursty
s Used in serial lines or LANs
28. Stop and Wait
ss Send a packetSend a packet
ss Wait forWait for ackack before sending next packetbefore sending next packet
29. Static window
ss Stop and wait can send at most oneStop and wait can send at most one pktpkt per RTTper RTT
ss Here, we allow multiple packets per RTT (= transmissionHere, we allow multiple packets per RTT (= transmission
window)window)
30. What should window size be?
s Let bottleneck service rate along path = b pkts/sec
s Let round trip time = R sec
s Let flow control window = w packet
s Sending rate is w packets in R seconds = w/R
s To use bottleneck w/R > b => w > bR
s This is the bandwidth delay product or optimal window size
31. Static window
ss Works well if b and R are fixedWorks well if b and R are fixed
ss But, bottleneck rate changes with time!But, bottleneck rate changes with time!
ss Static choice of w can lead to problemsStatic choice of w can lead to problems
xx too smalltoo small
xx too largetoo large
ss So, need to adapt windowSo, need to adapt window
ss Always try to get to theAlways try to get to the currentcurrent optimal valueoptimal value
32. DECbit flow control
ss IntuitionIntuition
xx every packet has a bit in headerevery packet has a bit in header
xx intermediate routers set bit if queue has built up => sourceintermediate routers set bit if queue has built up => source
window is too largewindow is too large
xx sink copies bit tosink copies bit to ackack
xx if bits set, source reduces window sizeif bits set, source reduces window size
xx in steady state, oscillate around optimal sizein steady state, oscillate around optimal size
33. DECbit
ss When do bits get set?When do bits get set?
ss How does a source interpret them?How does a source interpret them?
34. DECbit details: router actions
ss MeasureMeasure demanddemand and mean queue length of each source
s Computed over queue regeneration cycles
s Balance between sensitivity and stability
35. Router actions
ss If mean queue length > 1.0If mean queue length > 1.0
xx set bits on sources whose demand exceeds fair shareset bits on sources whose demand exceeds fair share
ss If it exceeds 2.0If it exceeds 2.0
xx set bits on everyoneset bits on everyone
xx panic!panic!
36. Source actions
ss Keep track of bitsKeep track of bits
ss Can’t take control actions too fast!Can’t take control actions too fast!
ss Wait for past change to take effectWait for past change to take effect
ss Measure bits over past + present window sizeMeasure bits over past + present window size
ss If more than 50% set, then decrease window, else increaseIf more than 50% set, then decrease window, else increase
ss Additive increase,Additive increase, multiplicativemultiplicative decreasedecrease
37. Evaluation
ss Works with FIFOWorks with FIFO
xx but requires per-connection state (demand)but requires per-connection state (demand)
ss SoftwareSoftware
ss ButBut
xx assumes cooperation!assumes cooperation!
xx conservative window increase policyconservative window increase policy
39. TCP Flow Control
ss ImplicitImplicit
ss Dynamic windowDynamic window
ss End-to-endEnd-to-end
ss Very similar toVery similar to DECbitDECbit, but, but
xx no support from routersno support from routers
xx increase if no loss (usually detected using timeout)increase if no loss (usually detected using timeout)
xx window decrease on a timeoutwindow decrease on a timeout
xx additive increaseadditive increase multiplicativemultiplicative decreasedecrease
40. TCP details
ss Window starts at 1Window starts at 1
ss Increases exponentially for a while, then linearlyIncreases exponentially for a while, then linearly
ss Exponentially => doubles every RTTExponentially => doubles every RTT
ss Linearly => increases by 1 every RTTLinearly => increases by 1 every RTT
ss During exponential phase, everyDuring exponential phase, every ackack results in window increaseresults in window increase
by 1by 1
ss During linear phase, window increases by 1 when #During linear phase, window increases by 1 when # acksacks ==
window sizewindow size
ss Exponential phase is calledExponential phase is called slow startslow start
ss Linear phase is calledLinear phase is called congestion avoidancecongestion avoidance
41. More TCP details
ss On a loss, current window size is stored in a variable calledOn a loss, current window size is stored in a variable called slowslow
start thresholdstart threshold oror ssthreshssthresh
ss Switch from exponential to linear (slow start to congestionSwitch from exponential to linear (slow start to congestion
avoidance) when window size reaches thresholdavoidance) when window size reaches threshold
ss Loss detected either with timeout orLoss detected either with timeout or fast retransmitfast retransmit (duplicate(duplicate
cumulativecumulative acksacks))
ss Two versions of TCPTwo versions of TCP
xx Tahoe: in both cases, drop window to 1Tahoe: in both cases, drop window to 1
xx Reno: on timeout, drop window to 1, and on fast retransmitReno: on timeout, drop window to 1, and on fast retransmit
drop window to half previous size (also, increase window ondrop window to half previous size (also, increase window on
subsequentsubsequent acksacks))
42. TCP vs. DECbit
ss Both use dynamic window flow control and additive-increaseBoth use dynamic window flow control and additive-increase
multiplicativemultiplicative decreasedecrease
ss TCP uses implicit measurement of congestionTCP uses implicit measurement of congestion
xx probe a black boxprobe a black box
ss Operates at theOperates at the cliffcliff
ss Source does not filter informationSource does not filter information
43. Evaluation
ss Effective over a wide range of bandwidthsEffective over a wide range of bandwidths
ss A lot of operational experienceA lot of operational experience
ss WeaknessesWeaknesses
xx loss => overload? (wireless)loss => overload? (wireless)
xx overload => self-blame, problem with FCFSoverload => self-blame, problem with FCFS
xx ovelroadovelroad detected only on a lossdetected only on a loss
33 in steady state, sourcein steady state, source inducesinduces lossloss
xx needs at leastneeds at least bRbR/3 buffers per connection/3 buffers per connection
45. TCP Vegas
ss Expected throughput =Expected throughput =
transmission_window_size/propagation_delaytransmission_window_size/propagation_delay
ss Numerator: knownNumerator: known
ss Denominator: measureDenominator: measure smallestsmallest RTT
s Also know actual throughput
s Difference = how much to reduce/increase rate
s Algorithm
xx send a special packetsend a special packet
xx onon ackack, compute expected and actual throughput, compute expected and actual throughput
xx (expected - actual)* RTT packets in bottleneck buffer(expected - actual)* RTT packets in bottleneck buffer
xx adjust sending rate if this is too largeadjust sending rate if this is too large
ss Works better than TCP RenoWorks better than TCP Reno
46. NETBLT
s First rate-based flow control scheme
s Separates error control (window) and flow control (no coupling)
s So, losses and retransmissions do not affect the flow rate
s Application data sent as a series of buffers, each at a particular
rate
s Rate = (burst size + burst rate) so granularity of control = burst
s Initially, no adjustment of rates
s Later, if received rate < sending rate, multiplicatively decrease
rate
s Change rate only once per buffer => slow
47. Packet pair
ss Improves basic ideas in NETBLTImproves basic ideas in NETBLT
xx better measurement of bottleneckbetter measurement of bottleneck
xx control based on predictioncontrol based on prediction
xx finer granularityfiner granularity
ss Assume all bottlenecks serve packets in round robin orderAssume all bottlenecks serve packets in round robin order
ss Then, spacing between packets at receiver (=Then, spacing between packets at receiver (= ackack spacing) =spacing) =
1/(rate of slowest server)1/(rate of slowest server)
ss IfIf allall data sent as paired packets, no distinction between datadata sent as paired packets, no distinction between data
and probesand probes
ss Implicitly determine service rates if servers are round-robin-likeImplicitly determine service rates if servers are round-robin-like
49. Packet-pair details
ss AcksAcks give time series of service rates in the pastgive time series of service rates in the past
ss We can use this to predict the next rateWe can use this to predict the next rate
ss ExponentialExponential averageraverager, with fuzzy rules to change the averaging, with fuzzy rules to change the averaging
factorfactor
ss Predicted rate feeds into flow control equationPredicted rate feeds into flow control equation
50. Packet-pair flow control
s Let X = # packets in bottleneck buffer
s S = # outstanding packets
s R = RTT
s b = bottleneck rate
s Then, X = S - Rb (assuming no losses)
s Let l = source rate
s l(k+1) = b(k+1) + (setpoint -X)/R
52. ATM Forum EERC
ss Similar toSimilar to DECbitDECbit, but send a whole cell’s worth of info instead, but send a whole cell’s worth of info instead
of one bitof one bit
ss Sources periodically send a Resource Management (RM) cellSources periodically send a Resource Management (RM) cell
with awith a rate requestrate request
xx typically once every 32 cellstypically once every 32 cells
ss Each server fills in RM cell with current share, if lessEach server fills in RM cell with current share, if less
ss Source sends at this rateSource sends at this rate
53. ATM Forum EERC details
ss Source sends Explicit Rate (ER) in RM cellSource sends Explicit Rate (ER) in RM cell
ss Switches compute source share in an unspecified mannerSwitches compute source share in an unspecified manner
(allows competition)(allows competition)
ss Current rate = allowed cell rate = ACRCurrent rate = allowed cell rate = ACR
ss If ER > ACR then ACR = ACR + RIF * PCR else ACR = ERIf ER > ACR then ACR = ACR + RIF * PCR else ACR = ER
ss If switch does not change ER, then useIf switch does not change ER, then use DECbitDECbit ideaidea
xx If CI bit set, ACR = ACR (1 - RDF)If CI bit set, ACR = ACR (1 - RDF)
ss If ER < AR, AR = ERIf ER < AR, AR = ER
ss Allows interoperability of a sortAllows interoperability of a sort
ss If idle 500 ms, reset rate to Initial cell rateIf idle 500 ms, reset rate to Initial cell rate
ss If no RM cells return for a while, ACRIf no RM cells return for a while, ACR *=*= (1-RDF)(1-RDF)
54. Comparison with DECbit
ss Sources know exact rateSources know exact rate
ss Non-zero Initial cell-rate => conservative increase can beNon-zero Initial cell-rate => conservative increase can be
avoidedavoided
ss Interoperation between ER/CI switchesInteroperation between ER/CI switches
55. Problems
s RM cells in data path a mess
s Updating sending rate based on RM cell can be hard
s Interoperability comes at the cost of reduced efficiency (as bad
as DECbit)
s Computing ER is hard
56. Comparison among closed-loop schemes
ss On-off, stop-and-wait, static window,On-off, stop-and-wait, static window, DECbitDECbit, TCP, NETBLT,, TCP, NETBLT,
Packet-pair, ATM Forum EERCPacket-pair, ATM Forum EERC
ss Which is best? No simple answerWhich is best? No simple answer
ss Some rules of thumbSome rules of thumb
xx flow control easier with RR schedulingflow control easier with RR scheduling
33 otherwise, assume cooperation, or police ratesotherwise, assume cooperation, or police rates
xx explicit schemes are more robustexplicit schemes are more robust
xx hop-by-hop schemes are morehop-by-hop schemes are more resposiveresposive, but more, but more complescomples
xx try to separate error control and flow controltry to separate error control and flow control
xx rate based schemes are inherently unstable unless well-rate based schemes are inherently unstable unless well-
engineeredengineered
57. Hybrid flow control
ss Source gets a minimum rate, but can use moreSource gets a minimum rate, but can use more
ss All problems of both open loop and closed loop flow controlAll problems of both open loop and closed loop flow control
ss Resource partitioning problemResource partitioning problem
xx what fraction can be reserved?what fraction can be reserved?
xx how?how?