SlideShare ist ein Scribd-Unternehmen logo
1 von 32
TCP Issues in Virtualized Datacenter
Networks
Hemanth Kumar Mantri
Department of Computer Science 1 of 27
Selected Papers
• The TCP Outcast Problem: Exposing
Unfairness in Data Center Networks.
– NSDI’12
• vSnoop: Improving TCP Throughput in
VirtualizedEnvironments via Ack Offload.
– ACM/IEEE SC, 2010
2 of 27
Background and Motivation
• Data center is a shared environment
– Multi Tenancy
• Virtualization: A key enabler of cloud
computing
– Amazon EC2
• Resource sharing
– CPU/Memory are strictly shared
– Network sharing largely laissez-faire
3 of 27
Data Center Networks
• Flows compete via TCP
• Ideally, TCP should achieve true fairness
– All flows get equal share of link capacity
• In practice, TCP exhibits RTT-bias
– Throughput is inversely proportional to RTT
• 2 Major Issues
– Unfairness (in general)
– Low Throughput (in virtualized environments)
4 of 27
Datacenter Topology (Hierarchical)
5 of 27
Traffic Pattern: Many to One
6 of 27
Key Find: Unfairness
Inverse RTT Bias?
Low RTT = Low Throughput
7 of 27
Further Investigation
Instantaneous Average
2-hop flow is consistently starved!!
TCP Outcast Problem
• Some Flows are ‘Outcast’ed and receive very low
throughput compared to others
• Almost an order of magnitude reduction in some
cases
8 of 27
Experiments
• Same RTTs
• Same Hop Length
• Unsynchronized Flows
• Introduce Background Traffic
• Vary Switch Buffer Size
• Vary TCP
– RENO, MP-TCP, BIC, Cubic + SACK
• Unfairness Persists! 9 of 27
Observation
Flow differential at input ports is the culprit! 10 of 27
Vary #flows at competing bottle neck
switch
11 of 27
Reason: Port Blackout
1. Packets are roughly same size
2. Similar inter-arrival rates (Predictable Timing) 12 of 27
Port Blackout
• Can occur on any input port
• Happens for small intervals of time
• Has more catastrophic effect on
throughput of fewer flows!!
– Experiments showed that “same number” of
packet drops affect the throughput of fewer
flows much more than if there were several
concurrent flows.
13 of 27
Conditions for TCP Outcast
14 of 27
Solutions?
• Stochastic Fair Queuing (SFQ)
– Explicitly enforce fairness among flows
– Expensive for commodity switches
• Equal Length Routing
– All flows are forced to go through Core
– Better interleaving of packets, alleviate PB
15 of 27
• Multiple VMs hosted by one physical host
• Multiple VMs sharing the same core
– Flexibility, scalability, and economy
VM Consolidation
Hardware
Virtualization Layer
VM 1 VM 3 VM 4VM 2
Observation:
VM consolidation negatively
impacts network performance!
16 of 27
Sender
Hardware
Virtualization Layer
Investigating the Problem
Server
VM 1 VM 2 VM 3
Client
17 of 27
40
60
80
100
120
140
160
180
5432
RTT(ms)
Number of VMs
RTT increases in
proportion to VM
scheduling slice
(30ms)
Effect of CPU Sharing
18 of 27
Exact Culprit
Sender
Hardware
Driver Domain
(dom0)
VM 1
Device
Driver
VM 3
bufbuf
VM 2
buf
19 of 27
Connection to the VM is much
slower than dom0!
Impact on TCP Throughput
+ dom0
x VM
20 of 27
Solution: vSnoop
• Alleviates the negative effect of VM scheduling on
TCP throughput
• Implemented within the driver domain to
accelerate TCP connections
• Does not require any modifications to the VM
• Does not violate end-to-end TCP semantics
• Applicable across a wide range of VMMs
– Xen, VMware, KVM, etc.
21 of 27
Sender VM1 BufferDriver Domain
Time
SYN
SYN,ACK
SYN
SYN,ACK
VM1 buffer
TCP Connection to a VM
Scheduled VM
VM1
VM2
VM3
VM1
VM2
VM3
SYN,ACK
SYN
VM Scheduling
Latency
RTT
RTT
VM Scheduling
Latency
Sender establishes a TCP
connection to VM1
22 of 27
Sender VM Shared BufferDriver Domain
Time
SYN
SYN,ACK
SYN
SYN,ACK
VM1 buffer
Key Idea: Acknowledgement Offload
Scheduled VM
VM1
VM2
VM3
VM1
VM2
VM3
SYN,ACK
w/ vSnoop
Faster progress during
TCP slowstart
23 of 27
• Challenge 1: Out-of-order/special packets (SYN, FIN packets)
• Solution: Let the VM handle these packets
• Challenge 2: Packet loss after vSnoop
• Solution: Let vSnoop acknowledge only if room in buffer
• Challenge 3: ACKs generated by the VM
• Solution: Suppress/rewrite ACKs already generated by vSnoop
Challenges
24 of 27
vSnoop Implementation in Xen
Driver Domain (dom0)
Bridge
Netfront
Netback
vSnoop
VM1
Netfront
Netback
VM3
Netfront
Netback
VM2
buf bufbuf
Tuning
Netfront
25 of 27
Median
0.192MB/s
0.778MB/s
6.003MB/s
TCP Throughput Improvement
• 3 VMs consolidated, 1000 transfers of a 100KB file
• Vanilla Xen, Xen+tuning, Xen+tuning+vSnoop
30x Improvement
+ Vanilla Xen
x Xen+tuning
* Xen+tuning+vSnoop
26 of 27
Thank You!
• References
– http://friends.cs.purdue.edu/dokuwiki/doku.php
– https://www.usenix.org/conference/nsdi12/tech-
schedule/technical-sessions
• Most animations and pictures are taken from
the authors’ original slides and NSDI’12
conference talk.
27 of 27
BACKUP SLIDES
28
Conditions for Outcast
• Switches use the tail-drop queue
management discipline
• A large set of flows and a small set of
flows arriving at two different input ports
compete for a bottleneck output port at a
switch
29
Why does Unfairness Matter?
• Multi Tenant Clouds
– Some tenants get better performance than
others
• Map Reduce Apps
– Straggler problems
– One delayed flow affects overall job
completion
30
State Machine Maintained Per-
FlowStart
Unexpected
Sequence
Active
(online)
No buffer
(offline)
Out-of-order
packet
In-order pkt
Buffer space available
Out-of-order
packet
In-order pkt
No buffer
In-order pkt
Buffer space available
No buffer
Packet recv
Early acknowledgements
for in-order packets
Don’t
acknowledge
Pass out-of-order
pkts to VM
31
vSnoop’s Impact on TCP Flows
• Slow Start
– Early acknowledgements help progress
connections faster
– Most significant benefit for short transfers that are
more prevalent in data centers
• Congestion Avoidance and Fast Retransmit
– Large flows in the steady state can also benefit
from vSnoop
– Benefit not as much as for Slow Start 32

Weitere ähnliche Inhalte

Was ist angesagt?

Hhm 3470 mq v8 and more recent new things for z os
Hhm 3470 mq v8 and more recent new things for z osHhm 3470 mq v8 and more recent new things for z os
Hhm 3470 mq v8 and more recent new things for z osPete Siddall
 
Design and Performance Characteristics of Tap-as-a-Service
Design and Performance Characteristics of Tap-as-a-ServiceDesign and Performance Characteristics of Tap-as-a-Service
Design and Performance Characteristics of Tap-as-a-Servicesoichi shigeta
 
Application Live Migration in LAN/WAN Environment
Application Live Migration in LAN/WAN EnvironmentApplication Live Migration in LAN/WAN Environment
Application Live Migration in LAN/WAN EnvironmentMahendra Kutare
 
Training Slides: Basics 102: Introduction to Tungsten Clustering
Training Slides: Basics 102: Introduction to Tungsten ClusteringTraining Slides: Basics 102: Introduction to Tungsten Clustering
Training Slides: Basics 102: Introduction to Tungsten ClusteringContinuent
 
Feedback queuing models for time shared systems
Feedback queuing models for time shared systemsFeedback queuing models for time shared systems
Feedback queuing models for time shared systemsPushpalanka Jayawardhana
 
IBM MQ Clustering (2017 version)
IBM MQ Clustering (2017 version)IBM MQ Clustering (2017 version)
IBM MQ Clustering (2017 version)MarkTaylorIBM
 
Feedback Queueing Models for Time Shared Systems
Feedback Queueing Models for Time Shared SystemsFeedback Queueing Models for Time Shared Systems
Feedback Queueing Models for Time Shared SystemsIshara Amarasekera
 
VMworld 2014: Extreme Performance Series
VMworld 2014: Extreme Performance Series VMworld 2014: Extreme Performance Series
VMworld 2014: Extreme Performance Series VMworld
 
XPDS13: Performance Evaluation of Live Migration based on Xen ARM PVH - Jaeyo...
XPDS13: Performance Evaluation of Live Migration based on Xen ARM PVH - Jaeyo...XPDS13: Performance Evaluation of Live Migration based on Xen ARM PVH - Jaeyo...
XPDS13: Performance Evaluation of Live Migration based on Xen ARM PVH - Jaeyo...The Linux Foundation
 
Demand-Based Coordinated Scheduling for SMP VMs
Demand-Based Coordinated Scheduling for SMP VMsDemand-Based Coordinated Scheduling for SMP VMs
Demand-Based Coordinated Scheduling for SMP VMsHwanju Kim
 
Swift container sync
Swift container syncSwift container sync
Swift container syncOpen Stack
 
Containers in a File
Containers in a FileContainers in a File
Containers in a FileOpenVZ
 
Where is My Message?: Use MQ Tools to Work Out What Applications Have Done
Where is My Message?: Use MQ Tools to Work Out What Applications Have DoneWhere is My Message?: Use MQ Tools to Work Out What Applications Have Done
Where is My Message?: Use MQ Tools to Work Out What Applications Have DoneMorag Hughson
 

Was ist angesagt? (20)

Hhm 3470 mq v8 and more recent new things for z os
Hhm 3470 mq v8 and more recent new things for z osHhm 3470 mq v8 and more recent new things for z os
Hhm 3470 mq v8 and more recent new things for z os
 
Design and Performance Characteristics of Tap-as-a-Service
Design and Performance Characteristics of Tap-as-a-ServiceDesign and Performance Characteristics of Tap-as-a-Service
Design and Performance Characteristics of Tap-as-a-Service
 
Application Live Migration in LAN/WAN Environment
Application Live Migration in LAN/WAN EnvironmentApplication Live Migration in LAN/WAN Environment
Application Live Migration in LAN/WAN Environment
 
Training Slides: Basics 102: Introduction to Tungsten Clustering
Training Slides: Basics 102: Introduction to Tungsten ClusteringTraining Slides: Basics 102: Introduction to Tungsten Clustering
Training Slides: Basics 102: Introduction to Tungsten Clustering
 
VM Live Migration Speedup in Xen
VM Live Migration Speedup in XenVM Live Migration Speedup in Xen
VM Live Migration Speedup in Xen
 
Feedback queuing models for time shared systems
Feedback queuing models for time shared systemsFeedback queuing models for time shared systems
Feedback queuing models for time shared systems
 
IBM MQ Clustering (2017 version)
IBM MQ Clustering (2017 version)IBM MQ Clustering (2017 version)
IBM MQ Clustering (2017 version)
 
Feedback Queueing Models for Time Shared Systems
Feedback Queueing Models for Time Shared SystemsFeedback Queueing Models for Time Shared Systems
Feedback Queueing Models for Time Shared Systems
 
Mule
MuleMule
Mule
 
VMworld 2014: Extreme Performance Series
VMworld 2014: Extreme Performance Series VMworld 2014: Extreme Performance Series
VMworld 2014: Extreme Performance Series
 
XPDS13: Performance Evaluation of Live Migration based on Xen ARM PVH - Jaeyo...
XPDS13: Performance Evaluation of Live Migration based on Xen ARM PVH - Jaeyo...XPDS13: Performance Evaluation of Live Migration based on Xen ARM PVH - Jaeyo...
XPDS13: Performance Evaluation of Live Migration based on Xen ARM PVH - Jaeyo...
 
XS Boston 2008 Quantitative
XS Boston 2008 QuantitativeXS Boston 2008 Quantitative
XS Boston 2008 Quantitative
 
XS Boston 2008 XenLoop
XS Boston 2008 XenLoopXS Boston 2008 XenLoop
XS Boston 2008 XenLoop
 
Demand-Based Coordinated Scheduling for SMP VMs
Demand-Based Coordinated Scheduling for SMP VMsDemand-Based Coordinated Scheduling for SMP VMs
Demand-Based Coordinated Scheduling for SMP VMs
 
Message passing in Distributed Computing Systems
Message passing in Distributed Computing SystemsMessage passing in Distributed Computing Systems
Message passing in Distributed Computing Systems
 
AMQP 1.0 introduction
AMQP 1.0 introductionAMQP 1.0 introduction
AMQP 1.0 introduction
 
XS 2008 Boston Capacity Planning
XS 2008 Boston Capacity PlanningXS 2008 Boston Capacity Planning
XS 2008 Boston Capacity Planning
 
Swift container sync
Swift container syncSwift container sync
Swift container sync
 
Containers in a File
Containers in a FileContainers in a File
Containers in a File
 
Where is My Message?: Use MQ Tools to Work Out What Applications Have Done
Where is My Message?: Use MQ Tools to Work Out What Applications Have DoneWhere is My Message?: Use MQ Tools to Work Out What Applications Have Done
Where is My Message?: Use MQ Tools to Work Out What Applications Have Done
 

Ähnlich wie TCP Issues in DataCenter Networks

XPDS13: On Paravirualizing TCP - Congestion Control on Xen VMs - Luwei Cheng,...
XPDS13: On Paravirualizing TCP - Congestion Control on Xen VMs - Luwei Cheng,...XPDS13: On Paravirualizing TCP - Congestion Control on Xen VMs - Luwei Cheng,...
XPDS13: On Paravirualizing TCP - Congestion Control on Xen VMs - Luwei Cheng,...The Linux Foundation
 
TLS in manet
TLS in manetTLS in manet
TLS in manetJay Patel
 
VMworld 2013: Extreme Performance Series: Network Speed Ahead
VMworld 2013: Extreme Performance Series: Network Speed Ahead VMworld 2013: Extreme Performance Series: Network Speed Ahead
VMworld 2013: Extreme Performance Series: Network Speed Ahead VMworld
 
Designing TCP-Friendly Window-based Congestion Control
Designing TCP-Friendly Window-based Congestion ControlDesigning TCP-Friendly Window-based Congestion Control
Designing TCP-Friendly Window-based Congestion Controlsoohyunc
 
lec 3 4 Core Delays Thruput Net Arch.ppt
lec 3 4 Core Delays Thruput Net Arch.pptlec 3 4 Core Delays Thruput Net Arch.ppt
lec 3 4 Core Delays Thruput Net Arch.pptMahamKhurram4
 
Congestion_Control09.ppt
Congestion_Control09.pptCongestion_Control09.ppt
Congestion_Control09.ppttahaniali27
 
FATTREE: A scalable Commodity Data Center Network Architecture
FATTREE: A scalable Commodity Data Center Network ArchitectureFATTREE: A scalable Commodity Data Center Network Architecture
FATTREE: A scalable Commodity Data Center Network ArchitectureAnkita Mahajan
 
Lecture notes - Data Centers________.pptx
Lecture notes - Data Centers________.pptxLecture notes - Data Centers________.pptx
Lecture notes - Data Centers________.pptxSandeepGupta229023
 
RIPE 80: Buffers and Protocols
RIPE 80: Buffers and ProtocolsRIPE 80: Buffers and Protocols
RIPE 80: Buffers and ProtocolsAPNIC
 
Congection control and Internet working
Congection control and Internet workingCongection control and Internet working
Congection control and Internet workingTharuniDiddekunta
 
ClickOS_EE80777777777777777777777777777.pptx
ClickOS_EE80777777777777777777777777777.pptxClickOS_EE80777777777777777777777777777.pptx
ClickOS_EE80777777777777777777777777777.pptxBiHongPhc
 

Ähnlich wie TCP Issues in DataCenter Networks (20)

10 sdn-vir-6up
10 sdn-vir-6up10 sdn-vir-6up
10 sdn-vir-6up
 
XPDS13: On Paravirualizing TCP - Congestion Control on Xen VMs - Luwei Cheng,...
XPDS13: On Paravirualizing TCP - Congestion Control on Xen VMs - Luwei Cheng,...XPDS13: On Paravirualizing TCP - Congestion Control on Xen VMs - Luwei Cheng,...
XPDS13: On Paravirualizing TCP - Congestion Control on Xen VMs - Luwei Cheng,...
 
TLS in manet
TLS in manetTLS in manet
TLS in manet
 
VMworld 2013: Extreme Performance Series: Network Speed Ahead
VMworld 2013: Extreme Performance Series: Network Speed Ahead VMworld 2013: Extreme Performance Series: Network Speed Ahead
VMworld 2013: Extreme Performance Series: Network Speed Ahead
 
Designing TCP-Friendly Window-based Congestion Control
Designing TCP-Friendly Window-based Congestion ControlDesigning TCP-Friendly Window-based Congestion Control
Designing TCP-Friendly Window-based Congestion Control
 
lec 3 4 Core Delays Thruput Net Arch.ppt
lec 3 4 Core Delays Thruput Net Arch.pptlec 3 4 Core Delays Thruput Net Arch.ppt
lec 3 4 Core Delays Thruput Net Arch.ppt
 
transport layer
transport layertransport layer
transport layer
 
Congestion control
Congestion controlCongestion control
Congestion control
 
Congestion_Control09.ppt
Congestion_Control09.pptCongestion_Control09.ppt
Congestion_Control09.ppt
 
Lect9 (1)
Lect9 (1)Lect9 (1)
Lect9 (1)
 
Lect9
Lect9Lect9
Lect9
 
FATTREE: A scalable Commodity Data Center Network Architecture
FATTREE: A scalable Commodity Data Center Network ArchitectureFATTREE: A scalable Commodity Data Center Network Architecture
FATTREE: A scalable Commodity Data Center Network Architecture
 
Tcp (1)
Tcp (1)Tcp (1)
Tcp (1)
 
Tcp
TcpTcp
Tcp
 
NE #1.pptx
NE #1.pptxNE #1.pptx
NE #1.pptx
 
Lecture notes - Data Centers________.pptx
Lecture notes - Data Centers________.pptxLecture notes - Data Centers________.pptx
Lecture notes - Data Centers________.pptx
 
RIPE 80: Buffers and Protocols
RIPE 80: Buffers and ProtocolsRIPE 80: Buffers and Protocols
RIPE 80: Buffers and Protocols
 
Part9-congestion.pptx
Part9-congestion.pptxPart9-congestion.pptx
Part9-congestion.pptx
 
Congection control and Internet working
Congection control and Internet workingCongection control and Internet working
Congection control and Internet working
 
ClickOS_EE80777777777777777777777777777.pptx
ClickOS_EE80777777777777777777777777777.pptxClickOS_EE80777777777777777777777777777.pptx
ClickOS_EE80777777777777777777777777777.pptx
 

Mehr von Hemanth Kumar Mantri

Mehr von Hemanth Kumar Mantri (8)

Basic Paxos Implementation in Orc
Basic Paxos Implementation in OrcBasic Paxos Implementation in Orc
Basic Paxos Implementation in Orc
 
Neural Networks in File access Prediction
Neural Networks in File access PredictionNeural Networks in File access Prediction
Neural Networks in File access Prediction
 
Connected Components Labeling
Connected Components LabelingConnected Components Labeling
Connected Components Labeling
 
JPEG Image Compression
JPEG Image CompressionJPEG Image Compression
JPEG Image Compression
 
Traffic Simulation using NetLogo
Traffic Simulation using NetLogoTraffic Simulation using NetLogo
Traffic Simulation using NetLogo
 
Search Engine Switching
Search Engine SwitchingSearch Engine Switching
Search Engine Switching
 
Hadoop and MapReduce
Hadoop and MapReduceHadoop and MapReduce
Hadoop and MapReduce
 
Auto Tuning
Auto TuningAuto Tuning
Auto Tuning
 

Kürzlich hochgeladen

Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 

Kürzlich hochgeladen (20)

Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 

TCP Issues in DataCenter Networks

  • 1. TCP Issues in Virtualized Datacenter Networks Hemanth Kumar Mantri Department of Computer Science 1 of 27
  • 2. Selected Papers • The TCP Outcast Problem: Exposing Unfairness in Data Center Networks. – NSDI’12 • vSnoop: Improving TCP Throughput in VirtualizedEnvironments via Ack Offload. – ACM/IEEE SC, 2010 2 of 27
  • 3. Background and Motivation • Data center is a shared environment – Multi Tenancy • Virtualization: A key enabler of cloud computing – Amazon EC2 • Resource sharing – CPU/Memory are strictly shared – Network sharing largely laissez-faire 3 of 27
  • 4. Data Center Networks • Flows compete via TCP • Ideally, TCP should achieve true fairness – All flows get equal share of link capacity • In practice, TCP exhibits RTT-bias – Throughput is inversely proportional to RTT • 2 Major Issues – Unfairness (in general) – Low Throughput (in virtualized environments) 4 of 27
  • 6. Traffic Pattern: Many to One 6 of 27
  • 7. Key Find: Unfairness Inverse RTT Bias? Low RTT = Low Throughput 7 of 27
  • 8. Further Investigation Instantaneous Average 2-hop flow is consistently starved!! TCP Outcast Problem • Some Flows are ‘Outcast’ed and receive very low throughput compared to others • Almost an order of magnitude reduction in some cases 8 of 27
  • 9. Experiments • Same RTTs • Same Hop Length • Unsynchronized Flows • Introduce Background Traffic • Vary Switch Buffer Size • Vary TCP – RENO, MP-TCP, BIC, Cubic + SACK • Unfairness Persists! 9 of 27
  • 10. Observation Flow differential at input ports is the culprit! 10 of 27
  • 11. Vary #flows at competing bottle neck switch 11 of 27
  • 12. Reason: Port Blackout 1. Packets are roughly same size 2. Similar inter-arrival rates (Predictable Timing) 12 of 27
  • 13. Port Blackout • Can occur on any input port • Happens for small intervals of time • Has more catastrophic effect on throughput of fewer flows!! – Experiments showed that “same number” of packet drops affect the throughput of fewer flows much more than if there were several concurrent flows. 13 of 27
  • 14. Conditions for TCP Outcast 14 of 27
  • 15. Solutions? • Stochastic Fair Queuing (SFQ) – Explicitly enforce fairness among flows – Expensive for commodity switches • Equal Length Routing – All flows are forced to go through Core – Better interleaving of packets, alleviate PB 15 of 27
  • 16. • Multiple VMs hosted by one physical host • Multiple VMs sharing the same core – Flexibility, scalability, and economy VM Consolidation Hardware Virtualization Layer VM 1 VM 3 VM 4VM 2 Observation: VM consolidation negatively impacts network performance! 16 of 27
  • 17. Sender Hardware Virtualization Layer Investigating the Problem Server VM 1 VM 2 VM 3 Client 17 of 27
  • 18. 40 60 80 100 120 140 160 180 5432 RTT(ms) Number of VMs RTT increases in proportion to VM scheduling slice (30ms) Effect of CPU Sharing 18 of 27
  • 19. Exact Culprit Sender Hardware Driver Domain (dom0) VM 1 Device Driver VM 3 bufbuf VM 2 buf 19 of 27
  • 20. Connection to the VM is much slower than dom0! Impact on TCP Throughput + dom0 x VM 20 of 27
  • 21. Solution: vSnoop • Alleviates the negative effect of VM scheduling on TCP throughput • Implemented within the driver domain to accelerate TCP connections • Does not require any modifications to the VM • Does not violate end-to-end TCP semantics • Applicable across a wide range of VMMs – Xen, VMware, KVM, etc. 21 of 27
  • 22. Sender VM1 BufferDriver Domain Time SYN SYN,ACK SYN SYN,ACK VM1 buffer TCP Connection to a VM Scheduled VM VM1 VM2 VM3 VM1 VM2 VM3 SYN,ACK SYN VM Scheduling Latency RTT RTT VM Scheduling Latency Sender establishes a TCP connection to VM1 22 of 27
  • 23. Sender VM Shared BufferDriver Domain Time SYN SYN,ACK SYN SYN,ACK VM1 buffer Key Idea: Acknowledgement Offload Scheduled VM VM1 VM2 VM3 VM1 VM2 VM3 SYN,ACK w/ vSnoop Faster progress during TCP slowstart 23 of 27
  • 24. • Challenge 1: Out-of-order/special packets (SYN, FIN packets) • Solution: Let the VM handle these packets • Challenge 2: Packet loss after vSnoop • Solution: Let vSnoop acknowledge only if room in buffer • Challenge 3: ACKs generated by the VM • Solution: Suppress/rewrite ACKs already generated by vSnoop Challenges 24 of 27
  • 25. vSnoop Implementation in Xen Driver Domain (dom0) Bridge Netfront Netback vSnoop VM1 Netfront Netback VM3 Netfront Netback VM2 buf bufbuf Tuning Netfront 25 of 27
  • 26. Median 0.192MB/s 0.778MB/s 6.003MB/s TCP Throughput Improvement • 3 VMs consolidated, 1000 transfers of a 100KB file • Vanilla Xen, Xen+tuning, Xen+tuning+vSnoop 30x Improvement + Vanilla Xen x Xen+tuning * Xen+tuning+vSnoop 26 of 27
  • 27. Thank You! • References – http://friends.cs.purdue.edu/dokuwiki/doku.php – https://www.usenix.org/conference/nsdi12/tech- schedule/technical-sessions • Most animations and pictures are taken from the authors’ original slides and NSDI’12 conference talk. 27 of 27
  • 29. Conditions for Outcast • Switches use the tail-drop queue management discipline • A large set of flows and a small set of flows arriving at two different input ports compete for a bottleneck output port at a switch 29
  • 30. Why does Unfairness Matter? • Multi Tenant Clouds – Some tenants get better performance than others • Map Reduce Apps – Straggler problems – One delayed flow affects overall job completion 30
  • 31. State Machine Maintained Per- FlowStart Unexpected Sequence Active (online) No buffer (offline) Out-of-order packet In-order pkt Buffer space available Out-of-order packet In-order pkt No buffer In-order pkt Buffer space available No buffer Packet recv Early acknowledgements for in-order packets Don’t acknowledge Pass out-of-order pkts to VM 31
  • 32. vSnoop’s Impact on TCP Flows • Slow Start – Early acknowledgements help progress connections faster – Most significant benefit for short transfers that are more prevalent in data centers • Congestion Avoidance and Fast Retransmit – Large flows in the steady state can also benefit from vSnoop – Benefit not as much as for Slow Start 32