SlideShare ist ein Scribd-Unternehmen logo
1 von 34
LinkedIn’s Approach to Programmable Data Center
Shawn Zandi
Principal Architect
Infrastructure Engineering
LinkedIn Infrastructure
• Infrastructure architecture based on application’s behavior & requirements
• Pre-planned static topology
• Single operator
• Single tenant with many applications
• As oppose to multi-tenant with different (or unknown) needs
• 34% infrastructure growth on annual basis and close to half a billion users
Edge Network to Eyeballs
Backbone Network
Bare Metal
Operating System
Container
Application
End to End Network Design
Data Center Network
End to end control enables us to tackle problems at different parts of the stack
From application code, os, network or client software to solve by architecture…
Bare Metal
Operating System
Container
Application
Traffic Demands
• High intra and inter-DC bandwidth demand due to organic growth
• Every single byte of member activity, creates thousands bytes of east-west traffic inside
data center:
• Application Call Graph
• Metrics, Analytics and Tracking via Kafka
• Hadoop and Offline Jobs
• Machine Learning
• Data Replications
• Search and Indexing
• Ads, recruiting solutions, etc.
Scaling Out Data Centers Network - Hardware
• White-box Switches (ODM)
• Vendor Switches Based (OEM)
• Based on Merchant Silicon
• Big Chassis Switches
• Designed around robustness (NSR, ISSU, etc.)
• Feature-rich but mostly irrelevant to LinkedIn needs
Project Falco
Data center were designed by redundant chassis at core
controlling and forwarding east-west and north-south traffic
PodW
SpineSpineSpine
LeafLeafLeafLeaf
Spine
PodX
SpineSpineSpine
LeafLeafLeafLeaf
Spine
PodY
SpineSpineSpine
LeafLeafLeafLeaf
Spine
PodZ
SpineSpineSpine
LeafLeafLeafLeaf
Spine
Fabric 2
Spine Spine Spine
Fabric 4
Spine Spine Spine
Fabric 1
Spine Spine Spine
4,096 x100G ports
Non-Blocking
Scale-out
Spine Spine Spine
Fabric 3
Chassis Free Data Center
Why No Chassis?
• Robust-yet-Fragile
• Complex due to NSR, ISSU, feature-sets, etc.
• Larger fault domain, Fail-over/Fail-back
• Indeterministic boot up process and long upgrade procedures
• Moved complexity from big boxes to our advantage, where we can manage and control!
• Better control and visibility to internals by removing black-box abstraction!
• Same Switch SKU on ToR, Leaf and Spine (Entire DC)
• Single chipset uniform IO design (same bandwidth, latency and buffering)
• True 5-Staged Clos Topology! with deterministic latency
• Dedicated control plane, OAM and CPU for each ASIC
W X Y Z
W X Y Z
W X Y Z
Distributed Control Plane Complexity
Pod 1
2 32…1
Pod 11
322 352…321
Pod 21
642 672…641
Pod 31
962 992…961
W X Y Z
2171217021692168213121302129212820912090208920882051205020492048
2339233823372336 2368 2369 2370 2371 2400 2401 2402 24032307230623052304
“Fabric wide visibility and telemetry”
The wider the fabric, flow tracking and fault isolation becomes more difficult
Problem 1
“Fabric wide traffic distribution and packet scheduling!”
Forwarding is different than routing, and out of scope for routing protocols.
Problem 2
We need a robust and scalable control protocol designed for a data center fabric
Control Plane :: Routing
• Routing protocols provide destination-based reachability information
• Routing protocols are not traffic aware.
• Best path selection is elementary.
• Network graph is built based on series of ECMP groups,
“Routing protocols are more about the destination than the journey”
ECMP forwarding simply does not cut it!
Problem #3
ECMP is not really equal!
• Elephants and mice issue
• ECMP Hashing is not bandwidth aware. Devices use an algorithm to
distribute traffic amongst links regardless of load.
• Traffic is routed using shortest path, not all the available paths,
hence not maximizing all the available capacity. Some links may
suffer while the other may be underutilized.
• Flows stick to a certain path, as hashing is performed per flow. An
established socket cannot be moved to a different path easily!
“We need a robust and scalable fabric-wide forwarding policy”
Problem #4
Lack of Centralized Policy and Control
• The more parallel links you add, forwarding decision becomes more
random.
• Devices were configured and maintained individually
• Routing/Forwarding policy management tasks are performed
individually and hop by hop.
• Know when/where to centralize or distribute to scale out!
“End to End Path Selection & Control”
No application, protocol or packet can dictate a path
Centralized flow based routing does not scale!
Problem #5
“Using the same familiar, robust and well-known solutions brings along the same
restrictions when they were originally designed”
Problem #6
Hardware
Network
Transport
Application
BGP (1990s)
Clos Topology (1950s)
Ethernet & IP (1980s)
IP Routing History
• IP routing is defined hop-by-hop
• BGP is “the” IDR designed to work between different autonomous
system, to provide policy and control between different routing
domains to select a best path.
• True: BGP can scale and is extensible. BGP has many policy knobs.
• A datacenter fabric operated under a single administrative domain
instead of series of individual routers with different policies and
decision process.
Forwarding traffic based on demands & patterns:
• Application
• Latency
• Loss
• Bandwidth (Throughput)
Programmable Data Center
A data center fabric that distributes traffic amongst all available links efficiently and
effectively while maintaining lowest latency and providing the most possible bandwidth to
different applications based on different needs and priorities.
Program forwarding tables individually on all switches from a centralized location
Approach #1
Flow x > Port 1
Flow x > Port 3
Flow x > Port 2
Forwarding and Control Element Separation
Encode path information into packet header
Approach #2
Distributed control plane for topology discovery and reachability information
+
Use a controller software for forwarding policy and optimizations
Approach #3
Scale: No state or flow information required to be stored on every box
Network can choose and move flows dynamically
Application can choose and move flows dynamically
Works with existing data plane (merchant silicon support)
Supports ECMP with fallback to IP routing
Automatic Local Repair / LFA
Hardware
Routing
Policy
Applications
Link Selection and Scheduling
Topology Discovery and Network Graph
Control
Telemetry/Visibility, Machine Learning, Prediction Engine, Self Healing, etc.
Forwarding
Merchant Silicon
Rethinking The Network Stack
Network Element
Management
Plane
SNMP, Syslog, etc.
System &
Environmental
Data
Packet & Flow
Data
Network Operating System
Kafka Network Agent
ASIC
System
Drivers
Reducing Protocols
Network ElementNetwork ElementNetwork ElementNetwork Element
Management
Plane
SNMP, Syslog, etc.
System &
Environmental
Data
Packet & Flow
Data
Network Operating System
Kafka Agent
Monitoring and Management System
Kafka Broker
Machine Learning & Data Processing
Alert
Processor
Log Retention
Data Store
Event
Correlation
Kafka Pub/Sub Pipeline
Record, Process and Replay Network State
Open19
OpenFabric
ASIC ODM
RIB / Forwarding Abstraction Layer
FALCO
Apps
Linux OS
Hardware
Physical Layer
Hardware Abstraction Layer
Metrics & Analytics
Machine Learning
Self Healing
etc. (API to Infrastructure)
Policy & Control
Operating System
Base Networking
LinkedIn Infrastructure Strategy
• Unified Architecture
• We used a single SKU (hardware and software) for all switches while procuring
hardware from multiple ODM channels (multi-homing)
• One Software: Base Networking on Merchant Silicon with minimum req. features
• No Overlay - For the infrastructure, the application is stateless
• No Middle-box (Firewall, Load-balancer, etc.) Moved to application
• Network is only a set of intermediate boxes running linux
Simplified Infrastructure to Own
• To control and own your architecture:
• End to end stack (app, operating system, network and architecture.)
• Ultimate sophistication: Simplicity
• In house support as far as possible
• Move complexity to your comfort zone!
Stay in Control
• SDN is nor a protocol or a tool or product off the shelf
• SDN is the whole network stack and architecture that enables
applications to meet and interact with infrastructure to:
SDN for LinkedIn
• Discover and Learn
• Provision
• Manage
• Control
• Monitor
Project Altair: The Evolution of LinkedIn’s Data Center Network
Project Falco: Decoupling Switching Hardware and Software
Open19: A New Vision for the Data Center

Weitere ähnliche Inhalte

Was ist angesagt?

Building day 2 upload Building the Internet of Things with Thingsquare and ...
Building day 2   upload Building the Internet of Things with Thingsquare and ...Building day 2   upload Building the Internet of Things with Thingsquare and ...
Building day 2 upload Building the Internet of Things with Thingsquare and ...Adam Dunkels
 
Traffic Engineering Using Segment Routing
Traffic Engineering Using Segment Routing Traffic Engineering Using Segment Routing
Traffic Engineering Using Segment Routing Cisco Canada
 
BGP Traffic Engineering with SDN Controller
BGP Traffic Engineering with SDN ControllerBGP Traffic Engineering with SDN Controller
BGP Traffic Engineering with SDN ControllerAPNIC
 
Pyretic - A new programmer friendly language for SDN
Pyretic - A new programmer friendly language for SDNPyretic - A new programmer friendly language for SDN
Pyretic - A new programmer friendly language for SDNnvirters
 
The Next Generation Internet Number Registry Services
The Next Generation Internet Number Registry ServicesThe Next Generation Internet Number Registry Services
The Next Generation Internet Number Registry ServicesMyNOG
 
SDN Architecture & Ecosystem
SDN Architecture & EcosystemSDN Architecture & Ecosystem
SDN Architecture & EcosystemKingston Smiler
 
Barak Perlman, ConteXtream - SFC (Service Function Chaining) Using Openstack ...
Barak Perlman, ConteXtream - SFC (Service Function Chaining) Using Openstack ...Barak Perlman, ConteXtream - SFC (Service Function Chaining) Using Openstack ...
Barak Perlman, ConteXtream - SFC (Service Function Chaining) Using Openstack ...Cloud Native Day Tel Aviv
 
LISP and NSH in Open vSwitch
LISP and NSH in Open vSwitchLISP and NSH in Open vSwitch
LISP and NSH in Open vSwitchmestery
 
Segment Routing Advanced Use Cases - Cisco Live 2016 USA
Segment Routing Advanced Use Cases - Cisco Live 2016 USASegment Routing Advanced Use Cases - Cisco Live 2016 USA
Segment Routing Advanced Use Cases - Cisco Live 2016 USAJose Liste
 
Routed networks sydney
Routed networks sydneyRouted networks sydney
Routed networks sydneyMiguel Lavalle
 
Segment Routing
Segment RoutingSegment Routing
Segment RoutingAPNIC
 
Scaling the Web to Billions of Nodes: Towards the IPv6 “Internet of Things” b...
Scaling the Web to Billions of Nodes: Towards the IPv6 “Internet of Things” b...Scaling the Web to Billions of Nodes: Towards the IPv6 “Internet of Things” b...
Scaling the Web to Billions of Nodes: Towards the IPv6 “Internet of Things” b...gogo6
 
DEVNET-1175 OpenDaylight Service Function Chaining
DEVNET-1175	OpenDaylight Service Function ChainingDEVNET-1175	OpenDaylight Service Function Chaining
DEVNET-1175 OpenDaylight Service Function ChainingCisco DevNet
 
I pv6 routing_protocol_for_low_power_and_lossy_
I pv6 routing_protocol_for_low_power_and_lossy_I pv6 routing_protocol_for_low_power_and_lossy_
I pv6 routing_protocol_for_low_power_and_lossy_Sheetal Kshirsagar
 
SDN/NFV: Service Chaining
SDN/NFV: Service Chaining SDN/NFV: Service Chaining
SDN/NFV: Service Chaining Odinot Stanislas
 
Flowspec @ Bay Area Juniper User Group (BAJUG)
Flowspec @ Bay Area Juniper User Group (BAJUG)Flowspec @ Bay Area Juniper User Group (BAJUG)
Flowspec @ Bay Area Juniper User Group (BAJUG)Juniper Networks
 
Segment Routing: A Tutorial
Segment Routing: A TutorialSegment Routing: A Tutorial
Segment Routing: A TutorialAPNIC
 

Was ist angesagt? (20)

MENOG-Segment Routing Introduction
MENOG-Segment Routing IntroductionMENOG-Segment Routing Introduction
MENOG-Segment Routing Introduction
 
Building day 2 upload Building the Internet of Things with Thingsquare and ...
Building day 2   upload Building the Internet of Things with Thingsquare and ...Building day 2   upload Building the Internet of Things with Thingsquare and ...
Building day 2 upload Building the Internet of Things with Thingsquare and ...
 
Traffic Engineering Using Segment Routing
Traffic Engineering Using Segment Routing Traffic Engineering Using Segment Routing
Traffic Engineering Using Segment Routing
 
BGP Traffic Engineering with SDN Controller
BGP Traffic Engineering with SDN ControllerBGP Traffic Engineering with SDN Controller
BGP Traffic Engineering with SDN Controller
 
Pyretic - A new programmer friendly language for SDN
Pyretic - A new programmer friendly language for SDNPyretic - A new programmer friendly language for SDN
Pyretic - A new programmer friendly language for SDN
 
The Next Generation Internet Number Registry Services
The Next Generation Internet Number Registry ServicesThe Next Generation Internet Number Registry Services
The Next Generation Internet Number Registry Services
 
EVPN Introduction
EVPN IntroductionEVPN Introduction
EVPN Introduction
 
SDN Architecture & Ecosystem
SDN Architecture & EcosystemSDN Architecture & Ecosystem
SDN Architecture & Ecosystem
 
Barak Perlman, ConteXtream - SFC (Service Function Chaining) Using Openstack ...
Barak Perlman, ConteXtream - SFC (Service Function Chaining) Using Openstack ...Barak Perlman, ConteXtream - SFC (Service Function Chaining) Using Openstack ...
Barak Perlman, ConteXtream - SFC (Service Function Chaining) Using Openstack ...
 
LISP and NSH in Open vSwitch
LISP and NSH in Open vSwitchLISP and NSH in Open vSwitch
LISP and NSH in Open vSwitch
 
Segment Routing Advanced Use Cases - Cisco Live 2016 USA
Segment Routing Advanced Use Cases - Cisco Live 2016 USASegment Routing Advanced Use Cases - Cisco Live 2016 USA
Segment Routing Advanced Use Cases - Cisco Live 2016 USA
 
Routed networks sydney
Routed networks sydneyRouted networks sydney
Routed networks sydney
 
Segment Routing
Segment RoutingSegment Routing
Segment Routing
 
Scaling the Web to Billions of Nodes: Towards the IPv6 “Internet of Things” b...
Scaling the Web to Billions of Nodes: Towards the IPv6 “Internet of Things” b...Scaling the Web to Billions of Nodes: Towards the IPv6 “Internet of Things” b...
Scaling the Web to Billions of Nodes: Towards the IPv6 “Internet of Things” b...
 
BGP Advanced topics
BGP Advanced topicsBGP Advanced topics
BGP Advanced topics
 
DEVNET-1175 OpenDaylight Service Function Chaining
DEVNET-1175	OpenDaylight Service Function ChainingDEVNET-1175	OpenDaylight Service Function Chaining
DEVNET-1175 OpenDaylight Service Function Chaining
 
I pv6 routing_protocol_for_low_power_and_lossy_
I pv6 routing_protocol_for_low_power_and_lossy_I pv6 routing_protocol_for_low_power_and_lossy_
I pv6 routing_protocol_for_low_power_and_lossy_
 
SDN/NFV: Service Chaining
SDN/NFV: Service Chaining SDN/NFV: Service Chaining
SDN/NFV: Service Chaining
 
Flowspec @ Bay Area Juniper User Group (BAJUG)
Flowspec @ Bay Area Juniper User Group (BAJUG)Flowspec @ Bay Area Juniper User Group (BAJUG)
Flowspec @ Bay Area Juniper User Group (BAJUG)
 
Segment Routing: A Tutorial
Segment Routing: A TutorialSegment Routing: A Tutorial
Segment Routing: A Tutorial
 

Andere mochten auch

Introduction to Link State Advertisements (LSA)
Introduction to Link State Advertisements (LSA)Introduction to Link State Advertisements (LSA)
Introduction to Link State Advertisements (LSA)Shawn Zandi
 
An introduction to MPLS networks and applications
An introduction to MPLS networks and applicationsAn introduction to MPLS networks and applications
An introduction to MPLS networks and applicationsShawn Zandi
 
New idc architecture
New idc architectureNew idc architecture
New idc architectureMason Mei
 
MPLS Concepts and Fundamentals
MPLS Concepts and FundamentalsMPLS Concepts and Fundamentals
MPLS Concepts and FundamentalsShawn Zandi
 
Things You MUST Know Before Deploying OpenStack: Bruno Lago, Catalyst IT
Things You MUST Know Before Deploying OpenStack: Bruno Lago, Catalyst ITThings You MUST Know Before Deploying OpenStack: Bruno Lago, Catalyst IT
Things You MUST Know Before Deploying OpenStack: Bruno Lago, Catalyst ITOpenStack
 
Kernel load-balancing for Docker containers using IPVS
Kernel load-balancing for Docker containers using IPVSKernel load-balancing for Docker containers using IPVS
Kernel load-balancing for Docker containers using IPVSDocker, Inc.
 

Andere mochten auch (7)

Introduction to Link State Advertisements (LSA)
Introduction to Link State Advertisements (LSA)Introduction to Link State Advertisements (LSA)
Introduction to Link State Advertisements (LSA)
 
An introduction to MPLS networks and applications
An introduction to MPLS networks and applicationsAn introduction to MPLS networks and applications
An introduction to MPLS networks and applications
 
New idc architecture
New idc architectureNew idc architecture
New idc architecture
 
NOS Comparison
NOS ComparisonNOS Comparison
NOS Comparison
 
MPLS Concepts and Fundamentals
MPLS Concepts and FundamentalsMPLS Concepts and Fundamentals
MPLS Concepts and Fundamentals
 
Things You MUST Know Before Deploying OpenStack: Bruno Lago, Catalyst IT
Things You MUST Know Before Deploying OpenStack: Bruno Lago, Catalyst ITThings You MUST Know Before Deploying OpenStack: Bruno Lago, Catalyst IT
Things You MUST Know Before Deploying OpenStack: Bruno Lago, Catalyst IT
 
Kernel load-balancing for Docker containers using IPVS
Kernel load-balancing for Docker containers using IPVSKernel load-balancing for Docker containers using IPVS
Kernel load-balancing for Docker containers using IPVS
 

Ähnlich wie LinkedIn's Approach to Programmable Data Center

bruce-sdn.pptx
bruce-sdn.pptxbruce-sdn.pptx
bruce-sdn.pptxSameer Ali
 
Mastering IoT Design: Sense, Process, Connect: Processing: Turning IoT Data i...
Mastering IoT Design: Sense, Process, Connect: Processing: Turning IoT Data i...Mastering IoT Design: Sense, Process, Connect: Processing: Turning IoT Data i...
Mastering IoT Design: Sense, Process, Connect: Processing: Turning IoT Data i...Deepak Shankar
 
Tutorial: Network State Awareness Troubleshooting
Tutorial: Network State Awareness TroubleshootingTutorial: Network State Awareness Troubleshooting
Tutorial: Network State Awareness TroubleshootingAPNIC
 
Lecture 11 Final.pptx
Lecture 11 Final.pptxLecture 11 Final.pptx
Lecture 11 Final.pptxHadeeb
 
From Device to Data Center to Insights: Architectural Considerations for the ...
From Device to Data Center to Insights: Architectural Considerations for the ...From Device to Data Center to Insights: Architectural Considerations for the ...
From Device to Data Center to Insights: Architectural Considerations for the ...P. Taylor Goetz
 
Presentation on Data Center Use-Case & Trends
Presentation on Data Center Use-Case & TrendsPresentation on Data Center Use-Case & Trends
Presentation on Data Center Use-Case & TrendsAmod Dani
 
LPWAN Cost Webinar
LPWAN Cost WebinarLPWAN Cost Webinar
LPWAN Cost WebinarBrian Ray
 
SD-WAN Catalyst a brief Presentation of solution
SD-WAN Catalyst a brief  Presentation of solutionSD-WAN Catalyst a brief  Presentation of solution
SD-WAN Catalyst a brief Presentation of solutionpepegaston2030
 
Light Reading BTE_SDNtoolbox_June_2015
Light Reading BTE_SDNtoolbox_June_2015Light Reading BTE_SDNtoolbox_June_2015
Light Reading BTE_SDNtoolbox_June_2015Deborah Porchivina
 

Ähnlich wie LinkedIn's Approach to Programmable Data Center (20)

10. Lec X- SDN.pptx
10. Lec X- SDN.pptx10. Lec X- SDN.pptx
10. Lec X- SDN.pptx
 
bruce-sdn.pptx
bruce-sdn.pptxbruce-sdn.pptx
bruce-sdn.pptx
 
Решения WANDL и NorthStar для операторов
Решения WANDL и NorthStar для операторовРешения WANDL и NorthStar для операторов
Решения WANDL и NorthStar для операторов
 
Mastering IoT Design: Sense, Process, Connect: Processing: Turning IoT Data i...
Mastering IoT Design: Sense, Process, Connect: Processing: Turning IoT Data i...Mastering IoT Design: Sense, Process, Connect: Processing: Turning IoT Data i...
Mastering IoT Design: Sense, Process, Connect: Processing: Turning IoT Data i...
 
Cloud Migration
Cloud MigrationCloud Migration
Cloud Migration
 
Building Hyperscale Networks
Building Hyperscale NetworksBuilding Hyperscale Networks
Building Hyperscale Networks
 
Tutorial: Network State Awareness Troubleshooting
Tutorial: Network State Awareness TroubleshootingTutorial: Network State Awareness Troubleshooting
Tutorial: Network State Awareness Troubleshooting
 
Lecture 11 Final.pptx
Lecture 11 Final.pptxLecture 11 Final.pptx
Lecture 11 Final.pptx
 
Introduction to Software Defined Networking (SDN)
Introduction to Software Defined Networking (SDN)Introduction to Software Defined Networking (SDN)
Introduction to Software Defined Networking (SDN)
 
Introductionto SDN
Introductionto SDN Introductionto SDN
Introductionto SDN
 
From Device to Data Center to Insights: Architectural Considerations for the ...
From Device to Data Center to Insights: Architectural Considerations for the ...From Device to Data Center to Insights: Architectural Considerations for the ...
From Device to Data Center to Insights: Architectural Considerations for the ...
 
Presentation on Data Center Use-Case & Trends
Presentation on Data Center Use-Case & TrendsPresentation on Data Center Use-Case & Trends
Presentation on Data Center Use-Case & Trends
 
OSI Model
OSI ModelOSI Model
OSI Model
 
LPWAN Cost Webinar
LPWAN Cost WebinarLPWAN Cost Webinar
LPWAN Cost Webinar
 
INT_Ch17.pptx
INT_Ch17.pptxINT_Ch17.pptx
INT_Ch17.pptx
 
Cloud Networking Trends
Cloud Networking TrendsCloud Networking Trends
Cloud Networking Trends
 
SD-WAN Catalyst a brief Presentation of solution
SD-WAN Catalyst a brief  Presentation of solutionSD-WAN Catalyst a brief  Presentation of solution
SD-WAN Catalyst a brief Presentation of solution
 
Light Reading BTE_SDNtoolbox_June_2015
Light Reading BTE_SDNtoolbox_June_2015Light Reading BTE_SDNtoolbox_June_2015
Light Reading BTE_SDNtoolbox_June_2015
 
From Device to Data Center to Insights
From Device to Data Center to InsightsFrom Device to Data Center to Insights
From Device to Data Center to Insights
 
Technology Fundamentals
Technology FundamentalsTechnology Fundamentals
Technology Fundamentals
 

Kürzlich hochgeladen

办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书
办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书
办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书zdzoqco
 
Font Performance - NYC WebPerf Meetup April '24
Font Performance - NYC WebPerf Meetup April '24Font Performance - NYC WebPerf Meetup April '24
Font Performance - NYC WebPerf Meetup April '24Paul Calvano
 
A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)
A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)
A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)Christopher H Felton
 
Blepharitis inflammation of eyelid symptoms cause everything included along w...
Blepharitis inflammation of eyelid symptoms cause everything included along w...Blepharitis inflammation of eyelid symptoms cause everything included along w...
Blepharitis inflammation of eyelid symptoms cause everything included along w...Excelmac1
 
Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)
Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)
Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)Dana Luther
 
Call Girls Service Adil Nagar 7001305949 Need escorts Service Pooja Vip
Call Girls Service Adil Nagar 7001305949 Need escorts Service Pooja VipCall Girls Service Adil Nagar 7001305949 Need escorts Service Pooja Vip
Call Girls Service Adil Nagar 7001305949 Need escorts Service Pooja VipCall Girls Lucknow
 
Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作
Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作
Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作ys8omjxb
 
定制(UAL学位证)英国伦敦艺术大学毕业证成绩单原版一比一
定制(UAL学位证)英国伦敦艺术大学毕业证成绩单原版一比一定制(UAL学位证)英国伦敦艺术大学毕业证成绩单原版一比一
定制(UAL学位证)英国伦敦艺术大学毕业证成绩单原版一比一Fs
 
Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170
Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170
Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170Sonam Pathan
 
Top 10 Interactive Website Design Trends in 2024.pptx
Top 10 Interactive Website Design Trends in 2024.pptxTop 10 Interactive Website Design Trends in 2024.pptx
Top 10 Interactive Website Design Trends in 2024.pptxDyna Gilbert
 
定制(AUT毕业证书)新西兰奥克兰理工大学毕业证成绩单原版一比一
定制(AUT毕业证书)新西兰奥克兰理工大学毕业证成绩单原版一比一定制(AUT毕业证书)新西兰奥克兰理工大学毕业证成绩单原版一比一
定制(AUT毕业证书)新西兰奥克兰理工大学毕业证成绩单原版一比一Fs
 
Call Girls South Delhi Delhi reach out to us at ☎ 9711199012
Call Girls South Delhi Delhi reach out to us at ☎ 9711199012Call Girls South Delhi Delhi reach out to us at ☎ 9711199012
Call Girls South Delhi Delhi reach out to us at ☎ 9711199012rehmti665
 
Call Girls in Uttam Nagar Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Uttam Nagar Delhi 💯Call Us 🔝8264348440🔝Call Girls in Uttam Nagar Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Uttam Nagar Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一
定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一
定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一Fs
 
Contact Rya Baby for Call Girls New Delhi
Contact Rya Baby for Call Girls New DelhiContact Rya Baby for Call Girls New Delhi
Contact Rya Baby for Call Girls New Delhimiss dipika
 
Film cover research (1).pptxsdasdasdasdasdasa
Film cover research (1).pptxsdasdasdasdasdasaFilm cover research (1).pptxsdasdasdasdasdasa
Film cover research (1).pptxsdasdasdasdasdasa494f574xmv
 
PHP-based rendering of TYPO3 Documentation
PHP-based rendering of TYPO3 DocumentationPHP-based rendering of TYPO3 Documentation
PHP-based rendering of TYPO3 DocumentationLinaWolf1
 

Kürzlich hochgeladen (20)

Hot Sexy call girls in Rk Puram 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in  Rk Puram 🔝 9953056974 🔝 Delhi escort ServiceHot Sexy call girls in  Rk Puram 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Rk Puram 🔝 9953056974 🔝 Delhi escort Service
 
办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书
办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书
办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书
 
Font Performance - NYC WebPerf Meetup April '24
Font Performance - NYC WebPerf Meetup April '24Font Performance - NYC WebPerf Meetup April '24
Font Performance - NYC WebPerf Meetup April '24
 
Model Call Girl in Jamuna Vihar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in  Jamuna Vihar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in  Jamuna Vihar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Jamuna Vihar Delhi reach out to us at 🔝9953056974🔝
 
A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)
A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)
A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)
 
Blepharitis inflammation of eyelid symptoms cause everything included along w...
Blepharitis inflammation of eyelid symptoms cause everything included along w...Blepharitis inflammation of eyelid symptoms cause everything included along w...
Blepharitis inflammation of eyelid symptoms cause everything included along w...
 
Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)
Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)
Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)
 
young call girls in Uttam Nagar🔝 9953056974 🔝 Delhi escort Service
young call girls in Uttam Nagar🔝 9953056974 🔝 Delhi escort Serviceyoung call girls in Uttam Nagar🔝 9953056974 🔝 Delhi escort Service
young call girls in Uttam Nagar🔝 9953056974 🔝 Delhi escort Service
 
Call Girls Service Adil Nagar 7001305949 Need escorts Service Pooja Vip
Call Girls Service Adil Nagar 7001305949 Need escorts Service Pooja VipCall Girls Service Adil Nagar 7001305949 Need escorts Service Pooja Vip
Call Girls Service Adil Nagar 7001305949 Need escorts Service Pooja Vip
 
Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作
Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作
Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作
 
定制(UAL学位证)英国伦敦艺术大学毕业证成绩单原版一比一
定制(UAL学位证)英国伦敦艺术大学毕业证成绩单原版一比一定制(UAL学位证)英国伦敦艺术大学毕业证成绩单原版一比一
定制(UAL学位证)英国伦敦艺术大学毕业证成绩单原版一比一
 
Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170
Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170
Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170
 
Top 10 Interactive Website Design Trends in 2024.pptx
Top 10 Interactive Website Design Trends in 2024.pptxTop 10 Interactive Website Design Trends in 2024.pptx
Top 10 Interactive Website Design Trends in 2024.pptx
 
定制(AUT毕业证书)新西兰奥克兰理工大学毕业证成绩单原版一比一
定制(AUT毕业证书)新西兰奥克兰理工大学毕业证成绩单原版一比一定制(AUT毕业证书)新西兰奥克兰理工大学毕业证成绩单原版一比一
定制(AUT毕业证书)新西兰奥克兰理工大学毕业证成绩单原版一比一
 
Call Girls South Delhi Delhi reach out to us at ☎ 9711199012
Call Girls South Delhi Delhi reach out to us at ☎ 9711199012Call Girls South Delhi Delhi reach out to us at ☎ 9711199012
Call Girls South Delhi Delhi reach out to us at ☎ 9711199012
 
Call Girls in Uttam Nagar Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Uttam Nagar Delhi 💯Call Us 🔝8264348440🔝Call Girls in Uttam Nagar Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Uttam Nagar Delhi 💯Call Us 🔝8264348440🔝
 
定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一
定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一
定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一
 
Contact Rya Baby for Call Girls New Delhi
Contact Rya Baby for Call Girls New DelhiContact Rya Baby for Call Girls New Delhi
Contact Rya Baby for Call Girls New Delhi
 
Film cover research (1).pptxsdasdasdasdasdasa
Film cover research (1).pptxsdasdasdasdasdasaFilm cover research (1).pptxsdasdasdasdasdasa
Film cover research (1).pptxsdasdasdasdasdasa
 
PHP-based rendering of TYPO3 Documentation
PHP-based rendering of TYPO3 DocumentationPHP-based rendering of TYPO3 Documentation
PHP-based rendering of TYPO3 Documentation
 

LinkedIn's Approach to Programmable Data Center

  • 1. LinkedIn’s Approach to Programmable Data Center Shawn Zandi Principal Architect Infrastructure Engineering
  • 2. LinkedIn Infrastructure • Infrastructure architecture based on application’s behavior & requirements • Pre-planned static topology • Single operator • Single tenant with many applications • As oppose to multi-tenant with different (or unknown) needs • 34% infrastructure growth on annual basis and close to half a billion users
  • 3. Edge Network to Eyeballs Backbone Network Bare Metal Operating System Container Application End to End Network Design Data Center Network End to end control enables us to tackle problems at different parts of the stack From application code, os, network or client software to solve by architecture… Bare Metal Operating System Container Application
  • 4. Traffic Demands • High intra and inter-DC bandwidth demand due to organic growth • Every single byte of member activity, creates thousands bytes of east-west traffic inside data center: • Application Call Graph • Metrics, Analytics and Tracking via Kafka • Hadoop and Offline Jobs • Machine Learning • Data Replications • Search and Indexing • Ads, recruiting solutions, etc.
  • 5. Scaling Out Data Centers Network - Hardware • White-box Switches (ODM) • Vendor Switches Based (OEM) • Based on Merchant Silicon • Big Chassis Switches • Designed around robustness (NSR, ISSU, etc.) • Feature-rich but mostly irrelevant to LinkedIn needs Project Falco
  • 6. Data center were designed by redundant chassis at core controlling and forwarding east-west and north-south traffic
  • 7. PodW SpineSpineSpine LeafLeafLeafLeaf Spine PodX SpineSpineSpine LeafLeafLeafLeaf Spine PodY SpineSpineSpine LeafLeafLeafLeaf Spine PodZ SpineSpineSpine LeafLeafLeafLeaf Spine Fabric 2 Spine Spine Spine Fabric 4 Spine Spine Spine Fabric 1 Spine Spine Spine 4,096 x100G ports Non-Blocking Scale-out Spine Spine Spine Fabric 3 Chassis Free Data Center
  • 8. Why No Chassis? • Robust-yet-Fragile • Complex due to NSR, ISSU, feature-sets, etc. • Larger fault domain, Fail-over/Fail-back • Indeterministic boot up process and long upgrade procedures • Moved complexity from big boxes to our advantage, where we can manage and control! • Better control and visibility to internals by removing black-box abstraction! • Same Switch SKU on ToR, Leaf and Spine (Entire DC) • Single chipset uniform IO design (same bandwidth, latency and buffering) • True 5-Staged Clos Topology! with deterministic latency • Dedicated control plane, OAM and CPU for each ASIC
  • 9. W X Y Z W X Y Z W X Y Z Distributed Control Plane Complexity Pod 1 2 32…1 Pod 11 322 352…321 Pod 21 642 672…641 Pod 31 962 992…961 W X Y Z 2171217021692168213121302129212820912090208920882051205020492048 2339233823372336 2368 2369 2370 2371 2400 2401 2402 24032307230623052304
  • 10. “Fabric wide visibility and telemetry” The wider the fabric, flow tracking and fault isolation becomes more difficult Problem 1
  • 11. “Fabric wide traffic distribution and packet scheduling!” Forwarding is different than routing, and out of scope for routing protocols. Problem 2 We need a robust and scalable control protocol designed for a data center fabric
  • 12. Control Plane :: Routing • Routing protocols provide destination-based reachability information • Routing protocols are not traffic aware. • Best path selection is elementary. • Network graph is built based on series of ECMP groups, “Routing protocols are more about the destination than the journey”
  • 13. ECMP forwarding simply does not cut it! Problem #3
  • 14. ECMP is not really equal! • Elephants and mice issue • ECMP Hashing is not bandwidth aware. Devices use an algorithm to distribute traffic amongst links regardless of load. • Traffic is routed using shortest path, not all the available paths, hence not maximizing all the available capacity. Some links may suffer while the other may be underutilized. • Flows stick to a certain path, as hashing is performed per flow. An established socket cannot be moved to a different path easily!
  • 15. “We need a robust and scalable fabric-wide forwarding policy” Problem #4
  • 16. Lack of Centralized Policy and Control • The more parallel links you add, forwarding decision becomes more random. • Devices were configured and maintained individually • Routing/Forwarding policy management tasks are performed individually and hop by hop. • Know when/where to centralize or distribute to scale out!
  • 17. “End to End Path Selection & Control” No application, protocol or packet can dictate a path Centralized flow based routing does not scale! Problem #5
  • 18. “Using the same familiar, robust and well-known solutions brings along the same restrictions when they were originally designed” Problem #6
  • 20. IP Routing History • IP routing is defined hop-by-hop • BGP is “the” IDR designed to work between different autonomous system, to provide policy and control between different routing domains to select a best path. • True: BGP can scale and is extensible. BGP has many policy knobs. • A datacenter fabric operated under a single administrative domain instead of series of individual routers with different policies and decision process.
  • 21. Forwarding traffic based on demands & patterns: • Application • Latency • Loss • Bandwidth (Throughput) Programmable Data Center A data center fabric that distributes traffic amongst all available links efficiently and effectively while maintaining lowest latency and providing the most possible bandwidth to different applications based on different needs and priorities.
  • 22. Program forwarding tables individually on all switches from a centralized location Approach #1
  • 23. Flow x > Port 1 Flow x > Port 3 Flow x > Port 2 Forwarding and Control Element Separation
  • 24. Encode path information into packet header Approach #2
  • 25. Distributed control plane for topology discovery and reachability information + Use a controller software for forwarding policy and optimizations Approach #3
  • 26. Scale: No state or flow information required to be stored on every box Network can choose and move flows dynamically Application can choose and move flows dynamically Works with existing data plane (merchant silicon support) Supports ECMP with fallback to IP routing Automatic Local Repair / LFA
  • 27. Hardware Routing Policy Applications Link Selection and Scheduling Topology Discovery and Network Graph Control Telemetry/Visibility, Machine Learning, Prediction Engine, Self Healing, etc. Forwarding Merchant Silicon Rethinking The Network Stack
  • 28. Network Element Management Plane SNMP, Syslog, etc. System & Environmental Data Packet & Flow Data Network Operating System Kafka Network Agent ASIC System Drivers Reducing Protocols
  • 29. Network ElementNetwork ElementNetwork ElementNetwork Element Management Plane SNMP, Syslog, etc. System & Environmental Data Packet & Flow Data Network Operating System Kafka Agent Monitoring and Management System Kafka Broker Machine Learning & Data Processing Alert Processor Log Retention Data Store Event Correlation Kafka Pub/Sub Pipeline Record, Process and Replay Network State
  • 30. Open19 OpenFabric ASIC ODM RIB / Forwarding Abstraction Layer FALCO Apps Linux OS Hardware Physical Layer Hardware Abstraction Layer Metrics & Analytics Machine Learning Self Healing etc. (API to Infrastructure) Policy & Control Operating System Base Networking LinkedIn Infrastructure Strategy
  • 31. • Unified Architecture • We used a single SKU (hardware and software) for all switches while procuring hardware from multiple ODM channels (multi-homing) • One Software: Base Networking on Merchant Silicon with minimum req. features • No Overlay - For the infrastructure, the application is stateless • No Middle-box (Firewall, Load-balancer, etc.) Moved to application • Network is only a set of intermediate boxes running linux Simplified Infrastructure to Own
  • 32. • To control and own your architecture: • End to end stack (app, operating system, network and architecture.) • Ultimate sophistication: Simplicity • In house support as far as possible • Move complexity to your comfort zone! Stay in Control
  • 33. • SDN is nor a protocol or a tool or product off the shelf • SDN is the whole network stack and architecture that enables applications to meet and interact with infrastructure to: SDN for LinkedIn • Discover and Learn • Provision • Manage • Control • Monitor
  • 34. Project Altair: The Evolution of LinkedIn’s Data Center Network Project Falco: Decoupling Switching Hardware and Software Open19: A New Vision for the Data Center

Hinweis der Redaktion

  1. This is the luxury that public cloud doesn’t offer: solving scale or performance issues, many different possible ways as we own the end to end stack from the application serving in the DC to the app on clients phone! Scale Example: one possible approach is to throw more machines at it We believe if application performance and user experience is our primary mission, owning end to end infrastructure is not optional. That is why we do not have any plan to move to any infrastructure outside LinkedIn’s control from top to bottom.
  2. The infrastructure growth more than anything was on data center networking as traffic demands.
  3. and we chose to build our network on top of merchant silicon. a very common strategy for mega scaled data center.
  4. Let’s look at the history of data center network: DCN were built by same building blocks that constructs campus or enterprise networks, chassis architecture and hierarchical access, distribution and core model of deployment Vendors: same set of software features set, protocols and architecture was being used for campus, enterprise and data center networks.
  5. Making a big switch out of pizza box switches that can scale horizontally. In this model each plane can have up to 32 switches, with 4 planes we can have 128 switches in the fabric. That is 4,096 x100 Gig ports. You cannot buy a chassis switch with 4096 ports in the market.
  6. No control over what line cards boot first, and black holing traffic until software is fully loaded and control plane in sync Chassis switches simplify management and control plane of multiple line cards (modules) by managing multiple chipsets under the same roof, as the code abstracts the complexity inside the chassis from the rest of the network however it translates into code complexity where we did not have control over. Our strategy is to simplify what we don’t have control over, and move complexity to where we can manage and control. We can manage multiple pizza-box switches via distributed control plane, software automation and zero touch provisioning… In our case, we used the same switch code and chipset as the building block for the entire data center where operation staff no longer need to focus on hardware variables and can shift their attention to the software and automation pieces.
  7. As we grow there will be more switches and links to manage, the more the feel and need to enhance programmability into the network. This wide ecmp structure creates complexity
  8. chassis switches provide cell switching, dbd, virtual output queue (head of the line blocking) across backplane, since there’s no big enough switch to serve the whole data center, this task should be performed across the fabric!
  9. non of the routing protocols were designed (originally) for a fully mesh, ecmp network, extensions to support multi path and features such as mesh group added later, but it does not really solve this use case, routing protocols says: here is the destination prefix, and here is the next hop, good luck with that
  10. Folded CLOS is not perfect, once that network utilizes more bandwidth we will see some links carry more traffic than the others because ECMP algorithm or equal cost multi path (sharing bandwidth across multiple links) is based on a hashing mechanism.
  11. Once that a path (a series of links) instead of series of nodes is determined, apply and enforce this in an end to end . Distributed architectures are usually faster to compute, but depending on the case, there are examples that centralize compute can be faster than overall distributed form, due to eliminating duplicated tasks and event propagation…
  12. BGP in DC not the complete answer. It works for what is expected for BGP, not necessarily for what expected in a data center. BGP does a perfect job for the intent behind its design, the intent was to have a good enough, path-vector protocol to scale to internet size with policy control…
  13. Although protocols will change over time as they gain new functionality, the foundation and fundamental thought process behind them when there were designed does not or cannot be changed.
  14. While we are looking for fabric based forwarding, higher-level intent based forwarding, operation is bgp brings an illusion of control in the dc, ask those operator if they really control whats happening there?
  15. ForCES and OpenFlow, providing flow based routing We already know that this does not scale!
  16. Examples, Segment Routing, MPLS, or Cisco ACI by utilizing VXLAN
  17. OpenFabric is open and extensible software code that extends to run on different silicons and operating systems. Open19 provides passive backplane for data without a need of wiring and power rails to fit servers, storage and switches in standard 19 inch cabinets.
  18. Simplified architecture is desirable to simplify ownership
  19. We believe it is important as a content provider company to control our own destiny, own to customize based on your needs, and only your use cases Priority #1 is to stick to and own our architecture, the simple architecture that is driven by our application. in order to control this we need to own several pieces of this puzzle.