SlideShare ist ein Scribd-Unternehmen logo
1 von 32
数据中心网络研究:机遇与挑战

       郭传雄

   微软亚洲研究院 (MSRA)
      2011.04.15
                    1
Outline
•   DCN background
•   Opportunities
•   Research challenges
•   A modular DCN design




                             2
3
4
Background: personal experience
• Bandwidth is a scarce resource
 Network   Memory        Disk     CPU                Year

 10Mb/s    2MB           10MB     386/20M            1994

 100Mb/s   128MB         2GB      PentiumII/233      1998

 100Mb/s   256MB         40GB     PentiumIII/800     2002

 1Gb/s     2GB           160GB    Core2/2GHZ         2007

 1Gb/s     4GB           500GB    Core2 Quad/3GHZ    2011

 X100      X2000, but    X50000   X150X4, but multi- 17 years
           slow access            core and instruction
                                  level progress

                                                                5
Background: technology trends
– Disk is cheap (TB and PB are common)
   • 500RMB for 1TB
– Memory is cheap (32GB a PC is not uncommon)
   • 150RMB for 2GB DRAM
– CPU is powerful yet inexpensive (multi-core)
   • 2000RMB for Intel core i7 with 4 cores
– But “network bandwidth is a scarce resource
   • Intra-DC: replication everywhere for fault tolerance
   • Inter-DC: Input and output need bandwidth
   • 50$ (per 1G port), 500$ (per 10G port)
– 0.1$ = 1GB bandwidth = 1CPU hour = 1GB storage per
  month
                                                            6
DCN building blocks




Server   Rack   Container   Data Center   7
DCN reference design
              •   Does not scale
              •   Low bandwidth
              •   Single point of failure
              •   High cost




                                       8
Outline
•   DCN background
•   Opportunities
•   Research challenges
•   A modular DCN design




                             9
Right time for DCN research
• It is a real problem
• It is an important problem
  – DCN as the infrastructure for cloud computing
• The assumptions are different
  – Data centers are owned by single organization
  – We can innovate at both end-hosts and network
    devices
  – Security is easier (closed environment and trusted
    people)

                                                     10
DCN research: opportunities
• Full of research problems
  – Scalability: tens of thousands to millions servers
  – Performance
  – Fault tolerance
  – Cost saving
  – Feel free to suggest new “TCP” protocols
• You can invent your own DCN!


                                                         11
Outline
•   DCN background
•   Opportunities
•   Research challenges
•   A modular DCN design




                             12
Research challenges
Applications                       Architectures

•   Search                         •   Topology design
•   Distributed execution engine   •   Network virtualization
•   Distributed file systems       •   Electrical/optical switching
•   Online social networking       •   Commodity vs. special system
•   HPC applications



Technologies                       Protocols

• DCN management                   • DCN routing
• DCN platform                     • TCP incast congestion control
• Energy efficiency                • Multicast




                                                                      13
Architecture design
•   Scaling: from thousands to millions of servers
•   High capacity: support various traffic patterns
•   Fault tolerance
•   Cost efficient
•   Easy to deploy and manage




                                                      14
Fat-tree (ucsd-sigcomm08)




                            15
VL2 (msrr-sigcomm09)

               OSFP+ECMP


                           10G


                           10G

                           1G




                                16
Dcell/Bcube (msra-sigcomm08,09)

             • Put intelligence at servers
             • Use Ethernet switches as crossbar
             • Innovations in topology design and routing




  DCell                          BCube
                                                      17
Architecture: optical/electrical
switching (ucsd-sigcomm10, rice-
           sigcomm10)
                    • A hybrid architecture
                       • Optical circuit switching
                       • Electrical packet switching




                                              18
Protocols: TCP incast congestion
                 control

                   S1


                   S2
R



                   Sn


cmu-sigcomm09, msra-conext10


                                       19
Technologies: research platform
• A DCN research platform
  – High performance: comparable to ASIC
  – Easy to program: comparable to commodity server
  – Rich functions
     • Programmable packet forwarding
     • Experiment various control/management funcs
     • Can implement various routing/congestion control
       designs
• ServerSwitch (msra-nsdi11)
                                                          20
Applications
• A unified network for both data center and
  HPC applications?
                      Data center               HPC
Topology              Tree-based                Torus/mesh, fat-tree
Routing               Deterministic routing     Single path routing
                      Per-packet adaptive       L2 spanning tree
                      routing to exploit path   L3 shortest path routing
                      diversity
Flow control          No packet drop            Packets can be dropped
                      Hop by hop                End-to-end
Application support   Scientific applications   Search, e-commerce,
                                                cloud computing
Programming API       MPI/RDMA                  TCP/IP socket
                                                                           21
Outline
•   DCN background
•   Opportunities
•   Research challenges
•   A modular DCN design




                             22
Team
• Chuanxiong Guo, Guohan Lu, Haitao Wu,
  Yongqiang Xiong
• Interns: Zhiqiang Zhou, Jiaxin Cao, Jiabo Ju, Qin
  Jia, Jun Li
• Alumni/Alumna
  – members: Songwu Lu, Dan Li
  – interns: Lei Shi, Yunfeng Shi, Danfeng Zhang, Xuan Zhang,
    Byunchul Park, Nan Hua, Chen Tian, Min-Chen Zhao, Chao
    Kong, Kai Chen, Wenfei Wu, Shuang Yang, Peng Su, Bruce
    Chen, Zhenqian Feng, Min-Jeong Shi, Yibo Zhu…
                                                                23
Modular, mega-data center
      networking




                            24
Modular, mega-data center
        networking

BCube       BCube        BCube


BCube      MDCube        BCube


BCube       BCube        BCube
                                 25
BCube: Server centric network
BCube1


      <1,0>               <1,1>               <1,2>               <1,3>



BCube0
      <0,0>               <0,1>               <0,2>               <0,3>



 00   01   02   03   10   11   12   13   20   21   22   23   30   31   32        33




                                                                            26
2-D MDCube
             MDCube structure




                                27
Problem: Server for pkt fwding?
BCube1


      <1,0>                <1,1>               <1,2>               <1,3>



BCube0
      <0,0>                <0,1>               <0,2>               <0,3>



 00   01    02   03   10   11   12   13   20   21   22   23   30   31   32        33



                                      Forwarding node
                                                                             28
Solution: ServerSwitch

                   • Full programmability at server CPU
                      – Kernel module for low latency processing
Software




                      – User space for ease-to-use
                        programmability

                   • Low latency and high throughput
           PCI-E
                     interconnection
Hardware




                   • Packet forwarding in commodity
                     switching ASIC
                      – High performance and limited
                        programmability
                                                           29
Testbed
• A BCube testbed
  – 16 servers (Dell Precision 490 workstation with
    Intel 2.00GHz dualcore CPU, 4GB DRAM, 160GB
    disk)
  – 8 8-port mini-switches (DLink 8-port Gigabit
    switch DGS-1008D)
• NIC
  – Intel Pro/1000 PT quad-port Ethernet NIC
  – NetFPGA
                                                      30
Summary
• DCN is an area full of opportunities and
  challenges
• The best is yet to come!
• Further information
  • http://research.microsoft.com/en-
    us/projects/msradcn/default.aspx




                                             31
32

Weitere ähnliche Inhalte

Was ist angesagt?

User-space Network Processing
User-space Network ProcessingUser-space Network Processing
User-space Network ProcessingRyousei Takano
 
IEEE CloudCom 2014参加報告
IEEE CloudCom 2014参加報告IEEE CloudCom 2014参加報告
IEEE CloudCom 2014参加報告Ryousei Takano
 
AI Accelerators for Cloud Datacenters
AI Accelerators for Cloud DatacentersAI Accelerators for Cloud Datacenters
AI Accelerators for Cloud DatacentersCastLabKAIST
 
Virtualization Primer for Java Developers
Virtualization Primer for Java DevelopersVirtualization Primer for Java Developers
Virtualization Primer for Java DevelopersRichard McDougall
 
AI Chip Trends and Forecast
AI Chip Trends and ForecastAI Chip Trends and Forecast
AI Chip Trends and ForecastCastLabKAIST
 
Exploring emerging technologies in the HPC co-design space
Exploring emerging technologies in the HPC co-design spaceExploring emerging technologies in the HPC co-design space
Exploring emerging technologies in the HPC co-design spacejsvetter
 
Design installation-commissioning-red raider-cluster-ttu
Design installation-commissioning-red raider-cluster-ttuDesign installation-commissioning-red raider-cluster-ttu
Design installation-commissioning-red raider-cluster-ttuAlan Sill
 
Cache-partitioning
Cache-partitioningCache-partitioning
Cache-partitioningdavidkftam
 
Ph.D. thesis presentation
Ph.D. thesis presentationPh.D. thesis presentation
Ph.D. thesis presentationdavidkftam
 
Design and implementation of a reliable and cost-effective cloud computing in...
Design and implementation of a reliable and cost-effective cloud computing in...Design and implementation of a reliable and cost-effective cloud computing in...
Design and implementation of a reliable and cost-effective cloud computing in...Francesco Taurino
 
LCA13: Jason Taylor Keynote - ARM & Disaggregated Rack - LCA13-Hong - 6 March...
LCA13: Jason Taylor Keynote - ARM & Disaggregated Rack - LCA13-Hong - 6 March...LCA13: Jason Taylor Keynote - ARM & Disaggregated Rack - LCA13-Hong - 6 March...
LCA13: Jason Taylor Keynote - ARM & Disaggregated Rack - LCA13-Hong - 6 March...Linaro
 
Roeder posterismb2010
Roeder posterismb2010Roeder posterismb2010
Roeder posterismb2010Chris Roeder
 
A Prototype Storage Subsystem based on Phase Change Memory
A Prototype Storage Subsystem based on Phase Change MemoryA Prototype Storage Subsystem based on Phase Change Memory
A Prototype Storage Subsystem based on Phase Change MemoryIBM Research
 
Maxwell siuc hpc_description_tutorial
Maxwell siuc hpc_description_tutorialMaxwell siuc hpc_description_tutorial
Maxwell siuc hpc_description_tutorialmadhuinturi
 
Hyper v.nu-windows serverhyperv-networkingevolved
Hyper v.nu-windows serverhyperv-networkingevolvedHyper v.nu-windows serverhyperv-networkingevolved
Hyper v.nu-windows serverhyperv-networkingevolvedhypervnu
 
Superfluid Orchestration of heterogeneous Reusable Functional Blocks for 5G n...
Superfluid Orchestration of heterogeneous Reusable Functional Blocks for 5G n...Superfluid Orchestration of heterogeneous Reusable Functional Blocks for 5G n...
Superfluid Orchestration of heterogeneous Reusable Functional Blocks for 5G n...Stefano Salsano
 
Exaflop In 2018 Hardware
Exaflop In 2018   HardwareExaflop In 2018   Hardware
Exaflop In 2018 HardwareJacob Wu
 
Gluster Webinar: Introduction to GlusterFS
Gluster Webinar: Introduction to GlusterFSGluster Webinar: Introduction to GlusterFS
Gluster Webinar: Introduction to GlusterFSGlusterFS
 

Was ist angesagt? (20)

User-space Network Processing
User-space Network ProcessingUser-space Network Processing
User-space Network Processing
 
POWER10 innovations for HPC
POWER10 innovations for HPCPOWER10 innovations for HPC
POWER10 innovations for HPC
 
IEEE CloudCom 2014参加報告
IEEE CloudCom 2014参加報告IEEE CloudCom 2014参加報告
IEEE CloudCom 2014参加報告
 
AI Accelerators for Cloud Datacenters
AI Accelerators for Cloud DatacentersAI Accelerators for Cloud Datacenters
AI Accelerators for Cloud Datacenters
 
Virtualization Primer for Java Developers
Virtualization Primer for Java DevelopersVirtualization Primer for Java Developers
Virtualization Primer for Java Developers
 
AI Chip Trends and Forecast
AI Chip Trends and ForecastAI Chip Trends and Forecast
AI Chip Trends and Forecast
 
Summit workshop thompto
Summit workshop thomptoSummit workshop thompto
Summit workshop thompto
 
Exploring emerging technologies in the HPC co-design space
Exploring emerging technologies in the HPC co-design spaceExploring emerging technologies in the HPC co-design space
Exploring emerging technologies in the HPC co-design space
 
Design installation-commissioning-red raider-cluster-ttu
Design installation-commissioning-red raider-cluster-ttuDesign installation-commissioning-red raider-cluster-ttu
Design installation-commissioning-red raider-cluster-ttu
 
Cache-partitioning
Cache-partitioningCache-partitioning
Cache-partitioning
 
Ph.D. thesis presentation
Ph.D. thesis presentationPh.D. thesis presentation
Ph.D. thesis presentation
 
Design and implementation of a reliable and cost-effective cloud computing in...
Design and implementation of a reliable and cost-effective cloud computing in...Design and implementation of a reliable and cost-effective cloud computing in...
Design and implementation of a reliable and cost-effective cloud computing in...
 
LCA13: Jason Taylor Keynote - ARM & Disaggregated Rack - LCA13-Hong - 6 March...
LCA13: Jason Taylor Keynote - ARM & Disaggregated Rack - LCA13-Hong - 6 March...LCA13: Jason Taylor Keynote - ARM & Disaggregated Rack - LCA13-Hong - 6 March...
LCA13: Jason Taylor Keynote - ARM & Disaggregated Rack - LCA13-Hong - 6 March...
 
Roeder posterismb2010
Roeder posterismb2010Roeder posterismb2010
Roeder posterismb2010
 
A Prototype Storage Subsystem based on Phase Change Memory
A Prototype Storage Subsystem based on Phase Change MemoryA Prototype Storage Subsystem based on Phase Change Memory
A Prototype Storage Subsystem based on Phase Change Memory
 
Maxwell siuc hpc_description_tutorial
Maxwell siuc hpc_description_tutorialMaxwell siuc hpc_description_tutorial
Maxwell siuc hpc_description_tutorial
 
Hyper v.nu-windows serverhyperv-networkingevolved
Hyper v.nu-windows serverhyperv-networkingevolvedHyper v.nu-windows serverhyperv-networkingevolved
Hyper v.nu-windows serverhyperv-networkingevolved
 
Superfluid Orchestration of heterogeneous Reusable Functional Blocks for 5G n...
Superfluid Orchestration of heterogeneous Reusable Functional Blocks for 5G n...Superfluid Orchestration of heterogeneous Reusable Functional Blocks for 5G n...
Superfluid Orchestration of heterogeneous Reusable Functional Blocks for 5G n...
 
Exaflop In 2018 Hardware
Exaflop In 2018   HardwareExaflop In 2018   Hardware
Exaflop In 2018 Hardware
 
Gluster Webinar: Introduction to GlusterFS
Gluster Webinar: Introduction to GlusterFSGluster Webinar: Introduction to GlusterFS
Gluster Webinar: Introduction to GlusterFS
 

Ähnlich wie 数据中心网络研究:机遇与挑战

Lecture notes - Data Centers________.pptx
Lecture notes - Data Centers________.pptxLecture notes - Data Centers________.pptx
Lecture notes - Data Centers________.pptxSandeepGupta229023
 
Expectations for optical network from the viewpoint of system software research
Expectations for optical network from the viewpoint of system software researchExpectations for optical network from the viewpoint of system software research
Expectations for optical network from the viewpoint of system software researchRyousei Takano
 
HKG18-500K1 - Keynote: Dileep Bhandarkar - Emerging Computing Trends in the D...
HKG18-500K1 - Keynote: Dileep Bhandarkar - Emerging Computing Trends in the D...HKG18-500K1 - Keynote: Dileep Bhandarkar - Emerging Computing Trends in the D...
HKG18-500K1 - Keynote: Dileep Bhandarkar - Emerging Computing Trends in the D...Linaro
 
Linaro connect 2018 keynote final updated
Linaro connect 2018 keynote final updatedLinaro connect 2018 keynote final updated
Linaro connect 2018 keynote final updatedDileep Bhandarkar
 
Resilient Network Design Concepts Educat
Resilient Network Design Concepts EducatResilient Network Design Concepts Educat
Resilient Network Design Concepts EducatSamGrandprix
 
The von Neumann Memory Barrier and Computer Architectures for the 21st Century
The von Neumann Memory Barrier and Computer Architectures for the 21st CenturyThe von Neumann Memory Barrier and Computer Architectures for the 21st Century
The von Neumann Memory Barrier and Computer Architectures for the 21st CenturyPerry Lea
 
Gpu with cuda architecture
Gpu with cuda architectureGpu with cuda architecture
Gpu with cuda architectureDhaval Kaneria
 
Trends and challenges in IP based SOC design
Trends and challenges in IP based SOC designTrends and challenges in IP based SOC design
Trends and challenges in IP based SOC designAishwaryaRavishankar8
 
Solace Systems The Evolution of Messaging The Rise of the Appliance
Solace Systems The Evolution of Messaging The Rise of the ApplianceSolace Systems The Evolution of Messaging The Rise of the Appliance
Solace Systems The Evolution of Messaging The Rise of the ApplianceIosif Itkin
 
Extent 2013 Obninsk High Performance Messaging
Extent 2013 Obninsk High Performance MessagingExtent 2013 Obninsk High Performance Messaging
Extent 2013 Obninsk High Performance Messagingextentconf Tsoy
 
HiPEAC-CSW 2022_Pedro Trancoso presentation
HiPEAC-CSW 2022_Pedro Trancoso presentationHiPEAC-CSW 2022_Pedro Trancoso presentation
HiPEAC-CSW 2022_Pedro Trancoso presentationVEDLIoT Project
 
Memory, Big Data, NoSQL and Virtualization
Memory, Big Data, NoSQL and VirtualizationMemory, Big Data, NoSQL and Virtualization
Memory, Big Data, NoSQL and VirtualizationBigstep
 
The CMS Online Cluster: 
 Setup, Operation and Maintenance 
 of an Evolving C...
The CMS Online Cluster: 
 Setup, Operation and Maintenance 
 of an Evolving C...The CMS Online Cluster: 
 Setup, Operation and Maintenance 
 of an Evolving C...
The CMS Online Cluster: 
 Setup, Operation and Maintenance 
 of an Evolving C...Jose Antonio Coarasa Perez
 
Multicloud as the Next Generation of Cloud Infrastructure
Multicloud as the Next Generation of Cloud Infrastructure Multicloud as the Next Generation of Cloud Infrastructure
Multicloud as the Next Generation of Cloud Infrastructure Brad Eckert
 
PLNOG 8: Ivan Pepelnjak - Data Center Fabrics - What Really Matters
PLNOG 8: Ivan Pepelnjak - Data Center Fabrics - What Really Matters PLNOG 8: Ivan Pepelnjak - Data Center Fabrics - What Really Matters
PLNOG 8: Ivan Pepelnjak - Data Center Fabrics - What Really Matters PROIDEA
 
Navigating dc architectures tech&amp;sales
Navigating dc architectures tech&amp;salesNavigating dc architectures tech&amp;sales
Navigating dc architectures tech&amp;salesEric Zhaohui Ji
 
PacketCloud: an Open Platform for Elastic In-network Services.
PacketCloud: an Open Platform for Elastic In-network Services. PacketCloud: an Open Platform for Elastic In-network Services.
PacketCloud: an Open Platform for Elastic In-network Services. yeung2000
 
40 Powers of 10 - Simulating the Universe with the DiRAC HPC Facility
40 Powers of 10 - Simulating the Universe with the DiRAC HPC Facility40 Powers of 10 - Simulating the Universe with the DiRAC HPC Facility
40 Powers of 10 - Simulating the Universe with the DiRAC HPC Facilityinside-BigData.com
 

Ähnlich wie 数据中心网络研究:机遇与挑战 (20)

Lecture notes - Data Centers________.pptx
Lecture notes - Data Centers________.pptxLecture notes - Data Centers________.pptx
Lecture notes - Data Centers________.pptx
 
Expectations for optical network from the viewpoint of system software research
Expectations for optical network from the viewpoint of system software researchExpectations for optical network from the viewpoint of system software research
Expectations for optical network from the viewpoint of system software research
 
HKG18-500K1 - Keynote: Dileep Bhandarkar - Emerging Computing Trends in the D...
HKG18-500K1 - Keynote: Dileep Bhandarkar - Emerging Computing Trends in the D...HKG18-500K1 - Keynote: Dileep Bhandarkar - Emerging Computing Trends in the D...
HKG18-500K1 - Keynote: Dileep Bhandarkar - Emerging Computing Trends in the D...
 
Linaro connect 2018 keynote final updated
Linaro connect 2018 keynote final updatedLinaro connect 2018 keynote final updated
Linaro connect 2018 keynote final updated
 
Resilient Network Design Concepts Educat
Resilient Network Design Concepts EducatResilient Network Design Concepts Educat
Resilient Network Design Concepts Educat
 
Vaibhav (2)
Vaibhav (2)Vaibhav (2)
Vaibhav (2)
 
The von Neumann Memory Barrier and Computer Architectures for the 21st Century
The von Neumann Memory Barrier and Computer Architectures for the 21st CenturyThe von Neumann Memory Barrier and Computer Architectures for the 21st Century
The von Neumann Memory Barrier and Computer Architectures for the 21st Century
 
Gpu with cuda architecture
Gpu with cuda architectureGpu with cuda architecture
Gpu with cuda architecture
 
Trends and challenges in IP based SOC design
Trends and challenges in IP based SOC designTrends and challenges in IP based SOC design
Trends and challenges in IP based SOC design
 
Solace Systems The Evolution of Messaging The Rise of the Appliance
Solace Systems The Evolution of Messaging The Rise of the ApplianceSolace Systems The Evolution of Messaging The Rise of the Appliance
Solace Systems The Evolution of Messaging The Rise of the Appliance
 
Extent 2013 Obninsk High Performance Messaging
Extent 2013 Obninsk High Performance MessagingExtent 2013 Obninsk High Performance Messaging
Extent 2013 Obninsk High Performance Messaging
 
Disruptive Technologies
Disruptive TechnologiesDisruptive Technologies
Disruptive Technologies
 
HiPEAC-CSW 2022_Pedro Trancoso presentation
HiPEAC-CSW 2022_Pedro Trancoso presentationHiPEAC-CSW 2022_Pedro Trancoso presentation
HiPEAC-CSW 2022_Pedro Trancoso presentation
 
Memory, Big Data, NoSQL and Virtualization
Memory, Big Data, NoSQL and VirtualizationMemory, Big Data, NoSQL and Virtualization
Memory, Big Data, NoSQL and Virtualization
 
The CMS Online Cluster: 
 Setup, Operation and Maintenance 
 of an Evolving C...
The CMS Online Cluster: 
 Setup, Operation and Maintenance 
 of an Evolving C...The CMS Online Cluster: 
 Setup, Operation and Maintenance 
 of an Evolving C...
The CMS Online Cluster: 
 Setup, Operation and Maintenance 
 of an Evolving C...
 
Multicloud as the Next Generation of Cloud Infrastructure
Multicloud as the Next Generation of Cloud Infrastructure Multicloud as the Next Generation of Cloud Infrastructure
Multicloud as the Next Generation of Cloud Infrastructure
 
PLNOG 8: Ivan Pepelnjak - Data Center Fabrics - What Really Matters
PLNOG 8: Ivan Pepelnjak - Data Center Fabrics - What Really Matters PLNOG 8: Ivan Pepelnjak - Data Center Fabrics - What Really Matters
PLNOG 8: Ivan Pepelnjak - Data Center Fabrics - What Really Matters
 
Navigating dc architectures tech&amp;sales
Navigating dc architectures tech&amp;salesNavigating dc architectures tech&amp;sales
Navigating dc architectures tech&amp;sales
 
PacketCloud: an Open Platform for Elastic In-network Services.
PacketCloud: an Open Platform for Elastic In-network Services. PacketCloud: an Open Platform for Elastic In-network Services.
PacketCloud: an Open Platform for Elastic In-network Services.
 
40 Powers of 10 - Simulating the Universe with the DiRAC HPC Facility
40 Powers of 10 - Simulating the Universe with the DiRAC HPC Facility40 Powers of 10 - Simulating the Universe with the DiRAC HPC Facility
40 Powers of 10 - Simulating the Universe with the DiRAC HPC Facility
 

Kürzlich hochgeladen

Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DaySri Ambati
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 

Kürzlich hochgeladen (20)

Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 

数据中心网络研究:机遇与挑战

  • 1. 数据中心网络研究:机遇与挑战 郭传雄 微软亚洲研究院 (MSRA) 2011.04.15 1
  • 2. Outline • DCN background • Opportunities • Research challenges • A modular DCN design 2
  • 3. 3
  • 4. 4
  • 5. Background: personal experience • Bandwidth is a scarce resource Network Memory Disk CPU Year 10Mb/s 2MB 10MB 386/20M 1994 100Mb/s 128MB 2GB PentiumII/233 1998 100Mb/s 256MB 40GB PentiumIII/800 2002 1Gb/s 2GB 160GB Core2/2GHZ 2007 1Gb/s 4GB 500GB Core2 Quad/3GHZ 2011 X100 X2000, but X50000 X150X4, but multi- 17 years slow access core and instruction level progress 5
  • 6. Background: technology trends – Disk is cheap (TB and PB are common) • 500RMB for 1TB – Memory is cheap (32GB a PC is not uncommon) • 150RMB for 2GB DRAM – CPU is powerful yet inexpensive (multi-core) • 2000RMB for Intel core i7 with 4 cores – But “network bandwidth is a scarce resource • Intra-DC: replication everywhere for fault tolerance • Inter-DC: Input and output need bandwidth • 50$ (per 1G port), 500$ (per 10G port) – 0.1$ = 1GB bandwidth = 1CPU hour = 1GB storage per month 6
  • 7. DCN building blocks Server Rack Container Data Center 7
  • 8. DCN reference design • Does not scale • Low bandwidth • Single point of failure • High cost 8
  • 9. Outline • DCN background • Opportunities • Research challenges • A modular DCN design 9
  • 10. Right time for DCN research • It is a real problem • It is an important problem – DCN as the infrastructure for cloud computing • The assumptions are different – Data centers are owned by single organization – We can innovate at both end-hosts and network devices – Security is easier (closed environment and trusted people) 10
  • 11. DCN research: opportunities • Full of research problems – Scalability: tens of thousands to millions servers – Performance – Fault tolerance – Cost saving – Feel free to suggest new “TCP” protocols • You can invent your own DCN! 11
  • 12. Outline • DCN background • Opportunities • Research challenges • A modular DCN design 12
  • 13. Research challenges Applications Architectures • Search • Topology design • Distributed execution engine • Network virtualization • Distributed file systems • Electrical/optical switching • Online social networking • Commodity vs. special system • HPC applications Technologies Protocols • DCN management • DCN routing • DCN platform • TCP incast congestion control • Energy efficiency • Multicast 13
  • 14. Architecture design • Scaling: from thousands to millions of servers • High capacity: support various traffic patterns • Fault tolerance • Cost efficient • Easy to deploy and manage 14
  • 16. VL2 (msrr-sigcomm09) OSFP+ECMP 10G 10G 1G 16
  • 17. Dcell/Bcube (msra-sigcomm08,09) • Put intelligence at servers • Use Ethernet switches as crossbar • Innovations in topology design and routing DCell BCube 17
  • 18. Architecture: optical/electrical switching (ucsd-sigcomm10, rice- sigcomm10) • A hybrid architecture • Optical circuit switching • Electrical packet switching 18
  • 19. Protocols: TCP incast congestion control S1 S2 R Sn cmu-sigcomm09, msra-conext10 19
  • 20. Technologies: research platform • A DCN research platform – High performance: comparable to ASIC – Easy to program: comparable to commodity server – Rich functions • Programmable packet forwarding • Experiment various control/management funcs • Can implement various routing/congestion control designs • ServerSwitch (msra-nsdi11) 20
  • 21. Applications • A unified network for both data center and HPC applications? Data center HPC Topology Tree-based Torus/mesh, fat-tree Routing Deterministic routing Single path routing Per-packet adaptive L2 spanning tree routing to exploit path L3 shortest path routing diversity Flow control No packet drop Packets can be dropped Hop by hop End-to-end Application support Scientific applications Search, e-commerce, cloud computing Programming API MPI/RDMA TCP/IP socket 21
  • 22. Outline • DCN background • Opportunities • Research challenges • A modular DCN design 22
  • 23. Team • Chuanxiong Guo, Guohan Lu, Haitao Wu, Yongqiang Xiong • Interns: Zhiqiang Zhou, Jiaxin Cao, Jiabo Ju, Qin Jia, Jun Li • Alumni/Alumna – members: Songwu Lu, Dan Li – interns: Lei Shi, Yunfeng Shi, Danfeng Zhang, Xuan Zhang, Byunchul Park, Nan Hua, Chen Tian, Min-Chen Zhao, Chao Kong, Kai Chen, Wenfei Wu, Shuang Yang, Peng Su, Bruce Chen, Zhenqian Feng, Min-Jeong Shi, Yibo Zhu… 23
  • 24. Modular, mega-data center networking 24
  • 25. Modular, mega-data center networking BCube BCube BCube BCube MDCube BCube BCube BCube BCube 25
  • 26. BCube: Server centric network BCube1 <1,0> <1,1> <1,2> <1,3> BCube0 <0,0> <0,1> <0,2> <0,3> 00 01 02 03 10 11 12 13 20 21 22 23 30 31 32 33 26
  • 27. 2-D MDCube MDCube structure 27
  • 28. Problem: Server for pkt fwding? BCube1 <1,0> <1,1> <1,2> <1,3> BCube0 <0,0> <0,1> <0,2> <0,3> 00 01 02 03 10 11 12 13 20 21 22 23 30 31 32 33 Forwarding node 28
  • 29. Solution: ServerSwitch • Full programmability at server CPU – Kernel module for low latency processing Software – User space for ease-to-use programmability • Low latency and high throughput PCI-E interconnection Hardware • Packet forwarding in commodity switching ASIC – High performance and limited programmability 29
  • 30. Testbed • A BCube testbed – 16 servers (Dell Precision 490 workstation with Intel 2.00GHz dualcore CPU, 4GB DRAM, 160GB disk) – 8 8-port mini-switches (DLink 8-port Gigabit switch DGS-1008D) • NIC – Intel Pro/1000 PT quad-port Ethernet NIC – NetFPGA 30
  • 31. Summary • DCN is an area full of opportunities and challenges • The best is yet to come! • Further information • http://research.microsoft.com/en- us/projects/msradcn/default.aspx 31
  • 32. 32