5. Background: personal experience
• Bandwidth is a scarce resource
Network Memory Disk CPU Year
10Mb/s 2MB 10MB 386/20M 1994
100Mb/s 128MB 2GB PentiumII/233 1998
100Mb/s 256MB 40GB PentiumIII/800 2002
1Gb/s 2GB 160GB Core2/2GHZ 2007
1Gb/s 4GB 500GB Core2 Quad/3GHZ 2011
X100 X2000, but X50000 X150X4, but multi- 17 years
slow access core and instruction
level progress
5
6. Background: technology trends
– Disk is cheap (TB and PB are common)
• 500RMB for 1TB
– Memory is cheap (32GB a PC is not uncommon)
• 150RMB for 2GB DRAM
– CPU is powerful yet inexpensive (multi-core)
• 2000RMB for Intel core i7 with 4 cores
– But “network bandwidth is a scarce resource
• Intra-DC: replication everywhere for fault tolerance
• Inter-DC: Input and output need bandwidth
• 50$ (per 1G port), 500$ (per 10G port)
– 0.1$ = 1GB bandwidth = 1CPU hour = 1GB storage per
month
6
8. DCN reference design
• Does not scale
• Low bandwidth
• Single point of failure
• High cost
8
9. Outline
• DCN background
• Opportunities
• Research challenges
• A modular DCN design
9
10. Right time for DCN research
• It is a real problem
• It is an important problem
– DCN as the infrastructure for cloud computing
• The assumptions are different
– Data centers are owned by single organization
– We can innovate at both end-hosts and network
devices
– Security is easier (closed environment and trusted
people)
10
11. DCN research: opportunities
• Full of research problems
– Scalability: tens of thousands to millions servers
– Performance
– Fault tolerance
– Cost saving
– Feel free to suggest new “TCP” protocols
• You can invent your own DCN!
11
12. Outline
• DCN background
• Opportunities
• Research challenges
• A modular DCN design
12
13. Research challenges
Applications Architectures
• Search • Topology design
• Distributed execution engine • Network virtualization
• Distributed file systems • Electrical/optical switching
• Online social networking • Commodity vs. special system
• HPC applications
Technologies Protocols
• DCN management • DCN routing
• DCN platform • TCP incast congestion control
• Energy efficiency • Multicast
13
14. Architecture design
• Scaling: from thousands to millions of servers
• High capacity: support various traffic patterns
• Fault tolerance
• Cost efficient
• Easy to deploy and manage
14
17. Dcell/Bcube (msra-sigcomm08,09)
• Put intelligence at servers
• Use Ethernet switches as crossbar
• Innovations in topology design and routing
DCell BCube
17
20. Technologies: research platform
• A DCN research platform
– High performance: comparable to ASIC
– Easy to program: comparable to commodity server
– Rich functions
• Programmable packet forwarding
• Experiment various control/management funcs
• Can implement various routing/congestion control
designs
• ServerSwitch (msra-nsdi11)
20
21. Applications
• A unified network for both data center and
HPC applications?
Data center HPC
Topology Tree-based Torus/mesh, fat-tree
Routing Deterministic routing Single path routing
Per-packet adaptive L2 spanning tree
routing to exploit path L3 shortest path routing
diversity
Flow control No packet drop Packets can be dropped
Hop by hop End-to-end
Application support Scientific applications Search, e-commerce,
cloud computing
Programming API MPI/RDMA TCP/IP socket
21
22. Outline
• DCN background
• Opportunities
• Research challenges
• A modular DCN design
22
23. Team
• Chuanxiong Guo, Guohan Lu, Haitao Wu,
Yongqiang Xiong
• Interns: Zhiqiang Zhou, Jiaxin Cao, Jiabo Ju, Qin
Jia, Jun Li
• Alumni/Alumna
– members: Songwu Lu, Dan Li
– interns: Lei Shi, Yunfeng Shi, Danfeng Zhang, Xuan Zhang,
Byunchul Park, Nan Hua, Chen Tian, Min-Chen Zhao, Chao
Kong, Kai Chen, Wenfei Wu, Shuang Yang, Peng Su, Bruce
Chen, Zhenqian Feng, Min-Jeong Shi, Yibo Zhu…
23
29. Solution: ServerSwitch
• Full programmability at server CPU
– Kernel module for low latency processing
Software
– User space for ease-to-use
programmability
• Low latency and high throughput
PCI-E
interconnection
Hardware
• Packet forwarding in commodity
switching ASIC
– High performance and limited
programmability
29
31. Summary
• DCN is an area full of opportunities and
challenges
• The best is yet to come!
• Further information
• http://research.microsoft.com/en-
us/projects/msradcn/default.aspx
31