MICRO servers in BIG Data
Most of the optimization in Big Data solutions today happens in software because the available hardware choices are too similar. This status quo changes with Microservers. New and differentiated hardware platforms are emerging and the number of server vendors is increasing. These varied designs offer an unprecedented opportunity to increase performance and lower costs at the same time. However, it requires a deeper understanding of the application, the hardware, and rigorous analysis tools. In this talk, I will talk through some of the trends in microservers, and present a simple framework for choosing the best hardware for your deployment. Disclaimer: The views are my own, and do not reflect the views of my current or past employers.
2. BACKGROUND
Hardware designer come parallel programmer
Core microarchitecture and many core design
Worked on parallel programming, compilers,
task scheduling
Distributed application performance
The views are my own and do not represent those of my current and
past employers
3. PERFORMANCE OPTIMIZATION
App
Optimization
Middleware (Hadoop,
Sector/Sphere, etc)
OS/Hypervisor
System Hardware
CPU RAM Disk NIC
4. THE
BIG DATA news
BIG DATA NEWS AND GOSSIP WORLD EXCLUSIVES
MICROSERVERS IN BIGDATA
AND THE MICROSERVER WORLD IS JUST
AROUND THE CORNER
Shipments of microservers will rise threefold this year
Microservers would change the face of computing
Estimates of adoption between now and 2015 vary, but
are as high as 49% compound growth rate for Micro
server adoption
Going green with micro
servers
5. MICROSERVERS
MICRO SERVERS AVAILABLE
TODAY
AVAILABLE MICRO SERVERS
Intel Server Calxeda Server
NEED FOR USE
Marvell, TI, nVidia
Intel ATOM based
servers
6. MICROSERVERS
HOW ARE THEY DIFFERENT?
AVAILABLE MICRO SERVERS
Power- Disk
NEED FOR USE
efficient BW/comp
cores ute
Network
Computes/
bandwidth
TCO-$
/compute
7. It is not that simple …
9000
Traditional Server
8000 NB Micro server
CB
System Capacity
7000
6000
5000
Micro server becomes feasible due to cost
4000
3000
2000
1000
0
16 32 64 128 256 512 1024Bandwidth4096 8192 16384
2048 Requirement
* CB – CORE BOUND ; NB – NETWORK BOUND
8. WHEN TO USE MICROSERVERS?
When app is bandwidth bound and not CPU bound
When app scales well
When cost and throughput are more important
than latency
9. CPU/
Network
Memory
BW
x Data Size y Data Size
Capacity = MIN ( x, y, z )
Disk
Bandwidth
z Data Size
1. Difference between x, y, z represents inefficiency
2. Traditional servers had these fixed
3. Microservers will have more choices
10. BENCHMARKS
Porting is not always feasible
Use performance monitoring to characterize app
Architecture independent benchmarks that test
sub-systems in isolation
SPECInt Rate for CPU/Memory
FIO (JBOD configuration) for disk
Iperf for Network
11. Compare Actual Cost (dollars)
Number servers = Requirement/capacity
Total Cost of ownership = (cost per server) x
number of servers
Don’t forget to future-proof the analysis
The requirements will change
What looks good today won’t look good tomorrow
12. EXPECT
Lots of differentiated platforms
New approaches
Asymmetric Clusters
Dedicated Networks
Shared local disks with remote cores
Optimized appliances
GPGPUs
Hardware accelerators
13.
14. RECOMMENDATIONS
Keep Microservers on your Big Data roadmap
Keep their strengths and weaknesses in your mind
while you code
Keep your eyes and ears open to things that can
make a good benchmark
15. More on this on my blog :
futurechips.org
I am just a click away on
www.linkedin.com/pub/aater-suleman/3/21b/ba4
maater@gmail.com