Weitere ähnliche Inhalte Ähnlich wie OpenNebulaConf 2016 - VTastic: Akamai Innovations for Distributed System Testing by Jack Wadden, Akamai (20) Mehr von OpenNebula Project (20) Kürzlich hochgeladen (20) OpenNebulaConf 2016 - VTastic: Akamai Innovations for Distributed System Testing by Jack Wadden, Akamai1. Vtastic: Innovations In Distributed Systems Testing
Jack Wadden, Sr. Engineering Manager
Akamai Technologies, Inc.
2. ©2015 AKAMAI | FASTER FORWARDTM
AKAMAI CDN OVERVIEW
• We Make the Internet Fast, Reliable and Secure
• Globally-Distributed Network of Servers
• Caching Content Close to End Users
• Scalable Live Media Streaming
• Protocol Optimizations
• DNS-Based Load Balancing System
• Chooses the Best Server to Handle Your Requests
3. ©2015 AKAMAI | FASTER FORWARDTM
MASSIVE SCALE
• 15-30% of All Internet Traffic
• 3+ Trillion Hits/day (2 x 1012
)
• 30+ Tbps
• 215,000+ Servers
• Located in 120+ Countries
• 1000+ Software Components
• 100+ of Server Roles
5. ©2015 AKAMAI | FASTER FORWARDTM
TESTNETS: AKAMAI’S SYSTEM TEST
ENVIRONMENT
6. ©2015 AKAMAI | FASTER FORWARDTM
HOWEVER, AT AKAMAI TESTNETS
ARE A SCARCE RESOURCE
8. ©2015 AKAMAI | FASTER FORWARDTM
AND REQUIRE A HUGE TEAM TO
MAINTAIN
11. ©2015 AKAMAI | FASTER FORWARDTM
CONFLICTING USES NEED TO BE
COORDINATED
13. ©2015 AKAMAI | FASTER FORWARDTM
FEATURES OF A BETTER TESTNET
Low barrier to access Eliminate coordination
No-block debugging Automation
Portable, restorable configuration Efficient maintenance
Permit destructive testing Optimal platform utilization
15. ©2015 AKAMAI | FASTER FORWARDTM
TESTNET CLONING
Test Harness
VTASTIC Resource
Tracker
OpenNebula
Master Storage
Testnet
Clones
16. ©2015 AKAMAI | FASTER FORWARDTM
VTASTIC MASTER TESTNET
• Supported by SME teams
• Running Production Versions
• Vtastic Team Coordinates Changes
• Custom Clones can be Saved, Shared
Master Master Master
Candidate
Snapshot
Clone
Old
Master
17. ©2015 AKAMAI | FASTER FORWARDTM
CLONES USE PRIVATE IP SPACE
100.80.0.8
(MDT)
100.80.0.15
(KDC)
100.80.0.21
(UMP)
GWSH, SOCKS
172.26.238.16
(NAT Exit)
100.80.0.1
(NAT Gateway)
IP (Anything)
VLAN #83
18. ©2015 AKAMAI | FASTER FORWARDTM
NAT TUNNELING TOOLS
• vpoint: Testnet-Attached bash Shell
• LD_PRELOAD for Transparent SOCKS Tunneling (dante-client)
• Proprietary SSH-proxy client
• chrome-vpoint, firefox-vpoint
• Dedicated browser session with SOCKS configuration
19. ©2015 AKAMAI | FASTER FORWARDTM
DESIGN APPROACH
• Centrally-Managed Infrastructure
• Resources Granted to Users/Groups
• Distributed Storage & Compute Platform
• Commodity Hardware
• Open Source Technology
• Virtualization: Qemu/KVM
• Storage: GlusterFS
• Orchestration: OpenNebula!!
• Vtastic VRT: Python, Django, Apache
20. ©2015 AKAMAI | FASTER FORWARDTM
SPECS, SCALE
• 40 VM Hosts
• 32 Cores
• 128 GB RAM
• 2 x 10 Gbps Ethernet
• Average 35 VMs per Host
• 40-50+ Testnets
• 30-120 Nodes per Testnet
• 1500-2000+ Total VMs
• 40 Storage Nodes
• 8 Cores
• 32 GB RAM
• 10 Gbps Ethernet
• 6 x 384 GB SSD + RAID0 = 2.1 TB
• Total Usable Space = 42 TB
• Master Testnet
• 120 Nodes
• ~1.5 TB (After virt-sparsify)
21. ©2015 AKAMAI | FASTER FORWARDTM
1.0: GLUSTER & FUSE
• Backing Files and Scratch Images on Remote Storage
• Qemu Uses POSIX Path (/glusterclient/foo)
• Problems:
• Memory Leaks, Hangs in GlusterFS FUSE Mount
• Occasional Loss of VMs
• Performance Concerns
22. ©2015 AKAMAI | FASTER FORWARDTM
1.1: GLUSTER DIRECT
• Qemu uses libgfapi (gluster://SERVER:PORT/foo)
• Backing Files and Scratch Images on Remote Storage
• FUSE Mount Used for Image Management
• Problems:
• Frequent, Catastrophic Loss of VMs
• Occasional FUSE Mount Problems (Image Management)
23. ©2015 AKAMAI | FASTER FORWARDTM
1.2: FUSE + LOCAL SCRATCH
• Qemu Uses POSIX Path (/glusterclient/foo) for Backing Image
• FUSE Mount Used for Image Management
• Scratch Images Stored on Local Disk
• Problems:
• Increased Snapshot Time
• No Live Migration
• Occasional FUSE Mount Problems (Image Management)
• Lack of Trust (VM Loss Experienced before Re-creating Gluster Volume)
24. ©2015 AKAMAI | FASTER FORWARDTM
IN DEVELOPMENT: CEPH
• Static and Scratch Images on Remote Storage
• Live Migration Possible
• Holy Grail, or New Devil?
• Challenges:
• Learning Curve
• Ceph Stability?
• Need Support for Trees of RBD Clones
25. ©2015 AKAMAI | FASTER FORWARDTM
FUTURE POSSIBILIES
• Incorporating Physical Hardware (Load/Performance Testing)
• Realistic Network Conditions (Latency, Loss)
• Subnetting / Internetworking
27. ©2015 AKAMAI | FASTER FORWARDTM
IMAGE CREDITS
• http://www.huffingtonpost.com/2013/04/18/embarassing-data-disasters_n_3109254.html
• http://exchange.nottingham.ac.uk/research/files/2012/08/drinks-production-line-912x343.jpg
• http://machinelearningmastery.com/wp-content/uploads/2013/12/test-harness.jpg
• http://www.constructionweekonline.com/pictures/drought.gif
• http://static.giantbomb.com/uploads/original/23/232017/2612483-supercomputer_neu_03.jpg
• http://blog.straphq.com/wp-content/uploads/sites/18/2015/02/hackathon-hackers.jpg
• https://nationalsafety.files.wordpress.com/2011/07/071511_2104_safetyfails4.jpg?w=595
• http://img.khelnama.com/sites/default/files/styles/gallery_content_big/public/mediaimages/gallery/2013/Feb/Tug%20of%20War%20image.jpg
• http://www.globalnerdy.com/wordpress/wp-content/uploads/2013/06/WWDC-bathroom-line.jpg
• http://media.masslive.com/republican/photo/2010/11/9022738-large.jpg
• Unlock by Joel Bryant from the Noun Project
• debug by Lemon Liu from the Noun Project
• Robot by Angela Dinh from the Noun Project
• Server by Mister Pixel from the Noun Project
• coin by Rohith M S from the Noun Project
• Waiting Room by Luis Prado from the Noun Project
• users by TukTuk Design from the Noun Project
• Traffic Light by Arthur Shlain from the Noun Project
• Wrench by Rashida Luqman Kheriwala from the Noun Project
• http://product-images.www8-hp.com/digmedialib/prodimg/lowres/c02632282.png
• http://www.i2clipart.com/cliparts/2/c/3/a/clipart-database-symbol-256x256-2c3a.png
• http://piedmontnewsonline.com/wp-content/uploads/awpcp/help_wanted_sign-large2.png
• https://upload.wikimedia.org/wikipedia/commons/thumb/3/31/XM12_and_XM2.png/220px-XM12_and_XM2.png
• http://www.follytoxnetsystems.net/movie%20pix/cisco%20router_2801.gif
• http://fcw.com/~/media/GIG/FCWNow/Topics/Records%20Management/electronic%20records%20management.jpg
• play by Convoy from the Noun Project
• Camera by iconoci from the Noun Project