Evolution of Netflix's cloud security strategy. Includes cloud-based key management and hybrid security controls that span traditional datacenter and public cloud.
1. Scaling
the
Cloud
Bill Burns
Director, Information
Security & Networking
CISO Executive Summit
Nov 27, 2012
Thursday, November 29, 12
2. Agenda
• Netflix Background and Culture
• Why We Moved to the Cloud
• InfoSec Challenges, Solutions in a hybrid DataCenter/
IaaS Cloud: C.I.A.
• InfoSec Take-Aways: Running In The Cloud
Thursday, November 29, 12
3. Netflix
Business
• 30+ million members globally
• Streaming in 51 countries
• 1B hours streamed/month
• Watched on 1000+ devices
• 33% of US peak evening
Internet traffic
(c) 2011 Sandvine
Thursday, November 29, 12
4. Background and
Context
• High Performance Culture
• Fail Fast, Learn Fast ... Get Results
• Some core values:
• “Freedom & Responsibility”
• “Loosely-Coupled, Highly-Aligned”
• “Context not control”
Thursday, November 29, 12
5. Engineering-
Centric Culture
• Sought the Cloud for Availability, Capacity
• ...and also found Agility
• DevOps / NoOps means engineering teams own:
• New deployments and upgrades
• Capacity planning & procurement
Thursday, November 29, 12
6. Freedom
&
Responsibility
Thursday, November 29, 12
7. Demand vs Capacity
37x growth in
13 months
Then-current
DataCenter
Capacity
Thursday, November 29, 12
8. Demand
1
Cloud:
On-
Demand # Servers
Capacity
2
1. Demand: Typical pattern
of customer requests rise
& fall over time
Utilization
2. Reaction: System
automatically adds,
removes servers to the
application pool 3
3. Result: Overall utilization
stays constant
Thursday, November 29, 12
9. Running In
The Cloud ::
InfoSec
Perspective
Thursday, November 29, 12
10. InfoSec In
The Cloud ::
Harder
1.“Your IP address attacked me
yesterday. Please stop it!”
2.Dealing with other people’s traffic
at your front door
3.Herding ephemeral instances
with vendor applications
4.Trusting endpoints, infrastructure
5.Key management
Thursday, November 29, 12
11. InfoSec In our
Cloud :: Easier
1.Reacting to business velocity 6.Embedding security controls
2.Detecting instance changes 7.Least privilege enforcement
3.Application ownership,
management 8.Testing/auditing for
conformance
4.Patching, updating
5.Availability, in a environment 9.Consistency, conformity in
you don’t control environment
Thursday, November 29, 12
12. InfoSec DevOps ::
Staying Relevant
• “Communication is what the listener does” – Mark
Horstman, Manager Tools podcast / Peter Drucker
• My team’s goal: InfoSec program adds value, deeper
part of the business’ success, not a “bolt-on”
• Pain: Learning a new vocabulary, systems thinking
• End result: We like this model a lot!
Thursday, November 29, 12
13. InfoSec
Confiden"ality' Challenges
In An IaaS
U"lity' Integrity'
Cloud
Authen"city' Availability'
Possession'
Thursday, November 29, 12
14. InfoSec Challenge
in an IaaS Cloud ::
Availability
Thursday, November 29, 12
15. Availability ::
Assume
failures
•You’re only good at what you
regularly test for
•If you fear a failure mode, find a
way to automate a test for that
•Chaos Monkey/Gorilla induce
failures, help us practice recovery
•Include security control systems
in your failure testing too!
(c) Courtesy Flikr - Winton
Thursday, November 29, 12
16. The Netflix
Simian Army
& other
Security
Controls • Chaos Monkey - Randomly kills instances
• Chaos Gorilla - Evacuates entire data centers
• Striving for continuous • Janitor Monkey – Ensures a clean inventory
testing, monitoring
• Identify and test • Security Monkey – Various security checks
common failure modes
• Exploit Monkey – Under development
• Automation
everywhere • Critical Systems – File integrity monitoring,
HIDS, WAF baked in as needed
Thursday, November 29, 12
17. InfoSec Challenge
in an IaaS Cloud ::
Integrity
Thursday, November 29, 12
19. Integrity ::
Patching
• Goal: Running instances do not get patched
• Alternative:
• Bake a new AMI for any change
• Launch, test new instances in parallel
• Kill the old instances
Thursday, November 29, 12
20. Integrity ::
Upgrades
• Bake a new AMI for
any change
• Launch new instances
in parallel
• Kill the old instances
Lesson Learned: Make the secure-and-
consistent behavior the easier alternative.
Thursday, November 29, 12
21. Embedding
Security
Controls
• Controls baked into our templates
• Places controls near the data
• Automation ensures coverage as
machines born, replaced
• Security controls are “Data Center
agnostic”
• Provide a single view of attack
surface
• Evolving, work in progress
Thursday, November 29, 12
22. Security
Controls:
WAF
Example
• Sample Control: Web
Application Firewall
• Software-only, baked-in AMI
• Control spans all
environments, regions
• Consistent control, view
• Zero effort for developer to
add protection
Thursday, November 29, 12
23. Automation =
Conformity
&
Consistency
• All apps, tiers are Highly
Available
• Secure defaults applied
automatically
• Replacement instances
look just like the originals
• Includes security controls
Thursday, November 29, 12
24. InfoSec Challenge
in an IaaS Cloud ::
Confidentiality/
Possession
Thursday, November 29, 12
25. Key
Management ::
Cloud Hardware
Security
Modules (HSMs)
• Problem:
• Need crypto keys near the Cloud
• HSMs are in the data center
• Can’t entirely trust our CSP
• Motivation:
• Want to decouple DC and Cloud
• Want to trust our Cloud more fully
• If we want this, others will probably want
it too.
• Solution:
• A real HSM: FIPS 140-2 certified
hardware
• Keys stay in hardware
• “HSM as a Service”
Thursday, November 29, 12
26. InfoSec Cloud
Take-Aways
• Our cloud operations and DevOps models were disruptive to:
• Engineering, Auditors, Vendors, and other Operations teams
• Our InfoSec team:
• Learned new cloud operational approaches, techniques, our PaaS
• Wrote/consumed APIs and services, learned a new AWS alphabet soup
• Had to tweak most software to fit this model; easier to start cloud first
• Worked with partners to implement new security controls
Thursday, November 29, 12
27. Thank you!
@x509v3
Bill.Burns@Netflix.com
Thursday, November 29, 12