Slides that accompany our YouTube video of a webcast on AWS and cloud cost management best practices - along with a discussion of how containers can help you change the game on cloud cost efficiency.
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Webcast: AWS Sticker Shock? How can containers and automation help?
1. AWS Sticker Shock?
How can containers and
automation help?
Ed Lee
Mukulika Kapas
VOIP or Dial-in (see chat)
Questions? Hit the GTW chat or @applatix
2. • We have about 40 minutes of content with time for questions
• We will post and email a video and slides on Monday
• Post any questions on the GTM chat for us to answer
• If audio fails, let us know on chat! We will dial in again quickly!
• You might hear a train go by at the :24 minute mark, sorry!
But first, some quick housekeeping
March 26, 2017 2
3. Who are we?
Ed Lee
Founder & CTO
Mukulika Kapas
Product Director
March 26, 2017 3
4. • Financial – Cloud sprawl
• Operational – Unfamiliarity, steep
learning curve, SLA breaches
• Business risks – Using cloud like on-
prem and not meeting business agility
• Security – Open ports, no regular
vulnerability assessment
Cloud disasters
March 26, 2017 4
5. Cloud management framework
March 26, 2017 5
Monitor
Analyze
Optimize
Govern
Account Cost Resource
Across 3-dimensions
Automation is key to analyze
and optimize
6. • Manageability is key as cloud usage grows
• To bring “Order to Chaos”, you must
Gain visibility across all clouds, accounts,
regions and services
Group (tag) resources to gain granular
understanding of usage and costs
Gather and analyze real time usage data
Track both operational and financial metrics
Analyze trends and investigate anomalies
Monitor & Analyze
March 26, 2017 6
7. Outline
March 26, 2017 7
Cost and usage
monitoring &
analysis
How can
containers &
automation help?
9. Standardize account hierarchy
March 26, 2017 9
AWS Main
Account
LOB 1
Project 1
Dev
Prod
Project 2
LOB 2
Project 3
LOB 3
Create and maintain account hierarchy
1) Track resource usage and billing
2) Centrally manage access using groups, roles and policies
Best Practice Recommendations
10. AWS Main
Account
AWS Dev
Account
AWS Test
Account
AWS Prod
Account
Use AWS tags
March 26, 2017 10
Billing (project/owner)
Purpose (perf testing)
Expiration (2017/05/05)
1) Be consistent and disciplined in applying and using tags
2) Leverage automation to apply tags
3) Follow naming standards for concatenation
4) Use billing tags to generate granular billing and usage reports
Best Practice Recommendations
11. • Name – Used to identify individual resources
• Project/Owner – Useful for billing and point of contact for the resource
• Purpose – What is this resource being used for?
• Expiration – Date when this resource can be freed
• Cluster – Group resources used by distributed applications
• AllowedPorts – 80, 443
• Backup – daily
• Cost Allocation Tags must be activated to be reflected in billing data
• IAM policies can be conditioned on tags
Example tags
March 26, 2017 11
13. Continuously monitor spending
• Analyze trends
• Investigate anomalies
• No substitute for talking to users
• Use AWS Cost Explorer – It’s free!
Provides useful information related to
Reserved Instances
Does not provide hourly granularity
Does not break out enough items
Not so useful spending categorization
• Third party applications/services
Provide more functionality, but $$
March 26, 2017 13
15. • Setup AWS CloudWatch
Track real-time resource usage metrics
Monitor custom metrics
Collect and monitor log files
Set alarms
Automate reaction to resource changes
• Free tier of basic monitoring
• Important for rightsizing resources
Setup resource monitoring (AWS CloudWatch)
March 26, 2017 15
16. Combine cost & resource usage metrics
March 26, 2017 16
• Monitor cost versus resource utilization
Correlate cost vs utilization
o Billing => costs
o CloudWatch => utilization
Why? To look for underutilized expensive cost buckets
Optimize resource sizing/usage to reduce costs
• Use 3rd party tools or write your own automation scripts
• Other CloudWatch limitations
No application level monitoring & tracing
17. • Convertible RI are attractive but require a 3 year term
• Sweet spot in many cases is partial-upfront one-year RI
• Break even for most partial-upfront one-year RI is 7 months
14 months for three-year RIs
• Break even point is more important than term of contract
Reserved Instance (RI) planning
March 26, 2017 17
18. • Standard RIs are always bound to a particular instance family
• In the past, RIs must be bound to a zone within a specific region
Provides a capacity reservation (i.e. you can always start the RI without delay)
Within the same region, the RI’s zone may be manually changed
Within the same instance family, the RI’s size may be manually changed
• More recently, RIs may be bound to a region rather than a zone
No capacity reservation (i.e. there may be a delay before starting an RI)
Automatically applies to instance in the same region regardless of zone (Sep 2016)
Automatically applies to instance of the same family, regardless of size (Mar 2017)
Important Reserved Instance details
March 26, 2017 18
19. Resource & cost usage optimizations
March 26, 2017 19
• Right-size your resources (instances, EBS volumes, etc.)
• Take advantage of new regional RI benefits
• Automate optimization of resources with policies
Power down unused resources, e.g. nights and weekends
Delete EBS volumes not attached to EC2 instances
Check for open ports
• Perform “what if analysis” to optimize use of RIs
Based on past three months of usage, would another RI have saved money?
Based on next three months of usage, would another RI save money?
• Use spot instances instead of RIs whenever possible
5-10x cheaper than on-demand, 2-3x cheaper than RIs
More flexible than RIs (no term contracts)
21. • For many use cases, bulk of spending is for EC2 instances
• Containers enable higher compute efficiency and density
• With automation, containers enable
On-demand computing
Auto-scaling
o Power off unused resources
o Burst large jobs
Effective use of spot instances
• If you are not continuously scaling your cloud infrastructure to
match demand, you are not getting the full benefits of cloud
Containers + Automation => additional 2-5x improvement in efficiency
The next level of agility and efficiency
March 26, 2017 21
22. • Many enterprises start with ‘lift and shift’ to move to the cloud
• Result: 1 AWS instance per VM results in low utilization
• Low utilization => high costs
• Right sizing becomes important (time consuming, depends on historical load)
• Typical tools for managing VMs/instances : Chef/Puppet
• Example web app: Apache, Java, MySQL
Lift and shift leads to low utilization
3/26/2017 22
On-Premises AWS
Multiple VMs – flexible capacity Multiple Instances – fixed capacity
‘lift and
shift’
Apache
2.X
Java
8.x
MySQL
5.x
Apache
2.x
Java
8.x
MySQL
5.x
23. A public cloud instance is not a VM!
• Public cloud instance is more like a server than a VM
• Lift and shift ➜ wasted compute resources
• How do Google and Facebook get 80% utilization? Containers!
• Containers are an ideal virtualization technology for the public cloud
Container
On-Premises Public Cloud
VMs Instances
Utilization: 30-40% Utilization: 10-20%
25. Container orchestration is required
3/26/2017 25
• Managing containerized apps at scale requires orchestration
• Requires high-level of automation to use effectively
• Deploy containers and application stacks
• Drive orchestration with code (e.g. YAML)
• Result: High utilization, low cost, application-level visibility,
infrastructure-as-code
• Eliminate configuration management tools (Chef/Puppet, etc.)
26. • Spot instances are 5-10x cheaper than on-demand (2-3x cheaper than RI)
Bid for unused EC2 capacity
Hourly price is set by AWS based on supply and demand
AWS may terminate spot instances with lower bids at any time
Applications must be able to tolerate restart after termination
• Good Spot Instance use cases
Batch processing
Any application that can be quickly and reliably restarted
By leveraging containers and automation, spot instances are suitable for most
applications
• Effective use of spot instances requires careful orchestration of spot
instances and on-demand instances
Automatically use spot-instances w/ orchestration
March 26, 2017 26
27. Key Takeaways
• Proper account management is critical to cloud management
• Enable consolidated billing and reporting
• Be consistent and disciplined in tagging resources
• Correlate billing with resource utilization data
• Automate cost and resource utilization mapping
• Take advantage of new regional RI benefits
• Start investigating how to use containers and automation to
improve agility and resource efficiency
March 26, 2017 27
28. Follow up
• For more resources see http://applatix.com
• Feedback? Questions? info@applatix.com or @applatix
• Our next Webinar: Using Kubernetes on AWS, April 13
March 26, 2017 28
Different persona has different needs
Finance needs to allocate cost to different LOBs
Engineering needs to understand spend by product/teams
Operations needs to understand how to improve cost
Standardize on dimensions across the organization
Provide reporting and analytics based on these dimensions
Choose a scheme for tagging your resources
User, project, application etc.
Enable the tags you want in your billing reports
Allows you to group spending by tags
Very useful for analyzing and allocating costs
You are probably already doing most of these, but in case you are not, we strongly encourage you to do so.
Analytics across AWS accounts – each account can be a logical group like business unit, teams, environment or product
Analytics by region
Analytics by AWS services
Analytics by different resource ids/subtypes for a specific AWS service
EC2 instance types
S3 operations
Analytics by tags