AWS User Group presentation on AWS Well-Architected Framework. Includes items from framework and additional best practices I have implemented (identified by slides with (Me) in the title).
2. @joehack3r
Disclaimer
• Any views or opinions represented are my own and
do not necessarily represent those of people,
institutions or organizations that I am or have been
associated with in any professional or personal
capacity.
3. @joehack3r
What is Well-Architected?
My Simplest Definition:
Designing a product or service in a
manner to meet the customer's needs
while balancing trade-offs.
13. @joehack3r
AWS Well-Architected
Framework
• Based on AWS experts working with thousands of
customers
• Learn about new or different ways of thinking
• Evaluate your environment against AWS best
practices
15. @joehack3r
Security Pillar
• Encrypt everything in transit and at rest
• Log everything (CloudTrail, VPC Flow, S3, Config,
etc.)
• Security groups (firewall) and NACL at all layers
• Principle of least privilege
19. @joehack3r
Security Pillar (Me)
• CloudTrail
(this command used to create trails is now moot with
“Apply trail to all regions option” in console)
myS3LogBucket=my-test-bucket-2718
aws ec2 describe-regions --output json |
grep RegionName | awk -F""" {'print
$4'} | while read region; do aws
cloudtrail create-subscription --name
"Default" --s3-use-bucket $myS3LogBucket
--region $region; done
28. @joehack3r
Cost Optimization Pillar (Me)
• Multiple Billing alerts
• Tags in billing report
• Janitor Monkey with Edda
• Made it easy to use Spot instances
29. @joehack3r
My Practices
• Lots of CloudFormation
• Parameterize AMI, Instance Type, AZs, etc.
• CI/CD Application Software and Infrastructure
• VPC
• ELB and ASG everything
30. @joehack3r
My Practices
• Work closely with our Solutions Architect
• Research and demo new AWS services
• Attend DevOpsDays, hackathons, re:Invent
• Follow Netflix Tech Blog and others
31. @joehack3r
Suggested Next Steps
• Read the announcement and PDF
http://bit.ly/aws-well-architected
• Read AWS Architectures and White Papers
https://aws.amazon.com/architecture/
https://aws.amazon.com/whitepapers/
• Review with SA, TAM, consulting partner, etc.
Hinweis der Redaktion
Blame me for any errors or content you don’t like.
Balancing trade-offs is the key.
Let’s use an example
Consider the three little pigs. They need shelter from the rain and a place to sleep.
For this requirement, best solution is the straw house.
What about a new requirement: protection from the big bad wolf?
This adds an additional cost.
New best solution is the brick house.
Different situations results in different optimal solutions.
Large variety of “costs”:
Costs can come from infrastructure, support, upgrades, development, reliability, reputation, etc.
Some costs are one-time, some are recurring, some change over time.
Some costs we do not know about. Some costs we cannot predict or measure accurately.
“What if” the wolf retired and there are no other predators, natural disasters, etc. to worry about?
Being well-architected is looking at and understanding the big picture - including the unknowns - and making the best decision.
Cloud lets you build what you need when you need it. You don’t have to build the brick house right away. You can also switch back to the straw house easily.
One more thing…
Know difference between what you want, what you need and what you can afford.
AWS provides lots of cloud services. They also provide tools to help build and maintain your usage of the cloud services. It is part of our responsibility as engineers and architects to understand and use these services and tools properly. To help us, AWS introduced the Well-Architected Framework.
Highly recommend downloading the PDF.
Presentation is overview and subset of material covered in the PDF.
Notice: “This document is provided for informational purposes only”
“Customers are responsible for making their own independent assessment”
You should know your requirements, limitations, etc. better than anybody.
It is not a recipe that guarantees anything.
If it does not guarantee anything, why should you listen?
Use a sports analogy:
Nobody achieves excellence without a coach.
Every successful athlete has multiple coaches. AWS experts are one of our coaches.
You get better by observing and learning from others. Who knows what the Fosbury flop is?
You get better by testing yourself against others.
While the AWS Well-Architected Framework provides these, you are still responsible for your performance on the field.
We’ll go into a high level summary of the pillars.
Security: Protect information, systems, and assets
Reliability: Recover from failures and acquire resources to meet demand
Performance: Use resources efficiently and maintain that efficiency
Cost: Avoid or eliminate unneeded cost or suboptimal resources
Security: Protect information, systems, and assets
Encrypted boot volumes, EBS volumes, S3, RDS, etc.
Mention Alert Logic for CloudTrail analysis.
Separate security groups for each ELB, ASG, RDS. SG tied to role, not ports.
Principle of least privilege - restrict to single bucket, read-only
Who has account old enough for EC2-Classic?
Who is still using EC2-Classic resources?
Who is using Default VPC?
Why?
Who is using roll your own VPC?
What is keeping people from using roll your own VPC?
Reasons for VPC: internal ELB, newer instance types, VPN peering, greater isolation (may be required for some compliance (PCI, HIPPA))
Mention Alert Logic as being able to help with VPC
Reliability: Recover from failures and acquire resources to meet demand
How many people have run into an AWS limit or requested one to be increased?
You run into a limit when you can least afford it.
Everything should be in ASG
If you use single region, test migration to another region
If you use multiple regions, kill single region (sadly, Chaos Kong is not open source)
Practice restoring production to lower tier or different region.
Change Management: Automate deployments and patching
Message me on twitter and I’ll see about sharing an AWS limit script
September Meetup has CFN template for Chaos Monkey
Automated Recovery: if it’s down, detect it (we use ELB health check), and replace it automatically. Because it uses an ASG, we can detach the instance and save it for triage.
Automated Deployment: Lightning talk at DevOps meetup on February 8.
Performance: Use resources efficiently and maintain that efficiency
Benchmark different instance types: General purpose vs. memory optimized vs. cpu optimized vs. storage optimized vs. GPU
Who has installed MySQL or PostgreSQL on EC2 instances? RDS better solution?
Use right service - Lambda, ElastiCache, etc.
Can’t beat physics - put the resources close to the customer.
m4.large instance type costs less than m3.large instance type.
Ability to change instance type easily!
Monitor the environment, use different instance families and types
RI offer up to 75% off on-demand pricing.
Tags: product, environment, cost center (owner)
Everybody should have at least 1 billing alert
Running non-production environments for 40 hours a week is 75% savings over 24x7.
m4.large instance type costs less than m3.large instance type.
Billing account has two alarms: Mid-month and end-month
Individual accounts have single alarm:
Slightly more than expected monthly spend
Janitor monkey & Edda to clean-up resource sprawl
CloudFormation & Spot price parameter
Nearly everything we have is in CloudFormation. Exceptions: S3 buckets, DNS, DynamoDB?
Parameters also include ELB security policy, TLS certificate…anything we expect needs to change
CI/CD will be discussed at 2/8 DevOps meetup
Even a single instance is put behind an ELB and in an ASG
Some companies have weekly calls with AWS about roadmap (under NDA) and get early access to products.
Even if Netflix tools aren’t applicable, their challenges, solutions, and reasoning provide phenomenal learning opportunities.