2. Who am I ?
• Software Engineer at Amazon Web Services (Developer Evangelist)
• Previously of SpaceX and NASA
• Please email me about literally anything… People never want to talk
about anything anymore: randhunt@amazon.com
• Major thanks to:
• Matthias Jung, Peter Dalbhanjan and others who contributed to these slides
3. Agenda
• Why CloudFormation?
• Vocabulary
• How to plan my stacks?
• How to get started?
• How to prevent errors?
• How to safely update stacks?
• How to extend CloudFormation
• SAM
• YAML
• Cross-Stack references
5. Setting Up an Application
Setup Load Balancer
Configure Servers
Setup Database
…
Configure Network & Firewalls
Configure Access Rights
Series of Operational
Tasks
6. Setting Up an Application
Launch ELB
Launch EC2 Instances
Launch RDS Instance
…
Configure VPC
Define IAM Users
Series of API
Calls to AWS
7. Setting Up an Application
Launch ELB
Launch EC2 Instances
Launch RDS Instance
…
Configure VPC
Define IAM Users
Series of API
Calls to AWS
AWS CLI & SDKs
8. Setting Up an Application
ELB
EC2 Instances
RDS Instance
…
VPC
IAM Users
Template of
Resources
20. Think Services & Decouple
Food Catalog
website
Ordering website
Customer DB service
Inventory service
Recommendations
service
Analytics service Fulfillment
service
Payment
service
46. Debugging Tips
• Deactivate Rollback Flag during tests
• Put “breakpoints” via WaitConditions
• Test user data & scripts separately, e.g. Moustache
• Log stack events in DWH or logging service
• Use CloudTrail and AWS Config to track changes
• Redirect local Cfn log files to CloudWatch Logs
51. Review Updates
• What is going to be updated?
• Preview Feature with Change Sets
• Pay attention to impact on Related Resources
• Ref and Get:Att
• Check for Update Mode
• No Interruption
• Some Interruption
• Replacement
• Check for Drift
CloudFormation allows you to declaratively model your infrastructures architecture into a template.
For example the template for a simple web application could include things such as Amazon EC2 instances, an Elastic Load Balancer and an Amazon RDS instance.
For more complicated architectures it can also include a lot more such as Lambda functions, SNS queues , DynamoDB tables or IAM policies.
Once you have finished authoring your template you then upload it to CloudFormation and we take care of all the fine details of provisioning the infrastructure into what we call a stack.
Using Cloudformation you don’t need to worry about the ins and outs of each of the different services APIs, we take care of that for you.
Once your infrastructure has been provisioned you can make changes to it by modifying your template and CloudFormation will work out how to apply those changes to your infrastructure.
As we will discuss in this presentation this process can be automated into your existing deployment pipelines with things like Jenkins. The templates can be also included into your existing development processes and be stored in source control and be code reviewed.
Why CloudFormation?
In the old world with traditional hardware, setting up an application consisted of a series of operational tasks
Executed mostly manually or semi-automatically
You could do the same thing in the cloud: go to the console and configure a VPC, launch an ELB, etc
But as you know: all AWS services are programmable and have APIs
Also those clicks in the console would trigger a an call to AWS
So it’s much cleverer to not do those tasks manually, but fully automated them
You can write scripts leveraging our CLI or our SDK
Still there are a couple of things you need to deal with
For example: failure handling; you need to keep track which resources have already been created and tear them down again
You also need to be able to deal with modifications in your infrastructure and carefully track and test changes and their impact
You also need to manage state and deal with dependencies of your resources, e.g. if the applications servers need the database endpoint, the database must be created and running
With cloudformation, you don’t need to write code to manage your resources
Instead you just declare your resources that make up your application in a JSON template
You give that template to CloudFormation, which then instantiates all the resources
When there’s an error during the process, CloudFormation tears down all resources to avoid that you have just half your application
When you want to change something, you just make a modification to your template, CloudFormation detects the changes and applies them
CloudFormation also manages the state and the dependencies of the resources for you
This is the basic structure of a CloudFormation template
As I said, it’s in JSON
There’s a section where you define parameters; that can be referenced in the template and thus make it reusable in different contexts, environments, or for different applications
You have a section with mappings, a kind of simple hash that adds some useful logic to the template
You have a secibtion where you can define conditions: for example, only create certain resources if there’s a parameter that indicates that this is a test-stack
The there is this large part where you define all resources of your stack
And finally there’s also output-values that you can define, which is returned after running CloudFormation and you can work with
This is the basic structure of a CloudFormation template
As I said, it’s in JSON
There’s a section where you define parameters; that can be referenced in the template and thus make it reusable in different contexts, environments, or for different applications
You have a section with mappings, a kind of simple hash that adds some useful logic to the template
You have a secibtion where you can define conditions: for example, only create certain resources if there’s a parameter that indicates that this is a test-stack
The there is this large part where you define all resources of your stack
And finally there’s also output-values that you can define, which is returned after running CloudFormation and you can work with
Here are the key benefis of CloudFormation
Automation is obviously one of the key benfits of cloudformation, creation, update, and deletion of application or infrastructure
But more powerful is to use it to manage all you infrastructure with it: commit, version, roll back just as with application code to track changes and test them extensively before using them into production
Creation is atomic: you get deterministic behavior: either your application started up successfully or not, but then you don’t have any orphaned resources flowing around
The templates can be used as blueprints inside or across organizations, you can share or enforce best practices
Some more soft advantages are
that Cfn is highly configurable,
closely integrated with all AWS services,
allows to follow a module approach to infrastructure management and provisioning
and you get can started quickly to get an application running compared to selecting the right services and putting something together yourself
there’s a ton of different usecases for Cfn
Many of them we didn’t even think of
The development process that you use for developing business logic can be the same as what you when writing CloudFormation templates.
You start of with your favorite IDE or Text Editor to write the code, Eclipse, VIM or VisualStudio
You then commit to template to your source code repository using your usual branching strategy
and then have the template reviewed as part of your typical code review process.
The template is then integrated and run as part of your CI and CD pipelines.
Being simply a JSON document, you can even write Unit Tests for your templates. When developing a CloudFormation template you can use all of your normal software engineering principles
At the end of the day
It’s all software – a template can be reused across applications – just like code library's and a stack can be shared by multiple applications.
Resources – EC2 instances, VPC,
Parameters – is a way to ask questions during template creation for user inputs. It contains a list of attributes with values and constraints. User inputs can be Instance types, keynames, VPC ID’s, Username Passwords for DB’s etc.
Notice, Keyname doesn’t have default attribute and EC2InstanceType does. CFn fails to create a stack if no value is chosen. You will also notice that the key names are a drop down list to choose from
Another neat feature, we are forcing the users to choose from 3 instance types. So you can restrict your templates to use only specific values if needed.
Outputs is a way to provide your output of CFn stack. Here is where your resource output goes like website url’s, any resource you created that are useful for other stacks
When designing the architecture for your business, the first question you might have is how do you plan your stacks?
Example: one stack per account, per application per application layer, what can be reused?
Here a couple of patterns from our customers
One obvious way to plan for stacks is to look at different application layers
Different layers can have different life-cycles: for example, a network stack needs much less updates than a front-end
Different layers also require different expertise: for a network stack, you need network administrators, for a front-end service application administrators
Both makes layers a good abstraction of organization into cloudformation stacks and templates
You might also ask the question of reusability: can a template be reused in different stacks? When does it make sense to split a template in several ones? Similar tradeoffs as with object-oriented programming design decisions
Once you have a layered architecture, you would want to reuse those same templates to replicate it in multiple environments or regions.
One of the benefits of infrastructure-as-code is that you can easily model service-oriented architecture.
i.e. organizing a big business problem into manageable parts. In this example, we are organizing a food ordering business.
Each service is a self-contained unit of functionality, loosely coupled with other services. The services have clearly defined defined contracts to interact with each other.
We see this working for our customers. When you are using CloudFormation, you map these services onto stacks, and you can create these well defined relationships across stacks.
For example, you might have a food catalog stack that depends on a customer db stack. You would use the stack outputs and parameters to create the relationship between the stacks.
Food catalog needs the customer db endpoint. So, you can publish it in the outputs of the customer DB stack and pass it on as an input parameter when you create a food catalog stack.
How can reuse of CloudFormation templates be fostered?
Let’s take the following example
We have two web-applications that have a similar structure
One uses RDS, the other one DynamoDB as backend
So we could put the front-end part into one template
And the backend part in a different template
You could pass information from the output of the backend-stack creation to the creation parameter of the frontend stack
But you can also use the Nested Stack feature of CloudFormation
You would reference the front-end template from the back-end template
When the backend-template is instantiated, it also instantiates a front-end part
You still customize the ELB & Auto Scaling for each website by using parameters.
Advantage: you explicitly express and maintain the dependencies between different templates
Another big advantage of using nested stacks is that it supports role-specialization
you can have people to author templates for their area of expertise and still create a combined stack by nesting the templates.
So this guy is a front-end developer responsible to maintain the front-end stack
And this lady is responsible for the backend part. Using nested stacks, she can create a combined stack including the frontend part without touching this frontend template
Two issues: no explicit dependencies + no access to resources within another stack
Addressed by cross stack references
The app stack can import the values without the need to define in parameters
The network stack cannot be changed unless it is unreferenced by the app stack
Now that you know how to structure your application stacks
What’s the best way to get started with CloudFormation?
When you are using CloudFormation; like any other software development, you go through the process of coding, testing, hitting errors, debugging, and ultimately getting to a stack that works as expected.
Are there any ways to minimize the errors that you encounter? Are there ways to make that process faster? Sure there are.
Use comments
With JSON it’s not as nice as in any programming language, but still you can add a comment attribute in the metadata resource element to add comments
Make sure your validate your templates using the ValidateTemplate API.
This will help you identify the JSON syntax errors,
make sure the template sections like Parameters and Resources are structured properly and there are no circular dependencies.
If you are using the console, this is done for you automatically.
We found that a large majority of stack creation failures are caused by bad input – invalid parameter values. We launched this new feature to address that challenge.
If you are hosting an application inside a VPC, you are likely passing in the VPC id, subnet ids, etc. as stack parameters. Even if you are not hosting an application in a VPC, you might still be passing in a KeyPair as a parameter so that later you can SSH into the application instances.
When you need to pass in those parameters, use the new parameter types. Logistically, you still pass these values in as simple strings. But, qualifying them with these new parameter types allows CloudFormation to make sure the values are valid.
Using these new parameter types in your templates has two benefits.
Number #1: It allows the CloudFormation console to show you a drop down list of a valid set of values in the console. – So, no more looking up the right VPC id and typing it in.
Even if you are not using the console, these parameter types allow CloudFormation to detect invalid parameters right at the start of the stack creation workflow.
Earlier, if you were passing in an invalid key pair, you might have had to wait a few minutes; until CloudFormation attempted to actually create the instance using that key pair; after creating all other resources that the instance depended on.
Now, if you are using these parameter types, CloudFormation can check whether the key pair is one of the valid key pairs in your account, for the region you are using; in just a few seconds; saving you a lot of time and money.
If you are using the console, you even get nice combo-boxes and check-boxes that present you with all resource you can choose of without causing problems
While we are on the topic of parameters, here is another way to help your template users to pass in valid parameters. CloudFormation parameters support adding constraints on parameters.
In this example, imagine you are provisioning a Windows server and you want to limit the IP address ranges from which a user can remote desktop into the server,
You can use the parameter constraints to make sure that the parameter is a valid CIDR block.
Insufficient IAM permissions is one of the most common causes of stack creation failures and you can completely eliminate that.
When a user creates CloudFormation stacks, CloudFormation creates the resources in the stack on behalf of the user. What CloudFormation can provision is limited by the permissions the user has to provision resources.
By all means, you should use IAM permissions to control what your users can provision. However, when you intend to grant a user, permissions to create some stacks; make sure that the user not only has permissions to call the create stack API, but also the permissions for provisioning the resources needed in the stack.
Along the same vein, when you make sure your CloudFormation stack limit is sufficiently high, also make sure you have enough quotas for the AWS resources you are planning to use in the stack.
You not only want to create stacks, but also want to make sure they keep running as expected.
The first entrance point for everything are the stack events generated upon every stack creation, update or deletion
There you find information about types and names of resources and possible error reasons if something fails
You can also retrieve those events programmatically and move them to whatever analytics system you like
Deactivate rollback: normally, when the creation of a stack fails, all resources already created are torn down during the rollback process
The problem is that it becomes hard or even impossible to understand why a certain script on an EC2 instance fails, when the EC2 instance is torn down immediately
Therefore, we give the possibility to deactivate this process to facilitate debugging
Breakpoints
Cfn doesn’t support breakpoints, but you can simulate that using WaitConditions
WaitConditions are CloudFormation resources that block further creation of the stack until a signal or a timeout
You can tell CloudFormation to wait before creating a certain resource until it is notified
Therefore, you create a resource called “WaitCondition”
CloudFormation stopps until it receives a notification for that WaitCondition via a call to a presigned URL call to the Cfn endpoint (note: we have a helper script cfn-signal for that)
You can also specify a timeout – upon expiry, the stack creation fails
Typically, you want the WaitCondition start directly after the creation of another resource, e.g. an RDS instance. Done by adding a DependsOn on WaitCondition.
How this can be done is described in the blog-post below
There you can easily explore those logs in the CloudWatch Logs console, search and filter for it
First, choose an update style that works for your scenario. Our customers use one of these two main styles.
In-place update is where you update a template, and call UpdateStack on an existing stack.
In Blue-Green style, you use an updated template to create a new stack from scratch, side-by-side an existing stack, without touching the existing stack, and then switch traffic.
In-place update is incremental and hence typically faster.
In-place update is cost-efficient compared to blue grreen, because you are not running double the number of stack resources.
Because it’s all in one stack, carrying forward state and data is simpler.
In fact, place is the only option to carry forward unique resources like the EIPs.
On the other hand, there is no way you can break a working stack in the blue green deployment
You can instantly fall back to the old stack if something goes wrong with the new stack
Are there any ways to get the best of worlds? I think there are.
First, choose an update style that works for your scenario. Our customers use one of these two main styles.
In-place update is where you update a template, and call UpdateStack on an existing stack.
In Blue-Green style, you use an updated template to create a new stack from scratch, side-by-side an existing stack, without touching the existing stack, and then switch traffic.
In-place update is incremental and hence typically faster.
In-place update is cost-efficient compared to blue grreen, because you are not running double the number of stack resources.
Because it’s all in one stack, carrying forward state and data is simpler.
In fact, place is the only option to carry forward unique resources like the EIPs.
On the other hand, there is no way you can break a working stack in the blue green deployment
You can instantly fall back to the old stack if something goes wrong with the new stack
Are there any ways to get the best of worlds? I think there are: e.g. you could choose blue-green only for major changes to the infrastructure
When you are doing the in-place update, that is when you are planning to call UpdateStack on an existing stack; there are several steps you could take to make the update go through successfully.
Review the version history of your templates to understand exactly what you are going to update.
This includes looking at Refs and Fn::GetAtts to anticipates how the updates will cascade and affect related resources.
When you update a stack resource, the update might happen without interrupting the resource, with some interruption, or CloudFormation may even have to replace the existing resource with a new one. Refer to our documentation to understand what type of update will be performed and if it works for you.
The last two are very important to avoid getting into UPDATE_ROLLBACK_FAILED state. If an update cannot go through, CloudFormation rolls you back to the last known good state. So, during the update, CloudFormation needs not only the permissions to do a happy path update, but also to do the inverse of the update.
Lastly, during the lifetime of the stack don’t let it drift from its template. If you have changed it intentionally, restore it to its original state and push your changes by changing the template and running an update.
let’s have a look at this new feature
So you have a LAMP stack running
Go to the stack and choose the action “Create Change Set” and choose the updated template where you added some resources
you get access to a wizard that displays all changes, the impacted resources, and also what the impact is: are resources replaced?
You can create several of those
Once you are sure that everything is as you exected, you confirm and execute changes
When you are updating an Auto Scaling group in your stack, and you do not want to have any downtime, use rolling updates.
Rolling updates is a CloudFormation feature that allows you to update an Auto Scaling group in-place, without any downtime.
You can divide the Auto Scaled instances into batches and update only a single batch at a time.
The benefit is that there are always some instances doing the job the Auto Scaling group is supposed to do. That is zero downtime.
You can have CloudFormation wait until a batch update is verified and move on to updating the next batch only if the updated batch is working as expected.
The ELB Health Check is commonly used for this verification, but you can use any tests you want.
If the health check on the updated batch fails, CloudFormation will roll the group back to the original configuration.
Most importantly, you can now automate all of this process in one simple CloudFormation template.