Keeping consistent environments across your development, test, and production systems can be a complex task. Docker containers offer a way to develop and test your application in the same environment in which it runs in production. You can use tools such as the ECS CLI and Docker Compose for local testing of applications; Jenkins and AWS CodePipeline for building and workflow orchestration; Amazon EC2 Container Registry to store your container images; and Amazon EC2 Container Service to manage and scale containers. In this session, you will learn how to build containers into your development workflow and orchestrate container deployments using Amazon ECS. You will hear how Okta runs 30,000 tests per developer commit and releases 10,000 new lines of code each week to production with a CI system based on 100% AWS services. We'll also discuss how Okta uses ECS for parallelized testing in CI and for production microservices in a multi-region, always on cloud service.
2. What to Expect from the Session
• Review the CI/CD Pipeline
• How would you use containers with CI/CD?
• Okta Engineering: How they work and ship code
• CI with Docker and ECS
3. The Continuous Everything… Nirvana
Goal Design Develop Deploy Test
Run and
monitor
Continuous integration
Continuous delivery
Continuous deployment
Continuous feedback
5. Why Use Containers for Continuous Delivery?
• Roll out features as quickly as possible
• Predictable and reproducible environment
• They are immutable! They will run the same in every
environment
• Fast feedback
7. Docker and Docker Toolbox
• Docker (Linux > 3.10)
• Docker Toolbox or Docker Beta (OS X, Windows)
• Define app environment with Dockerfile
8. Dockerfile
FROM ruby:2.2.2
RUN apt-get update -qq && apt-get install -y build-
essential libpq-dev
RUN mkdir -p /opt/web
WORKDIR /tmp
ADD Gemfile /tmp/
ADD Gemfile.lock /tmp/
RUN bundle install
ADD . /opt/web
WORKDIR /opt/web
9. Docker Compose
Define and run multi-container applications:
1. Define app environment with Dockerfile
2. Define services that make up your app in docker-
compose.yml
3. Run docker-compose up to start and run entire app
15. Running Tests Inside a Container
Usual Docker commands available within your test
environment
Run the container with the commands necessary to
execute your tests, e.g.:
docker run web bundle exec rake test
16. Running Tests Against a Container
Start a container running in detached mode with an
exposed port serving your app
Run browser tests or other black box tests against the
container, e.g., headless browser tests
18. Amazon EC2 Container Service
• Highly scalable container management service
• Easily manage clusters for any scale
• Flexible container placement
• Integrated with other AWS services
• Extensible
• ECS concepts
• Cluster and container instances
• Task definition and task
19. AWS Elastic Beanstalk
• Deploy and manage applications without worrying about
the infrastructure
• Elastic Beanstalk manages your database, Elastic Load
Balancing, ECS cluster, monitoring, and logging
• Docker support
• Single container (on EC2)
• Multi container (on ECS)
20. Amazon ECS CLI
• Easily create ECS clusters & supporting resources
such as EC2 instances
• Run Docker Compose configuration files on ECS
• Available today – http://amzn.to/1jBf45a
22. Continuous Delivery To ECS with Jenkins
4. Push image to
Docker registry
2. Build image from
sources 3. Run test on image
1. Code push
triggers build
5. Update service
6. Pull image
23. Continuous Delivery To ECS with Jenkins
Easy deployment
Developers – Merge into master, done!
Jenkins build steps
Trigger via webhooks, monitoring, Lambda
Build Docker image via Build and Publish plugin
Push Docker image into registry
Register updated job with ECS API
24. Continuous Delivery To ECS with CodePipeline
1. Code push
triggers pipeline
2. Lambda function
creates EC2 instance
3. Image is built and
pushed to ECR
4. Lambda function
terminates EC2 instance
5. Lambda function
deploy new task
revision to ECS
25. Continuous Delivery To ECS with CodePipeline
• Lambda custom actions
• Create and terminate EC2 instance
• Update ECS service
• EC2 instance uses user data to build an image and push
it to ECR
34. The problem
Inspired by: http://dev2ops.org/2010/02/what-is-devops/
Dev OpsWall of turmoil
Dev Ops
I want stabilityI want change
Domain boundary
Container frameworks
Cluster schedulerDev Ops
Continuous integration
38. Okta Engineering—How Do We Work, How Do
We Ship Our Code?
• 200 engineers, split into teams with embedded
specialists
• 1 week sprints, and deploy to production weekly
• Capability to do more than one hotfix per day at
customers’ request or for bugs found in CI or pre-prod
• Every merge to master is a potential release candidate
39. Okta Engineering—How Do We Test Our
Code?
• Every topic branch goes through the same amount of
vigor in testing as release candidates.
• Passing automated tests is enforced at commit time.
• Largest repo: 33K tests, takes 60 minutes (22 parallel
runs)
• Smallest repo: 100 tests, 5 minutes
• The Developer Productivity team is responsible for
supporting engineering.
40. Challenge of Developer Productivity Team
• Developer experience
• Quality
• Cost
• Cloud first
41. Challenge of Developer Productivity Team
• Developer experience
• Quality
• Cost
• Cloud first
Developers expect fast turn-
around time and reliable results
42. Challenge of Developer Productivity Team
• Developer experience
• Quality
• Cost
• Cloud first
We need to run all the tests
required to guarantee quality
43. Challenge of Developer Productivity Team
• Developer experience
• Quality
• Cost
• Cloud first
We need to run an
infrastructure which is as cost-
effective as possible
44. Challenge of Developer Productivity Team
• Developer experience
• Quality
• Cost
• Cloud first
We aim to use cloud services
first, wherever possible
48. Vision
• Clean testing environments
• Dynamic worker scaling
• Spot Instances for cost
• Versioned testing
• Improved queuing system
• Less infrastructure flakiness
• The correct privileges, to
maintain security
49. Vision
• Clean testing
environment
• Dynamic worker scaling
• Spot Instances for cost
• Versioned testing
• Improved queuing system
• Less infrastructure flakiness
• The correct privileges, to
maintain security
Isolate test environments from
others, parallel and serial runs
50. Vision
• Clean testing environments
• Dynamic worker scaling
• Spot Instances for cost
• Versioned testing
• Improved queuing system
• Less infrastructure flakiness
• The correct privileges, to
maintain security
Workers should survive the
loss of their build server
Worker pool should scale
quickly
Number of workers should not
affect memory footprint of build
server
51. Vision
• Clean testing environment
• Dynamic worker scaling
• Spot Instances for cost
• Versioned testing
• Improved queuing system
• Less infrastructure flakiness
• The correct privileges, to
maintain security
Run our services for cheaper
rates, as we have many short
lived tasks, and could certainly
handle a few failures
52. Vision
• Clean testing environment
• Dynamic worker scaling
• Spot Instances for cost
• Versioned Testing
• Improved queuing system
• Less infrastructure flakiness
• The correct privileges, to
maintain security
Enable testing of infrastructure
changes in topic branches
53. Vision
• Clean testing environment
• Dynamic worker scaling
• Spot Instances for cost
• Versioned testing
• Improved queuing system
• Less infrastructure flakiness
• The correct privileges, to
maintain security
Should survive build server
reboots
Shouldn’t be tied to specific
workers or build servers
Centralized
Should have good visibility
Re-queuing of lost tasks
54. Vision
• Clean testing environment
• Dynamic worker scaling
• Spot Instances for cost
• Versioned testing
• Improved queuing system
• Less infrastructure
flakiness
• The correct privileges, to
maintain security
Push testing and creation of
test machines to developers
55. Vision
• Clean testing environment
• Dynamic worker scaling
• Spot Instances for cost
• Versioned testing
• Improved queuing system
• Less infrastructure flakiness
• The correct privileges, to
maintain security
Launch tasks in secure
environments
58. ECS and Docker
• AWS + Java app tailored to Okta process
• Immutable and disposable build workers—created for
one-time use, destroyed when job is done
• Near ZERO cost on weekends, scales with load
• ECS allows us to maximize usage of EC2 instances
• Same containers for multiple types and numbers of
builds
• Same AMI can run multiple Docker images
59. Amazon ECS
IAM separation per service
• Either service per cluster or use new IAM for ECS functionality
Sharing the docker daemon to allow running Docker within
Docker
Pre-fetching large data blobs and making them available
on the hosts is an option
Multiple containers: mysql, redis, kinesilite
60. Docker Update
• Update Dockerfile and our CI system builds the new image,
uploading it to our repository
• Update task definition for cluster updates
61. Docker Conventions
• Dockerfiles live with project code, versioned together
• docker-compose used for development, so a clone plus
build will have a full service running locally
• Single repo for library and third-party service definitions
• Secrets or any form of config NEVER baked in
containers
• Start from minimal, audited base OS
• Strict rules around “FROM” clause
• Build owns creating immutable version and publishing
65. Clean Testing Environments
• Docker images
• Nearly instant machine refresh
• Easy for users to create and upload images that have
been tested to work locally
• Efficient machine use
• ECS with ECR and private repository back end
67. Dynamic Worker Scaling
Lambda allocates jobs using bin packing
This is one of the changes we had to make in order to use
ECS for long running tasks, rather than services spread
across many stateless instances
Disconnects unneeded nodes from cluster, allowing
themselves to self-terminate when they are idle
VS
68. Dynamic Worker Scaling
Lambda allocates jobs using bin packing
This is one of the changes we had to make in order to use
ECS for long running tasks, rather than services spread
across many stateless instances
Disconnects unneeded nodes from cluster, allowing
themselves to self-terminate when they are idle
VS
69. Dynamic Worker Scaling
Lambda allocates jobs using bin packing
This is one of the changes we had to make in order to use
ECS for long running tasks, rather than services spread
across many stateless instances
Disconnects unneeded nodes from cluster, allowing
themselves to self-terminate when they are idle
VS
70. Dynamic Worker Scaling
Lambda allocates jobs using bin packing
This is one of the changes we had to make in order to use
ECS for long running tasks, rather than services spread
across many stateless instances
Disconnects unneeded nodes from cluster, allowing
themselves to self-terminate when they are idle
VS
71. Dynamic Worker Scaling`
Lambda allocates jobs using bin packing
This is one of the changes we had to make in order to use
ECS for long running tasks, rather than services spread
across many stateless instances
Disconnects unneeded nodes from cluster, allowing
themselves to self-terminate when they are idle
VS
73. Spot Instances
• We use Spot Instances across all Availability Zones
• Manually switch between On-Demand and Spot
Instances 3 times per week during Spot price spikes
• We are planning on moving to Spot Fleet soon
• Set pricing to On-Demand prices, we lose build slaves
whenever pricing goes above On-Demand prices
• 4000-6000 instance hours per day, about 1500 Spot
losses per week
78. Versioned Jobs with ECS
• Versioned build and test scripts can now be run in
versioned Docker containers, using versioned task
definitions
• Creates extreme flexibility
• CloudFormation allows us to stand up whole new
clusters with all different versions in a matter of minutes
for long term testing
79. ECS + Docker Problems
• Docker containers not launching
• ECS agent failing
• Docker containers stopping
• Incompatibility with certain services
• Docker OS availability
• Cleanup - AWS has made this configurable
• Image size
83. Expand Use
• Use ECS for more services
• Allow developers to control their test suites and Docker
images more directly
• Developer environments
• Use Docker for local long running services
• Use a VM running the same version OS
• Remote updates to keep it in line with CD system
• Aim to enable running CD containers right out of the box