5. Managing many containers is hard
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
Server
Guest OS
11. Task Placement Engine
Name Example
AMI ID attribute:ecs.ami-id == ami-eca289fb
Availability Zone attribute:ecs.availability-zone == us-east-1a
Instance Type attribute:ecs.instance-type == t2.small
Distinct Instances type=“distinctInstances”
Custom attribute:stack == prod
12. Task Placement selection
Cluster Constraints Satisfy CPU, memory, and port requirements
Filter for location, instance-type, AMI, or custom
attribute constraints
Identify instances that meet spread or binpack
placement strategy
Select final container instances for placement
Custom Constraints
Placement Strategies
Apply filter
20. What are load balancers?
At a high level, load balancers do the same thing: distribute (balance) traffic
between targets. Targets could be different tasks in a service, IP addresses,
or EC2 instances in a cluster.
21. Different types of load balancers
ELB Classic: the original. Balances traffic between EC2 instances.
Application Load Balancer: request level (7). great for microservices. Path-based
HTTP/HTTPS routing (/web, /messages), content based routing, IP routing. Only
in VPC.
Network Load Balancer: connection level (4). Route to targets (EC2, containers,
IPs). High throughput, low latency. Great for spiky traffic patterns. Requires no
warming. Can assign elastic IP per subnet
View the entire breakdown here:
https://aws.amazon.com/elasticloadbalancing/details/#details)
22. What does this have to do with scheduling?
• First, ELB is what actually distributes the request. So, deployments and
scheduling can be tweaked at that level: for example, changing the
connection draining timeout can speed up deployments.
• Secondly, your ELB can influence your resource management. For
example, dynamic port allocation with ALB.
24. Docker image size
• Major component of resource management is the size of your Docker
images. They add up quickly, with big consequences.
• The more layers you have (in general), and the larger those layers are, the
larger your final image will be. This eats up disk space.
• You don’t always need the recommended packages (--no-install-
recommends)
25. OK, so how can I reduce image sizes?
• Sharing is caring.
• Use shared base images where possible
• Limit the data written to the container layer
• Chain RUN statements
• Prevent cache misses at build for as long as
possible
26. Let’s talk cache
• Docker cacheing is complicated!
• Calling RUN, ADD or COPY will add layers. Other instructions will not
(Docker 1.10 and above)
• How the cache works: starting from the current layer, Docker looks
backwards at all child images to see if they use the same instruction. If so,
the cache is used***
• For ADD and COPY: a checksum is used: other than with ADD and COPY,
Docker looks at the string of the command, not the contents of the
packages (for example, with apt-get update)
27. *** (sometimes footnotes need their own slides)
So what happens if my command string is always the same, but I need to
rerun the command? For example, with git commands.
You can ignore the cache, or some people break it by changing in the string
each time (like with a timestamp)
28. In the image itself, clean as you go:
• If you download and install a package (like with curl and tar), remove the
compressed original in the same layer:
30. Clean up after your images, both in the image,
and on the system
Docker image prune:
$ docker image prune –a
Alternatively, go even further with Docker system prune:
$ docker system prune -a
31. Garbage collection
• Clean up after your containers! Beyond image and system prune:
• Make sure your orchestration platform (like ECS or K8s) is garbage collecting:
• ECS
• Kubernetes
• 3rd party tools like spotify-gc
36. For tasks, scheduling a task starts that task if
there are available resources
Shared Data Volume
Containers
launch
Container
Instance
Volume Definitions
Container Definitions
37. Starting a task
User / Scheduler
StartTask
API
Container Instance – What set of resources should we subtract from?
Task Definitions – What resources does the application need?
38. Starting a task
API
User / Scheduler
StartTask
Cluster Management Engine
We take that information, check against our Regional Cluster Management Engine, and either Approve or reject the
request.
The Cluster Management Engine has been designed to provide distributed transactions with Availability Zone isolation. So
even if there is an issue in one Availability Zone you will continue to be able to schedule to your cluster.
39. Starting a task
API
User / Scheduler
StartTask
Cluster Management Engine
Agent Communication
Once a request is approved we propagate down to the Agent Communication that a node needs to
change its state.
40. Starting a task
API
User / Scheduler
StartTask
Cluster Management Engine
Agent Communication
Docker
Container Instance
ECS Agent
Task
Container
WebSocket
The Agent Communication Service will push this information down to the Websocket that the container instance
opened.
41. Starting a task
User / Scheduler
StartTask
Agent Communication
Docker
Task
Container
ECS Agent
Task
Container
SubmitStateChange
API
Cluster Management Engine
We will then acknowledge to the service that we have performed (or failed to perform) the specified action.
At this point the task is now happily running and tracked, but how do we keep in sync?
58. Run tasks in response to a cron expression, or at a
specific time
59. Time-based task scheduling
• Schedule on fixed time intervals (e.g.: number of minutes, hours, or days)
• Or use cron expressions.
• Set Amazon ECS as a CloudWatch Events target