Optimize Content Processing in the Cloud with GPU and Spot Instances

Optimize Content Processing in the Cloud
with GPU and Spot Instances
Chad Schmutzer | Solutions Architect – EC2 Spot
Amazon Web Services

What are we going to do today?
… build a transcoding pipeline with GPUs
… learn about EC2 Spot
… while saving up to 90% on your EC2 Bill
… using AWS CloudFormation, in about 10
minutes

On-Demand
Pay for compute
capacity by the hour
with no long-term
commitments
For spiky workloads,
or to define needs
AWS EC2 Consumption Models
Reserved
Make a low, one-time
payment and receive
a significant discount
on the hourly charge
For committed
utilization
Spot
Bid for unused
capacity, charged at
a Spot Price which
fluctuates based on
supply and demand
For time-insensitive
or transient
workloads

Spare capacity at scale
AWS has more than a
million active customers
in 190 countries.
Amazon EC2 instance
usage has increased 93%
YoY, comparing Q4 2014
and Q4 2013, not
including Amazon use.

With Spot the rules are simple

Capacity pools
AZ1
AZ2
SYD Total Capacity
T2 C4 M4 I2 R3 D2
Shared
Dedicated
Shared
Dedicated

$0.27 $0.29$0.50
1b 1c1a
8XL
$0.30 $0.16$0.214XL
$0.07 $0.08$0.082XL
$0.05 $0.04$0.04XL
$0.01 $0.04$0.01L
C3
$1.76
On
Demand
$0.88
$0.44
$.22
$0.11
Show me the markets!
Each instance family
Each instance size
Each Availability Zone
In every region
Is a separate Spot Market

50% Bid
75% Bid
You pay the
market
price
Bid Price Vs Market Price
25% Bid

Amazon EC2 Spot – in the wild
1) We make this easy using the
Spot bid advisor
2) With deliberate pool
selection and bidding, you
will keep your Spot instance
as long as you need to.
3) And with new features like
Spot fleet diversified we do
the heavy lifting for you...

Spot Bid Advisor – aws-spot-labs

EC2 Best practices
Fault toleranceFault tolerance
for Spot
StatelessStateless Multi-AZMulti-AZ Loosely coupledLoosely coupled Instance
Flexibility
Instance
Flexibility

Why use Spot – customer examples
39 years of drug research re-processed, using over 80,000 cores, in 9
hours for $4,232
- Approximately 87,000 compute cores at peak
- Estimated 39 years of computational chemistry performed in 9 hours
- Three candidate compounds successfully identified

“By using AWS Spot instances, we've been able to save 75% a month
simply by changing four lines of code. It makes perfect sense for saving
money when you're running continuous integration workloads or
pipeline processing.” - Matthew Leventi, Lead Engineer, Lyft

The $9 Billion Experiment

Scaling up as many as 1000 Spot instances a day to handle real time ad
delivery
Petabyte-Scale Data Pipelines with Docker, Luigi and Elastic Spot
Instances

A large scale POC for animation rendering on AWS:
•Cloud Rendering at Walt Disney Animation Studios (available on SlideShare)
•Automated environment leveraging Spot Fleet
•Launched 40K cores in 20 min
at less than $0.02 per core-hour

Spot fleet helps you
Launch Thousands of Spot Instances
with one RequestSpotFleet call.
Get Best Price
Find the lowest priced horsepower that works for you.
or
Get Diversified Resources
Diversify your fleet. Grow your availability.
And
Apply Custom Weighting
Create your own capacity unit based on your application
needs

Diversification with EC2 Spot fleet
Multiple EC2 Spot instances
selected
Multiple Availability Zones
selected
Pick the instances with similar
performance characteristics e.g.
c3.large, m3.large, m4.large,
r3.large, c4.large.

Results - Grid
Requested 1000
vCores over 30 days
Minimum 960 vCores
Mode 1024 vCores
Average 1012 vCores
Average Price of $0.012
per vCore
Savings of over 80%

Walt Disney Animation Studios
Core Count
./aws_spot_fleet_request -p reinvent --cpu 8 --ram 64 -m 4.7 -c 1500

It is easy!
aws ec2 request-spot-fleet --spot-fleet-request-config file://config.json
{ "IamFleetRole": "arn:aws:iam::781603563322:role/fleet-role", "TargetCapacity":
"100", "SpotPrice": "0.03", "ValidFrom": "2015-09-15T00:56:19Z", "ValidUntil":
"2016-09-14T07:00:00Z", "TerminateInstancesWithExpiration": true,
"LaunchSpecifications": [ { "ImageId": "ami-0d4cfd66", "InstanceType":
"c3.large", "WeightedCapacity": 2, "SubnetId": "subnet-d0dc51fb" }, { "ImageId":
"ami-0d4cfd66", "InstanceType": "c3.large", "WeightedCapacity": 2, "SubnetId":
"subnet-64531413" }, { "ImageId": "ami-0d4cfd66", "InstanceType": "c3.large",
"WeightedCapacity": 2, "SubnetId": "subnet-0b1b8052" }, { "ImageId": "ami-
0d4cfd66", "InstanceType": "c3.xlarge", "WeightedCapacity": 4, "SubnetId":
"subnet-d0dc51fb" }, { "ImageId": "ami-0d4cfd66", "InstanceType": "c3.xlarge",
"WeightedCapacity": 4, "SubnetId": "subnet-64531413" }, { "ImageId": "ami-
0d4cfd66", "InstanceType": "c3.xlarge", "WeightedCapacity": 4, "SubnetId":
"subnet-0b1b8052" }, { "ImageId": "ami-0d4cfd66", "InstanceType": "c3.4xlarge",
"WeightedCapacity": 16, "SubnetId": "subnet-d0dc51fb" }, { "ImageId": "ami-
0d4cfd66", "InstanceType": "c3.4xlarge", "WeightedCapacity": 16, "SubnetId":
"subnet-64531413" }, { "ImageId": "ami-0d4cfd66", "InstanceType": "c3.4xlarge",
"WeightedCapacity": 16, "SubnetId": "subnet-0b1b8052" }, { "ImageId": "ami-
"subnet-d0dc51fb" }, { "ImageId": "ami-0d4cfd66", "InstanceType": "c3.8xlarge",
"WeightedCapacity": 32, "SubnetId": "subnet-64531413" }, { "ImageId": "ami-
"subnet-0b1b8052" }, { "ImageId": "ami-0d4cfd66", "InstanceType": "c3.2xlarge",
"WeightedCapacity": 8, "SubnetId": "subnet-d0dc51fb" }, { "ImageId": "ami-
"subnet-64531413" }, { "ImageId": "ami-0d4cfd66", "InstanceType": "c3.2xlarge",
"WeightedCapacity": 8, "SubnetId": "subnet-0b1b8052" } ] }

An easy to use interface that
lets you launch spare EC2
instances in seconds
Helps you select and bid on the
EC2 instances that meet your
applications requirements
Simple to use dashboard lets
you modify and manage your
application’s compute capacity
EC2 Spot Console

Using a single
additional Parameter
Run continuously
for up to 6 hours
Save up to 50% off
On-Demand pricing
EC2 Spot block
$1

What’s in 6 hours?
~ 21% less than 1 hour
~ 35% less than 2 hours
~ 40% less than 3 hours
In total roughly 50% of all
instances live less than 6
hours

Capitalizing on two minute warning
When the Spot price exceeds
your bid price, the instance will
receive a two-minute warning
Check for the 2 minute spot
instance termination
notification every 5 seconds
leveraging a script invoked at
instance launch

Sample script – two minutes left!
1) Check for 2 minute
warning
2) If YES, detach instance
from ELB
3) OTHERWISE, do nothing
4) Sleep for 5 seconds
$ if curl -s http://169.254.169.254/latest/meta-
data/spot/termination-time |
grep -q .*T.*Z; then instance_id=$(curl -s
http://169.254.169.254/latest/meta-data/instance-id);
aws elb deregister-instances-from-load-balancer
--load-balancer-name my-load-balancer
--instances $instance_id;
/env/bin/flushsessiontoDBonterminationscript.sh; fi

Batch Processing with
Amazon EC2 Spot

Batch oriented applications can leverage on-demand
processing using EC2 Spot to save up to 90% cost:
Batch Processing with Amazon EC2 Spot
Monte Carlo
simulation
Molecular
modeling
Media
processing
High energy
simulations

Common method Batch Processing

EC2 Spot fleet to setup a
heterogeneous, scalable “grid”
of EC2 spot instances with
multiple capacity pools as
worker nodes
Scaling to 50,000 cores
EC2 Spot blocks for less
flexible jobs that must run
continuously.

AWS cloud
Region
Amazon S3
DynamoDB
Amazon SQS
CloudWatch172.16.0.0/16
Internet
gateway
region-1a - 172.16.0.0/20 region-1b - 172.16.16.0/20
region-1c - 172.16.32.0/20 region-1d - 172.16.48.0/20
Cluster
Controller
c3.4x Spot r3.4x Spot c3.4x Spot r3.4x Spot
c3.4x Spot r3.4x Spot
Queue based processing

Disney Animation Renderfarm
Renderfarm
Avere FXT
cluster
WDAS Data Center
Renderfarm
Avere FXT
cluster
Storage
Remote Data Center
Renderfarm
Avere FXT
cluster
Remote Data Center
San Francisco
Los Angeles
Burbank
Artists
Redundant 10Gb
Redundant10G
b

Disney Animation Renderfarm
Renderfarm
Avere FXT
cluster
WDAS Data Center
Renderfarm
Avere FXT
cluster
Storage
Remote Data Center
Renderfarm
Avere FXT
cluster
Remote Data Center
San Francisco
Los Angeles
Burbank
Artists
Redundant 10Gb
Redundant10Gb
virtual private cloud
Avere vFXT
Oregon
Spot Instances
10Gb Primary, 1Gb backup
EFS

Mez
Archival
Backup Origin
Primary Origin
G2
G2
Ingest Bucket S3 Events SQS Queue
Source Encoder
SPOT or On-Demand
Edge Cache Fleet
Failover
ALB CloudFront Viewers
Diversified SPOT Fleet
G2
M4
Egress Bucket
Live/VOD 360 OTT on EC2 Spot
Direct
Connect
Multi
Tenancy
Multi
CDN
ContainerEncoding
Full OTTCMS / DRM
GPU / CPU

Ingest Store Transform Process
PUSH OR PULL
MEZ, LIVE & VOD
CREATE A CENTRALIZED
CONTENT LAKE ON S3
MEDIA DELIVERY AND/OR
HANDS-ON POSTPRODUCTION
SCALE OUT ON ELASTIC
CAPACITY FOR ALL PROCESSING
Content production and post-production companies are leveraging AWS to accelerate and streamline creative,
editing, compositing and streaming delivery workloads with highly scalable cloud computing and storage.
Media Pipeline

Reference Links
EC2 Spot Documentation:
http://aws.amazon.com/ec2/spot/
http://aws.amazon.com/ec2/spot/bid-advisor/
http://aws.amazon.com/ec2/spot/getting-started/
http://aws.amazon.com/ec2/spot/faqs/
http://aws.amazon.com/ec2/spot/testimonials/
User Guide
http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-spot-instances.html
http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/spot-fleet.html
lpful AWS Blog Posts
https://aws.amazon.com/blogs/aws/focusing-on-spot-instances-lets-talk-about-best-practices/
https://aws.amazon.com/blogs/aws/building-price-aware-applications-using-ec2-spot-instances/
https://aws.amazon.com/blogs/compute/cost-effective-batch-processing-with-amazon-ec2-spot/
https://aws.amazon.com/blogs/compute/dynamic-scaling-with-ec2-spot-fleet/

Optimize Content Processing in the Cloud with GPU and Spot Instances

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Ähnlich wie Optimize Content Processing in the Cloud with GPU and Spot Instances

Ähnlich wie Optimize Content Processing in the Cloud with GPU and Spot Instances (20)

Mehr von Amazon Web Services

Mehr von Amazon Web Services (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Optimize Content Processing in the Cloud with GPU and Spot Instances

Hinweis der Redaktion