SlideShare ist ein Scribd-Unternehmen logo
1 von 32
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Scott Miao, SPN, Trend Micro
2016/5/20
Analytic Engine
A common Big Data computation service on the AWS
Who am I
• Scott Miao
• RD, SPN, Trend Micro
• Worked on Hadoop ecosystem about 6
years
• Worked on AWS for BigData about 3 years
• Expertise in HDFS/MR/HBase
• Speaker in some Hadoop related confs
• @takeshi.miao
Agenda
• What problems we suffered ?
• Why AWS ?
• Analytic Engine
• The benefits AWS brings to AE
• AE roadmap on AWS
What problems we suffered ?
Hadoop Expansion
Data volume increases 1.5 ~ 2x every year
Data center issues
• network bottleneck
• server depreciation
Growth
becomes 2x
Why AWS ?
Return of Investment
• On traditional infra., we put a lot of efforts on services operation
• On the Cloud, we can leverage its elasticities to automate our
services
• More focus on innovation !!
Time
Money
Revenue
Cost
AWS is a leader of IaaS platform
https://www.gartner.com/doc/reprints?id=1-2G2O5FC&ct=150519&st=sbSource: Gartner (May 2015)
AWS Evaluation
Cost acceptable
Functionalities satisfied
Performance satisfied
Analytic Engine
A common Big Data computation service on the AWS
High Level Architecture
Analytic Engine
(AE)
CloudStorage
(CS)
createCluster
submitJob
deleteCluster
Input from
Output to
AWS EMR
RESTful API RESTful API
RDs
Researchers
Services
Common
Storage
Service
Common
Computation
Service
Common Cloud Services in Trend
Analytic Engine
• Computation service for
Trenders
• Based on AWS EMR
• Simple RESTful API calls
• Computing on demand
• Short live
• Long running
• No operation effort
• Pay by computing resources
Cloud Storage
• Storage service for Trenders
• Based on AWS S3
• Simple RESTful API calls
• Share data to all in one place
• Metadata search for files
• No operation effort
• Pay by storage size used
Why we use AE instead of EMR directly ?
• Abstraction
• Avoid locked-in
• Hide details impl. behind the scene
• AWS EMR was not design for long running jobs
• >= AMI-3.1.1 – 256 ACTIVE or PENDING jobs (STEPs)
• < AMI-3.1.1 – 256 jobs in total
• Better integrated with other common services
• Keep our hands off from AWS native codes
• Centralized Authentication & Authorization
• No AWS/SSH keys for user
http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/AddingStepstoaJobFlow.html
Common usecases for AE
• User creates a cluster
• User can create multiple clusters
• User submits job to target cluster
• AE helps user to deliver job to secondary cluster
• User wants to know their cost
Usecase#1 – User creates a cluster
AEusers
createCluster
EMR
1.User invokes createCluster
2.AE launches an EMR cluster for user
With tags attached
1.
2.
tag:
‘sched:routine’,
‘env:prod’,
m3.xlarge * 10
tag:
‘sched:routine’,
‘env:prod’,
m3.xlarge * 10It is RESTful API,
so I can use any
client I am familiar
with !
Usecase#2 – User can create multiple clusters
as he/she need
AEusers
createCluster
EMR
1.User invokes createCluster
2.AE launches another new EMR cluster for user
with tags attached
3. User can create many clusters he/she needs
1.
2.
tag:
‘sched:adhoc’,
‘env:prod’,
c3.4xlarge * 20
tag:
‘sched:routine’,
‘env:prod’,
m3.xlarge * 10
tag:
‘sched:adhoc’,
‘env:prod’,
c3.4xlarge * 20
1.User invokes submitJob
2.AE matches the job and
deliver it to target cluster
3. AE submits job
4.EMR pulls data from CS
5.Job runs on target cluster
6.EMR outputs result to CS
7. AE sends msg to SNS
Topic if user specified
Usecase#3 – User submits job to target cluster
to run
AEusers
submitJob
EMR
CS
1.
2.
3.
clusterCriteria:
[[‘sched:adhoc’,
‘env:prod’],
[“env:prod”]]
tag:
‘sched:routine’,
‘env:prod’
tag:
‘sched:adhoc’,
‘env:prod’
5.7.
4. 6.
Usecase#4 – AE delivers job to secondary
cluster if target cluster down
AEusers
submitJob
EMR
CS
1.
2.
3.
clusterCriteria:
[[‘sched:adhoc’,
‘env:prod’],
[“env:prod”]]
tag:
‘sched:routine’,
‘env:prod’
tag:
‘sched:adhoc’,
‘env:prod’
1.User invokes submitJob
2.AE matches the job and
deliver it to secondary cluster
3. AE submits job
4.EMR pull data from CS
5.Job run on target cluster
6.EMR output result to CS
5.
4. 6.
Usecase#5 – User wants to know what their
current cost is
Billing & Cost management -> Cost Explorer -> Launch Cost Explorer
IDC
Middle Level Architecture
AZb
AE API servers
RDS
Internal ELB
AZa
AZb
AZc
AE API servers
RDS
services
services
services
peering
HTTPS
EMR
EMR
Cross-account
S3 buckets
input/output
Auto
Scaling
group
worker
s
worker
sMulti-AZs
Auto
Scaling
group
Auto
Scaling
group
Eureka
Eureka
Internet
HTTPS/HTTP
Basic/VPN
Cloud Storage
HTTPS/HTTP
Basic
Amazon
SNS
Oregon (us-west-2)
peering
The benefits AWS brings to AE
Pros & Cons
Aspects IDC AWS
Data Capacity Limited by physical rack
space
No limitation in
seasonable amount
Computation Capacity Limited by physical rack
space
No limitation in
seasonable amount
DevOps Hard, due to on physical
machine/ VM farm
Easy, due to code is
everything (Continuous
Deployment)
Scalability Hard, due to on physical
machine/ VM farm
Easy, relied on ELB,
Autoscaling group from
AWS
Pros & Cons
Aspects IDC AWS
Disaster Recovery Hard, due to on physical
machine/ VM farm
Easy, due to code is
everything
Data Location Limited due to IDC
location
Various and easy due to
multiple regions of AWS
Cost Implied in Total Cost of
Ownership
Acceptable cost with
Cloud optimized design
AE roadmap on AWS
Roadmap
Backups
AZb
AZa
AZb
AZc
RDS
peering
HTTPS
Cross-account
S3 buckets
input/output
Oregon (us-west-2)
RDS
1. pre-built infra. by AWS CF
2. Users permission granted
3. Pre-launched RDS
1.
2.
3.
AZb
AE API servers
RDS
Internal ELB
AZa
AZb
AZc
AE API servers
RDS
peering
HTTPS
EMR
EMR
Cross-account
S3 buckets
input/output
Auto
Scaling
group
worker
s
worker
sMulti-AZs
Auto
Scaling
group
Auto
Scaling
group
Eureka
Eureka
Oregon (us-west-2)
4. Provision AE SaaS by CI/CD
4.
IDC
AZb
AE API servers
RDS
Internal ELB
AZa
AZb
AZc
AE API servers
RDS
services
services
services
peering
HTTPS
EMR
EMR
Cross-account
S3 buckets
Auto
Scaling
group
worker
s
worker
sMulti-AZs
Auto
Scaling
group
Auto
Scaling
group
Eureka
Eureka
Internet
HTTPS/HTTP
Basic/VPN
Cloud Storage
HTTPS/HTTP
BasicOregon (us-west-2)
5. Users can access via VPN, FW open for Trend
6. Input from CS or S3
7. Computation in AWS EMR cluster
5.
7.
6. 8.
6. 8.
Amazon
SNS
9.
8. Output to CS or S3
9. Job end msg to AWS SNS (optional)
What is Netflix Genie
• A practice from Netflix
• Hadoop client to submit different kinds of Job
• Flexible data model design to adopt diff kind of cluster
• Flexible Job/cluster matching design (based on tags)
• Cloud characteristics built-in design
• e.g. auto-scaling, load-balance, etc
• It’s goal is plain & simple
• We use it as an internal component
https://github.com/Netflix/genie/wiki
What is Netflix Eureka
• Is a RESTful service
• Built by Netflix
• A critical component for Genie to do Load Balance and failover
Genie
API client API client API client

Weitere ähnliche Inhalte

Was ist angesagt?

Netflix Keystone SPaaS: Real-time Stream Processing as a Service - ABD320 - r...
Netflix Keystone SPaaS: Real-time Stream Processing as a Service - ABD320 - r...Netflix Keystone SPaaS: Real-time Stream Processing as a Service - ABD320 - r...
Netflix Keystone SPaaS: Real-time Stream Processing as a Service - ABD320 - r...Amazon Web Services
 
Data Con LA 2019 - Integrating Kafka with a Real-Time Database by David Anderson
Data Con LA 2019 - Integrating Kafka with a Real-Time Database by David AndersonData Con LA 2019 - Integrating Kafka with a Real-Time Database by David Anderson
Data Con LA 2019 - Integrating Kafka with a Real-Time Database by David AndersonData Con LA
 
AWS re:Invent 2016: Design Patterns for High Availability: Lessons from Amazo...
AWS re:Invent 2016: Design Patterns for High Availability: Lessons from Amazo...AWS re:Invent 2016: Design Patterns for High Availability: Lessons from Amazo...
AWS re:Invent 2016: Design Patterns for High Availability: Lessons from Amazo...Amazon Web Services
 
AWS re:Invent 2016: Global Traffic Management with Amazon Route 53 Traffic Fl...
AWS re:Invent 2016: Global Traffic Management with Amazon Route 53 Traffic Fl...AWS re:Invent 2016: Global Traffic Management with Amazon Route 53 Traffic Fl...
AWS re:Invent 2016: Global Traffic Management with Amazon Route 53 Traffic Fl...Amazon Web Services
 
AWS re:Invent 2016: High Performance Computing on AWS (CMP207)
AWS re:Invent 2016: High Performance Computing on AWS (CMP207)AWS re:Invent 2016: High Performance Computing on AWS (CMP207)
AWS re:Invent 2016: High Performance Computing on AWS (CMP207)Amazon Web Services
 
Deep Dive on Microservices and Amazon ECS
Deep Dive on Microservices and Amazon ECSDeep Dive on Microservices and Amazon ECS
Deep Dive on Microservices and Amazon ECSAmazon Web Services
 
Redis For Distributed & Fault Tolerant Data Plumbing Infrastructure
Redis For Distributed & Fault Tolerant Data Plumbing Infrastructure Redis For Distributed & Fault Tolerant Data Plumbing Infrastructure
Redis For Distributed & Fault Tolerant Data Plumbing Infrastructure Redis Labs
 
AWS re:Invent 2016: Get Technically Inspired by Container-Powered Migrations ...
AWS re:Invent 2016: Get Technically Inspired by Container-Powered Migrations ...AWS re:Invent 2016: Get Technically Inspired by Container-Powered Migrations ...
AWS re:Invent 2016: Get Technically Inspired by Container-Powered Migrations ...Amazon Web Services
 
Fraud Detection for Israel BigThings Meetup
Fraud Detection  for Israel BigThings MeetupFraud Detection  for Israel BigThings Meetup
Fraud Detection for Israel BigThings MeetupGwen (Chen) Shapira
 
AWS re:Invent 2016: Introduction to Managed Database Services on AWS (DAT307)
AWS re:Invent 2016: Introduction to Managed Database Services on AWS (DAT307)AWS re:Invent 2016: Introduction to Managed Database Services on AWS (DAT307)
AWS re:Invent 2016: Introduction to Managed Database Services on AWS (DAT307)Amazon Web Services
 
Redis Reliability, Performance & Innovation
Redis Reliability, Performance & InnovationRedis Reliability, Performance & Innovation
Redis Reliability, Performance & InnovationRedis Labs
 
AWS re:Invent 2016: From Resilience to Ubiquity - #NetflixEverywhere Global A...
AWS re:Invent 2016: From Resilience to Ubiquity - #NetflixEverywhere Global A...AWS re:Invent 2016: From Resilience to Ubiquity - #NetflixEverywhere Global A...
AWS re:Invent 2016: From Resilience to Ubiquity - #NetflixEverywhere Global A...Amazon Web Services
 
AWS re:Invent 2016: Building HPC Clusters as Code in the (Almost) Infinite Cl...
AWS re:Invent 2016: Building HPC Clusters as Code in the (Almost) Infinite Cl...AWS re:Invent 2016: Building HPC Clusters as Code in the (Almost) Infinite Cl...
AWS re:Invent 2016: Building HPC Clusters as Code in the (Almost) Infinite Cl...Amazon Web Services
 
Architecting for the Cloud using NetflixOSS - Codemash Workshop
Architecting for the Cloud using NetflixOSS - Codemash WorkshopArchitecting for the Cloud using NetflixOSS - Codemash Workshop
Architecting for the Cloud using NetflixOSS - Codemash WorkshopSudhir Tonse
 
Hadoop on Docker
Hadoop on DockerHadoop on Docker
Hadoop on DockerRakesh Saha
 
(ISM301) Engineering Netflix Global Operations In The Cloud
(ISM301) Engineering Netflix Global Operations In The Cloud(ISM301) Engineering Netflix Global Operations In The Cloud
(ISM301) Engineering Netflix Global Operations In The CloudAmazon Web Services
 
Cloud Foundry Summit 2015: Building a Robust Cloud Foundry (HA, Security and DR)
Cloud Foundry Summit 2015: Building a Robust Cloud Foundry (HA, Security and DR)Cloud Foundry Summit 2015: Building a Robust Cloud Foundry (HA, Security and DR)
Cloud Foundry Summit 2015: Building a Robust Cloud Foundry (HA, Security and DR)VMware Tanzu
 

Was ist angesagt? (20)

Netflix Keystone SPaaS: Real-time Stream Processing as a Service - ABD320 - r...
Netflix Keystone SPaaS: Real-time Stream Processing as a Service - ABD320 - r...Netflix Keystone SPaaS: Real-time Stream Processing as a Service - ABD320 - r...
Netflix Keystone SPaaS: Real-time Stream Processing as a Service - ABD320 - r...
 
Data Con LA 2019 - Integrating Kafka with a Real-Time Database by David Anderson
Data Con LA 2019 - Integrating Kafka with a Real-Time Database by David AndersonData Con LA 2019 - Integrating Kafka with a Real-Time Database by David Anderson
Data Con LA 2019 - Integrating Kafka with a Real-Time Database by David Anderson
 
AWS re:Invent 2016: Design Patterns for High Availability: Lessons from Amazo...
AWS re:Invent 2016: Design Patterns for High Availability: Lessons from Amazo...AWS re:Invent 2016: Design Patterns for High Availability: Lessons from Amazo...
AWS re:Invent 2016: Design Patterns for High Availability: Lessons from Amazo...
 
AWS re:Invent 2016: Global Traffic Management with Amazon Route 53 Traffic Fl...
AWS re:Invent 2016: Global Traffic Management with Amazon Route 53 Traffic Fl...AWS re:Invent 2016: Global Traffic Management with Amazon Route 53 Traffic Fl...
AWS re:Invent 2016: Global Traffic Management with Amazon Route 53 Traffic Fl...
 
Svc 202-netflix-open-source
Svc 202-netflix-open-sourceSvc 202-netflix-open-source
Svc 202-netflix-open-source
 
AWS re:Invent 2016: High Performance Computing on AWS (CMP207)
AWS re:Invent 2016: High Performance Computing on AWS (CMP207)AWS re:Invent 2016: High Performance Computing on AWS (CMP207)
AWS re:Invent 2016: High Performance Computing on AWS (CMP207)
 
Deep Dive on Microservices and Amazon ECS
Deep Dive on Microservices and Amazon ECSDeep Dive on Microservices and Amazon ECS
Deep Dive on Microservices and Amazon ECS
 
Redis For Distributed & Fault Tolerant Data Plumbing Infrastructure
Redis For Distributed & Fault Tolerant Data Plumbing Infrastructure Redis For Distributed & Fault Tolerant Data Plumbing Infrastructure
Redis For Distributed & Fault Tolerant Data Plumbing Infrastructure
 
AWS re:Invent 2016: Get Technically Inspired by Container-Powered Migrations ...
AWS re:Invent 2016: Get Technically Inspired by Container-Powered Migrations ...AWS re:Invent 2016: Get Technically Inspired by Container-Powered Migrations ...
AWS re:Invent 2016: Get Technically Inspired by Container-Powered Migrations ...
 
Fraud Detection for Israel BigThings Meetup
Fraud Detection  for Israel BigThings MeetupFraud Detection  for Israel BigThings Meetup
Fraud Detection for Israel BigThings Meetup
 
Kafka Security
Kafka SecurityKafka Security
Kafka Security
 
Blue green deployment
Blue green deploymentBlue green deployment
Blue green deployment
 
AWS re:Invent 2016: Introduction to Managed Database Services on AWS (DAT307)
AWS re:Invent 2016: Introduction to Managed Database Services on AWS (DAT307)AWS re:Invent 2016: Introduction to Managed Database Services on AWS (DAT307)
AWS re:Invent 2016: Introduction to Managed Database Services on AWS (DAT307)
 
Redis Reliability, Performance & Innovation
Redis Reliability, Performance & InnovationRedis Reliability, Performance & Innovation
Redis Reliability, Performance & Innovation
 
AWS re:Invent 2016: From Resilience to Ubiquity - #NetflixEverywhere Global A...
AWS re:Invent 2016: From Resilience to Ubiquity - #NetflixEverywhere Global A...AWS re:Invent 2016: From Resilience to Ubiquity - #NetflixEverywhere Global A...
AWS re:Invent 2016: From Resilience to Ubiquity - #NetflixEverywhere Global A...
 
AWS re:Invent 2016: Building HPC Clusters as Code in the (Almost) Infinite Cl...
AWS re:Invent 2016: Building HPC Clusters as Code in the (Almost) Infinite Cl...AWS re:Invent 2016: Building HPC Clusters as Code in the (Almost) Infinite Cl...
AWS re:Invent 2016: Building HPC Clusters as Code in the (Almost) Infinite Cl...
 
Architecting for the Cloud using NetflixOSS - Codemash Workshop
Architecting for the Cloud using NetflixOSS - Codemash WorkshopArchitecting for the Cloud using NetflixOSS - Codemash Workshop
Architecting for the Cloud using NetflixOSS - Codemash Workshop
 
Hadoop on Docker
Hadoop on DockerHadoop on Docker
Hadoop on Docker
 
(ISM301) Engineering Netflix Global Operations In The Cloud
(ISM301) Engineering Netflix Global Operations In The Cloud(ISM301) Engineering Netflix Global Operations In The Cloud
(ISM301) Engineering Netflix Global Operations In The Cloud
 
Cloud Foundry Summit 2015: Building a Robust Cloud Foundry (HA, Security and DR)
Cloud Foundry Summit 2015: Building a Robust Cloud Foundry (HA, Security and DR)Cloud Foundry Summit 2015: Building a Robust Cloud Foundry (HA, Security and DR)
Cloud Foundry Summit 2015: Building a Robust Cloud Foundry (HA, Security and DR)
 

Ähnlich wie analytic engine - a common big data computation service on the aws

Batch Processing with Containers on AWS - CON304 - re:Invent 2017
Batch Processing with Containers on AWS - CON304 - re:Invent 2017Batch Processing with Containers on AWS - CON304 - re:Invent 2017
Batch Processing with Containers on AWS - CON304 - re:Invent 2017Amazon Web Services
 
Building Serverless Web Applications - DevDay Austin 2017
Building Serverless Web Applications - DevDay Austin 2017Building Serverless Web Applications - DevDay Austin 2017
Building Serverless Web Applications - DevDay Austin 2017Amazon Web Services
 
Cost Optimization for Microsoft Workloads on AWS - AWS Transformation Day: Sa...
Cost Optimization for Microsoft Workloads on AWS - AWS Transformation Day: Sa...Cost Optimization for Microsoft Workloads on AWS - AWS Transformation Day: Sa...
Cost Optimization for Microsoft Workloads on AWS - AWS Transformation Day: Sa...Amazon Web Services
 
Raleigh DevDay 2017: Build a serverless web application in one day workshop
Raleigh DevDay 2017: Build a serverless web application in one day workshopRaleigh DevDay 2017: Build a serverless web application in one day workshop
Raleigh DevDay 2017: Build a serverless web application in one day workshopAmazon Web Services
 
Serverless at Lifestage
Serverless at LifestageServerless at Lifestage
Serverless at LifestageBATbern
 
Building Serverless Web Applications - DevDay Los Angeles 2017
Building Serverless Web Applications - DevDay Los Angeles 2017Building Serverless Web Applications - DevDay Los Angeles 2017
Building Serverless Web Applications - DevDay Los Angeles 2017Amazon Web Services
 
AWS Summit Stockholm 2014 – B2 – Migrating enterprise applications to AWS
AWS Summit Stockholm 2014 – B2 – Migrating enterprise applications to AWSAWS Summit Stockholm 2014 – B2 – Migrating enterprise applications to AWS
AWS Summit Stockholm 2014 – B2 – Migrating enterprise applications to AWSAmazon Web Services
 
Realize Value of Your Microsoft Investments - AWS Transformation Days Raleigh...
Realize Value of Your Microsoft Investments - AWS Transformation Days Raleigh...Realize Value of Your Microsoft Investments - AWS Transformation Days Raleigh...
Realize Value of Your Microsoft Investments - AWS Transformation Days Raleigh...Amazon Web Services
 
How to Bring Microsoft Apps to AWS - AWS Online Tech Talks
How to Bring Microsoft Apps to AWS - AWS Online Tech TalksHow to Bring Microsoft Apps to AWS - AWS Online Tech Talks
How to Bring Microsoft Apps to AWS - AWS Online Tech TalksAmazon Web Services
 
10 Pro Tips for Scaling Your Startup from 0-10M Users
10 Pro Tips for Scaling Your Startup from 0-10M Users10 Pro Tips for Scaling Your Startup from 0-10M Users
10 Pro Tips for Scaling Your Startup from 0-10M UsersAmazon Web Services
 
Realize Value of Your Microsoft Investments- Transformation Day Philadelphia ...
Realize Value of Your Microsoft Investments- Transformation Day Philadelphia ...Realize Value of Your Microsoft Investments- Transformation Day Philadelphia ...
Realize Value of Your Microsoft Investments- Transformation Day Philadelphia ...Amazon Web Services
 
GPSBUS220-Refactor and Replatform .NET Apps to Use the Latest Microsoft SQL S...
GPSBUS220-Refactor and Replatform .NET Apps to Use the Latest Microsoft SQL S...GPSBUS220-Refactor and Replatform .NET Apps to Use the Latest Microsoft SQL S...
GPSBUS220-Refactor and Replatform .NET Apps to Use the Latest Microsoft SQL S...Amazon Web Services
 
Realize Value of Your Microsoft Investments - AWS Transformation Day Boston 2018
Realize Value of Your Microsoft Investments - AWS Transformation Day Boston 2018Realize Value of Your Microsoft Investments - AWS Transformation Day Boston 2018
Realize Value of Your Microsoft Investments - AWS Transformation Day Boston 2018Amazon Web Services
 
Realize Value, Reduce Costs And Optimize the Value of Your Microsoft Investme...
Realize Value, Reduce Costs And Optimize the Value of Your Microsoft Investme...Realize Value, Reduce Costs And Optimize the Value of Your Microsoft Investme...
Realize Value, Reduce Costs And Optimize the Value of Your Microsoft Investme...Amazon Web Services
 
GAM307_Ubisoft How For Honor Runs Using Amazon ECS
GAM307_Ubisoft How For Honor Runs Using Amazon ECSGAM307_Ubisoft How For Honor Runs Using Amazon ECS
GAM307_Ubisoft How For Honor Runs Using Amazon ECSAmazon Web Services
 
Best-Practices-for-Running-Windows-Workloads-on-AWS
Best-Practices-for-Running-Windows-Workloads-on-AWSBest-Practices-for-Running-Windows-Workloads-on-AWS
Best-Practices-for-Running-Windows-Workloads-on-AWSAmazon Web Services
 
Building scalable OTT workflows on AWS - Serverless Video Workflows
Building scalable OTT workflows on AWS - Serverless Video WorkflowsBuilding scalable OTT workflows on AWS - Serverless Video Workflows
Building scalable OTT workflows on AWS - Serverless Video WorkflowsAmazon Web Services
 

Ähnlich wie analytic engine - a common big data computation service on the aws (20)

Batch Processing with Containers on AWS - CON304 - re:Invent 2017
Batch Processing with Containers on AWS - CON304 - re:Invent 2017Batch Processing with Containers on AWS - CON304 - re:Invent 2017
Batch Processing with Containers on AWS - CON304 - re:Invent 2017
 
Building Serverless Web Applications - DevDay Austin 2017
Building Serverless Web Applications - DevDay Austin 2017Building Serverless Web Applications - DevDay Austin 2017
Building Serverless Web Applications - DevDay Austin 2017
 
Cost Optimization for Microsoft Workloads on AWS - AWS Transformation Day: Sa...
Cost Optimization for Microsoft Workloads on AWS - AWS Transformation Day: Sa...Cost Optimization for Microsoft Workloads on AWS - AWS Transformation Day: Sa...
Cost Optimization for Microsoft Workloads on AWS - AWS Transformation Day: Sa...
 
Raleigh DevDay 2017: Build a serverless web application in one day workshop
Raleigh DevDay 2017: Build a serverless web application in one day workshopRaleigh DevDay 2017: Build a serverless web application in one day workshop
Raleigh DevDay 2017: Build a serverless web application in one day workshop
 
Serverless at Lifestage
Serverless at LifestageServerless at Lifestage
Serverless at Lifestage
 
Building Serverless Web Applications - DevDay Los Angeles 2017
Building Serverless Web Applications - DevDay Los Angeles 2017Building Serverless Web Applications - DevDay Los Angeles 2017
Building Serverless Web Applications - DevDay Los Angeles 2017
 
AWS Summit Stockholm 2014 – B2 – Migrating enterprise applications to AWS
AWS Summit Stockholm 2014 – B2 – Migrating enterprise applications to AWSAWS Summit Stockholm 2014 – B2 – Migrating enterprise applications to AWS
AWS Summit Stockholm 2014 – B2 – Migrating enterprise applications to AWS
 
Realize Value of Your Microsoft Investments - AWS Transformation Days Raleigh...
Realize Value of Your Microsoft Investments - AWS Transformation Days Raleigh...Realize Value of Your Microsoft Investments - AWS Transformation Days Raleigh...
Realize Value of Your Microsoft Investments - AWS Transformation Days Raleigh...
 
How to Bring Microsoft Apps to AWS - AWS Online Tech Talks
How to Bring Microsoft Apps to AWS - AWS Online Tech TalksHow to Bring Microsoft Apps to AWS - AWS Online Tech Talks
How to Bring Microsoft Apps to AWS - AWS Online Tech Talks
 
ARC205_Born in the Cloud
ARC205_Born in the CloudARC205_Born in the Cloud
ARC205_Born in the Cloud
 
10 Pro Tips for Scaling Your Startup from 0-10M Users
10 Pro Tips for Scaling Your Startup from 0-10M Users10 Pro Tips for Scaling Your Startup from 0-10M Users
10 Pro Tips for Scaling Your Startup from 0-10M Users
 
Managing Your Cloud Assets
Managing Your Cloud AssetsManaging Your Cloud Assets
Managing Your Cloud Assets
 
Realize Value of Your Microsoft Investments- Transformation Day Philadelphia ...
Realize Value of Your Microsoft Investments- Transformation Day Philadelphia ...Realize Value of Your Microsoft Investments- Transformation Day Philadelphia ...
Realize Value of Your Microsoft Investments- Transformation Day Philadelphia ...
 
GPSBUS220-Refactor and Replatform .NET Apps to Use the Latest Microsoft SQL S...
GPSBUS220-Refactor and Replatform .NET Apps to Use the Latest Microsoft SQL S...GPSBUS220-Refactor and Replatform .NET Apps to Use the Latest Microsoft SQL S...
GPSBUS220-Refactor and Replatform .NET Apps to Use the Latest Microsoft SQL S...
 
Realize Value of Your Microsoft Investments - AWS Transformation Day Boston 2018
Realize Value of Your Microsoft Investments - AWS Transformation Day Boston 2018Realize Value of Your Microsoft Investments - AWS Transformation Day Boston 2018
Realize Value of Your Microsoft Investments - AWS Transformation Day Boston 2018
 
Introduction to Serverless
Introduction to ServerlessIntroduction to Serverless
Introduction to Serverless
 
Realize Value, Reduce Costs And Optimize the Value of Your Microsoft Investme...
Realize Value, Reduce Costs And Optimize the Value of Your Microsoft Investme...Realize Value, Reduce Costs And Optimize the Value of Your Microsoft Investme...
Realize Value, Reduce Costs And Optimize the Value of Your Microsoft Investme...
 
GAM307_Ubisoft How For Honor Runs Using Amazon ECS
GAM307_Ubisoft How For Honor Runs Using Amazon ECSGAM307_Ubisoft How For Honor Runs Using Amazon ECS
GAM307_Ubisoft How For Honor Runs Using Amazon ECS
 
Best-Practices-for-Running-Windows-Workloads-on-AWS
Best-Practices-for-Running-Windows-Workloads-on-AWSBest-Practices-for-Running-Windows-Workloads-on-AWS
Best-Practices-for-Running-Windows-Workloads-on-AWS
 
Building scalable OTT workflows on AWS - Serverless Video Workflows
Building scalable OTT workflows on AWS - Serverless Video WorkflowsBuilding scalable OTT workflows on AWS - Serverless Video Workflows
Building scalable OTT workflows on AWS - Serverless Video Workflows
 

Mehr von Scott Miao

My thoughts for - Building CI/CD Pipelines for Serverless Applications sharing
My thoughts for - Building CI/CD Pipelines for Serverless Applications sharingMy thoughts for - Building CI/CD Pipelines for Serverless Applications sharing
My thoughts for - Building CI/CD Pipelines for Serverless Applications sharingScott Miao
 
Zero-downtime Hadoop/HBase Cross-datacenter Migration
Zero-downtime Hadoop/HBase Cross-datacenter MigrationZero-downtime Hadoop/HBase Cross-datacenter Migration
Zero-downtime Hadoop/HBase Cross-datacenter MigrationScott Miao
 
Attack on graph
Attack on graphAttack on graph
Attack on graphScott Miao
 
004 architecture andadvanceduse
004 architecture andadvanceduse004 architecture andadvanceduse
004 architecture andadvanceduseScott Miao
 
003 admin featuresandclients
003 admin featuresandclients003 admin featuresandclients
003 admin featuresandclientsScott Miao
 
006 performance tuningandclusteradmin
006 performance tuningandclusteradmin006 performance tuningandclusteradmin
006 performance tuningandclusteradminScott Miao
 
005 cluster monitoring
005 cluster monitoring005 cluster monitoring
005 cluster monitoringScott Miao
 
002 hbase clientapi
002 hbase clientapi002 hbase clientapi
002 hbase clientapiScott Miao
 
001 hbase introduction
001 hbase introduction001 hbase introduction
001 hbase introductionScott Miao
 
20121022 tm hbasecanarytool
20121022 tm hbasecanarytool20121022 tm hbasecanarytool
20121022 tm hbasecanarytoolScott Miao
 

Mehr von Scott Miao (10)

My thoughts for - Building CI/CD Pipelines for Serverless Applications sharing
My thoughts for - Building CI/CD Pipelines for Serverless Applications sharingMy thoughts for - Building CI/CD Pipelines for Serverless Applications sharing
My thoughts for - Building CI/CD Pipelines for Serverless Applications sharing
 
Zero-downtime Hadoop/HBase Cross-datacenter Migration
Zero-downtime Hadoop/HBase Cross-datacenter MigrationZero-downtime Hadoop/HBase Cross-datacenter Migration
Zero-downtime Hadoop/HBase Cross-datacenter Migration
 
Attack on graph
Attack on graphAttack on graph
Attack on graph
 
004 architecture andadvanceduse
004 architecture andadvanceduse004 architecture andadvanceduse
004 architecture andadvanceduse
 
003 admin featuresandclients
003 admin featuresandclients003 admin featuresandclients
003 admin featuresandclients
 
006 performance tuningandclusteradmin
006 performance tuningandclusteradmin006 performance tuningandclusteradmin
006 performance tuningandclusteradmin
 
005 cluster monitoring
005 cluster monitoring005 cluster monitoring
005 cluster monitoring
 
002 hbase clientapi
002 hbase clientapi002 hbase clientapi
002 hbase clientapi
 
001 hbase introduction
001 hbase introduction001 hbase introduction
001 hbase introduction
 
20121022 tm hbasecanarytool
20121022 tm hbasecanarytool20121022 tm hbasecanarytool
20121022 tm hbasecanarytool
 

Kürzlich hochgeladen

ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfOverkill Security
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusZilliz
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Zilliz
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024The Digital Insurer
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...apidays
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfOrbitshub
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistandanishmna97
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamUiPathCommunity
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Cyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfCyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfOverkill Security
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 

Kürzlich hochgeladen (20)

ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Cyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfCyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdf
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 

analytic engine - a common big data computation service on the aws

  • 1. © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Scott Miao, SPN, Trend Micro 2016/5/20 Analytic Engine A common Big Data computation service on the AWS
  • 2. Who am I • Scott Miao • RD, SPN, Trend Micro • Worked on Hadoop ecosystem about 6 years • Worked on AWS for BigData about 3 years • Expertise in HDFS/MR/HBase • Speaker in some Hadoop related confs • @takeshi.miao
  • 3. Agenda • What problems we suffered ? • Why AWS ? • Analytic Engine • The benefits AWS brings to AE • AE roadmap on AWS
  • 4. What problems we suffered ?
  • 5. Hadoop Expansion Data volume increases 1.5 ~ 2x every year Data center issues • network bottleneck • server depreciation Growth becomes 2x
  • 7. Return of Investment • On traditional infra., we put a lot of efforts on services operation • On the Cloud, we can leverage its elasticities to automate our services • More focus on innovation !! Time Money Revenue Cost
  • 8. AWS is a leader of IaaS platform https://www.gartner.com/doc/reprints?id=1-2G2O5FC&ct=150519&st=sbSource: Gartner (May 2015)
  • 9. AWS Evaluation Cost acceptable Functionalities satisfied Performance satisfied
  • 10. Analytic Engine A common Big Data computation service on the AWS
  • 11. High Level Architecture Analytic Engine (AE) CloudStorage (CS) createCluster submitJob deleteCluster Input from Output to AWS EMR RESTful API RESTful API RDs Researchers Services Common Storage Service Common Computation Service
  • 12. Common Cloud Services in Trend Analytic Engine • Computation service for Trenders • Based on AWS EMR • Simple RESTful API calls • Computing on demand • Short live • Long running • No operation effort • Pay by computing resources Cloud Storage • Storage service for Trenders • Based on AWS S3 • Simple RESTful API calls • Share data to all in one place • Metadata search for files • No operation effort • Pay by storage size used
  • 13. Why we use AE instead of EMR directly ? • Abstraction • Avoid locked-in • Hide details impl. behind the scene • AWS EMR was not design for long running jobs • >= AMI-3.1.1 – 256 ACTIVE or PENDING jobs (STEPs) • < AMI-3.1.1 – 256 jobs in total • Better integrated with other common services • Keep our hands off from AWS native codes • Centralized Authentication & Authorization • No AWS/SSH keys for user http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/AddingStepstoaJobFlow.html
  • 14. Common usecases for AE • User creates a cluster • User can create multiple clusters • User submits job to target cluster • AE helps user to deliver job to secondary cluster • User wants to know their cost
  • 15. Usecase#1 – User creates a cluster AEusers createCluster EMR 1.User invokes createCluster 2.AE launches an EMR cluster for user With tags attached 1. 2. tag: ‘sched:routine’, ‘env:prod’, m3.xlarge * 10 tag: ‘sched:routine’, ‘env:prod’, m3.xlarge * 10It is RESTful API, so I can use any client I am familiar with !
  • 16. Usecase#2 – User can create multiple clusters as he/she need AEusers createCluster EMR 1.User invokes createCluster 2.AE launches another new EMR cluster for user with tags attached 3. User can create many clusters he/she needs 1. 2. tag: ‘sched:adhoc’, ‘env:prod’, c3.4xlarge * 20 tag: ‘sched:routine’, ‘env:prod’, m3.xlarge * 10 tag: ‘sched:adhoc’, ‘env:prod’, c3.4xlarge * 20
  • 17. 1.User invokes submitJob 2.AE matches the job and deliver it to target cluster 3. AE submits job 4.EMR pulls data from CS 5.Job runs on target cluster 6.EMR outputs result to CS 7. AE sends msg to SNS Topic if user specified Usecase#3 – User submits job to target cluster to run AEusers submitJob EMR CS 1. 2. 3. clusterCriteria: [[‘sched:adhoc’, ‘env:prod’], [“env:prod”]] tag: ‘sched:routine’, ‘env:prod’ tag: ‘sched:adhoc’, ‘env:prod’ 5.7. 4. 6.
  • 18. Usecase#4 – AE delivers job to secondary cluster if target cluster down AEusers submitJob EMR CS 1. 2. 3. clusterCriteria: [[‘sched:adhoc’, ‘env:prod’], [“env:prod”]] tag: ‘sched:routine’, ‘env:prod’ tag: ‘sched:adhoc’, ‘env:prod’ 1.User invokes submitJob 2.AE matches the job and deliver it to secondary cluster 3. AE submits job 4.EMR pull data from CS 5.Job run on target cluster 6.EMR output result to CS 5. 4. 6.
  • 19. Usecase#5 – User wants to know what their current cost is Billing & Cost management -> Cost Explorer -> Launch Cost Explorer
  • 20. IDC Middle Level Architecture AZb AE API servers RDS Internal ELB AZa AZb AZc AE API servers RDS services services services peering HTTPS EMR EMR Cross-account S3 buckets input/output Auto Scaling group worker s worker sMulti-AZs Auto Scaling group Auto Scaling group Eureka Eureka Internet HTTPS/HTTP Basic/VPN Cloud Storage HTTPS/HTTP Basic Amazon SNS Oregon (us-west-2) peering
  • 21. The benefits AWS brings to AE
  • 22. Pros & Cons Aspects IDC AWS Data Capacity Limited by physical rack space No limitation in seasonable amount Computation Capacity Limited by physical rack space No limitation in seasonable amount DevOps Hard, due to on physical machine/ VM farm Easy, due to code is everything (Continuous Deployment) Scalability Hard, due to on physical machine/ VM farm Easy, relied on ELB, Autoscaling group from AWS
  • 23. Pros & Cons Aspects IDC AWS Disaster Recovery Hard, due to on physical machine/ VM farm Easy, due to code is everything Data Location Limited due to IDC location Various and easy due to multiple regions of AWS Cost Implied in Total Cost of Ownership Acceptable cost with Cloud optimized design
  • 26.
  • 28. AZb AZa AZb AZc RDS peering HTTPS Cross-account S3 buckets input/output Oregon (us-west-2) RDS 1. pre-built infra. by AWS CF 2. Users permission granted 3. Pre-launched RDS 1. 2. 3.
  • 29. AZb AE API servers RDS Internal ELB AZa AZb AZc AE API servers RDS peering HTTPS EMR EMR Cross-account S3 buckets input/output Auto Scaling group worker s worker sMulti-AZs Auto Scaling group Auto Scaling group Eureka Eureka Oregon (us-west-2) 4. Provision AE SaaS by CI/CD 4.
  • 30. IDC AZb AE API servers RDS Internal ELB AZa AZb AZc AE API servers RDS services services services peering HTTPS EMR EMR Cross-account S3 buckets Auto Scaling group worker s worker sMulti-AZs Auto Scaling group Auto Scaling group Eureka Eureka Internet HTTPS/HTTP Basic/VPN Cloud Storage HTTPS/HTTP BasicOregon (us-west-2) 5. Users can access via VPN, FW open for Trend 6. Input from CS or S3 7. Computation in AWS EMR cluster 5. 7. 6. 8. 6. 8. Amazon SNS 9. 8. Output to CS or S3 9. Job end msg to AWS SNS (optional)
  • 31. What is Netflix Genie • A practice from Netflix • Hadoop client to submit different kinds of Job • Flexible data model design to adopt diff kind of cluster • Flexible Job/cluster matching design (based on tags) • Cloud characteristics built-in design • e.g. auto-scaling, load-balance, etc • It’s goal is plain & simple • We use it as an internal component https://github.com/Netflix/genie/wiki
  • 32. What is Netflix Eureka • Is a RESTful service • Built by Netflix • A critical component for Genie to do Load Balance and failover Genie API client API client API client