SlideShare ist ein Scribd-Unternehmen logo
1 von 48
Lynn Langit
New AWS Services
For bioinformatics pipelines
Feb 2017
New AWS Services
• Useful for scaling bioinformatics pipelines
• Announced at re:Invent (Nov 2016)
• Athena
• Step Functions
• Batch
• Glue
• QuickSight
Starting Point for CSIRO
Serverless AWS Lambda Application
Public Genomic Datasets
About AWS Athena
Serverless SQL queries on S3 data
AWS Athena Information
• Add table (structure) to database via DDL from input file(s)
• Write and execute SQL query
• Optionally save query
• Optionally review query history
• View results
• Optionally download result set to .csv
Athena - Demo
Athena Genomics Query Example
About AWS Step Functions
Serverless visual workflows for Lambdas
AWS Step Functions
1. Define steps and services (activities or lambdas)
2. Verify step execution(s)
3. Monitor and scale
“Your application as a state machine.”
AWS Step Functions – 1. Define Steps/Services
AWS Step Functions – 2. Verify step execution
Step Functions - Demo
About AWS Batch
Fully managed batch processing at scale
What is batch computing?
Run jobs asynchronously and automatically across one or
more computers.
Jobs may dependencies, making the sequencing and
scheduling of multiple jobs complex and challenging.
What is AWS Batch?
Fully Managed
No software to install or
servers to manage.
Integrated with AWS
Batch jobs can easily and
securely interact with
services such as Amazon S3,
DynamoDB, and Rekognition
Cost-optimized
Provisioning
Auto provisions compute
resources tailored to the job
needs using EC2 & EC2 Spot
AWS Batch Concepts
1. Jobs
1. Job Definitions
2. Job Queues
3. Job States
2. Compute Environments
3. Scheduler
Short Video -- here
Jobs
Jobs are the unit of work executed by AWS Batch as containerized
applications running on Amazon EC2.
Containerized jobs can reference a container image, command, and
parameters or users can simply provide a .zip containing their
application and we will run it on a default Amazon Linux container.
$ aws batch submit-job --job-name variant-calling
--job-definition gatk --job-queue genomics
Massively parallel jobs
• Now - users can submit a large number of independent “simple jobs.”
• Soon – AWS will add support for “array jobs” that run many copies of an
application against an array of elements.
Array jobs are an efficient way to run:
• Parametric sweeps
• Monte Carlo simulations
• Processing a large collection of objects
NOTE: These use cases are possible today, simply submit more jobs.
Example Genomics
Workflow
Workflows, Pipelines, and Job Dependencies
Jobs can express a dependency on the successful
completion of other jobs or specific elements of an
array job.
Use your preferred workflow engine and language to
submit jobs. Flow-based systems simply submit jobs
serially, while DAG-based systems submit many jobs
at once, identifying inter-job dependencies.
$ aws batch submit-job –depends-on 606b3ad1-aa31-48d8-92ec-f154bfc8215f ...
Job Definitions
Batch Job Definitions specify how jobs are to be run. While each job
must reference a job definition, many parameters can be overridden.
Some of the attributes specified in a job definition:
• IAM role associated with the job
• vCPU and memory requirements
• Mount points
• Container properties
• Environment variables
$ aws batch register-job-definition --job-definition-name gatk
--container-properties ...
Job Queues
Jobs are submitted to a Job Queue, where they reside until they are
able to be scheduled to a compute resource. Information related to
completed jobs persists in the queue for 24 hours.
$ aws batch create-job-queue --job-queue-name genomics
--priority 500 --compute-environment-order ...
Compute Environments
Mapped from job queues to run containerized batch jobs.
• Managed CEs - you describe your requirements (instance types,
min/max/desired vCPUs, and EC2 Spot bid as a % of On-Demand),
AWS launches & scales resources for you. Pick specific instance types,
instance families or simply choose “optimal”
• Unmanaged CEs - you can launch and manage your own resources. Your
instances need to include the ECS agent and run supported versions of Linux
and Docker. AWS Batch will then create an Amazon ECS cluster which can
accept the instances you launch. Jobs can be scheduled to your Compute
Environment as soon as your instances are healthy and register with the
ECS Agent.
$ aws batch create-compute-environment --compute-
environment-name unmanagedce --type UNMANAGED ...
AWS Batch Scheduler
The Scheduler evaluates when, where, and
how to run jobs that have been submitted to
a job queue.
Jobs run in approximately the order in which
they are submitted as long as all
dependencies on other jobs have been met.
Queued Job States
• SUBMITTED: Accepted into the queue, but not yet evaluated for execution
• PENDING: Your job has dependencies on other jobs which have not yet
completed
• RUNNABLE: Your job has been evaluated by the scheduler and is ready to run
• STARTING: Your job is in the process of being scheduled to a compute
resource
• RUNNING: Your job is currently running
• SUCCEEDED: Your job has finished with exit code 0
• FAILED: Your job finished with a non-zero exit code or was cancelled or
terminated.
AWS Batch Actions
• CancelJob: Marks jobs that are not yet STARTING as
FAILED.
• TerminateJob: Cancels jobs that are currently waiting in the
queue. Stops jobs that are in a STARTING or RUNNING state
and transitions them to FAILED.
NOTE: Requires a “reason” which is viewable via DescribeJobs
$ aws batch cancel-job --reason “Submitted to wrong queue”
--jobId= 8a767ac8-e28a-4c97-875b-e5c0bcf49eb8
AWS Batch Data Types
• ComputeEnvironmentDetail
• ComputeEnvironmentOrder
• ComputeResource
• ContainerProperties
• ContainerPropertiesResource
• CounterProperties
• Host
• Job
• JobDefinition
• JobQueueDetail
• MountPoint
• Parameter
• Ulimit
• Volume
Batch - Demo
AWS Batch Pricing and Functionality
There is no charge for AWS Batch; you only pay for the
underlying resources that you consume!
NOTE: Support for Array Jobs, retries, and jobs executed as AWS Lambda
functions coming soon!
Use the Right Tool for the Job
Not all batch workloads are the same…
• ETL and Big Data processing/analytics?
• Consider EMR, Data Pipeline, Redshift, and related services.
• Lots of small Cron jobs? AWS Batch is a great way to execute these jobs, but
you will likely want a workflow or job-scheduling system to orchestrate job
submissions.
• Efficiently run lots of big and small compute jobs on heterogeneous
compute resources? Use AWS Batch
Example: DNA Sequencing
Example: Genomics on Unmanaged Compute Environments
Fully Managed Integrated with AWS Cost-optimized
Resource Provisioning
AWS Batch summarized
About AWS Glue
Serverless managed, scalable ETL
AWS Glue
1. Build a data catalog
1. Discover and use your datasets via a Hive-compatible metastore
2. Store versions, connection and credential info
3. Use crawlers to auto-generate schema from S3 data & partitions
2. Generate and edit transforms using PySpark
3. Schedule and run your jobs
1. On schedule, event or lambda
NOTE: Glue is announced, but no beta as of yet…video from re:Invent -- here
An aside…
EC2 Elastic GPUs
About AWS QuickSight
Quick and easy data dashboards
Resources for new AWS Services
• Athena (SQL query on S3) – here
• Batch (Optimized, chained EC2 batches) – here
• Glue (Scaled ETL) -- here
• Step Functions (Lambda workflows) – here
• QuickSight (Data Dashboards) – here
• Full list of AWS services announced at re:Invent 2016 -- here

Weitere ähnliche Inhalte

Was ist angesagt?

AWS Glue - let's get stuck in!
AWS Glue - let's get stuck in!AWS Glue - let's get stuck in!
AWS Glue - let's get stuck in!Chris Taylor
 
Kinesis and Spark Streaming - Advanced AWS Meetup - August 2014
Kinesis and Spark Streaming - Advanced AWS Meetup - August 2014Kinesis and Spark Streaming - Advanced AWS Meetup - August 2014
Kinesis and Spark Streaming - Advanced AWS Meetup - August 2014Chris Fregly
 
AWS re:Invent 2016: How Mapbox Uses the AWS Edge to Deliver Fast Maps for Mob...
AWS re:Invent 2016: How Mapbox Uses the AWS Edge to Deliver Fast Maps for Mob...AWS re:Invent 2016: How Mapbox Uses the AWS Edge to Deliver Fast Maps for Mob...
AWS re:Invent 2016: How Mapbox Uses the AWS Edge to Deliver Fast Maps for Mob...Amazon Web Services
 
AWS re:Invent 2016: Streaming ETL for RDS and DynamoDB (DAT315)
AWS re:Invent 2016: Streaming ETL for RDS and DynamoDB (DAT315)AWS re:Invent 2016: Streaming ETL for RDS and DynamoDB (DAT315)
AWS re:Invent 2016: Streaming ETL for RDS and DynamoDB (DAT315)Amazon Web Services
 
Best Practices for Genomic and Bioinformatics Analysis Pipelines on AWS
Best Practices for Genomic and Bioinformatics Analysis Pipelines on AWS Best Practices for Genomic and Bioinformatics Analysis Pipelines on AWS
Best Practices for Genomic and Bioinformatics Analysis Pipelines on AWS Amazon Web Services
 
Batch Processing with Containers on AWS - June 2017 AWS Online Tech Talks
Batch Processing with Containers on AWS -  June 2017 AWS Online Tech TalksBatch Processing with Containers on AWS -  June 2017 AWS Online Tech Talks
Batch Processing with Containers on AWS - June 2017 AWS Online Tech TalksAmazon Web Services
 
AWS re:Invent 2016: Get Technically Inspired by Container-Powered Migrations ...
AWS re:Invent 2016: Get Technically Inspired by Container-Powered Migrations ...AWS re:Invent 2016: Get Technically Inspired by Container-Powered Migrations ...
AWS re:Invent 2016: Get Technically Inspired by Container-Powered Migrations ...Amazon Web Services
 
(BDT205) Your First Big Data Application On AWS
(BDT205) Your First Big Data Application On AWS(BDT205) Your First Big Data Application On AWS
(BDT205) Your First Big Data Application On AWSAmazon Web Services
 
Real-time Streaming and Querying with Amazon Kinesis and Amazon Elastic MapRe...
Real-time Streaming and Querying with Amazon Kinesis and Amazon Elastic MapRe...Real-time Streaming and Querying with Amazon Kinesis and Amazon Elastic MapRe...
Real-time Streaming and Querying with Amazon Kinesis and Amazon Elastic MapRe...Amazon Web Services
 
NEW LAUNCH! Intro to Amazon Athena. Easily analyze data in S3, using SQL.
NEW LAUNCH! Intro to Amazon Athena. Easily analyze data in S3, using SQL.NEW LAUNCH! Intro to Amazon Athena. Easily analyze data in S3, using SQL.
NEW LAUNCH! Intro to Amazon Athena. Easily analyze data in S3, using SQL.Amazon Web Services
 
Managing Data with Voume Velocity, and Variety with Amazon ElastiCache for Redis
Managing Data with Voume Velocity, and Variety with Amazon ElastiCache for RedisManaging Data with Voume Velocity, and Variety with Amazon ElastiCache for Redis
Managing Data with Voume Velocity, and Variety with Amazon ElastiCache for RedisAmazon Web Services
 
AWS re:Invent 2016: How Citus Enables Scalable PostgreSQL on AWS (DAT207)
AWS re:Invent 2016: How Citus Enables Scalable PostgreSQL on AWS (DAT207)AWS re:Invent 2016: How Citus Enables Scalable PostgreSQL on AWS (DAT207)
AWS re:Invent 2016: How Citus Enables Scalable PostgreSQL on AWS (DAT207)Amazon Web Services
 
AWS for the Data Professional
AWS for the Data ProfessionalAWS for the Data Professional
AWS for the Data ProfessionalLynn Langit
 
Streaming data analytics (Kinesis, EMR/Spark) - Pop-up Loft Tel Aviv
Streaming data analytics (Kinesis, EMR/Spark) - Pop-up Loft Tel Aviv Streaming data analytics (Kinesis, EMR/Spark) - Pop-up Loft Tel Aviv
Streaming data analytics (Kinesis, EMR/Spark) - Pop-up Loft Tel Aviv Amazon Web Services
 
Building Big Data Applications with Serverless Architectures - June 2017 AWS...
Building Big Data Applications with Serverless Architectures -  June 2017 AWS...Building Big Data Applications with Serverless Architectures -  June 2017 AWS...
Building Big Data Applications with Serverless Architectures - June 2017 AWS...Amazon Web Services
 
AWS re:Invent 2016 Recap: What Happened, What It Means
AWS re:Invent 2016 Recap: What Happened, What It MeansAWS re:Invent 2016 Recap: What Happened, What It Means
AWS re:Invent 2016 Recap: What Happened, What It MeansRightScale
 
Scaling Traffic from 0 to 139 Million Unique Visitors
Scaling Traffic from 0 to 139 Million Unique VisitorsScaling Traffic from 0 to 139 Million Unique Visitors
Scaling Traffic from 0 to 139 Million Unique VisitorsYelp Engineering
 
Build Your Web Analytics with node.js, Amazon DynamoDB and Amazon EMR (BDT203...
Build Your Web Analytics with node.js, Amazon DynamoDB and Amazon EMR (BDT203...Build Your Web Analytics with node.js, Amazon DynamoDB and Amazon EMR (BDT203...
Build Your Web Analytics with node.js, Amazon DynamoDB and Amazon EMR (BDT203...Amazon Web Services
 

Was ist angesagt? (20)

AWS Glue - let's get stuck in!
AWS Glue - let's get stuck in!AWS Glue - let's get stuck in!
AWS Glue - let's get stuck in!
 
Kinesis and Spark Streaming - Advanced AWS Meetup - August 2014
Kinesis and Spark Streaming - Advanced AWS Meetup - August 2014Kinesis and Spark Streaming - Advanced AWS Meetup - August 2014
Kinesis and Spark Streaming - Advanced AWS Meetup - August 2014
 
AWS re:Invent 2016: How Mapbox Uses the AWS Edge to Deliver Fast Maps for Mob...
AWS re:Invent 2016: How Mapbox Uses the AWS Edge to Deliver Fast Maps for Mob...AWS re:Invent 2016: How Mapbox Uses the AWS Edge to Deliver Fast Maps for Mob...
AWS re:Invent 2016: How Mapbox Uses the AWS Edge to Deliver Fast Maps for Mob...
 
Big Data Architectural Patterns
Big Data Architectural PatternsBig Data Architectural Patterns
Big Data Architectural Patterns
 
AWS re:Invent 2016: Streaming ETL for RDS and DynamoDB (DAT315)
AWS re:Invent 2016: Streaming ETL for RDS and DynamoDB (DAT315)AWS re:Invent 2016: Streaming ETL for RDS and DynamoDB (DAT315)
AWS re:Invent 2016: Streaming ETL for RDS and DynamoDB (DAT315)
 
Best Practices for Genomic and Bioinformatics Analysis Pipelines on AWS
Best Practices for Genomic and Bioinformatics Analysis Pipelines on AWS Best Practices for Genomic and Bioinformatics Analysis Pipelines on AWS
Best Practices for Genomic and Bioinformatics Analysis Pipelines on AWS
 
Batch Processing with Containers on AWS - June 2017 AWS Online Tech Talks
Batch Processing with Containers on AWS -  June 2017 AWS Online Tech TalksBatch Processing with Containers on AWS -  June 2017 AWS Online Tech Talks
Batch Processing with Containers on AWS - June 2017 AWS Online Tech Talks
 
Real-Time Event Processing
Real-Time Event ProcessingReal-Time Event Processing
Real-Time Event Processing
 
AWS re:Invent 2016: Get Technically Inspired by Container-Powered Migrations ...
AWS re:Invent 2016: Get Technically Inspired by Container-Powered Migrations ...AWS re:Invent 2016: Get Technically Inspired by Container-Powered Migrations ...
AWS re:Invent 2016: Get Technically Inspired by Container-Powered Migrations ...
 
(BDT205) Your First Big Data Application On AWS
(BDT205) Your First Big Data Application On AWS(BDT205) Your First Big Data Application On AWS
(BDT205) Your First Big Data Application On AWS
 
Real-time Streaming and Querying with Amazon Kinesis and Amazon Elastic MapRe...
Real-time Streaming and Querying with Amazon Kinesis and Amazon Elastic MapRe...Real-time Streaming and Querying with Amazon Kinesis and Amazon Elastic MapRe...
Real-time Streaming and Querying with Amazon Kinesis and Amazon Elastic MapRe...
 
NEW LAUNCH! Intro to Amazon Athena. Easily analyze data in S3, using SQL.
NEW LAUNCH! Intro to Amazon Athena. Easily analyze data in S3, using SQL.NEW LAUNCH! Intro to Amazon Athena. Easily analyze data in S3, using SQL.
NEW LAUNCH! Intro to Amazon Athena. Easily analyze data in S3, using SQL.
 
Managing Data with Voume Velocity, and Variety with Amazon ElastiCache for Redis
Managing Data with Voume Velocity, and Variety with Amazon ElastiCache for RedisManaging Data with Voume Velocity, and Variety with Amazon ElastiCache for Redis
Managing Data with Voume Velocity, and Variety with Amazon ElastiCache for Redis
 
AWS re:Invent 2016: How Citus Enables Scalable PostgreSQL on AWS (DAT207)
AWS re:Invent 2016: How Citus Enables Scalable PostgreSQL on AWS (DAT207)AWS re:Invent 2016: How Citus Enables Scalable PostgreSQL on AWS (DAT207)
AWS re:Invent 2016: How Citus Enables Scalable PostgreSQL on AWS (DAT207)
 
AWS for the Data Professional
AWS for the Data ProfessionalAWS for the Data Professional
AWS for the Data Professional
 
Streaming data analytics (Kinesis, EMR/Spark) - Pop-up Loft Tel Aviv
Streaming data analytics (Kinesis, EMR/Spark) - Pop-up Loft Tel Aviv Streaming data analytics (Kinesis, EMR/Spark) - Pop-up Loft Tel Aviv
Streaming data analytics (Kinesis, EMR/Spark) - Pop-up Loft Tel Aviv
 
Building Big Data Applications with Serverless Architectures - June 2017 AWS...
Building Big Data Applications with Serverless Architectures -  June 2017 AWS...Building Big Data Applications with Serverless Architectures -  June 2017 AWS...
Building Big Data Applications with Serverless Architectures - June 2017 AWS...
 
AWS re:Invent 2016 Recap: What Happened, What It Means
AWS re:Invent 2016 Recap: What Happened, What It MeansAWS re:Invent 2016 Recap: What Happened, What It Means
AWS re:Invent 2016 Recap: What Happened, What It Means
 
Scaling Traffic from 0 to 139 Million Unique Visitors
Scaling Traffic from 0 to 139 Million Unique VisitorsScaling Traffic from 0 to 139 Million Unique Visitors
Scaling Traffic from 0 to 139 Million Unique Visitors
 
Build Your Web Analytics with node.js, Amazon DynamoDB and Amazon EMR (BDT203...
Build Your Web Analytics with node.js, Amazon DynamoDB and Amazon EMR (BDT203...Build Your Web Analytics with node.js, Amazon DynamoDB and Amazon EMR (BDT203...
Build Your Web Analytics with node.js, Amazon DynamoDB and Amazon EMR (BDT203...
 

Andere mochten auch

What is 'Teaching Kids Programming'
What is 'Teaching Kids Programming'What is 'Teaching Kids Programming'
What is 'Teaching Kids Programming'Lynn Langit
 
Google Cloud and Data Pipeline Patterns
Google Cloud and Data Pipeline PatternsGoogle Cloud and Data Pipeline Patterns
Google Cloud and Data Pipeline PatternsLynn Langit
 
Optimizing costs with spot instances
Optimizing costs with spot instancesOptimizing costs with spot instances
Optimizing costs with spot instancesAmazon Web Services
 
Redis Labs and SQL Server
Redis Labs and SQL ServerRedis Labs and SQL Server
Redis Labs and SQL ServerLynn Langit
 
AWS Cost optimization at scale
AWS Cost optimization at scaleAWS Cost optimization at scale
AWS Cost optimization at scaleBrett Pollak
 
Aws Atlanta meetup Amazon Athena
Aws Atlanta meetup Amazon AthenaAws Atlanta meetup Amazon Athena
Aws Atlanta meetup Amazon AthenaAdam Book
 
Aws meetup aws_waf
Aws meetup aws_wafAws meetup aws_waf
Aws meetup aws_wafAdam Book
 
Scaling Galaxy on Google Cloud Platform
Scaling Galaxy on Google Cloud PlatformScaling Galaxy on Google Cloud Platform
Scaling Galaxy on Google Cloud PlatformLynn Langit
 
Cloud Big Data Architectures
Cloud Big Data ArchitecturesCloud Big Data Architectures
Cloud Big Data ArchitecturesLynn Langit
 
AWS re:Invent 2016: Real-time Data Processing Using AWS Lambda (SVR301)
AWS re:Invent 2016: Real-time Data Processing Using AWS Lambda (SVR301)AWS re:Invent 2016: Real-time Data Processing Using AWS Lambda (SVR301)
AWS re:Invent 2016: Real-time Data Processing Using AWS Lambda (SVR301)Amazon Web Services
 
Bringing Learnings from Googley Microservices with gRPC - Varun Talwar, Google
Bringing Learnings from Googley Microservices with gRPC - Varun Talwar, GoogleBringing Learnings from Googley Microservices with gRPC - Varun Talwar, Google
Bringing Learnings from Googley Microservices with gRPC - Varun Talwar, GoogleAmbassador Labs
 
(STG308) How EA, State Of Texas & H3 Biomedicine Protect Data
(STG308) How EA, State Of Texas & H3 Biomedicine Protect Data(STG308) How EA, State Of Texas & H3 Biomedicine Protect Data
(STG308) How EA, State Of Texas & H3 Biomedicine Protect DataAmazon Web Services
 
Design in Tech Report 2017
Design in Tech Report 2017Design in Tech Report 2017
Design in Tech Report 2017John Maeda
 

Andere mochten auch (13)

What is 'Teaching Kids Programming'
What is 'Teaching Kids Programming'What is 'Teaching Kids Programming'
What is 'Teaching Kids Programming'
 
Google Cloud and Data Pipeline Patterns
Google Cloud and Data Pipeline PatternsGoogle Cloud and Data Pipeline Patterns
Google Cloud and Data Pipeline Patterns
 
Optimizing costs with spot instances
Optimizing costs with spot instancesOptimizing costs with spot instances
Optimizing costs with spot instances
 
Redis Labs and SQL Server
Redis Labs and SQL ServerRedis Labs and SQL Server
Redis Labs and SQL Server
 
AWS Cost optimization at scale
AWS Cost optimization at scaleAWS Cost optimization at scale
AWS Cost optimization at scale
 
Aws Atlanta meetup Amazon Athena
Aws Atlanta meetup Amazon AthenaAws Atlanta meetup Amazon Athena
Aws Atlanta meetup Amazon Athena
 
Aws meetup aws_waf
Aws meetup aws_wafAws meetup aws_waf
Aws meetup aws_waf
 
Scaling Galaxy on Google Cloud Platform
Scaling Galaxy on Google Cloud PlatformScaling Galaxy on Google Cloud Platform
Scaling Galaxy on Google Cloud Platform
 
Cloud Big Data Architectures
Cloud Big Data ArchitecturesCloud Big Data Architectures
Cloud Big Data Architectures
 
AWS re:Invent 2016: Real-time Data Processing Using AWS Lambda (SVR301)
AWS re:Invent 2016: Real-time Data Processing Using AWS Lambda (SVR301)AWS re:Invent 2016: Real-time Data Processing Using AWS Lambda (SVR301)
AWS re:Invent 2016: Real-time Data Processing Using AWS Lambda (SVR301)
 
Bringing Learnings from Googley Microservices with gRPC - Varun Talwar, Google
Bringing Learnings from Googley Microservices with gRPC - Varun Talwar, GoogleBringing Learnings from Googley Microservices with gRPC - Varun Talwar, Google
Bringing Learnings from Googley Microservices with gRPC - Varun Talwar, Google
 
(STG308) How EA, State Of Texas & H3 Biomedicine Protect Data
(STG308) How EA, State Of Texas & H3 Biomedicine Protect Data(STG308) How EA, State Of Texas & H3 Biomedicine Protect Data
(STG308) How EA, State Of Texas & H3 Biomedicine Protect Data
 
Design in Tech Report 2017
Design in Tech Report 2017Design in Tech Report 2017
Design in Tech Report 2017
 

Ähnlich wie New AWS Services for Bioinformatics

NEW LAUNCH! Introducing AWS Batch: Easy and efficient batch computing
 	  NEW LAUNCH! Introducing AWS Batch: Easy and efficient batch computing 	  NEW LAUNCH! Introducing AWS Batch: Easy and efficient batch computing
NEW LAUNCH! Introducing AWS Batch: Easy and efficient batch computingAmazon Web Services
 
Amazon Batch: 實現簡單且有效率的批次運算
Amazon Batch: 實現簡單且有效率的批次運算Amazon Batch: 實現簡單且有效率的批次運算
Amazon Batch: 實現簡單且有效率的批次運算Amazon Web Services
 
NEW LAUNCH! Introducing AWS Batch: Easy and efficient batch computing on Amaz...
NEW LAUNCH! Introducing AWS Batch: Easy and efficient batch computing on Amaz...NEW LAUNCH! Introducing AWS Batch: Easy and efficient batch computing on Amaz...
NEW LAUNCH! Introducing AWS Batch: Easy and efficient batch computing on Amaz...Amazon Web Services
 
Announcing AWS Batch - Run Batch Jobs At Scale - December 2016 Monthly Webina...
Announcing AWS Batch - Run Batch Jobs At Scale - December 2016 Monthly Webina...Announcing AWS Batch - Run Batch Jobs At Scale - December 2016 Monthly Webina...
Announcing AWS Batch - Run Batch Jobs At Scale - December 2016 Monthly Webina...Amazon Web Services
 
AWS Batch: Simplifying Batch Computing in the Cloud
AWS Batch: Simplifying Batch Computing in the CloudAWS Batch: Simplifying Batch Computing in the Cloud
AWS Batch: Simplifying Batch Computing in the CloudAmazon Web Services
 
ECS & ECR Deep Dive - 김기완 솔루션즈 아키텍트 :: AWS Container Day
ECS & ECR Deep Dive - 김기완 솔루션즈 아키텍트 :: AWS Container DayECS & ECR Deep Dive - 김기완 솔루션즈 아키텍트 :: AWS Container Day
ECS & ECR Deep Dive - 김기완 솔루션즈 아키텍트 :: AWS Container DayAmazon Web Services Korea
 
AWS Batch: Simplifying Batch Computing in the Cloud
AWS Batch: Simplifying Batch Computing in the CloudAWS Batch: Simplifying Batch Computing in the Cloud
AWS Batch: Simplifying Batch Computing in the CloudAmazon Web Services
 
AWS Batch: Simplifying batch computing in the cloud
AWS Batch: Simplifying batch computing in the cloudAWS Batch: Simplifying batch computing in the cloud
AWS Batch: Simplifying batch computing in the cloudAdrian Hornsby
 
Building and scaling your containerized microservices on Amazon ECS
Building and scaling your containerized microservices on Amazon ECSBuilding and scaling your containerized microservices on Amazon ECS
Building and scaling your containerized microservices on Amazon ECSAmazon Web Services
 
Cloud & Native Cloud for Managers
Cloud & Native Cloud for ManagersCloud & Native Cloud for Managers
Cloud & Native Cloud for ManagersEitan Sela
 
Introduction to Batch Processing on AWS
Introduction to Batch Processing on AWSIntroduction to Batch Processing on AWS
Introduction to Batch Processing on AWSAmazon Web Services
 
intro elastic container service amazon aws
intro elastic container service amazon awsintro elastic container service amazon aws
intro elastic container service amazon awsDanielJara92
 
Nyc big datagenomics-pizarroa-sept2017
Nyc big datagenomics-pizarroa-sept2017Nyc big datagenomics-pizarroa-sept2017
Nyc big datagenomics-pizarroa-sept2017delagoya
 

Ähnlich wie New AWS Services for Bioinformatics (20)

NEW LAUNCH! Introducing AWS Batch: Easy and efficient batch computing
 	  NEW LAUNCH! Introducing AWS Batch: Easy and efficient batch computing 	  NEW LAUNCH! Introducing AWS Batch: Easy and efficient batch computing
NEW LAUNCH! Introducing AWS Batch: Easy and efficient batch computing
 
Amazon Batch: 實現簡單且有效率的批次運算
Amazon Batch: 實現簡單且有效率的批次運算Amazon Batch: 實現簡單且有效率的批次運算
Amazon Batch: 實現簡單且有效率的批次運算
 
NEW LAUNCH! Introducing AWS Batch: Easy and efficient batch computing on Amaz...
NEW LAUNCH! Introducing AWS Batch: Easy and efficient batch computing on Amaz...NEW LAUNCH! Introducing AWS Batch: Easy and efficient batch computing on Amaz...
NEW LAUNCH! Introducing AWS Batch: Easy and efficient batch computing on Amaz...
 
Introduction to AWS Batch
Introduction to AWS BatchIntroduction to AWS Batch
Introduction to AWS Batch
 
Introduction to AWS Batch
Introduction to AWS BatchIntroduction to AWS Batch
Introduction to AWS Batch
 
Introduction to AWS Batch
Introduction to AWS BatchIntroduction to AWS Batch
Introduction to AWS Batch
 
Announcing AWS Batch - Run Batch Jobs At Scale - December 2016 Monthly Webina...
Announcing AWS Batch - Run Batch Jobs At Scale - December 2016 Monthly Webina...Announcing AWS Batch - Run Batch Jobs At Scale - December 2016 Monthly Webina...
Announcing AWS Batch - Run Batch Jobs At Scale - December 2016 Monthly Webina...
 
SRV410 Deep Dive on AWS Batch
SRV410 Deep Dive on AWS BatchSRV410 Deep Dive on AWS Batch
SRV410 Deep Dive on AWS Batch
 
AWS Batch: Simplifying Batch Computing in the Cloud
AWS Batch: Simplifying Batch Computing in the CloudAWS Batch: Simplifying Batch Computing in the Cloud
AWS Batch: Simplifying Batch Computing in the Cloud
 
ECS & ECR Deep Dive - 김기완 솔루션즈 아키텍트 :: AWS Container Day
ECS & ECR Deep Dive - 김기완 솔루션즈 아키텍트 :: AWS Container DayECS & ECR Deep Dive - 김기완 솔루션즈 아키텍트 :: AWS Container Day
ECS & ECR Deep Dive - 김기완 솔루션즈 아키텍트 :: AWS Container Day
 
AWS Batch: Simplifying Batch Computing in the Cloud
AWS Batch: Simplifying Batch Computing in the CloudAWS Batch: Simplifying Batch Computing in the Cloud
AWS Batch: Simplifying Batch Computing in the Cloud
 
AWS Batch: Simplifying batch computing in the cloud
AWS Batch: Simplifying batch computing in the cloudAWS Batch: Simplifying batch computing in the cloud
AWS Batch: Simplifying batch computing in the cloud
 
Building and scaling your containerized microservices on Amazon ECS
Building and scaling your containerized microservices on Amazon ECSBuilding and scaling your containerized microservices on Amazon ECS
Building and scaling your containerized microservices on Amazon ECS
 
Managing Your Cloud Assets
Managing Your Cloud AssetsManaging Your Cloud Assets
Managing Your Cloud Assets
 
Cloud & Native Cloud for Managers
Cloud & Native Cloud for ManagersCloud & Native Cloud for Managers
Cloud & Native Cloud for Managers
 
AWS glue technical enablement training
AWS glue technical enablement trainingAWS glue technical enablement training
AWS glue technical enablement training
 
Introduction to Batch Processing on AWS
Introduction to Batch Processing on AWSIntroduction to Batch Processing on AWS
Introduction to Batch Processing on AWS
 
Intro to Amazon ECS
Intro to Amazon ECSIntro to Amazon ECS
Intro to Amazon ECS
 
intro elastic container service amazon aws
intro elastic container service amazon awsintro elastic container service amazon aws
intro elastic container service amazon aws
 
Nyc big datagenomics-pizarroa-sept2017
Nyc big datagenomics-pizarroa-sept2017Nyc big datagenomics-pizarroa-sept2017
Nyc big datagenomics-pizarroa-sept2017
 

Mehr von Lynn Langit

VariantSpark on AWS
VariantSpark on AWSVariantSpark on AWS
VariantSpark on AWSLynn Langit
 
Serverless Architectures
Serverless ArchitecturesServerless Architectures
Serverless ArchitecturesLynn Langit
 
10+ Years of Teaching Kids Programming
10+ Years of Teaching Kids Programming10+ Years of Teaching Kids Programming
10+ Years of Teaching Kids ProgrammingLynn Langit
 
Blastn plus jupyter on Docker
Blastn plus jupyter on DockerBlastn plus jupyter on Docker
Blastn plus jupyter on DockerLynn Langit
 
Testing in Ballerina Language
Testing in Ballerina LanguageTesting in Ballerina Language
Testing in Ballerina LanguageLynn Langit
 
Teaching Kids to create Alexa Skills
Teaching Kids to create Alexa SkillsTeaching Kids to create Alexa Skills
Teaching Kids to create Alexa SkillsLynn Langit
 
Understanding Jupyter notebooks using bioinformatics examples
Understanding Jupyter notebooks using bioinformatics examplesUnderstanding Jupyter notebooks using bioinformatics examples
Understanding Jupyter notebooks using bioinformatics examplesLynn Langit
 
Genome-scale Big Data Pipelines
Genome-scale Big Data PipelinesGenome-scale Big Data Pipelines
Genome-scale Big Data PipelinesLynn Langit
 
Teaching Kids Programming
Teaching Kids ProgrammingTeaching Kids Programming
Teaching Kids ProgrammingLynn Langit
 
Serverless Reality
Serverless RealityServerless Reality
Serverless RealityLynn Langit
 
Genomic Scale Big Data Pipelines
Genomic Scale Big Data PipelinesGenomic Scale Big Data Pipelines
Genomic Scale Big Data PipelinesLynn Langit
 
VariantSpark - a Spark library for genomics
VariantSpark - a Spark library for genomicsVariantSpark - a Spark library for genomics
VariantSpark - a Spark library for genomicsLynn Langit
 
Bioinformatics Data Pipelines built by CSIRO on AWS
Bioinformatics Data Pipelines built by CSIRO on AWSBioinformatics Data Pipelines built by CSIRO on AWS
Bioinformatics Data Pipelines built by CSIRO on AWSLynn Langit
 
Teaching Kids Programming for Developers
Teaching Kids Programming for DevelopersTeaching Kids Programming for Developers
Teaching Kids Programming for DevelopersLynn Langit
 
Cloud-centric Internet of Things
Cloud-centric Internet of ThingsCloud-centric Internet of Things
Cloud-centric Internet of ThingsLynn Langit
 
Building AWS Redshift Data Warehouse with Matillion and Tableau
Building AWS Redshift Data Warehouse with Matillion and TableauBuilding AWS Redshift Data Warehouse with Matillion and Tableau
Building AWS Redshift Data Warehouse with Matillion and TableauLynn Langit
 
TKPJava Eclipse and Codenvy IDE Keyboard Shortcuts
TKPJava Eclipse and Codenvy IDE Keyboard ShortcutsTKPJava Eclipse and Codenvy IDE Keyboard Shortcuts
TKPJava Eclipse and Codenvy IDE Keyboard ShortcutsLynn Langit
 
Understanding Codenvy - for Containerized Developer Workspaces
Understanding Codenvy - for Containerized Developer WorkspacesUnderstanding Codenvy - for Containerized Developer Workspaces
Understanding Codenvy - for Containerized Developer WorkspacesLynn Langit
 

Mehr von Lynn Langit (20)

VariantSpark on AWS
VariantSpark on AWSVariantSpark on AWS
VariantSpark on AWS
 
Serverless Architectures
Serverless ArchitecturesServerless Architectures
Serverless Architectures
 
10+ Years of Teaching Kids Programming
10+ Years of Teaching Kids Programming10+ Years of Teaching Kids Programming
10+ Years of Teaching Kids Programming
 
Blastn plus jupyter on Docker
Blastn plus jupyter on DockerBlastn plus jupyter on Docker
Blastn plus jupyter on Docker
 
Testing in Ballerina Language
Testing in Ballerina LanguageTesting in Ballerina Language
Testing in Ballerina Language
 
Teaching Kids to create Alexa Skills
Teaching Kids to create Alexa SkillsTeaching Kids to create Alexa Skills
Teaching Kids to create Alexa Skills
 
Practical cloud
Practical cloudPractical cloud
Practical cloud
 
Understanding Jupyter notebooks using bioinformatics examples
Understanding Jupyter notebooks using bioinformatics examplesUnderstanding Jupyter notebooks using bioinformatics examples
Understanding Jupyter notebooks using bioinformatics examples
 
Genome-scale Big Data Pipelines
Genome-scale Big Data PipelinesGenome-scale Big Data Pipelines
Genome-scale Big Data Pipelines
 
Teaching Kids Programming
Teaching Kids ProgrammingTeaching Kids Programming
Teaching Kids Programming
 
Practical Cloud
Practical CloudPractical Cloud
Practical Cloud
 
Serverless Reality
Serverless RealityServerless Reality
Serverless Reality
 
Genomic Scale Big Data Pipelines
Genomic Scale Big Data PipelinesGenomic Scale Big Data Pipelines
Genomic Scale Big Data Pipelines
 
VariantSpark - a Spark library for genomics
VariantSpark - a Spark library for genomicsVariantSpark - a Spark library for genomics
VariantSpark - a Spark library for genomics
 
Bioinformatics Data Pipelines built by CSIRO on AWS
Bioinformatics Data Pipelines built by CSIRO on AWSBioinformatics Data Pipelines built by CSIRO on AWS
Bioinformatics Data Pipelines built by CSIRO on AWS
 
Teaching Kids Programming for Developers
Teaching Kids Programming for DevelopersTeaching Kids Programming for Developers
Teaching Kids Programming for Developers
 
Cloud-centric Internet of Things
Cloud-centric Internet of ThingsCloud-centric Internet of Things
Cloud-centric Internet of Things
 
Building AWS Redshift Data Warehouse with Matillion and Tableau
Building AWS Redshift Data Warehouse with Matillion and TableauBuilding AWS Redshift Data Warehouse with Matillion and Tableau
Building AWS Redshift Data Warehouse with Matillion and Tableau
 
TKPJava Eclipse and Codenvy IDE Keyboard Shortcuts
TKPJava Eclipse and Codenvy IDE Keyboard ShortcutsTKPJava Eclipse and Codenvy IDE Keyboard Shortcuts
TKPJava Eclipse and Codenvy IDE Keyboard Shortcuts
 
Understanding Codenvy - for Containerized Developer Workspaces
Understanding Codenvy - for Containerized Developer WorkspacesUnderstanding Codenvy - for Containerized Developer Workspaces
Understanding Codenvy - for Containerized Developer Workspaces
 

Kürzlich hochgeladen

08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 

Kürzlich hochgeladen (20)

08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 

New AWS Services for Bioinformatics

  • 1. Lynn Langit New AWS Services For bioinformatics pipelines Feb 2017
  • 2. New AWS Services • Useful for scaling bioinformatics pipelines • Announced at re:Invent (Nov 2016) • Athena • Step Functions • Batch • Glue • QuickSight
  • 4. Serverless AWS Lambda Application
  • 6. About AWS Athena Serverless SQL queries on S3 data
  • 7.
  • 8. AWS Athena Information • Add table (structure) to database via DDL from input file(s) • Write and execute SQL query • Optionally save query • Optionally review query history • View results • Optionally download result set to .csv
  • 11. About AWS Step Functions Serverless visual workflows for Lambdas
  • 12. AWS Step Functions 1. Define steps and services (activities or lambdas) 2. Verify step execution(s) 3. Monitor and scale “Your application as a state machine.”
  • 13. AWS Step Functions – 1. Define Steps/Services
  • 14. AWS Step Functions – 2. Verify step execution
  • 16.
  • 17. About AWS Batch Fully managed batch processing at scale
  • 18. What is batch computing? Run jobs asynchronously and automatically across one or more computers. Jobs may dependencies, making the sequencing and scheduling of multiple jobs complex and challenging.
  • 19. What is AWS Batch? Fully Managed No software to install or servers to manage. Integrated with AWS Batch jobs can easily and securely interact with services such as Amazon S3, DynamoDB, and Rekognition Cost-optimized Provisioning Auto provisions compute resources tailored to the job needs using EC2 & EC2 Spot
  • 20. AWS Batch Concepts 1. Jobs 1. Job Definitions 2. Job Queues 3. Job States 2. Compute Environments 3. Scheduler Short Video -- here
  • 21. Jobs Jobs are the unit of work executed by AWS Batch as containerized applications running on Amazon EC2. Containerized jobs can reference a container image, command, and parameters or users can simply provide a .zip containing their application and we will run it on a default Amazon Linux container. $ aws batch submit-job --job-name variant-calling --job-definition gatk --job-queue genomics
  • 22. Massively parallel jobs • Now - users can submit a large number of independent “simple jobs.” • Soon – AWS will add support for “array jobs” that run many copies of an application against an array of elements. Array jobs are an efficient way to run: • Parametric sweeps • Monte Carlo simulations • Processing a large collection of objects NOTE: These use cases are possible today, simply submit more jobs.
  • 24. Workflows, Pipelines, and Job Dependencies Jobs can express a dependency on the successful completion of other jobs or specific elements of an array job. Use your preferred workflow engine and language to submit jobs. Flow-based systems simply submit jobs serially, while DAG-based systems submit many jobs at once, identifying inter-job dependencies. $ aws batch submit-job –depends-on 606b3ad1-aa31-48d8-92ec-f154bfc8215f ...
  • 25. Job Definitions Batch Job Definitions specify how jobs are to be run. While each job must reference a job definition, many parameters can be overridden. Some of the attributes specified in a job definition: • IAM role associated with the job • vCPU and memory requirements • Mount points • Container properties • Environment variables $ aws batch register-job-definition --job-definition-name gatk --container-properties ...
  • 26. Job Queues Jobs are submitted to a Job Queue, where they reside until they are able to be scheduled to a compute resource. Information related to completed jobs persists in the queue for 24 hours. $ aws batch create-job-queue --job-queue-name genomics --priority 500 --compute-environment-order ...
  • 27. Compute Environments Mapped from job queues to run containerized batch jobs. • Managed CEs - you describe your requirements (instance types, min/max/desired vCPUs, and EC2 Spot bid as a % of On-Demand), AWS launches & scales resources for you. Pick specific instance types, instance families or simply choose “optimal” • Unmanaged CEs - you can launch and manage your own resources. Your instances need to include the ECS agent and run supported versions of Linux and Docker. AWS Batch will then create an Amazon ECS cluster which can accept the instances you launch. Jobs can be scheduled to your Compute Environment as soon as your instances are healthy and register with the ECS Agent. $ aws batch create-compute-environment --compute- environment-name unmanagedce --type UNMANAGED ...
  • 28. AWS Batch Scheduler The Scheduler evaluates when, where, and how to run jobs that have been submitted to a job queue. Jobs run in approximately the order in which they are submitted as long as all dependencies on other jobs have been met.
  • 29. Queued Job States • SUBMITTED: Accepted into the queue, but not yet evaluated for execution • PENDING: Your job has dependencies on other jobs which have not yet completed • RUNNABLE: Your job has been evaluated by the scheduler and is ready to run • STARTING: Your job is in the process of being scheduled to a compute resource • RUNNING: Your job is currently running • SUCCEEDED: Your job has finished with exit code 0 • FAILED: Your job finished with a non-zero exit code or was cancelled or terminated.
  • 30. AWS Batch Actions • CancelJob: Marks jobs that are not yet STARTING as FAILED. • TerminateJob: Cancels jobs that are currently waiting in the queue. Stops jobs that are in a STARTING or RUNNING state and transitions them to FAILED. NOTE: Requires a “reason” which is viewable via DescribeJobs $ aws batch cancel-job --reason “Submitted to wrong queue” --jobId= 8a767ac8-e28a-4c97-875b-e5c0bcf49eb8
  • 31. AWS Batch Data Types • ComputeEnvironmentDetail • ComputeEnvironmentOrder • ComputeResource • ContainerProperties • ContainerPropertiesResource • CounterProperties • Host • Job • JobDefinition • JobQueueDetail • MountPoint • Parameter • Ulimit • Volume
  • 33. AWS Batch Pricing and Functionality There is no charge for AWS Batch; you only pay for the underlying resources that you consume! NOTE: Support for Array Jobs, retries, and jobs executed as AWS Lambda functions coming soon!
  • 34. Use the Right Tool for the Job Not all batch workloads are the same… • ETL and Big Data processing/analytics? • Consider EMR, Data Pipeline, Redshift, and related services. • Lots of small Cron jobs? AWS Batch is a great way to execute these jobs, but you will likely want a workflow or job-scheduling system to orchestrate job submissions. • Efficiently run lots of big and small compute jobs on heterogeneous compute resources? Use AWS Batch
  • 36. Example: Genomics on Unmanaged Compute Environments
  • 37. Fully Managed Integrated with AWS Cost-optimized Resource Provisioning AWS Batch summarized
  • 38. About AWS Glue Serverless managed, scalable ETL
  • 39. AWS Glue 1. Build a data catalog 1. Discover and use your datasets via a Hive-compatible metastore 2. Store versions, connection and credential info 3. Use crawlers to auto-generate schema from S3 data & partitions 2. Generate and edit transforms using PySpark 3. Schedule and run your jobs 1. On schedule, event or lambda NOTE: Glue is announced, but no beta as of yet…video from re:Invent -- here
  • 40.
  • 41.
  • 42.
  • 44.
  • 45. About AWS QuickSight Quick and easy data dashboards
  • 46.
  • 47.
  • 48. Resources for new AWS Services • Athena (SQL query on S3) – here • Batch (Optimized, chained EC2 batches) – here • Glue (Scaled ETL) -- here • Step Functions (Lambda workflows) – here • QuickSight (Data Dashboards) – here • Full list of AWS services announced at re:Invent 2016 -- here

Hinweis der Redaktion

  1. https://www.csiro.au/en/Locations/NSW/North-Ryde Riverside Life Sciences Centre reception Our North Ryde site is in the heart of Sydney's high-tech hub and co-locates researchers from diverse disciplines. Our NSW science education centre is located on this site.
  2. https://aws.amazon.com/blogs/aws/genome-engineering-applications-early-adopters-of-the-cloud/
  3. https://aws.amazon.com/public-datasets/
  4. https://aws.amazon.com/blogs/big-data/interactive-analysis-of-genomic-datasets-using-amazon-athena/
  5. https://aws.amazon.com/blogs/aws/new-aws-step-functions-build-distributed-applications-using-visual-workflows/
  6. https://aws.amazon.com/blogs/aws/aws-batch-run-batch-computing-jobs-on-aws/ Jamie Kinney, Principal Product Manager, AWS Batch
  7. High-Throughput: Can process as many concurrent genomic workflows as needed (>1000 day). Flexible: You define your containers, dependencies, and resource requirements. Batch takes care of the rest. Elastic and Scalable: Treat each workflow like a burst compute. Pay only for what you need when you need it. Cost-Optimized: Runs on spot-fleet to significantly reduce cost of genomic analysis.
  8. https://aws.amazon.com/glue/
  9. https://aws.amazon.com/ec2/Elastic-GPUs/
  10. https://aws.amazon.com/glue/