SlideShare ist ein Scribd-Unternehmen logo
1 von 28
© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Radhika Ravirala, Solutions Architect, AWS
August 17, 2017
Serverless Big Data Architectures
Serverless Data Analytics
Agenda
Cloud Architecture Evolution – Why Serverless
Data and Analytics Flow
Key Services Overview
Design Patterns
Call to Action
Cloud Architecture Evolution
Virtualized Managed Serverless
Virtualized
Servers
Managed
Platforms
Serverless
Analytics
No servers to provision
or manage
Scales with usage
Never pay for idle Availability and fault
tolerance built in
Serverless characteristics
Data and Analytics Flow
Ingest/
Collect
Store
Analyze/
Process
Visualization/
Consume
Orchestrate/Transform
What Is the Temperature of Your Data / Access ?
Orchestration/Transform
AWS Big Data Services
Ingest/ Collect Store Analyze/ Process
Visualization/
Consume
Batch
ETL/ELT
Realtime
ETL/ELT
Transactional
/ CDC
B.I. Tools
Data Science
Notebooks
Bulk Transport
File/Object Upload
Streaming Ingest
Commits Transactional
NoSQL
Data Lake
Streaming Storage
Dashboards
Batch Analytics
Interactive
Querying
Machine Learning/
Deep Learning
Realtime Analytics
…
Orchestration/Transform
AWS Big Data Services
Ingest/ Collect Store Analyze/ Process
Visualization/
Consume
= Serverless
Serverless
Managed
Virtualized
Batch
ETL/ELT
Realtime
ETL/ELT
Transactional
/ CDC
B.I. Tools
Data Science
Notebooks
Bulk Transport
File/Object Upload
Streaming Ingest
Commits Transactional
NoSQL
Data Lake
Streaming Storage
Dashboards
Batch Analytics
Interactive
Querying
Machine Learning/
Deep Learning
Realtime Analytics
Orchestration/Transform
AWS Big Data Services
EMR EC2
S3
RedshiftDynamoDB
AWS DMS (CDC)
AWS Lambda
Kinesis Analytics Amazon Athena
Amazon
QuickSight
Aurora
AWS Glue AWS Step
Functions
Kinesis
Streams
Ingest/ Collect Store Analyze/ Process
Visualization/
Consume
AWS
Snowball
ISV
Connectors
Kinesis
Firehose
S3 Transfer
Acceleration
= Serverless
Amazon
ElasticSearc
h
Key Services Overview
Big Data Storage for Virtually All AWS Services
Amazon S3
• Store anything
• Object storage
• Scalable
• 99.999999999% durability
• Extremely low cost
Amazon
DynamoDB
Fast & Flexible NoSQL Database Service
• NoSQL Database
• Seamless scalability
• Zero admin
• Single digit millisecond latency
Amazon
Kinesis
Real-time Streaming Platform
• Streams, Firehose, Analytics
• Real-time processing
• High throughput; elastic
• Easy to use
• Integration with S3, EMR,
Redshift, DynamoDB
Amazon Kinesis
Streams
• For Technical Developers
• Build your own custom
applications that process
or analyze streaming
data
Amazon Kinesis
Firehose
• For all developers, data
scientists
• Easily load massive
volumes of streaming data
into S3, Amazon Redshift
and Amazon Elasticsearch
Amazon Kinesis
Analytics
• For all developers, data
scientists
• Easily analyze data
streams using standard
SQL queries
Amazon Kinesis: Streaming Data Made Easy
Services make it easy to capture, deliver and process streams on AWS
AWS Lambda
• Run your code in the cloud - fully
managed and highly-available
• Triggered through API or state
changes in your setup
• Scales automatically to match the
incoming event rate
• Node.js (JavaScript), Python, Java,
and C#
• Charged per 100ms execution time
Serverless Compute
Amazon
Athena
Interactive Query Service
• Query directly from
Amazon S3
• Use ANSI SQL
• Serverless
• Multiple Data Formats
• Pay per query
AWS Glue
Fully Managed ETL Service
• Catalog data sources
• Identify data formats & data types
• Error Handling
• Manage and scale resources
• Generate ETL code
• Schedules, executes ETL jobs
New !
AWS Glue: services
Data Catalog
 Hive metastore compatible metadata repository of data sources.
 Crawls data source to infer table, data type, partition format.
Job Execution
 Runs jobs in Spark containers – automatic scaling based on
SLA.
 Glue is serverless - only pay for the resources you consume.
Job Authoring
 Generates Python code to move data from source to destination.
 Edit with your favorite IDE; share code snippets using Git.
• Fast and cloud-powered
• Easy to use, no infrastructure to
manage
• Scales to 100s of thousands of
users
• Quick calculations with SPICE
• 1/10th the cost of legacy BI
software
Business Intelligence
Amazon
QuickSight
Serverless Design Patterns
Real-time Analytics
Producer
Apache
Kafka
KCL
AWS
Lambda
Spark
Streaming
Apache
Storm
Amazon
SNS
Notifications
Amazon
ElastiCache
Amazon
DynamoDB
Amazon
RDS
Amazon
ES
Alert
Analytics
Output KPI
Serverless
Managed
DynamoDB
Streams
Kinesis
Streams
Virtualized
Kinesis
Analytics
Ingest/ Collect Store Analyze/ Process
Visualization/
Consume
Apache
FlinkSQS
Interactive Queries
Ingest/ Collect Store Analyze/ Process
Visualization/
Consume
Producer Amazon S3
Amazon
Redshift
Amazon EMR
Presto
Impala
Spark
Interactive
Amazon
Athena
Serverless
Managed
Virtualized
QuickSight
Catalog & Search
Access and search metadata
Access & User Interface
Give your users easy and secure access
DynamoDB Elasticsearch API Gateway Identity & Access
Management
Cognito
QuickSight Amazon AI EMR Redshift
Athena Kinesis RDS
Central Storage
Secure, cost-effective
Storage in Amazon S3
S3
Snowball Database Migration
Service
Kinesis Firehose Direct Connect
Data Ingestion
Get your data into S3
Quickly and securely
Protect and Secure
Use entitlements to ensure data is secure and users’ identities are verified
Processing & Analytics
Use of predictive and prescriptive
analytics to gain better understanding
Security Token
Service
CloudWatch CloudTrail Key Management
Service
Data Lake Reference Architecture
= Serverless
Amazon S3
Data Lake
Amazon Kinesis
Streams & Firehose
Hadoop / Spark
Streaming Analytics Tools
Amazon Redshift
Data Warehouse
Amazon DynamoDB
NoSQL Database
AWS Lambda
Spark Streaming
on EMR
Amazon
Elasticsearch Service
Relational Database
Amazon EMR
Amazon Aurora
Amazon Machine Learning
Predictive Analytics
Any Open Source Tool
of Choice on EC2
Data Science Sandbox
Visualization /
Reporting
Apache Storm
on EMR
Apache Flink
on EMR
Amazon Kinesis
Analytics
Serving Tier
Clusterless SQL Query
Amazon Athena
DataSourcesTransactionalData
Amazon Glue
Clusterless ETL
Amazon ElastiCache
Redis
Data Lake and
Real-time
Analytics
Serverless ETL
Store Transform Store Analyze/ Process
Visualize/
Consume
Amazon S3
Apache
Kafka
Kinesis
Streams Amazon EMR
Spark
Flink
AWS Glue
AWS Lambda
ISV
Amazon S3
Apache
Kafka
Redshift
Kinesis
Streams
Data CatalogAWS Glue
DynamoDB
Streams
DynamoDB Hive M/D
Serverless nicely fits into big data platforms
• AWS Serverless Big Data Services
• Complements existing big data flows
• Focus on the analytics and not on infrastructure or servers
• Don’t focus on the scaling, availability, and undifferentiated
heavy lifting
• Pay only for what you use
• Easily try out different tools, analytics, and solutions
DEMO
Serverless Big Data Architectures: Serverless Data Analytics

Weitere ähnliche Inhalte

Was ist angesagt?

Getting Started with AWS Lambda and Serverless Computing
Getting Started with AWS Lambda and Serverless ComputingGetting Started with AWS Lambda and Serverless Computing
Getting Started with AWS Lambda and Serverless ComputingKristana Kane
 
BDA402 Deep Dive: Log Analytics with Amazon Elasticsearch Service
BDA402 Deep Dive: Log Analytics with Amazon Elasticsearch ServiceBDA402 Deep Dive: Log Analytics with Amazon Elasticsearch Service
BDA402 Deep Dive: Log Analytics with Amazon Elasticsearch ServiceAmazon Web Services
 
Automated Compliance and Governance with AWS Config and AWS CloudTrail - June...
Automated Compliance and Governance with AWS Config and AWS CloudTrail - June...Automated Compliance and Governance with AWS Config and AWS CloudTrail - June...
Automated Compliance and Governance with AWS Config and AWS CloudTrail - June...Amazon Web Services
 
Architetture Serverless: concentrarsi sull'idea, non sull'infrastruttura
Architetture Serverless: concentrarsi sull'idea, non sull'infrastrutturaArchitetture Serverless: concentrarsi sull'idea, non sull'infrastruttura
Architetture Serverless: concentrarsi sull'idea, non sull'infrastrutturaAmazon Web Services
 
SRV203 Getting Started with AWS Lambda and the Serverless Cloud
SRV203 Getting Started with AWS Lambda and the Serverless CloudSRV203 Getting Started with AWS Lambda and the Serverless Cloud
SRV203 Getting Started with AWS Lambda and the Serverless CloudAmazon Web Services
 
WKS407 Wild Rydes Takes Off – The Dawn of a New Unicorn
WKS407 Wild Rydes Takes Off – The Dawn of a New Unicorn WKS407 Wild Rydes Takes Off – The Dawn of a New Unicorn
WKS407 Wild Rydes Takes Off – The Dawn of a New Unicorn Amazon Web Services
 
Wild Rides Takes off - The Dawn of a New Unicorn
Wild Rides Takes off - The Dawn of a New UnicornWild Rides Takes off - The Dawn of a New Unicorn
Wild Rides Takes off - The Dawn of a New UnicornAmazon Web Services
 
NEW LAUNCH! Introducing AWS Batch: Easy and efficient batch computing on Amaz...
NEW LAUNCH! Introducing AWS Batch: Easy and efficient batch computing on Amaz...NEW LAUNCH! Introducing AWS Batch: Easy and efficient batch computing on Amaz...
NEW LAUNCH! Introducing AWS Batch: Easy and efficient batch computing on Amaz...Amazon Web Services
 
HSBC and AWS Day - Microservices and Serverless
HSBC and AWS Day - Microservices and ServerlessHSBC and AWS Day - Microservices and Serverless
HSBC and AWS Day - Microservices and ServerlessAmazon Web Services
 
Migrate from Oracle to Amazon Aurora using AWS Schema Conversion Tool & AWS D...
Migrate from Oracle to Amazon Aurora using AWS Schema Conversion Tool & AWS D...Migrate from Oracle to Amazon Aurora using AWS Schema Conversion Tool & AWS D...
Migrate from Oracle to Amazon Aurora using AWS Schema Conversion Tool & AWS D...Amazon Web Services
 
Convert and Migrate Your NoSQL Database or Data Warehouse to AWS - July 2017
Convert and Migrate Your NoSQL Database or Data Warehouse to AWS - July 2017Convert and Migrate Your NoSQL Database or Data Warehouse to AWS - July 2017
Convert and Migrate Your NoSQL Database or Data Warehouse to AWS - July 2017Amazon Web Services
 
Migrating Your Databases to AWS Deep Dive on Amazon RDS and AWS
Migrating Your Databases to AWS Deep Dive on Amazon RDS and AWSMigrating Your Databases to AWS Deep Dive on Amazon RDS and AWS
Migrating Your Databases to AWS Deep Dive on Amazon RDS and AWSKristana Kane
 
Microservizi e container Docker in produzione: strumenti e consigli
Microservizi e container Docker in produzione: strumenti e consigliMicroservizi e container Docker in produzione: strumenti e consigli
Microservizi e container Docker in produzione: strumenti e consigliAmazon Web Services
 
WKS401 Deploy a Deep Learning Framework on Amazon ECS and EC2 Spot Instances
WKS401 Deploy a Deep Learning Framework on Amazon ECS and EC2 Spot InstancesWKS401 Deploy a Deep Learning Framework on Amazon ECS and EC2 Spot Instances
WKS401 Deploy a Deep Learning Framework on Amazon ECS and EC2 Spot InstancesAmazon Web Services
 
數據庫遷移到雲端的成功秘訣
數據庫遷移到雲端的成功秘訣數據庫遷移到雲端的成功秘訣
數據庫遷移到雲端的成功秘訣Amazon Web Services
 
Data Processing without Servers | AWS Public Sector Summit 2016
Data Processing without Servers | AWS Public Sector Summit 2016Data Processing without Servers | AWS Public Sector Summit 2016
Data Processing without Servers | AWS Public Sector Summit 2016Amazon Web Services
 
SRV409 Deep Dive on Microservices and Docker
SRV409 Deep Dive on Microservices and DockerSRV409 Deep Dive on Microservices and Docker
SRV409 Deep Dive on Microservices and DockerAmazon Web Services
 
Hands On Lab: Introduction to Microsoft SQL Server in AWS - May 2017 AWS Onli...
Hands On Lab: Introduction to Microsoft SQL Server in AWS - May 2017 AWS Onli...Hands On Lab: Introduction to Microsoft SQL Server in AWS - May 2017 AWS Onli...
Hands On Lab: Introduction to Microsoft SQL Server in AWS - May 2017 AWS Onli...Amazon Web Services
 
Database migration simple, cross-engine and cross-platform migrations with ...
Database migration   simple, cross-engine and cross-platform migrations with ...Database migration   simple, cross-engine and cross-platform migrations with ...
Database migration simple, cross-engine and cross-platform migrations with ...Amazon Web Services
 

Was ist angesagt? (20)

Getting Started with AWS Lambda and Serverless Computing
Getting Started with AWS Lambda and Serverless ComputingGetting Started with AWS Lambda and Serverless Computing
Getting Started with AWS Lambda and Serverless Computing
 
BDA402 Deep Dive: Log Analytics with Amazon Elasticsearch Service
BDA402 Deep Dive: Log Analytics with Amazon Elasticsearch ServiceBDA402 Deep Dive: Log Analytics with Amazon Elasticsearch Service
BDA402 Deep Dive: Log Analytics with Amazon Elasticsearch Service
 
Automated Compliance and Governance with AWS Config and AWS CloudTrail - June...
Automated Compliance and Governance with AWS Config and AWS CloudTrail - June...Automated Compliance and Governance with AWS Config and AWS CloudTrail - June...
Automated Compliance and Governance with AWS Config and AWS CloudTrail - June...
 
Architetture Serverless: concentrarsi sull'idea, non sull'infrastruttura
Architetture Serverless: concentrarsi sull'idea, non sull'infrastrutturaArchitetture Serverless: concentrarsi sull'idea, non sull'infrastruttura
Architetture Serverless: concentrarsi sull'idea, non sull'infrastruttura
 
SRV203 Getting Started with AWS Lambda and the Serverless Cloud
SRV203 Getting Started with AWS Lambda and the Serverless CloudSRV203 Getting Started with AWS Lambda and the Serverless Cloud
SRV203 Getting Started with AWS Lambda and the Serverless Cloud
 
WKS407 Wild Rydes Takes Off – The Dawn of a New Unicorn
WKS407 Wild Rydes Takes Off – The Dawn of a New Unicorn WKS407 Wild Rydes Takes Off – The Dawn of a New Unicorn
WKS407 Wild Rydes Takes Off – The Dawn of a New Unicorn
 
Wild Rides Takes off - The Dawn of a New Unicorn
Wild Rides Takes off - The Dawn of a New UnicornWild Rides Takes off - The Dawn of a New Unicorn
Wild Rides Takes off - The Dawn of a New Unicorn
 
NEW LAUNCH! Introducing AWS Batch: Easy and efficient batch computing on Amaz...
NEW LAUNCH! Introducing AWS Batch: Easy and efficient batch computing on Amaz...NEW LAUNCH! Introducing AWS Batch: Easy and efficient batch computing on Amaz...
NEW LAUNCH! Introducing AWS Batch: Easy and efficient batch computing on Amaz...
 
HSBC and AWS Day - Microservices and Serverless
HSBC and AWS Day - Microservices and ServerlessHSBC and AWS Day - Microservices and Serverless
HSBC and AWS Day - Microservices and Serverless
 
Migrate from Oracle to Amazon Aurora using AWS Schema Conversion Tool & AWS D...
Migrate from Oracle to Amazon Aurora using AWS Schema Conversion Tool & AWS D...Migrate from Oracle to Amazon Aurora using AWS Schema Conversion Tool & AWS D...
Migrate from Oracle to Amazon Aurora using AWS Schema Conversion Tool & AWS D...
 
Convert and Migrate Your NoSQL Database or Data Warehouse to AWS - July 2017
Convert and Migrate Your NoSQL Database or Data Warehouse to AWS - July 2017Convert and Migrate Your NoSQL Database or Data Warehouse to AWS - July 2017
Convert and Migrate Your NoSQL Database or Data Warehouse to AWS - July 2017
 
Migrating Your Databases to AWS Deep Dive on Amazon RDS and AWS
Migrating Your Databases to AWS Deep Dive on Amazon RDS and AWSMigrating Your Databases to AWS Deep Dive on Amazon RDS and AWS
Migrating Your Databases to AWS Deep Dive on Amazon RDS and AWS
 
Microservizi e container Docker in produzione: strumenti e consigli
Microservizi e container Docker in produzione: strumenti e consigliMicroservizi e container Docker in produzione: strumenti e consigli
Microservizi e container Docker in produzione: strumenti e consigli
 
WKS401 Deploy a Deep Learning Framework on Amazon ECS and EC2 Spot Instances
WKS401 Deploy a Deep Learning Framework on Amazon ECS and EC2 Spot InstancesWKS401 Deploy a Deep Learning Framework on Amazon ECS and EC2 Spot Instances
WKS401 Deploy a Deep Learning Framework on Amazon ECS and EC2 Spot Instances
 
Windows and .NET on AWS
Windows and .NET on AWSWindows and .NET on AWS
Windows and .NET on AWS
 
數據庫遷移到雲端的成功秘訣
數據庫遷移到雲端的成功秘訣數據庫遷移到雲端的成功秘訣
數據庫遷移到雲端的成功秘訣
 
Data Processing without Servers | AWS Public Sector Summit 2016
Data Processing without Servers | AWS Public Sector Summit 2016Data Processing without Servers | AWS Public Sector Summit 2016
Data Processing without Servers | AWS Public Sector Summit 2016
 
SRV409 Deep Dive on Microservices and Docker
SRV409 Deep Dive on Microservices and DockerSRV409 Deep Dive on Microservices and Docker
SRV409 Deep Dive on Microservices and Docker
 
Hands On Lab: Introduction to Microsoft SQL Server in AWS - May 2017 AWS Onli...
Hands On Lab: Introduction to Microsoft SQL Server in AWS - May 2017 AWS Onli...Hands On Lab: Introduction to Microsoft SQL Server in AWS - May 2017 AWS Onli...
Hands On Lab: Introduction to Microsoft SQL Server in AWS - May 2017 AWS Onli...
 
Database migration simple, cross-engine and cross-platform migrations with ...
Database migration   simple, cross-engine and cross-platform migrations with ...Database migration   simple, cross-engine and cross-platform migrations with ...
Database migration simple, cross-engine and cross-platform migrations with ...
 

Ähnlich wie Serverless Big Data Architectures: Serverless Data Analytics

BDA303 Serverless big data architectures: Design patterns and best practices
BDA303 Serverless big data architectures: Design patterns and best practicesBDA303 Serverless big data architectures: Design patterns and best practices
BDA303 Serverless big data architectures: Design patterns and best practicesAmazon Web Services
 
Semplificare l'analisi dei dati con architetture "Serverless": architetture e...
Semplificare l'analisi dei dati con architetture "Serverless": architetture e...Semplificare l'analisi dei dati con architetture "Serverless": architetture e...
Semplificare l'analisi dei dati con architetture "Serverless": architetture e...Amazon Web Services
 
Big Data Architectural Patterns and Best Practices on AWS
Big Data Architectural Patterns and Best Practices on AWSBig Data Architectural Patterns and Best Practices on AWS
Big Data Architectural Patterns and Best Practices on AWSAmazon Web Services
 
Building Data Lakes and Analytics on AWS; Patterns and Best Practices - BDA30...
Building Data Lakes and Analytics on AWS; Patterns and Best Practices - BDA30...Building Data Lakes and Analytics on AWS; Patterns and Best Practices - BDA30...
Building Data Lakes and Analytics on AWS; Patterns and Best Practices - BDA30...Amazon Web Services
 
Database and Analytics on the AWS Cloud - AWS Innovate Toronto
Database and Analytics on the AWS Cloud - AWS Innovate TorontoDatabase and Analytics on the AWS Cloud - AWS Innovate Toronto
Database and Analytics on the AWS Cloud - AWS Innovate TorontoAmazon Web Services
 
Build Data Lakes & Analytics on AWS: Patterns & Best Practices - BDA305 - Ana...
Build Data Lakes & Analytics on AWS: Patterns & Best Practices - BDA305 - Ana...Build Data Lakes & Analytics on AWS: Patterns & Best Practices - BDA305 - Ana...
Build Data Lakes & Analytics on AWS: Patterns & Best Practices - BDA305 - Ana...Amazon Web Services
 
Building a Data Processing Pipeline on AWS
Building a Data Processing Pipeline on AWSBuilding a Data Processing Pipeline on AWS
Building a Data Processing Pipeline on AWSAmazon Web Services
 
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...Amazon Web Services
 
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...Amazon Web Services
 
BDA305 Building Data Lakes and Analytics on AWS
BDA305 Building Data Lakes and Analytics on AWSBDA305 Building Data Lakes and Analytics on AWS
BDA305 Building Data Lakes and Analytics on AWSAmazon Web Services
 
Building a Data Processing Pipeline on AWS - AWS Summit SG 2017
Building a Data Processing Pipeline on AWS - AWS Summit SG 2017Building a Data Processing Pipeline on AWS - AWS Summit SG 2017
Building a Data Processing Pipeline on AWS - AWS Summit SG 2017Amazon Web Services
 
Building a Real Time Dashboard with Amazon Kinesis, Amazon Lambda and Amazon ...
Building a Real Time Dashboard with Amazon Kinesis, Amazon Lambda and Amazon ...Building a Real Time Dashboard with Amazon Kinesis, Amazon Lambda and Amazon ...
Building a Real Time Dashboard with Amazon Kinesis, Amazon Lambda and Amazon ...Amazon Web Services
 
AWS Innovate: Build a Data Lake on AWS- Johnathon Meichtry
AWS Innovate: Build a Data Lake on AWS- Johnathon MeichtryAWS Innovate: Build a Data Lake on AWS- Johnathon Meichtry
AWS Innovate: Build a Data Lake on AWS- Johnathon MeichtryAmazon Web Services Korea
 
Build Data Lakes and Analytics on AWS: Patterns & Best Practices - BDA305 - A...
Build Data Lakes and Analytics on AWS: Patterns & Best Practices - BDA305 - A...Build Data Lakes and Analytics on AWS: Patterns & Best Practices - BDA305 - A...
Build Data Lakes and Analytics on AWS: Patterns & Best Practices - BDA305 - A...Amazon Web Services
 
AWS Summit Singapore - Architecting a Serverless Data Lake on AWS
AWS Summit Singapore - Architecting a Serverless Data Lake on AWSAWS Summit Singapore - Architecting a Serverless Data Lake on AWS
AWS Summit Singapore - Architecting a Serverless Data Lake on AWSAmazon Web Services
 
ABD202_Best Practices for Building Serverless Big Data Applications
ABD202_Best Practices for Building Serverless Big Data ApplicationsABD202_Best Practices for Building Serverless Big Data Applications
ABD202_Best Practices for Building Serverless Big Data ApplicationsAmazon Web Services
 
Building Data Lakes and Analytics on AWS; Patterns and Best Practices - BDA30...
Building Data Lakes and Analytics on AWS; Patterns and Best Practices - BDA30...Building Data Lakes and Analytics on AWS; Patterns and Best Practices - BDA30...
Building Data Lakes and Analytics on AWS; Patterns and Best Practices - BDA30...Amazon Web Services
 
Big Data Architectural Patterns and Best Practices on AWS
Big Data Architectural Patterns and Best Practices on AWSBig Data Architectural Patterns and Best Practices on AWS
Big Data Architectural Patterns and Best Practices on AWSAmazon Web Services
 

Ähnlich wie Serverless Big Data Architectures: Serverless Data Analytics (20)

BDA303 Serverless big data architectures: Design patterns and best practices
BDA303 Serverless big data architectures: Design patterns and best practicesBDA303 Serverless big data architectures: Design patterns and best practices
BDA303 Serverless big data architectures: Design patterns and best practices
 
Semplificare l'analisi dei dati con architetture "Serverless": architetture e...
Semplificare l'analisi dei dati con architetture "Serverless": architetture e...Semplificare l'analisi dei dati con architetture "Serverless": architetture e...
Semplificare l'analisi dei dati con architetture "Serverless": architetture e...
 
Big Data Architectural Patterns and Best Practices on AWS
Big Data Architectural Patterns and Best Practices on AWSBig Data Architectural Patterns and Best Practices on AWS
Big Data Architectural Patterns and Best Practices on AWS
 
Building Data Lakes and Analytics on AWS; Patterns and Best Practices - BDA30...
Building Data Lakes and Analytics on AWS; Patterns and Best Practices - BDA30...Building Data Lakes and Analytics on AWS; Patterns and Best Practices - BDA30...
Building Data Lakes and Analytics on AWS; Patterns and Best Practices - BDA30...
 
Database and Analytics on the AWS Cloud - AWS Innovate Toronto
Database and Analytics on the AWS Cloud - AWS Innovate TorontoDatabase and Analytics on the AWS Cloud - AWS Innovate Toronto
Database and Analytics on the AWS Cloud - AWS Innovate Toronto
 
Build Data Lakes & Analytics on AWS: Patterns & Best Practices - BDA305 - Ana...
Build Data Lakes & Analytics on AWS: Patterns & Best Practices - BDA305 - Ana...Build Data Lakes & Analytics on AWS: Patterns & Best Practices - BDA305 - Ana...
Build Data Lakes & Analytics on AWS: Patterns & Best Practices - BDA305 - Ana...
 
Building a Data Processing Pipeline on AWS
Building a Data Processing Pipeline on AWSBuilding a Data Processing Pipeline on AWS
Building a Data Processing Pipeline on AWS
 
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
 
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
 
BDA305 Building Data Lakes and Analytics on AWS
BDA305 Building Data Lakes and Analytics on AWSBDA305 Building Data Lakes and Analytics on AWS
BDA305 Building Data Lakes and Analytics on AWS
 
Building a Data Processing Pipeline on AWS - AWS Summit SG 2017
Building a Data Processing Pipeline on AWS - AWS Summit SG 2017Building a Data Processing Pipeline on AWS - AWS Summit SG 2017
Building a Data Processing Pipeline on AWS - AWS Summit SG 2017
 
Building a Real Time Dashboard with Amazon Kinesis, Amazon Lambda and Amazon ...
Building a Real Time Dashboard with Amazon Kinesis, Amazon Lambda and Amazon ...Building a Real Time Dashboard with Amazon Kinesis, Amazon Lambda and Amazon ...
Building a Real Time Dashboard with Amazon Kinesis, Amazon Lambda and Amazon ...
 
AWS Innovate: Build a Data Lake on AWS- Johnathon Meichtry
AWS Innovate: Build a Data Lake on AWS- Johnathon MeichtryAWS Innovate: Build a Data Lake on AWS- Johnathon Meichtry
AWS Innovate: Build a Data Lake on AWS- Johnathon Meichtry
 
Implementing a Data Lake
Implementing a Data LakeImplementing a Data Lake
Implementing a Data Lake
 
Build Data Lakes and Analytics on AWS: Patterns & Best Practices - BDA305 - A...
Build Data Lakes and Analytics on AWS: Patterns & Best Practices - BDA305 - A...Build Data Lakes and Analytics on AWS: Patterns & Best Practices - BDA305 - A...
Build Data Lakes and Analytics on AWS: Patterns & Best Practices - BDA305 - A...
 
Big Data on AWS
Big Data on AWSBig Data on AWS
Big Data on AWS
 
AWS Summit Singapore - Architecting a Serverless Data Lake on AWS
AWS Summit Singapore - Architecting a Serverless Data Lake on AWSAWS Summit Singapore - Architecting a Serverless Data Lake on AWS
AWS Summit Singapore - Architecting a Serverless Data Lake on AWS
 
ABD202_Best Practices for Building Serverless Big Data Applications
ABD202_Best Practices for Building Serverless Big Data ApplicationsABD202_Best Practices for Building Serverless Big Data Applications
ABD202_Best Practices for Building Serverless Big Data Applications
 
Building Data Lakes and Analytics on AWS; Patterns and Best Practices - BDA30...
Building Data Lakes and Analytics on AWS; Patterns and Best Practices - BDA30...Building Data Lakes and Analytics on AWS; Patterns and Best Practices - BDA30...
Building Data Lakes and Analytics on AWS; Patterns and Best Practices - BDA30...
 
Big Data Architectural Patterns and Best Practices on AWS
Big Data Architectural Patterns and Best Practices on AWSBig Data Architectural Patterns and Best Practices on AWS
Big Data Architectural Patterns and Best Practices on AWS
 

Kürzlich hochgeladen

Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 

Kürzlich hochgeladen (20)

E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 

Serverless Big Data Architectures: Serverless Data Analytics

  • 1. © 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Radhika Ravirala, Solutions Architect, AWS August 17, 2017 Serverless Big Data Architectures Serverless Data Analytics
  • 2. Agenda Cloud Architecture Evolution – Why Serverless Data and Analytics Flow Key Services Overview Design Patterns Call to Action
  • 3. Cloud Architecture Evolution Virtualized Managed Serverless Virtualized Servers Managed Platforms Serverless Analytics
  • 4. No servers to provision or manage Scales with usage Never pay for idle Availability and fault tolerance built in Serverless characteristics
  • 5. Data and Analytics Flow Ingest/ Collect Store Analyze/ Process Visualization/ Consume Orchestrate/Transform
  • 6. What Is the Temperature of Your Data / Access ?
  • 7. Orchestration/Transform AWS Big Data Services Ingest/ Collect Store Analyze/ Process Visualization/ Consume Batch ETL/ELT Realtime ETL/ELT Transactional / CDC B.I. Tools Data Science Notebooks Bulk Transport File/Object Upload Streaming Ingest Commits Transactional NoSQL Data Lake Streaming Storage Dashboards Batch Analytics Interactive Querying Machine Learning/ Deep Learning Realtime Analytics …
  • 8. Orchestration/Transform AWS Big Data Services Ingest/ Collect Store Analyze/ Process Visualization/ Consume = Serverless Serverless Managed Virtualized Batch ETL/ELT Realtime ETL/ELT Transactional / CDC B.I. Tools Data Science Notebooks Bulk Transport File/Object Upload Streaming Ingest Commits Transactional NoSQL Data Lake Streaming Storage Dashboards Batch Analytics Interactive Querying Machine Learning/ Deep Learning Realtime Analytics
  • 9. Orchestration/Transform AWS Big Data Services EMR EC2 S3 RedshiftDynamoDB AWS DMS (CDC) AWS Lambda Kinesis Analytics Amazon Athena Amazon QuickSight Aurora AWS Glue AWS Step Functions Kinesis Streams Ingest/ Collect Store Analyze/ Process Visualization/ Consume AWS Snowball ISV Connectors Kinesis Firehose S3 Transfer Acceleration = Serverless Amazon ElasticSearc h
  • 11. Big Data Storage for Virtually All AWS Services Amazon S3 • Store anything • Object storage • Scalable • 99.999999999% durability • Extremely low cost
  • 12. Amazon DynamoDB Fast & Flexible NoSQL Database Service • NoSQL Database • Seamless scalability • Zero admin • Single digit millisecond latency
  • 13. Amazon Kinesis Real-time Streaming Platform • Streams, Firehose, Analytics • Real-time processing • High throughput; elastic • Easy to use • Integration with S3, EMR, Redshift, DynamoDB
  • 14. Amazon Kinesis Streams • For Technical Developers • Build your own custom applications that process or analyze streaming data Amazon Kinesis Firehose • For all developers, data scientists • Easily load massive volumes of streaming data into S3, Amazon Redshift and Amazon Elasticsearch Amazon Kinesis Analytics • For all developers, data scientists • Easily analyze data streams using standard SQL queries Amazon Kinesis: Streaming Data Made Easy Services make it easy to capture, deliver and process streams on AWS
  • 15. AWS Lambda • Run your code in the cloud - fully managed and highly-available • Triggered through API or state changes in your setup • Scales automatically to match the incoming event rate • Node.js (JavaScript), Python, Java, and C# • Charged per 100ms execution time Serverless Compute
  • 16. Amazon Athena Interactive Query Service • Query directly from Amazon S3 • Use ANSI SQL • Serverless • Multiple Data Formats • Pay per query
  • 17. AWS Glue Fully Managed ETL Service • Catalog data sources • Identify data formats & data types • Error Handling • Manage and scale resources • Generate ETL code • Schedules, executes ETL jobs New !
  • 18. AWS Glue: services Data Catalog  Hive metastore compatible metadata repository of data sources.  Crawls data source to infer table, data type, partition format. Job Execution  Runs jobs in Spark containers – automatic scaling based on SLA.  Glue is serverless - only pay for the resources you consume. Job Authoring  Generates Python code to move data from source to destination.  Edit with your favorite IDE; share code snippets using Git.
  • 19. • Fast and cloud-powered • Easy to use, no infrastructure to manage • Scales to 100s of thousands of users • Quick calculations with SPICE • 1/10th the cost of legacy BI software Business Intelligence Amazon QuickSight
  • 22. Interactive Queries Ingest/ Collect Store Analyze/ Process Visualization/ Consume Producer Amazon S3 Amazon Redshift Amazon EMR Presto Impala Spark Interactive Amazon Athena Serverless Managed Virtualized QuickSight
  • 23. Catalog & Search Access and search metadata Access & User Interface Give your users easy and secure access DynamoDB Elasticsearch API Gateway Identity & Access Management Cognito QuickSight Amazon AI EMR Redshift Athena Kinesis RDS Central Storage Secure, cost-effective Storage in Amazon S3 S3 Snowball Database Migration Service Kinesis Firehose Direct Connect Data Ingestion Get your data into S3 Quickly and securely Protect and Secure Use entitlements to ensure data is secure and users’ identities are verified Processing & Analytics Use of predictive and prescriptive analytics to gain better understanding Security Token Service CloudWatch CloudTrail Key Management Service Data Lake Reference Architecture = Serverless
  • 24. Amazon S3 Data Lake Amazon Kinesis Streams & Firehose Hadoop / Spark Streaming Analytics Tools Amazon Redshift Data Warehouse Amazon DynamoDB NoSQL Database AWS Lambda Spark Streaming on EMR Amazon Elasticsearch Service Relational Database Amazon EMR Amazon Aurora Amazon Machine Learning Predictive Analytics Any Open Source Tool of Choice on EC2 Data Science Sandbox Visualization / Reporting Apache Storm on EMR Apache Flink on EMR Amazon Kinesis Analytics Serving Tier Clusterless SQL Query Amazon Athena DataSourcesTransactionalData Amazon Glue Clusterless ETL Amazon ElastiCache Redis Data Lake and Real-time Analytics
  • 25. Serverless ETL Store Transform Store Analyze/ Process Visualize/ Consume Amazon S3 Apache Kafka Kinesis Streams Amazon EMR Spark Flink AWS Glue AWS Lambda ISV Amazon S3 Apache Kafka Redshift Kinesis Streams Data CatalogAWS Glue DynamoDB Streams DynamoDB Hive M/D
  • 26. Serverless nicely fits into big data platforms • AWS Serverless Big Data Services • Complements existing big data flows • Focus on the analytics and not on infrastructure or servers • Don’t focus on the scaling, availability, and undifferentiated heavy lifting • Pay only for what you use • Easily try out different tools, analytics, and solutions
  • 27. DEMO