SlideShare ist ein Scribd-Unternehmen logo
1 von 41
© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Steve Abraham – Solutions Architect
May 26, 2016
May Webinar Series
Evolving Your Big Data Use Cases from
Batch to Real Time
Agenda
 Common Use Cases for Real-Time Analytics
 Patterns & Practices
 Tools
 Demo
 Q&A
Common Use Cases
Streaming Data Scenarios Across Verticals
Scenarios/
Verticals
Accelerated Ingest-
Transform-Load
Continuous Metrics
Generation
Responsive Data
Analysis
Digital Ad
Tech/Marketing
Publisher, bidder data
aggregation
Advertising metrics like
coverage, yield, and conversion
User engagement with ads,
optimized bid/buy engines
IoT
Sensor, device telemetry
data ingestion
Operational metrics and
dashboards
Device operational
intelligence and alerts
Gaming
Online data aggregation,
e.g., top 10 players
Massively multiplayer online
game (MMOG) live dashboard
Leader board generation,
player-skill match
Consumer
Online
Clickstream analytics
Metrics like impressions and
page views
Recommendation engines,
proactive care
Customer Use Cases
Sonos runs near real-time streaming
analytics on device data logs from their
connected hi-fi audio equipment.
Analyzing 30TB+ clickstream
data enabling real-time insights for
Publishers.
Zillow uses Lambda and Kinesis to
manage a global ingestion pipeline and
produce quality analytics in real-time
without building infrastructure.
Nordstorm recommendation team built
online stylist using Amazon Kinesis
Streams and AWS Lambda.
Patterns & Practices
Big Data : Batch or Stream?
Batch Processing
(Data Lake)
Real-time Processing
(Stream)
Log Analysis What went wrong an hour
ago?
Take corrective action /
notify admin
Billing Analysis How much did you spend
during the last billing
cycle?
Notify user that billing limit
is approaching
Customer Preferences Which ads were most
effective yesterday?
Which ad should the
customer see right now?
Fraud Audit trail / forensic
evidence
Stop fraud now
Data Streaming and Big Data
Kinesis
AWS
Lambda
Velocity
Size
Big Data
Data Streaming
Batch Processing
Two Main Processing Patterns
Stream processing (real time)
 Real-time response to events in data streams
Examples:
 Proactively detect hardware errors in device logs
 Notify when inventory drops below a threshold
 Fraud detection
Micro-batching (near real time)
 Near real-time operations on small batches of events in data streams
Examples:
 Aggregate and archive events
 Monitor performance SLAs
Data Streaming Design Pattern
Collect Process Analyze
Store
Data Collection
and Storage
Data
Processing
Event
Processing
Data
Analysis
Batch Layer
Amazon
Kinesis
data
process
store
Amazon
Kinesis
Firehose
Amazon S3
A
p
p
l
i
c
a
t
i
o
n
s
Amazon
Redshift
Amazon EMR
Presto
Hive
Pig
Spark
answer
Speed Layer
answer
Serving
Layer
Amazon
ES
Amazon
DynamoDB
Amazon
RDS
Amazon
ElastiCache
answer
KCL
AWS Lambda
Spark
Streaming
Storm
Batch /
Stream
Architecture
Primitive: Multi-Stage Decoupled “Data Bus”
 Multiple Stages
 Storage Decouples Processing Stages
Store Process Store ProcessData Answers
process
store
Primitive: Multi-Stream Processing
Amazon
Kinesis
AWS LambdaData
Amazon
DynamoDB
Amazon
Firehose
Amazon S3
 Read from Kinesis
 Write to Multiple Data Stores
process
store
Amazon EMR
Primitive: Analysis Frameworks
Amazon
Kinesis
AWS
Lambda
Amazon S3
Data
Amazon
DynamoDB
Answer
Spark
Streaming
Amazon
Firehose
Spark
SQL
 Can Read from Multiple Inputs
process
store
Real-time Analytics
KCL
AWS Lambda
Spark
Streaming
Apache
Storm
Amazon
SNS
Amazon
ML
Notifications
Amazon
ElastiCache
(Redis)
Amazon
DynamoDB
Amazon
RDS
Amazon
ES
Alert
App state
Real-time Prediction
KPI
Amazon
KinesisData
process
store
Tools
Plethora of Tools
Amazon
Glacier
Amazon S3
Amazon
DynamoDB
Amazon RDS
Amazon EMR
Amazon
Redshift
Amazon
Kinesis
Amazon Kinesis-
enabled app
AWS Lambda Amazon ML
Amazon SQS
Amazon
ElastiCache
DynamoDB
Streams Amazon Elasticsearch
Service
AWS IoT
Amazon Kinesis Firehose
Load massive volumes of streaming data into Amazon S3,
Amazon Redshift and Amazon Elasticsearch
Capture and submit
streaming data to
Firehose
Analyze streaming data using your
favorite BI tools
Firehose loads streaming data
continuously into S3, Amazon Redshift
and Amazon Elasticsearch
Zero administration: Capture and deliver streaming data into Amazon S3, Amazon Redshift and Amazon
Elasticsearch without writing an application or managing infrastructure.
Direct-to-data store integration: Batch, compress, and encrypt streaming data for delivery into data
destinations in as little as 60 secs using simple configurations.
Seamless elasticity: Seamlessly scales to match data throughput w/o intervention
Data Sources
App.4
[Machine
Learning]
AWSEndpoint
App.1
[Aggregate &
De-Duplicate]
App.2
[Metric
Extraction]
Amazon S3
Amazon Redshift
App.3
[Sliding
Window
Analysis]
Availability
Zone
Shard 1
Shard 2
Shard N
AWS Lambda
Amazon EMR
Availability
Zone
Availability
Zone
Amazon Kinesis Streams
Managed service for real-time streaming
Data Sources
Data Sources
Data Sources
 Streams are made of shards
 Each shard ingests up to 1MB/sec,
and 1000 records/sec
 Each shard emits up to 2 MB/sec
 All data is stored for 24 hours by
default; storage can be extended for
up to 7 days
 Scale Kinesis streams using scaling
utility
 Replay data inside of 24-hour window
or extended window
Amazon Kinesis Streams
Managed ability to capture and store data
Amazon Kinesis Firehose vs. Amazon Kinesis
Streams
Amazon Kinesis Streams is for use cases that require custom
processing, per incoming record, with sub-1 second processing
latency, and a choice of stream processing frameworks.
Amazon Kinesis Firehose is for use cases that require zero
administration, ability to use existing analytics tools based on
Amazon S3, Amazon Redshift and Amazon Elasticsearch, and a
data latency of 60 seconds or higher.
Streaming Data Ingestion
Putting Data into Amazon Kinesis Streams
Determine your partition key strategy
 Managed buffer or streaming MapReduce job
 Ensure high cardinality for your shards
Provision adequate shards
 For ingress needs
 Egress needs for all consuming applications: if more
than two simultaneous applications
 Include headroom for catching up with data in stream
Putting Data into Amazon Kinesis
Amazon Kinesis Agent – (supports pre-processing)
 http://docs.aws.amazon.com/firehose/latest/dev/writing-with-agents.html
Pre-batch before Puts for better efficiency
 Consider Flume, Fluentd as collectors/agents
 See https://github.com/awslabs/aws-fluent-plugin-kinesis
Make a tweak to your existing logging
 log4j appender option
 See https://github.com/awslabs/kinesis-log4j-appender
Amazon Kinesis Producer Library
 Writes to one or more Amazon Kinesis streams with automatic,
configurable retry mechanism
 Collects records and uses PutRecords to write multiple records to
multiple shards per request
 Aggregates user records to increase payload size and improve
throughput
 Integrates seamlessly with KCL to de-aggregate batched records
 Use Amazon Kinesis Producer Library with AWS Lambda (New!)
 Submits Amazon CloudWatch metrics on your behalf to provide
visibility into producer performance
Record Order and Multiple Shards
Unordered processing
 Randomize partition key to distribute events over
many shards and use multiple workers
Exact order processing
 Control partition key to ensure events are grouped
into the same shard and read by the same worker
Need both? Use global sequence number
Producer
Get Global
Sequence
Unordered
Stream
Campaign Centric
Stream
Fraud Inspection
Stream
Get Event Metadata
Sample Code for Scaling Shards
java -cp
KinesisScalingUtils.jar-complete.jar
-Dstream-name=MyStream
-Dscaling-action=scaleUp
-Dcount=10
-Dregion=eu-west-1 ScalingClient
Options:
 stream-name - The name of the stream to be scaled
 scaling-action - The action to be taken to scale. Must be one of "scaleUp”,
"scaleDown" or “resize”
 count - Number of shards by which to absolutely scale up or down, or resize
See https://github.com/awslabs/amazon-kinesis-scaling-utils
Amazon Kinesis Stream Processing
Amazon Kinesis Client Library
 Build Kinesis Applications with Kinesis Client Library (KCL)
 Open source client library available for Java, Ruby, Python,
Node.JS dev
 Deploy on your EC2 instances
 KCL Application includes three components:
1. Record Processor Factory – Creates the record processor
2. Record Processor – Processor unit that processes data from a
shard in Amazon Kinesis Streams
3. Worker – Processing unit that maps to each application instance
State Management with Kinesis Client Library
 One record processor maps to one shard and processes data records from
that shard
 One worker maps to one or more record processors
 Balances shard-worker associations when worker / instance counts change
 Balances shard-worker associations when shards split or merge
Other Options
 Third-party connectors(for example, Splunk)
 AWS IoT platform
 AWS Lambda
 Amazon EMR with Apache Spark, Pig or Hive
Apache Spark and Amazon Kinesis Streams
Apache Spark is an in-memory analytics cluster using
RDD for fast processing
Spark Streaming can read directly from an Amazon
Kinesis stream
Amazon software license linking – Add ASL dependency
to SBT/MAVEN project, artifactId = spark-
streaming-kinesis-asl_2.10
Example: Counting tweets on a sliding window
KinesisUtils.createStream(‘twitter-stream’)
.filter(_.getText.contains(”Open-Source"))
.countByWindow(Seconds(5))
Amazon EMR
Amazon Kinesis
StreamsStreaming Input
Tumbling/Fixed Window
Aggregation
Periodic Output
Amazon Redshift
COPY from Amazon EMR
Common Integration Pattern with Amazon EMR
Tumbling Window Reporting
Amazon Kinesis Streams with AWS Lambda
Amazon Kinesis - KCL
Amazon Kinesis - Lambda
Demo
Key Takeaways
• Streaming vs. Batching
• Loosely Coupled
• Managed Services
• Right Tools
Questions?
Thank you!

Weitere ähnliche Inhalte

Was ist angesagt?

Google Cloud and Data Pipeline Patterns
Google Cloud and Data Pipeline PatternsGoogle Cloud and Data Pipeline Patterns
Google Cloud and Data Pipeline PatternsLynn Langit
 
Big Data Architecture
Big Data ArchitectureBig Data Architecture
Big Data ArchitectureGuido Schmutz
 
CRISP-DM: a data science project methodology
CRISP-DM: a data science project methodologyCRISP-DM: a data science project methodology
CRISP-DM: a data science project methodologySergey Shelpuk
 
Scaling and Modernizing Data Platform with Databricks
Scaling and Modernizing Data Platform with DatabricksScaling and Modernizing Data Platform with Databricks
Scaling and Modernizing Data Platform with DatabricksDatabricks
 
A Reference Architecture for ETL 2.0
A Reference Architecture for ETL 2.0 A Reference Architecture for ETL 2.0
A Reference Architecture for ETL 2.0 DataWorks Summit
 
ABD318_Architecting a data lake with Amazon S3, Amazon Kinesis, AWS Glue and ...
ABD318_Architecting a data lake with Amazon S3, Amazon Kinesis, AWS Glue and ...ABD318_Architecting a data lake with Amazon S3, Amazon Kinesis, AWS Glue and ...
ABD318_Architecting a data lake with Amazon S3, Amazon Kinesis, AWS Glue and ...Amazon Web Services
 
Data Warehouse Design and Best Practices
Data Warehouse Design and Best PracticesData Warehouse Design and Best Practices
Data Warehouse Design and Best PracticesIvo Andreev
 
Stream processing and managing real-time data
Stream processing and managing real-time dataStream processing and managing real-time data
Stream processing and managing real-time dataAmazon Web Services
 
Advanced Streaming Analytics with Apache Flink and Apache Kafka, Stephan Ewen
Advanced Streaming Analytics with Apache Flink and Apache Kafka, Stephan EwenAdvanced Streaming Analytics with Apache Flink and Apache Kafka, Stephan Ewen
Advanced Streaming Analytics with Apache Flink and Apache Kafka, Stephan Ewenconfluent
 
Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...
Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...
Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...Hortonworks
 
Hadoop Performance Optimization at Scale, Lessons Learned at Twitter
Hadoop Performance Optimization at Scale, Lessons Learned at TwitterHadoop Performance Optimization at Scale, Lessons Learned at Twitter
Hadoop Performance Optimization at Scale, Lessons Learned at TwitterDataWorks Summit
 
Pinot: Enabling Real-time Analytics Applications @ LinkedIn's Scale
Pinot: Enabling Real-time Analytics Applications @ LinkedIn's ScalePinot: Enabling Real-time Analytics Applications @ LinkedIn's Scale
Pinot: Enabling Real-time Analytics Applications @ LinkedIn's ScaleSeunghyun Lee
 
Introduction to Spark Streaming
Introduction to Spark StreamingIntroduction to Spark Streaming
Introduction to Spark Streamingdatamantra
 
Apache Kafka® and the Data Mesh
Apache Kafka® and the Data MeshApache Kafka® and the Data Mesh
Apache Kafka® and the Data MeshConfluentInc1
 
Modernizing to a Cloud Data Architecture
Modernizing to a Cloud Data ArchitectureModernizing to a Cloud Data Architecture
Modernizing to a Cloud Data ArchitectureDatabricks
 
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkRedis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkDatabricks
 
Case Study: Stream Processing on AWS using Kappa Architecture
Case Study: Stream Processing on AWS using Kappa ArchitectureCase Study: Stream Processing on AWS using Kappa Architecture
Case Study: Stream Processing on AWS using Kappa ArchitectureJoey Bolduc-Gilbert
 
Delta from a Data Engineer's Perspective
Delta from a Data Engineer's PerspectiveDelta from a Data Engineer's Perspective
Delta from a Data Engineer's PerspectiveDatabricks
 

Was ist angesagt? (20)

Google Cloud and Data Pipeline Patterns
Google Cloud and Data Pipeline PatternsGoogle Cloud and Data Pipeline Patterns
Google Cloud and Data Pipeline Patterns
 
Data Product Architectures
Data Product ArchitecturesData Product Architectures
Data Product Architectures
 
Big Data Architecture
Big Data ArchitectureBig Data Architecture
Big Data Architecture
 
CRISP-DM: a data science project methodology
CRISP-DM: a data science project methodologyCRISP-DM: a data science project methodology
CRISP-DM: a data science project methodology
 
Scaling and Modernizing Data Platform with Databricks
Scaling and Modernizing Data Platform with DatabricksScaling and Modernizing Data Platform with Databricks
Scaling and Modernizing Data Platform with Databricks
 
A Reference Architecture for ETL 2.0
A Reference Architecture for ETL 2.0 A Reference Architecture for ETL 2.0
A Reference Architecture for ETL 2.0
 
ABD318_Architecting a data lake with Amazon S3, Amazon Kinesis, AWS Glue and ...
ABD318_Architecting a data lake with Amazon S3, Amazon Kinesis, AWS Glue and ...ABD318_Architecting a data lake with Amazon S3, Amazon Kinesis, AWS Glue and ...
ABD318_Architecting a data lake with Amazon S3, Amazon Kinesis, AWS Glue and ...
 
Data Warehouse Design and Best Practices
Data Warehouse Design and Best PracticesData Warehouse Design and Best Practices
Data Warehouse Design and Best Practices
 
Stream processing and managing real-time data
Stream processing and managing real-time dataStream processing and managing real-time data
Stream processing and managing real-time data
 
Advanced Streaming Analytics with Apache Flink and Apache Kafka, Stephan Ewen
Advanced Streaming Analytics with Apache Flink and Apache Kafka, Stephan EwenAdvanced Streaming Analytics with Apache Flink and Apache Kafka, Stephan Ewen
Advanced Streaming Analytics with Apache Flink and Apache Kafka, Stephan Ewen
 
Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...
Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...
Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...
 
Hadoop Performance Optimization at Scale, Lessons Learned at Twitter
Hadoop Performance Optimization at Scale, Lessons Learned at TwitterHadoop Performance Optimization at Scale, Lessons Learned at Twitter
Hadoop Performance Optimization at Scale, Lessons Learned at Twitter
 
Pinot: Enabling Real-time Analytics Applications @ LinkedIn's Scale
Pinot: Enabling Real-time Analytics Applications @ LinkedIn's ScalePinot: Enabling Real-time Analytics Applications @ LinkedIn's Scale
Pinot: Enabling Real-time Analytics Applications @ LinkedIn's Scale
 
Introduction to Spark Streaming
Introduction to Spark StreamingIntroduction to Spark Streaming
Introduction to Spark Streaming
 
Apache Kafka® and the Data Mesh
Apache Kafka® and the Data MeshApache Kafka® and the Data Mesh
Apache Kafka® and the Data Mesh
 
Apache PIG
Apache PIGApache PIG
Apache PIG
 
Modernizing to a Cloud Data Architecture
Modernizing to a Cloud Data ArchitectureModernizing to a Cloud Data Architecture
Modernizing to a Cloud Data Architecture
 
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkRedis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
 
Case Study: Stream Processing on AWS using Kappa Architecture
Case Study: Stream Processing on AWS using Kappa ArchitectureCase Study: Stream Processing on AWS using Kappa Architecture
Case Study: Stream Processing on AWS using Kappa Architecture
 
Delta from a Data Engineer's Perspective
Delta from a Data Engineer's PerspectiveDelta from a Data Engineer's Perspective
Delta from a Data Engineer's Perspective
 

Andere mochten auch

Better Together - Using Spark and Redshift to Combine Your Data with Public D...
Better Together - Using Spark and Redshift to Combine Your Data with Public D...Better Together - Using Spark and Redshift to Combine Your Data with Public D...
Better Together - Using Spark and Redshift to Combine Your Data with Public D...C4Media
 
AWS January 2016 Webinar Series - Getting Started with Big Data on AWS
AWS January 2016 Webinar Series - Getting Started with Big Data on AWSAWS January 2016 Webinar Series - Getting Started with Big Data on AWS
AWS January 2016 Webinar Series - Getting Started with Big Data on AWSAmazon Web Services
 
Big Data and Data Science: The Technologies Shaping Our Lives
Big Data and Data Science: The Technologies Shaping Our LivesBig Data and Data Science: The Technologies Shaping Our Lives
Big Data and Data Science: The Technologies Shaping Our LivesRukshan Batuwita
 
Big Data Meets HPC - Exploiting HPC Technologies for Accelerating Big Data Pr...
Big Data Meets HPC - Exploiting HPC Technologies for Accelerating Big Data Pr...Big Data Meets HPC - Exploiting HPC Technologies for Accelerating Big Data Pr...
Big Data Meets HPC - Exploiting HPC Technologies for Accelerating Big Data Pr...inside-BigData.com
 
Strata EU tutorial - Architectural considerations for hadoop applications
Strata EU tutorial - Architectural considerations for hadoop applicationsStrata EU tutorial - Architectural considerations for hadoop applications
Strata EU tutorial - Architectural considerations for hadoop applicationshadooparchbook
 
End to End Streaming Architectures
End to End Streaming ArchitecturesEnd to End Streaming Architectures
End to End Streaming ArchitecturesCloudera, Inc.
 
Application Architectures with Hadoop - UK Hadoop User Group
Application Architectures with Hadoop - UK Hadoop User GroupApplication Architectures with Hadoop - UK Hadoop User Group
Application Architectures with Hadoop - UK Hadoop User Grouphadooparchbook
 
Machine Learning for Developers - Pop-up Loft Tel Aviv
Machine Learning for Developers - Pop-up Loft Tel AvivMachine Learning for Developers - Pop-up Loft Tel Aviv
Machine Learning for Developers - Pop-up Loft Tel AvivAmazon Web Services
 
Amazon S3 - Masterclass - Pop-up Loft Tel Aviv
Amazon S3 - Masterclass - Pop-up Loft Tel AvivAmazon S3 - Masterclass - Pop-up Loft Tel Aviv
Amazon S3 - Masterclass - Pop-up Loft Tel AvivAmazon Web Services
 
#EarthOnAWS: How the Cloud Is Transforming Earth Observation | AWS Public Sec...
#EarthOnAWS: How the Cloud Is Transforming Earth Observation | AWS Public Sec...#EarthOnAWS: How the Cloud Is Transforming Earth Observation | AWS Public Sec...
#EarthOnAWS: How the Cloud Is Transforming Earth Observation | AWS Public Sec...Amazon Web Services
 
DevOps as a Pathway to AWS | AWS Public Sector Summit 2016
DevOps as a Pathway to AWS | AWS Public Sector Summit 2016DevOps as a Pathway to AWS | AWS Public Sector Summit 2016
DevOps as a Pathway to AWS | AWS Public Sector Summit 2016Amazon Web Services
 
AWS Mobile Hub + AWS Device Farm
AWS Mobile Hub + AWS Device FarmAWS Mobile Hub + AWS Device Farm
AWS Mobile Hub + AWS Device FarmAmazon Web Services
 
Building Your Practice on AWS: An APN Breakfast Session
Building Your Practice on AWS: An APN Breakfast SessionBuilding Your Practice on AWS: An APN Breakfast Session
Building Your Practice on AWS: An APN Breakfast SessionAmazon Web Services
 
Develping mobile services on aws - Pop-up Loft Tel Aviv
Develping mobile services on aws - Pop-up Loft Tel AvivDevelping mobile services on aws - Pop-up Loft Tel Aviv
Develping mobile services on aws - Pop-up Loft Tel AvivAmazon Web Services
 
Account Separation and Mandatory Access Control on AWS | Security Roadshow Du...
Account Separation and Mandatory Access Control on AWS | Security Roadshow Du...Account Separation and Mandatory Access Control on AWS | Security Roadshow Du...
Account Separation and Mandatory Access Control on AWS | Security Roadshow Du...Amazon Web Services
 
Application Delivery Patterns for Developers - Technical 401
Application Delivery Patterns for Developers - Technical 401Application Delivery Patterns for Developers - Technical 401
Application Delivery Patterns for Developers - Technical 401Amazon Web Services
 

Andere mochten auch (20)

Better Together - Using Spark and Redshift to Combine Your Data with Public D...
Better Together - Using Spark and Redshift to Combine Your Data with Public D...Better Together - Using Spark and Redshift to Combine Your Data with Public D...
Better Together - Using Spark and Redshift to Combine Your Data with Public D...
 
AWS January 2016 Webinar Series - Getting Started with Big Data on AWS
AWS January 2016 Webinar Series - Getting Started with Big Data on AWSAWS January 2016 Webinar Series - Getting Started with Big Data on AWS
AWS January 2016 Webinar Series - Getting Started with Big Data on AWS
 
Real-Time Analytics with Apache Cassandra and Apache Spark,
Real-Time Analytics with Apache Cassandra and Apache Spark,Real-Time Analytics with Apache Cassandra and Apache Spark,
Real-Time Analytics with Apache Cassandra and Apache Spark,
 
Big Data and Data Science: The Technologies Shaping Our Lives
Big Data and Data Science: The Technologies Shaping Our LivesBig Data and Data Science: The Technologies Shaping Our Lives
Big Data and Data Science: The Technologies Shaping Our Lives
 
Big Data Meets HPC - Exploiting HPC Technologies for Accelerating Big Data Pr...
Big Data Meets HPC - Exploiting HPC Technologies for Accelerating Big Data Pr...Big Data Meets HPC - Exploiting HPC Technologies for Accelerating Big Data Pr...
Big Data Meets HPC - Exploiting HPC Technologies for Accelerating Big Data Pr...
 
Strata EU tutorial - Architectural considerations for hadoop applications
Strata EU tutorial - Architectural considerations for hadoop applicationsStrata EU tutorial - Architectural considerations for hadoop applications
Strata EU tutorial - Architectural considerations for hadoop applications
 
End to End Streaming Architectures
End to End Streaming ArchitecturesEnd to End Streaming Architectures
End to End Streaming Architectures
 
Application Architectures with Hadoop - UK Hadoop User Group
Application Architectures with Hadoop - UK Hadoop User GroupApplication Architectures with Hadoop - UK Hadoop User Group
Application Architectures with Hadoop - UK Hadoop User Group
 
Machine Learning for Developers - Pop-up Loft Tel Aviv
Machine Learning for Developers - Pop-up Loft Tel AvivMachine Learning for Developers - Pop-up Loft Tel Aviv
Machine Learning for Developers - Pop-up Loft Tel Aviv
 
Amazon S3 - Masterclass - Pop-up Loft Tel Aviv
Amazon S3 - Masterclass - Pop-up Loft Tel AvivAmazon S3 - Masterclass - Pop-up Loft Tel Aviv
Amazon S3 - Masterclass - Pop-up Loft Tel Aviv
 
#EarthOnAWS: How the Cloud Is Transforming Earth Observation | AWS Public Sec...
#EarthOnAWS: How the Cloud Is Transforming Earth Observation | AWS Public Sec...#EarthOnAWS: How the Cloud Is Transforming Earth Observation | AWS Public Sec...
#EarthOnAWS: How the Cloud Is Transforming Earth Observation | AWS Public Sec...
 
DevOps as a Pathway to AWS | AWS Public Sector Summit 2016
DevOps as a Pathway to AWS | AWS Public Sector Summit 2016DevOps as a Pathway to AWS | AWS Public Sector Summit 2016
DevOps as a Pathway to AWS | AWS Public Sector Summit 2016
 
AWS Mobile Hub + AWS Device Farm
AWS Mobile Hub + AWS Device FarmAWS Mobile Hub + AWS Device Farm
AWS Mobile Hub + AWS Device Farm
 
Amazon EC2
Amazon EC2Amazon EC2
Amazon EC2
 
Building Your Practice on AWS: An APN Breakfast Session
Building Your Practice on AWS: An APN Breakfast SessionBuilding Your Practice on AWS: An APN Breakfast Session
Building Your Practice on AWS: An APN Breakfast Session
 
Develping mobile services on aws - Pop-up Loft Tel Aviv
Develping mobile services on aws - Pop-up Loft Tel AvivDevelping mobile services on aws - Pop-up Loft Tel Aviv
Develping mobile services on aws - Pop-up Loft Tel Aviv
 
Account Separation and Mandatory Access Control on AWS | Security Roadshow Du...
Account Separation and Mandatory Access Control on AWS | Security Roadshow Du...Account Separation and Mandatory Access Control on AWS | Security Roadshow Du...
Account Separation and Mandatory Access Control on AWS | Security Roadshow Du...
 
Application Delivery Patterns for Developers - Technical 401
Application Delivery Patterns for Developers - Technical 401Application Delivery Patterns for Developers - Technical 401
Application Delivery Patterns for Developers - Technical 401
 
Keynote - Currency fair
Keynote - Currency fairKeynote - Currency fair
Keynote - Currency fair
 
Workshop: We love APIs
Workshop: We love APIsWorkshop: We love APIs
Workshop: We love APIs
 

Ähnlich wie Evolving Your Big Data Use Cases from Batch to Real-Time - AWS May 2016 Webinar Series

AWS April 2016 Webinar Series - Getting Started with Real-Time Data Analytics...
AWS April 2016 Webinar Series - Getting Started with Real-Time Data Analytics...AWS April 2016 Webinar Series - Getting Started with Real-Time Data Analytics...
AWS April 2016 Webinar Series - Getting Started with Real-Time Data Analytics...Amazon Web Services
 
Deep dive and best practices on real time streaming applications nyc-loft_oct...
Deep dive and best practices on real time streaming applications nyc-loft_oct...Deep dive and best practices on real time streaming applications nyc-loft_oct...
Deep dive and best practices on real time streaming applications nyc-loft_oct...Amazon Web Services
 
AWS Webcast - Amazon Kinesis and Apache Storm
AWS Webcast - Amazon Kinesis and Apache StormAWS Webcast - Amazon Kinesis and Apache Storm
AWS Webcast - Amazon Kinesis and Apache StormAmazon Web Services
 
20141021 AWS Cloud Taekwon - Big Data on AWS
20141021 AWS Cloud Taekwon - Big Data on AWS20141021 AWS Cloud Taekwon - Big Data on AWS
20141021 AWS Cloud Taekwon - Big Data on AWSAmazon Web Services Korea
 
Building Big Data Applications with Serverless Architectures - June 2017 AWS...
Building Big Data Applications with Serverless Architectures -  June 2017 AWS...Building Big Data Applications with Serverless Architectures -  June 2017 AWS...
Building Big Data Applications with Serverless Architectures - June 2017 AWS...Amazon Web Services
 
B3 - Business intelligence apps on aws
B3 - Business intelligence apps on awsB3 - Business intelligence apps on aws
B3 - Business intelligence apps on awsAmazon Web Services
 
Getting Started with Amazon Kinesis | AWS Public Sector Summit 2016
Getting Started with Amazon Kinesis | AWS Public Sector Summit 2016Getting Started with Amazon Kinesis | AWS Public Sector Summit 2016
Getting Started with Amazon Kinesis | AWS Public Sector Summit 2016Amazon Web Services
 
Modern data architectures for real time analytics and engagement
Modern data architectures for real time analytics and engagementModern data architectures for real time analytics and engagement
Modern data architectures for real time analytics and engagementAmazon Web Services
 
Modern Data Architectures for Real Time Analytics & Engagement
Modern Data Architectures for Real Time Analytics & EngagementModern Data Architectures for Real Time Analytics & Engagement
Modern Data Architectures for Real Time Analytics & EngagementAmazon Web Services
 
Driving Business Insights with a Modern Data Architecture AWS Summit SG 2017
Driving Business Insights with a Modern Data Architecture  AWS Summit SG 2017Driving Business Insights with a Modern Data Architecture  AWS Summit SG 2017
Driving Business Insights with a Modern Data Architecture AWS Summit SG 2017Amazon Web Services
 
Getting Started with Amazon Kinesis
Getting Started with Amazon KinesisGetting Started with Amazon Kinesis
Getting Started with Amazon KinesisAmazon Web Services
 
Build Data Lakes and Analytics on AWS
Build Data Lakes and Analytics on AWS Build Data Lakes and Analytics on AWS
Build Data Lakes and Analytics on AWS Amazon Web Services
 
BDA303 Serverless big data architectures: Design patterns and best practices
BDA303 Serverless big data architectures: Design patterns and best practicesBDA303 Serverless big data architectures: Design patterns and best practices
BDA303 Serverless big data architectures: Design patterns and best practicesAmazon Web Services
 
Analyzing Data Streams in Real Time with Amazon Kinesis: PNNL's Serverless Da...
Analyzing Data Streams in Real Time with Amazon Kinesis: PNNL's Serverless Da...Analyzing Data Streams in Real Time with Amazon Kinesis: PNNL's Serverless Da...
Analyzing Data Streams in Real Time with Amazon Kinesis: PNNL's Serverless Da...Amazon Web Services
 
Building Data Lakes and Analytics on AWS; Patterns and Best Practices - BDA30...
Building Data Lakes and Analytics on AWS; Patterns and Best Practices - BDA30...Building Data Lakes and Analytics on AWS; Patterns and Best Practices - BDA30...
Building Data Lakes and Analytics on AWS; Patterns and Best Practices - BDA30...Amazon Web Services
 
(SDD405) Amazon Kinesis Deep Dive | AWS re:Invent 2014
(SDD405) Amazon Kinesis Deep Dive | AWS re:Invent 2014(SDD405) Amazon Kinesis Deep Dive | AWS re:Invent 2014
(SDD405) Amazon Kinesis Deep Dive | AWS re:Invent 2014Amazon Web Services
 
AWS re:Invent 2016: Big Data Mini Con State of the Union (BDM205)
AWS re:Invent 2016: Big Data Mini Con State of the Union (BDM205)AWS re:Invent 2016: Big Data Mini Con State of the Union (BDM205)
AWS re:Invent 2016: Big Data Mini Con State of the Union (BDM205)Amazon Web Services
 

Ähnlich wie Evolving Your Big Data Use Cases from Batch to Real-Time - AWS May 2016 Webinar Series (20)

AWS April 2016 Webinar Series - Getting Started with Real-Time Data Analytics...
AWS April 2016 Webinar Series - Getting Started with Real-Time Data Analytics...AWS April 2016 Webinar Series - Getting Started with Real-Time Data Analytics...
AWS April 2016 Webinar Series - Getting Started with Real-Time Data Analytics...
 
Deep dive and best practices on real time streaming applications nyc-loft_oct...
Deep dive and best practices on real time streaming applications nyc-loft_oct...Deep dive and best practices on real time streaming applications nyc-loft_oct...
Deep dive and best practices on real time streaming applications nyc-loft_oct...
 
AWS Webcast - Amazon Kinesis and Apache Storm
AWS Webcast - Amazon Kinesis and Apache StormAWS Webcast - Amazon Kinesis and Apache Storm
AWS Webcast - Amazon Kinesis and Apache Storm
 
20141021 AWS Cloud Taekwon - Big Data on AWS
20141021 AWS Cloud Taekwon - Big Data on AWS20141021 AWS Cloud Taekwon - Big Data on AWS
20141021 AWS Cloud Taekwon - Big Data on AWS
 
Building Big Data Applications with Serverless Architectures - June 2017 AWS...
Building Big Data Applications with Serverless Architectures -  June 2017 AWS...Building Big Data Applications with Serverless Architectures -  June 2017 AWS...
Building Big Data Applications with Serverless Architectures - June 2017 AWS...
 
B3 - Business intelligence apps on aws
B3 - Business intelligence apps on awsB3 - Business intelligence apps on aws
B3 - Business intelligence apps on aws
 
Getting Started with Amazon Kinesis | AWS Public Sector Summit 2016
Getting Started with Amazon Kinesis | AWS Public Sector Summit 2016Getting Started with Amazon Kinesis | AWS Public Sector Summit 2016
Getting Started with Amazon Kinesis | AWS Public Sector Summit 2016
 
Modern data architectures for real time analytics and engagement
Modern data architectures for real time analytics and engagementModern data architectures for real time analytics and engagement
Modern data architectures for real time analytics and engagement
 
Modern Data Architectures for Real Time Analytics & Engagement
Modern Data Architectures for Real Time Analytics & EngagementModern Data Architectures for Real Time Analytics & Engagement
Modern Data Architectures for Real Time Analytics & Engagement
 
Driving Business Insights with a Modern Data Architecture AWS Summit SG 2017
Driving Business Insights with a Modern Data Architecture  AWS Summit SG 2017Driving Business Insights with a Modern Data Architecture  AWS Summit SG 2017
Driving Business Insights with a Modern Data Architecture AWS Summit SG 2017
 
Real-Time Streaming Data on AWS
Real-Time Streaming Data on AWSReal-Time Streaming Data on AWS
Real-Time Streaming Data on AWS
 
Getting Started with Amazon Kinesis
Getting Started with Amazon KinesisGetting Started with Amazon Kinesis
Getting Started with Amazon Kinesis
 
Build Data Lakes and Analytics on AWS
Build Data Lakes and Analytics on AWS Build Data Lakes and Analytics on AWS
Build Data Lakes and Analytics on AWS
 
Big Data on AWS
Big Data on AWSBig Data on AWS
Big Data on AWS
 
Big Data Analytics on AWS
Big Data Analytics on AWSBig Data Analytics on AWS
Big Data Analytics on AWS
 
BDA303 Serverless big data architectures: Design patterns and best practices
BDA303 Serverless big data architectures: Design patterns and best practicesBDA303 Serverless big data architectures: Design patterns and best practices
BDA303 Serverless big data architectures: Design patterns and best practices
 
Analyzing Data Streams in Real Time with Amazon Kinesis: PNNL's Serverless Da...
Analyzing Data Streams in Real Time with Amazon Kinesis: PNNL's Serverless Da...Analyzing Data Streams in Real Time with Amazon Kinesis: PNNL's Serverless Da...
Analyzing Data Streams in Real Time with Amazon Kinesis: PNNL's Serverless Da...
 
Building Data Lakes and Analytics on AWS; Patterns and Best Practices - BDA30...
Building Data Lakes and Analytics on AWS; Patterns and Best Practices - BDA30...Building Data Lakes and Analytics on AWS; Patterns and Best Practices - BDA30...
Building Data Lakes and Analytics on AWS; Patterns and Best Practices - BDA30...
 
(SDD405) Amazon Kinesis Deep Dive | AWS re:Invent 2014
(SDD405) Amazon Kinesis Deep Dive | AWS re:Invent 2014(SDD405) Amazon Kinesis Deep Dive | AWS re:Invent 2014
(SDD405) Amazon Kinesis Deep Dive | AWS re:Invent 2014
 
AWS re:Invent 2016: Big Data Mini Con State of the Union (BDM205)
AWS re:Invent 2016: Big Data Mini Con State of the Union (BDM205)AWS re:Invent 2016: Big Data Mini Con State of the Union (BDM205)
AWS re:Invent 2016: Big Data Mini Con State of the Union (BDM205)
 

Mehr von Amazon Web Services

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Amazon Web Services
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Amazon Web Services
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateAmazon Web Services
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSAmazon Web Services
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Amazon Web Services
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Amazon Web Services
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...Amazon Web Services
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsAmazon Web Services
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareAmazon Web Services
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSAmazon Web Services
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAmazon Web Services
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareAmazon Web Services
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWSAmazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckAmazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without serversAmazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...Amazon Web Services
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceAmazon Web Services
 

Mehr von Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

Kürzlich hochgeladen

04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfhans926745
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesBoston Institute of Analytics
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 

Kürzlich hochgeladen (20)

04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 

Evolving Your Big Data Use Cases from Batch to Real-Time - AWS May 2016 Webinar Series

  • 1. © 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Steve Abraham – Solutions Architect May 26, 2016 May Webinar Series Evolving Your Big Data Use Cases from Batch to Real Time
  • 2. Agenda  Common Use Cases for Real-Time Analytics  Patterns & Practices  Tools  Demo  Q&A
  • 4. Streaming Data Scenarios Across Verticals Scenarios/ Verticals Accelerated Ingest- Transform-Load Continuous Metrics Generation Responsive Data Analysis Digital Ad Tech/Marketing Publisher, bidder data aggregation Advertising metrics like coverage, yield, and conversion User engagement with ads, optimized bid/buy engines IoT Sensor, device telemetry data ingestion Operational metrics and dashboards Device operational intelligence and alerts Gaming Online data aggregation, e.g., top 10 players Massively multiplayer online game (MMOG) live dashboard Leader board generation, player-skill match Consumer Online Clickstream analytics Metrics like impressions and page views Recommendation engines, proactive care
  • 5. Customer Use Cases Sonos runs near real-time streaming analytics on device data logs from their connected hi-fi audio equipment. Analyzing 30TB+ clickstream data enabling real-time insights for Publishers. Zillow uses Lambda and Kinesis to manage a global ingestion pipeline and produce quality analytics in real-time without building infrastructure. Nordstorm recommendation team built online stylist using Amazon Kinesis Streams and AWS Lambda.
  • 7. Big Data : Batch or Stream? Batch Processing (Data Lake) Real-time Processing (Stream) Log Analysis What went wrong an hour ago? Take corrective action / notify admin Billing Analysis How much did you spend during the last billing cycle? Notify user that billing limit is approaching Customer Preferences Which ads were most effective yesterday? Which ad should the customer see right now? Fraud Audit trail / forensic evidence Stop fraud now
  • 8. Data Streaming and Big Data Kinesis AWS Lambda Velocity Size Big Data Data Streaming Batch Processing
  • 9. Two Main Processing Patterns Stream processing (real time)  Real-time response to events in data streams Examples:  Proactively detect hardware errors in device logs  Notify when inventory drops below a threshold  Fraud detection Micro-batching (near real time)  Near real-time operations on small batches of events in data streams Examples:  Aggregate and archive events  Monitor performance SLAs
  • 10. Data Streaming Design Pattern Collect Process Analyze Store Data Collection and Storage Data Processing Event Processing Data Analysis
  • 11. Batch Layer Amazon Kinesis data process store Amazon Kinesis Firehose Amazon S3 A p p l i c a t i o n s Amazon Redshift Amazon EMR Presto Hive Pig Spark answer Speed Layer answer Serving Layer Amazon ES Amazon DynamoDB Amazon RDS Amazon ElastiCache answer KCL AWS Lambda Spark Streaming Storm Batch / Stream Architecture
  • 12. Primitive: Multi-Stage Decoupled “Data Bus”  Multiple Stages  Storage Decouples Processing Stages Store Process Store ProcessData Answers process store
  • 13. Primitive: Multi-Stream Processing Amazon Kinesis AWS LambdaData Amazon DynamoDB Amazon Firehose Amazon S3  Read from Kinesis  Write to Multiple Data Stores process store
  • 14. Amazon EMR Primitive: Analysis Frameworks Amazon Kinesis AWS Lambda Amazon S3 Data Amazon DynamoDB Answer Spark Streaming Amazon Firehose Spark SQL  Can Read from Multiple Inputs process store
  • 16. Tools
  • 17. Plethora of Tools Amazon Glacier Amazon S3 Amazon DynamoDB Amazon RDS Amazon EMR Amazon Redshift Amazon Kinesis Amazon Kinesis- enabled app AWS Lambda Amazon ML Amazon SQS Amazon ElastiCache DynamoDB Streams Amazon Elasticsearch Service AWS IoT
  • 18.
  • 19. Amazon Kinesis Firehose Load massive volumes of streaming data into Amazon S3, Amazon Redshift and Amazon Elasticsearch Capture and submit streaming data to Firehose Analyze streaming data using your favorite BI tools Firehose loads streaming data continuously into S3, Amazon Redshift and Amazon Elasticsearch Zero administration: Capture and deliver streaming data into Amazon S3, Amazon Redshift and Amazon Elasticsearch without writing an application or managing infrastructure. Direct-to-data store integration: Batch, compress, and encrypt streaming data for delivery into data destinations in as little as 60 secs using simple configurations. Seamless elasticity: Seamlessly scales to match data throughput w/o intervention
  • 20. Data Sources App.4 [Machine Learning] AWSEndpoint App.1 [Aggregate & De-Duplicate] App.2 [Metric Extraction] Amazon S3 Amazon Redshift App.3 [Sliding Window Analysis] Availability Zone Shard 1 Shard 2 Shard N AWS Lambda Amazon EMR Availability Zone Availability Zone Amazon Kinesis Streams Managed service for real-time streaming Data Sources Data Sources Data Sources
  • 21.  Streams are made of shards  Each shard ingests up to 1MB/sec, and 1000 records/sec  Each shard emits up to 2 MB/sec  All data is stored for 24 hours by default; storage can be extended for up to 7 days  Scale Kinesis streams using scaling utility  Replay data inside of 24-hour window or extended window Amazon Kinesis Streams Managed ability to capture and store data
  • 22. Amazon Kinesis Firehose vs. Amazon Kinesis Streams Amazon Kinesis Streams is for use cases that require custom processing, per incoming record, with sub-1 second processing latency, and a choice of stream processing frameworks. Amazon Kinesis Firehose is for use cases that require zero administration, ability to use existing analytics tools based on Amazon S3, Amazon Redshift and Amazon Elasticsearch, and a data latency of 60 seconds or higher.
  • 24. Putting Data into Amazon Kinesis Streams Determine your partition key strategy  Managed buffer or streaming MapReduce job  Ensure high cardinality for your shards Provision adequate shards  For ingress needs  Egress needs for all consuming applications: if more than two simultaneous applications  Include headroom for catching up with data in stream
  • 25. Putting Data into Amazon Kinesis Amazon Kinesis Agent – (supports pre-processing)  http://docs.aws.amazon.com/firehose/latest/dev/writing-with-agents.html Pre-batch before Puts for better efficiency  Consider Flume, Fluentd as collectors/agents  See https://github.com/awslabs/aws-fluent-plugin-kinesis Make a tweak to your existing logging  log4j appender option  See https://github.com/awslabs/kinesis-log4j-appender
  • 26. Amazon Kinesis Producer Library  Writes to one or more Amazon Kinesis streams with automatic, configurable retry mechanism  Collects records and uses PutRecords to write multiple records to multiple shards per request  Aggregates user records to increase payload size and improve throughput  Integrates seamlessly with KCL to de-aggregate batched records  Use Amazon Kinesis Producer Library with AWS Lambda (New!)  Submits Amazon CloudWatch metrics on your behalf to provide visibility into producer performance
  • 27. Record Order and Multiple Shards Unordered processing  Randomize partition key to distribute events over many shards and use multiple workers Exact order processing  Control partition key to ensure events are grouped into the same shard and read by the same worker Need both? Use global sequence number Producer Get Global Sequence Unordered Stream Campaign Centric Stream Fraud Inspection Stream Get Event Metadata
  • 28. Sample Code for Scaling Shards java -cp KinesisScalingUtils.jar-complete.jar -Dstream-name=MyStream -Dscaling-action=scaleUp -Dcount=10 -Dregion=eu-west-1 ScalingClient Options:  stream-name - The name of the stream to be scaled  scaling-action - The action to be taken to scale. Must be one of "scaleUp”, "scaleDown" or “resize”  count - Number of shards by which to absolutely scale up or down, or resize See https://github.com/awslabs/amazon-kinesis-scaling-utils
  • 29. Amazon Kinesis Stream Processing
  • 30. Amazon Kinesis Client Library  Build Kinesis Applications with Kinesis Client Library (KCL)  Open source client library available for Java, Ruby, Python, Node.JS dev  Deploy on your EC2 instances  KCL Application includes three components: 1. Record Processor Factory – Creates the record processor 2. Record Processor – Processor unit that processes data from a shard in Amazon Kinesis Streams 3. Worker – Processing unit that maps to each application instance
  • 31. State Management with Kinesis Client Library  One record processor maps to one shard and processes data records from that shard  One worker maps to one or more record processors  Balances shard-worker associations when worker / instance counts change  Balances shard-worker associations when shards split or merge
  • 32. Other Options  Third-party connectors(for example, Splunk)  AWS IoT platform  AWS Lambda  Amazon EMR with Apache Spark, Pig or Hive
  • 33. Apache Spark and Amazon Kinesis Streams Apache Spark is an in-memory analytics cluster using RDD for fast processing Spark Streaming can read directly from an Amazon Kinesis stream Amazon software license linking – Add ASL dependency to SBT/MAVEN project, artifactId = spark- streaming-kinesis-asl_2.10 Example: Counting tweets on a sliding window KinesisUtils.createStream(‘twitter-stream’) .filter(_.getText.contains(”Open-Source")) .countByWindow(Seconds(5))
  • 34. Amazon EMR Amazon Kinesis StreamsStreaming Input Tumbling/Fixed Window Aggregation Periodic Output Amazon Redshift COPY from Amazon EMR Common Integration Pattern with Amazon EMR Tumbling Window Reporting
  • 35. Amazon Kinesis Streams with AWS Lambda
  • 38. Demo
  • 39. Key Takeaways • Streaming vs. Batching • Loosely Coupled • Managed Services • Right Tools