SlideShare a Scribd company logo
1 of 49
Download to read offline
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Analysing Data in Real-time
Julio Faerman
@faermanj
AWS Technical Evangelist
Timely decisions require new data and fast
Source: Perishable insights, Mike Gualtieri, Forrester
Data loses value quickly over time
Real time Seconds Minutes Hours Days Months
Valueofdatatodecision-making
Preventive/Predictive
Actionable Reactive Historical
Time critical
decisions
Traditional “batch” business
intelligence
Information half-life
in decision-making
What is streaming data?
Typical characteristics
Low-latencyContinuous Ordered,
incremental
High volume
Most common uses of streaming
Industrial
Automation
Smart Home
Smart City
Data
Lakes
IoT
Analytics
Log
Analytics
5
CUSTOMER EVENT STORE
A JOURNEY FROM DATA WAREHOUSE
TO STREAMING DATA
6
CUSTOMER EVENT STORE
INTRODUCTION
Charles van Kints
Product Owner
Customer Event Store
ABN AMRO
Abhishek Choudhary
Lead Development Engineer
Customer Event Store
ABN AMRO
8
Slow batch driven
processing
Complexity in connecting to
external sources
THE PROBLEM: OUR CHALLENGES ON PROCESSING EVENT DATA
Events: Interactions (touching points) of (potential) customers
towards ABN AMRO, throughout devices, across channels
Error prone process: bad
records
Huge increase in volumes of
data
Fast changing sources
Limitations in consuming
capabilities
Diversity in data from
different sources
CUSTOMER EVENT STORE: WHY
9
CUSTOMER EVENT STORE: WHY
PROBLEM STATEMENT
“Not being able to handle important events in the life of the customer that impact
their relation with ABN AMRO in an adequate way.”
Bernard Faber
Solution Architect, ABN AMRO
“Strong increase of the digitalized touching points with our customers (called
events), from a growing number of sources.”
Charles Van Kints
Product Owner, ABN AMRO
“The continuous growth in event data sources and volume, the increasing
demand towards using event data and the current solution within the Marketing
Intelligence data warehouse.”
Peter Kromhout
Engineering Lead, ABN AMRO
10
KEY FEATURES
Handle Changes
Instantly
& Metadata Driven
Large Volumes
Consuming Capabilities
Customer Interactions
Real – Time
Future State
Customer
Event Store
Building insights in the customer behaviour, customer journey and customer interactions with ABN AMRO in order
to be able to act Personal and Relevant.
CUSTOMER EVENT STORE : WHAT
11
JOURNEY SO FAR…
ü Approval – License to
Public
ü Prepare for Go-Live
March – 2018
Prototype
ü Develop Prototype.
ü Initiate License to Public
April – 2018
Technical Go - Live
ü Product Stack deployed
ü 2 Sources Live
ü Tune product for
Business Go-Live
Business Go – Live
ü Add new sources
ü Consuming Capabilities
ü Enable data usage
Approach
ü Successful prototype
ü Co-creation – Business
& IT
ü 2 Event Sources
Go!
August – 2018 September – 2018 December – 2018
CUSTOMER EVENT STORE: WHEN
12
CUSTOMER EVENT STORE: HOW
CONCEPTUAL DESIGN
Internal
sources
RDS
LandingZone
External
sources
Access
Mngmt
Metadata
Validation
Streaming
End-Point
REST API
Alerting
Standardisation
ETLMetadata
ETL
Streaming
Batch
Metadata
Metadata
Real-Time processing
Batch processing
EVENT STORE
Profile Information
StitchingPre-Processing
Lineage Orchestration Monitoring Access Management
Stream
StreamingZone
13
CUSTOMER EVENT STORE: HOW
TECHNICAL ARCHITECTURE : STREAM & BATCH
Batch Bucket
Nano - Batch
Bucket
EMR
Glue
Step Function
Lambda
SNS
EnterpriseRaw
DataStore
Auto-Scaling
Group
Snowplow
Collector
Fargate
Fargate
14
CUSTOMER EVENT STORE: HOW
TECHNICAL ARCHITECTURE: ONE PROCESS – STREAM & BATCH
Nano - Batch
Bucket
Auto-Scaling
Group
Snowplow
Collector
Fargate
Auto-Scaling
Group
Snowplow
Enricher
Fargate
Kinesis Data
Stream – Raw
Kinesis Data
Stream – Good
Kinesis Data
Stream – Bad
Kinesis Data
Firehose
Schema Bucket
Bad Events
Bucket
EnterpriseRaw
DataStore
Batch Bucket
15
CUSTOMER EVENT STORE: HOW
TECHNICAL ARCHITECTURE: STANDARDIZEEnterpriseRaw
DataStore
Auto-Scaling
Group
Snowplow
Collector
Fargate
Auto-Scaling
Group
Snowplow
Enricher
Fargate
Kinesis Data
Stream – Raw
Kinesis Data
Stream – Good
Kinesis Data
Stream – Bad
CloudWatch
Kinesis Data
Stream –
Standardized
Kinesis Data
Firehose - ORC
Kinesis Data
Firehose - JSON
Standard Bucket
Glue Crawler
Athena
DynamoDB
Alarm
Alarm
Rule
Schema Bucket
16
CUSTOMER EVENT STORE: HOW
TECHNICAL ARCHITECTURE: END 2 ENDEnterpriseRaw
DataStore
17
CUSTOMER EVENT STORE: SUMMARY
AWS STEP FUNCTIONS
2018
Analysis
Complex workflows involving
iteration of Lambda functions
can be implemented quickly.
Complex Workflows
Clear intermediate results.
Debug Friendly
New state machine can be
created for only the failed
states..
Restart – ability
Preserves state between
subsequent API calls.
State Management
Lambda, Glue, ECS,
SageMaker.
Serverless Orchestration
Retrials can be triggered for
specific errors. Other actions
can also be configured.
Error Handling
18
CUSTOMER EVENT STORE: SUMMARY
WHEN TO USE GLUE AND/OR EMR
Can be placed in
custom VPC
Horizontally scalableServer less
&
Pre-configured
Limited
customization
Fully Managed
Public Service
Define Cluster –
Choose Applications
&
Customize as you wish
Vertically & horizontally
scalable
More actions than
just SPARK
AWS EMR
AWS Glue
Spin-up time
Only Spark
19
KEY TAKE-AWAY
Dev-Ops
Security by Design
Architecture by Evolution
One Process
Serverless & Native
Components
Technical Vs Business
Go – Live
CUSTOMER EVENT STORE: SUMMARY
Thank you!
Streaming with Amazon Kinesis
Easily collect, process, and analyze video and data streams in real-time
Capture, process, and
store video streams
Amazon Kinesis
Video Streams
Load data streams into
data stores
Amazon Kinesis
Data Firehose
SQL
Analyze data streams
with SQL
Amazon Kinesis
Data Analytics
Capture, process, and
store data streams
Amazon Kinesis
Data Streams
Amazon Kinesis Data Streams producers and
consumers
Producers Consumers
Kinesis Agent
Apache Kafka
AWS SDK
LOG4J
Flume
Fluentd
AWS Mobile SDK for
iOS
Amazon Kinesis
Producer Library
Get* APIs
Amazon Kinesis Client
Library + Connector
Library
Apache Storm
Amazon EMR
AWS Lambda
Apache Spark
Amazon
Kinesis
S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Data ingestion from a variety of sources
Kinesis Data
Streams
Transactions
ERP
Web logs/
cookies
Connected
devices
AWS SDKs
• Publish directly from application code via APIs
• AWS Mobile SDK
• Managed AWS sources: CloudWatch Logs, AWS IoT, Kinesis Data
Analytics and more
• RDS Aurora via Lambda
Kinesis Agent
• Monitors log files and forwards lines as messages to Kinesis Data Streams
Kinesis Producer Library (KPL)
• Background process aggregates and batches messages
3rd party and open source
• Log4j appender
• Apache Kafka
• Flume, fluentd, and more …
Data processing from a variety of consumers
Fully managed service for real-time processing of streaming data
Cost-effective: $0.014 per 1,000,000 PUT Payload Units
Millions of sources
producing 100’s of
terabytes per hour
Amazon Web Services
Front
End
AZ AZ AZ
Authentic
authorization
Durable, highly consistent storage replicas data
across three data centers (availability zones)
Ordered stream of
events supports
multiple readers
Amazon
Kinesis Client
Library on
EC2
Amazon
Kinesis Data
Firehose
Amazon
Kinesis Data
Analytics
AWS Lambda
Amazon Kinesis Data Streams: Standard
consumers
Shard 1
Shard 2
Shard 3
Shard n
Kinesis Data Stream
Consumer
application A
GetRecords()
Data
GetRecords():
Five transactions per second, per shard
Data:
2MB per second, per shard
Data
producer
up to 1 MB
or 1000
records per
second, per
shard
With only one
consumer
application,
records can be
retrieved every
200 ms
Amazon Kinesis Data Streams: Enhanced fan-out
consumers
Consumers do not poll. Messages are pushed to the consumer as they arrive
Shard 1
Kinesis Data Stream
Data
producer
Consumer
application A
SubscribeToShard()
Uses HTTP/2
• Up to five mins connection
• Data pushed to consumer
persist
Enhanced fan-out
• Multiple consumer applications for
the same Kinesis Data Stream
• Default limit of five
registered consuming
applications. More can be
supported with a service
limit increase request
• Low-latency requirements for data
processing
• Messages are typically
delivered to a consumer in
less than 70 ms
Amazon Kinesis Data Streams Consumers
Standard
• Total number of consuming
applications is low
• Consumers are not latency-
sensitive
• Minimize cost
S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon Kinesis Data Firehose—How it works
Ingest Transform Deliver
Amazon S3
Amazon Redshift
Amazon Elasticsearch Service
AWS IoT
Amazon Kinesis Agent
Amazon Kinesis Streams
Amazon CloudWatch Logs
Amazon CloudWatch Events
Apache Kafka
S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
SQL on streaming data?
Aggregations (count, sum, min, … ) take granular real-time
data and turn it into insights
Data is continuously processed so you need to tell the
application when you want results
Aggregation Windows
Window types
Sliding, tumbling, and stagger
Tumbling windows are fixed size and grouped keys do not overlap
Source
Time
t0 t1 t2 t3 t4 t5 t6 t7 t8 t9 t10 t11 t12 t13 t14 t15
Writing streaming SQL
Pump (continuous query) using stagger window
CREATE OR REPLACE PUMP calls_per_ip_pump AS
INSERT INTO calls_per_ip_stream
SELECT STREAM source_ip_address,
COUNT(*)
FROM source_sql_stream_001
WINDOWED BY STAGGER(
PARTITION BY source_ip_address
RANGE INTERVAL '1' MINUTE);
Apache Flink: Stateful Stream Computations
S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Traditional methods struggle with real-world forecasting
complications
Can’t handle
seasonality
DeepAR
Probabilistic Forecasting with Autoregressive Recurrent Networks
https://github.com/awslabs/amazon-sagemaker-examples/
N E W !
Amazon Forecast
Any historical
time-series
Integrates with SAP and
Oracle Supply Chain
Custom forecasts
with 3 clicks
50% more
accurate
1/10th
the cost
Integrates with
Amazon Timestream
Retail demand Travel demand AWS usage
Revenue forecasts Web traffic Advertising demand
Generate forecasts for:
Accurate time-series forecasting service, based on the same technology
used at Amazon.com
S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Invoking lambda functions
AWS IoT Greengrass
S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Thank you!
S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Julio Faerman
@faermanj

More Related Content

What's hot

MLOps Virtual Event | Building Machine Learning Platforms for the Full Lifecycle
MLOps Virtual Event | Building Machine Learning Platforms for the Full LifecycleMLOps Virtual Event | Building Machine Learning Platforms for the Full Lifecycle
MLOps Virtual Event | Building Machine Learning Platforms for the Full LifecycleDatabricks
 
Azure Storage
Azure StorageAzure Storage
Azure StorageMustafa
 
Data Lake Architecture
Data Lake ArchitectureData Lake Architecture
Data Lake ArchitectureDATAVERSITY
 
DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDatabricks
 
Building the Data Lake with Azure Data Factory and Data Lake Analytics
Building the Data Lake with Azure Data Factory and Data Lake AnalyticsBuilding the Data Lake with Azure Data Factory and Data Lake Analytics
Building the Data Lake with Azure Data Factory and Data Lake AnalyticsKhalid Salama
 
Pipelines and Data Flows: Introduction to Data Integration in Azure Synapse A...
Pipelines and Data Flows: Introduction to Data Integration in Azure Synapse A...Pipelines and Data Flows: Introduction to Data Integration in Azure Synapse A...
Pipelines and Data Flows: Introduction to Data Integration in Azure Synapse A...Cathrine Wilhelmsen
 
Free Training: How to Build a Lakehouse
Free Training: How to Build a LakehouseFree Training: How to Build a Lakehouse
Free Training: How to Build a LakehouseDatabricks
 
Big Data Architectural Patterns and Best Practices
Big Data Architectural Patterns and Best PracticesBig Data Architectural Patterns and Best Practices
Big Data Architectural Patterns and Best PracticesAmazon Web Services
 
Azure Synapse Analytics Overview (r2)
Azure Synapse Analytics Overview (r2)Azure Synapse Analytics Overview (r2)
Azure Synapse Analytics Overview (r2)James Serra
 
Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)James Serra
 
Real-Time Streaming: Intro to Amazon Kinesis
Real-Time Streaming: Intro to Amazon KinesisReal-Time Streaming: Intro to Amazon Kinesis
Real-Time Streaming: Intro to Amazon KinesisAmazon Web Services
 
Building Lakehouses on Delta Lake with SQL Analytics Primer
Building Lakehouses on Delta Lake with SQL Analytics PrimerBuilding Lakehouses on Delta Lake with SQL Analytics Primer
Building Lakehouses on Delta Lake with SQL Analytics PrimerDatabricks
 
Stream processing and managing real-time data
Stream processing and managing real-time dataStream processing and managing real-time data
Stream processing and managing real-time dataAmazon Web Services
 
Modern Data architecture Design
Modern Data architecture DesignModern Data architecture Design
Modern Data architecture DesignKujambu Murugesan
 
Presto: Fast SQL-on-Anything (including Delta Lake, Snowflake, Elasticsearch ...
Presto: Fast SQL-on-Anything (including Delta Lake, Snowflake, Elasticsearch ...Presto: Fast SQL-on-Anything (including Delta Lake, Snowflake, Elasticsearch ...
Presto: Fast SQL-on-Anything (including Delta Lake, Snowflake, Elasticsearch ...Databricks
 
Databricks Overview for MLOps
Databricks Overview for MLOpsDatabricks Overview for MLOps
Databricks Overview for MLOpsDatabricks
 
Databricks Delta Lake and Its Benefits
Databricks Delta Lake and Its BenefitsDatabricks Delta Lake and Its Benefits
Databricks Delta Lake and Its BenefitsDatabricks
 

What's hot (20)

Snowflake Overview
Snowflake OverviewSnowflake Overview
Snowflake Overview
 
MLOps Virtual Event | Building Machine Learning Platforms for the Full Lifecycle
MLOps Virtual Event | Building Machine Learning Platforms for the Full LifecycleMLOps Virtual Event | Building Machine Learning Platforms for the Full Lifecycle
MLOps Virtual Event | Building Machine Learning Platforms for the Full Lifecycle
 
Azure Storage
Azure StorageAzure Storage
Azure Storage
 
Data Lake Architecture
Data Lake ArchitectureData Lake Architecture
Data Lake Architecture
 
DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptx
 
Building the Data Lake with Azure Data Factory and Data Lake Analytics
Building the Data Lake with Azure Data Factory and Data Lake AnalyticsBuilding the Data Lake with Azure Data Factory and Data Lake Analytics
Building the Data Lake with Azure Data Factory and Data Lake Analytics
 
Pipelines and Data Flows: Introduction to Data Integration in Azure Synapse A...
Pipelines and Data Flows: Introduction to Data Integration in Azure Synapse A...Pipelines and Data Flows: Introduction to Data Integration in Azure Synapse A...
Pipelines and Data Flows: Introduction to Data Integration in Azure Synapse A...
 
Free Training: How to Build a Lakehouse
Free Training: How to Build a LakehouseFree Training: How to Build a Lakehouse
Free Training: How to Build a Lakehouse
 
Big Data Architectural Patterns and Best Practices
Big Data Architectural Patterns and Best PracticesBig Data Architectural Patterns and Best Practices
Big Data Architectural Patterns and Best Practices
 
Azure Synapse Analytics Overview (r2)
Azure Synapse Analytics Overview (r2)Azure Synapse Analytics Overview (r2)
Azure Synapse Analytics Overview (r2)
 
Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)
 
Real-Time Streaming: Intro to Amazon Kinesis
Real-Time Streaming: Intro to Amazon KinesisReal-Time Streaming: Intro to Amazon Kinesis
Real-Time Streaming: Intro to Amazon Kinesis
 
Building Lakehouses on Delta Lake with SQL Analytics Primer
Building Lakehouses on Delta Lake with SQL Analytics PrimerBuilding Lakehouses on Delta Lake with SQL Analytics Primer
Building Lakehouses on Delta Lake with SQL Analytics Primer
 
Stream processing and managing real-time data
Stream processing and managing real-time dataStream processing and managing real-time data
Stream processing and managing real-time data
 
Modern Data architecture Design
Modern Data architecture DesignModern Data architecture Design
Modern Data architecture Design
 
Presto: Fast SQL-on-Anything (including Delta Lake, Snowflake, Elasticsearch ...
Presto: Fast SQL-on-Anything (including Delta Lake, Snowflake, Elasticsearch ...Presto: Fast SQL-on-Anything (including Delta Lake, Snowflake, Elasticsearch ...
Presto: Fast SQL-on-Anything (including Delta Lake, Snowflake, Elasticsearch ...
 
Databricks Overview for MLOps
Databricks Overview for MLOpsDatabricks Overview for MLOps
Databricks Overview for MLOps
 
Big Data Architectural Patterns
Big Data Architectural PatternsBig Data Architectural Patterns
Big Data Architectural Patterns
 
Databricks Delta Lake and Its Benefits
Databricks Delta Lake and Its BenefitsDatabricks Delta Lake and Its Benefits
Databricks Delta Lake and Its Benefits
 
What is AWS Glue
What is AWS GlueWhat is AWS Glue
What is AWS Glue
 

Similar to Analysing Data in Real-time

Amazon Kinesis - Building Serverless real-time solution - Tel Aviv Summit 2018
Amazon Kinesis - Building Serverless real-time solution - Tel Aviv Summit 2018Amazon Kinesis - Building Serverless real-time solution - Tel Aviv Summit 2018
Amazon Kinesis - Building Serverless real-time solution - Tel Aviv Summit 2018Amazon Web Services
 
Analyzing Data Streams in Real Time with Amazon Kinesis: PNNL's Serverless Da...
Analyzing Data Streams in Real Time with Amazon Kinesis: PNNL's Serverless Da...Analyzing Data Streams in Real Time with Amazon Kinesis: PNNL's Serverless Da...
Analyzing Data Streams in Real Time with Amazon Kinesis: PNNL's Serverless Da...Amazon Web Services
 
Modern Data Architectures for Business Insights at Scale
Modern Data Architectures for Business Insights at Scale Modern Data Architectures for Business Insights at Scale
Modern Data Architectures for Business Insights at Scale Amazon Web Services
 
Amazon Kinesis Platform – The Complete Overview - Pop-up Loft TLV 2017
Amazon Kinesis Platform – The Complete Overview - Pop-up Loft TLV 2017Amazon Kinesis Platform – The Complete Overview - Pop-up Loft TLV 2017
Amazon Kinesis Platform – The Complete Overview - Pop-up Loft TLV 2017Amazon Web Services
 
Driving Business Insights with a Modern Data Architecture AWS Summit SG 2017
Driving Business Insights with a Modern Data Architecture  AWS Summit SG 2017Driving Business Insights with a Modern Data Architecture  AWS Summit SG 2017
Driving Business Insights with a Modern Data Architecture AWS Summit SG 2017Amazon Web Services
 
Building a Data Lake in Amazon S3 & Amazon Glacier (STG401-R1) - AWS re:Inven...
Building a Data Lake in Amazon S3 & Amazon Glacier (STG401-R1) - AWS re:Inven...Building a Data Lake in Amazon S3 & Amazon Glacier (STG401-R1) - AWS re:Inven...
Building a Data Lake in Amazon S3 & Amazon Glacier (STG401-R1) - AWS re:Inven...Amazon Web Services
 
Need for Speed – Intro To Real-Time Data Streaming Analytics on AWS | AWS Sum...
Need for Speed – Intro To Real-Time Data Streaming Analytics on AWS | AWS Sum...Need for Speed – Intro To Real-Time Data Streaming Analytics on AWS | AWS Sum...
Need for Speed – Intro To Real-Time Data Streaming Analytics on AWS | AWS Sum...AWS Summits
 
Need for Speed – Intro To Real-Time Data Streaming Analytics on AWS | AWS Sum...
Need for Speed – Intro To Real-Time Data Streaming Analytics on AWS | AWS Sum...Need for Speed – Intro To Real-Time Data Streaming Analytics on AWS | AWS Sum...
Need for Speed – Intro To Real-Time Data Streaming Analytics on AWS | AWS Sum...Amazon Web Services
 
NEW LAUNCH! Introducing AWS IoT Analytics - IOT214 - re:Invent 2017
NEW LAUNCH! Introducing AWS IoT Analytics - IOT214 - re:Invent 2017NEW LAUNCH! Introducing AWS IoT Analytics - IOT214 - re:Invent 2017
NEW LAUNCH! Introducing AWS IoT Analytics - IOT214 - re:Invent 2017Amazon Web Services
 
BDA307 Real-time Streaming Applications on AWS, Patterns and Use Cases
BDA307 Real-time Streaming Applications on AWS, Patterns and Use CasesBDA307 Real-time Streaming Applications on AWS, Patterns and Use Cases
BDA307 Real-time Streaming Applications on AWS, Patterns and Use CasesAmazon Web Services
 
Analyzing Real-time Streaming Data with Amazon Kinesis
Analyzing Real-time Streaming Data with Amazon KinesisAnalyzing Real-time Streaming Data with Amazon Kinesis
Analyzing Real-time Streaming Data with Amazon KinesisAmazon Web Services
 
Intro Presentation at AWS AWSome Day Glasgow September 2015
Intro Presentation at AWS AWSome Day Glasgow September 2015Intro Presentation at AWS AWSome Day Glasgow September 2015
Intro Presentation at AWS AWSome Day Glasgow September 2015Ian Massingham
 
Architecting for Real-Time Insights with Amazon Kinesis (ANT310) - AWS re:Inv...
Architecting for Real-Time Insights with Amazon Kinesis (ANT310) - AWS re:Inv...Architecting for Real-Time Insights with Amazon Kinesis (ANT310) - AWS re:Inv...
Architecting for Real-Time Insights with Amazon Kinesis (ANT310) - AWS re:Inv...Amazon Web Services
 
雲上打造資料湖 (Data Lake):智能化駕馭商機 (Level 300)
雲上打造資料湖 (Data Lake):智能化駕馭商機 (Level 300)雲上打造資料湖 (Data Lake):智能化駕馭商機 (Level 300)
雲上打造資料湖 (Data Lake):智能化駕馭商機 (Level 300)Amazon Web Services
 
Intro Presentation at AWS AWSome Day London September 2015
Intro Presentation at AWS AWSome Day London September 2015Intro Presentation at AWS AWSome Day London September 2015
Intro Presentation at AWS AWSome Day London September 2015Ian Massingham
 
Driving Business Outcomes with a Modern Data Architecture - Level 100
Driving Business Outcomes with a Modern Data Architecture - Level 100Driving Business Outcomes with a Modern Data Architecture - Level 100
Driving Business Outcomes with a Modern Data Architecture - Level 100Amazon Web Services
 
Analyzing and processing streaming data with Amazon EMR - ADB204 - New York A...
Analyzing and processing streaming data with Amazon EMR - ADB204 - New York A...Analyzing and processing streaming data with Amazon EMR - ADB204 - New York A...
Analyzing and processing streaming data with Amazon EMR - ADB204 - New York A...Amazon Web Services
 

Similar to Analysing Data in Real-time (20)

Amazon Kinesis - Building Serverless real-time solution - Tel Aviv Summit 2018
Amazon Kinesis - Building Serverless real-time solution - Tel Aviv Summit 2018Amazon Kinesis - Building Serverless real-time solution - Tel Aviv Summit 2018
Amazon Kinesis - Building Serverless real-time solution - Tel Aviv Summit 2018
 
Analyzing Data Streams in Real Time with Amazon Kinesis: PNNL's Serverless Da...
Analyzing Data Streams in Real Time with Amazon Kinesis: PNNL's Serverless Da...Analyzing Data Streams in Real Time with Amazon Kinesis: PNNL's Serverless Da...
Analyzing Data Streams in Real Time with Amazon Kinesis: PNNL's Serverless Da...
 
Modern Data Architectures for Business Insights at Scale
Modern Data Architectures for Business Insights at Scale Modern Data Architectures for Business Insights at Scale
Modern Data Architectures for Business Insights at Scale
 
Amazon Kinesis Platform – The Complete Overview - Pop-up Loft TLV 2017
Amazon Kinesis Platform – The Complete Overview - Pop-up Loft TLV 2017Amazon Kinesis Platform – The Complete Overview - Pop-up Loft TLV 2017
Amazon Kinesis Platform – The Complete Overview - Pop-up Loft TLV 2017
 
Building your Datalake on AWS
Building your Datalake on AWSBuilding your Datalake on AWS
Building your Datalake on AWS
 
Driving Business Insights with a Modern Data Architecture AWS Summit SG 2017
Driving Business Insights with a Modern Data Architecture  AWS Summit SG 2017Driving Business Insights with a Modern Data Architecture  AWS Summit SG 2017
Driving Business Insights with a Modern Data Architecture AWS Summit SG 2017
 
Building a Data Lake in Amazon S3 & Amazon Glacier (STG401-R1) - AWS re:Inven...
Building a Data Lake in Amazon S3 & Amazon Glacier (STG401-R1) - AWS re:Inven...Building a Data Lake in Amazon S3 & Amazon Glacier (STG401-R1) - AWS re:Inven...
Building a Data Lake in Amazon S3 & Amazon Glacier (STG401-R1) - AWS re:Inven...
 
Need for Speed – Intro To Real-Time Data Streaming Analytics on AWS | AWS Sum...
Need for Speed – Intro To Real-Time Data Streaming Analytics on AWS | AWS Sum...Need for Speed – Intro To Real-Time Data Streaming Analytics on AWS | AWS Sum...
Need for Speed – Intro To Real-Time Data Streaming Analytics on AWS | AWS Sum...
 
Need for Speed – Intro To Real-Time Data Streaming Analytics on AWS | AWS Sum...
Need for Speed – Intro To Real-Time Data Streaming Analytics on AWS | AWS Sum...Need for Speed – Intro To Real-Time Data Streaming Analytics on AWS | AWS Sum...
Need for Speed – Intro To Real-Time Data Streaming Analytics on AWS | AWS Sum...
 
NEW LAUNCH! Introducing AWS IoT Analytics - IOT214 - re:Invent 2017
NEW LAUNCH! Introducing AWS IoT Analytics - IOT214 - re:Invent 2017NEW LAUNCH! Introducing AWS IoT Analytics - IOT214 - re:Invent 2017
NEW LAUNCH! Introducing AWS IoT Analytics - IOT214 - re:Invent 2017
 
BDA307 Real-time Streaming Applications on AWS, Patterns and Use Cases
BDA307 Real-time Streaming Applications on AWS, Patterns and Use CasesBDA307 Real-time Streaming Applications on AWS, Patterns and Use Cases
BDA307 Real-time Streaming Applications on AWS, Patterns and Use Cases
 
Analyzing Real-time Streaming Data with Amazon Kinesis
Analyzing Real-time Streaming Data with Amazon KinesisAnalyzing Real-time Streaming Data with Amazon Kinesis
Analyzing Real-time Streaming Data with Amazon Kinesis
 
Intro Presentation at AWS AWSome Day Glasgow September 2015
Intro Presentation at AWS AWSome Day Glasgow September 2015Intro Presentation at AWS AWSome Day Glasgow September 2015
Intro Presentation at AWS AWSome Day Glasgow September 2015
 
Architecting for Real-Time Insights with Amazon Kinesis (ANT310) - AWS re:Inv...
Architecting for Real-Time Insights with Amazon Kinesis (ANT310) - AWS re:Inv...Architecting for Real-Time Insights with Amazon Kinesis (ANT310) - AWS re:Inv...
Architecting for Real-Time Insights with Amazon Kinesis (ANT310) - AWS re:Inv...
 
雲上打造資料湖 (Data Lake):智能化駕馭商機 (Level 300)
雲上打造資料湖 (Data Lake):智能化駕馭商機 (Level 300)雲上打造資料湖 (Data Lake):智能化駕馭商機 (Level 300)
雲上打造資料湖 (Data Lake):智能化駕馭商機 (Level 300)
 
AWS Big Data Platform
AWS Big Data PlatformAWS Big Data Platform
AWS Big Data Platform
 
Intro Presentation at AWS AWSome Day London September 2015
Intro Presentation at AWS AWSome Day London September 2015Intro Presentation at AWS AWSome Day London September 2015
Intro Presentation at AWS AWSome Day London September 2015
 
Driving Business Outcomes with a Modern Data Architecture - Level 100
Driving Business Outcomes with a Modern Data Architecture - Level 100Driving Business Outcomes with a Modern Data Architecture - Level 100
Driving Business Outcomes with a Modern Data Architecture - Level 100
 
Data_Analytics_and_AI_ML
Data_Analytics_and_AI_MLData_Analytics_and_AI_ML
Data_Analytics_and_AI_ML
 
Analyzing and processing streaming data with Amazon EMR - ADB204 - New York A...
Analyzing and processing streaming data with Amazon EMR - ADB204 - New York A...Analyzing and processing streaming data with Amazon EMR - ADB204 - New York A...
Analyzing and processing streaming data with Amazon EMR - ADB204 - New York A...
 

More from Amazon Web Services

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Amazon Web Services
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Amazon Web Services
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateAmazon Web Services
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSAmazon Web Services
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Amazon Web Services
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Amazon Web Services
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...Amazon Web Services
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsAmazon Web Services
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareAmazon Web Services
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSAmazon Web Services
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAmazon Web Services
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareAmazon Web Services
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWSAmazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckAmazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without serversAmazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...Amazon Web Services
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceAmazon Web Services
 

More from Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

Analysing Data in Real-time

  • 1. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Analysing Data in Real-time Julio Faerman @faermanj AWS Technical Evangelist
  • 2. Timely decisions require new data and fast Source: Perishable insights, Mike Gualtieri, Forrester Data loses value quickly over time Real time Seconds Minutes Hours Days Months Valueofdatatodecision-making Preventive/Predictive Actionable Reactive Historical Time critical decisions Traditional “batch” business intelligence Information half-life in decision-making
  • 3. What is streaming data? Typical characteristics Low-latencyContinuous Ordered, incremental High volume
  • 4. Most common uses of streaming Industrial Automation Smart Home Smart City Data Lakes IoT Analytics Log Analytics
  • 5. 5 CUSTOMER EVENT STORE A JOURNEY FROM DATA WAREHOUSE TO STREAMING DATA
  • 6. 6 CUSTOMER EVENT STORE INTRODUCTION Charles van Kints Product Owner Customer Event Store ABN AMRO Abhishek Choudhary Lead Development Engineer Customer Event Store ABN AMRO
  • 7.
  • 8. 8 Slow batch driven processing Complexity in connecting to external sources THE PROBLEM: OUR CHALLENGES ON PROCESSING EVENT DATA Events: Interactions (touching points) of (potential) customers towards ABN AMRO, throughout devices, across channels Error prone process: bad records Huge increase in volumes of data Fast changing sources Limitations in consuming capabilities Diversity in data from different sources CUSTOMER EVENT STORE: WHY
  • 9. 9 CUSTOMER EVENT STORE: WHY PROBLEM STATEMENT “Not being able to handle important events in the life of the customer that impact their relation with ABN AMRO in an adequate way.” Bernard Faber Solution Architect, ABN AMRO “Strong increase of the digitalized touching points with our customers (called events), from a growing number of sources.” Charles Van Kints Product Owner, ABN AMRO “The continuous growth in event data sources and volume, the increasing demand towards using event data and the current solution within the Marketing Intelligence data warehouse.” Peter Kromhout Engineering Lead, ABN AMRO
  • 10. 10 KEY FEATURES Handle Changes Instantly & Metadata Driven Large Volumes Consuming Capabilities Customer Interactions Real – Time Future State Customer Event Store Building insights in the customer behaviour, customer journey and customer interactions with ABN AMRO in order to be able to act Personal and Relevant. CUSTOMER EVENT STORE : WHAT
  • 11. 11 JOURNEY SO FAR… ü Approval – License to Public ü Prepare for Go-Live March – 2018 Prototype ü Develop Prototype. ü Initiate License to Public April – 2018 Technical Go - Live ü Product Stack deployed ü 2 Sources Live ü Tune product for Business Go-Live Business Go – Live ü Add new sources ü Consuming Capabilities ü Enable data usage Approach ü Successful prototype ü Co-creation – Business & IT ü 2 Event Sources Go! August – 2018 September – 2018 December – 2018 CUSTOMER EVENT STORE: WHEN
  • 12. 12 CUSTOMER EVENT STORE: HOW CONCEPTUAL DESIGN Internal sources RDS LandingZone External sources Access Mngmt Metadata Validation Streaming End-Point REST API Alerting Standardisation ETLMetadata ETL Streaming Batch Metadata Metadata Real-Time processing Batch processing EVENT STORE Profile Information StitchingPre-Processing Lineage Orchestration Monitoring Access Management Stream StreamingZone
  • 13. 13 CUSTOMER EVENT STORE: HOW TECHNICAL ARCHITECTURE : STREAM & BATCH Batch Bucket Nano - Batch Bucket EMR Glue Step Function Lambda SNS EnterpriseRaw DataStore Auto-Scaling Group Snowplow Collector Fargate Fargate
  • 14. 14 CUSTOMER EVENT STORE: HOW TECHNICAL ARCHITECTURE: ONE PROCESS – STREAM & BATCH Nano - Batch Bucket Auto-Scaling Group Snowplow Collector Fargate Auto-Scaling Group Snowplow Enricher Fargate Kinesis Data Stream – Raw Kinesis Data Stream – Good Kinesis Data Stream – Bad Kinesis Data Firehose Schema Bucket Bad Events Bucket EnterpriseRaw DataStore Batch Bucket
  • 15. 15 CUSTOMER EVENT STORE: HOW TECHNICAL ARCHITECTURE: STANDARDIZEEnterpriseRaw DataStore Auto-Scaling Group Snowplow Collector Fargate Auto-Scaling Group Snowplow Enricher Fargate Kinesis Data Stream – Raw Kinesis Data Stream – Good Kinesis Data Stream – Bad CloudWatch Kinesis Data Stream – Standardized Kinesis Data Firehose - ORC Kinesis Data Firehose - JSON Standard Bucket Glue Crawler Athena DynamoDB Alarm Alarm Rule Schema Bucket
  • 16. 16 CUSTOMER EVENT STORE: HOW TECHNICAL ARCHITECTURE: END 2 ENDEnterpriseRaw DataStore
  • 17. 17 CUSTOMER EVENT STORE: SUMMARY AWS STEP FUNCTIONS 2018 Analysis Complex workflows involving iteration of Lambda functions can be implemented quickly. Complex Workflows Clear intermediate results. Debug Friendly New state machine can be created for only the failed states.. Restart – ability Preserves state between subsequent API calls. State Management Lambda, Glue, ECS, SageMaker. Serverless Orchestration Retrials can be triggered for specific errors. Other actions can also be configured. Error Handling
  • 18. 18 CUSTOMER EVENT STORE: SUMMARY WHEN TO USE GLUE AND/OR EMR Can be placed in custom VPC Horizontally scalableServer less & Pre-configured Limited customization Fully Managed Public Service Define Cluster – Choose Applications & Customize as you wish Vertically & horizontally scalable More actions than just SPARK AWS EMR AWS Glue Spin-up time Only Spark
  • 19. 19 KEY TAKE-AWAY Dev-Ops Security by Design Architecture by Evolution One Process Serverless & Native Components Technical Vs Business Go – Live CUSTOMER EVENT STORE: SUMMARY
  • 21. Streaming with Amazon Kinesis Easily collect, process, and analyze video and data streams in real-time Capture, process, and store video streams Amazon Kinesis Video Streams Load data streams into data stores Amazon Kinesis Data Firehose SQL Analyze data streams with SQL Amazon Kinesis Data Analytics Capture, process, and store data streams Amazon Kinesis Data Streams
  • 22. Amazon Kinesis Data Streams producers and consumers Producers Consumers Kinesis Agent Apache Kafka AWS SDK LOG4J Flume Fluentd AWS Mobile SDK for iOS Amazon Kinesis Producer Library Get* APIs Amazon Kinesis Client Library + Connector Library Apache Storm Amazon EMR AWS Lambda Apache Spark Amazon Kinesis
  • 23. S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 24. Data ingestion from a variety of sources Kinesis Data Streams Transactions ERP Web logs/ cookies Connected devices AWS SDKs • Publish directly from application code via APIs • AWS Mobile SDK • Managed AWS sources: CloudWatch Logs, AWS IoT, Kinesis Data Analytics and more • RDS Aurora via Lambda Kinesis Agent • Monitors log files and forwards lines as messages to Kinesis Data Streams Kinesis Producer Library (KPL) • Background process aggregates and batches messages 3rd party and open source • Log4j appender • Apache Kafka • Flume, fluentd, and more …
  • 25. Data processing from a variety of consumers Fully managed service for real-time processing of streaming data Cost-effective: $0.014 per 1,000,000 PUT Payload Units Millions of sources producing 100’s of terabytes per hour Amazon Web Services Front End AZ AZ AZ Authentic authorization Durable, highly consistent storage replicas data across three data centers (availability zones) Ordered stream of events supports multiple readers Amazon Kinesis Client Library on EC2 Amazon Kinesis Data Firehose Amazon Kinesis Data Analytics AWS Lambda
  • 26. Amazon Kinesis Data Streams: Standard consumers Shard 1 Shard 2 Shard 3 Shard n Kinesis Data Stream Consumer application A GetRecords() Data GetRecords(): Five transactions per second, per shard Data: 2MB per second, per shard Data producer up to 1 MB or 1000 records per second, per shard With only one consumer application, records can be retrieved every 200 ms
  • 27. Amazon Kinesis Data Streams: Enhanced fan-out consumers Consumers do not poll. Messages are pushed to the consumer as they arrive Shard 1 Kinesis Data Stream Data producer Consumer application A SubscribeToShard() Uses HTTP/2 • Up to five mins connection • Data pushed to consumer persist
  • 28. Enhanced fan-out • Multiple consumer applications for the same Kinesis Data Stream • Default limit of five registered consuming applications. More can be supported with a service limit increase request • Low-latency requirements for data processing • Messages are typically delivered to a consumer in less than 70 ms Amazon Kinesis Data Streams Consumers Standard • Total number of consuming applications is low • Consumers are not latency- sensitive • Minimize cost
  • 29. S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 30. Amazon Kinesis Data Firehose—How it works Ingest Transform Deliver Amazon S3 Amazon Redshift Amazon Elasticsearch Service AWS IoT Amazon Kinesis Agent Amazon Kinesis Streams Amazon CloudWatch Logs Amazon CloudWatch Events Apache Kafka
  • 31. S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 32. SQL on streaming data? Aggregations (count, sum, min, … ) take granular real-time data and turn it into insights Data is continuously processed so you need to tell the application when you want results Aggregation Windows
  • 33. Window types Sliding, tumbling, and stagger Tumbling windows are fixed size and grouped keys do not overlap Source Time t0 t1 t2 t3 t4 t5 t6 t7 t8 t9 t10 t11 t12 t13 t14 t15
  • 34. Writing streaming SQL Pump (continuous query) using stagger window CREATE OR REPLACE PUMP calls_per_ip_pump AS INSERT INTO calls_per_ip_stream SELECT STREAM source_ip_address, COUNT(*) FROM source_sql_stream_001 WINDOWED BY STAGGER( PARTITION BY source_ip_address RANGE INTERVAL '1' MINUTE);
  • 35. Apache Flink: Stateful Stream Computations
  • 36.
  • 37.
  • 38. S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 39. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Traditional methods struggle with real-world forecasting complications Can’t handle seasonality
  • 40. DeepAR Probabilistic Forecasting with Autoregressive Recurrent Networks
  • 42. N E W ! Amazon Forecast Any historical time-series Integrates with SAP and Oracle Supply Chain Custom forecasts with 3 clicks 50% more accurate 1/10th the cost Integrates with Amazon Timestream Retail demand Travel demand AWS usage Revenue forecasts Web traffic Advertising demand Generate forecasts for: Accurate time-series forecasting service, based on the same technology used at Amazon.com
  • 43. S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 44. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Invoking lambda functions
  • 46.
  • 47.
  • 48. S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 49. Thank you! S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Julio Faerman @faermanj