SlideShare a Scribd company logo
1 of 47
Download to read offline
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Roy Ben-Alta, Sr. Business Development Manager, AWS
September 22, 2016
Real-time Streaming Data on AWS
Deep Dive & Best Practices
Carlos Vinicius, Data Engineer @ OLX
Outline
Real-time streaming overview
Use cases and design patterns
Amazon Kinesis deep dive
Streaming data ingestion
Stream processing
Q&A
Big Data Evolution – It is all about the Pace
Batch
Report
Real-time
Alerts
Prediction
Forecast
Streaming Data Scenarios Across Verticals
Scenarios/
Verticals
Accelerated Ingest-
Transform-Load
Continuous Metrics
Generation
Responsive Data
Analysis
Digital Ad
Tech/Marketing
Publisher, bidder data
aggregation
Advertising metrics like
coverage, yield, and
conversion
User engagement with
ads, optimized bid/buy
engines
IoT Sensor, device
telemetry
data ingestion
Operational metrics and
dashboards
Device operational
intelligence and alerts
Gaming Online data
aggregation, e.g., top
10 players
Massively multiplayer
online game (MMOG) live
dashboard
Leader board generation,
player-skill match
Consumer Online Clickstream analytics Metrics like impressions
and page views
Recommendation engines,
proactive care
Amazon Kinesis Customer Base Diversity
1 billion events/wk from
connected devices | IoT
17 PB of game data per
season | Entertainment
80 billion ad
impressions/day, 30 ms
response time | Ad Tech
100 GB/day click streams
from 250+ sites |
Enterprise
50 billion ad
impressions/day sub-50
ms responses | Ad Tech
10 million events/day
| Retail
Amazon Kinesis as Databus -
Migrate from Kafka to Kinesis| Enterprise
Funnel all
production events
through Amazon
Kinesis
Metering Record Common Log Entry
MQTT RecordSyslog Entry
{
"payerId": "Joe",
"productCode": "AmazonS3",
"clientProductCode": "AmazonS3",
"usageType": "Bandwidth",
"operation": "PUT",
"value": "22490",
"timestamp": "1216674828"
}
{
127.0.0.1 user-
identifier frank
[10/Oct/2000:13:5
5:36 -0700] "GET
/apache_pb.gif
HTTP/1.0" 200
2326
}
{
“SeattlePublicWa
ter/Kinesis/123/
Realtime” –
412309129140
}
{
<165>1 2003-10-11T22:14:15.003Z
mymachine.example.com evntslog -
ID47 [exampleSDID@32473 iut="3"
eventSource="Application"
eventID="1011"][examplePriority@
32473 class="high"]
}
Streaming Data Challenges: Variety & Velocity
• Streaming data comes in
different types and
formats
− Metering records,
logs and sensor data
− JSON, CSV, TSV
• Can vary in size from a
few bytes to kilobytes or
megabytes
• High velocity and
continuous
Two Main Processing Patterns
Stream processing (real time)
• Real-time response to events in data streams
Examples:
• Proactively detect hardware errors in device logs
• Notify when inventory drops below a threshold
• Fraud detection
Micro-batching (near real time)
• Near real-time operations on small batches of events in data streams
Examples:
• Aggregate and archive events
• Monitor performance SLAs
Amazon Kinesis Deep Dive
Amazon Kinesis
Streams
• For Technical Developers
• Build your own custom
applications that process
or analyze streaming
data
Amazon Kinesis
Firehose
• For all developers, data
scientists
• Easily load massive
volumes of streaming data
into S3, Amazon Redshift
and Amazon Elasticsearch
Amazon Kinesis
Analytics
• For all developers, data
scientists
• Easily analyze data
streams using standard
SQL queries
Amazon Kinesis: Streaming Data Made Easy
Services make it easy to capture, deliver and process streams on AWS
Amazon Kinesis Streams
Build your own data streaming applications
Easy administration: Simply create a new stream, and set the desired level of
capacity with shards. Scale to match your data throughput rate and volume.
Build real-time applications: Perform continual processing on streaming big data
using Kinesis Client Library (KCL), Apache Spark/Storm, AWS Lambda, and more.
Low cost: Cost-efficient for workloads of any scale.
Real-Time Streaming Data Ingestion
Custom-built
Streaming
Applications
(KCL)
Inexpensive: $0.014 per 1,000,000 PUT Payload
Units
Amazon Kinesis Streams - GA 2013
Fully managed service for real-time processing of streaming data
Data
Sources
App.4
[Machine
Learning]
AWSEndpoint
App.1
[Aggregate &
De-Duplicate]
Data
Sources
Data
Sources
Data
Sources
App.2
[Metric
Extraction]
Amazon S3
Amazon Redshift
App.3
[Sliding
Window
Analysis]
Availability
Zone
Shard 1
Shard 2
Shard N
Availability
Zone
Availability
Zone
Amazon Kinesis Streams
Managed service for real-time streaming
AWS Lambda
Amazon EMR
• Streams are made of shards
• Each shard ingests up to 1MB/sec, and
1000 records/sec
• Each shard emits up to 2 MB/sec
• All data is stored for 24 hours by
default; storage can be extended for
up to 7 days
• Scale Kinesis streams using scaling util
• Replay data inside of 24-hour window
Amazon Kinesis Streams
Managed ability to capture and store data
Amazon Kinesis Streams: Year in Review 2016
Lambda and
Spark Streaming support
Extended Retention Shard-Level Metrics Time-based seek
Streaming Data Scenarios Across Verticals
Scenarios/
Verticals
Accelerated Ingest-
Transform-Load
Continuous Metrics
Generation
Responsive Data
Analysis
Digital Ad
Tech/Marketing
Publisher, bidder data
aggregation
Advertising metrics like
coverage, yield, and
conversion
User engagement with
ads, optimized bid/buy
engines
IoT Sensor, device
telemetry
data ingestion
Operational metrics and
dashboards
Device operational
intelligence and alerts
Gaming Online data
aggregation, e.g., top
10 players
Massively multiplayer
online game (MMOG) live
dashboard
Leader board generation,
player-skill match
Consumer Online Clickstream analytics Metrics like impressions
and page views
Recommendation engines,
proactive care
Amazon Kinesis Firehose
Load massive volumes of streaming data into Amazon S3, Amazon
Redshift and Amazon Elasticsearch
Zero administration: Capture and deliver streaming data into Amazon S3, Amazon Redshift
and Amazon Elasticsearch without writing an application or managing infrastructure.
Direct-to-data store integration: Batch, compress, and encrypt streaming data for
delivery into data destinations in as little as 60 secs using simple configurations.
Seamless elasticity: Seamlessly scales to match data throughput w/o intervention
Capture and submit
streaming data to Firehose
Analyze streaming data using your
favorite BI tools
Firehose loads streaming data
continuously into S3, Amazon Redshift
and Amazon Elasticsearch
Amazon Kinesis Firehose: Year in Review & 2016 Roadmap
Kinesis Agent and
log transformation
Error Reporting
and Troubleshooting
Delivery for S3, Redshift
and Elasticsearch
Amazon Kinesis Firehose vs. Amazon Kinesis
Streams
Amazon Kinesis Streams is for use cases that require custom
processing, per incoming record, with sub-1 second processing
latency, and a choice of stream processing frameworks.
Amazon Kinesis Firehose is for use cases that require zero
administration, ability to use existing analytics tools based on
Amazon S3, Amazon Redshift and Amazon Elasticsearch, and a
data latency of 60 seconds or higher.
Amazon Kinesis Analytics
Apply SQL on streams: Easily connect to a Kinesis Stream or Firehose Delivery
Stream and apply SQL skills.
Build real-time applications: Perform continual processing on streaming big data
with sub-second processing latencies.
Easy Scalability : Elastically scales to match data throughput.
Connect to Kinesis streams,
Firehose delivery streams
Run standard SQL queries
against data streams
Kinesis Analytics can send processed data
to analytics tools so you can create alerts
and respond in real-time
Use SQL to build real-time applications
Easily write SQL code to process
streaming data
Connect to streaming source
Continuously deliver SQL results
Streaming Data Scenarios Across Verticals
Scenarios/
Verticals
Accelerated Ingest-
Transform-Load
Continuous Metrics
Generation
Responsive Data Analysis
Digital Ad
Tech/Marketing
Publisher, bidder data
aggregation
Advertising metrics like
coverage, yield, and
conversion
User engagement with
ads, optimized bid/buy
engines
IoT Sensor, device telemetry
data ingestion
Operational metrics and
dashboards
Device operational
intelligence and alerts
Gaming Online data aggregation,
e.g., top 10 players
Massively multiplayer
online game (MMOG) live
dashboard
Leader board generation,
player-skill match
Consumer
Online
Clickstream analytics Metrics like impressions
and page views
Recommendation engines,
proactive care
Streaming Data Ingestion
Putting Data into Amazon Kinesis Streams
Determine your partition key strategy
• Managed buffer or streaming MapReduce job
• Ensure high cardinality for your shards
Provision adequate shards
• For ingress needs
• Egress needs for all consuming applications: if more
than two simultaneous applications
• Include headroom for catching up with data in stream
Putting Data into Amazon Kinesis
Amazon Kinesis Agent – (supports pre-processing)
• http://docs.aws.amazon.com/firehose/latest/dev/writing-with-agents.html
Pre-batch before Puts for better efficiency
• Consider Flume, Fluentd as collectors/agents
• See https://github.com/awslabs/aws-fluent-plugin-kinesis
Make a tweak to your existing logging
• log4j appender option
• See https://github.com/awslabs/kinesis-log4j-appender
Amazon Kinesis Producer Library
• Writes to one or more Amazon Kinesis streams with automatic,
configurable retry mechanism
• Collects records and uses PutRecords to write multiple records to
multiple shards per request
• Aggregates user records to increase payload size and improve
throughput
• Integrates seamlessly with KCL to de-aggregate batched records
• Use Amazon Kinesis Producer Library with AWS Lambda (New!)
• Submits Amazon CloudWatch metrics on your behalf to provide
visibility into producer performance
Record Order and Multiple Shards
Unordered processing
• Randomize partition key to distribute events over
many shards and use multiple workers
Exact order processing
• Control partition key to ensure events are
grouped into the same shard and read by the
same worker
Need both? Use global sequence number
Producer
Get Global
Sequence
Unordered
Stream
Campaign Centric
Stream
Fraud Inspection
Stream
Get Event
Metadata
Sample Code for Scaling Shards
java -cp
KinesisScalingUtils.jar-complete.jar
-Dstream-name=MyStream
-Dscaling-action=scaleUp
-Dcount=10
-Dregion=eu-west-1 ScalingClient
Options:
• stream-name - The name of the stream to be scaled
• scaling-action - The action to be taken to scale. Must be one of "scaleUp”, "scaleDown"
or “resize”
• count - Number of shards by which to absolutely scale up or down, or resize
See https://github.com/awslabs/amazon-kinesis-scaling-utils
Amazon Kinesis Stream Processing
Amazon Kinesis Client Library
• Build Kinesis Applications with Kinesis Client Library (KCL)
• Open source client library available for Java, Ruby, Python,
Node.JS dev
• Deploy on your EC2 instances
• KCL Application includes three components:
1. Record Processor Factory – Creates the record processor
2. Record Processor – Processor unit that processes data from a
shard in Amazon Kinesis Streams
3. Worker – Processing unit that maps to each application instance
State Management with Kinesis Client Library
• One record processor maps to one shard and processes data records from
that shard
• One worker maps to one or more record processors
• Balances shard-worker associations when worker / instance counts change
• Balances shard-worker associations when shards split or merge
Other Options
• Third-party connectors(for example, Splunk)
• AWS IoT platform
• Amazon EMR with Apache Spark, Pig or Hive
• AWS Lambda
Apache Spark and Amazon Kinesis Streams
Apache Spark is an in-memory analytics cluster using
RDD for fast processing
Spark Streaming can read directly from an Amazon
Kinesis stream
Amazon software license linking – Add ASL
dependency to SBT/MAVEN project, artifactId = spark-
streaming-kinesis-asl_2.10
KinesisUtils.createStream(‘twitter-stream’)
.filter(_.getText.contains(”Open-Source"))
.countByWindow(Seconds(5))
Example: Counting tweets on a sliding window
Common Integration Pattern with Amazon EMR
Tumbling Window Reporting
Amazon EMR
Amazon
Kinesis
StreamsStreaming Input
Tumbling/Fixed
Window
Aggregation
Periodic Output
Amazon Redshift
COPY from
Amazon EMR
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
OLX Case Study
Carlos Vinicius, Data Engineer @ OLX
“Data streaming production
ready in no time”
Present in the country since
2010 OLX is Brazil's biggest
classifieds website and app.
With more than 14 million of
active Ads and 5 million
messages exchanged daily via
chat.
“Being able to evaluate
new ideas fast and
efficiently is for us
AWS greatest benefit.”
- Bernardo Carneiro,
Director of Technology
The challenge
Build a scalable architecture that
supports a growing increase in the
volume of data.
Be able to develop and evaluate
the results quickly.
Cost.
Solution
450 MB
per day
4xhigher success rate
3 weeks
Development to production time
30 USDmonthly cost
5,500
peak
3,500
requests per minute
average
5M
records per day
Daily load Amazon Kinesis Benefits
Amazon Kinesis Streams with AWS Lambda
AWS Lambda + Amazon Kinesis
Data Input Kinesis Action Lambda Data Output
IT application activity
Capture the
stream
Audit
Process the
stream
SNS
Metering records Condense Redshift
Change logs Backup S3
Financial data Store RDS
Transaction orders Process SQS
Server health metrics Monitor EC2
User clickstream Analyze EMR
IoT device data Respond Backend endpoint
Custom data Custom action Custom application
Common Architecture: Lambda + Kinesis
Data Processing for Data Storage/Analysis
Use Lambda to process and
“fan out” to other AWS services
i.e. Storage, Database, and
BI/analytics
Amazon Kinesis stream can
continuously capture and
store terabytes of data per
hour from hundreds of
thousands of sources
Grant AWS Lambda
permissions for the relevant
stream actions via IAM
(Execution Role) during
function creation
IAM
IAM
IAM
Atom Data Flow Management is a data infrastructure solution that
allows clients to customize their data flow according to their business
needs.
365Scores is a leading sports app that offers users live scores, match
statistics, news, videos, and highlights across 10 sports and over 1K
competitions worldwide.
Background
350B
Data Events
Every Month
15
Dedicated Data
Professionals
10M+
Total Installs
375K
Five Star Reviews
$6.7M
Funding Raised
Lambda and
Kinesis Diagram
Hearst
Processing 150GB/Day clickstream data
Conclusion
• Amazon Kinesis offers: managed service to build applications, streaming
data ingestion, and continuous processing
• Ingest aggregate data using Amazon Producer Library
• Process data using Amazon Connector Library and open source connectors
• Determine your partition key strategy
• Try out Amazon Kinesis at http://aws.amazon.com/kinesis/
• Technical documentations
• Amazon Kinesis Agent
• Amazon Kinesis Streams and Spark Streaming
• Amazon Kinesis Producer Library Best Practice
• Amazon Kinesis Firehose and AWS Lambda
• Building Near Real-Time Discovery Platform with Amazon Kinesis
• Public case studies
• Comcast Use Case
• Glu mobile – Real-Time Analytics
• Hearst Publishing – Clickstream Analytics
• How Sonos Leverages Amazon Kinesis
• Nordstorm Online Stylist
Reference
Obrigado!

More Related Content

What's hot

AWS re:Invent 2016 recap (part 2)
AWS re:Invent 2016 recap (part 2) AWS re:Invent 2016 recap (part 2)
AWS re:Invent 2016 recap (part 2) Julien SIMON
 
AWS Summit Seoul 2015 - AWS 클라우드를 활용한 빅데이터 및 실시간 스트리밍 분석
AWS Summit Seoul 2015 -  AWS 클라우드를 활용한 빅데이터 및 실시간 스트리밍 분석AWS Summit Seoul 2015 -  AWS 클라우드를 활용한 빅데이터 및 실시간 스트리밍 분석
AWS Summit Seoul 2015 - AWS 클라우드를 활용한 빅데이터 및 실시간 스트리밍 분석Amazon Web Services Korea
 
데이터 마이그레이션 AWS와 같이하기 - 김일호 솔루션즈 아키텍트:: AWS Cloud Track 3 Gaming
데이터 마이그레이션 AWS와 같이하기 - 김일호 솔루션즈 아키텍트:: AWS Cloud Track 3 Gaming데이터 마이그레이션 AWS와 같이하기 - 김일호 솔루션즈 아키텍트:: AWS Cloud Track 3 Gaming
데이터 마이그레이션 AWS와 같이하기 - 김일호 솔루션즈 아키텍트:: AWS Cloud Track 3 GamingAmazon Web Services Korea
 
AWS CLOUD 2017 - Amazon Athena 및 Glue를 통한 빠른 데이터 질의 및 처리 기능 소개 (김상필 솔루션즈 아키텍트)
AWS CLOUD 2017 - Amazon Athena 및 Glue를 통한 빠른 데이터 질의 및 처리 기능 소개 (김상필 솔루션즈 아키텍트)AWS CLOUD 2017 - Amazon Athena 및 Glue를 통한 빠른 데이터 질의 및 처리 기능 소개 (김상필 솔루션즈 아키텍트)
AWS CLOUD 2017 - Amazon Athena 및 Glue를 통한 빠른 데이터 질의 및 처리 기능 소개 (김상필 솔루션즈 아키텍트)Amazon Web Services Korea
 
AWS Webcast - High Availability SQL Server with Amazon RDS
AWS Webcast - High Availability SQL Server with Amazon RDSAWS Webcast - High Availability SQL Server with Amazon RDS
AWS Webcast - High Availability SQL Server with Amazon RDSAmazon Web Services
 
Getting Started with the Hybrid Cloud: Enterprise Backup and Recovery
Getting Started with the Hybrid Cloud: Enterprise Backup and RecoveryGetting Started with the Hybrid Cloud: Enterprise Backup and Recovery
Getting Started with the Hybrid Cloud: Enterprise Backup and RecoveryAmazon Web Services
 
2017 AWS DB Day | Amazon Athena 서비스 최신 기능 소개
2017 AWS DB Day | Amazon Athena 서비스 최신 기능 소개 2017 AWS DB Day | Amazon Athena 서비스 최신 기능 소개
2017 AWS DB Day | Amazon Athena 서비스 최신 기능 소개 Amazon Web Services Korea
 
AWS를 활용한 첫 빅데이터 프로젝트 시작하기(김일호)- AWS 웨비나 시리즈 2015
AWS를 활용한 첫 빅데이터 프로젝트 시작하기(김일호)- AWS 웨비나 시리즈 2015AWS를 활용한 첫 빅데이터 프로젝트 시작하기(김일호)- AWS 웨비나 시리즈 2015
AWS를 활용한 첫 빅데이터 프로젝트 시작하기(김일호)- AWS 웨비나 시리즈 2015Amazon Web Services Korea
 
Getting Started with AWS Security
Getting Started with AWS SecurityGetting Started with AWS Security
Getting Started with AWS SecurityAmazon Web Services
 
Deep Dive on Amazon RDS (May 2016)
Deep Dive on Amazon RDS (May 2016)Deep Dive on Amazon RDS (May 2016)
Deep Dive on Amazon RDS (May 2016)Julien SIMON
 
Accelerate your Business with SAP on AWS - AWS Summit Cape Town 2017
Accelerate your Business with SAP on AWS - AWS Summit Cape Town 2017 Accelerate your Business with SAP on AWS - AWS Summit Cape Town 2017
Accelerate your Business with SAP on AWS - AWS Summit Cape Town 2017 Amazon Web Services
 
數位媒體雲端儲存案例和技術分享 (AWS Storage Options for Media Industry)
數位媒體雲端儲存案例和技術分享 (AWS Storage Options for Media Industry)數位媒體雲端儲存案例和技術分享 (AWS Storage Options for Media Industry)
數位媒體雲端儲存案例和技術分享 (AWS Storage Options for Media Industry)Amazon Web Services
 
Database migration simple, cross-engine and cross-platform migrations with ...
Database migration   simple, cross-engine and cross-platform migrations with ...Database migration   simple, cross-engine and cross-platform migrations with ...
Database migration simple, cross-engine and cross-platform migrations with ...Amazon Web Services
 
Getting started with amazon aurora - Toronto
Getting started with amazon aurora - TorontoGetting started with amazon aurora - Toronto
Getting started with amazon aurora - TorontoAmazon Web Services
 
How to Scale to Millions of Users with AWS
How to Scale to Millions of Users with AWSHow to Scale to Millions of Users with AWS
How to Scale to Millions of Users with AWSAmazon Web Services
 
AWS Summit Seoul 2015 - AWS 최신 서비스 살펴보기 - Aurora, Lambda, EFS, Machine Learn...
AWS Summit Seoul 2015 -  AWS 최신 서비스 살펴보기 - Aurora, Lambda, EFS, Machine Learn...AWS Summit Seoul 2015 -  AWS 최신 서비스 살펴보기 - Aurora, Lambda, EFS, Machine Learn...
AWS Summit Seoul 2015 - AWS 최신 서비스 살펴보기 - Aurora, Lambda, EFS, Machine Learn...Amazon Web Services Korea
 

What's hot (20)

AWS re:Invent 2016 recap (part 2)
AWS re:Invent 2016 recap (part 2) AWS re:Invent 2016 recap (part 2)
AWS re:Invent 2016 recap (part 2)
 
AWS Summit Seoul 2015 - AWS 클라우드를 활용한 빅데이터 및 실시간 스트리밍 분석
AWS Summit Seoul 2015 -  AWS 클라우드를 활용한 빅데이터 및 실시간 스트리밍 분석AWS Summit Seoul 2015 -  AWS 클라우드를 활용한 빅데이터 및 실시간 스트리밍 분석
AWS Summit Seoul 2015 - AWS 클라우드를 활용한 빅데이터 및 실시간 스트리밍 분석
 
데이터 마이그레이션 AWS와 같이하기 - 김일호 솔루션즈 아키텍트:: AWS Cloud Track 3 Gaming
데이터 마이그레이션 AWS와 같이하기 - 김일호 솔루션즈 아키텍트:: AWS Cloud Track 3 Gaming데이터 마이그레이션 AWS와 같이하기 - 김일호 솔루션즈 아키텍트:: AWS Cloud Track 3 Gaming
데이터 마이그레이션 AWS와 같이하기 - 김일호 솔루션즈 아키텍트:: AWS Cloud Track 3 Gaming
 
AWS CLOUD 2017 - Amazon Athena 및 Glue를 통한 빠른 데이터 질의 및 처리 기능 소개 (김상필 솔루션즈 아키텍트)
AWS CLOUD 2017 - Amazon Athena 및 Glue를 통한 빠른 데이터 질의 및 처리 기능 소개 (김상필 솔루션즈 아키텍트)AWS CLOUD 2017 - Amazon Athena 및 Glue를 통한 빠른 데이터 질의 및 처리 기능 소개 (김상필 솔루션즈 아키텍트)
AWS CLOUD 2017 - Amazon Athena 및 Glue를 통한 빠른 데이터 질의 및 처리 기능 소개 (김상필 솔루션즈 아키텍트)
 
AWS Webcast - High Availability SQL Server with Amazon RDS
AWS Webcast - High Availability SQL Server with Amazon RDSAWS Webcast - High Availability SQL Server with Amazon RDS
AWS Webcast - High Availability SQL Server with Amazon RDS
 
Getting Started with the Hybrid Cloud: Enterprise Backup and Recovery
Getting Started with the Hybrid Cloud: Enterprise Backup and RecoveryGetting Started with the Hybrid Cloud: Enterprise Backup and Recovery
Getting Started with the Hybrid Cloud: Enterprise Backup and Recovery
 
2017 AWS DB Day | Amazon Athena 서비스 최신 기능 소개
2017 AWS DB Day | Amazon Athena 서비스 최신 기능 소개 2017 AWS DB Day | Amazon Athena 서비스 최신 기능 소개
2017 AWS DB Day | Amazon Athena 서비스 최신 기능 소개
 
AWS를 활용한 첫 빅데이터 프로젝트 시작하기(김일호)- AWS 웨비나 시리즈 2015
AWS를 활용한 첫 빅데이터 프로젝트 시작하기(김일호)- AWS 웨비나 시리즈 2015AWS를 활용한 첫 빅데이터 프로젝트 시작하기(김일호)- AWS 웨비나 시리즈 2015
AWS를 활용한 첫 빅데이터 프로젝트 시작하기(김일호)- AWS 웨비나 시리즈 2015
 
Getting Started with AWS Security
Getting Started with AWS SecurityGetting Started with AWS Security
Getting Started with AWS Security
 
Deep Dive on Amazon RDS (May 2016)
Deep Dive on Amazon RDS (May 2016)Deep Dive on Amazon RDS (May 2016)
Deep Dive on Amazon RDS (May 2016)
 
Accelerate your Business with SAP on AWS - AWS Summit Cape Town 2017
Accelerate your Business with SAP on AWS - AWS Summit Cape Town 2017 Accelerate your Business with SAP on AWS - AWS Summit Cape Town 2017
Accelerate your Business with SAP on AWS - AWS Summit Cape Town 2017
 
Keynote AWS Experience Day Cali
Keynote AWS Experience Day CaliKeynote AWS Experience Day Cali
Keynote AWS Experience Day Cali
 
Amazon Aurora
Amazon AuroraAmazon Aurora
Amazon Aurora
 
數位媒體雲端儲存案例和技術分享 (AWS Storage Options for Media Industry)
數位媒體雲端儲存案例和技術分享 (AWS Storage Options for Media Industry)數位媒體雲端儲存案例和技術分享 (AWS Storage Options for Media Industry)
數位媒體雲端儲存案例和技術分享 (AWS Storage Options for Media Industry)
 
Introduction on Amazon EC2
 Introduction on Amazon EC2 Introduction on Amazon EC2
Introduction on Amazon EC2
 
Database migration simple, cross-engine and cross-platform migrations with ...
Database migration   simple, cross-engine and cross-platform migrations with ...Database migration   simple, cross-engine and cross-platform migrations with ...
Database migration simple, cross-engine and cross-platform migrations with ...
 
Getting started with amazon aurora - Toronto
Getting started with amazon aurora - TorontoGetting started with amazon aurora - Toronto
Getting started with amazon aurora - Toronto
 
How to Scale to Millions of Users with AWS
How to Scale to Millions of Users with AWSHow to Scale to Millions of Users with AWS
How to Scale to Millions of Users with AWS
 
Cost Optimization at Scale
Cost Optimization at ScaleCost Optimization at Scale
Cost Optimization at Scale
 
AWS Summit Seoul 2015 - AWS 최신 서비스 살펴보기 - Aurora, Lambda, EFS, Machine Learn...
AWS Summit Seoul 2015 -  AWS 최신 서비스 살펴보기 - Aurora, Lambda, EFS, Machine Learn...AWS Summit Seoul 2015 -  AWS 최신 서비스 살펴보기 - Aurora, Lambda, EFS, Machine Learn...
AWS Summit Seoul 2015 - AWS 최신 서비스 살펴보기 - Aurora, Lambda, EFS, Machine Learn...
 

Viewers also liked

GP Surgery Revit Renders
GP Surgery Revit RendersGP Surgery Revit Renders
GP Surgery Revit RendersLee Slaughter
 
Invest in health entrepreneurship
Invest in health entrepreneurshipInvest in health entrepreneurship
Invest in health entrepreneurshipMoses Talibita
 
Resultados pre auditoria ingles.
Resultados pre auditoria ingles.Resultados pre auditoria ingles.
Resultados pre auditoria ingles.Foro Abierto
 
風が帆を押すとき 日本企業のCRE推進に関する調査 2013年
風が帆を押すとき 日本企業のCRE推進に関する調査 2013年風が帆を押すとき 日本企業のCRE推進に関する調査 2013年
風が帆を押すとき 日本企業のCRE推進に関する調査 2013年JLL
 
JLL ASU Student Housing Report - 2015
JLL ASU Student Housing Report - 2015JLL ASU Student Housing Report - 2015
JLL ASU Student Housing Report - 2015John Cunningham
 
QUESTIONING 2O YEARS OF THE 1995 CONSTITUTION OF THE REPUBLIC OF UGANDA AND A...
QUESTIONING 2O YEARS OF THE 1995 CONSTITUTION OF THE REPUBLIC OF UGANDA AND A...QUESTIONING 2O YEARS OF THE 1995 CONSTITUTION OF THE REPUBLIC OF UGANDA AND A...
QUESTIONING 2O YEARS OF THE 1995 CONSTITUTION OF THE REPUBLIC OF UGANDA AND A...Moses Talibita
 
A teosofia do sinal da cruz h.p. blavatsky
A teosofia do sinal da cruz  h.p. blavatskyA teosofia do sinal da cruz  h.p. blavatsky
A teosofia do sinal da cruz h.p. blavatskyRosana Dalla Piazza
 
SRBench Streaming RDF SPARQL Benchmark
SRBench Streaming  RDF SPARQL BenchmarkSRBench Streaming  RDF SPARQL Benchmark
SRBench Streaming RDF SPARQL BenchmarkJean-Paul Calbimonte
 
AN1431T PSpice Model (Free SPICE Model)
AN1431T PSpice Model  (Free SPICE Model)AN1431T PSpice Model  (Free SPICE Model)
AN1431T PSpice Model (Free SPICE Model)Tsuyoshi Horigome
 
Contrato de compra e venda
Contrato de compra e vendaContrato de compra e venda
Contrato de compra e vendaLeomara Andrade
 
GDC 2016 End-to-End Approach to Physically Based Rendering
GDC 2016 End-to-End Approach to Physically Based RenderingGDC 2016 End-to-End Approach to Physically Based Rendering
GDC 2016 End-to-End Approach to Physically Based RenderingWes McDermott
 
The New Industrial Revolution – Where’s next
The New Industrial Revolution – Where’s nextThe New Industrial Revolution – Where’s next
The New Industrial Revolution – Where’s nextJLL
 

Viewers also liked (17)

GP Surgery Revit Renders
GP Surgery Revit RendersGP Surgery Revit Renders
GP Surgery Revit Renders
 
JONELLE_Resume
JONELLE_ResumeJONELLE_Resume
JONELLE_Resume
 
Invest in health entrepreneurship
Invest in health entrepreneurshipInvest in health entrepreneurship
Invest in health entrepreneurship
 
VookAD-Brochure
VookAD-BrochureVookAD-Brochure
VookAD-Brochure
 
Resultados pre auditoria ingles.
Resultados pre auditoria ingles.Resultados pre auditoria ingles.
Resultados pre auditoria ingles.
 
風が帆を押すとき 日本企業のCRE推進に関する調査 2013年
風が帆を押すとき 日本企業のCRE推進に関する調査 2013年風が帆を押すとき 日本企業のCRE推進に関する調査 2013年
風が帆を押すとき 日本企業のCRE推進に関する調査 2013年
 
JLL ASU Student Housing Report - 2015
JLL ASU Student Housing Report - 2015JLL ASU Student Housing Report - 2015
JLL ASU Student Housing Report - 2015
 
Curriculum Vitae
Curriculum Vitae Curriculum Vitae
Curriculum Vitae
 
QUESTIONING 2O YEARS OF THE 1995 CONSTITUTION OF THE REPUBLIC OF UGANDA AND A...
QUESTIONING 2O YEARS OF THE 1995 CONSTITUTION OF THE REPUBLIC OF UGANDA AND A...QUESTIONING 2O YEARS OF THE 1995 CONSTITUTION OF THE REPUBLIC OF UGANDA AND A...
QUESTIONING 2O YEARS OF THE 1995 CONSTITUTION OF THE REPUBLIC OF UGANDA AND A...
 
A teosofia do sinal da cruz h.p. blavatsky
A teosofia do sinal da cruz  h.p. blavatskyA teosofia do sinal da cruz  h.p. blavatsky
A teosofia do sinal da cruz h.p. blavatsky
 
SRBench Streaming RDF SPARQL Benchmark
SRBench Streaming  RDF SPARQL BenchmarkSRBench Streaming  RDF SPARQL Benchmark
SRBench Streaming RDF SPARQL Benchmark
 
iot contest file
iot contest fileiot contest file
iot contest file
 
AD Audit Plus
AD Audit PlusAD Audit Plus
AD Audit Plus
 
AN1431T PSpice Model (Free SPICE Model)
AN1431T PSpice Model  (Free SPICE Model)AN1431T PSpice Model  (Free SPICE Model)
AN1431T PSpice Model (Free SPICE Model)
 
Contrato de compra e venda
Contrato de compra e vendaContrato de compra e venda
Contrato de compra e venda
 
GDC 2016 End-to-End Approach to Physically Based Rendering
GDC 2016 End-to-End Approach to Physically Based RenderingGDC 2016 End-to-End Approach to Physically Based Rendering
GDC 2016 End-to-End Approach to Physically Based Rendering
 
The New Industrial Revolution – Where’s next
The New Industrial Revolution – Where’s nextThe New Industrial Revolution – Where’s next
The New Industrial Revolution – Where’s next
 

Similar to Path to the future #4 - Ingestão, processamento e análise de dados em tempo real

Analyzing Real-time Streaming Data with Amazon Kinesis
Analyzing Real-time Streaming Data with Amazon KinesisAnalyzing Real-time Streaming Data with Amazon Kinesis
Analyzing Real-time Streaming Data with Amazon KinesisAmazon Web Services
 
Real-Time Streaming: Intro to Amazon Kinesis
Real-Time Streaming: Intro to Amazon KinesisReal-Time Streaming: Intro to Amazon Kinesis
Real-Time Streaming: Intro to Amazon KinesisAmazon Web Services
 
Amazon Kinesis Platform – The Complete Overview - Pop-up Loft TLV 2017
Amazon Kinesis Platform – The Complete Overview - Pop-up Loft TLV 2017Amazon Kinesis Platform – The Complete Overview - Pop-up Loft TLV 2017
Amazon Kinesis Platform – The Complete Overview - Pop-up Loft TLV 2017Amazon Web Services
 
Getting Started with Amazon Kinesis
Getting Started with Amazon KinesisGetting Started with Amazon Kinesis
Getting Started with Amazon KinesisAmazon Web Services
 
Introduction to Amazon Kinesis Firehose - AWS August Webinar Series
Introduction to Amazon Kinesis Firehose - AWS August Webinar SeriesIntroduction to Amazon Kinesis Firehose - AWS August Webinar Series
Introduction to Amazon Kinesis Firehose - AWS August Webinar SeriesAmazon Web Services
 
Introduction to Real-time, Streaming Data and Amazon Kinesis. Streaming Data ...
Introduction to Real-time, Streaming Data and Amazon Kinesis. Streaming Data ...Introduction to Real-time, Streaming Data and Amazon Kinesis. Streaming Data ...
Introduction to Real-time, Streaming Data and Amazon Kinesis. Streaming Data ...Amazon Web Services
 
Getting Started with Amazon Kinesis | AWS Public Sector Summit 2016
Getting Started with Amazon Kinesis | AWS Public Sector Summit 2016Getting Started with Amazon Kinesis | AWS Public Sector Summit 2016
Getting Started with Amazon Kinesis | AWS Public Sector Summit 2016Amazon Web Services
 
BDA307 Real-time Streaming Applications on AWS, Patterns and Use Cases
BDA307 Real-time Streaming Applications on AWS, Patterns and Use CasesBDA307 Real-time Streaming Applications on AWS, Patterns and Use Cases
BDA307 Real-time Streaming Applications on AWS, Patterns and Use CasesAmazon Web Services
 
Emerging Prevalence of Data Streaming in Analytics and it's Business Signific...
Emerging Prevalence of Data Streaming in Analytics and it's Business Signific...Emerging Prevalence of Data Streaming in Analytics and it's Business Signific...
Emerging Prevalence of Data Streaming in Analytics and it's Business Signific...Amazon Web Services
 
BDA307 Real-time Streaming Applications on AWS, Patterns and Use Cases
BDA307 Real-time Streaming Applications on AWS, Patterns and Use CasesBDA307 Real-time Streaming Applications on AWS, Patterns and Use Cases
BDA307 Real-time Streaming Applications on AWS, Patterns and Use CasesAmazon Web Services
 
Slashing Big Data Complexity: How Comcast X1 Syndicates Streaming Analytics w...
Slashing Big Data Complexity: How Comcast X1 Syndicates Streaming Analytics w...Slashing Big Data Complexity: How Comcast X1 Syndicates Streaming Analytics w...
Slashing Big Data Complexity: How Comcast X1 Syndicates Streaming Analytics w...Amazon Web Services
 
Introduction to Real-time, Streaming Data and Amazon Kinesis: Streaming Data ...
Introduction to Real-time, Streaming Data and Amazon Kinesis: Streaming Data ...Introduction to Real-time, Streaming Data and Amazon Kinesis: Streaming Data ...
Introduction to Real-time, Streaming Data and Amazon Kinesis: Streaming Data ...Amazon Web Services
 
Driving Business Insights with a Modern Data Architecture AWS Summit SG 2017
Driving Business Insights with a Modern Data Architecture  AWS Summit SG 2017Driving Business Insights with a Modern Data Architecture  AWS Summit SG 2017
Driving Business Insights with a Modern Data Architecture AWS Summit SG 2017Amazon Web Services
 
Serverless Streaming Data Processing using Amazon Kinesis Analytics
Serverless Streaming Data Processing using Amazon Kinesis AnalyticsServerless Streaming Data Processing using Amazon Kinesis Analytics
Serverless Streaming Data Processing using Amazon Kinesis AnalyticsAmazon Web Services
 
Driving Business Outcomes with a Modern Data Architecture - Level 100
Driving Business Outcomes with a Modern Data Architecture - Level 100Driving Business Outcomes with a Modern Data Architecture - Level 100
Driving Business Outcomes with a Modern Data Architecture - Level 100Amazon Web Services
 
AWS April 2016 Webinar Series - Getting Started with Real-Time Data Analytics...
AWS April 2016 Webinar Series - Getting Started with Real-Time Data Analytics...AWS April 2016 Webinar Series - Getting Started with Real-Time Data Analytics...
AWS April 2016 Webinar Series - Getting Started with Real-Time Data Analytics...Amazon Web Services
 
Getting started with Amazon Kinesis
Getting started with Amazon KinesisGetting started with Amazon Kinesis
Getting started with Amazon KinesisAmazon Web Services
 
Getting started with amazon kinesis
Getting started with amazon kinesisGetting started with amazon kinesis
Getting started with amazon kinesisJampp
 
Deep Dive and Best Practices for Real Time Streaming Applications
Deep Dive and Best Practices for Real Time Streaming ApplicationsDeep Dive and Best Practices for Real Time Streaming Applications
Deep Dive and Best Practices for Real Time Streaming ApplicationsAmazon Web Services
 

Similar to Path to the future #4 - Ingestão, processamento e análise de dados em tempo real (20)

Analyzing Real-time Streaming Data with Amazon Kinesis
Analyzing Real-time Streaming Data with Amazon KinesisAnalyzing Real-time Streaming Data with Amazon Kinesis
Analyzing Real-time Streaming Data with Amazon Kinesis
 
Real-Time Streaming: Intro to Amazon Kinesis
Real-Time Streaming: Intro to Amazon KinesisReal-Time Streaming: Intro to Amazon Kinesis
Real-Time Streaming: Intro to Amazon Kinesis
 
Amazon Kinesis Platform – The Complete Overview - Pop-up Loft TLV 2017
Amazon Kinesis Platform – The Complete Overview - Pop-up Loft TLV 2017Amazon Kinesis Platform – The Complete Overview - Pop-up Loft TLV 2017
Amazon Kinesis Platform – The Complete Overview - Pop-up Loft TLV 2017
 
Getting Started with Amazon Kinesis
Getting Started with Amazon KinesisGetting Started with Amazon Kinesis
Getting Started with Amazon Kinesis
 
Introduction to Amazon Kinesis Firehose - AWS August Webinar Series
Introduction to Amazon Kinesis Firehose - AWS August Webinar SeriesIntroduction to Amazon Kinesis Firehose - AWS August Webinar Series
Introduction to Amazon Kinesis Firehose - AWS August Webinar Series
 
Introduction to Real-time, Streaming Data and Amazon Kinesis. Streaming Data ...
Introduction to Real-time, Streaming Data and Amazon Kinesis. Streaming Data ...Introduction to Real-time, Streaming Data and Amazon Kinesis. Streaming Data ...
Introduction to Real-time, Streaming Data and Amazon Kinesis. Streaming Data ...
 
Getting Started with Amazon Kinesis | AWS Public Sector Summit 2016
Getting Started with Amazon Kinesis | AWS Public Sector Summit 2016Getting Started with Amazon Kinesis | AWS Public Sector Summit 2016
Getting Started with Amazon Kinesis | AWS Public Sector Summit 2016
 
BDA307 Real-time Streaming Applications on AWS, Patterns and Use Cases
BDA307 Real-time Streaming Applications on AWS, Patterns and Use CasesBDA307 Real-time Streaming Applications on AWS, Patterns and Use Cases
BDA307 Real-time Streaming Applications on AWS, Patterns and Use Cases
 
Emerging Prevalence of Data Streaming in Analytics and it's Business Signific...
Emerging Prevalence of Data Streaming in Analytics and it's Business Signific...Emerging Prevalence of Data Streaming in Analytics and it's Business Signific...
Emerging Prevalence of Data Streaming in Analytics and it's Business Signific...
 
BDA307 Real-time Streaming Applications on AWS, Patterns and Use Cases
BDA307 Real-time Streaming Applications on AWS, Patterns and Use CasesBDA307 Real-time Streaming Applications on AWS, Patterns and Use Cases
BDA307 Real-time Streaming Applications on AWS, Patterns and Use Cases
 
Slashing Big Data Complexity: How Comcast X1 Syndicates Streaming Analytics w...
Slashing Big Data Complexity: How Comcast X1 Syndicates Streaming Analytics w...Slashing Big Data Complexity: How Comcast X1 Syndicates Streaming Analytics w...
Slashing Big Data Complexity: How Comcast X1 Syndicates Streaming Analytics w...
 
Introduction to Real-time, Streaming Data and Amazon Kinesis: Streaming Data ...
Introduction to Real-time, Streaming Data and Amazon Kinesis: Streaming Data ...Introduction to Real-time, Streaming Data and Amazon Kinesis: Streaming Data ...
Introduction to Real-time, Streaming Data and Amazon Kinesis: Streaming Data ...
 
Driving Business Insights with a Modern Data Architecture AWS Summit SG 2017
Driving Business Insights with a Modern Data Architecture  AWS Summit SG 2017Driving Business Insights with a Modern Data Architecture  AWS Summit SG 2017
Driving Business Insights with a Modern Data Architecture AWS Summit SG 2017
 
Serverless Streaming Data Processing using Amazon Kinesis Analytics
Serverless Streaming Data Processing using Amazon Kinesis AnalyticsServerless Streaming Data Processing using Amazon Kinesis Analytics
Serverless Streaming Data Processing using Amazon Kinesis Analytics
 
Driving Business Outcomes with a Modern Data Architecture - Level 100
Driving Business Outcomes with a Modern Data Architecture - Level 100Driving Business Outcomes with a Modern Data Architecture - Level 100
Driving Business Outcomes with a Modern Data Architecture - Level 100
 
AWS April 2016 Webinar Series - Getting Started with Real-Time Data Analytics...
AWS April 2016 Webinar Series - Getting Started with Real-Time Data Analytics...AWS April 2016 Webinar Series - Getting Started with Real-Time Data Analytics...
AWS April 2016 Webinar Series - Getting Started with Real-Time Data Analytics...
 
Real-Time Streaming Data on AWS
Real-Time Streaming Data on AWSReal-Time Streaming Data on AWS
Real-Time Streaming Data on AWS
 
Getting started with Amazon Kinesis
Getting started with Amazon KinesisGetting started with Amazon Kinesis
Getting started with Amazon Kinesis
 
Getting started with amazon kinesis
Getting started with amazon kinesisGetting started with amazon kinesis
Getting started with amazon kinesis
 
Deep Dive and Best Practices for Real Time Streaming Applications
Deep Dive and Best Practices for Real Time Streaming ApplicationsDeep Dive and Best Practices for Real Time Streaming Applications
Deep Dive and Best Practices for Real Time Streaming Applications
 

More from Amazon Web Services LATAM

AWS para terceiro setor - Sessão 1 - Introdução à nuvem
AWS para terceiro setor - Sessão 1 - Introdução à nuvemAWS para terceiro setor - Sessão 1 - Introdução à nuvem
AWS para terceiro setor - Sessão 1 - Introdução à nuvemAmazon Web Services LATAM
 
AWS para terceiro setor - Sessão 2 - Armazenamento e Backup
AWS para terceiro setor - Sessão 2 - Armazenamento e BackupAWS para terceiro setor - Sessão 2 - Armazenamento e Backup
AWS para terceiro setor - Sessão 2 - Armazenamento e BackupAmazon Web Services LATAM
 
AWS para terceiro setor - Sessão 3 - Protegendo seus dados.
AWS para terceiro setor - Sessão 3 - Protegendo seus dados.AWS para terceiro setor - Sessão 3 - Protegendo seus dados.
AWS para terceiro setor - Sessão 3 - Protegendo seus dados.Amazon Web Services LATAM
 
AWS para terceiro setor - Sessão 1 - Introdução à nuvem
AWS para terceiro setor - Sessão 1 - Introdução à nuvemAWS para terceiro setor - Sessão 1 - Introdução à nuvem
AWS para terceiro setor - Sessão 1 - Introdução à nuvemAmazon Web Services LATAM
 
AWS para terceiro setor - Sessão 2 - Armazenamento e Backup
AWS para terceiro setor - Sessão 2 - Armazenamento e BackupAWS para terceiro setor - Sessão 2 - Armazenamento e Backup
AWS para terceiro setor - Sessão 2 - Armazenamento e BackupAmazon Web Services LATAM
 
AWS para terceiro setor - Sessão 3 - Protegendo seus dados.
AWS para terceiro setor - Sessão 3 - Protegendo seus dados.AWS para terceiro setor - Sessão 3 - Protegendo seus dados.
AWS para terceiro setor - Sessão 3 - Protegendo seus dados.Amazon Web Services LATAM
 
Automatice el proceso de entrega con CI/CD en AWS
Automatice el proceso de entrega con CI/CD en AWSAutomatice el proceso de entrega con CI/CD en AWS
Automatice el proceso de entrega con CI/CD en AWSAmazon Web Services LATAM
 
Automatize seu processo de entrega de software com CI/CD na AWS
Automatize seu processo de entrega de software com CI/CD na AWSAutomatize seu processo de entrega de software com CI/CD na AWS
Automatize seu processo de entrega de software com CI/CD na AWSAmazon Web Services LATAM
 
Ransomware: como recuperar os seus dados na nuvem AWS
Ransomware: como recuperar os seus dados na nuvem AWSRansomware: como recuperar os seus dados na nuvem AWS
Ransomware: como recuperar os seus dados na nuvem AWSAmazon Web Services LATAM
 
Ransomware: cómo recuperar sus datos en la nube de AWS
Ransomware: cómo recuperar sus datos en la nube de AWSRansomware: cómo recuperar sus datos en la nube de AWS
Ransomware: cómo recuperar sus datos en la nube de AWSAmazon Web Services LATAM
 
Aprenda a migrar y transferir datos al usar la nube de AWS
Aprenda a migrar y transferir datos al usar la nube de AWSAprenda a migrar y transferir datos al usar la nube de AWS
Aprenda a migrar y transferir datos al usar la nube de AWSAmazon Web Services LATAM
 
Aprenda como migrar e transferir dados ao utilizar a nuvem da AWS
Aprenda como migrar e transferir dados ao utilizar a nuvem da AWSAprenda como migrar e transferir dados ao utilizar a nuvem da AWS
Aprenda como migrar e transferir dados ao utilizar a nuvem da AWSAmazon Web Services LATAM
 
Cómo mover a un almacenamiento de archivos administrados
Cómo mover a un almacenamiento de archivos administradosCómo mover a un almacenamiento de archivos administrados
Cómo mover a un almacenamiento de archivos administradosAmazon Web Services LATAM
 
Os benefícios de migrar seus workloads de Big Data para a AWS
Os benefícios de migrar seus workloads de Big Data para a AWSOs benefícios de migrar seus workloads de Big Data para a AWS
Os benefícios de migrar seus workloads de Big Data para a AWSAmazon Web Services LATAM
 

More from Amazon Web Services LATAM (20)

AWS para terceiro setor - Sessão 1 - Introdução à nuvem
AWS para terceiro setor - Sessão 1 - Introdução à nuvemAWS para terceiro setor - Sessão 1 - Introdução à nuvem
AWS para terceiro setor - Sessão 1 - Introdução à nuvem
 
AWS para terceiro setor - Sessão 2 - Armazenamento e Backup
AWS para terceiro setor - Sessão 2 - Armazenamento e BackupAWS para terceiro setor - Sessão 2 - Armazenamento e Backup
AWS para terceiro setor - Sessão 2 - Armazenamento e Backup
 
AWS para terceiro setor - Sessão 3 - Protegendo seus dados.
AWS para terceiro setor - Sessão 3 - Protegendo seus dados.AWS para terceiro setor - Sessão 3 - Protegendo seus dados.
AWS para terceiro setor - Sessão 3 - Protegendo seus dados.
 
AWS para terceiro setor - Sessão 1 - Introdução à nuvem
AWS para terceiro setor - Sessão 1 - Introdução à nuvemAWS para terceiro setor - Sessão 1 - Introdução à nuvem
AWS para terceiro setor - Sessão 1 - Introdução à nuvem
 
AWS para terceiro setor - Sessão 2 - Armazenamento e Backup
AWS para terceiro setor - Sessão 2 - Armazenamento e BackupAWS para terceiro setor - Sessão 2 - Armazenamento e Backup
AWS para terceiro setor - Sessão 2 - Armazenamento e Backup
 
AWS para terceiro setor - Sessão 3 - Protegendo seus dados.
AWS para terceiro setor - Sessão 3 - Protegendo seus dados.AWS para terceiro setor - Sessão 3 - Protegendo seus dados.
AWS para terceiro setor - Sessão 3 - Protegendo seus dados.
 
Automatice el proceso de entrega con CI/CD en AWS
Automatice el proceso de entrega con CI/CD en AWSAutomatice el proceso de entrega con CI/CD en AWS
Automatice el proceso de entrega con CI/CD en AWS
 
Automatize seu processo de entrega de software com CI/CD na AWS
Automatize seu processo de entrega de software com CI/CD na AWSAutomatize seu processo de entrega de software com CI/CD na AWS
Automatize seu processo de entrega de software com CI/CD na AWS
 
Cómo empezar con Amazon EKS
Cómo empezar con Amazon EKSCómo empezar con Amazon EKS
Cómo empezar con Amazon EKS
 
Como começar com Amazon EKS
Como começar com Amazon EKSComo começar com Amazon EKS
Como começar com Amazon EKS
 
Ransomware: como recuperar os seus dados na nuvem AWS
Ransomware: como recuperar os seus dados na nuvem AWSRansomware: como recuperar os seus dados na nuvem AWS
Ransomware: como recuperar os seus dados na nuvem AWS
 
Ransomware: cómo recuperar sus datos en la nube de AWS
Ransomware: cómo recuperar sus datos en la nube de AWSRansomware: cómo recuperar sus datos en la nube de AWS
Ransomware: cómo recuperar sus datos en la nube de AWS
 
Ransomware: Estratégias de Mitigação
Ransomware: Estratégias de MitigaçãoRansomware: Estratégias de Mitigação
Ransomware: Estratégias de Mitigação
 
Ransomware: Estratégias de Mitigación
Ransomware: Estratégias de MitigaciónRansomware: Estratégias de Mitigación
Ransomware: Estratégias de Mitigación
 
Aprenda a migrar y transferir datos al usar la nube de AWS
Aprenda a migrar y transferir datos al usar la nube de AWSAprenda a migrar y transferir datos al usar la nube de AWS
Aprenda a migrar y transferir datos al usar la nube de AWS
 
Aprenda como migrar e transferir dados ao utilizar a nuvem da AWS
Aprenda como migrar e transferir dados ao utilizar a nuvem da AWSAprenda como migrar e transferir dados ao utilizar a nuvem da AWS
Aprenda como migrar e transferir dados ao utilizar a nuvem da AWS
 
Cómo mover a un almacenamiento de archivos administrados
Cómo mover a un almacenamiento de archivos administradosCómo mover a un almacenamiento de archivos administrados
Cómo mover a un almacenamiento de archivos administrados
 
Simplifique su BI con AWS
Simplifique su BI con AWSSimplifique su BI con AWS
Simplifique su BI con AWS
 
Simplifique o seu BI com a AWS
Simplifique o seu BI com a AWSSimplifique o seu BI com a AWS
Simplifique o seu BI com a AWS
 
Os benefícios de migrar seus workloads de Big Data para a AWS
Os benefícios de migrar seus workloads de Big Data para a AWSOs benefícios de migrar seus workloads de Big Data para a AWS
Os benefícios de migrar seus workloads de Big Data para a AWS
 

Recently uploaded

Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 

Recently uploaded (20)

Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 

Path to the future #4 - Ingestão, processamento e análise de dados em tempo real

  • 1. © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Roy Ben-Alta, Sr. Business Development Manager, AWS September 22, 2016 Real-time Streaming Data on AWS Deep Dive & Best Practices Carlos Vinicius, Data Engineer @ OLX
  • 2. Outline Real-time streaming overview Use cases and design patterns Amazon Kinesis deep dive Streaming data ingestion Stream processing Q&A
  • 3. Big Data Evolution – It is all about the Pace Batch Report Real-time Alerts Prediction Forecast
  • 4. Streaming Data Scenarios Across Verticals Scenarios/ Verticals Accelerated Ingest- Transform-Load Continuous Metrics Generation Responsive Data Analysis Digital Ad Tech/Marketing Publisher, bidder data aggregation Advertising metrics like coverage, yield, and conversion User engagement with ads, optimized bid/buy engines IoT Sensor, device telemetry data ingestion Operational metrics and dashboards Device operational intelligence and alerts Gaming Online data aggregation, e.g., top 10 players Massively multiplayer online game (MMOG) live dashboard Leader board generation, player-skill match Consumer Online Clickstream analytics Metrics like impressions and page views Recommendation engines, proactive care
  • 5. Amazon Kinesis Customer Base Diversity 1 billion events/wk from connected devices | IoT 17 PB of game data per season | Entertainment 80 billion ad impressions/day, 30 ms response time | Ad Tech 100 GB/day click streams from 250+ sites | Enterprise 50 billion ad impressions/day sub-50 ms responses | Ad Tech 10 million events/day | Retail Amazon Kinesis as Databus - Migrate from Kafka to Kinesis| Enterprise Funnel all production events through Amazon Kinesis
  • 6. Metering Record Common Log Entry MQTT RecordSyslog Entry { "payerId": "Joe", "productCode": "AmazonS3", "clientProductCode": "AmazonS3", "usageType": "Bandwidth", "operation": "PUT", "value": "22490", "timestamp": "1216674828" } { 127.0.0.1 user- identifier frank [10/Oct/2000:13:5 5:36 -0700] "GET /apache_pb.gif HTTP/1.0" 200 2326 } { “SeattlePublicWa ter/Kinesis/123/ Realtime” – 412309129140 } { <165>1 2003-10-11T22:14:15.003Z mymachine.example.com evntslog - ID47 [exampleSDID@32473 iut="3" eventSource="Application" eventID="1011"][examplePriority@ 32473 class="high"] } Streaming Data Challenges: Variety & Velocity • Streaming data comes in different types and formats − Metering records, logs and sensor data − JSON, CSV, TSV • Can vary in size from a few bytes to kilobytes or megabytes • High velocity and continuous
  • 7. Two Main Processing Patterns Stream processing (real time) • Real-time response to events in data streams Examples: • Proactively detect hardware errors in device logs • Notify when inventory drops below a threshold • Fraud detection Micro-batching (near real time) • Near real-time operations on small batches of events in data streams Examples: • Aggregate and archive events • Monitor performance SLAs
  • 9. Amazon Kinesis Streams • For Technical Developers • Build your own custom applications that process or analyze streaming data Amazon Kinesis Firehose • For all developers, data scientists • Easily load massive volumes of streaming data into S3, Amazon Redshift and Amazon Elasticsearch Amazon Kinesis Analytics • For all developers, data scientists • Easily analyze data streams using standard SQL queries Amazon Kinesis: Streaming Data Made Easy Services make it easy to capture, deliver and process streams on AWS
  • 10. Amazon Kinesis Streams Build your own data streaming applications Easy administration: Simply create a new stream, and set the desired level of capacity with shards. Scale to match your data throughput rate and volume. Build real-time applications: Perform continual processing on streaming big data using Kinesis Client Library (KCL), Apache Spark/Storm, AWS Lambda, and more. Low cost: Cost-efficient for workloads of any scale.
  • 11. Real-Time Streaming Data Ingestion Custom-built Streaming Applications (KCL) Inexpensive: $0.014 per 1,000,000 PUT Payload Units Amazon Kinesis Streams - GA 2013 Fully managed service for real-time processing of streaming data
  • 12. Data Sources App.4 [Machine Learning] AWSEndpoint App.1 [Aggregate & De-Duplicate] Data Sources Data Sources Data Sources App.2 [Metric Extraction] Amazon S3 Amazon Redshift App.3 [Sliding Window Analysis] Availability Zone Shard 1 Shard 2 Shard N Availability Zone Availability Zone Amazon Kinesis Streams Managed service for real-time streaming AWS Lambda Amazon EMR
  • 13. • Streams are made of shards • Each shard ingests up to 1MB/sec, and 1000 records/sec • Each shard emits up to 2 MB/sec • All data is stored for 24 hours by default; storage can be extended for up to 7 days • Scale Kinesis streams using scaling util • Replay data inside of 24-hour window Amazon Kinesis Streams Managed ability to capture and store data
  • 14. Amazon Kinesis Streams: Year in Review 2016 Lambda and Spark Streaming support Extended Retention Shard-Level Metrics Time-based seek
  • 15. Streaming Data Scenarios Across Verticals Scenarios/ Verticals Accelerated Ingest- Transform-Load Continuous Metrics Generation Responsive Data Analysis Digital Ad Tech/Marketing Publisher, bidder data aggregation Advertising metrics like coverage, yield, and conversion User engagement with ads, optimized bid/buy engines IoT Sensor, device telemetry data ingestion Operational metrics and dashboards Device operational intelligence and alerts Gaming Online data aggregation, e.g., top 10 players Massively multiplayer online game (MMOG) live dashboard Leader board generation, player-skill match Consumer Online Clickstream analytics Metrics like impressions and page views Recommendation engines, proactive care
  • 16. Amazon Kinesis Firehose Load massive volumes of streaming data into Amazon S3, Amazon Redshift and Amazon Elasticsearch Zero administration: Capture and deliver streaming data into Amazon S3, Amazon Redshift and Amazon Elasticsearch without writing an application or managing infrastructure. Direct-to-data store integration: Batch, compress, and encrypt streaming data for delivery into data destinations in as little as 60 secs using simple configurations. Seamless elasticity: Seamlessly scales to match data throughput w/o intervention Capture and submit streaming data to Firehose Analyze streaming data using your favorite BI tools Firehose loads streaming data continuously into S3, Amazon Redshift and Amazon Elasticsearch
  • 17. Amazon Kinesis Firehose: Year in Review & 2016 Roadmap Kinesis Agent and log transformation Error Reporting and Troubleshooting Delivery for S3, Redshift and Elasticsearch
  • 18. Amazon Kinesis Firehose vs. Amazon Kinesis Streams Amazon Kinesis Streams is for use cases that require custom processing, per incoming record, with sub-1 second processing latency, and a choice of stream processing frameworks. Amazon Kinesis Firehose is for use cases that require zero administration, ability to use existing analytics tools based on Amazon S3, Amazon Redshift and Amazon Elasticsearch, and a data latency of 60 seconds or higher.
  • 19. Amazon Kinesis Analytics Apply SQL on streams: Easily connect to a Kinesis Stream or Firehose Delivery Stream and apply SQL skills. Build real-time applications: Perform continual processing on streaming big data with sub-second processing latencies. Easy Scalability : Elastically scales to match data throughput. Connect to Kinesis streams, Firehose delivery streams Run standard SQL queries against data streams Kinesis Analytics can send processed data to analytics tools so you can create alerts and respond in real-time
  • 20. Use SQL to build real-time applications Easily write SQL code to process streaming data Connect to streaming source Continuously deliver SQL results
  • 21. Streaming Data Scenarios Across Verticals Scenarios/ Verticals Accelerated Ingest- Transform-Load Continuous Metrics Generation Responsive Data Analysis Digital Ad Tech/Marketing Publisher, bidder data aggregation Advertising metrics like coverage, yield, and conversion User engagement with ads, optimized bid/buy engines IoT Sensor, device telemetry data ingestion Operational metrics and dashboards Device operational intelligence and alerts Gaming Online data aggregation, e.g., top 10 players Massively multiplayer online game (MMOG) live dashboard Leader board generation, player-skill match Consumer Online Clickstream analytics Metrics like impressions and page views Recommendation engines, proactive care
  • 23. Putting Data into Amazon Kinesis Streams Determine your partition key strategy • Managed buffer or streaming MapReduce job • Ensure high cardinality for your shards Provision adequate shards • For ingress needs • Egress needs for all consuming applications: if more than two simultaneous applications • Include headroom for catching up with data in stream
  • 24. Putting Data into Amazon Kinesis Amazon Kinesis Agent – (supports pre-processing) • http://docs.aws.amazon.com/firehose/latest/dev/writing-with-agents.html Pre-batch before Puts for better efficiency • Consider Flume, Fluentd as collectors/agents • See https://github.com/awslabs/aws-fluent-plugin-kinesis Make a tweak to your existing logging • log4j appender option • See https://github.com/awslabs/kinesis-log4j-appender
  • 25. Amazon Kinesis Producer Library • Writes to one or more Amazon Kinesis streams with automatic, configurable retry mechanism • Collects records and uses PutRecords to write multiple records to multiple shards per request • Aggregates user records to increase payload size and improve throughput • Integrates seamlessly with KCL to de-aggregate batched records • Use Amazon Kinesis Producer Library with AWS Lambda (New!) • Submits Amazon CloudWatch metrics on your behalf to provide visibility into producer performance
  • 26. Record Order and Multiple Shards Unordered processing • Randomize partition key to distribute events over many shards and use multiple workers Exact order processing • Control partition key to ensure events are grouped into the same shard and read by the same worker Need both? Use global sequence number Producer Get Global Sequence Unordered Stream Campaign Centric Stream Fraud Inspection Stream Get Event Metadata
  • 27. Sample Code for Scaling Shards java -cp KinesisScalingUtils.jar-complete.jar -Dstream-name=MyStream -Dscaling-action=scaleUp -Dcount=10 -Dregion=eu-west-1 ScalingClient Options: • stream-name - The name of the stream to be scaled • scaling-action - The action to be taken to scale. Must be one of "scaleUp”, "scaleDown" or “resize” • count - Number of shards by which to absolutely scale up or down, or resize See https://github.com/awslabs/amazon-kinesis-scaling-utils
  • 28. Amazon Kinesis Stream Processing
  • 29. Amazon Kinesis Client Library • Build Kinesis Applications with Kinesis Client Library (KCL) • Open source client library available for Java, Ruby, Python, Node.JS dev • Deploy on your EC2 instances • KCL Application includes three components: 1. Record Processor Factory – Creates the record processor 2. Record Processor – Processor unit that processes data from a shard in Amazon Kinesis Streams 3. Worker – Processing unit that maps to each application instance
  • 30. State Management with Kinesis Client Library • One record processor maps to one shard and processes data records from that shard • One worker maps to one or more record processors • Balances shard-worker associations when worker / instance counts change • Balances shard-worker associations when shards split or merge
  • 31. Other Options • Third-party connectors(for example, Splunk) • AWS IoT platform • Amazon EMR with Apache Spark, Pig or Hive • AWS Lambda
  • 32. Apache Spark and Amazon Kinesis Streams Apache Spark is an in-memory analytics cluster using RDD for fast processing Spark Streaming can read directly from an Amazon Kinesis stream Amazon software license linking – Add ASL dependency to SBT/MAVEN project, artifactId = spark- streaming-kinesis-asl_2.10 KinesisUtils.createStream(‘twitter-stream’) .filter(_.getText.contains(”Open-Source")) .countByWindow(Seconds(5)) Example: Counting tweets on a sliding window
  • 33. Common Integration Pattern with Amazon EMR Tumbling Window Reporting Amazon EMR Amazon Kinesis StreamsStreaming Input Tumbling/Fixed Window Aggregation Periodic Output Amazon Redshift COPY from Amazon EMR
  • 34. © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved. OLX Case Study Carlos Vinicius, Data Engineer @ OLX
  • 35. “Data streaming production ready in no time” Present in the country since 2010 OLX is Brazil's biggest classifieds website and app. With more than 14 million of active Ads and 5 million messages exchanged daily via chat. “Being able to evaluate new ideas fast and efficiently is for us AWS greatest benefit.” - Bernardo Carneiro, Director of Technology
  • 36. The challenge Build a scalable architecture that supports a growing increase in the volume of data. Be able to develop and evaluate the results quickly. Cost.
  • 38. 450 MB per day 4xhigher success rate 3 weeks Development to production time 30 USDmonthly cost 5,500 peak 3,500 requests per minute average 5M records per day Daily load Amazon Kinesis Benefits
  • 39. Amazon Kinesis Streams with AWS Lambda
  • 40. AWS Lambda + Amazon Kinesis Data Input Kinesis Action Lambda Data Output IT application activity Capture the stream Audit Process the stream SNS Metering records Condense Redshift Change logs Backup S3 Financial data Store RDS Transaction orders Process SQS Server health metrics Monitor EC2 User clickstream Analyze EMR IoT device data Respond Backend endpoint Custom data Custom action Custom application
  • 41. Common Architecture: Lambda + Kinesis Data Processing for Data Storage/Analysis Use Lambda to process and “fan out” to other AWS services i.e. Storage, Database, and BI/analytics Amazon Kinesis stream can continuously capture and store terabytes of data per hour from hundreds of thousands of sources Grant AWS Lambda permissions for the relevant stream actions via IAM (Execution Role) during function creation IAM IAM IAM
  • 42. Atom Data Flow Management is a data infrastructure solution that allows clients to customize their data flow according to their business needs. 365Scores is a leading sports app that offers users live scores, match statistics, news, videos, and highlights across 10 sports and over 1K competitions worldwide. Background 350B Data Events Every Month 15 Dedicated Data Professionals 10M+ Total Installs 375K Five Star Reviews $6.7M Funding Raised
  • 45. Conclusion • Amazon Kinesis offers: managed service to build applications, streaming data ingestion, and continuous processing • Ingest aggregate data using Amazon Producer Library • Process data using Amazon Connector Library and open source connectors • Determine your partition key strategy • Try out Amazon Kinesis at http://aws.amazon.com/kinesis/
  • 46. • Technical documentations • Amazon Kinesis Agent • Amazon Kinesis Streams and Spark Streaming • Amazon Kinesis Producer Library Best Practice • Amazon Kinesis Firehose and AWS Lambda • Building Near Real-Time Discovery Platform with Amazon Kinesis • Public case studies • Comcast Use Case • Glu mobile – Real-Time Analytics • Hearst Publishing – Clickstream Analytics • How Sonos Leverages Amazon Kinesis • Nordstorm Online Stylist Reference