SlideShare a Scribd company logo
1 of 38
Download to read offline
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
July 13, 2016
Streaming Data Processing
with Amazon Kinesis
Alan Lewis, Principal Architect, Realtor.com
Ray Zhu, Sr. Product Manager, AWS
What to expect from this session
Amazon Kinesis: Getting Started with streaming data on AWS
• Streaming scenarios
• Amazon Kinesis Streams overview
• Amazon Kinesis Firehose overview
• Firehose getting started experience
• Amazon Kinesis at Realtor.com
Need to go a bit faster
Scenarios Accelerated Ingest-
Transform-Load
Continual Metrics
Generation
Responsive Data
Analysis
Data Types IT logs, applications logs, social media / clickstreams, sensor or device data, market data
Ad/ Marketing
Tech
Publisher, bidder data
aggregation
Advertising metrics like
coverage, yield, conversion
Analytics on user
engagement with ads,
optimized bid / buy engines
IoT Sensor, device telemetry
data ingestion
IT operational metrics
dashboards
Sensor operational
intelligence, alerts, and
notifications
Gaming Online customer engagement
data aggregation
Consumer engagement
metrics for level success,
transition rates, CTR
Clickstream analytics,
leaderboard generation,
player-skill match engines
Consumer
Engagement
Online customer engagement
data aggregation
Consumer engagement
metrics like page views,
CTR
Clickstream analytics,
recommendation engines
Streaming data scenarios across segments
1 2
3
Amazon Kinesis
Services make it easy to capture, deliver, and process streams on AWS
Amazon Confidential
In Preview
Amazon Kinesis
Streams
Stores data as a
continuous replayable
stream for custom
applications
Amazon Kinesis
Firehose
Load streaming data into
Amazon S3, Amazon
Redshift, and Amazon
Elasticsearch Service
Amazon Kinesis
Analytics
Analyze data streams
using standard SQL
queries
Amazon Kinesis Streams
Amazon Kinesis Streams
Store data as a continuous stream
Easy administration: Simply create a new stream and set the desired level of capacity
with shards. Scale to match your data throughput rate and volume.
Build real-time applications: Perform continual processing on streaming big data using
Amazon Kinesis Client Library (KCL), Apache Spark/Storm, AWS Lambda, and more.
Low cost: Cost-efficient for workloads of any scale.
Amazon Kinesis Firehose
Amazon Kinesis Firehose
Load massive volumes of streaming data into destinations
Zero administration: Capture and deliver streaming data into Amazon S3, Amazon
Redshift, and other destinations without writing an application or managing infrastructure.
Direct-to-data store integration: Batch, compress, and encrypt streaming data for
delivery into data destinations in as little as 60 secs using simple configurations.
Seamless elasticity: Seamlessly scale to match data throughput without intervention.
Capture and submit
streaming data to Firehose
Firehose loads streaming data
continuously into Amazon S3
and Amazon Redshift
Analyze streaming data using
your favorite BI tools
Amazon Kinesis Firehose
Customer Experience
Amazon Kinesis Firehose console experience
Unified console experience for Firehose and Streams
Amazon Kinesis Firehose console (Amazon S3)
Create fully managed resources for delivery without building an app
Amazon Kinesis Firehose console (Amazon S3)
Configure data delivery options simply using the console
Amazon Kinesis Firehose console (Amazon Redshift)
Configure data delivery to Amazon Redshift simply using the console
Amazon Kinesis Firehose console (Amazon ES)
Configure data delivery to Amazon ES simply using the console
Amazon Kinesis Firehose monitoring
Visibility into and transparency of data delivery
Amazon Kinesis Firehose monitoring
Error logging for troubleshooting delivery failures
Amazon Kinesis Firehose pricing
Simple, pay-as-you-go, and no upfront costs
Dimension Value
Per 1 GB of data ingested $0.035
Kinesis at Realtor.com
What I’d like you to take away
Amazon Kinesis is:
• Simple, reliable, and offers high performance
• A transformative building block with broad applicability
• An enabler for “real time everywhere”
About Realtor.com
First national US real estate
search site
Most accurate real estate
content
Gets data from 99% of MLSs
55 million unique users in April
Realtor.com cloud strategy
Going “all in” on cloud, most
on AWS
About ½ done – BI, search,
geo services, photos all in
AWS now
Strong bias towards AWS
managed services
Customer problem
My listings get lots of traffic at
start, but less over time
I only want people searching
for relevant listings
I want to get more brand
exposure in search
Solution: “Turbo listings” product
Native ad product that
provides customers more
exposure in search
100% relevant placements,
and are like any other listing
Shows the agent profile photo
in search
Turbo technical requirements
Extreme availability and throughput
Multiple systems, both inside and outside VPCs (and
inside/outside AWS)
Auditable, secure billing database
Why Kinesis?
Great performance
Multiproducer, multiconsumer queues
Worry-free managed service
Turbo architecture
AWS
AWS
Mobile
Native
Apps
Decrement
impressions
API
Create
Campaign
API
Update
Campaign
API
Delete
Campaign
API
Campaign
Expired?
Count
Reached
zero?
False
True
True
Campaign Manager
Impression data
{
"campaign_id": "01d329aa-9eb2-426c-9b7b-4877a32fb176",
"id": "a34f271f-058d-47ba-9d45-8140261742a0",
"listing_id": 593893632,
"property_id": 1258201259,
"advertiser_id": "8675309",
"event_type": "turbo_search_impression",
"producer": "fesl",
"client_source": "rdc_web",
"client_version": "8.0",
"page_variation": "list_view",
"timestamp": "2016-03-02T00:47:25+00:00",
"user_agent": "...”
}
Impression tracking flow
AWS
Lambda
Pull events
Amazon
RDS
Amazon EC2
Amazon Kinesis
Streams
Post to web
service
Decrement in
DB
Campaign manager
Billing flow
Amazon
DynamoDB
Amazon
Redshift
AWS
Lambda
Amazon
S3
Amazon Kinesis
Streams
Amazon Kinesis
Firehose
AWS KMS Private subnet
AWS
Lambda
AWS
Lambda
Validate
event
Firehose
PutRecord
Firehose
destination
SSE-KMS
encryption on
Amazon S3
Amazon S3
notification
Status
tracking
Event
source
COPY
command
KMS encryption
on Amazon Redshift
Data transfer
In JSON
Event data
in JSON
Redshift – 15 minute batches
Outcomes: Huge scale
Serving millions of impressions per day on 2 Kinesis
shards
Tested up to 20x current site traffic
Basically, we couldn’t break it
Outcomes: Great performance
Latencies in single or low
double digit milliseconds
Events are processed in small
batches for efficiency
For our purposes, Kinesis
gives us real time data
streaming
Lessons learned
Complexity with Amazon Redshift and private subnets
Must consider what dedupe behavior you need
Simple key–value data JSON structure pays dividends
Future: Real time pipeline
Real time is the pinnacle
Collect data on page 1, and
act on page 2
What we’ve built on Kinesis
with the turbo feature is the
starting point for us
Photo by @snordq on Flickr. Creative Commons License
What I’d like you to take away
Amazon Kinesis is:
Simple, reliable, and offers high performance
A transformative building block with broad applicability
An enabler for “real time everywhere”
One final thing…
Hiring! Search for “realtor.com careers” (careers.move.com)
Software engineers, QA engineers, data scientists, product
managers, and project managers
In Santa Clara, Ventura County, Vancouver, Canada, and
Morgantown, WV
Thank you: Eddy Luten, Viren Nagtode, and Sonal Shirke
Thank you!

More Related Content

Viewers also liked

AWS re:Invent 2016: Analyzing Streaming Data in Real-time with Amazon Kinesis...
AWS re:Invent 2016: Analyzing Streaming Data in Real-time with Amazon Kinesis...AWS re:Invent 2016: Analyzing Streaming Data in Real-time with Amazon Kinesis...
AWS re:Invent 2016: Analyzing Streaming Data in Real-time with Amazon Kinesis...Amazon Web Services
 
Actividad2guiadeaprendizaje1 140526102933-phpapp02
Actividad2guiadeaprendizaje1 140526102933-phpapp02Actividad2guiadeaprendizaje1 140526102933-phpapp02
Actividad2guiadeaprendizaje1 140526102933-phpapp02harvey rosero
 
Trabajo final
Trabajo finalTrabajo final
Trabajo finalbarrameda
 
Zend Solution Brief 0909 Web
Zend Solution Brief 0909 WebZend Solution Brief 0909 Web
Zend Solution Brief 0909 WebNajeem Illyas
 
clasificación de las dinámicas de grupos unicrece tercer semestre grupo pedag...
clasificación de las dinámicas de grupos unicrece tercer semestre grupo pedag...clasificación de las dinámicas de grupos unicrece tercer semestre grupo pedag...
clasificación de las dinámicas de grupos unicrece tercer semestre grupo pedag...RebecaCruzPerez
 
Webinar "eMail Marketing en eCommerce: El desafío del Retail"
Webinar "eMail Marketing en eCommerce: El desafío del Retail" Webinar "eMail Marketing en eCommerce: El desafío del Retail"
Webinar "eMail Marketing en eCommerce: El desafío del Retail" eCommerce Institute
 
Doha 2006 sukan asia
Doha 2006 sukan asiaDoha 2006 sukan asia
Doha 2006 sukan asiaOlimpikini
 
Cara de Um "Focinho" de Outro!
Cara de Um "Focinho" de Outro!Cara de Um "Focinho" de Outro!
Cara de Um "Focinho" de Outro!auricola
 
Culegere de teste competenta lingvistica
Culegere de teste competenta lingvisticaCulegere de teste competenta lingvistica
Culegere de teste competenta lingvisticatdaniela2005br
 
Casos reales de Native Advertising en medios españoles: el Huffington Post. B...
Casos reales de Native Advertising en medios españoles: el Huffington Post. B...Casos reales de Native Advertising en medios españoles: el Huffington Post. B...
Casos reales de Native Advertising en medios españoles: el Huffington Post. B...NativeAD
 
Espo neumo tx asma
Espo neumo tx asmaEspo neumo tx asma
Espo neumo tx asmaChava BG
 
Presentación Red de Mujeres Empresarias del Medio Rural
Presentación Red de Mujeres Empresarias del Medio RuralPresentación Red de Mujeres Empresarias del Medio Rural
Presentación Red de Mujeres Empresarias del Medio RuralYolanda Hernández
 

Viewers also liked (16)

AWS re:Invent 2016: Analyzing Streaming Data in Real-time with Amazon Kinesis...
AWS re:Invent 2016: Analyzing Streaming Data in Real-time with Amazon Kinesis...AWS re:Invent 2016: Analyzing Streaming Data in Real-time with Amazon Kinesis...
AWS re:Invent 2016: Analyzing Streaming Data in Real-time with Amazon Kinesis...
 
Ofertas DICIEMBRE 2014
Ofertas DICIEMBRE 2014Ofertas DICIEMBRE 2014
Ofertas DICIEMBRE 2014
 
Actividad2guiadeaprendizaje1 140526102933-phpapp02
Actividad2guiadeaprendizaje1 140526102933-phpapp02Actividad2guiadeaprendizaje1 140526102933-phpapp02
Actividad2guiadeaprendizaje1 140526102933-phpapp02
 
spring
springspring
spring
 
Trabajo final
Trabajo finalTrabajo final
Trabajo final
 
Zend Solution Brief 0909 Web
Zend Solution Brief 0909 WebZend Solution Brief 0909 Web
Zend Solution Brief 0909 Web
 
clasificación de las dinámicas de grupos unicrece tercer semestre grupo pedag...
clasificación de las dinámicas de grupos unicrece tercer semestre grupo pedag...clasificación de las dinámicas de grupos unicrece tercer semestre grupo pedag...
clasificación de las dinámicas de grupos unicrece tercer semestre grupo pedag...
 
Webinar "eMail Marketing en eCommerce: El desafío del Retail"
Webinar "eMail Marketing en eCommerce: El desafío del Retail" Webinar "eMail Marketing en eCommerce: El desafío del Retail"
Webinar "eMail Marketing en eCommerce: El desafío del Retail"
 
Doha 2006 sukan asia
Doha 2006 sukan asiaDoha 2006 sukan asia
Doha 2006 sukan asia
 
Cara de Um "Focinho" de Outro!
Cara de Um "Focinho" de Outro!Cara de Um "Focinho" de Outro!
Cara de Um "Focinho" de Outro!
 
Culegere de teste competenta lingvistica
Culegere de teste competenta lingvisticaCulegere de teste competenta lingvistica
Culegere de teste competenta lingvistica
 
Sugerencias para el collage
Sugerencias para el collageSugerencias para el collage
Sugerencias para el collage
 
Casos reales de Native Advertising en medios españoles: el Huffington Post. B...
Casos reales de Native Advertising en medios españoles: el Huffington Post. B...Casos reales de Native Advertising en medios españoles: el Huffington Post. B...
Casos reales de Native Advertising en medios españoles: el Huffington Post. B...
 
Espo neumo tx asma
Espo neumo tx asmaEspo neumo tx asma
Espo neumo tx asma
 
Presentación Red de Mujeres Empresarias del Medio Rural
Presentación Red de Mujeres Empresarias del Medio RuralPresentación Red de Mujeres Empresarias del Medio Rural
Presentación Red de Mujeres Empresarias del Medio Rural
 
Optica
Optica Optica
Optica
 

More from Amazon Web Services

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Amazon Web Services
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Amazon Web Services
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateAmazon Web Services
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSAmazon Web Services
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Amazon Web Services
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Amazon Web Services
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...Amazon Web Services
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsAmazon Web Services
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareAmazon Web Services
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSAmazon Web Services
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAmazon Web Services
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareAmazon Web Services
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWSAmazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckAmazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without serversAmazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...Amazon Web Services
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceAmazon Web Services
 

More from Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

Recently uploaded

AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Zilliz
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdfSandro Moreira
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...apidays
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Angeliki Cooney
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Bhuvaneswari Subramani
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 

Recently uploaded (20)

AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 

Streaming Data Processing with Amazon Kinesis

  • 1. © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved. July 13, 2016 Streaming Data Processing with Amazon Kinesis Alan Lewis, Principal Architect, Realtor.com Ray Zhu, Sr. Product Manager, AWS
  • 2. What to expect from this session Amazon Kinesis: Getting Started with streaming data on AWS • Streaming scenarios • Amazon Kinesis Streams overview • Amazon Kinesis Firehose overview • Firehose getting started experience • Amazon Kinesis at Realtor.com
  • 3. Need to go a bit faster
  • 4. Scenarios Accelerated Ingest- Transform-Load Continual Metrics Generation Responsive Data Analysis Data Types IT logs, applications logs, social media / clickstreams, sensor or device data, market data Ad/ Marketing Tech Publisher, bidder data aggregation Advertising metrics like coverage, yield, conversion Analytics on user engagement with ads, optimized bid / buy engines IoT Sensor, device telemetry data ingestion IT operational metrics dashboards Sensor operational intelligence, alerts, and notifications Gaming Online customer engagement data aggregation Consumer engagement metrics for level success, transition rates, CTR Clickstream analytics, leaderboard generation, player-skill match engines Consumer Engagement Online customer engagement data aggregation Consumer engagement metrics like page views, CTR Clickstream analytics, recommendation engines Streaming data scenarios across segments 1 2 3
  • 5. Amazon Kinesis Services make it easy to capture, deliver, and process streams on AWS Amazon Confidential In Preview Amazon Kinesis Streams Stores data as a continuous replayable stream for custom applications Amazon Kinesis Firehose Load streaming data into Amazon S3, Amazon Redshift, and Amazon Elasticsearch Service Amazon Kinesis Analytics Analyze data streams using standard SQL queries
  • 7. Amazon Kinesis Streams Store data as a continuous stream Easy administration: Simply create a new stream and set the desired level of capacity with shards. Scale to match your data throughput rate and volume. Build real-time applications: Perform continual processing on streaming big data using Amazon Kinesis Client Library (KCL), Apache Spark/Storm, AWS Lambda, and more. Low cost: Cost-efficient for workloads of any scale.
  • 9. Amazon Kinesis Firehose Load massive volumes of streaming data into destinations Zero administration: Capture and deliver streaming data into Amazon S3, Amazon Redshift, and other destinations without writing an application or managing infrastructure. Direct-to-data store integration: Batch, compress, and encrypt streaming data for delivery into data destinations in as little as 60 secs using simple configurations. Seamless elasticity: Seamlessly scale to match data throughput without intervention. Capture and submit streaming data to Firehose Firehose loads streaming data continuously into Amazon S3 and Amazon Redshift Analyze streaming data using your favorite BI tools
  • 11. Amazon Kinesis Firehose console experience Unified console experience for Firehose and Streams
  • 12. Amazon Kinesis Firehose console (Amazon S3) Create fully managed resources for delivery without building an app
  • 13. Amazon Kinesis Firehose console (Amazon S3) Configure data delivery options simply using the console
  • 14. Amazon Kinesis Firehose console (Amazon Redshift) Configure data delivery to Amazon Redshift simply using the console
  • 15. Amazon Kinesis Firehose console (Amazon ES) Configure data delivery to Amazon ES simply using the console
  • 16. Amazon Kinesis Firehose monitoring Visibility into and transparency of data delivery
  • 17. Amazon Kinesis Firehose monitoring Error logging for troubleshooting delivery failures
  • 18. Amazon Kinesis Firehose pricing Simple, pay-as-you-go, and no upfront costs Dimension Value Per 1 GB of data ingested $0.035
  • 20. What I’d like you to take away Amazon Kinesis is: • Simple, reliable, and offers high performance • A transformative building block with broad applicability • An enabler for “real time everywhere”
  • 21. About Realtor.com First national US real estate search site Most accurate real estate content Gets data from 99% of MLSs 55 million unique users in April
  • 22. Realtor.com cloud strategy Going “all in” on cloud, most on AWS About ½ done – BI, search, geo services, photos all in AWS now Strong bias towards AWS managed services
  • 23. Customer problem My listings get lots of traffic at start, but less over time I only want people searching for relevant listings I want to get more brand exposure in search
  • 24. Solution: “Turbo listings” product Native ad product that provides customers more exposure in search 100% relevant placements, and are like any other listing Shows the agent profile photo in search
  • 25. Turbo technical requirements Extreme availability and throughput Multiple systems, both inside and outside VPCs (and inside/outside AWS) Auditable, secure billing database
  • 26. Why Kinesis? Great performance Multiproducer, multiconsumer queues Worry-free managed service
  • 28. Impression data { "campaign_id": "01d329aa-9eb2-426c-9b7b-4877a32fb176", "id": "a34f271f-058d-47ba-9d45-8140261742a0", "listing_id": 593893632, "property_id": 1258201259, "advertiser_id": "8675309", "event_type": "turbo_search_impression", "producer": "fesl", "client_source": "rdc_web", "client_version": "8.0", "page_variation": "list_view", "timestamp": "2016-03-02T00:47:25+00:00", "user_agent": "...” }
  • 29. Impression tracking flow AWS Lambda Pull events Amazon RDS Amazon EC2 Amazon Kinesis Streams Post to web service Decrement in DB Campaign manager
  • 30. Billing flow Amazon DynamoDB Amazon Redshift AWS Lambda Amazon S3 Amazon Kinesis Streams Amazon Kinesis Firehose AWS KMS Private subnet AWS Lambda AWS Lambda Validate event Firehose PutRecord Firehose destination SSE-KMS encryption on Amazon S3 Amazon S3 notification Status tracking Event source COPY command KMS encryption on Amazon Redshift Data transfer In JSON Event data in JSON
  • 31. Redshift – 15 minute batches
  • 32. Outcomes: Huge scale Serving millions of impressions per day on 2 Kinesis shards Tested up to 20x current site traffic Basically, we couldn’t break it
  • 33. Outcomes: Great performance Latencies in single or low double digit milliseconds Events are processed in small batches for efficiency For our purposes, Kinesis gives us real time data streaming
  • 34. Lessons learned Complexity with Amazon Redshift and private subnets Must consider what dedupe behavior you need Simple key–value data JSON structure pays dividends
  • 35. Future: Real time pipeline Real time is the pinnacle Collect data on page 1, and act on page 2 What we’ve built on Kinesis with the turbo feature is the starting point for us Photo by @snordq on Flickr. Creative Commons License
  • 36. What I’d like you to take away Amazon Kinesis is: Simple, reliable, and offers high performance A transformative building block with broad applicability An enabler for “real time everywhere”
  • 37. One final thing… Hiring! Search for “realtor.com careers” (careers.move.com) Software engineers, QA engineers, data scientists, product managers, and project managers In Santa Clara, Ventura County, Vancouver, Canada, and Morgantown, WV Thank you: Eddy Luten, Viren Nagtode, and Sonal Shirke