Realtor.com enables realtors to connect with home buyers. With AWS services, Realtor.com built a new advertising solution that allows realtors to launch marketing campaigns in real-time and scales to hundreds of millions of ad impressions every day. In this session, learn how Realtor.com architected their solution using Amazon Kinesis Streams, Amazon Kinesis Firehose, AWS Lambda, and Amazon Redshift to track native ad impressions on their site and mobile app. Realtor.com will share lessons learned and tips for getting the most out of streaming data services on AWS. We will also provide an overview of how to get started with real-time, streaming data using Amazon Kinesis services.
2. What to expect from this session
Amazon Kinesis: Getting Started with streaming data on AWS
• Streaming scenarios
• Amazon Kinesis Streams overview
• Amazon Kinesis Firehose overview
• Firehose getting started experience
• Amazon Kinesis at Realtor.com
4. Scenarios Accelerated Ingest-
Transform-Load
Continual Metrics
Generation
Responsive Data
Analysis
Data Types IT logs, applications logs, social media / clickstreams, sensor or device data, market data
Ad/ Marketing
Tech
Publisher, bidder data
aggregation
Advertising metrics like
coverage, yield, conversion
Analytics on user
engagement with ads,
optimized bid / buy engines
IoT Sensor, device telemetry
data ingestion
IT operational metrics
dashboards
Sensor operational
intelligence, alerts, and
notifications
Gaming Online customer engagement
data aggregation
Consumer engagement
metrics for level success,
transition rates, CTR
Clickstream analytics,
leaderboard generation,
player-skill match engines
Consumer
Engagement
Online customer engagement
data aggregation
Consumer engagement
metrics like page views,
CTR
Clickstream analytics,
recommendation engines
Streaming data scenarios across segments
1 2
3
5. Amazon Kinesis
Services make it easy to capture, deliver, and process streams on AWS
Amazon Confidential
In Preview
Amazon Kinesis
Streams
Stores data as a
continuous replayable
stream for custom
applications
Amazon Kinesis
Firehose
Load streaming data into
Amazon S3, Amazon
Redshift, and Amazon
Elasticsearch Service
Amazon Kinesis
Analytics
Analyze data streams
using standard SQL
queries
7. Amazon Kinesis Streams
Store data as a continuous stream
Easy administration: Simply create a new stream and set the desired level of capacity
with shards. Scale to match your data throughput rate and volume.
Build real-time applications: Perform continual processing on streaming big data using
Amazon Kinesis Client Library (KCL), Apache Spark/Storm, AWS Lambda, and more.
Low cost: Cost-efficient for workloads of any scale.
9. Amazon Kinesis Firehose
Load massive volumes of streaming data into destinations
Zero administration: Capture and deliver streaming data into Amazon S3, Amazon
Redshift, and other destinations without writing an application or managing infrastructure.
Direct-to-data store integration: Batch, compress, and encrypt streaming data for
delivery into data destinations in as little as 60 secs using simple configurations.
Seamless elasticity: Seamlessly scale to match data throughput without intervention.
Capture and submit
streaming data to Firehose
Firehose loads streaming data
continuously into Amazon S3
and Amazon Redshift
Analyze streaming data using
your favorite BI tools
20. What I’d like you to take away
Amazon Kinesis is:
• Simple, reliable, and offers high performance
• A transformative building block with broad applicability
• An enabler for “real time everywhere”
21. About Realtor.com
First national US real estate
search site
Most accurate real estate
content
Gets data from 99% of MLSs
55 million unique users in April
22. Realtor.com cloud strategy
Going “all in” on cloud, most
on AWS
About ½ done – BI, search,
geo services, photos all in
AWS now
Strong bias towards AWS
managed services
23. Customer problem
My listings get lots of traffic at
start, but less over time
I only want people searching
for relevant listings
I want to get more brand
exposure in search
24. Solution: “Turbo listings” product
Native ad product that
provides customers more
exposure in search
100% relevant placements,
and are like any other listing
Shows the agent profile photo
in search
25. Turbo technical requirements
Extreme availability and throughput
Multiple systems, both inside and outside VPCs (and
inside/outside AWS)
Auditable, secure billing database
32. Outcomes: Huge scale
Serving millions of impressions per day on 2 Kinesis
shards
Tested up to 20x current site traffic
Basically, we couldn’t break it
33. Outcomes: Great performance
Latencies in single or low
double digit milliseconds
Events are processed in small
batches for efficiency
For our purposes, Kinesis
gives us real time data
streaming
34. Lessons learned
Complexity with Amazon Redshift and private subnets
Must consider what dedupe behavior you need
Simple key–value data JSON structure pays dividends
35. Future: Real time pipeline
Real time is the pinnacle
Collect data on page 1, and
act on page 2
What we’ve built on Kinesis
with the turbo feature is the
starting point for us
Photo by @snordq on Flickr. Creative Commons License
36. What I’d like you to take away
Amazon Kinesis is:
Simple, reliable, and offers high performance
A transformative building block with broad applicability
An enabler for “real time everywhere”
37. One final thing…
Hiring! Search for “realtor.com careers” (careers.move.com)
Software engineers, QA engineers, data scientists, product
managers, and project managers
In Santa Clara, Ventura County, Vancouver, Canada, and
Morgantown, WV
Thank you: Eddy Luten, Viren Nagtode, and Sonal Shirke