by Adrian Hornsby, Technical Evangelist, AWS
Amazon Kinesis is a platform for streaming data on AWS, offering powerful services to make it easy to load and analyze streaming data. In this session, you’ll learn about how AWS customers are transitioning from batch to real-time processing using Amazon Kinesis, and how to get started. We will provide an overview of streaming data applications and introduce the Amazon Kinesis platform and its services. We will walk through a production use case to demonstrate how to ingest streaming data, prepare it, and analyze it to gain actionable insights in real time using Amazon Kinesis. We will also provide pointers to tutorials and other resources so you can quickly get started with your streaming data application.
Introduction to Real-time, Streaming Data and Amazon Kinesis. Streaming Data Ingestion with Firehose
1. San Francisco Loft - 2017
Introduction to Real-time, Streaming
Data and Amazon Kinesis:
Streaming Data Ingestion with
Firehose
Adrian Hornsby
Technical Evangelist with AWS
2. What to Expect from the Session
• Streaming data overview
• Firehose patterns overview
• Firehose usage patterns
• Streaming data end-to-end example and walk-
through
4. Streaming Data is data that is generated continuously by thousands of data
sources, which typically send in the data records simultaneously, and in
small sizes (order of Kilobytes).
Streaming data includes a wide variety of data such as log files generated by
customers using your mobile or web applications, ecommerce purchases,
in-game player activity, information from social networks, financial trading
floors, or geospatial services, and telemetry from connected devices or
instrumentation in data centers.
6. Most data is produced continuously
Mobile Apps Web Clickstream Application Logs
Metering Records IoT Sensors Smart Buildings
[Wed Oct 11 14:32:52
2000] [error] [client
127.0.0.1] client
denied by server
configuration:
/export/home/live/ap/h
tdocs/test
7. The diminishing value of data
• Recent data is highly valuable
• Old + Recent data is more valuable
8. Processing real-time, streaming data
• Durable
• Continuous
• Fast
• Correct
• Reactive
• Reliable
What are the key requirements?
Ingest Transform Analyze React Persist
10. Real-time streaming data made easy
Amazon Kinesis
Streams
• For Technical Developers
• Collect and stream data
for ordered, replayable,
real-time processing
Amazon Kinesis
Firehose
• For all developers, data
scientists
• Easily load massive
volumes of streaming data
into Amazon S3, Redshift,
ElasticSearch
Amazon Kinesis
Analytics
• For all developers, data
scientists
• Easily analyze data
streams using standard
SQL queries
11. Amazon Kinesis Streams
• Reliably ingest and durably store streaming data at low cost
• Build custom real-time applications to process streaming data
12. Amazon Kinesis Analytics
• Interact with streaming data in real-time using SQL
• Build fully managed and elastic stream processing
applications that process data for real-time visualizations
and alarms
13. Amazon Kinesis Firehose
• Reliably ingest and deliver batched, compressed, and
encrypted data to S3, Redshift, and Elasticsearch
• Point and click setup with zero administration and
seamless elasticity
14. Amazon Kinesis makes it easy to work with
real-time streaming data
Amazon Kinesis
Firehose
• For all developers, data
scientists
• Easily load massive
volumes of streaming data
into Amazon S3, Redshift,
ElasticSearch
20. Amazon Kinesis Firehose vs. Amazon Kinesis Streams
Amazon Kinesis Streams is for use cases that require custom processing,
per incoming record, with sub-1 second processing latency, and a choice of
stream processing frameworks.
Amazon Kinesis Firehose is for use cases that require zero administration,
ability to use existing analytics tools based on Amazon S3, Amazon
Redshift and Amazon Elasticsearch, and a data latency of 60 seconds or
higher.
39. Amazon Kinesis Customer Base Diversity
1 billion events/wk from
connected devices | IoT
17 PB of game data per
season | Entertainment
80 billion ad
impressions/day, 30 ms
response time | Ad Tech
100 GB/day click streams
from 250+ sites |
Enterprise
50 billion ad
impressions/day sub-50
ms responses | Ad Tech
10 million events/day
| Retail
Amazon Kinesis as Databus -
Migrate from Kafka to Kinesis| Enterprise
Funnel all
production events
through Amazon
Kinesis