SlideShare ist ein Scribd-Unternehmen logo
1 von 25
@ 2014 Amazon.com, Inc. and Its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc
Streaming Data for Analysis
Brett Francis
Enterprise Solutions Architect
Talk Outline
• Streaming Big Data
• Analytics with Redshift
• Generalizing the Streaming for Analytics design pattern
• Cost Influences on Architecture
You’re likely already “streaming”
• Sensor networks analytics
• Ad network analytics
• Log shipping and centralization
• Click stream analysis
• Gaming status
• Hardware and software appliance metrics
• …more…
Example Streaming Big Data Source
Let’s explore common
challenges of streaming
One common starting point is ingesting records
for analysis
Elastic Beanstalk
foo-analysis.com
Global top-10
foo-analysis.com
Too big to handle on one box
Global top-10Elastic Beanstalk
foo-analysis.com
The solution: needs record sorting and grouping
Local top-10
Local top-10
Local top-10 Global top-10
Elastic Beanstalk
foo-analysis.com
The solution: streaming map/reduce
Global top-10
Elastic Beanstalk
foo-analysis.com
Local top-10
Local top-10
Local top-10
Data Record
Shard:
Sequence Number
14 17 18 21 23
When to use Stream Processing
• “real-time” starts coming onto the radar
• The time to answer can’t wait for batch processing times
• Instead of processing serially as A > B > C it would be
better to have a fan out pattern
• The records are just a means to an end, most records
can be immediately archived after an “answer” is
determined.
How this relates to Kinesis
Global top-10Elastic Beanstalk
foo-analysis.com
Kinesis
Kinesis
Application
Core streaming concepts
Global top-10Elastic Beanstalk
foo-analysis.com
Data
Record
Stream
Shard
Partition Key
Worker
My top-10
Data Record
Shard:
Sequence Number
14 17 18 21 23
Kinesis Managed Stream Processing
• Moved from batch to continuous processing
• Scale shards and time series elastically UP or DOWN
without losing sequencing
• Workers can replay records for up to 24 hours
• Scale up to GB/sec without losing durability
• Records stored across multiple availability zones
• Multiple parallel Kinesis Aps output to anything…
• RDBMS, S3, In-house Data Warehouse, Messaging, another stream,
JavaSDK, PythonSDK, etc.
Amazon Kinesis
AWSEndpoint
S3
DynamoDB
Redshift
Data
Sources
Availability
Zone
Availability
Zone
Data
Sources
Data
Sources
Data
Sources
Data
Sources
Availability
Zone
Shard 1
Shard 2
Shard N
[Aggregate &
De-Duplicate]
[Metric
Extraction]
[Sliding Window
Analysis]
[Machine
Learning]
App. 1
App. 2
App. 3
App. 4
Core Concepts Recapped
• Data Record ~ a single generated record
• Stream ~ all records (aka. The Fire Hose)
• Partition Key ~ all records for specific topic / sensor
• Shard ~ all data records belonging to a set of topics, grouped
together
• Sequence Number ~ generated and assigned to each data record
when ingested
• Worker ~ processes the records of a shard in sequence order
Amazon Redshift
Analysis using Redshift
• Compatible with existing SQL Business Intelligence tools
• Start small and grow massively
• Scalable from 160GB to Petabyte+
• Elastic data warehousing
• Automatically run queries against old cluster while the new one is being
provisioned
• Run it when you need it
Redshift Architecture
• Ingest from S3, EMR,
DynamoDB or API
• Backups to S3
• JDBC / ODBC Access
Generalizing a Streaming for Analytics
design pattern
Example: Kinesis for Clickstream Analytics
Clickstream
processing
applications
Aggregated
clickstream
statistics
Clickstream
archive
Clickstream
Trend analysis
Example: Kinesis for Simple Metering & Billing
Billing
auditors
Incremental
bill
computation
Metering
record
archive
Billing mgmt
service
Kinesis Poster Worker Demo
(aka. The Egg Finder)
• Published at AWSlabs
• h t t p s : / / g i t h u b . c o m / a w s l a b s / k i n e s i s - p o s t e r - w o r k e r
• Poster ~ multi-threaded client that posts random characters in to a stream
• Worker ~ a thread-per-shard client that gets batches of records looking for
the word ‘egg’
Cost Influences on Architecture
Streaming Analysis Cost Dimensions
• Amazon Kinesis priced in shard increments of:
• 1MB/sec ingest 2MB/sec egress
• 1M PUTs
• Amazon EC2 Kinesis Apps priced by instance
• Amazon Redshift prices are hourly and:
• One tenth the cost of alternatives (ex. 3Yr RI)
• Scales from 160GB to >1PB
Thank You.
Please send me feedback on this presentation.
brettf@
Follow-up Links
aws.amazon.com/kinesis
aws.amazon.com/redshift
aws.amazon.com/elasticbeanstalk

Weitere ähnliche Inhalte

Was ist angesagt?

End-to-End Spark/TensorFlow/PyTorch Pipelines with Databricks Delta
End-to-End Spark/TensorFlow/PyTorch Pipelines with Databricks DeltaEnd-to-End Spark/TensorFlow/PyTorch Pipelines with Databricks Delta
End-to-End Spark/TensorFlow/PyTorch Pipelines with Databricks DeltaDatabricks
 
A Thorough Comparison of Delta Lake, Iceberg and Hudi
A Thorough Comparison of Delta Lake, Iceberg and HudiA Thorough Comparison of Delta Lake, Iceberg and Hudi
A Thorough Comparison of Delta Lake, Iceberg and HudiDatabricks
 
LanGCHAIN Framework
LanGCHAIN FrameworkLanGCHAIN Framework
LanGCHAIN FrameworkKeymate.AI
 
Apache Kafka and the Data Mesh | Michael Noll, Confluent
Apache Kafka and the Data Mesh | Michael Noll, ConfluentApache Kafka and the Data Mesh | Michael Noll, Confluent
Apache Kafka and the Data Mesh | Michael Noll, ConfluentHostedbyConfluent
 
Kafka Streams: What it is, and how to use it?
Kafka Streams: What it is, and how to use it?Kafka Streams: What it is, and how to use it?
Kafka Streams: What it is, and how to use it?confluent
 
DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDatabricks
 
Kafka used at scale to deliver real-time notifications
Kafka used at scale to deliver real-time notificationsKafka used at scale to deliver real-time notifications
Kafka used at scale to deliver real-time notificationsSérgio Nunes
 
Apache Kafka Architecture & Fundamentals Explained
Apache Kafka Architecture & Fundamentals ExplainedApache Kafka Architecture & Fundamentals Explained
Apache Kafka Architecture & Fundamentals Explainedconfluent
 
Architect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh ArchitectureArchitect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh ArchitectureDatabricks
 
Common issues with Apache Kafka® Producer
Common issues with Apache Kafka® ProducerCommon issues with Apache Kafka® Producer
Common issues with Apache Kafka® Producerconfluent
 
Parquet performance tuning: the missing guide
Parquet performance tuning: the missing guideParquet performance tuning: the missing guide
Parquet performance tuning: the missing guideRyan Blue
 
Lecture 1: Semantic Analysis in Language Technology
Lecture 1: Semantic Analysis in Language TechnologyLecture 1: Semantic Analysis in Language Technology
Lecture 1: Semantic Analysis in Language TechnologyMarina Santini
 
Design Patterns For Real Time Streaming Data Analytics
Design Patterns For Real Time Streaming Data AnalyticsDesign Patterns For Real Time Streaming Data Analytics
Design Patterns For Real Time Streaming Data AnalyticsDataWorks Summit
 
A Deep Dive into Kafka Controller
A Deep Dive into Kafka ControllerA Deep Dive into Kafka Controller
A Deep Dive into Kafka Controllerconfluent
 
Real-Time Data Flows with Apache NiFi
Real-Time Data Flows with Apache NiFiReal-Time Data Flows with Apache NiFi
Real-Time Data Flows with Apache NiFiManish Gupta
 
Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...
Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...
Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...StreamNative
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language ProcessingYasir Khan
 

Was ist angesagt? (20)

End-to-End Spark/TensorFlow/PyTorch Pipelines with Databricks Delta
End-to-End Spark/TensorFlow/PyTorch Pipelines with Databricks DeltaEnd-to-End Spark/TensorFlow/PyTorch Pipelines with Databricks Delta
End-to-End Spark/TensorFlow/PyTorch Pipelines with Databricks Delta
 
A Thorough Comparison of Delta Lake, Iceberg and Hudi
A Thorough Comparison of Delta Lake, Iceberg and HudiA Thorough Comparison of Delta Lake, Iceberg and Hudi
A Thorough Comparison of Delta Lake, Iceberg and Hudi
 
LanGCHAIN Framework
LanGCHAIN FrameworkLanGCHAIN Framework
LanGCHAIN Framework
 
Apache Kafka and the Data Mesh | Michael Noll, Confluent
Apache Kafka and the Data Mesh | Michael Noll, ConfluentApache Kafka and the Data Mesh | Michael Noll, Confluent
Apache Kafka and the Data Mesh | Michael Noll, Confluent
 
Kafka Streams: What it is, and how to use it?
Kafka Streams: What it is, and how to use it?Kafka Streams: What it is, and how to use it?
Kafka Streams: What it is, and how to use it?
 
DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptx
 
Kafka used at scale to deliver real-time notifications
Kafka used at scale to deliver real-time notificationsKafka used at scale to deliver real-time notifications
Kafka used at scale to deliver real-time notifications
 
Apache Kafka Architecture & Fundamentals Explained
Apache Kafka Architecture & Fundamentals ExplainedApache Kafka Architecture & Fundamentals Explained
Apache Kafka Architecture & Fundamentals Explained
 
Apache Flink Deep Dive
Apache Flink Deep DiveApache Flink Deep Dive
Apache Flink Deep Dive
 
Architect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh ArchitectureArchitect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh Architecture
 
Modern Data Pipelines
Modern Data PipelinesModern Data Pipelines
Modern Data Pipelines
 
Common issues with Apache Kafka® Producer
Common issues with Apache Kafka® ProducerCommon issues with Apache Kafka® Producer
Common issues with Apache Kafka® Producer
 
Parquet performance tuning: the missing guide
Parquet performance tuning: the missing guideParquet performance tuning: the missing guide
Parquet performance tuning: the missing guide
 
Lecture 1: Semantic Analysis in Language Technology
Lecture 1: Semantic Analysis in Language TechnologyLecture 1: Semantic Analysis in Language Technology
Lecture 1: Semantic Analysis in Language Technology
 
Design Patterns For Real Time Streaming Data Analytics
Design Patterns For Real Time Streaming Data AnalyticsDesign Patterns For Real Time Streaming Data Analytics
Design Patterns For Real Time Streaming Data Analytics
 
A Deep Dive into Kafka Controller
A Deep Dive into Kafka ControllerA Deep Dive into Kafka Controller
A Deep Dive into Kafka Controller
 
Real-Time Data Flows with Apache NiFi
Real-Time Data Flows with Apache NiFiReal-Time Data Flows with Apache NiFi
Real-Time Data Flows with Apache NiFi
 
Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...
Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...
Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
The delta architecture
The delta architectureThe delta architecture
The delta architecture
 

Andere mochten auch

Clickstream Analysis with Spark—Understanding Visitors in Realtime by Josef A...
Clickstream Analysis with Spark—Understanding Visitors in Realtime by Josef A...Clickstream Analysis with Spark—Understanding Visitors in Realtime by Josef A...
Clickstream Analysis with Spark—Understanding Visitors in Realtime by Josef A...Spark Summit
 
Real-time Streaming and Querying with Amazon Kinesis and Amazon Elastic MapRe...
Real-time Streaming and Querying with Amazon Kinesis and Amazon Elastic MapRe...Real-time Streaming and Querying with Amazon Kinesis and Amazon Elastic MapRe...
Real-time Streaming and Querying with Amazon Kinesis and Amazon Elastic MapRe...Amazon Web Services
 
Introducing Amazon Kinesis: Real-time Processing of Streaming Big Data (BDT10...
Introducing Amazon Kinesis: Real-time Processing of Streaming Big Data (BDT10...Introducing Amazon Kinesis: Real-time Processing of Streaming Big Data (BDT10...
Introducing Amazon Kinesis: Real-time Processing of Streaming Big Data (BDT10...Amazon Web Services
 
Kinesis and Spark Streaming - Advanced AWS Meetup - August 2014
Kinesis and Spark Streaming - Advanced AWS Meetup - August 2014Kinesis and Spark Streaming - Advanced AWS Meetup - August 2014
Kinesis and Spark Streaming - Advanced AWS Meetup - August 2014Chris Fregly
 
Getting Started with Real-time Analytics
Getting Started with Real-time AnalyticsGetting Started with Real-time Analytics
Getting Started with Real-time AnalyticsAmazon Web Services
 
Day 4 - Big Data on AWS - RedShift, EMR & the Internet of Things
Day 4 - Big Data on AWS - RedShift, EMR & the Internet of ThingsDay 4 - Big Data on AWS - RedShift, EMR & the Internet of Things
Day 4 - Big Data on AWS - RedShift, EMR & the Internet of ThingsAmazon Web Services
 
real time big data
real time big data real time big data
real time big data parry prabhu
 
Collecting and analyzing sensor data with hadoop or other no sql databases
Collecting and analyzing sensor data with hadoop or other no sql databasesCollecting and analyzing sensor data with hadoop or other no sql databases
Collecting and analyzing sensor data with hadoop or other no sql databasesMatteo Redaelli
 
Design Patterns For Real Time Streaming Data Analytics
Design Patterns For Real Time Streaming Data AnalyticsDesign Patterns For Real Time Streaming Data Analytics
Design Patterns For Real Time Streaming Data AnalyticsDataWorks Summit
 
Web log & clickstream
Web log & clickstream Web log & clickstream
Web log & clickstream Michel Bruley
 
Real-time Big Data Analytics: From Deployment to Production
Real-time Big Data Analytics: From Deployment to ProductionReal-time Big Data Analytics: From Deployment to Production
Real-time Big Data Analytics: From Deployment to ProductionRevolution Analytics
 
Fundamental Metrics for Startups
Fundamental Metrics for StartupsFundamental Metrics for Startups
Fundamental Metrics for Startups500 Startups
 
Clickstream Data Warehouse - Turning clicks into customers
Clickstream Data Warehouse - Turning clicks into customersClickstream Data Warehouse - Turning clicks into customers
Clickstream Data Warehouse - Turning clicks into customersAlbert Hui
 
Deep Dive Amazon Redshift for Big Data Analytics - September Webinar Series
Deep Dive Amazon Redshift for Big Data Analytics - September Webinar SeriesDeep Dive Amazon Redshift for Big Data Analytics - September Webinar Series
Deep Dive Amazon Redshift for Big Data Analytics - September Webinar SeriesAmazon Web Services
 
Improving HDFS Availability with IPC Quality of Service
Improving HDFS Availability with IPC Quality of ServiceImproving HDFS Availability with IPC Quality of Service
Improving HDFS Availability with IPC Quality of ServiceDataWorks Summit
 
Enhancements on Spark SQL optimizer by Min Qiu
Enhancements on Spark SQL optimizer by Min QiuEnhancements on Spark SQL optimizer by Min Qiu
Enhancements on Spark SQL optimizer by Min QiuSpark Summit
 
Paper 2 iGCSE Revision Guide
Paper 2 iGCSE Revision GuidePaper 2 iGCSE Revision Guide
Paper 2 iGCSE Revision GuideBradonEnglish
 

Andere mochten auch (20)

Clickstream Analysis with Spark—Understanding Visitors in Realtime by Josef A...
Clickstream Analysis with Spark—Understanding Visitors in Realtime by Josef A...Clickstream Analysis with Spark—Understanding Visitors in Realtime by Josef A...
Clickstream Analysis with Spark—Understanding Visitors in Realtime by Josef A...
 
Real-time Streaming and Querying with Amazon Kinesis and Amazon Elastic MapRe...
Real-time Streaming and Querying with Amazon Kinesis and Amazon Elastic MapRe...Real-time Streaming and Querying with Amazon Kinesis and Amazon Elastic MapRe...
Real-time Streaming and Querying with Amazon Kinesis and Amazon Elastic MapRe...
 
Introducing Amazon Kinesis: Real-time Processing of Streaming Big Data (BDT10...
Introducing Amazon Kinesis: Real-time Processing of Streaming Big Data (BDT10...Introducing Amazon Kinesis: Real-time Processing of Streaming Big Data (BDT10...
Introducing Amazon Kinesis: Real-time Processing of Streaming Big Data (BDT10...
 
Kinesis and Spark Streaming - Advanced AWS Meetup - August 2014
Kinesis and Spark Streaming - Advanced AWS Meetup - August 2014Kinesis and Spark Streaming - Advanced AWS Meetup - August 2014
Kinesis and Spark Streaming - Advanced AWS Meetup - August 2014
 
Getting Started with Real-time Analytics
Getting Started with Real-time AnalyticsGetting Started with Real-time Analytics
Getting Started with Real-time Analytics
 
Day 4 - Big Data on AWS - RedShift, EMR & the Internet of Things
Day 4 - Big Data on AWS - RedShift, EMR & the Internet of ThingsDay 4 - Big Data on AWS - RedShift, EMR & the Internet of Things
Day 4 - Big Data on AWS - RedShift, EMR & the Internet of Things
 
real time big data
real time big data real time big data
real time big data
 
Collecting and analyzing sensor data with hadoop or other no sql databases
Collecting and analyzing sensor data with hadoop or other no sql databasesCollecting and analyzing sensor data with hadoop or other no sql databases
Collecting and analyzing sensor data with hadoop or other no sql databases
 
Design Patterns For Real Time Streaming Data Analytics
Design Patterns For Real Time Streaming Data AnalyticsDesign Patterns For Real Time Streaming Data Analytics
Design Patterns For Real Time Streaming Data Analytics
 
Writers effects 2
Writers effects 2Writers effects 2
Writers effects 2
 
Web log & clickstream
Web log & clickstream Web log & clickstream
Web log & clickstream
 
Benchmark slideshow
Benchmark slideshowBenchmark slideshow
Benchmark slideshow
 
Real-time Big Data Analytics: From Deployment to Production
Real-time Big Data Analytics: From Deployment to ProductionReal-time Big Data Analytics: From Deployment to Production
Real-time Big Data Analytics: From Deployment to Production
 
Kat Gordon
Kat Gordon Kat Gordon
Kat Gordon
 
Fundamental Metrics for Startups
Fundamental Metrics for StartupsFundamental Metrics for Startups
Fundamental Metrics for Startups
 
Clickstream Data Warehouse - Turning clicks into customers
Clickstream Data Warehouse - Turning clicks into customersClickstream Data Warehouse - Turning clicks into customers
Clickstream Data Warehouse - Turning clicks into customers
 
Deep Dive Amazon Redshift for Big Data Analytics - September Webinar Series
Deep Dive Amazon Redshift for Big Data Analytics - September Webinar SeriesDeep Dive Amazon Redshift for Big Data Analytics - September Webinar Series
Deep Dive Amazon Redshift for Big Data Analytics - September Webinar Series
 
Improving HDFS Availability with IPC Quality of Service
Improving HDFS Availability with IPC Quality of ServiceImproving HDFS Availability with IPC Quality of Service
Improving HDFS Availability with IPC Quality of Service
 
Enhancements on Spark SQL optimizer by Min Qiu
Enhancements on Spark SQL optimizer by Min QiuEnhancements on Spark SQL optimizer by Min Qiu
Enhancements on Spark SQL optimizer by Min Qiu
 
Paper 2 iGCSE Revision Guide
Paper 2 iGCSE Revision GuidePaper 2 iGCSE Revision Guide
Paper 2 iGCSE Revision Guide
 

Ähnlich wie Streaming data for real time analysis

(SDD405) Amazon Kinesis Deep Dive | AWS re:Invent 2014
(SDD405) Amazon Kinesis Deep Dive | AWS re:Invent 2014(SDD405) Amazon Kinesis Deep Dive | AWS re:Invent 2014
(SDD405) Amazon Kinesis Deep Dive | AWS re:Invent 2014Amazon Web Services
 
Amazon Kinesis Data Streams Vs Msk (1).pptx
Amazon Kinesis Data Streams Vs Msk (1).pptxAmazon Kinesis Data Streams Vs Msk (1).pptx
Amazon Kinesis Data Streams Vs Msk (1).pptxRenjithPillai26
 
Realtime Analytics on AWS
Realtime Analytics on AWSRealtime Analytics on AWS
Realtime Analytics on AWSSungmin Kim
 
Choose Right Stream Storage: Amazon Kinesis Data Streams vs MSK
Choose Right Stream Storage: Amazon Kinesis Data Streams vs MSKChoose Right Stream Storage: Amazon Kinesis Data Streams vs MSK
Choose Right Stream Storage: Amazon Kinesis Data Streams vs MSKSungmin Kim
 
Deep Dive and Best Practices for Real Time Streaming Applications
Deep Dive and Best Practices for Real Time Streaming ApplicationsDeep Dive and Best Practices for Real Time Streaming Applications
Deep Dive and Best Practices for Real Time Streaming ApplicationsAmazon Web Services
 
AWS April 2016 Webinar Series - Getting Started with Real-Time Data Analytics...
AWS April 2016 Webinar Series - Getting Started with Real-Time Data Analytics...AWS April 2016 Webinar Series - Getting Started with Real-Time Data Analytics...
AWS April 2016 Webinar Series - Getting Started with Real-Time Data Analytics...Amazon Web Services
 
Launching Your First Big Data Project on AWS
Launching Your First Big Data Project on AWSLaunching Your First Big Data Project on AWS
Launching Your First Big Data Project on AWSAmazon Web Services
 
Building real time data-driven products
Building real time data-driven productsBuilding real time data-driven products
Building real time data-driven productsLars Albertsson
 
Big Data Analytics Platforms by KTH and RISE SICS
Big Data Analytics Platforms by KTH and RISE SICSBig Data Analytics Platforms by KTH and RISE SICS
Big Data Analytics Platforms by KTH and RISE SICSBig Data Value Association
 
AWS Analytics Immersion Day - Build BI System from Scratch (Day1, Day2 Full V...
AWS Analytics Immersion Day - Build BI System from Scratch (Day1, Day2 Full V...AWS Analytics Immersion Day - Build BI System from Scratch (Day1, Day2 Full V...
AWS Analytics Immersion Day - Build BI System from Scratch (Day1, Day2 Full V...Sungmin Kim
 
찾아가는 AWS 세미나(구로,가산,판교) - AWS 기반 빅데이터 활용 방법 (김일호 솔루션즈 아키텍트)
찾아가는 AWS 세미나(구로,가산,판교) - AWS 기반 빅데이터 활용 방법 (김일호 솔루션즈 아키텍트)찾아가는 AWS 세미나(구로,가산,판교) - AWS 기반 빅데이터 활용 방법 (김일호 솔루션즈 아키텍트)
찾아가는 AWS 세미나(구로,가산,판교) - AWS 기반 빅데이터 활용 방법 (김일호 솔루션즈 아키텍트)Amazon Web Services Korea
 
Amazon Kinesis Platform – The Complete Overview - Pop-up Loft TLV 2017
Amazon Kinesis Platform – The Complete Overview - Pop-up Loft TLV 2017Amazon Kinesis Platform – The Complete Overview - Pop-up Loft TLV 2017
Amazon Kinesis Platform – The Complete Overview - Pop-up Loft TLV 2017Amazon Web Services
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon RedshiftAmazon Web Services
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon RedshiftAmazon Web Services
 
AWS Webcast - AWS Kinesis Webinar
AWS Webcast - AWS Kinesis WebinarAWS Webcast - AWS Kinesis Webinar
AWS Webcast - AWS Kinesis WebinarAmazon Web Services
 
Analysing All Your Streaming Data - Level 300
Analysing All Your Streaming Data - Level 300Analysing All Your Streaming Data - Level 300
Analysing All Your Streaming Data - Level 300Amazon Web Services
 
Best Practices for Building Open Source Data Layers
Best Practices for Building Open Source Data LayersBest Practices for Building Open Source Data Layers
Best Practices for Building Open Source Data LayersIBMCompose
 
AWS를 활용한 첫 빅데이터 프로젝트 시작하기(김일호)- AWS 웨비나 시리즈 2015
AWS를 활용한 첫 빅데이터 프로젝트 시작하기(김일호)- AWS 웨비나 시리즈 2015AWS를 활용한 첫 빅데이터 프로젝트 시작하기(김일호)- AWS 웨비나 시리즈 2015
AWS를 활용한 첫 빅데이터 프로젝트 시작하기(김일호)- AWS 웨비나 시리즈 2015Amazon Web Services Korea
 
Data warehousing in the era of Big Data: Deep Dive into Amazon Redshift
Data warehousing in the era of Big Data: Deep Dive into Amazon RedshiftData warehousing in the era of Big Data: Deep Dive into Amazon Redshift
Data warehousing in the era of Big Data: Deep Dive into Amazon RedshiftAmazon Web Services
 

Ähnlich wie Streaming data for real time analysis (20)

(SDD405) Amazon Kinesis Deep Dive | AWS re:Invent 2014
(SDD405) Amazon Kinesis Deep Dive | AWS re:Invent 2014(SDD405) Amazon Kinesis Deep Dive | AWS re:Invent 2014
(SDD405) Amazon Kinesis Deep Dive | AWS re:Invent 2014
 
Amazon Kinesis Data Streams Vs Msk (1).pptx
Amazon Kinesis Data Streams Vs Msk (1).pptxAmazon Kinesis Data Streams Vs Msk (1).pptx
Amazon Kinesis Data Streams Vs Msk (1).pptx
 
Realtime Analytics on AWS
Realtime Analytics on AWSRealtime Analytics on AWS
Realtime Analytics on AWS
 
Choose Right Stream Storage: Amazon Kinesis Data Streams vs MSK
Choose Right Stream Storage: Amazon Kinesis Data Streams vs MSKChoose Right Stream Storage: Amazon Kinesis Data Streams vs MSK
Choose Right Stream Storage: Amazon Kinesis Data Streams vs MSK
 
Deep Dive and Best Practices for Real Time Streaming Applications
Deep Dive and Best Practices for Real Time Streaming ApplicationsDeep Dive and Best Practices for Real Time Streaming Applications
Deep Dive and Best Practices for Real Time Streaming Applications
 
AWS April 2016 Webinar Series - Getting Started with Real-Time Data Analytics...
AWS April 2016 Webinar Series - Getting Started with Real-Time Data Analytics...AWS April 2016 Webinar Series - Getting Started with Real-Time Data Analytics...
AWS April 2016 Webinar Series - Getting Started with Real-Time Data Analytics...
 
Launching Your First Big Data Project on AWS
Launching Your First Big Data Project on AWSLaunching Your First Big Data Project on AWS
Launching Your First Big Data Project on AWS
 
Building real time data-driven products
Building real time data-driven productsBuilding real time data-driven products
Building real time data-driven products
 
Big Data Analytics Platforms by KTH and RISE SICS
Big Data Analytics Platforms by KTH and RISE SICSBig Data Analytics Platforms by KTH and RISE SICS
Big Data Analytics Platforms by KTH and RISE SICS
 
AWS Analytics Immersion Day - Build BI System from Scratch (Day1, Day2 Full V...
AWS Analytics Immersion Day - Build BI System from Scratch (Day1, Day2 Full V...AWS Analytics Immersion Day - Build BI System from Scratch (Day1, Day2 Full V...
AWS Analytics Immersion Day - Build BI System from Scratch (Day1, Day2 Full V...
 
찾아가는 AWS 세미나(구로,가산,판교) - AWS 기반 빅데이터 활용 방법 (김일호 솔루션즈 아키텍트)
찾아가는 AWS 세미나(구로,가산,판교) - AWS 기반 빅데이터 활용 방법 (김일호 솔루션즈 아키텍트)찾아가는 AWS 세미나(구로,가산,판교) - AWS 기반 빅데이터 활용 방법 (김일호 솔루션즈 아키텍트)
찾아가는 AWS 세미나(구로,가산,판교) - AWS 기반 빅데이터 활용 방법 (김일호 솔루션즈 아키텍트)
 
Amazon Kinesis Platform – The Complete Overview - Pop-up Loft TLV 2017
Amazon Kinesis Platform – The Complete Overview - Pop-up Loft TLV 2017Amazon Kinesis Platform – The Complete Overview - Pop-up Loft TLV 2017
Amazon Kinesis Platform – The Complete Overview - Pop-up Loft TLV 2017
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon Redshift
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon Redshift
 
Amazon Kinesis
Amazon KinesisAmazon Kinesis
Amazon Kinesis
 
AWS Webcast - AWS Kinesis Webinar
AWS Webcast - AWS Kinesis WebinarAWS Webcast - AWS Kinesis Webinar
AWS Webcast - AWS Kinesis Webinar
 
Analysing All Your Streaming Data - Level 300
Analysing All Your Streaming Data - Level 300Analysing All Your Streaming Data - Level 300
Analysing All Your Streaming Data - Level 300
 
Best Practices for Building Open Source Data Layers
Best Practices for Building Open Source Data LayersBest Practices for Building Open Source Data Layers
Best Practices for Building Open Source Data Layers
 
AWS를 활용한 첫 빅데이터 프로젝트 시작하기(김일호)- AWS 웨비나 시리즈 2015
AWS를 활용한 첫 빅데이터 프로젝트 시작하기(김일호)- AWS 웨비나 시리즈 2015AWS를 활용한 첫 빅데이터 프로젝트 시작하기(김일호)- AWS 웨비나 시리즈 2015
AWS를 활용한 첫 빅데이터 프로젝트 시작하기(김일호)- AWS 웨비나 시리즈 2015
 
Data warehousing in the era of Big Data: Deep Dive into Amazon Redshift
Data warehousing in the era of Big Data: Deep Dive into Amazon RedshiftData warehousing in the era of Big Data: Deep Dive into Amazon Redshift
Data warehousing in the era of Big Data: Deep Dive into Amazon Redshift
 

Mehr von Amazon Web Services

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Amazon Web Services
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Amazon Web Services
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateAmazon Web Services
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSAmazon Web Services
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Amazon Web Services
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Amazon Web Services
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...Amazon Web Services
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsAmazon Web Services
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareAmazon Web Services
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSAmazon Web Services
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAmazon Web Services
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareAmazon Web Services
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWSAmazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckAmazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without serversAmazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...Amazon Web Services
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceAmazon Web Services
 

Mehr von Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

Kürzlich hochgeladen

DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 

Kürzlich hochgeladen (20)

DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 

Streaming data for real time analysis

  • 1. @ 2014 Amazon.com, Inc. and Its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc Streaming Data for Analysis Brett Francis Enterprise Solutions Architect
  • 2. Talk Outline • Streaming Big Data • Analytics with Redshift • Generalizing the Streaming for Analytics design pattern • Cost Influences on Architecture
  • 3. You’re likely already “streaming” • Sensor networks analytics • Ad network analytics • Log shipping and centralization • Click stream analysis • Gaming status • Hardware and software appliance metrics • …more…
  • 4. Example Streaming Big Data Source
  • 6. One common starting point is ingesting records for analysis Elastic Beanstalk foo-analysis.com Global top-10 foo-analysis.com
  • 7. Too big to handle on one box Global top-10Elastic Beanstalk foo-analysis.com
  • 8. The solution: needs record sorting and grouping Local top-10 Local top-10 Local top-10 Global top-10 Elastic Beanstalk foo-analysis.com
  • 9. The solution: streaming map/reduce Global top-10 Elastic Beanstalk foo-analysis.com Local top-10 Local top-10 Local top-10 Data Record Shard: Sequence Number 14 17 18 21 23
  • 10. When to use Stream Processing • “real-time” starts coming onto the radar • The time to answer can’t wait for batch processing times • Instead of processing serially as A > B > C it would be better to have a fan out pattern • The records are just a means to an end, most records can be immediately archived after an “answer” is determined.
  • 11. How this relates to Kinesis Global top-10Elastic Beanstalk foo-analysis.com Kinesis Kinesis Application
  • 12. Core streaming concepts Global top-10Elastic Beanstalk foo-analysis.com Data Record Stream Shard Partition Key Worker My top-10 Data Record Shard: Sequence Number 14 17 18 21 23
  • 13. Kinesis Managed Stream Processing • Moved from batch to continuous processing • Scale shards and time series elastically UP or DOWN without losing sequencing • Workers can replay records for up to 24 hours • Scale up to GB/sec without losing durability • Records stored across multiple availability zones • Multiple parallel Kinesis Aps output to anything… • RDBMS, S3, In-house Data Warehouse, Messaging, another stream, JavaSDK, PythonSDK, etc.
  • 14. Amazon Kinesis AWSEndpoint S3 DynamoDB Redshift Data Sources Availability Zone Availability Zone Data Sources Data Sources Data Sources Data Sources Availability Zone Shard 1 Shard 2 Shard N [Aggregate & De-Duplicate] [Metric Extraction] [Sliding Window Analysis] [Machine Learning] App. 1 App. 2 App. 3 App. 4
  • 15. Core Concepts Recapped • Data Record ~ a single generated record • Stream ~ all records (aka. The Fire Hose) • Partition Key ~ all records for specific topic / sensor • Shard ~ all data records belonging to a set of topics, grouped together • Sequence Number ~ generated and assigned to each data record when ingested • Worker ~ processes the records of a shard in sequence order
  • 17. Analysis using Redshift • Compatible with existing SQL Business Intelligence tools • Start small and grow massively • Scalable from 160GB to Petabyte+ • Elastic data warehousing • Automatically run queries against old cluster while the new one is being provisioned • Run it when you need it
  • 18. Redshift Architecture • Ingest from S3, EMR, DynamoDB or API • Backups to S3 • JDBC / ODBC Access
  • 19. Generalizing a Streaming for Analytics design pattern
  • 20. Example: Kinesis for Clickstream Analytics Clickstream processing applications Aggregated clickstream statistics Clickstream archive Clickstream Trend analysis
  • 21. Example: Kinesis for Simple Metering & Billing Billing auditors Incremental bill computation Metering record archive Billing mgmt service
  • 22. Kinesis Poster Worker Demo (aka. The Egg Finder) • Published at AWSlabs • h t t p s : / / g i t h u b . c o m / a w s l a b s / k i n e s i s - p o s t e r - w o r k e r • Poster ~ multi-threaded client that posts random characters in to a stream • Worker ~ a thread-per-shard client that gets batches of records looking for the word ‘egg’
  • 23. Cost Influences on Architecture
  • 24. Streaming Analysis Cost Dimensions • Amazon Kinesis priced in shard increments of: • 1MB/sec ingest 2MB/sec egress • 1M PUTs • Amazon EC2 Kinesis Apps priced by instance • Amazon Redshift prices are hourly and: • One tenth the cost of alternatives (ex. 3Yr RI) • Scales from 160GB to >1PB
  • 25. Thank You. Please send me feedback on this presentation. brettf@ Follow-up Links aws.amazon.com/kinesis aws.amazon.com/redshift aws.amazon.com/elasticbeanstalk