SlideShare ist ein Scribd-Unternehmen logo
1 von 29
Introduction to Streaming &
Messaging
Flume ,Kafka,SQS,
Kinesis streams & firehose
Kinesis Analytics
Omid Vahdaty, Big Data Ninja
What is batch Processing?
the execution of a series of programs each on a set or "batch" of inputs, rather than a single input (which
would instead be a custom job
What is Streaming ?
Streaming Data is data that is generated continuously by thousands of data sources, which typically send in
the data records simultaneously, and in small sizes (order of Kilobytes)
Streaming VS. Batch Processing
Batch Stream
Data Scope Query the entire batch, with slight delay Query most recent events
defined in a time window.
Data Size Large data sets A few Individual records
Latency? Minutes ,hours Seconds, Milliseconds
Analysis Complex Analytics Basic: aggregations, metrics etc.
Challenges with Streaming Data
● Processing layer
○ Consuming data
○ Processing data
○ Notifying storage layer what to do.
● Storage layer
○ Ordering mechanism
○ Strong Consistency mechanism
● In general MUST have features:
○ scalability
○ data durability
○ fault tolerance
Messaging VS Streaming?
● Messaging: framed
message based
protocol.
● e.g 3 messages sent will
look like:
○ Hello world
○ Hello world
○ Hello world
● Streaming: unframed
data (bytes) stream
based protocol
● e.g 3 messages sent will
look like:
○ Hell
○ ow wo
○ rld Hel
○ low wor
○ ldHellow wo
○ rld
Messaging
Open Source: Kafka, flume
AWS: SQS
Flume
Flume Pros:
Good documentation with many existing implementation patterns to follow
Easy integration with existing monitoring framework
Integration with Cloudera Manager to monitor Flume processes
Flume Cons:
Event rather that stream centric
Calculating capacity is not an exact science but rather confirmed through trials
Throughput is dependent on the channel backing store.
Flume lacks the clear scaling and resiliency configurations (trivial with Kafka and Kinesis)
Kafka
Kafka Pros:
High achievable ingest rates with clear scaling pattern
High resiliency via distributed replicas with little impact on throughput
Kafka Cons:
No current framework for monitoring and configuring producers
Flume VS. Kafka
Flume Kafka
Choose when you desire No need for customization.
Need out of the box
components such HDFS
sink
Need a custom made high
availability delivery system
Velocity high higher
Event processing
Flume Kafka
Original Motivation distributed, reliable, and
available system for efficiently
collecting, aggregating and
moving large amounts of log
data from many different
sources to a centralized data
store. Built around hadoop
ecosystem
general purpose
distributed publish-
subscribe messaging
system Multi-consumer
ultra-high availability
messaging system.
Data Flow push pull
event availability JDBC Databases
Channel, file Channel.
Loose flume agent =
losing data.
replication of your
events data by design.
Commercial support Cloudera Cloudera
Collectors built in Yes. just the messaging
Use Case: Kafka and Flume combined
● Flume supports: Kafka source, Kafka channel, Kafka sink
● So, take the advantage of both and combine them to your needs.
Use Case: Kafka as a Channel
AWS SQS
● a fast, reliable, scalable, fully managed message queuing service
● decouple the components of a cloud application, move data between diverse, distributed
application components without losing messages and without requiring each component to be
always available.
● high throughput and at-least-once processing, and FIFO queues
● all messages are stored redundantly across multiple servers and data centers.
● Start with three API calls : SendMessage, ReceiveMessage, and DeleteMessage. Additional
APIs are available to provide advanced functionality.
● Queues
○ Standard queues offer maximum throughput, best-effort ordering, and at-least-once
delivery.
○ FIFO queues are designed to ensure strict ordering and exactly-once processing, with
limited throughput.
● scales dynamically
● Authentication mechanisms
AWS SQS use cases
Messaging semantics (such as message-level ack/fail) and visibility timeout. For example, you have a
queue of work items and want to track the successful completion of each item independently. Amazon
SQS tracks the ack/fail, so the application does not have to maintain a persistent checkpoint/cursor.
Amazon SQS will delete acked messages and redeliver failed messages after a configured visibility
timeout.
Individual message delay. For example, you have a job queue and need to schedule individual jobs with
a delay. With Amazon SQS, you can configure individual messages to have a delay of up to 15 minutes.
Dynamically increasing concurrency/throughput at read time. For example, you have a work
queue and want to add more readers until the backlog is cleared. With Amazon Kinesis, you can scale
up to a sufficient number of shards (note, however, that you'll need to provision enough shards ahead
of time).
Leveraging Amazon SQS’s ability to scale transparently. For example, you buffer requests and the
load changes as a result of occasional load spikes or the natural growth of your business. Because each
buffered request can be processed independently.
Typical SQL use case: decoupling APP layers.
Streaming
AWS Kinetics:
Streams,Firehose,Analytics
AWS Kinesis (streams)
● build custom applications that process or analyze streams
● continuously capture and store terabytes of data per hour
● Hundreds sources
● allows for real-time data processing
● Easy to use, get started in minutes
○ Kinesis Client Library
○ Kinesis Producer Library
● allows you to have multiple Applications processing the same stream concurrently.
● The throughput can scale from megabytes to terabytes per hour
● synchronously replicates your streaming data across three AZ
● preserves your data for up to 7 days
AWS Kinesis (streams) use cases
● Log and Event collection
● Mobile Data collection
● Real Time Analytics
○ when loading data from transactional databases into data warehouses.
○ Multi-stage processing using specialized algorithms
○ stream partitioning for finer control over scaling
● Gaming Data feed
AWS Kinesis (streams) use cases
Routing related records to the same record processor (as in streaming MapReduce). For example,
counting and aggregation are simpler when all records for a given key are routed to
the same record processor.
Ordering of records. For example, you want to transfer log data from the application host to the
processing/archival host while maintaining the order of log statements.
Ability for multiple applications to consume the same stream concurrently. For example, you
have one application that updates a real-time dashboard and another that archives data to
Amazon Redshift. You want both applications to consume data from the same stream
concurrently and independently.
Ability to consume records in the same order a few hours later. For example, you have a
billing application and an audit application that runs a few hours behind the billing application.
Because Amazon Kinesis stores data for up to 24 hours, you can run the audit application up to
24 hours behind the billing application.
AWS Kinesis (streams)
Kinesis Pros:
High achievable ingest rates with clear scaling pattern
Similar throughput and resiliency characteristics to Kafka
Integrates with other AWS services like EMR and Data Pipeline.
Kinesis Cons:
No current framework for monitoring and configuring producers
Cloud service only. Possible increase in latency from source to Kinesis.
AWS Kinesis (streams)
AWS Kinesis Firehose
● the easiest way to load streaming data into AWS.
● capture, transform, and load streaming data
○ integrates into Kinesis Analytics, S3, Redshift, Elasticsearch Service
○ Serverless Transformation on RAW data. (lambda function)
■ E.g transform log file into CSV format
● Firehose can back up all untransformed records to your S3 bucket concurrently while delivering transformed records to
the destination. You can enable source record backup
● enabling near real-time analytics
● Easy to use.
● Monitoring options.
● Limits
○ 20 stream per regions, Each stream
■ 2000 transaction per sec
■ 5000 records per sec
■ 5MB/s
■ Support 24 hours replay in cases on downtime
Kinesys Firehose agent
● Java software app that send data to streams/firehose
● monitors a set of files for new data and then sends streams/firehose
● It handles file rotation, checkpointing, and retrial upon failures.
● supports Amazon CloudWatch so that you can closely monitor and troubleshoot the data flow from
the agent.
● Data processing options:
○ SINGLELINE – This option converts a multi-line record to a single line record by removing
newline characters, and leading and trailing spaces.
○ CSVTOJSON – This option converts a record from delimiter separated format to JSON format.
○ LOGTOJSON – This option converts a record from several commonly used log formats to
JSON format. Currently supported log formats are Apache Common Log, Apache Combined
Log, Apache Error Log, and RFC3164 (syslog).
● https://github.com/awslabs/amazon-kinesis-agent
Write a JAVA agent to Firehose
● AWS java SDK
● Firehose API
○ Single record: PutRecord
○ Batch: PutRecordBatch.
● Key concepts:
○ Firehose delivery stream
○ Data producer - i.e web server creating log.
○ Record: The data of interest that your data producer sends to a Firehose delivery stream. A record can be as
large as 1000 KB.
○ buffer size (in MB )
○ buffer interval (seconds)
● Java examples: http://docs.aws.amazon.com/firehose/latest/dev/writing-with-sdk.html
Firehose and Redshift usecase
Kinesis Analytics : in-flight analytics.
● process streaming data in real time with standard SQL
● Amazon Kinesis Analytics enables you to create and run SQL queries on streaming data
● Easy 3 steps
1. Configure Input stream (kinesis stream, kinesis firehose)
a. Automatically created Schema
b. Manually change schema if you like
2. Write SQL query
3. Configure output stream: s3, redshift, elastics search
● Elastic: scale up down
● Managed service
● Standard SQL
Kinesis Analytics : in-flight analytics.
Stay in touch...
● Omid Vahdaty
● +972-54-2384178

Weitere ähnliche Inhalte

Was ist angesagt?

Apache flume by Swapnil Dubey
Apache flume by Swapnil DubeyApache flume by Swapnil Dubey
Apache flume by Swapnil DubeySwapnil Dubey
 
Unify Storage Backend for Batch and Streaming Computation with Apache Pulsar_...
Unify Storage Backend for Batch and Streaming Computation with Apache Pulsar_...Unify Storage Backend for Batch and Streaming Computation with Apache Pulsar_...
Unify Storage Backend for Batch and Streaming Computation with Apache Pulsar_...StreamNative
 
HBaseConAsia2018 Track1-5: Improving HBase reliability at PInterest with geo ...
HBaseConAsia2018 Track1-5: Improving HBase reliability at PInterest with geo ...HBaseConAsia2018 Track1-5: Improving HBase reliability at PInterest with geo ...
HBaseConAsia2018 Track1-5: Improving HBase reliability at PInterest with geo ...Michael Stack
 
From Message to Cluster: A Realworld Introduction to Kafka Capacity Planning
From Message to Cluster: A Realworld Introduction to Kafka Capacity PlanningFrom Message to Cluster: A Realworld Introduction to Kafka Capacity Planning
From Message to Cluster: A Realworld Introduction to Kafka Capacity Planningconfluent
 
HBaseCon 2015- HBase @ Flipboard
HBaseCon 2015- HBase @ FlipboardHBaseCon 2015- HBase @ Flipboard
HBaseCon 2015- HBase @ FlipboardMatthew Blair
 
Near-realtime analytics with Kafka and HBase
Near-realtime analytics with Kafka and HBaseNear-realtime analytics with Kafka and HBase
Near-realtime analytics with Kafka and HBasedave_revell
 
Messaging queue - Kafka
Messaging queue - KafkaMessaging queue - Kafka
Messaging queue - KafkaMayank Bansal
 
Kafka Basic For Beginners
Kafka Basic For BeginnersKafka Basic For Beginners
Kafka Basic For BeginnersRiby Varghese
 
HBaseCon 2012 | Building a Large Search Platform on a Shoestring Budget
HBaseCon 2012 | Building a Large Search Platform on a Shoestring BudgetHBaseCon 2012 | Building a Large Search Platform on a Shoestring Budget
HBaseCon 2012 | Building a Large Search Platform on a Shoestring BudgetCloudera, Inc.
 
HBaseCon 2015: HBase Operations in a Flurry
HBaseCon 2015: HBase Operations in a FlurryHBaseCon 2015: HBase Operations in a Flurry
HBaseCon 2015: HBase Operations in a FlurryHBaseCon
 
HBaseConAsia2018 Track3-4: HBase and OpenTSDB practice at Huawei
HBaseConAsia2018 Track3-4: HBase and OpenTSDB practice at HuaweiHBaseConAsia2018 Track3-4: HBase and OpenTSDB practice at Huawei
HBaseConAsia2018 Track3-4: HBase and OpenTSDB practice at HuaweiMichael Stack
 
HBase: Where Online Meets Low Latency
HBase: Where Online Meets Low LatencyHBase: Where Online Meets Low Latency
HBase: Where Online Meets Low LatencyHBaseCon
 
HBaseCon 2012 | Solbase - Kyungseog Oh, Photobucket
HBaseCon 2012 | Solbase - Kyungseog Oh, PhotobucketHBaseCon 2012 | Solbase - Kyungseog Oh, Photobucket
HBaseCon 2012 | Solbase - Kyungseog Oh, PhotobucketCloudera, Inc.
 
HBaseCon 2013: Streaming Data into Apache HBase using Apache Flume: Experienc...
HBaseCon 2013: Streaming Data into Apache HBase using Apache Flume: Experienc...HBaseCon 2013: Streaming Data into Apache HBase using Apache Flume: Experienc...
HBaseCon 2013: Streaming Data into Apache HBase using Apache Flume: Experienc...Cloudera, Inc.
 
HBaseCon 2015: HBase at Scale in an Online and High-Demand Environment
HBaseCon 2015: HBase at Scale in an Online and  High-Demand EnvironmentHBaseCon 2015: HBase at Scale in an Online and  High-Demand Environment
HBaseCon 2015: HBase at Scale in an Online and High-Demand EnvironmentHBaseCon
 

Was ist angesagt? (20)

Apache flume by Swapnil Dubey
Apache flume by Swapnil DubeyApache flume by Swapnil Dubey
Apache flume by Swapnil Dubey
 
Unify Storage Backend for Batch and Streaming Computation with Apache Pulsar_...
Unify Storage Backend for Batch and Streaming Computation with Apache Pulsar_...Unify Storage Backend for Batch and Streaming Computation with Apache Pulsar_...
Unify Storage Backend for Batch and Streaming Computation with Apache Pulsar_...
 
Kafka presentation
Kafka presentationKafka presentation
Kafka presentation
 
HBaseConAsia2018 Track1-5: Improving HBase reliability at PInterest with geo ...
HBaseConAsia2018 Track1-5: Improving HBase reliability at PInterest with geo ...HBaseConAsia2018 Track1-5: Improving HBase reliability at PInterest with geo ...
HBaseConAsia2018 Track1-5: Improving HBase reliability at PInterest with geo ...
 
From Message to Cluster: A Realworld Introduction to Kafka Capacity Planning
From Message to Cluster: A Realworld Introduction to Kafka Capacity PlanningFrom Message to Cluster: A Realworld Introduction to Kafka Capacity Planning
From Message to Cluster: A Realworld Introduction to Kafka Capacity Planning
 
HBaseCon 2015- HBase @ Flipboard
HBaseCon 2015- HBase @ FlipboardHBaseCon 2015- HBase @ Flipboard
HBaseCon 2015- HBase @ Flipboard
 
Near-realtime analytics with Kafka and HBase
Near-realtime analytics with Kafka and HBaseNear-realtime analytics with Kafka and HBase
Near-realtime analytics with Kafka and HBase
 
ApacheCon-HBase-2016
ApacheCon-HBase-2016ApacheCon-HBase-2016
ApacheCon-HBase-2016
 
Messaging queue - Kafka
Messaging queue - KafkaMessaging queue - Kafka
Messaging queue - Kafka
 
Kafka Basic For Beginners
Kafka Basic For BeginnersKafka Basic For Beginners
Kafka Basic For Beginners
 
HBaseCon 2012 | Building a Large Search Platform on a Shoestring Budget
HBaseCon 2012 | Building a Large Search Platform on a Shoestring BudgetHBaseCon 2012 | Building a Large Search Platform on a Shoestring Budget
HBaseCon 2012 | Building a Large Search Platform on a Shoestring Budget
 
HBaseCon 2015: HBase Operations in a Flurry
HBaseCon 2015: HBase Operations in a FlurryHBaseCon 2015: HBase Operations in a Flurry
HBaseCon 2015: HBase Operations in a Flurry
 
HBaseConAsia2018 Track3-4: HBase and OpenTSDB practice at Huawei
HBaseConAsia2018 Track3-4: HBase and OpenTSDB practice at HuaweiHBaseConAsia2018 Track3-4: HBase and OpenTSDB practice at Huawei
HBaseConAsia2018 Track3-4: HBase and OpenTSDB practice at Huawei
 
Apache Kafka at LinkedIn
Apache Kafka at LinkedInApache Kafka at LinkedIn
Apache Kafka at LinkedIn
 
HBase: Where Online Meets Low Latency
HBase: Where Online Meets Low LatencyHBase: Where Online Meets Low Latency
HBase: Where Online Meets Low Latency
 
HBaseCon 2012 | Solbase - Kyungseog Oh, Photobucket
HBaseCon 2012 | Solbase - Kyungseog Oh, PhotobucketHBaseCon 2012 | Solbase - Kyungseog Oh, Photobucket
HBaseCon 2012 | Solbase - Kyungseog Oh, Photobucket
 
HBaseCon 2013: Streaming Data into Apache HBase using Apache Flume: Experienc...
HBaseCon 2013: Streaming Data into Apache HBase using Apache Flume: Experienc...HBaseCon 2013: Streaming Data into Apache HBase using Apache Flume: Experienc...
HBaseCon 2013: Streaming Data into Apache HBase using Apache Flume: Experienc...
 
Hadoop Introduction
Hadoop IntroductionHadoop Introduction
Hadoop Introduction
 
Kafka vs kinesis
Kafka vs kinesisKafka vs kinesis
Kafka vs kinesis
 
HBaseCon 2015: HBase at Scale in an Online and High-Demand Environment
HBaseCon 2015: HBase at Scale in an Online and  High-Demand EnvironmentHBaseCon 2015: HBase at Scale in an Online and  High-Demand Environment
HBaseCon 2015: HBase at Scale in an Online and High-Demand Environment
 

Andere mochten auch

Data Aggregation At Scale Using Apache Flume
Data Aggregation At Scale Using Apache FlumeData Aggregation At Scale Using Apache Flume
Data Aggregation At Scale Using Apache FlumeArvind Prabhakar
 
Moving to a data-centric architecture: Toronto Data Unconference 2015
Moving to a data-centric architecture: Toronto Data Unconference 2015Moving to a data-centric architecture: Toronto Data Unconference 2015
Moving to a data-centric architecture: Toronto Data Unconference 2015Adam Muise
 
Implementing and running a secure datalake from the trenches
Implementing and running a secure datalake from the trenches Implementing and running a secure datalake from the trenches
Implementing and running a secure datalake from the trenches DataWorks Summit
 
Parquet and AVRO
Parquet and AVROParquet and AVRO
Parquet and AVROairisData
 
大型电商的数据服务的要点和难点
大型电商的数据服务的要点和难点 大型电商的数据服务的要点和难点
大型电商的数据服务的要点和难点 Chao Zhu
 
Paytm labs soyouwanttodatascience
Paytm labs soyouwanttodatasciencePaytm labs soyouwanttodatascience
Paytm labs soyouwanttodatascienceAdam Muise
 
Choosing an HDFS data storage format- Avro vs. Parquet and more - StampedeCon...
Choosing an HDFS data storage format- Avro vs. Parquet and more - StampedeCon...Choosing an HDFS data storage format- Avro vs. Parquet and more - StampedeCon...
Choosing an HDFS data storage format- Avro vs. Parquet and more - StampedeCon...StampedeCon
 

Andere mochten auch (9)

Data Aggregation At Scale Using Apache Flume
Data Aggregation At Scale Using Apache FlumeData Aggregation At Scale Using Apache Flume
Data Aggregation At Scale Using Apache Flume
 
Moving to a data-centric architecture: Toronto Data Unconference 2015
Moving to a data-centric architecture: Toronto Data Unconference 2015Moving to a data-centric architecture: Toronto Data Unconference 2015
Moving to a data-centric architecture: Toronto Data Unconference 2015
 
Implementing and running a secure datalake from the trenches
Implementing and running a secure datalake from the trenches Implementing and running a secure datalake from the trenches
Implementing and running a secure datalake from the trenches
 
Parquet and AVRO
Parquet and AVROParquet and AVRO
Parquet and AVRO
 
大型电商的数据服务的要点和难点
大型电商的数据服务的要点和难点 大型电商的数据服务的要点和难点
大型电商的数据服务的要点和难点
 
Parquet overview
Parquet overviewParquet overview
Parquet overview
 
Paytm labs soyouwanttodatascience
Paytm labs soyouwanttodatasciencePaytm labs soyouwanttodatascience
Paytm labs soyouwanttodatascience
 
File Format Benchmark - Avro, JSON, ORC & Parquet
File Format Benchmark - Avro, JSON, ORC & ParquetFile Format Benchmark - Avro, JSON, ORC & Parquet
File Format Benchmark - Avro, JSON, ORC & Parquet
 
Choosing an HDFS data storage format- Avro vs. Parquet and more - StampedeCon...
Choosing an HDFS data storage format- Avro vs. Parquet and more - StampedeCon...Choosing an HDFS data storage format- Avro vs. Parquet and more - StampedeCon...
Choosing an HDFS data storage format- Avro vs. Parquet and more - StampedeCon...
 

Ähnlich wie Introduction to streaming and messaging flume,kafka,SQS,kinesis

Amazon aws big data demystified | Introduction to streaming and messaging flu...
Amazon aws big data demystified | Introduction to streaming and messaging flu...Amazon aws big data demystified | Introduction to streaming and messaging flu...
Amazon aws big data demystified | Introduction to streaming and messaging flu...Omid Vahdaty
 
Deep dive and best practices on real time streaming applications nyc-loft_oct...
Deep dive and best practices on real time streaming applications nyc-loft_oct...Deep dive and best practices on real time streaming applications nyc-loft_oct...
Deep dive and best practices on real time streaming applications nyc-loft_oct...Amazon Web Services
 
AWS April 2016 Webinar Series - Getting Started with Real-Time Data Analytics...
AWS April 2016 Webinar Series - Getting Started with Real-Time Data Analytics...AWS April 2016 Webinar Series - Getting Started with Real-Time Data Analytics...
AWS April 2016 Webinar Series - Getting Started with Real-Time Data Analytics...Amazon Web Services
 
Deep Dive and Best Practices for Real Time Streaming Applications
Deep Dive and Best Practices for Real Time Streaming ApplicationsDeep Dive and Best Practices for Real Time Streaming Applications
Deep Dive and Best Practices for Real Time Streaming ApplicationsAmazon Web Services
 
AWS를 활용한 첫 빅데이터 프로젝트 시작하기(김일호)- AWS 웨비나 시리즈 2015
AWS를 활용한 첫 빅데이터 프로젝트 시작하기(김일호)- AWS 웨비나 시리즈 2015AWS를 활용한 첫 빅데이터 프로젝트 시작하기(김일호)- AWS 웨비나 시리즈 2015
AWS를 활용한 첫 빅데이터 프로젝트 시작하기(김일호)- AWS 웨비나 시리즈 2015Amazon Web Services Korea
 
Real Time Data Processing Using AWS Lambda - DevDay Austin 2017
Real Time Data Processing Using AWS Lambda - DevDay Austin 2017Real Time Data Processing Using AWS Lambda - DevDay Austin 2017
Real Time Data Processing Using AWS Lambda - DevDay Austin 2017Amazon Web Services
 
Real-time Data Processing Using AWS Lambda
Real-time Data Processing Using AWS LambdaReal-time Data Processing Using AWS Lambda
Real-time Data Processing Using AWS LambdaAmazon Web Services
 
Amazon Kinesis Data Streams Vs Msk (1).pptx
Amazon Kinesis Data Streams Vs Msk (1).pptxAmazon Kinesis Data Streams Vs Msk (1).pptx
Amazon Kinesis Data Streams Vs Msk (1).pptxRenjithPillai26
 
AWS May Webinar Series - Streaming Data Processing with Amazon Kinesis and AW...
AWS May Webinar Series - Streaming Data Processing with Amazon Kinesis and AW...AWS May Webinar Series - Streaming Data Processing with Amazon Kinesis and AW...
AWS May Webinar Series - Streaming Data Processing with Amazon Kinesis and AW...Amazon Web Services
 
1.0 - AWS-DAS-Collection-Kinesis.pdf
1.0 - AWS-DAS-Collection-Kinesis.pdf1.0 - AWS-DAS-Collection-Kinesis.pdf
1.0 - AWS-DAS-Collection-Kinesis.pdfSreeGe1
 
Log Analytics with Amazon Elasticsearch Service and Amazon Kinesis - March 20...
Log Analytics with Amazon Elasticsearch Service and Amazon Kinesis - March 20...Log Analytics with Amazon Elasticsearch Service and Amazon Kinesis - March 20...
Log Analytics with Amazon Elasticsearch Service and Amazon Kinesis - March 20...Amazon Web Services
 
AWS Webcast - AWS Kinesis Webinar
AWS Webcast - AWS Kinesis WebinarAWS Webcast - AWS Kinesis Webinar
AWS Webcast - AWS Kinesis WebinarAmazon Web Services
 
Real-time Data Processing Using AWS Lambda
Real-time Data Processing Using AWS LambdaReal-time Data Processing Using AWS Lambda
Real-time Data Processing Using AWS LambdaAmazon Web Services
 
2017 AWS DB Day | AWS 데이터베이스 개요 - 나의 업무에 적합한 데이터베이스는?
2017 AWS DB Day |  AWS 데이터베이스 개요 - 나의 업무에 적합한 데이터베이스는?2017 AWS DB Day |  AWS 데이터베이스 개요 - 나의 업무에 적합한 데이터베이스는?
2017 AWS DB Day | AWS 데이터베이스 개요 - 나의 업무에 적합한 데이터베이스는?Amazon Web Services Korea
 
Building real time data-driven products
Building real time data-driven productsBuilding real time data-driven products
Building real time data-driven productsLars Albertsson
 
AWS Webcast - Managing Big Data in the AWS Cloud_20140924
AWS Webcast - Managing Big Data in the AWS Cloud_20140924AWS Webcast - Managing Big Data in the AWS Cloud_20140924
AWS Webcast - Managing Big Data in the AWS Cloud_20140924Amazon Web Services
 
Building realtime data pipeline with Apache Kafka
Building realtime data pipeline with Apache KafkaBuilding realtime data pipeline with Apache Kafka
Building realtime data pipeline with Apache KafkaNagarajan Selvaraj
 
Introduction to Amazon Relational Database Service
Introduction to Amazon Relational Database ServiceIntroduction to Amazon Relational Database Service
Introduction to Amazon Relational Database ServiceAmazon Web Services
 
CloudCamp Athens presentation: Introduction to cloud computing
CloudCamp Athens presentation: Introduction to cloud computingCloudCamp Athens presentation: Introduction to cloud computing
CloudCamp Athens presentation: Introduction to cloud computingFotis Stamatelopoulos
 
SMC303 Real-time Data Processing Using AWS Lambda
SMC303 Real-time Data Processing Using AWS LambdaSMC303 Real-time Data Processing Using AWS Lambda
SMC303 Real-time Data Processing Using AWS LambdaAmazon Web Services
 

Ähnlich wie Introduction to streaming and messaging flume,kafka,SQS,kinesis (20)

Amazon aws big data demystified | Introduction to streaming and messaging flu...
Amazon aws big data demystified | Introduction to streaming and messaging flu...Amazon aws big data demystified | Introduction to streaming and messaging flu...
Amazon aws big data demystified | Introduction to streaming and messaging flu...
 
Deep dive and best practices on real time streaming applications nyc-loft_oct...
Deep dive and best practices on real time streaming applications nyc-loft_oct...Deep dive and best practices on real time streaming applications nyc-loft_oct...
Deep dive and best practices on real time streaming applications nyc-loft_oct...
 
AWS April 2016 Webinar Series - Getting Started with Real-Time Data Analytics...
AWS April 2016 Webinar Series - Getting Started with Real-Time Data Analytics...AWS April 2016 Webinar Series - Getting Started with Real-Time Data Analytics...
AWS April 2016 Webinar Series - Getting Started with Real-Time Data Analytics...
 
Deep Dive and Best Practices for Real Time Streaming Applications
Deep Dive and Best Practices for Real Time Streaming ApplicationsDeep Dive and Best Practices for Real Time Streaming Applications
Deep Dive and Best Practices for Real Time Streaming Applications
 
AWS를 활용한 첫 빅데이터 프로젝트 시작하기(김일호)- AWS 웨비나 시리즈 2015
AWS를 활용한 첫 빅데이터 프로젝트 시작하기(김일호)- AWS 웨비나 시리즈 2015AWS를 활용한 첫 빅데이터 프로젝트 시작하기(김일호)- AWS 웨비나 시리즈 2015
AWS를 활용한 첫 빅데이터 프로젝트 시작하기(김일호)- AWS 웨비나 시리즈 2015
 
Real Time Data Processing Using AWS Lambda - DevDay Austin 2017
Real Time Data Processing Using AWS Lambda - DevDay Austin 2017Real Time Data Processing Using AWS Lambda - DevDay Austin 2017
Real Time Data Processing Using AWS Lambda - DevDay Austin 2017
 
Real-time Data Processing Using AWS Lambda
Real-time Data Processing Using AWS LambdaReal-time Data Processing Using AWS Lambda
Real-time Data Processing Using AWS Lambda
 
Amazon Kinesis Data Streams Vs Msk (1).pptx
Amazon Kinesis Data Streams Vs Msk (1).pptxAmazon Kinesis Data Streams Vs Msk (1).pptx
Amazon Kinesis Data Streams Vs Msk (1).pptx
 
AWS May Webinar Series - Streaming Data Processing with Amazon Kinesis and AW...
AWS May Webinar Series - Streaming Data Processing with Amazon Kinesis and AW...AWS May Webinar Series - Streaming Data Processing with Amazon Kinesis and AW...
AWS May Webinar Series - Streaming Data Processing with Amazon Kinesis and AW...
 
1.0 - AWS-DAS-Collection-Kinesis.pdf
1.0 - AWS-DAS-Collection-Kinesis.pdf1.0 - AWS-DAS-Collection-Kinesis.pdf
1.0 - AWS-DAS-Collection-Kinesis.pdf
 
Log Analytics with Amazon Elasticsearch Service and Amazon Kinesis - March 20...
Log Analytics with Amazon Elasticsearch Service and Amazon Kinesis - March 20...Log Analytics with Amazon Elasticsearch Service and Amazon Kinesis - March 20...
Log Analytics with Amazon Elasticsearch Service and Amazon Kinesis - March 20...
 
AWS Webcast - AWS Kinesis Webinar
AWS Webcast - AWS Kinesis WebinarAWS Webcast - AWS Kinesis Webinar
AWS Webcast - AWS Kinesis Webinar
 
Real-time Data Processing Using AWS Lambda
Real-time Data Processing Using AWS LambdaReal-time Data Processing Using AWS Lambda
Real-time Data Processing Using AWS Lambda
 
2017 AWS DB Day | AWS 데이터베이스 개요 - 나의 업무에 적합한 데이터베이스는?
2017 AWS DB Day |  AWS 데이터베이스 개요 - 나의 업무에 적합한 데이터베이스는?2017 AWS DB Day |  AWS 데이터베이스 개요 - 나의 업무에 적합한 데이터베이스는?
2017 AWS DB Day | AWS 데이터베이스 개요 - 나의 업무에 적합한 데이터베이스는?
 
Building real time data-driven products
Building real time data-driven productsBuilding real time data-driven products
Building real time data-driven products
 
AWS Webcast - Managing Big Data in the AWS Cloud_20140924
AWS Webcast - Managing Big Data in the AWS Cloud_20140924AWS Webcast - Managing Big Data in the AWS Cloud_20140924
AWS Webcast - Managing Big Data in the AWS Cloud_20140924
 
Building realtime data pipeline with Apache Kafka
Building realtime data pipeline with Apache KafkaBuilding realtime data pipeline with Apache Kafka
Building realtime data pipeline with Apache Kafka
 
Introduction to Amazon Relational Database Service
Introduction to Amazon Relational Database ServiceIntroduction to Amazon Relational Database Service
Introduction to Amazon Relational Database Service
 
CloudCamp Athens presentation: Introduction to cloud computing
CloudCamp Athens presentation: Introduction to cloud computingCloudCamp Athens presentation: Introduction to cloud computing
CloudCamp Athens presentation: Introduction to cloud computing
 
SMC303 Real-time Data Processing Using AWS Lambda
SMC303 Real-time Data Processing Using AWS LambdaSMC303 Real-time Data Processing Using AWS Lambda
SMC303 Real-time Data Processing Using AWS Lambda
 

Mehr von Omid Vahdaty

Data Pipline Observability meetup
Data Pipline Observability meetup Data Pipline Observability meetup
Data Pipline Observability meetup Omid Vahdaty
 
Couchbase Data Platform | Big Data Demystified
Couchbase Data Platform | Big Data DemystifiedCouchbase Data Platform | Big Data Demystified
Couchbase Data Platform | Big Data DemystifiedOmid Vahdaty
 
Machine Learning Essentials Demystified part2 | Big Data Demystified
Machine Learning Essentials Demystified part2 | Big Data DemystifiedMachine Learning Essentials Demystified part2 | Big Data Demystified
Machine Learning Essentials Demystified part2 | Big Data DemystifiedOmid Vahdaty
 
Machine Learning Essentials Demystified part1 | Big Data Demystified
Machine Learning Essentials Demystified part1 | Big Data DemystifiedMachine Learning Essentials Demystified part1 | Big Data Demystified
Machine Learning Essentials Demystified part1 | Big Data DemystifiedOmid Vahdaty
 
The technology of fake news between a new front and a new frontier | Big Dat...
The technology of fake news  between a new front and a new frontier | Big Dat...The technology of fake news  between a new front and a new frontier | Big Dat...
The technology of fake news between a new front and a new frontier | Big Dat...Omid Vahdaty
 
Big Data in 200 km/h | AWS Big Data Demystified #1.3
Big Data in 200 km/h | AWS Big Data Demystified #1.3  Big Data in 200 km/h | AWS Big Data Demystified #1.3
Big Data in 200 km/h | AWS Big Data Demystified #1.3 Omid Vahdaty
 
Making your analytics talk business | Big Data Demystified
Making your analytics talk business | Big Data DemystifiedMaking your analytics talk business | Big Data Demystified
Making your analytics talk business | Big Data DemystifiedOmid Vahdaty
 
BI STRATEGY FROM A BIRD'S EYE VIEW (How to become a trusted advisor) | Omri H...
BI STRATEGY FROM A BIRD'S EYE VIEW (How to become a trusted advisor) | Omri H...BI STRATEGY FROM A BIRD'S EYE VIEW (How to become a trusted advisor) | Omri H...
BI STRATEGY FROM A BIRD'S EYE VIEW (How to become a trusted advisor) | Omri H...Omid Vahdaty
 
AI and Big Data in Health Sector Opportunities and challenges | Big Data Demy...
AI and Big Data in Health Sector Opportunities and challenges | Big Data Demy...AI and Big Data in Health Sector Opportunities and challenges | Big Data Demy...
AI and Big Data in Health Sector Opportunities and challenges | Big Data Demy...Omid Vahdaty
 
Aerospike meetup july 2019 | Big Data Demystified
Aerospike meetup july 2019 | Big Data DemystifiedAerospike meetup july 2019 | Big Data Demystified
Aerospike meetup july 2019 | Big Data DemystifiedOmid Vahdaty
 
ALIGNING YOUR BI OPERATIONS WITH YOUR CUSTOMERS' UNSPOKEN NEEDS, by Eyal Stei...
ALIGNING YOUR BI OPERATIONS WITH YOUR CUSTOMERS' UNSPOKEN NEEDS, by Eyal Stei...ALIGNING YOUR BI OPERATIONS WITH YOUR CUSTOMERS' UNSPOKEN NEEDS, by Eyal Stei...
ALIGNING YOUR BI OPERATIONS WITH YOUR CUSTOMERS' UNSPOKEN NEEDS, by Eyal Stei...Omid Vahdaty
 
AWS Big Data Demystified #1.2 | Big Data architecture lessons learned
AWS Big Data Demystified #1.2 | Big Data architecture lessons learned AWS Big Data Demystified #1.2 | Big Data architecture lessons learned
AWS Big Data Demystified #1.2 | Big Data architecture lessons learned Omid Vahdaty
 
AWS big-data-demystified #1.1 | Big Data Architecture Lessons Learned | English
AWS big-data-demystified #1.1  | Big Data Architecture Lessons Learned | EnglishAWS big-data-demystified #1.1  | Big Data Architecture Lessons Learned | English
AWS big-data-demystified #1.1 | Big Data Architecture Lessons Learned | EnglishOmid Vahdaty
 
AWS Big Data Demystified #4 data governance demystified [security, networ...
AWS Big Data Demystified #4   data governance demystified   [security, networ...AWS Big Data Demystified #4   data governance demystified   [security, networ...
AWS Big Data Demystified #4 data governance demystified [security, networ...Omid Vahdaty
 
AWS Big Data Demystified #3 | Zeppelin + spark sql, jdbc + thrift, ganglia, r...
AWS Big Data Demystified #3 | Zeppelin + spark sql, jdbc + thrift, ganglia, r...AWS Big Data Demystified #3 | Zeppelin + spark sql, jdbc + thrift, ganglia, r...
AWS Big Data Demystified #3 | Zeppelin + spark sql, jdbc + thrift, ganglia, r...Omid Vahdaty
 
AWS Big Data Demystified #2 | Athena, Spectrum, Emr, Hive
AWS Big Data Demystified #2 |  Athena, Spectrum, Emr, Hive AWS Big Data Demystified #2 |  Athena, Spectrum, Emr, Hive
AWS Big Data Demystified #2 | Athena, Spectrum, Emr, Hive Omid Vahdaty
 
AWS Big Data Demystified #1: Big data architecture lessons learned
AWS Big Data Demystified #1: Big data architecture lessons learned AWS Big Data Demystified #1: Big data architecture lessons learned
AWS Big Data Demystified #1: Big data architecture lessons learned Omid Vahdaty
 
Emr spark tuning demystified
Emr spark tuning demystifiedEmr spark tuning demystified
Emr spark tuning demystifiedOmid Vahdaty
 
Emr zeppelin & Livy demystified
Emr zeppelin & Livy demystifiedEmr zeppelin & Livy demystified
Emr zeppelin & Livy demystifiedOmid Vahdaty
 
Zeppelin and spark sql demystified
Zeppelin and spark sql demystifiedZeppelin and spark sql demystified
Zeppelin and spark sql demystifiedOmid Vahdaty
 

Mehr von Omid Vahdaty (20)

Data Pipline Observability meetup
Data Pipline Observability meetup Data Pipline Observability meetup
Data Pipline Observability meetup
 
Couchbase Data Platform | Big Data Demystified
Couchbase Data Platform | Big Data DemystifiedCouchbase Data Platform | Big Data Demystified
Couchbase Data Platform | Big Data Demystified
 
Machine Learning Essentials Demystified part2 | Big Data Demystified
Machine Learning Essentials Demystified part2 | Big Data DemystifiedMachine Learning Essentials Demystified part2 | Big Data Demystified
Machine Learning Essentials Demystified part2 | Big Data Demystified
 
Machine Learning Essentials Demystified part1 | Big Data Demystified
Machine Learning Essentials Demystified part1 | Big Data DemystifiedMachine Learning Essentials Demystified part1 | Big Data Demystified
Machine Learning Essentials Demystified part1 | Big Data Demystified
 
The technology of fake news between a new front and a new frontier | Big Dat...
The technology of fake news  between a new front and a new frontier | Big Dat...The technology of fake news  between a new front and a new frontier | Big Dat...
The technology of fake news between a new front and a new frontier | Big Dat...
 
Big Data in 200 km/h | AWS Big Data Demystified #1.3
Big Data in 200 km/h | AWS Big Data Demystified #1.3  Big Data in 200 km/h | AWS Big Data Demystified #1.3
Big Data in 200 km/h | AWS Big Data Demystified #1.3
 
Making your analytics talk business | Big Data Demystified
Making your analytics talk business | Big Data DemystifiedMaking your analytics talk business | Big Data Demystified
Making your analytics talk business | Big Data Demystified
 
BI STRATEGY FROM A BIRD'S EYE VIEW (How to become a trusted advisor) | Omri H...
BI STRATEGY FROM A BIRD'S EYE VIEW (How to become a trusted advisor) | Omri H...BI STRATEGY FROM A BIRD'S EYE VIEW (How to become a trusted advisor) | Omri H...
BI STRATEGY FROM A BIRD'S EYE VIEW (How to become a trusted advisor) | Omri H...
 
AI and Big Data in Health Sector Opportunities and challenges | Big Data Demy...
AI and Big Data in Health Sector Opportunities and challenges | Big Data Demy...AI and Big Data in Health Sector Opportunities and challenges | Big Data Demy...
AI and Big Data in Health Sector Opportunities and challenges | Big Data Demy...
 
Aerospike meetup july 2019 | Big Data Demystified
Aerospike meetup july 2019 | Big Data DemystifiedAerospike meetup july 2019 | Big Data Demystified
Aerospike meetup july 2019 | Big Data Demystified
 
ALIGNING YOUR BI OPERATIONS WITH YOUR CUSTOMERS' UNSPOKEN NEEDS, by Eyal Stei...
ALIGNING YOUR BI OPERATIONS WITH YOUR CUSTOMERS' UNSPOKEN NEEDS, by Eyal Stei...ALIGNING YOUR BI OPERATIONS WITH YOUR CUSTOMERS' UNSPOKEN NEEDS, by Eyal Stei...
ALIGNING YOUR BI OPERATIONS WITH YOUR CUSTOMERS' UNSPOKEN NEEDS, by Eyal Stei...
 
AWS Big Data Demystified #1.2 | Big Data architecture lessons learned
AWS Big Data Demystified #1.2 | Big Data architecture lessons learned AWS Big Data Demystified #1.2 | Big Data architecture lessons learned
AWS Big Data Demystified #1.2 | Big Data architecture lessons learned
 
AWS big-data-demystified #1.1 | Big Data Architecture Lessons Learned | English
AWS big-data-demystified #1.1  | Big Data Architecture Lessons Learned | EnglishAWS big-data-demystified #1.1  | Big Data Architecture Lessons Learned | English
AWS big-data-demystified #1.1 | Big Data Architecture Lessons Learned | English
 
AWS Big Data Demystified #4 data governance demystified [security, networ...
AWS Big Data Demystified #4   data governance demystified   [security, networ...AWS Big Data Demystified #4   data governance demystified   [security, networ...
AWS Big Data Demystified #4 data governance demystified [security, networ...
 
AWS Big Data Demystified #3 | Zeppelin + spark sql, jdbc + thrift, ganglia, r...
AWS Big Data Demystified #3 | Zeppelin + spark sql, jdbc + thrift, ganglia, r...AWS Big Data Demystified #3 | Zeppelin + spark sql, jdbc + thrift, ganglia, r...
AWS Big Data Demystified #3 | Zeppelin + spark sql, jdbc + thrift, ganglia, r...
 
AWS Big Data Demystified #2 | Athena, Spectrum, Emr, Hive
AWS Big Data Demystified #2 |  Athena, Spectrum, Emr, Hive AWS Big Data Demystified #2 |  Athena, Spectrum, Emr, Hive
AWS Big Data Demystified #2 | Athena, Spectrum, Emr, Hive
 
AWS Big Data Demystified #1: Big data architecture lessons learned
AWS Big Data Demystified #1: Big data architecture lessons learned AWS Big Data Demystified #1: Big data architecture lessons learned
AWS Big Data Demystified #1: Big data architecture lessons learned
 
Emr spark tuning demystified
Emr spark tuning demystifiedEmr spark tuning demystified
Emr spark tuning demystified
 
Emr zeppelin & Livy demystified
Emr zeppelin & Livy demystifiedEmr zeppelin & Livy demystified
Emr zeppelin & Livy demystified
 
Zeppelin and spark sql demystified
Zeppelin and spark sql demystifiedZeppelin and spark sql demystified
Zeppelin and spark sql demystified
 

Kürzlich hochgeladen

Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝soniya singh
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...Call Girls in Nagpur High Profile
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations120cr0395
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130Suhani Kapoor
 
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptxthe ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptxhumanexperienceaaa
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130Suhani Kapoor
 
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Dr.Costas Sachpazis
 
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...ranjana rawat
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxupamatechverse
 
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)Suman Mia
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Dr.Costas Sachpazis
 
Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxpranjaldaimarysona
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escortsranjana rawat
 
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escortsranjana rawat
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 

Kürzlich hochgeladen (20)

Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
 
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptxthe ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
 
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
 
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptx
 
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
 
Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptx
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
 
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
 
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINEDJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
 
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCRCall Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
 

Introduction to streaming and messaging flume,kafka,SQS,kinesis

  • 1. Introduction to Streaming & Messaging Flume ,Kafka,SQS, Kinesis streams & firehose Kinesis Analytics Omid Vahdaty, Big Data Ninja
  • 2. What is batch Processing? the execution of a series of programs each on a set or "batch" of inputs, rather than a single input (which would instead be a custom job
  • 3. What is Streaming ? Streaming Data is data that is generated continuously by thousands of data sources, which typically send in the data records simultaneously, and in small sizes (order of Kilobytes)
  • 4. Streaming VS. Batch Processing Batch Stream Data Scope Query the entire batch, with slight delay Query most recent events defined in a time window. Data Size Large data sets A few Individual records Latency? Minutes ,hours Seconds, Milliseconds Analysis Complex Analytics Basic: aggregations, metrics etc.
  • 5. Challenges with Streaming Data ● Processing layer ○ Consuming data ○ Processing data ○ Notifying storage layer what to do. ● Storage layer ○ Ordering mechanism ○ Strong Consistency mechanism ● In general MUST have features: ○ scalability ○ data durability ○ fault tolerance
  • 6. Messaging VS Streaming? ● Messaging: framed message based protocol. ● e.g 3 messages sent will look like: ○ Hello world ○ Hello world ○ Hello world ● Streaming: unframed data (bytes) stream based protocol ● e.g 3 messages sent will look like: ○ Hell ○ ow wo ○ rld Hel ○ low wor ○ ldHellow wo ○ rld
  • 8. Flume Flume Pros: Good documentation with many existing implementation patterns to follow Easy integration with existing monitoring framework Integration with Cloudera Manager to monitor Flume processes Flume Cons: Event rather that stream centric Calculating capacity is not an exact science but rather confirmed through trials Throughput is dependent on the channel backing store. Flume lacks the clear scaling and resiliency configurations (trivial with Kafka and Kinesis)
  • 9. Kafka Kafka Pros: High achievable ingest rates with clear scaling pattern High resiliency via distributed replicas with little impact on throughput Kafka Cons: No current framework for monitoring and configuring producers
  • 10. Flume VS. Kafka Flume Kafka Choose when you desire No need for customization. Need out of the box components such HDFS sink Need a custom made high availability delivery system Velocity high higher Event processing
  • 11. Flume Kafka Original Motivation distributed, reliable, and available system for efficiently collecting, aggregating and moving large amounts of log data from many different sources to a centralized data store. Built around hadoop ecosystem general purpose distributed publish- subscribe messaging system Multi-consumer ultra-high availability messaging system. Data Flow push pull event availability JDBC Databases Channel, file Channel. Loose flume agent = losing data. replication of your events data by design. Commercial support Cloudera Cloudera Collectors built in Yes. just the messaging
  • 12. Use Case: Kafka and Flume combined ● Flume supports: Kafka source, Kafka channel, Kafka sink ● So, take the advantage of both and combine them to your needs.
  • 13. Use Case: Kafka as a Channel
  • 14. AWS SQS ● a fast, reliable, scalable, fully managed message queuing service ● decouple the components of a cloud application, move data between diverse, distributed application components without losing messages and without requiring each component to be always available. ● high throughput and at-least-once processing, and FIFO queues ● all messages are stored redundantly across multiple servers and data centers. ● Start with three API calls : SendMessage, ReceiveMessage, and DeleteMessage. Additional APIs are available to provide advanced functionality. ● Queues ○ Standard queues offer maximum throughput, best-effort ordering, and at-least-once delivery. ○ FIFO queues are designed to ensure strict ordering and exactly-once processing, with limited throughput. ● scales dynamically ● Authentication mechanisms
  • 15. AWS SQS use cases Messaging semantics (such as message-level ack/fail) and visibility timeout. For example, you have a queue of work items and want to track the successful completion of each item independently. Amazon SQS tracks the ack/fail, so the application does not have to maintain a persistent checkpoint/cursor. Amazon SQS will delete acked messages and redeliver failed messages after a configured visibility timeout. Individual message delay. For example, you have a job queue and need to schedule individual jobs with a delay. With Amazon SQS, you can configure individual messages to have a delay of up to 15 minutes. Dynamically increasing concurrency/throughput at read time. For example, you have a work queue and want to add more readers until the backlog is cleared. With Amazon Kinesis, you can scale up to a sufficient number of shards (note, however, that you'll need to provision enough shards ahead of time). Leveraging Amazon SQS’s ability to scale transparently. For example, you buffer requests and the load changes as a result of occasional load spikes or the natural growth of your business. Because each buffered request can be processed independently.
  • 16. Typical SQL use case: decoupling APP layers.
  • 18. AWS Kinesis (streams) ● build custom applications that process or analyze streams ● continuously capture and store terabytes of data per hour ● Hundreds sources ● allows for real-time data processing ● Easy to use, get started in minutes ○ Kinesis Client Library ○ Kinesis Producer Library ● allows you to have multiple Applications processing the same stream concurrently. ● The throughput can scale from megabytes to terabytes per hour ● synchronously replicates your streaming data across three AZ ● preserves your data for up to 7 days
  • 19. AWS Kinesis (streams) use cases ● Log and Event collection ● Mobile Data collection ● Real Time Analytics ○ when loading data from transactional databases into data warehouses. ○ Multi-stage processing using specialized algorithms ○ stream partitioning for finer control over scaling ● Gaming Data feed
  • 20. AWS Kinesis (streams) use cases Routing related records to the same record processor (as in streaming MapReduce). For example, counting and aggregation are simpler when all records for a given key are routed to the same record processor. Ordering of records. For example, you want to transfer log data from the application host to the processing/archival host while maintaining the order of log statements. Ability for multiple applications to consume the same stream concurrently. For example, you have one application that updates a real-time dashboard and another that archives data to Amazon Redshift. You want both applications to consume data from the same stream concurrently and independently. Ability to consume records in the same order a few hours later. For example, you have a billing application and an audit application that runs a few hours behind the billing application. Because Amazon Kinesis stores data for up to 24 hours, you can run the audit application up to 24 hours behind the billing application.
  • 21. AWS Kinesis (streams) Kinesis Pros: High achievable ingest rates with clear scaling pattern Similar throughput and resiliency characteristics to Kafka Integrates with other AWS services like EMR and Data Pipeline. Kinesis Cons: No current framework for monitoring and configuring producers Cloud service only. Possible increase in latency from source to Kinesis.
  • 23. AWS Kinesis Firehose ● the easiest way to load streaming data into AWS. ● capture, transform, and load streaming data ○ integrates into Kinesis Analytics, S3, Redshift, Elasticsearch Service ○ Serverless Transformation on RAW data. (lambda function) ■ E.g transform log file into CSV format ● Firehose can back up all untransformed records to your S3 bucket concurrently while delivering transformed records to the destination. You can enable source record backup ● enabling near real-time analytics ● Easy to use. ● Monitoring options. ● Limits ○ 20 stream per regions, Each stream ■ 2000 transaction per sec ■ 5000 records per sec ■ 5MB/s ■ Support 24 hours replay in cases on downtime
  • 24. Kinesys Firehose agent ● Java software app that send data to streams/firehose ● monitors a set of files for new data and then sends streams/firehose ● It handles file rotation, checkpointing, and retrial upon failures. ● supports Amazon CloudWatch so that you can closely monitor and troubleshoot the data flow from the agent. ● Data processing options: ○ SINGLELINE – This option converts a multi-line record to a single line record by removing newline characters, and leading and trailing spaces. ○ CSVTOJSON – This option converts a record from delimiter separated format to JSON format. ○ LOGTOJSON – This option converts a record from several commonly used log formats to JSON format. Currently supported log formats are Apache Common Log, Apache Combined Log, Apache Error Log, and RFC3164 (syslog). ● https://github.com/awslabs/amazon-kinesis-agent
  • 25. Write a JAVA agent to Firehose ● AWS java SDK ● Firehose API ○ Single record: PutRecord ○ Batch: PutRecordBatch. ● Key concepts: ○ Firehose delivery stream ○ Data producer - i.e web server creating log. ○ Record: The data of interest that your data producer sends to a Firehose delivery stream. A record can be as large as 1000 KB. ○ buffer size (in MB ) ○ buffer interval (seconds) ● Java examples: http://docs.aws.amazon.com/firehose/latest/dev/writing-with-sdk.html
  • 27. Kinesis Analytics : in-flight analytics. ● process streaming data in real time with standard SQL ● Amazon Kinesis Analytics enables you to create and run SQL queries on streaming data ● Easy 3 steps 1. Configure Input stream (kinesis stream, kinesis firehose) a. Automatically created Schema b. Manually change schema if you like 2. Write SQL query 3. Configure output stream: s3, redshift, elastics search ● Elastic: scale up down ● Managed service ● Standard SQL
  • 28. Kinesis Analytics : in-flight analytics.
  • 29. Stay in touch... ● Omid Vahdaty ● +972-54-2384178