SlideShare ist ein Scribd-Unternehmen logo
1 von 37
Real-Time Processing Using AWS Lambda
Presenter: Sidhartha Chauhan, Solution Architect
Author: Cecilia Deng, SDE
March 2017 – AWS Loft New York
What to Expect from the Session
• What kinds of real time events can trigger lambda?
• How does Lambda pull and process streams?
• What are some stream processing behaviors?
• Hear how Thomson Reuters went real time with AWS Lambda
Flavors of real time event sources
Asynchronous Invoke
Push Event Source
Synchronous Invoke
Push Event Source
Stream
Pull Event Source
S3
async invoke
Alexa skill
sync invoke
Pull then sync invoke
DynamoDB
Update Stream
Real-time push
Real-time push
• Who?
• Any integrator that uses AWS Lambda invoke API
• E.g., Amazon S3, Amazon SNS, Amazon Alexa, AWS IoT
• What?
• Event sources sending events to Lambda for processing
• How?
• Real-time triggered events owned by event source
• Real-time processing owned by Lambda invoke methods
Real-time push
Synchronous Invoke
Push Event Source
Asynchronous Invoke
Push Event Source
Real-time pull
Real-time pull
• Who?
• Amazon Kinesis and DynamoDB update streams
• What?
• Lambda grabbing events from a stream for processing
• How?
• Mapping maintained by Lambda
• Real-time triggered events owned by DDB or Kinesis producer
• Real-time processing owned by Lambda stream polling component and
invoke methods
Real-time pull
Stream
Pull Event Source
Processing streams
Processing streams: Kinesis setup
• Streams
▪ Made up of shards
▪ Each shard supports writes up to 1 MB/s
▪ Each shard supports reads up to 2 MB/s
▪ Each shard supports 5 reads/s
• Data
▪ All data is stored and replayable for 24 hours by default
▪ Make sure partition key distribution is even to optimize parallel throughput
▪ Pick a key with more groups than shards
Processing streams: Lambda setup
Memory
▪ CPU is proportional to the memory
configured
▪ More memory means faster execution,
if CPU bound
▪ More memory means larger sized
record batches can be processed
Timeout
• Increasing timeout allows for longer functions, but more wait in case of errors
Permission model
• The execution role defined for Lambda must have permission to access the
stream
Processing streams: event source setup
• Batch size
▪ Max number of records that Lambda will send in one invocation
▪ Not equivalent to how many records Lambda gets from Kinesis
▪ Effective batch size is
• MIN(records available, batch size, 6 MB)
▪ Increasing batch size allows fewer Lambda function invocations with
more data processed per function
Processing streams: event source setup
• Starting Position:
▪ The position in the stream where Lambda starts reading
▪ Set to “Trim Horizon” for reading from start of stream (all data)
▪ Set to “Latest” for reading most recent data (LIFO) (latest data)
Processing streams: event source setup
Amazon
Kinesis 1
AWS
Lambda 1
Amazon
CloudWatch
Amazon
DynamoDB
AWS
Lambda 2 Amazon
S3
• Multiple functions can be mapped to one
stream
• Multiple streams can be mapped to one
Lambda function
• Each mapping is a unique key pair Kinesis
stream to Lambda function
• Each mapping has unique shard iterators
Amazon
Kinesis 2
Processing streams: under the hood
• Event received by Lambda function is a collection of records from the
stream
{ "Records": [ {
"kinesis": {
"partitionKey": "partitionKey-3",
"kinesisSchemaVersion": "1.0",
"data": "SGVsbG8sIHRoaXMgaXMgYSB0ZXN0IDEyMy4=",
"sequenceNumber": "49545115243490985018280067714973144582180062593244200961" },
"eventSource": "aws:kinesis",
"eventID": "shardId-
000000000000:49545115243490985018280067714973144582180062593244200961",
"invokeIdentityArn": "arn:aws:iam::account-id:role/testLEBRole",
"eventVersion": "1.0",
"eventName": "aws:kinesis:record",
"eventSourceARN": "arn:aws:kinesis:us-west-2:35667example:stream/examplestream",
"awsRegion": "us-west-2" } ] }
Processing streams: under the hood
• Polling
▪ Concurrent polling and processing per shard
▪ Currently, polls every 1s for DDB Streams if no records found
▪ Currently, polls every 250 ms for DDB Streams if no records found
▪ Grab as much as possible in one GetRecords call
• Batching
▪ Sub batch in memory for invocation payload
• Synchronous invocation
▪ Batches invoked as synchronous RequestResponse type
▪ Lambda honors Kinesis at least once semantics
▪ Each shard blocks on in order synchronous invocation
Processing streams: under the hood
• Per Shard:
▪ Lambda calls GetRecords with max limit from Kinesis (10 k or 10 MB)
▪ If no record, wait some time
▪ From in memory, sub batches and formats records into Lambda payload
▪ Invoke Lambda with synchronous invoke
… …
Source
Kinesis Lambda Polling Logic
Shards
Lambda will scale automaticallyScale Kinesis by adding shards
Batch sync invokesPolls
Processing streams: how it works
▪ Lambda blocks on ordered processing for each individual shard
▪ Increasing # of shards with even distribution allows increased concurrency
▪ Batch size may impact duration if the Lambda function takes longer to process
more records
… …
Source
Kinesis Lambda Polling Logic
Shards
Lambda will scale automaticallyScale Kinesis by adding shards
Batch sync invokesPolls
Processing streams: under the hood
▪ Retry execution failures until the record is expired
▪ Retry with exponential backoff up to 60 s
▪ Throttles and errors impacts duration and directly impacts throughput
Kinesis
…
Source
Scale Kinesis by adding shards
Lambda Polling Logic
Lambda will scale automatically
Polls
invoke fail
invoke fail
invoke success
Batch sync invokes
Processing streams: under the hood
▪ Maximum theoretical throughput:
# shards * 2 MB / (s)
▪ Effective theoretical throughput:
• ( # shards * batch size (MB) ) / ( function duration (s) * retries until expiry)
▪ If put / ingestion rate is greater than the theoretical throughput, consider increasing
number of shards of optimizing function duration to increase throughput
Processing streams: how it looks
•GetRecords (effective throughput): bytes, latency, records, etc.
•PutRecord: bytes, latency, records, etc.
•GetRecords.IteratorAgeMilliseconds: how old your last processed records were. If high,
processing is falling behind. If close to 24 hours, records are close to being dropped.
Processing streams: how it looks
Amazon CloudWatch Metrics
• Invocation count
• Duration
• Error count
• Throttle count
Amazon CloudWatch Logs
• All Metrics
• Custom logs
• RAM consumed
Processing streams: how it looks
Common observations:
▪ Effective batch size may be less than configured during low throughput
▪ Effective batch size will increase during higher throughput
▪ Increased Lambda duration -> decreased # of invokes and GetRecord calls
▪ Too many consumers of your stream may compete with Kinesis read limits and
induce ReadProvisionedThroughputExceeded errors and metrics
ANALYSING USAGE OF THOMSON REUTERS
PRODUCTS WITH AWS
Anders Fritz & Marco Pierleoni
CHALLENGE
• To identify and define a solution for usage analytics tracking that enables product
teams to take ownership of the usage data collected. In addition to tracking and
visualizing usage data it had to;
1. Cross reference Usage
with Business data
4. Require Limited
Maintenance.
3. Auto Scale as data
flow fluctuates.
2. Follow TR Security &
Compliance rules.
5. Launch in 5 months.
SOLUTION
SOLUTION
SOLUTION
SOLUTION
SOLUTION
SOLUTION
SOLUTION
• Product Insight is live – adoption rate high.
• Tested 4,000 requests per second while targeting 5bn requests / month.
• Since March – very little maintenance required
• No Outages
• No Downtime
• Cloudwatch monitor everything.
• Latency – Data visible on chart within 10 seconds
• BrExit and US elections tested autoscaling.
• US elections ~16m events – normally ~ 6-8m events / day.
• UK EU referendum (BrExit) ~ 10m events – normally ~ 5m events / day
OUTCOME
EVENTS CAPTURED
UK EU Referendum June 23rd (BrExit)
time
#events
EVENTS CAPTURED
US Elections November 8th
time
#events
aws.amazon.com/activate
Everything and Anything Startups
Need to Get Started on AWS

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

AWS re:Invent 2016: Building Big Data Applications with the AWS Big Data Plat...
AWS re:Invent 2016: Building Big Data Applications with the AWS Big Data Plat...AWS re:Invent 2016: Building Big Data Applications with the AWS Big Data Plat...
AWS re:Invent 2016: Building Big Data Applications with the AWS Big Data Plat...
 
Real-Time Streaming Data on AWS
Real-Time Streaming Data on AWSReal-Time Streaming Data on AWS
Real-Time Streaming Data on AWS
 
The Best of re:invent 2016
The Best of re:invent 2016The Best of re:invent 2016
The Best of re:invent 2016
 
AWS re:Invent 2016: AWS Database State of the Union (DAT320)
AWS re:Invent 2016: AWS Database State of the Union (DAT320)AWS re:Invent 2016: AWS Database State of the Union (DAT320)
AWS re:Invent 2016: AWS Database State of the Union (DAT320)
 
Real-time Data Processing Using AWS Lambda
Real-time Data Processing Using AWS LambdaReal-time Data Processing Using AWS Lambda
Real-time Data Processing Using AWS Lambda
 
Rackspace Best Practices for DevOps on AWS
Rackspace Best Practices for DevOps on AWSRackspace Best Practices for DevOps on AWS
Rackspace Best Practices for DevOps on AWS
 
Reducing Latency and Increasing Performance while Cutting Infrastructure Costs
Reducing Latency and Increasing Performance while Cutting Infrastructure CostsReducing Latency and Increasing Performance while Cutting Infrastructure Costs
Reducing Latency and Increasing Performance while Cutting Infrastructure Costs
 
Getting started with Amazon Kinesis
Getting started with Amazon KinesisGetting started with Amazon Kinesis
Getting started with Amazon Kinesis
 
Migrating your Databases to AWS: Deep Dive on Amazon RDS and AWS Database Mig...
Migrating your Databases to AWS: Deep Dive on Amazon RDS and AWS Database Mig...Migrating your Databases to AWS: Deep Dive on Amazon RDS and AWS Database Mig...
Migrating your Databases to AWS: Deep Dive on Amazon RDS and AWS Database Mig...
 
AWS re:Invent 2016: Event Handling at Scale: Designing an Auditable Ingestion...
AWS re:Invent 2016: Event Handling at Scale: Designing an Auditable Ingestion...AWS re:Invent 2016: Event Handling at Scale: Designing an Auditable Ingestion...
AWS re:Invent 2016: Event Handling at Scale: Designing an Auditable Ingestion...
 
Getting Started with AWS Lambda and the Serverless Cloud
Getting Started with AWS Lambda and the Serverless CloudGetting Started with AWS Lambda and the Serverless Cloud
Getting Started with AWS Lambda and the Serverless Cloud
 
AWS re:Invent 2016: Monitoring, Hold the Infrastructure: Getting the Most fro...
AWS re:Invent 2016: Monitoring, Hold the Infrastructure: Getting the Most fro...AWS re:Invent 2016: Monitoring, Hold the Infrastructure: Getting the Most fro...
AWS re:Invent 2016: Monitoring, Hold the Infrastructure: Getting the Most fro...
 
Modern data architectures for real time analytics and engagement
Modern data architectures for real time analytics and engagementModern data architectures for real time analytics and engagement
Modern data architectures for real time analytics and engagement
 
Real Time Data Processing Using AWS Lambda
Real Time Data Processing Using AWS LambdaReal Time Data Processing Using AWS Lambda
Real Time Data Processing Using AWS Lambda
 
AWS re:Invent 2016: From Monolithic to Microservices: Evolving Architecture P...
AWS re:Invent 2016: From Monolithic to Microservices: Evolving Architecture P...AWS re:Invent 2016: From Monolithic to Microservices: Evolving Architecture P...
AWS re:Invent 2016: From Monolithic to Microservices: Evolving Architecture P...
 
AWS re:Invent 2016: Store and collaborate on content securely with Amazon Wor...
AWS re:Invent 2016: Store and collaborate on content securely with Amazon Wor...AWS re:Invent 2016: Store and collaborate on content securely with Amazon Wor...
AWS re:Invent 2016: Store and collaborate on content securely with Amazon Wor...
 
Serverless Realtime Backup
Serverless Realtime BackupServerless Realtime Backup
Serverless Realtime Backup
 
AWS re:Invent 2016: How AWS Automates Internal Compliance at Massive Scale us...
AWS re:Invent 2016: How AWS Automates Internal Compliance at Massive Scale us...AWS re:Invent 2016: How AWS Automates Internal Compliance at Massive Scale us...
AWS re:Invent 2016: How AWS Automates Internal Compliance at Massive Scale us...
 
A Data Culture with Embedded Analytics in Action
A Data Culture with Embedded Analytics in ActionA Data Culture with Embedded Analytics in Action
A Data Culture with Embedded Analytics in Action
 
AWS re:Invent 2016: Big Data Mini Con State of the Union (BDM205)
AWS re:Invent 2016: Big Data Mini Con State of the Union (BDM205)AWS re:Invent 2016: Big Data Mini Con State of the Union (BDM205)
AWS re:Invent 2016: Big Data Mini Con State of the Union (BDM205)
 

Andere mochten auch

Andere mochten auch (20)

Introduction to AWS Step Functions:
Introduction to AWS Step Functions: Introduction to AWS Step Functions:
Introduction to AWS Step Functions:
 
A Brief Look at Serverless Architecture
A Brief Look at Serverless ArchitectureA Brief Look at Serverless Architecture
A Brief Look at Serverless Architecture
 
Build a Website on AWS for Your First 10 Million Users
Build a Website on AWS for Your First 10 Million UsersBuild a Website on AWS for Your First 10 Million Users
Build a Website on AWS for Your First 10 Million Users
 
Lessons & Use-Cases at Scale - Dr. Pete Stanski
Lessons & Use-Cases at Scale - Dr. Pete StanskiLessons & Use-Cases at Scale - Dr. Pete Stanski
Lessons & Use-Cases at Scale - Dr. Pete Stanski
 
Introduction to Cloud Computing with Amazon Web Services
Introduction to Cloud Computing with Amazon Web ServicesIntroduction to Cloud Computing with Amazon Web Services
Introduction to Cloud Computing with Amazon Web Services
 
Modern Data Architectures for Business Insights at Scale
Modern Data Architectures for Business Insights at ScaleModern Data Architectures for Business Insights at Scale
Modern Data Architectures for Business Insights at Scale
 
Getting Started with Docker on AWS
Getting Started with Docker on AWSGetting Started with Docker on AWS
Getting Started with Docker on AWS
 
Modern Data Architectures for Real Time Analytics & Engagement
Modern Data Architectures for Real Time Analytics & EngagementModern Data Architectures for Real Time Analytics & Engagement
Modern Data Architectures for Real Time Analytics & Engagement
 
Container Orchestration with Amazon ECS
Container Orchestration with Amazon ECSContainer Orchestration with Amazon ECS
Container Orchestration with Amazon ECS
 
Accelerating the Transition to Broadcast and OTT Infrastructure in the Cloud
Accelerating the Transition to Broadcast and OTT Infrastructure in the CloudAccelerating the Transition to Broadcast and OTT Infrastructure in the Cloud
Accelerating the Transition to Broadcast and OTT Infrastructure in the Cloud
 
Optimize MySQL Workloads with Amazon Elastic Block Store - February 2017 AWS ...
Optimize MySQL Workloads with Amazon Elastic Block Store - February 2017 AWS ...Optimize MySQL Workloads with Amazon Elastic Block Store - February 2017 AWS ...
Optimize MySQL Workloads with Amazon Elastic Block Store - February 2017 AWS ...
 
Building A Modern Data Analytics Architecture on AWS
Building A Modern Data Analytics Architecture on AWSBuilding A Modern Data Analytics Architecture on AWS
Building A Modern Data Analytics Architecture on AWS
 
Deep Dive on Elastic File System - February 2017 AWS Online Tech Talks
Deep Dive on Elastic File System - February 2017 AWS Online Tech TalksDeep Dive on Elastic File System - February 2017 AWS Online Tech Talks
Deep Dive on Elastic File System - February 2017 AWS Online Tech Talks
 
Introduction on Amazon EC2
 Introduction on Amazon EC2 Introduction on Amazon EC2
Introduction on Amazon EC2
 
AWS re:Invent 2016: [JK REPEAT] Deep Dive on Amazon EC2 Instances, Featuring ...
AWS re:Invent 2016: [JK REPEAT] Deep Dive on Amazon EC2 Instances, Featuring ...AWS re:Invent 2016: [JK REPEAT] Deep Dive on Amazon EC2 Instances, Featuring ...
AWS re:Invent 2016: [JK REPEAT] Deep Dive on Amazon EC2 Instances, Featuring ...
 
Best Practices with IoT Security - February Online Tech Talks
Best Practices with IoT Security - February Online Tech TalksBest Practices with IoT Security - February Online Tech Talks
Best Practices with IoT Security - February Online Tech Talks
 
AWS Services for Content Production
AWS Services for Content ProductionAWS Services for Content Production
AWS Services for Content Production
 
Deep Dive on Amazon EC2
Deep Dive on Amazon EC2Deep Dive on Amazon EC2
Deep Dive on Amazon EC2
 
Migrate from SQL Server or Oracle into Amazon Aurora using AWS Database Migra...
Migrate from SQL Server or Oracle into Amazon Aurora using AWS Database Migra...Migrate from SQL Server or Oracle into Amazon Aurora using AWS Database Migra...
Migrate from SQL Server or Oracle into Amazon Aurora using AWS Database Migra...
 
Introducing Amazon Lex – A Service for Building Voice or Text Chatbots - Marc...
Introducing Amazon Lex – A Service for Building Voice or Text Chatbots - Marc...Introducing Amazon Lex – A Service for Building Voice or Text Chatbots - Marc...
Introducing Amazon Lex – A Service for Building Voice or Text Chatbots - Marc...
 

Ähnlich wie Real-time Data Processing using AWS Lambda

Real Time Data Processing Using AWS Lambda - DevDay Austin 2017
Real Time Data Processing Using AWS Lambda - DevDay Austin 2017Real Time Data Processing Using AWS Lambda - DevDay Austin 2017
Real Time Data Processing Using AWS Lambda - DevDay Austin 2017
Amazon Web Services
 

Ähnlich wie Real-time Data Processing using AWS Lambda (20)

Real-time Data Processing Using AWS Lambda
Real-time Data Processing Using AWS LambdaReal-time Data Processing Using AWS Lambda
Real-time Data Processing Using AWS Lambda
 
Real-time Data Processing Using AWS Lambda
Real-time Data Processing Using AWS LambdaReal-time Data Processing Using AWS Lambda
Real-time Data Processing Using AWS Lambda
 
Real-time Data Processing Using AWS Lambda
Real-time Data Processing Using AWS LambdaReal-time Data Processing Using AWS Lambda
Real-time Data Processing Using AWS Lambda
 
Real Time Data Processing Using AWS Lambda - DevDay Los Angeles 2017
Real Time Data Processing Using AWS Lambda - DevDay Los Angeles 2017Real Time Data Processing Using AWS Lambda - DevDay Los Angeles 2017
Real Time Data Processing Using AWS Lambda - DevDay Los Angeles 2017
 
AWS May Webinar Series - Streaming Data Processing with Amazon Kinesis and AW...
AWS May Webinar Series - Streaming Data Processing with Amazon Kinesis and AW...AWS May Webinar Series - Streaming Data Processing with Amazon Kinesis and AW...
AWS May Webinar Series - Streaming Data Processing with Amazon Kinesis and AW...
 
Real Time Data Processing Using AWS Lambda - DevDay Austin 2017
Real Time Data Processing Using AWS Lambda - DevDay Austin 2017Real Time Data Processing Using AWS Lambda - DevDay Austin 2017
Real Time Data Processing Using AWS Lambda - DevDay Austin 2017
 
Building Big Data Applications with Serverless Architectures - June 2017 AWS...
Building Big Data Applications with Serverless Architectures -  June 2017 AWS...Building Big Data Applications with Serverless Architectures -  June 2017 AWS...
Building Big Data Applications with Serverless Architectures - June 2017 AWS...
 
Raleigh DevDay 2017: Real time data processing using AWS Lambda
Raleigh DevDay 2017: Real time data processing using AWS LambdaRaleigh DevDay 2017: Real time data processing using AWS Lambda
Raleigh DevDay 2017: Real time data processing using AWS Lambda
 
SMC303 Real-time Data Processing Using AWS Lambda
SMC303 Real-time Data Processing Using AWS LambdaSMC303 Real-time Data Processing Using AWS Lambda
SMC303 Real-time Data Processing Using AWS Lambda
 
AWS Lambda Supports Parallelization Factor for Kinesis and DynamoDB Event Sou...
AWS Lambda Supports Parallelization Factor for Kinesis and DynamoDB Event Sou...AWS Lambda Supports Parallelization Factor for Kinesis and DynamoDB Event Sou...
AWS Lambda Supports Parallelization Factor for Kinesis and DynamoDB Event Sou...
 
Real-time Data Processing with Amazon DynamoDB Streams and AWS Lambda
Real-time Data Processing with Amazon DynamoDB Streams and AWS LambdaReal-time Data Processing with Amazon DynamoDB Streams and AWS Lambda
Real-time Data Processing with Amazon DynamoDB Streams and AWS Lambda
 
Real-Time Event Processing
Real-Time Event ProcessingReal-Time Event Processing
Real-Time Event Processing
 
Serverless Architecture Patterns
Serverless Architecture PatternsServerless Architecture Patterns
Serverless Architecture Patterns
 
Serverless Architectural Patterns and Best Practices | AWS
Serverless Architectural Patterns and Best Practices | AWSServerless Architectural Patterns and Best Practices | AWS
Serverless Architectural Patterns and Best Practices | AWS
 
Serverless Architectural Patterns and Best Practices
Serverless Architectural Patterns and Best PracticesServerless Architectural Patterns and Best Practices
Serverless Architectural Patterns and Best Practices
 
Deep Dive and Best Practices for Real Time Streaming Applications
Deep Dive and Best Practices for Real Time Streaming ApplicationsDeep Dive and Best Practices for Real Time Streaming Applications
Deep Dive and Best Practices for Real Time Streaming Applications
 
AWS re:Invent 2016: [JK REPEAT] Serverless Architectural Patterns and Best Pr...
AWS re:Invent 2016: [JK REPEAT] Serverless Architectural Patterns and Best Pr...AWS re:Invent 2016: [JK REPEAT] Serverless Architectural Patterns and Best Pr...
AWS re:Invent 2016: [JK REPEAT] Serverless Architectural Patterns and Best Pr...
 
Deep dive and best practices on real time streaming applications nyc-loft_oct...
Deep dive and best practices on real time streaming applications nyc-loft_oct...Deep dive and best practices on real time streaming applications nyc-loft_oct...
Deep dive and best practices on real time streaming applications nyc-loft_oct...
 
AWS re:Invent 2016: Serverless Architectural Patterns and Best Practices (ARC...
AWS re:Invent 2016: Serverless Architectural Patterns and Best Practices (ARC...AWS re:Invent 2016: Serverless Architectural Patterns and Best Practices (ARC...
AWS re:Invent 2016: Serverless Architectural Patterns and Best Practices (ARC...
 
AWS Webcast - Managing Big Data in the AWS Cloud_20140924
AWS Webcast - Managing Big Data in the AWS Cloud_20140924AWS Webcast - Managing Big Data in the AWS Cloud_20140924
AWS Webcast - Managing Big Data in the AWS Cloud_20140924
 

Mehr von Amazon Web Services

Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
Amazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
Amazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
Amazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
Amazon Web Services
 

Mehr von Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

Kürzlich hochgeladen

Chiulli_Aurora_Oman_Raffaele_Beowulf.pptx
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptxChiulli_Aurora_Oman_Raffaele_Beowulf.pptx
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptx
raffaeleoman
 
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...
Sheetaleventcompany
 
Uncommon Grace The Autobiography of Isaac Folorunso
Uncommon Grace The Autobiography of Isaac FolorunsoUncommon Grace The Autobiography of Isaac Folorunso
Uncommon Grace The Autobiography of Isaac Folorunso
Kayode Fayemi
 

Kürzlich hochgeladen (20)

Chiulli_Aurora_Oman_Raffaele_Beowulf.pptx
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptxChiulli_Aurora_Oman_Raffaele_Beowulf.pptx
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptx
 
Presentation on Engagement in Book Clubs
Presentation on Engagement in Book ClubsPresentation on Engagement in Book Clubs
Presentation on Engagement in Book Clubs
 
BDSM⚡Call Girls in Sector 93 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 93 Noida Escorts >༒8448380779 Escort ServiceBDSM⚡Call Girls in Sector 93 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 93 Noida Escorts >༒8448380779 Escort Service
 
Introduction to Prompt Engineering (Focusing on ChatGPT)
Introduction to Prompt Engineering (Focusing on ChatGPT)Introduction to Prompt Engineering (Focusing on ChatGPT)
Introduction to Prompt Engineering (Focusing on ChatGPT)
 
Thirunelveli call girls Tamil escorts 7877702510
Thirunelveli call girls Tamil escorts 7877702510Thirunelveli call girls Tamil escorts 7877702510
Thirunelveli call girls Tamil escorts 7877702510
 
BDSM⚡Call Girls in Sector 97 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 97 Noida Escorts >༒8448380779 Escort ServiceBDSM⚡Call Girls in Sector 97 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 97 Noida Escorts >༒8448380779 Escort Service
 
SaaStr Workshop Wednesday w/ Lucas Price, Yardstick
SaaStr Workshop Wednesday w/ Lucas Price, YardstickSaaStr Workshop Wednesday w/ Lucas Price, Yardstick
SaaStr Workshop Wednesday w/ Lucas Price, Yardstick
 
ICT role in 21st century education and it's challenges.pdf
ICT role in 21st century education and it's challenges.pdfICT role in 21st century education and it's challenges.pdf
ICT role in 21st century education and it's challenges.pdf
 
Causes of poverty in France presentation.pptx
Causes of poverty in France presentation.pptxCauses of poverty in France presentation.pptx
Causes of poverty in France presentation.pptx
 
Dreaming Music Video Treatment _ Project & Portfolio III
Dreaming Music Video Treatment _ Project & Portfolio IIIDreaming Music Video Treatment _ Project & Portfolio III
Dreaming Music Video Treatment _ Project & Portfolio III
 
lONG QUESTION ANSWER PAKISTAN STUDIES10.
lONG QUESTION ANSWER PAKISTAN STUDIES10.lONG QUESTION ANSWER PAKISTAN STUDIES10.
lONG QUESTION ANSWER PAKISTAN STUDIES10.
 
The workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdf
The workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdfThe workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdf
The workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdf
 
Call Girl Number in Khar Mumbai📲 9892124323 💞 Full Night Enjoy
Call Girl Number in Khar Mumbai📲 9892124323 💞 Full Night EnjoyCall Girl Number in Khar Mumbai📲 9892124323 💞 Full Night Enjoy
Call Girl Number in Khar Mumbai📲 9892124323 💞 Full Night Enjoy
 
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...
 
Mohammad_Alnahdi_Oral_Presentation_Assignment.pptx
Mohammad_Alnahdi_Oral_Presentation_Assignment.pptxMohammad_Alnahdi_Oral_Presentation_Assignment.pptx
Mohammad_Alnahdi_Oral_Presentation_Assignment.pptx
 
AWS Data Engineer Associate (DEA-C01) Exam Dumps 2024.pdf
AWS Data Engineer Associate (DEA-C01) Exam Dumps 2024.pdfAWS Data Engineer Associate (DEA-C01) Exam Dumps 2024.pdf
AWS Data Engineer Associate (DEA-C01) Exam Dumps 2024.pdf
 
VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara Services
VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara ServicesVVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara Services
VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara Services
 
ANCHORING SCRIPT FOR A CULTURAL EVENT.docx
ANCHORING SCRIPT FOR A CULTURAL EVENT.docxANCHORING SCRIPT FOR A CULTURAL EVENT.docx
ANCHORING SCRIPT FOR A CULTURAL EVENT.docx
 
Uncommon Grace The Autobiography of Isaac Folorunso
Uncommon Grace The Autobiography of Isaac FolorunsoUncommon Grace The Autobiography of Isaac Folorunso
Uncommon Grace The Autobiography of Isaac Folorunso
 
Report Writing Webinar Training
Report Writing Webinar TrainingReport Writing Webinar Training
Report Writing Webinar Training
 

Real-time Data Processing using AWS Lambda

  • 1. Real-Time Processing Using AWS Lambda Presenter: Sidhartha Chauhan, Solution Architect Author: Cecilia Deng, SDE March 2017 – AWS Loft New York
  • 2. What to Expect from the Session • What kinds of real time events can trigger lambda? • How does Lambda pull and process streams? • What are some stream processing behaviors? • Hear how Thomson Reuters went real time with AWS Lambda
  • 3. Flavors of real time event sources Asynchronous Invoke Push Event Source Synchronous Invoke Push Event Source Stream Pull Event Source S3 async invoke Alexa skill sync invoke Pull then sync invoke DynamoDB Update Stream
  • 5. Real-time push • Who? • Any integrator that uses AWS Lambda invoke API • E.g., Amazon S3, Amazon SNS, Amazon Alexa, AWS IoT • What? • Event sources sending events to Lambda for processing • How? • Real-time triggered events owned by event source • Real-time processing owned by Lambda invoke methods
  • 6. Real-time push Synchronous Invoke Push Event Source Asynchronous Invoke Push Event Source
  • 8. Real-time pull • Who? • Amazon Kinesis and DynamoDB update streams • What? • Lambda grabbing events from a stream for processing • How? • Mapping maintained by Lambda • Real-time triggered events owned by DDB or Kinesis producer • Real-time processing owned by Lambda stream polling component and invoke methods
  • 11. Processing streams: Kinesis setup • Streams ▪ Made up of shards ▪ Each shard supports writes up to 1 MB/s ▪ Each shard supports reads up to 2 MB/s ▪ Each shard supports 5 reads/s • Data ▪ All data is stored and replayable for 24 hours by default ▪ Make sure partition key distribution is even to optimize parallel throughput ▪ Pick a key with more groups than shards
  • 12. Processing streams: Lambda setup Memory ▪ CPU is proportional to the memory configured ▪ More memory means faster execution, if CPU bound ▪ More memory means larger sized record batches can be processed Timeout • Increasing timeout allows for longer functions, but more wait in case of errors Permission model • The execution role defined for Lambda must have permission to access the stream
  • 13. Processing streams: event source setup • Batch size ▪ Max number of records that Lambda will send in one invocation ▪ Not equivalent to how many records Lambda gets from Kinesis ▪ Effective batch size is • MIN(records available, batch size, 6 MB) ▪ Increasing batch size allows fewer Lambda function invocations with more data processed per function
  • 14. Processing streams: event source setup • Starting Position: ▪ The position in the stream where Lambda starts reading ▪ Set to “Trim Horizon” for reading from start of stream (all data) ▪ Set to “Latest” for reading most recent data (LIFO) (latest data)
  • 15. Processing streams: event source setup Amazon Kinesis 1 AWS Lambda 1 Amazon CloudWatch Amazon DynamoDB AWS Lambda 2 Amazon S3 • Multiple functions can be mapped to one stream • Multiple streams can be mapped to one Lambda function • Each mapping is a unique key pair Kinesis stream to Lambda function • Each mapping has unique shard iterators Amazon Kinesis 2
  • 16. Processing streams: under the hood • Event received by Lambda function is a collection of records from the stream { "Records": [ { "kinesis": { "partitionKey": "partitionKey-3", "kinesisSchemaVersion": "1.0", "data": "SGVsbG8sIHRoaXMgaXMgYSB0ZXN0IDEyMy4=", "sequenceNumber": "49545115243490985018280067714973144582180062593244200961" }, "eventSource": "aws:kinesis", "eventID": "shardId- 000000000000:49545115243490985018280067714973144582180062593244200961", "invokeIdentityArn": "arn:aws:iam::account-id:role/testLEBRole", "eventVersion": "1.0", "eventName": "aws:kinesis:record", "eventSourceARN": "arn:aws:kinesis:us-west-2:35667example:stream/examplestream", "awsRegion": "us-west-2" } ] }
  • 17. Processing streams: under the hood • Polling ▪ Concurrent polling and processing per shard ▪ Currently, polls every 1s for DDB Streams if no records found ▪ Currently, polls every 250 ms for DDB Streams if no records found ▪ Grab as much as possible in one GetRecords call • Batching ▪ Sub batch in memory for invocation payload • Synchronous invocation ▪ Batches invoked as synchronous RequestResponse type ▪ Lambda honors Kinesis at least once semantics ▪ Each shard blocks on in order synchronous invocation
  • 18. Processing streams: under the hood • Per Shard: ▪ Lambda calls GetRecords with max limit from Kinesis (10 k or 10 MB) ▪ If no record, wait some time ▪ From in memory, sub batches and formats records into Lambda payload ▪ Invoke Lambda with synchronous invoke … … Source Kinesis Lambda Polling Logic Shards Lambda will scale automaticallyScale Kinesis by adding shards Batch sync invokesPolls
  • 19. Processing streams: how it works ▪ Lambda blocks on ordered processing for each individual shard ▪ Increasing # of shards with even distribution allows increased concurrency ▪ Batch size may impact duration if the Lambda function takes longer to process more records … … Source Kinesis Lambda Polling Logic Shards Lambda will scale automaticallyScale Kinesis by adding shards Batch sync invokesPolls
  • 20. Processing streams: under the hood ▪ Retry execution failures until the record is expired ▪ Retry with exponential backoff up to 60 s ▪ Throttles and errors impacts duration and directly impacts throughput Kinesis … Source Scale Kinesis by adding shards Lambda Polling Logic Lambda will scale automatically Polls invoke fail invoke fail invoke success Batch sync invokes
  • 21. Processing streams: under the hood ▪ Maximum theoretical throughput: # shards * 2 MB / (s) ▪ Effective theoretical throughput: • ( # shards * batch size (MB) ) / ( function duration (s) * retries until expiry) ▪ If put / ingestion rate is greater than the theoretical throughput, consider increasing number of shards of optimizing function duration to increase throughput
  • 22. Processing streams: how it looks •GetRecords (effective throughput): bytes, latency, records, etc. •PutRecord: bytes, latency, records, etc. •GetRecords.IteratorAgeMilliseconds: how old your last processed records were. If high, processing is falling behind. If close to 24 hours, records are close to being dropped.
  • 23. Processing streams: how it looks Amazon CloudWatch Metrics • Invocation count • Duration • Error count • Throttle count Amazon CloudWatch Logs • All Metrics • Custom logs • RAM consumed
  • 24. Processing streams: how it looks Common observations: ▪ Effective batch size may be less than configured during low throughput ▪ Effective batch size will increase during higher throughput ▪ Increased Lambda duration -> decreased # of invokes and GetRecord calls ▪ Too many consumers of your stream may compete with Kinesis read limits and induce ReadProvisionedThroughputExceeded errors and metrics
  • 25. ANALYSING USAGE OF THOMSON REUTERS PRODUCTS WITH AWS Anders Fritz & Marco Pierleoni
  • 26. CHALLENGE • To identify and define a solution for usage analytics tracking that enables product teams to take ownership of the usage data collected. In addition to tracking and visualizing usage data it had to; 1. Cross reference Usage with Business data 4. Require Limited Maintenance. 3. Auto Scale as data flow fluctuates. 2. Follow TR Security & Compliance rules. 5. Launch in 5 months.
  • 34. • Product Insight is live – adoption rate high. • Tested 4,000 requests per second while targeting 5bn requests / month. • Since March – very little maintenance required • No Outages • No Downtime • Cloudwatch monitor everything. • Latency – Data visible on chart within 10 seconds • BrExit and US elections tested autoscaling. • US elections ~16m events – normally ~ 6-8m events / day. • UK EU referendum (BrExit) ~ 10m events – normally ~ 5m events / day OUTCOME
  • 35. EVENTS CAPTURED UK EU Referendum June 23rd (BrExit) time #events
  • 36. EVENTS CAPTURED US Elections November 8th time #events
  • 37. aws.amazon.com/activate Everything and Anything Startups Need to Get Started on AWS