SlideShare ist ein Scribd-Unternehmen logo
1 von 36
Downloaden Sie, um offline zu lesen
© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Vyom Nagrani, Sr. Product Manager, AWS Lambda
June 16, 2015
Dynamic Data Ingestion with
Amazon S3 and AWS Lambda
Amazon S3 Event Notifications: Integrating
storage and workflows
Delivers notifications to Amazon SNS, Amazon SQS, or AWS
Lambda when events occur in Amazon S3
S3
Events
SNS topic
SQS queue
Lambda function
Notifications
Foo() {
…
}
Benefits of Amazon S3 Notifications for dynamic
data ingestion
Integration – A new surface on the
Amazon S3 “building block” for event-
based computing
Speed – typical time to send
notifications is less than a second
Simplicity – Avoids proxies or polling
to detect changes
Proxy
List/Diff

Notifications
or
AWS Lambda: A compute service that runs
your code in response to events
Lambda functions: Stateless, event-driven code execution
Triggered by events:
• Put to an Amazon S3 bucket
• Record in an Amazon Kinesis stream
• Direct sync and async invocations
Makes it easy to
• Build back-end services that perform at scale
• Perform data-driven auditing, analysis, and notification
High performance at any scale;
Cost-effective and efficient
No Infrastructure to manage
Pay only for what you use: Lambda
automatically matches capacity to
your request rate. Purchase
compute in 100ms increments.
Bring Your Own Code
“Productivity focused compute platform to build powerful, dynamic,
modular applications in the cloud”
Run code in a choice of standard
languages. Use threads, processes,
files, and shell scripts normally.
Focus on business logic, not
infrastructure. You upload code; AWS
Lambda handles everything else.
Benefits of AWS Lambda for building a server-
less data processing engine
1 2 3
What you can do with S3+Lambda
Customers have told us about powerful applications …
… and we look forward to seeing what you create.
Today’s demo #1: Workflow of a simple video
transcoding application
Notification
Amazon S3 AWS Lambda Amazon S3
New video
uploaded
Walkthrough of setting up S3 event notifications and
Lambda functions through the AWS Console
Walkthrough of setting up S3 event notifications and
Lambda functions through the AWS Console
Walkthrough of setting up S3 event notifications and
Lambda functions through the AWS Console
Walkthrough of setting up S3 event notifications and
Lambda functions through the AWS Console
Walkthrough of setting up S3 event notifications and
Lambda functions through the AWS Console
Walkthrough of setting up S3 event notifications and
Lambda functions through the AWS Console
Walkthrough of setting up S3 event notifications and
Lambda functions through the AWS Console
Code walkthrough for video clip transcode
Setup variables
Serialize steps
Get file from S3
Code walkthrough for video clip transcode
Write to disk
ffmpeg Transode
Read from disk
Upload to S3
Demo #1: Automatic video
transcoding with Amazon S3 and
AWS Lambda
Potential further additions to a production
video transcoding application
• Include custom transcoding/watermarking libraries
• Break longer video files into smaller clips, transcode each clip separately
• Transcode to multiple formats by running multiple Lambda functions in parallel
• Send S3 event notification to an SNS topic
• Subscribe multiple Lambda functions to that SNS topic
Today’s demo #2: Workflow of infrastructure
monitoring and automation application
Notification
Amazon S3 AWS LambdaAWS
CloudTrail
Amazon SNS
AWS IAM
Optional
Code walkthrough for infrastructure monitoring
Get file from S3
Unzip it
Parse it
Check activity
Code walkthrough for infrastructure monitoring
Find patterns
Take action
Demo #2: Infrastructure
monitoring and automation using
AWS CloudTrail, Amazon S3 and
AWS Lambda
Potential further additions to a production
infrastructure monitoring and automation
• In addition to monitoring and alarming, create automated actions in response to policy
violations or suspicious activity
• Create .config file with multiple check points
• Each check can have a different SNS topic to alarm against
• Aggregate CloudTrail log files to be delivered to a single admin S3 bucket across all your
AWS accounts
Today’s demo #3: Workflow of automated file
de-duplication on upload
Notification
Amazon S3 AWS Lambda
Amazon S3
New file
uploaded
Amazon
DynamoDB
Optional
Code walkthrough for automated file de-duplication
Get headObject
List other objects
Compare eTags
Code walkthrough for automated file de-duplication
Take action
Demo #3: Automatic File De-
duplication using Amazon S3 and
AWS Lambda
Potential further additions to a production
automated file de-duplication
• Create and compare SHA hash for each file instead of using S3 eTag to reduce collision
• Handle collision situations by calling another Lambda function to do a full file compare
• Index all hashes to a DynamoDB table, check against table instead of reading all files in the
bucket each time a new file is uploaded/edited
• Create Lambda wrapper around deleteObject API call to update index table
Things to remember about S3 Notifications
• Amazon S3 event notifications are set up at the bucket level
• Highly reliable – designed for nine ‘9’s with at least once delivery
• Currently supports Put, Post, Copy, MultiPartComplete, and RRSObjectLost events
• Configuration stored as XML in the notification subresource associated with a bucket
• No additional charge for S3 Notifications
Attaching a Lambda function to S3 Notifications
• Automatic Scaling: Both S3 and Lambda scale automatically with higher PUT rates
• Lambda has a default limit of 1000 TPS, which can be increased by AWS Support Center
• Lambda queues all incoming requests from S3
• Lambda can absorb reasonable bursts of traffic for approximately 15-30 minutes
…Source
S3
Destination
1
Lambda
Destination
2
Functions
Lambda will scale with higher PUT rateS3 scales automatically
… Lambda
Frontend Queue
Best practices for creating Lambda functions
• Memory: CPU proportional to the memory configured
• Increasing memory makes your code execute faster (if CPU bound)
• Timeout: Increasing timeout allows for longer functions, but more wait in case of errors
• Retries: For S3, Lambda retries each function at least 3 times
• Events rejected by AWS Lambda may be retained and retried by S3 for 24 hours
• Permission model: S3 pushes events to Lambda, so grant S3 invocation permission
through a resource policy, and add the execution role Lambda
Monitoring and Debugging Lambda functions
• Monitoring: available in Amazon CloudWatch Metrics
• Invocation count
• Duration
• Error count
• Throttle count
• Debugging: available in Amazon CloudWatch Logs
• All Metrics
• Custom logs
• RAM consumed
• Search for log events
• Real time feed of log events delivered to an Amazon Kinesis stream
Customers running dynamic data ingestion
and processing using S3+Lambda
AWS
Lambda
Indexing
tables or
notifications
“I want to apply custom logic to process
content being uploaded to my data store”.
• Watermarking / thumbnailing
• Transcoding
• Indexing and deduplication
• Aggregation and filtering
• Pre processing
• Content validation
Amazon S3
Bucket
Events
Transcoded
files
Three Next Steps
1. Enable S3 notification feature on your existing S3 buckets. Amazon
S3 event notifications can be sent in response to actions taken on
objects uploaded or stored in Amazon S3.
2. Create and test your first Lambda function. With AWS Lambda,
there are no new languages, tools, or frameworks to learn. You can
use any third party library, even native ones.
3. Use AWS Lambda to process Amazon S3 objects … no
infrastructure to manage, and setup a dynamic data ingestion
pipeline in minutes!
Thank you!
Visit http://aws.amazon.com/s3, the
AWS blog, and the S3 forum to learn
more and get started using S3.
Visit http://aws.amazon.com/lambda,
the AWS Compute blog, and the
Lambda forum to learn more and get
started using Lambda.
AWS Summit – Chicago: An exciting, free cloud conference designed to educate and inform new
customers about the AWS platform, best practices and new cloud services.
Details
• July 1, 2015
• Chicago, Illinois
• @ McCormick Place
Featuring
• New product launches
• 36+ sessions, labs, and bootcamps
• Executive and partner networking
Registration is now open
• Come and see what AWS and the cloud can do for you.
• Click here to register: http://amzn.to/1RooPPL

Weitere ähnliche Inhalte

Andere mochten auch

HDInsight Hadoop on Windows Azure
HDInsight Hadoop on Windows AzureHDInsight Hadoop on Windows Azure
HDInsight Hadoop on Windows AzureLynn Langit
 
Real-time Data Processing Using AWS Lambda
Real-time Data Processing Using AWS LambdaReal-time Data Processing Using AWS Lambda
Real-time Data Processing Using AWS LambdaAmazon Web Services
 
Twitter Best Practices 2017
Twitter Best Practices 2017Twitter Best Practices 2017
Twitter Best Practices 2017The Orchard
 
AWS June Webinar Series - Deep Dive: Protecting Your Data with AWS Encryption
AWS June Webinar Series - Deep Dive: Protecting Your Data with AWS EncryptionAWS June Webinar Series - Deep Dive: Protecting Your Data with AWS Encryption
AWS June Webinar Series - Deep Dive: Protecting Your Data with AWS EncryptionAmazon Web Services
 
Best Practices for Building a Data Lake with Amazon S3 - August 2016 Monthly ...
Best Practices for Building a Data Lake with Amazon S3 - August 2016 Monthly ...Best Practices for Building a Data Lake with Amazon S3 - August 2016 Monthly ...
Best Practices for Building a Data Lake with Amazon S3 - August 2016 Monthly ...Amazon Web Services
 
AWS July Webinar Series - Troubleshooting Operational and Security Issues in ...
AWS July Webinar Series - Troubleshooting Operational and Security Issues in ...AWS July Webinar Series - Troubleshooting Operational and Security Issues in ...
AWS July Webinar Series - Troubleshooting Operational and Security Issues in ...Amazon Web Services
 
AWS re:Invent 2016: Big Data Architectural Patterns and Best Practices on AWS...
AWS re:Invent 2016: Big Data Architectural Patterns and Best Practices on AWS...AWS re:Invent 2016: Big Data Architectural Patterns and Best Practices on AWS...
AWS re:Invent 2016: Big Data Architectural Patterns and Best Practices on AWS...Amazon Web Services
 
(BDT310) Big Data Architectural Patterns and Best Practices on AWS
(BDT310) Big Data Architectural Patterns and Best Practices on AWS(BDT310) Big Data Architectural Patterns and Best Practices on AWS
(BDT310) Big Data Architectural Patterns and Best Practices on AWSAmazon Web Services
 

Andere mochten auch (10)

HDInsight Hadoop on Windows Azure
HDInsight Hadoop on Windows AzureHDInsight Hadoop on Windows Azure
HDInsight Hadoop on Windows Azure
 
Introduction to Azure HDInsight
Introduction to Azure HDInsightIntroduction to Azure HDInsight
Introduction to Azure HDInsight
 
Real-time Data Processing Using AWS Lambda
Real-time Data Processing Using AWS LambdaReal-time Data Processing Using AWS Lambda
Real-time Data Processing Using AWS Lambda
 
Twitter Best Practices 2017
Twitter Best Practices 2017Twitter Best Practices 2017
Twitter Best Practices 2017
 
Ingest Options on AWS
Ingest Options on AWSIngest Options on AWS
Ingest Options on AWS
 
AWS June Webinar Series - Deep Dive: Protecting Your Data with AWS Encryption
AWS June Webinar Series - Deep Dive: Protecting Your Data with AWS EncryptionAWS June Webinar Series - Deep Dive: Protecting Your Data with AWS Encryption
AWS June Webinar Series - Deep Dive: Protecting Your Data with AWS Encryption
 
Best Practices for Building a Data Lake with Amazon S3 - August 2016 Monthly ...
Best Practices for Building a Data Lake with Amazon S3 - August 2016 Monthly ...Best Practices for Building a Data Lake with Amazon S3 - August 2016 Monthly ...
Best Practices for Building a Data Lake with Amazon S3 - August 2016 Monthly ...
 
AWS July Webinar Series - Troubleshooting Operational and Security Issues in ...
AWS July Webinar Series - Troubleshooting Operational and Security Issues in ...AWS July Webinar Series - Troubleshooting Operational and Security Issues in ...
AWS July Webinar Series - Troubleshooting Operational and Security Issues in ...
 
AWS re:Invent 2016: Big Data Architectural Patterns and Best Practices on AWS...
AWS re:Invent 2016: Big Data Architectural Patterns and Best Practices on AWS...AWS re:Invent 2016: Big Data Architectural Patterns and Best Practices on AWS...
AWS re:Invent 2016: Big Data Architectural Patterns and Best Practices on AWS...
 
(BDT310) Big Data Architectural Patterns and Best Practices on AWS
(BDT310) Big Data Architectural Patterns and Best Practices on AWS(BDT310) Big Data Architectural Patterns and Best Practices on AWS
(BDT310) Big Data Architectural Patterns and Best Practices on AWS
 

Mehr von Amazon Web Services

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Amazon Web Services
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Amazon Web Services
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateAmazon Web Services
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSAmazon Web Services
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Amazon Web Services
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Amazon Web Services
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...Amazon Web Services
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsAmazon Web Services
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareAmazon Web Services
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSAmazon Web Services
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAmazon Web Services
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareAmazon Web Services
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWSAmazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckAmazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without serversAmazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...Amazon Web Services
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceAmazon Web Services
 

Mehr von Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

Kürzlich hochgeladen

Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 

Kürzlich hochgeladen (20)

Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 

AWS June Webinar Series - Best Practices: Dynamic Data Ingestion with S3 and Lambda

  • 1. © 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Vyom Nagrani, Sr. Product Manager, AWS Lambda June 16, 2015 Dynamic Data Ingestion with Amazon S3 and AWS Lambda
  • 2. Amazon S3 Event Notifications: Integrating storage and workflows Delivers notifications to Amazon SNS, Amazon SQS, or AWS Lambda when events occur in Amazon S3 S3 Events SNS topic SQS queue Lambda function Notifications Foo() { … }
  • 3. Benefits of Amazon S3 Notifications for dynamic data ingestion Integration – A new surface on the Amazon S3 “building block” for event- based computing Speed – typical time to send notifications is less than a second Simplicity – Avoids proxies or polling to detect changes Proxy List/Diff  Notifications or
  • 4. AWS Lambda: A compute service that runs your code in response to events Lambda functions: Stateless, event-driven code execution Triggered by events: • Put to an Amazon S3 bucket • Record in an Amazon Kinesis stream • Direct sync and async invocations Makes it easy to • Build back-end services that perform at scale • Perform data-driven auditing, analysis, and notification
  • 5. High performance at any scale; Cost-effective and efficient No Infrastructure to manage Pay only for what you use: Lambda automatically matches capacity to your request rate. Purchase compute in 100ms increments. Bring Your Own Code “Productivity focused compute platform to build powerful, dynamic, modular applications in the cloud” Run code in a choice of standard languages. Use threads, processes, files, and shell scripts normally. Focus on business logic, not infrastructure. You upload code; AWS Lambda handles everything else. Benefits of AWS Lambda for building a server- less data processing engine 1 2 3
  • 6. What you can do with S3+Lambda Customers have told us about powerful applications … … and we look forward to seeing what you create.
  • 7. Today’s demo #1: Workflow of a simple video transcoding application Notification Amazon S3 AWS Lambda Amazon S3 New video uploaded
  • 8. Walkthrough of setting up S3 event notifications and Lambda functions through the AWS Console
  • 9. Walkthrough of setting up S3 event notifications and Lambda functions through the AWS Console
  • 10. Walkthrough of setting up S3 event notifications and Lambda functions through the AWS Console
  • 11. Walkthrough of setting up S3 event notifications and Lambda functions through the AWS Console
  • 12. Walkthrough of setting up S3 event notifications and Lambda functions through the AWS Console
  • 13. Walkthrough of setting up S3 event notifications and Lambda functions through the AWS Console
  • 14. Walkthrough of setting up S3 event notifications and Lambda functions through the AWS Console
  • 15. Code walkthrough for video clip transcode Setup variables Serialize steps Get file from S3
  • 16. Code walkthrough for video clip transcode Write to disk ffmpeg Transode Read from disk Upload to S3
  • 17. Demo #1: Automatic video transcoding with Amazon S3 and AWS Lambda
  • 18. Potential further additions to a production video transcoding application • Include custom transcoding/watermarking libraries • Break longer video files into smaller clips, transcode each clip separately • Transcode to multiple formats by running multiple Lambda functions in parallel • Send S3 event notification to an SNS topic • Subscribe multiple Lambda functions to that SNS topic
  • 19. Today’s demo #2: Workflow of infrastructure monitoring and automation application Notification Amazon S3 AWS LambdaAWS CloudTrail Amazon SNS AWS IAM Optional
  • 20. Code walkthrough for infrastructure monitoring Get file from S3 Unzip it Parse it Check activity
  • 21. Code walkthrough for infrastructure monitoring Find patterns Take action
  • 22. Demo #2: Infrastructure monitoring and automation using AWS CloudTrail, Amazon S3 and AWS Lambda
  • 23. Potential further additions to a production infrastructure monitoring and automation • In addition to monitoring and alarming, create automated actions in response to policy violations or suspicious activity • Create .config file with multiple check points • Each check can have a different SNS topic to alarm against • Aggregate CloudTrail log files to be delivered to a single admin S3 bucket across all your AWS accounts
  • 24. Today’s demo #3: Workflow of automated file de-duplication on upload Notification Amazon S3 AWS Lambda Amazon S3 New file uploaded Amazon DynamoDB Optional
  • 25. Code walkthrough for automated file de-duplication Get headObject List other objects Compare eTags
  • 26. Code walkthrough for automated file de-duplication Take action
  • 27. Demo #3: Automatic File De- duplication using Amazon S3 and AWS Lambda
  • 28. Potential further additions to a production automated file de-duplication • Create and compare SHA hash for each file instead of using S3 eTag to reduce collision • Handle collision situations by calling another Lambda function to do a full file compare • Index all hashes to a DynamoDB table, check against table instead of reading all files in the bucket each time a new file is uploaded/edited • Create Lambda wrapper around deleteObject API call to update index table
  • 29. Things to remember about S3 Notifications • Amazon S3 event notifications are set up at the bucket level • Highly reliable – designed for nine ‘9’s with at least once delivery • Currently supports Put, Post, Copy, MultiPartComplete, and RRSObjectLost events • Configuration stored as XML in the notification subresource associated with a bucket • No additional charge for S3 Notifications
  • 30. Attaching a Lambda function to S3 Notifications • Automatic Scaling: Both S3 and Lambda scale automatically with higher PUT rates • Lambda has a default limit of 1000 TPS, which can be increased by AWS Support Center • Lambda queues all incoming requests from S3 • Lambda can absorb reasonable bursts of traffic for approximately 15-30 minutes …Source S3 Destination 1 Lambda Destination 2 Functions Lambda will scale with higher PUT rateS3 scales automatically … Lambda Frontend Queue
  • 31. Best practices for creating Lambda functions • Memory: CPU proportional to the memory configured • Increasing memory makes your code execute faster (if CPU bound) • Timeout: Increasing timeout allows for longer functions, but more wait in case of errors • Retries: For S3, Lambda retries each function at least 3 times • Events rejected by AWS Lambda may be retained and retried by S3 for 24 hours • Permission model: S3 pushes events to Lambda, so grant S3 invocation permission through a resource policy, and add the execution role Lambda
  • 32. Monitoring and Debugging Lambda functions • Monitoring: available in Amazon CloudWatch Metrics • Invocation count • Duration • Error count • Throttle count • Debugging: available in Amazon CloudWatch Logs • All Metrics • Custom logs • RAM consumed • Search for log events • Real time feed of log events delivered to an Amazon Kinesis stream
  • 33. Customers running dynamic data ingestion and processing using S3+Lambda AWS Lambda Indexing tables or notifications “I want to apply custom logic to process content being uploaded to my data store”. • Watermarking / thumbnailing • Transcoding • Indexing and deduplication • Aggregation and filtering • Pre processing • Content validation Amazon S3 Bucket Events Transcoded files
  • 34. Three Next Steps 1. Enable S3 notification feature on your existing S3 buckets. Amazon S3 event notifications can be sent in response to actions taken on objects uploaded or stored in Amazon S3. 2. Create and test your first Lambda function. With AWS Lambda, there are no new languages, tools, or frameworks to learn. You can use any third party library, even native ones. 3. Use AWS Lambda to process Amazon S3 objects … no infrastructure to manage, and setup a dynamic data ingestion pipeline in minutes!
  • 35. Thank you! Visit http://aws.amazon.com/s3, the AWS blog, and the S3 forum to learn more and get started using S3. Visit http://aws.amazon.com/lambda, the AWS Compute blog, and the Lambda forum to learn more and get started using Lambda.
  • 36. AWS Summit – Chicago: An exciting, free cloud conference designed to educate and inform new customers about the AWS platform, best practices and new cloud services. Details • July 1, 2015 • Chicago, Illinois • @ McCormick Place Featuring • New product launches • 36+ sessions, labs, and bootcamps • Executive and partner networking Registration is now open • Come and see what AWS and the cloud can do for you. • Click here to register: http://amzn.to/1RooPPL