SlideShare ist ein Scribd-Unternehmen logo
1 von 22
Downloaden Sie, um offline zu lesen
Building Your Own Web Analytics Service
with node.js, Amazon DynamoDB, and
Amazon Elastic MapReduce
Jonathan Keebler - Founder, CTO - ScribbleLive
November 13, 2013

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc.
Who Am I?
•Jonathan Keebler @keebler
•Built video player for all CTV properties
–Worked on news sites like CTV, TSN, CP24
•CTO, Founder of ScribbleLive
•Bootstrapped a high scalability startup
–Credit card limit wasn’t that high, had to find cheap
ways to handle the load of top tier news sites
What is ScribbleLive?
•Leading provider of real-time engagement
management solutions
•We enable real-time publication and syndication
of digital content
•Our platform is transforming the way the world’s
largest brands and media approach
communication and content creation, creating true
real-time engagement
Some of our customers
Today
•Learn to build your own analytics service
– Seriously, we’re going to do it

•node.js on Amazon EC2: web servers
•Amazon DynamoDB: database
•Hadoop/Hive on Amazon Elastic MapReduce
(EMR): roll-up data
Why would we do this?
•ScribbleLive tracks “engagement minutes” (EMs)
across all customer sites
– e.g., ESPN.com, CNN.com, Reuters.com
– EM = 1 minute of a user watching a webpage
– 2.5B per month, 120M+ per hour

•Big analytics providers couldn’t do it
– Didn’t have the features
– Too inaccurate
How are we going to do this?
Visitors

Elastic Load Balancing

node.js

node.js

node.js

DynamoDB

node.js
DynamoDB: data structure
•Separate tables by timeframe
– Minute (written by node.js directly)
– Hour (EMR from minute data)
– Day (EMR from hour data)
– Month (EMR from day data)

•Structure
– Hash: Item (page id)
– Range: Time (rounded to min, hour, day)
– { Hits: 1 }
Elastic Load Balancing: AMI setup
•Custom AMI
– Loads source from SVN
– Launches node.js
Elastic Load Balancing: Load balancing
•1 load balancer
•Cookies keep unique user on same instance
•Auto-scaling
– CPU >50% or network-in 50M bytes, triggers new
servers coming online and added to Elastic Load
Balancing
node.js: Overview of code
•Accepts GET /?item={ID}&uid={UserID}
•Dictionary/Array of how many GETs per item in this
minute
– Hits[Minute][“{ID}”]++
– Example: Hits[“1/1/2014 1:23:00”][“abcd”]++

•Dictionary/Array of Users already counted in
Item:Minute (prevent double-counting)
•At end of minute, write data back to DynamoDB
node.js: Bulk writing to DynamoDB
•Writing all data back immediately in a loop = BAD!
– Throughput would spike in that ~second
– Would have to use higher throughput limit
– More $$$$

•Instead, figure out how many writes need to happen /
60 seconds = how many writes per second you should do
node.js: Bulk writing to DynamoDB
•Call to DynamoDB per item:
– update: (atomic) add X to {ID}:{Minute}
Hadoop: What we map and reduce
•To go from minute to hourly data
– Round every minute down to the nearest hour
(floor( Minute / 3600 ) * 3600)

– Sum the # of “Hits” from each data point

•Just look at the past 24 hours to save time
•Do the same for hourly to daily, daily to monthly
Hadoop: Hive scripts
INSERT OVERWRITE TABLE MetricsHourly
SELECT
Item,
(floor( Time / 3600 ) * 3600) AS Time,
SUM(Hits) AS Hits,
from_unixtime(floor( Time / 3600 ) * 3600 ) AS TimeFriendly
FROM Metrics WHERE Time >= floor( unix_timestamp() / 86400 ) * 86400 - ( 86400 * 1 )
GROUP BY Item, floor( Time / 3600 ) * 3600;
Hadoop: Setting Up EMR
Hadoop: Setting Up EMR
• “Start an Interactive Hive Session”
• Run a cron job every 15 minutes to check if
the Hive job is complete
• If complete, downloads newest Hive script
and restarts the job
• Amazon CloudWatch alarms if jobs taking
longer than 12 hours
Hadoop: Cron Job
#!/bin/sh
JOBID=$(hadoop job -list | grep job_ | cut -f1)
if [ -n "$JOBID" ];then
echo "Another job already running";
else
echo "Starting Hive job..."
echo `date` starting >> /var/log/metricsdaily_starting
wget -qO- http://DEPLOY/metrics/rollups.sql > /tmp/rollups.sql && hive -f /tmp/rollups.sql
fi
Application API
•RESTful API in the language of your choice
•Calls to DynamoDB:
–query: Hash:{ID} w/ Range:{Time A}-{Time B}

•Since M-R could take a day to run, need to reconstruct
hourly data from minutes for most recent 24 hours
–e.g. if you want hourly data for last 2 days, take 24 hourly data
pts from yesterday, and 24*60 minute data pts from today
(convert to hourly data pts in code)
Performance
Performance
Please give us your feedback on this
presentation

BDT203
As a thank you, we will select prize
winners daily for completed surveys!

Weitere ähnliche Inhalte

Was ist angesagt?

Kinesis and Spark Streaming - Advanced AWS Meetup - August 2014
Kinesis and Spark Streaming - Advanced AWS Meetup - August 2014Kinesis and Spark Streaming - Advanced AWS Meetup - August 2014
Kinesis and Spark Streaming - Advanced AWS Meetup - August 2014Chris Fregly
 
AWS Webcast - Amazon Kinesis and Apache Storm
AWS Webcast - Amazon Kinesis and Apache StormAWS Webcast - Amazon Kinesis and Apache Storm
AWS Webcast - Amazon Kinesis and Apache StormAmazon Web Services
 
GDC 2015 - Game Analytics with AWS Redshift, Kinesis, and the Mobile SDK
GDC 2015 - Game Analytics with AWS Redshift, Kinesis, and the Mobile SDKGDC 2015 - Game Analytics with AWS Redshift, Kinesis, and the Mobile SDK
GDC 2015 - Game Analytics with AWS Redshift, Kinesis, and the Mobile SDKNate Wiger
 
Introduction to Amazon Kinesis Analytics
Introduction to Amazon Kinesis AnalyticsIntroduction to Amazon Kinesis Analytics
Introduction to Amazon Kinesis AnalyticsAmazon Web Services
 
(WRK302) Event-Driven Programming
(WRK302) Event-Driven Programming(WRK302) Event-Driven Programming
(WRK302) Event-Driven ProgrammingAmazon Web Services
 
Real Time Big Data Processing on AWS
Real Time Big Data Processing on AWSReal Time Big Data Processing on AWS
Real Time Big Data Processing on AWSCaserta
 
AWS May Webinar Series - Streaming Data Processing with Amazon Kinesis and AW...
AWS May Webinar Series - Streaming Data Processing with Amazon Kinesis and AW...AWS May Webinar Series - Streaming Data Processing with Amazon Kinesis and AW...
AWS May Webinar Series - Streaming Data Processing with Amazon Kinesis and AW...Amazon Web Services
 
How Netflix Uses Amazon Kinesis Streams to Monitor and Optimize Large-scale N...
How Netflix Uses Amazon Kinesis Streams to Monitor and Optimize Large-scale N...How Netflix Uses Amazon Kinesis Streams to Monitor and Optimize Large-scale N...
How Netflix Uses Amazon Kinesis Streams to Monitor and Optimize Large-scale N...Amazon Web Services
 
Optimizing Storage for Big Data/Analytics Workloads
Optimizing Storage for Big Data/Analytics WorkloadsOptimizing Storage for Big Data/Analytics Workloads
Optimizing Storage for Big Data/Analytics WorkloadsAmazon Web Services
 
DAT102 Introduction to Amazon DynamoDB - AWS re: Invent 2012
DAT102 Introduction to Amazon DynamoDB - AWS re: Invent 2012DAT102 Introduction to Amazon DynamoDB - AWS re: Invent 2012
DAT102 Introduction to Amazon DynamoDB - AWS re: Invent 2012Amazon Web Services
 
(BDT310) Big Data Architectural Patterns and Best Practices on AWS
(BDT310) Big Data Architectural Patterns and Best Practices on AWS(BDT310) Big Data Architectural Patterns and Best Practices on AWS
(BDT310) Big Data Architectural Patterns and Best Practices on AWSAmazon Web Services
 
BDA308 Serverless Analytics with Amazon Athena and Amazon QuickSight, featuri...
BDA308 Serverless Analytics with Amazon Athena and Amazon QuickSight, featuri...BDA308 Serverless Analytics with Amazon Athena and Amazon QuickSight, featuri...
BDA308 Serverless Analytics with Amazon Athena and Amazon QuickSight, featuri...Amazon Web Services
 
Interactive Analytics on AWS - AWS Summit Tel Aviv 2017
Interactive Analytics on AWS - AWS Summit Tel Aviv 2017Interactive Analytics on AWS - AWS Summit Tel Aviv 2017
Interactive Analytics on AWS - AWS Summit Tel Aviv 2017Amazon Web Services
 
SRV420 Analyzing Streaming Data in Real-time with Amazon Kinesis
SRV420 Analyzing Streaming Data in Real-time with Amazon KinesisSRV420 Analyzing Streaming Data in Real-time with Amazon Kinesis
SRV420 Analyzing Streaming Data in Real-time with Amazon KinesisAmazon Web Services
 
AWS re:Invent 2016: How Toyota Racing Development Makes Racing Decisions in R...
AWS re:Invent 2016: How Toyota Racing Development Makes Racing Decisions in R...AWS re:Invent 2016: How Toyota Racing Development Makes Racing Decisions in R...
AWS re:Invent 2016: How Toyota Racing Development Makes Racing Decisions in R...Amazon Web Services
 

Was ist angesagt? (20)

Processing and Analytics
Processing and AnalyticsProcessing and Analytics
Processing and Analytics
 
Kinesis and Spark Streaming - Advanced AWS Meetup - August 2014
Kinesis and Spark Streaming - Advanced AWS Meetup - August 2014Kinesis and Spark Streaming - Advanced AWS Meetup - August 2014
Kinesis and Spark Streaming - Advanced AWS Meetup - August 2014
 
AWS Webcast - Amazon Kinesis and Apache Storm
AWS Webcast - Amazon Kinesis and Apache StormAWS Webcast - Amazon Kinesis and Apache Storm
AWS Webcast - Amazon Kinesis and Apache Storm
 
GDC 2015 - Game Analytics with AWS Redshift, Kinesis, and the Mobile SDK
GDC 2015 - Game Analytics with AWS Redshift, Kinesis, and the Mobile SDKGDC 2015 - Game Analytics with AWS Redshift, Kinesis, and the Mobile SDK
GDC 2015 - Game Analytics with AWS Redshift, Kinesis, and the Mobile SDK
 
Introduction to Amazon Kinesis Analytics
Introduction to Amazon Kinesis AnalyticsIntroduction to Amazon Kinesis Analytics
Introduction to Amazon Kinesis Analytics
 
AWS Real-Time Event Processing
AWS Real-Time Event ProcessingAWS Real-Time Event Processing
AWS Real-Time Event Processing
 
(WRK302) Event-Driven Programming
(WRK302) Event-Driven Programming(WRK302) Event-Driven Programming
(WRK302) Event-Driven Programming
 
Data Collection and Storage
Data Collection and StorageData Collection and Storage
Data Collection and Storage
 
Real Time Big Data Processing on AWS
Real Time Big Data Processing on AWSReal Time Big Data Processing on AWS
Real Time Big Data Processing on AWS
 
AWS May Webinar Series - Streaming Data Processing with Amazon Kinesis and AW...
AWS May Webinar Series - Streaming Data Processing with Amazon Kinesis and AW...AWS May Webinar Series - Streaming Data Processing with Amazon Kinesis and AW...
AWS May Webinar Series - Streaming Data Processing with Amazon Kinesis and AW...
 
How Netflix Uses Amazon Kinesis Streams to Monitor and Optimize Large-scale N...
How Netflix Uses Amazon Kinesis Streams to Monitor and Optimize Large-scale N...How Netflix Uses Amazon Kinesis Streams to Monitor and Optimize Large-scale N...
How Netflix Uses Amazon Kinesis Streams to Monitor and Optimize Large-scale N...
 
Aws Kinesis
Aws KinesisAws Kinesis
Aws Kinesis
 
Optimizing Storage for Big Data/Analytics Workloads
Optimizing Storage for Big Data/Analytics WorkloadsOptimizing Storage for Big Data/Analytics Workloads
Optimizing Storage for Big Data/Analytics Workloads
 
DAT102 Introduction to Amazon DynamoDB - AWS re: Invent 2012
DAT102 Introduction to Amazon DynamoDB - AWS re: Invent 2012DAT102 Introduction to Amazon DynamoDB - AWS re: Invent 2012
DAT102 Introduction to Amazon DynamoDB - AWS re: Invent 2012
 
(BDT310) Big Data Architectural Patterns and Best Practices on AWS
(BDT310) Big Data Architectural Patterns and Best Practices on AWS(BDT310) Big Data Architectural Patterns and Best Practices on AWS
(BDT310) Big Data Architectural Patterns and Best Practices on AWS
 
DynamodbDB Deep Dive
DynamodbDB Deep DiveDynamodbDB Deep Dive
DynamodbDB Deep Dive
 
BDA308 Serverless Analytics with Amazon Athena and Amazon QuickSight, featuri...
BDA308 Serverless Analytics with Amazon Athena and Amazon QuickSight, featuri...BDA308 Serverless Analytics with Amazon Athena and Amazon QuickSight, featuri...
BDA308 Serverless Analytics with Amazon Athena and Amazon QuickSight, featuri...
 
Interactive Analytics on AWS - AWS Summit Tel Aviv 2017
Interactive Analytics on AWS - AWS Summit Tel Aviv 2017Interactive Analytics on AWS - AWS Summit Tel Aviv 2017
Interactive Analytics on AWS - AWS Summit Tel Aviv 2017
 
SRV420 Analyzing Streaming Data in Real-time with Amazon Kinesis
SRV420 Analyzing Streaming Data in Real-time with Amazon KinesisSRV420 Analyzing Streaming Data in Real-time with Amazon Kinesis
SRV420 Analyzing Streaming Data in Real-time with Amazon Kinesis
 
AWS re:Invent 2016: How Toyota Racing Development Makes Racing Decisions in R...
AWS re:Invent 2016: How Toyota Racing Development Makes Racing Decisions in R...AWS re:Invent 2016: How Toyota Racing Development Makes Racing Decisions in R...
AWS re:Invent 2016: How Toyota Racing Development Makes Racing Decisions in R...
 

Andere mochten auch

Real-time Data Processing with Amazon DynamoDB Streams and AWS Lambda
Real-time Data Processing with Amazon DynamoDB Streams and AWS LambdaReal-time Data Processing with Amazon DynamoDB Streams and AWS Lambda
Real-time Data Processing with Amazon DynamoDB Streams and AWS LambdaAmazon Web Services
 
Vagrant for Effective DevOps Culture
Vagrant for Effective DevOps CultureVagrant for Effective DevOps Culture
Vagrant for Effective DevOps CultureVaidik Kapoor
 
Using Elasticsearch for Analytics
Using Elasticsearch for AnalyticsUsing Elasticsearch for Analytics
Using Elasticsearch for AnalyticsVaidik Kapoor
 
Monitoring NGINX (plus): key metrics and how-to
Monitoring NGINX (plus): key metrics and how-toMonitoring NGINX (plus): key metrics and how-to
Monitoring NGINX (plus): key metrics and how-toDatadog
 
(GAM401) Build a Serverless Mobile Game w/ Cognito, Lambda & DynamoDB
(GAM401) Build a Serverless Mobile Game w/ Cognito, Lambda & DynamoDB(GAM401) Build a Serverless Mobile Game w/ Cognito, Lambda & DynamoDB
(GAM401) Build a Serverless Mobile Game w/ Cognito, Lambda & DynamoDBAmazon Web Services
 
From Zero to Hadoop: a tutorial for getting started writing Hadoop jobs on Am...
From Zero to Hadoop: a tutorial for getting started writing Hadoop jobs on Am...From Zero to Hadoop: a tutorial for getting started writing Hadoop jobs on Am...
From Zero to Hadoop: a tutorial for getting started writing Hadoop jobs on Am...Alexander Dean
 
(BDT313) Amazon DynamoDB For Big Data
(BDT313) Amazon DynamoDB For Big Data(BDT313) Amazon DynamoDB For Big Data
(BDT313) Amazon DynamoDB For Big DataAmazon Web Services
 
Compare DynamoDB vs. MongoDB
Compare DynamoDB vs. MongoDBCompare DynamoDB vs. MongoDB
Compare DynamoDB vs. MongoDBAmar Das
 
Privacy Regulations and Your Digital Setup
Privacy Regulations and Your Digital SetupPrivacy Regulations and Your Digital Setup
Privacy Regulations and Your Digital SetupPiwik PRO
 
Customer Sharing: Weather Risk - Weather on the Cloud
Customer Sharing: Weather Risk - Weather on the CloudCustomer Sharing: Weather Risk - Weather on the Cloud
Customer Sharing: Weather Risk - Weather on the CloudAmazon Web Services
 
Getting Started with AWS Security
Getting Started with AWS SecurityGetting Started with AWS Security
Getting Started with AWS SecurityAmazon Web Services
 
Media Content Ingest, Storage, and Archiving with AWS - John Downey, Amazon W...
Media Content Ingest, Storage, and Archiving with AWS - John Downey, Amazon W...Media Content Ingest, Storage, and Archiving with AWS - John Downey, Amazon W...
Media Content Ingest, Storage, and Archiving with AWS - John Downey, Amazon W...Amazon Web Services
 
AWS Summit Sydney 2014 | Moving to the Cloud. What does it Mean to your Business
AWS Summit Sydney 2014 | Moving to the Cloud. What does it Mean to your BusinessAWS Summit Sydney 2014 | Moving to the Cloud. What does it Mean to your Business
AWS Summit Sydney 2014 | Moving to the Cloud. What does it Mean to your BusinessAmazon Web Services
 
AWS Summit Sydney 2014 | Running your First Application on AWS
AWS Summit Sydney 2014 | Running your First Application on AWSAWS Summit Sydney 2014 | Running your First Application on AWS
AWS Summit Sydney 2014 | Running your First Application on AWSAmazon Web Services
 
AWS Enterprise Summit London | AWS as an Agile Enabler at The Co-operative
AWS Enterprise Summit London | AWS as an Agile Enabler at The Co-operativeAWS Enterprise Summit London | AWS as an Agile Enabler at The Co-operative
AWS Enterprise Summit London | AWS as an Agile Enabler at The Co-operativeAmazon Web Services
 
Customer Sharing: Trend Micro - Analytic Engine - A common Big Data computati...
Customer Sharing: Trend Micro - Analytic Engine - A common Big Data computati...Customer Sharing: Trend Micro - Analytic Engine - A common Big Data computati...
Customer Sharing: Trend Micro - Analytic Engine - A common Big Data computati...Amazon Web Services
 

Andere mochten auch (20)

Real-time Data Processing with Amazon DynamoDB Streams and AWS Lambda
Real-time Data Processing with Amazon DynamoDB Streams and AWS LambdaReal-time Data Processing with Amazon DynamoDB Streams and AWS Lambda
Real-time Data Processing with Amazon DynamoDB Streams and AWS Lambda
 
EC2 Avanzado
EC2 AvanzadoEC2 Avanzado
EC2 Avanzado
 
Vagrant for Effective DevOps Culture
Vagrant for Effective DevOps CultureVagrant for Effective DevOps Culture
Vagrant for Effective DevOps Culture
 
Using Elasticsearch for Analytics
Using Elasticsearch for AnalyticsUsing Elasticsearch for Analytics
Using Elasticsearch for Analytics
 
AWS EC2
AWS EC2AWS EC2
AWS EC2
 
Monitoring NGINX (plus): key metrics and how-to
Monitoring NGINX (plus): key metrics and how-toMonitoring NGINX (plus): key metrics and how-to
Monitoring NGINX (plus): key metrics and how-to
 
(GAM401) Build a Serverless Mobile Game w/ Cognito, Lambda & DynamoDB
(GAM401) Build a Serverless Mobile Game w/ Cognito, Lambda & DynamoDB(GAM401) Build a Serverless Mobile Game w/ Cognito, Lambda & DynamoDB
(GAM401) Build a Serverless Mobile Game w/ Cognito, Lambda & DynamoDB
 
From Zero to Hadoop: a tutorial for getting started writing Hadoop jobs on Am...
From Zero to Hadoop: a tutorial for getting started writing Hadoop jobs on Am...From Zero to Hadoop: a tutorial for getting started writing Hadoop jobs on Am...
From Zero to Hadoop: a tutorial for getting started writing Hadoop jobs on Am...
 
(BDT313) Amazon DynamoDB For Big Data
(BDT313) Amazon DynamoDB For Big Data(BDT313) Amazon DynamoDB For Big Data
(BDT313) Amazon DynamoDB For Big Data
 
Compare DynamoDB vs. MongoDB
Compare DynamoDB vs. MongoDBCompare DynamoDB vs. MongoDB
Compare DynamoDB vs. MongoDB
 
Privacy Regulations and Your Digital Setup
Privacy Regulations and Your Digital SetupPrivacy Regulations and Your Digital Setup
Privacy Regulations and Your Digital Setup
 
Understanding The Benefits Of Amazon EC2
Understanding The Benefits Of Amazon EC2Understanding The Benefits Of Amazon EC2
Understanding The Benefits Of Amazon EC2
 
Customer Sharing: Weather Risk - Weather on the Cloud
Customer Sharing: Weather Risk - Weather on the CloudCustomer Sharing: Weather Risk - Weather on the Cloud
Customer Sharing: Weather Risk - Weather on the Cloud
 
Scmp aws digitalmedia_2013
Scmp aws digitalmedia_2013Scmp aws digitalmedia_2013
Scmp aws digitalmedia_2013
 
Getting Started with AWS Security
Getting Started with AWS SecurityGetting Started with AWS Security
Getting Started with AWS Security
 
Media Content Ingest, Storage, and Archiving with AWS - John Downey, Amazon W...
Media Content Ingest, Storage, and Archiving with AWS - John Downey, Amazon W...Media Content Ingest, Storage, and Archiving with AWS - John Downey, Amazon W...
Media Content Ingest, Storage, and Archiving with AWS - John Downey, Amazon W...
 
AWS Summit Sydney 2014 | Moving to the Cloud. What does it Mean to your Business
AWS Summit Sydney 2014 | Moving to the Cloud. What does it Mean to your BusinessAWS Summit Sydney 2014 | Moving to the Cloud. What does it Mean to your Business
AWS Summit Sydney 2014 | Moving to the Cloud. What does it Mean to your Business
 
AWS Summit Sydney 2014 | Running your First Application on AWS
AWS Summit Sydney 2014 | Running your First Application on AWSAWS Summit Sydney 2014 | Running your First Application on AWS
AWS Summit Sydney 2014 | Running your First Application on AWS
 
AWS Enterprise Summit London | AWS as an Agile Enabler at The Co-operative
AWS Enterprise Summit London | AWS as an Agile Enabler at The Co-operativeAWS Enterprise Summit London | AWS as an Agile Enabler at The Co-operative
AWS Enterprise Summit London | AWS as an Agile Enabler at The Co-operative
 
Customer Sharing: Trend Micro - Analytic Engine - A common Big Data computati...
Customer Sharing: Trend Micro - Analytic Engine - A common Big Data computati...Customer Sharing: Trend Micro - Analytic Engine - A common Big Data computati...
Customer Sharing: Trend Micro - Analytic Engine - A common Big Data computati...
 

Ähnlich wie Build Your Web Analytics with node.js, Amazon DynamoDB and Amazon EMR (BDT203) | AWS re:Invent 2013

Data Platform at Twitter: Enabling Real-time & Batch Analytics at Scale
Data Platform at Twitter: Enabling Real-time & Batch Analytics at ScaleData Platform at Twitter: Enabling Real-time & Batch Analytics at Scale
Data Platform at Twitter: Enabling Real-time & Batch Analytics at ScaleSriram Krishnan
 
First Hive Meetup London 2012-07-10 - Tomas Cervenka - VisualDNA
First Hive Meetup London 2012-07-10 - Tomas Cervenka - VisualDNAFirst Hive Meetup London 2012-07-10 - Tomas Cervenka - VisualDNA
First Hive Meetup London 2012-07-10 - Tomas Cervenka - VisualDNATomas Cervenka
 
Dynamically scaling a political news and activism hub (up to 5x the traffic i...
Dynamically scaling a political news and activism hub (up to 5x the traffic i...Dynamically scaling a political news and activism hub (up to 5x the traffic i...
Dynamically scaling a political news and activism hub (up to 5x the traffic i...Susan Potter
 
Maximizing Audience Engagement in Media Delivery (MED303) | AWS re:Invent 2013
Maximizing Audience Engagement in Media Delivery (MED303) | AWS re:Invent 2013Maximizing Audience Engagement in Media Delivery (MED303) | AWS re:Invent 2013
Maximizing Audience Engagement in Media Delivery (MED303) | AWS re:Invent 2013Amazon Web Services
 
Creating scalable solutions with aws
Creating scalable solutions with awsCreating scalable solutions with aws
Creating scalable solutions with awsondrejbalas
 
Automate Your Big Data Workflows (SVC201) | AWS re:Invent 2013
Automate Your Big Data Workflows (SVC201) | AWS re:Invent 2013Automate Your Big Data Workflows (SVC201) | AWS re:Invent 2013
Automate Your Big Data Workflows (SVC201) | AWS re:Invent 2013Amazon Web Services
 
Why Scale Matters and How the Cloud is Really Different (at scale)
Why Scale Matters and How the Cloud is Really Different (at scale)Why Scale Matters and How the Cloud is Really Different (at scale)
Why Scale Matters and How the Cloud is Really Different (at scale)Amazon Web Services
 
Big data week presentation
Big data week presentationBig data week presentation
Big data week presentationJoseph Adler
 
AWS Summit Stockholm 2014 – B4 – Business intelligence on AWS
AWS Summit Stockholm 2014 – B4 – Business intelligence on AWSAWS Summit Stockholm 2014 – B4 – Business intelligence on AWS
AWS Summit Stockholm 2014 – B4 – Business intelligence on AWSAmazon Web Services
 
Building a Sustainable Data Platform on AWS
Building a Sustainable Data Platform on AWSBuilding a Sustainable Data Platform on AWS
Building a Sustainable Data Platform on AWSSmartNews, Inc.
 
Getting Maximum Performance from Amazon Redshift (DAT305) | AWS re:Invent 2013
Getting Maximum Performance from Amazon Redshift (DAT305) | AWS re:Invent 2013Getting Maximum Performance from Amazon Redshift (DAT305) | AWS re:Invent 2013
Getting Maximum Performance from Amazon Redshift (DAT305) | AWS re:Invent 2013Amazon Web Services
 
Cloud introduction2.ppt
Cloud introduction2.pptCloud introduction2.ppt
Cloud introduction2.pptBala Anand
 
Data & Analytics - Session 2 - Introducing Amazon Redshift
Data & Analytics - Session 2 - Introducing Amazon RedshiftData & Analytics - Session 2 - Introducing Amazon Redshift
Data & Analytics - Session 2 - Introducing Amazon RedshiftAmazon Web Services
 
(MBL305) You Have Data from the Devices, Now What?: Getting the Value of the IoT
(MBL305) You Have Data from the Devices, Now What?: Getting the Value of the IoT(MBL305) You Have Data from the Devices, Now What?: Getting the Value of the IoT
(MBL305) You Have Data from the Devices, Now What?: Getting the Value of the IoTAmazon Web Services
 
AWS re:Invent 2016: Content and Data Platforms at Vevo: Rebuilding and Scalin...
AWS re:Invent 2016: Content and Data Platforms at Vevo: Rebuilding and Scalin...AWS re:Invent 2016: Content and Data Platforms at Vevo: Rebuilding and Scalin...
AWS re:Invent 2016: Content and Data Platforms at Vevo: Rebuilding and Scalin...AwsReinventSlides
 
Deep dive and best practices on real time streaming applications nyc-loft_oct...
Deep dive and best practices on real time streaming applications nyc-loft_oct...Deep dive and best practices on real time streaming applications nyc-loft_oct...
Deep dive and best practices on real time streaming applications nyc-loft_oct...Amazon Web Services
 
Building an Event-oriented Data Platform with Kafka, Eric Sammer
Building an Event-oriented Data Platform with Kafka, Eric Sammer Building an Event-oriented Data Platform with Kafka, Eric Sammer
Building an Event-oriented Data Platform with Kafka, Eric Sammer confluent
 
How Edmodo Uses Splunk For Real-Time Tag-Based Reporting of AWS Billing and U...
How Edmodo Uses Splunk For Real-Time Tag-Based Reporting of AWS Billing and U...How Edmodo Uses Splunk For Real-Time Tag-Based Reporting of AWS Billing and U...
How Edmodo Uses Splunk For Real-Time Tag-Based Reporting of AWS Billing and U...cloudcontroller
 
Building a system for machine and event-oriented data with Rocana
Building a system for machine and event-oriented data with RocanaBuilding a system for machine and event-oriented data with Rocana
Building a system for machine and event-oriented data with RocanaTreasure Data, Inc.
 

Ähnlich wie Build Your Web Analytics with node.js, Amazon DynamoDB and Amazon EMR (BDT203) | AWS re:Invent 2013 (20)

What's new in AWS?
What's new in AWS?What's new in AWS?
What's new in AWS?
 
Data Platform at Twitter: Enabling Real-time & Batch Analytics at Scale
Data Platform at Twitter: Enabling Real-time & Batch Analytics at ScaleData Platform at Twitter: Enabling Real-time & Batch Analytics at Scale
Data Platform at Twitter: Enabling Real-time & Batch Analytics at Scale
 
First Hive Meetup London 2012-07-10 - Tomas Cervenka - VisualDNA
First Hive Meetup London 2012-07-10 - Tomas Cervenka - VisualDNAFirst Hive Meetup London 2012-07-10 - Tomas Cervenka - VisualDNA
First Hive Meetup London 2012-07-10 - Tomas Cervenka - VisualDNA
 
Dynamically scaling a political news and activism hub (up to 5x the traffic i...
Dynamically scaling a political news and activism hub (up to 5x the traffic i...Dynamically scaling a political news and activism hub (up to 5x the traffic i...
Dynamically scaling a political news and activism hub (up to 5x the traffic i...
 
Maximizing Audience Engagement in Media Delivery (MED303) | AWS re:Invent 2013
Maximizing Audience Engagement in Media Delivery (MED303) | AWS re:Invent 2013Maximizing Audience Engagement in Media Delivery (MED303) | AWS re:Invent 2013
Maximizing Audience Engagement in Media Delivery (MED303) | AWS re:Invent 2013
 
Creating scalable solutions with aws
Creating scalable solutions with awsCreating scalable solutions with aws
Creating scalable solutions with aws
 
Automate Your Big Data Workflows (SVC201) | AWS re:Invent 2013
Automate Your Big Data Workflows (SVC201) | AWS re:Invent 2013Automate Your Big Data Workflows (SVC201) | AWS re:Invent 2013
Automate Your Big Data Workflows (SVC201) | AWS re:Invent 2013
 
Why Scale Matters and How the Cloud is Really Different (at scale)
Why Scale Matters and How the Cloud is Really Different (at scale)Why Scale Matters and How the Cloud is Really Different (at scale)
Why Scale Matters and How the Cloud is Really Different (at scale)
 
Big data week presentation
Big data week presentationBig data week presentation
Big data week presentation
 
AWS Summit Stockholm 2014 – B4 – Business intelligence on AWS
AWS Summit Stockholm 2014 – B4 – Business intelligence on AWSAWS Summit Stockholm 2014 – B4 – Business intelligence on AWS
AWS Summit Stockholm 2014 – B4 – Business intelligence on AWS
 
Building a Sustainable Data Platform on AWS
Building a Sustainable Data Platform on AWSBuilding a Sustainable Data Platform on AWS
Building a Sustainable Data Platform on AWS
 
Getting Maximum Performance from Amazon Redshift (DAT305) | AWS re:Invent 2013
Getting Maximum Performance from Amazon Redshift (DAT305) | AWS re:Invent 2013Getting Maximum Performance from Amazon Redshift (DAT305) | AWS re:Invent 2013
Getting Maximum Performance from Amazon Redshift (DAT305) | AWS re:Invent 2013
 
Cloud introduction2.ppt
Cloud introduction2.pptCloud introduction2.ppt
Cloud introduction2.ppt
 
Data & Analytics - Session 2 - Introducing Amazon Redshift
Data & Analytics - Session 2 - Introducing Amazon RedshiftData & Analytics - Session 2 - Introducing Amazon Redshift
Data & Analytics - Session 2 - Introducing Amazon Redshift
 
(MBL305) You Have Data from the Devices, Now What?: Getting the Value of the IoT
(MBL305) You Have Data from the Devices, Now What?: Getting the Value of the IoT(MBL305) You Have Data from the Devices, Now What?: Getting the Value of the IoT
(MBL305) You Have Data from the Devices, Now What?: Getting the Value of the IoT
 
AWS re:Invent 2016: Content and Data Platforms at Vevo: Rebuilding and Scalin...
AWS re:Invent 2016: Content and Data Platforms at Vevo: Rebuilding and Scalin...AWS re:Invent 2016: Content and Data Platforms at Vevo: Rebuilding and Scalin...
AWS re:Invent 2016: Content and Data Platforms at Vevo: Rebuilding and Scalin...
 
Deep dive and best practices on real time streaming applications nyc-loft_oct...
Deep dive and best practices on real time streaming applications nyc-loft_oct...Deep dive and best practices on real time streaming applications nyc-loft_oct...
Deep dive and best practices on real time streaming applications nyc-loft_oct...
 
Building an Event-oriented Data Platform with Kafka, Eric Sammer
Building an Event-oriented Data Platform with Kafka, Eric Sammer Building an Event-oriented Data Platform with Kafka, Eric Sammer
Building an Event-oriented Data Platform with Kafka, Eric Sammer
 
How Edmodo Uses Splunk For Real-Time Tag-Based Reporting of AWS Billing and U...
How Edmodo Uses Splunk For Real-Time Tag-Based Reporting of AWS Billing and U...How Edmodo Uses Splunk For Real-Time Tag-Based Reporting of AWS Billing and U...
How Edmodo Uses Splunk For Real-Time Tag-Based Reporting of AWS Billing and U...
 
Building a system for machine and event-oriented data with Rocana
Building a system for machine and event-oriented data with RocanaBuilding a system for machine and event-oriented data with Rocana
Building a system for machine and event-oriented data with Rocana
 

Mehr von Amazon Web Services

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Amazon Web Services
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Amazon Web Services
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateAmazon Web Services
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSAmazon Web Services
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Amazon Web Services
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Amazon Web Services
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...Amazon Web Services
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsAmazon Web Services
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareAmazon Web Services
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSAmazon Web Services
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAmazon Web Services
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareAmazon Web Services
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWSAmazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckAmazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without serversAmazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...Amazon Web Services
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceAmazon Web Services
 

Mehr von Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

Kürzlich hochgeladen

Cybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptxCybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptxGDSC PJATK
 
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online CollaborationCOMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online Collaborationbruanjhuli
 
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfUiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfDianaGray10
 
AI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity WebinarAI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity WebinarPrecisely
 
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...Aggregage
 
Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfVideogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfinfogdgmi
 
Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.YounusS2
 
Machine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfMachine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfAijun Zhang
 
20230202 - Introduction to tis-py
20230202 - Introduction to tis-py20230202 - Introduction to tis-py
20230202 - Introduction to tis-pyJamie (Taka) Wang
 
Nanopower In Semiconductor Industry.pdf
Nanopower  In Semiconductor Industry.pdfNanopower  In Semiconductor Industry.pdf
Nanopower In Semiconductor Industry.pdfPedro Manuel
 
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesAI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesMd Hossain Ali
 
UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6DianaGray10
 
Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024SkyPlanner
 
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve DecarbonizationUsing IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve DecarbonizationIES VE
 
How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?IES VE
 
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDEADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDELiveplex
 
UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7DianaGray10
 
Designing A Time bound resource download URL
Designing A Time bound resource download URLDesigning A Time bound resource download URL
Designing A Time bound resource download URLRuncy Oommen
 
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1DianaGray10
 

Kürzlich hochgeladen (20)

Cybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptxCybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptx
 
201610817 - edge part1
201610817 - edge part1201610817 - edge part1
201610817 - edge part1
 
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online CollaborationCOMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
 
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfUiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
 
AI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity WebinarAI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity Webinar
 
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
 
Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfVideogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdf
 
Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.
 
Machine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfMachine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdf
 
20230202 - Introduction to tis-py
20230202 - Introduction to tis-py20230202 - Introduction to tis-py
20230202 - Introduction to tis-py
 
Nanopower In Semiconductor Industry.pdf
Nanopower  In Semiconductor Industry.pdfNanopower  In Semiconductor Industry.pdf
Nanopower In Semiconductor Industry.pdf
 
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesAI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
 
UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6
 
Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024
 
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve DecarbonizationUsing IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
 
How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?
 
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDEADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
 
UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7
 
Designing A Time bound resource download URL
Designing A Time bound resource download URLDesigning A Time bound resource download URL
Designing A Time bound resource download URL
 
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
 

Build Your Web Analytics with node.js, Amazon DynamoDB and Amazon EMR (BDT203) | AWS re:Invent 2013

  • 1. Building Your Own Web Analytics Service with node.js, Amazon DynamoDB, and Amazon Elastic MapReduce Jonathan Keebler - Founder, CTO - ScribbleLive November 13, 2013 © 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc.
  • 2. Who Am I? •Jonathan Keebler @keebler •Built video player for all CTV properties –Worked on news sites like CTV, TSN, CP24 •CTO, Founder of ScribbleLive •Bootstrapped a high scalability startup –Credit card limit wasn’t that high, had to find cheap ways to handle the load of top tier news sites
  • 3. What is ScribbleLive? •Leading provider of real-time engagement management solutions •We enable real-time publication and syndication of digital content •Our platform is transforming the way the world’s largest brands and media approach communication and content creation, creating true real-time engagement
  • 4. Some of our customers
  • 5. Today •Learn to build your own analytics service – Seriously, we’re going to do it •node.js on Amazon EC2: web servers •Amazon DynamoDB: database •Hadoop/Hive on Amazon Elastic MapReduce (EMR): roll-up data
  • 6. Why would we do this? •ScribbleLive tracks “engagement minutes” (EMs) across all customer sites – e.g., ESPN.com, CNN.com, Reuters.com – EM = 1 minute of a user watching a webpage – 2.5B per month, 120M+ per hour •Big analytics providers couldn’t do it – Didn’t have the features – Too inaccurate
  • 7. How are we going to do this? Visitors Elastic Load Balancing node.js node.js node.js DynamoDB node.js
  • 8. DynamoDB: data structure •Separate tables by timeframe – Minute (written by node.js directly) – Hour (EMR from minute data) – Day (EMR from hour data) – Month (EMR from day data) •Structure – Hash: Item (page id) – Range: Time (rounded to min, hour, day) – { Hits: 1 }
  • 9. Elastic Load Balancing: AMI setup •Custom AMI – Loads source from SVN – Launches node.js
  • 10. Elastic Load Balancing: Load balancing •1 load balancer •Cookies keep unique user on same instance •Auto-scaling – CPU >50% or network-in 50M bytes, triggers new servers coming online and added to Elastic Load Balancing
  • 11. node.js: Overview of code •Accepts GET /?item={ID}&uid={UserID} •Dictionary/Array of how many GETs per item in this minute – Hits[Minute][“{ID}”]++ – Example: Hits[“1/1/2014 1:23:00”][“abcd”]++ •Dictionary/Array of Users already counted in Item:Minute (prevent double-counting) •At end of minute, write data back to DynamoDB
  • 12. node.js: Bulk writing to DynamoDB •Writing all data back immediately in a loop = BAD! – Throughput would spike in that ~second – Would have to use higher throughput limit – More $$$$ •Instead, figure out how many writes need to happen / 60 seconds = how many writes per second you should do
  • 13. node.js: Bulk writing to DynamoDB •Call to DynamoDB per item: – update: (atomic) add X to {ID}:{Minute}
  • 14. Hadoop: What we map and reduce •To go from minute to hourly data – Round every minute down to the nearest hour (floor( Minute / 3600 ) * 3600) – Sum the # of “Hits” from each data point •Just look at the past 24 hours to save time •Do the same for hourly to daily, daily to monthly
  • 15. Hadoop: Hive scripts INSERT OVERWRITE TABLE MetricsHourly SELECT Item, (floor( Time / 3600 ) * 3600) AS Time, SUM(Hits) AS Hits, from_unixtime(floor( Time / 3600 ) * 3600 ) AS TimeFriendly FROM Metrics WHERE Time >= floor( unix_timestamp() / 86400 ) * 86400 - ( 86400 * 1 ) GROUP BY Item, floor( Time / 3600 ) * 3600;
  • 17. Hadoop: Setting Up EMR • “Start an Interactive Hive Session” • Run a cron job every 15 minutes to check if the Hive job is complete • If complete, downloads newest Hive script and restarts the job • Amazon CloudWatch alarms if jobs taking longer than 12 hours
  • 18. Hadoop: Cron Job #!/bin/sh JOBID=$(hadoop job -list | grep job_ | cut -f1) if [ -n "$JOBID" ];then echo "Another job already running"; else echo "Starting Hive job..." echo `date` starting >> /var/log/metricsdaily_starting wget -qO- http://DEPLOY/metrics/rollups.sql > /tmp/rollups.sql && hive -f /tmp/rollups.sql fi
  • 19. Application API •RESTful API in the language of your choice •Calls to DynamoDB: –query: Hash:{ID} w/ Range:{Time A}-{Time B} •Since M-R could take a day to run, need to reconstruct hourly data from minutes for most recent 24 hours –e.g. if you want hourly data for last 2 days, take 24 hourly data pts from yesterday, and 24*60 minute data pts from today (convert to hourly data pts in code)
  • 22. Please give us your feedback on this presentation BDT203 As a thank you, we will select prize winners daily for completed surveys!