SlideShare a Scribd company logo
1 of 69
Download to read offline
ARC 306: Lumberjacking on AWS
Cutting Through Logs to Find What Matters
Guy Ernest, Solutions Architecture
November 15, 2013

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc.
Progress Is Not Evenly Distributed

1980
$14,000,000/TB  450,000 ÷ 
 30,000 X 
100 MB
 50 X 
4 MB/s

Today
$30/TB
3 TB
200 MB/s
by Kheel Center, Cornell University

Solution: More Spindles
Case Study – Foursquare
The Challenge
“…Foursquare streams hundreds
of millions of application logs
each day. The company relies on
analytics to report on its daily
usage, evaluate new offerings,
and perform long-term trend
analysis—and with millions of
new check-ins each day, the
workload is only growing…”
“Real” Project Requirements Example
Cost
Analysis

Marketing

Operations

Revenue

Data transfer

Top URLs

Error rates

Top games

• By date/time
• By edge location
• By date/time within
an edge location
• By top X URLs
• By HTTP vs. HTTPS

•
•
•
•

• By top X URLs
• By edge location
• By edge location and
content type

• By revenue
• By edge location and
revenue

As-is count
By content type
By edge location
By edge location and
content type

Top ads
• That lead to a game
purchase

Requests served
• By edge location

Revenue
• By edge location

Top games
• By age
• By income
• By gender
Viable Business
Revenues
# Users

Operation Costs

$ Money
Available Data Sources
Metric
Data transfer by date/time
Data transfer by edge location
Data transfer by date/time within an edge location
Data transfer by top x URLs
Data transfer by http vs HTTPS
Top URLs
Top URLs by Content Type
Top URLs by Edge Location
Top URLs by Edge Location and Content Type
Error rates by top x URLs
Error rate by edge location
Error Rate by edge location and content type
Requests served by edge location
Revenue by edge location
Top games segmented by age
Top games segmented by income
Top games segmented by gender
Top games by revenue
Top games by edge location and revenue
Top game revenue segmented by age

Sources
CloudFront logs
CloudFront logs
CloudFront logs
CloudFront logs, web servers logs
CloudFront logs
CloudFront logs, web servers logs
CloudFront logs
CloudFront logs
CloudFront logs
CloudFront logs, web servers logs
CloudFront logs
CloudFront logs
CloudFront logs
CloudFront logs, OrdersDB, app servers logs
CloudFront logs, user profile
CloudFront logs, user profile
CloudFront logs, user profile
CloudFront logs, OrdersDB
CloudFront logs, OrdersDB
CloudFront logs, OrdersDB, user profile
CloudFront Access Log Format
#Version: 1.0
#Fields: date time x-edge-location sc-bytes c-ip cs-method cs(Host) cs-uri-stem sc-status cs(Referer) cs(User-Agent) cs-uri-query
2012-05-25
22:01:30
AMS1
4448
94.212.249.78
GET
d1234567890213.cloudfront.net
/YT0KthT/F5SOWdDPqNqQF07tiTOXqJMpfD
dlb3LMwv3/jP3/CINm/yDSy0MsRcWJN/Simutrans.exe
200
http://AtRJw2kxg0EMW.com/kZetr/YCb6AM9N2xt2
Mozilla/5.0%20(compatible;%20M
SIE%209.0;%20Windows%20NT%206.1;%20WOW64;%20Trident/5.0) uid=100&oid=108625181
2012-05-25
22:01:30
AMS1
4952
94.212.249.78
GET
d1234567890213.cloudfront.net
/66IG584/CPCxY0P44BGb5ZOd3qSUrauL05
0LOvFwaMj/eH/caw/Blob Wars-Blob And Conquer.exe
200
http://AtRJw2kxg0EMW.com/kZetr/YCb6AM9N2xt2
Mozilla/5.0%20(compatible;%20M
SIE%209.0;%20Windows%20NT%206.1;%20WOW64;%20Trident/5.0) uid=100&oid=108625184
2012-05-25
22:01:30
AMS1
4556
78.8.5.135
GET
d1234567890213.cloudfront.net
/SwlufjC/xEjH3BRbXMXwmFWqzKt7od6tlW
R3e13LhmH/V3eF/lo6g/AstroMenace.exe 200
http://AtRJw2kxg0EMW.com/AC1vg/1727EWfb7fPt
Opera/9.80%20(Windows%20NT%205.1;%20U;%20pl)%2
0Presto/2.10.229%20Version/11.60 uid=100&oid=108625189
2012-05-25
22:01:30
AMS1
47172
78.8.5.135
GET
d1234567890213.cloudfront.net
/Di1cXoN/TskldkSHcgkvZXQEmv5vOVR25X
5UTisFkRq/pQa/wCjUXZb/Z1HRuGlo/Kroz.exe
200
http://AtRJw2kxg0EMW.com/AC1vg/1727EWfb7fPt
Opera/9.80%20(Windows%20NT%205.1;%20U;
%20pl)%20Presto/2.10.229%20Version/11.60 uid=100&oid=108625206
Sample Your Data with R

>
>
>
>
>

sample_data <- read.delim(”SampleFiles/E123ABCDEF.2012-05-25-22.NEfbhLN3", header=F)
sample_data <- sample_data[-1:-2,]
View(sample_data)
m <- ggplot(sample_data, aes(x = factor(V9)))
m + geom_histogram() + scale_y_log10() + xlab('Error Codes') + ylab('log(Frequency)')
Need a Lot of Memory?
OpenRefine Running on an EC2 Instance
Logs

E T
Web

L
OLAP

OLTP

CRM

ANALYST
DATAWAREHOUSE

OLTP

DB
Swedish public domain photo taken in 1918

Log Shipping
“Poor Man’s Log Shipping”
Embedding Poor-man Invisible Pixel
http://www.poor-mananalytics.com/__track.gif?idt=5.1.5&idc=5&utmn=1532897343&utmhn=www.douban
.com&utmcs=UTF-8&utmsr=1440x900&utmsc=24-bit&utmul=enus&utmje=1&utmfl=10.3%20r181&utmdt=%E8%B1%86%E7%93%A3&utmhid=571356425&utmr
=-&utmp=%2F&utmac=UA-70197651&utmcc=__utma%3D30149280.1785629903.1314674330.1315290610.1315452707.10%3B
%2B__utmz%3D30149280.1315452707.10.7.utmcsr%3Dbiaodianfu.com%7Cutmccn%3D(re
ferral)%7Cutmcmd%3Dreferral%7Cutmcct%3D%2Fpoor-man-analyticsarchitecture.html%3B%2B__utmv%3D30149280.162%3B&utmu=qBM~
Open Source
Frameworks
Fluentd
Flume
Scribe
Chukwa
…

Fluentd Ascii Diagrams

Input
Output
+--------------------------------------------+
|
|
| Web Apps ---+
+--> File |
|
|
|
|
|
+-->
---+
|
| /var/log ------> Fluentd ------> Mail |
|
+-->
---+
|
|
|
|
|
| Apache
---+
+--> S3
|
|
|
+--------------------------------------------+
Web Server
+---------+
| Fluentd -------+
+---------+
|
|
Proxy Server
|
+---------+
+--> +---------+
| Fluentd ----------> | Fluentd |
+---------+
+--> +---------+
|
Database Server |
+---------+
|
| Fluentd -------+
+---------+
Use Amazon Kinesis to Ship Your Logs

New
Aggregation with S3Distcp

Aggregated
Even-size
Compressed
S3distcp on EMR Job Sample
./elastic-mapreduce --jobflow j-3GY8JC4179IOK --jar 
/home/hadoop/lib/emr-s3distcp-1.0.jar 
--args 
'--src,s3://myawsbucket/cf,
--dest,s3://myoutputbucket/aggregate ,
--groupBy,.*XABCD12345678.([0-9]+-[0-9]+-[0-9]+-[0-9]+).*,
--targetSize,128,
--outputCodec,lzo,
--deleteOnSuccess'
Pig for Access Logs Analysis
RAW_LOG = LOAD 's3://myoutputbucket/aggregate/' AS (ts:chararray, url:chararray…);
LOGS_BASE_F = FILTER RAW_LOG BY url MATCHES '^GET /__track.*$’;
LOGS_BASE_F_W_PARAM = FOREACH LOGS_BASE_F GENERATE
url,
Load and Filter
DATE_TIME(ts, 'dd/MMM/yyyy:HH:mm:ss Z') as dt,
SUBSTRING(DATE_TIME(ts, 'dd/MMM/yyyy:HH:mm:ss Z') ,0, 10 ) as day,
(cat / grep)
…
status,
REGEX_EXTRACT(url, '^GET /([^?]+)', 1) AS action: chararray,
REGEX_EXTRACT(url, 'idt=([^&]+)', 1) AS idt: chararray,
REGEX_EXTRACT(url, 'idc=([^&]+)', 1) AS idc: chararray;
I1 = FILTER LOGS_BASE_F_W_PARAM by action == 'clic' or action == 'display';
Parse
LOGS_SHORT = FOREACH I1 GENERATE uuid, action, dt, day, ida, idas, act, idp, idcmp
(awk)
,idc;
Store
G1 = GROUP LOGS_SHORT BY (uuid,idc);
store G1 into ‘s3://mybucket/sessions/’;
(>)
Pig vs. Hive
• Pig is geared toward sequentially transforming data
– ETL
– Shell in scale (from local mode to any scale)

• Hive is for querying data
– Data analysis / HQL
– Some transformation, typically as a means to a goal i.e., temporary tables
Monitoring Pig

https://github.com/netflix/lipstick
Another Monitoring
Tool

https://github.com/twitter/ambrose
Optimize Your EMR Cluster
Monitor Your EMR Cluster
Bootstrap Actions
--bootstrap-action s3://elasticmapreduce/bootstrap-actions/install-ganglia
Management Console
Customers Tools
Gathering information about EMR
jobs from multiple sources and
presentation it in a textual and
graphic view

github.com/Hi-Media/EmrMonitoring
Completed Job View
Spot Bidding Strategies

Less
Interruptions
Not paying
more
Most Saving
Jeff Bezos (early Amazon days)
Data Sources

Value

Queries
More Trends to Consider
Transactional Processing

Analytical Processing

Transactional context

Global context

Latency

Throughput

Indexed access

Full table scans

Random IO

Sequential IO

Disk seek times

Disk transfer rate
COPY into Amazon Redshift
create table cf_logs
( d date, t char(8), edge char(4), bytes int, cip varchar(15),
verb char(3), distro varchar(MAX), object varchar(MAX), status int,
Referer varchar(MAX), agent varchar(MAX), qs varchar(MAX) )
copy cf_logs from 's3://big-data/logs/E123ABCDEF/'
credentials
'aws_access_key_id=<key_id>;aws_secret_access_key=<secret_key>'
IGNOREHEADER 2
GZIP
DELIMITER 't'
DATEFORMAT 'YYYY-MM-DD'
COPY into Amazon Redshift with
AWS Data Pipeline
Charles Minard's flow map of Napoleon's March (1869)

Time for Data Visualization
Choose Your Favorite
Visualization Tool
Tableau (Windows instance)
R
Jaspersoft
QlikView
MicroStrategy
SiSense
…
Snapshot before Delete
Unload Data from Amazon Redshift
unload (“select * from cf_logs where date between '2013-11-03’ and '201311-10’“)
to 's3://mybucket/unload_cf_logs_week_46'
credentials 'aws_access_key_id=<key_id>;
aws_secret_access_key=<secret_key>’
delimiter as 't’
GZIP;
Reference Architecture
Partner Services
Loggly
Splunk
Stratalux (Logstash)
…

Loggly AWS Marketplace Page
What Else Can You Do with
Log Analysis?
Finally, a Small Warning

Abraham Wald (1902-1950)
B

C

A
Would You Like to Know More?
Further reading
http://aws.amazon.com/architecture

http://aws.amazon.com/articles
http://aws.typepad.com

Re:invent sessions
DAT205 - Amazon Redshift in Action: Enterprise, Big Data, and SaaS
DAT305 - Getting Maximum Performance from Amazon Redshift
BDT301 - Scaling your Analytics with Amazon Elastic MapReduce
Please give us your feedback on this
presentation

ARC306
As a thank you, we will select prize
winners daily for completed surveys!

More Related Content

What's hot

Fast Data at Scale - AWS Summit Tel Aviv 2017
Fast Data at Scale - AWS Summit Tel Aviv 2017Fast Data at Scale - AWS Summit Tel Aviv 2017
Fast Data at Scale - AWS Summit Tel Aviv 2017Amazon Web Services
 
(BDT306) How Hearst Publishing Manages Clickstream Analytics with AWS
(BDT306) How Hearst Publishing Manages Clickstream Analytics with AWS(BDT306) How Hearst Publishing Manages Clickstream Analytics with AWS
(BDT306) How Hearst Publishing Manages Clickstream Analytics with AWSAmazon Web Services
 
Getting Started with Real-Time Analytics
Getting Started with Real-Time AnalyticsGetting Started with Real-Time Analytics
Getting Started with Real-Time AnalyticsAmazon Web Services
 
Launching Your First Big Data Project on AWS
Launching Your First Big Data Project on AWSLaunching Your First Big Data Project on AWS
Launching Your First Big Data Project on AWSAmazon Web Services
 
Azure Data Explorer deep dive - review 04.2020
Azure Data Explorer deep dive - review 04.2020Azure Data Explorer deep dive - review 04.2020
Azure Data Explorer deep dive - review 04.2020Riccardo Zamana
 
Kinesis and Spark Streaming - Advanced AWS Meetup - August 2014
Kinesis and Spark Streaming - Advanced AWS Meetup - August 2014Kinesis and Spark Streaming - Advanced AWS Meetup - August 2014
Kinesis and Spark Streaming - Advanced AWS Meetup - August 2014Chris Fregly
 
개발자가 알아두면 좋을 5가지 AWS 인공 지능 깨알 지식 - 윤석찬 (AWS 테크 에반젤리스트)
개발자가 알아두면 좋을 5가지 AWS 인공 지능 깨알 지식 - 윤석찬 (AWS 테크 에반젤리스트)개발자가 알아두면 좋을 5가지 AWS 인공 지능 깨알 지식 - 윤석찬 (AWS 테크 에반젤리스트)
개발자가 알아두면 좋을 5가지 AWS 인공 지능 깨알 지식 - 윤석찬 (AWS 테크 에반젤리스트)Amazon Web Services Korea
 
Amazon EMR Facebook Presto Meetup
Amazon EMR Facebook Presto MeetupAmazon EMR Facebook Presto Meetup
Amazon EMR Facebook Presto Meetupstevemcpherson
 
Big Data: Mejores prácticas en AWS
Big Data: Mejores prácticas en AWSBig Data: Mejores prácticas en AWS
Big Data: Mejores prácticas en AWSAmazon Web Services
 
Berlin buzzwords 2020-feature-store-dowling
Berlin buzzwords 2020-feature-store-dowlingBerlin buzzwords 2020-feature-store-dowling
Berlin buzzwords 2020-feature-store-dowlingJim Dowling
 
Hopsworks data engineering melbourne april 2020
Hopsworks   data engineering melbourne april 2020Hopsworks   data engineering melbourne april 2020
Hopsworks data engineering melbourne april 2020Jim Dowling
 
Cost effective BigData Processing on Amazon EC2
Cost effective BigData Processing on Amazon EC2Cost effective BigData Processing on Amazon EC2
Cost effective BigData Processing on Amazon EC2Sujee Maniyam
 
Automotive Information Research driven by Apache Solr
Automotive Information Research driven by Apache SolrAutomotive Information Research driven by Apache Solr
Automotive Information Research driven by Apache SolrMario-Leander Reimer
 
Lambda Architecture - Storm, Trident, SummingBird ... - Architecture and Over...
Lambda Architecture - Storm, Trident, SummingBird ... - Architecture and Over...Lambda Architecture - Storm, Trident, SummingBird ... - Architecture and Over...
Lambda Architecture - Storm, Trident, SummingBird ... - Architecture and Over...Dataiku
 
Kusto (Azure Data Explorer) Training for R&D - January 2019
Kusto (Azure Data Explorer) Training for R&D - January 2019 Kusto (Azure Data Explorer) Training for R&D - January 2019
Kusto (Azure Data Explorer) Training for R&D - January 2019 Tal Bar-Zvi
 
Unlocking Open Data in the Cloud
Unlocking Open Data in the CloudUnlocking Open Data in the Cloud
Unlocking Open Data in the CloudAmazon Web Services
 

What's hot (20)

Fast Data at Scale - AWS Summit Tel Aviv 2017
Fast Data at Scale - AWS Summit Tel Aviv 2017Fast Data at Scale - AWS Summit Tel Aviv 2017
Fast Data at Scale - AWS Summit Tel Aviv 2017
 
(BDT306) How Hearst Publishing Manages Clickstream Analytics with AWS
(BDT306) How Hearst Publishing Manages Clickstream Analytics with AWS(BDT306) How Hearst Publishing Manages Clickstream Analytics with AWS
(BDT306) How Hearst Publishing Manages Clickstream Analytics with AWS
 
Getting Started with Real-Time Analytics
Getting Started with Real-Time AnalyticsGetting Started with Real-Time Analytics
Getting Started with Real-Time Analytics
 
Launching Your First Big Data Project on AWS
Launching Your First Big Data Project on AWSLaunching Your First Big Data Project on AWS
Launching Your First Big Data Project on AWS
 
Working with the Scalding Type -Safe API
Working with the Scalding Type -Safe APIWorking with the Scalding Type -Safe API
Working with the Scalding Type -Safe API
 
Azure Data Explorer deep dive - review 04.2020
Azure Data Explorer deep dive - review 04.2020Azure Data Explorer deep dive - review 04.2020
Azure Data Explorer deep dive - review 04.2020
 
Kinesis and Spark Streaming - Advanced AWS Meetup - August 2014
Kinesis and Spark Streaming - Advanced AWS Meetup - August 2014Kinesis and Spark Streaming - Advanced AWS Meetup - August 2014
Kinesis and Spark Streaming - Advanced AWS Meetup - August 2014
 
개발자가 알아두면 좋을 5가지 AWS 인공 지능 깨알 지식 - 윤석찬 (AWS 테크 에반젤리스트)
개발자가 알아두면 좋을 5가지 AWS 인공 지능 깨알 지식 - 윤석찬 (AWS 테크 에반젤리스트)개발자가 알아두면 좋을 5가지 AWS 인공 지능 깨알 지식 - 윤석찬 (AWS 테크 에반젤리스트)
개발자가 알아두면 좋을 5가지 AWS 인공 지능 깨알 지식 - 윤석찬 (AWS 테크 에반젤리스트)
 
Data Science on Google Cloud Platform
Data Science on Google Cloud PlatformData Science on Google Cloud Platform
Data Science on Google Cloud Platform
 
Amazon EMR Facebook Presto Meetup
Amazon EMR Facebook Presto MeetupAmazon EMR Facebook Presto Meetup
Amazon EMR Facebook Presto Meetup
 
Big Data: Mejores prácticas en AWS
Big Data: Mejores prácticas en AWSBig Data: Mejores prácticas en AWS
Big Data: Mejores prácticas en AWS
 
Berlin buzzwords 2020-feature-store-dowling
Berlin buzzwords 2020-feature-store-dowlingBerlin buzzwords 2020-feature-store-dowling
Berlin buzzwords 2020-feature-store-dowling
 
Eventually Everything Connects
Eventually Everything ConnectsEventually Everything Connects
Eventually Everything Connects
 
Hopsworks data engineering melbourne april 2020
Hopsworks   data engineering melbourne april 2020Hopsworks   data engineering melbourne april 2020
Hopsworks data engineering melbourne april 2020
 
Cost effective BigData Processing on Amazon EC2
Cost effective BigData Processing on Amazon EC2Cost effective BigData Processing on Amazon EC2
Cost effective BigData Processing on Amazon EC2
 
Automotive Information Research driven by Apache Solr
Automotive Information Research driven by Apache SolrAutomotive Information Research driven by Apache Solr
Automotive Information Research driven by Apache Solr
 
Lambda Architecture - Storm, Trident, SummingBird ... - Architecture and Over...
Lambda Architecture - Storm, Trident, SummingBird ... - Architecture and Over...Lambda Architecture - Storm, Trident, SummingBird ... - Architecture and Over...
Lambda Architecture - Storm, Trident, SummingBird ... - Architecture and Over...
 
Big Data in the Cloud
Big Data in the Cloud Big Data in the Cloud
Big Data in the Cloud
 
Kusto (Azure Data Explorer) Training for R&D - January 2019
Kusto (Azure Data Explorer) Training for R&D - January 2019 Kusto (Azure Data Explorer) Training for R&D - January 2019
Kusto (Azure Data Explorer) Training for R&D - January 2019
 
Unlocking Open Data in the Cloud
Unlocking Open Data in the CloudUnlocking Open Data in the Cloud
Unlocking Open Data in the Cloud
 

Viewers also liked

Enterprise Security Considerations
Enterprise Security ConsiderationsEnterprise Security Considerations
Enterprise Security ConsiderationsAmazon Web Services
 
Choosing the Right Data Storage Solution
Choosing the Right Data Storage SolutionChoosing the Right Data Storage Solution
Choosing the Right Data Storage SolutionAmazon Web Services
 
(SEC310) Integrating AWS with External Identity Management | AWS re:Invent 2014
(SEC310) Integrating AWS with External Identity Management | AWS re:Invent 2014(SEC310) Integrating AWS with External Identity Management | AWS re:Invent 2014
(SEC310) Integrating AWS with External Identity Management | AWS re:Invent 2014Amazon Web Services
 
Maximizing EC2 and Elastic Block Store Disk Performance
Maximizing EC2 and Elastic Block Store Disk PerformanceMaximizing EC2 and Elastic Block Store Disk Performance
Maximizing EC2 and Elastic Block Store Disk PerformanceAmazon Web Services
 
Becoming a Command Line Expert with the AWS CLI (TLS304) | AWS re:Invent 2013
Becoming a Command Line Expert with the AWS CLI (TLS304) | AWS re:Invent 2013Becoming a Command Line Expert with the AWS CLI (TLS304) | AWS re:Invent 2013
Becoming a Command Line Expert with the AWS CLI (TLS304) | AWS re:Invent 2013Amazon Web Services
 
Cloud Adoption in the Enterprise
Cloud Adoption in the EnterpriseCloud Adoption in the Enterprise
Cloud Adoption in the EnterpriseAmazon Web Services
 
Data Replication Options in AWS (ARC302) | AWS re:Invent 2013
Data Replication Options in AWS (ARC302) | AWS re:Invent 2013Data Replication Options in AWS (ARC302) | AWS re:Invent 2013
Data Replication Options in AWS (ARC302) | AWS re:Invent 2013Amazon Web Services
 
(MBL311) Workshop: Build an Android App Using AWS Mobile Services | AWS re:In...
(MBL311) Workshop: Build an Android App Using AWS Mobile Services | AWS re:In...(MBL311) Workshop: Build an Android App Using AWS Mobile Services | AWS re:In...
(MBL311) Workshop: Build an Android App Using AWS Mobile Services | AWS re:In...Amazon Web Services
 
Building a Real Time Dashboard with Amazon Kinesis, Amazon Lambda and Amazon ...
Building a Real Time Dashboard with Amazon Kinesis, Amazon Lambda and Amazon ...Building a Real Time Dashboard with Amazon Kinesis, Amazon Lambda and Amazon ...
Building a Real Time Dashboard with Amazon Kinesis, Amazon Lambda and Amazon ...Amazon Web Services
 

Viewers also liked (10)

Enterprise Security Considerations
Enterprise Security ConsiderationsEnterprise Security Considerations
Enterprise Security Considerations
 
Choosing the Right Data Storage Solution
Choosing the Right Data Storage SolutionChoosing the Right Data Storage Solution
Choosing the Right Data Storage Solution
 
AWS Webcast - SharePoint 2013
AWS Webcast - SharePoint 2013AWS Webcast - SharePoint 2013
AWS Webcast - SharePoint 2013
 
(SEC310) Integrating AWS with External Identity Management | AWS re:Invent 2014
(SEC310) Integrating AWS with External Identity Management | AWS re:Invent 2014(SEC310) Integrating AWS with External Identity Management | AWS re:Invent 2014
(SEC310) Integrating AWS with External Identity Management | AWS re:Invent 2014
 
Maximizing EC2 and Elastic Block Store Disk Performance
Maximizing EC2 and Elastic Block Store Disk PerformanceMaximizing EC2 and Elastic Block Store Disk Performance
Maximizing EC2 and Elastic Block Store Disk Performance
 
Becoming a Command Line Expert with the AWS CLI (TLS304) | AWS re:Invent 2013
Becoming a Command Line Expert with the AWS CLI (TLS304) | AWS re:Invent 2013Becoming a Command Line Expert with the AWS CLI (TLS304) | AWS re:Invent 2013
Becoming a Command Line Expert with the AWS CLI (TLS304) | AWS re:Invent 2013
 
Cloud Adoption in the Enterprise
Cloud Adoption in the EnterpriseCloud Adoption in the Enterprise
Cloud Adoption in the Enterprise
 
Data Replication Options in AWS (ARC302) | AWS re:Invent 2013
Data Replication Options in AWS (ARC302) | AWS re:Invent 2013Data Replication Options in AWS (ARC302) | AWS re:Invent 2013
Data Replication Options in AWS (ARC302) | AWS re:Invent 2013
 
(MBL311) Workshop: Build an Android App Using AWS Mobile Services | AWS re:In...
(MBL311) Workshop: Build an Android App Using AWS Mobile Services | AWS re:In...(MBL311) Workshop: Build an Android App Using AWS Mobile Services | AWS re:In...
(MBL311) Workshop: Build an Android App Using AWS Mobile Services | AWS re:In...
 
Building a Real Time Dashboard with Amazon Kinesis, Amazon Lambda and Amazon ...
Building a Real Time Dashboard with Amazon Kinesis, Amazon Lambda and Amazon ...Building a Real Time Dashboard with Amazon Kinesis, Amazon Lambda and Amazon ...
Building a Real Time Dashboard with Amazon Kinesis, Amazon Lambda and Amazon ...
 

Similar to Lumberjacking on AWS: Cutting Through Logs to Find What Matters (ARC306) | AWS re:Invent 2013

Big data on_aws in korea by abhishek sinha (lunch and learn)
Big data on_aws in korea by abhishek sinha (lunch and learn)Big data on_aws in korea by abhishek sinha (lunch and learn)
Big data on_aws in korea by abhishek sinha (lunch and learn)Amazon Web Services Korea
 
Introduction to Azure DocumentDB
Introduction to Azure DocumentDBIntroduction to Azure DocumentDB
Introduction to Azure DocumentDBDenny Lee
 
5 Years Of Building SaaS On AWS
5 Years Of Building SaaS On AWS5 Years Of Building SaaS On AWS
5 Years Of Building SaaS On AWSChristian Beedgen
 
클라우드 기반 데이터 분석 및 인공 지능을 위한 비지니스 혁신 - 윤석찬 (AWS 테크에반젤리스트)
클라우드 기반 데이터 분석 및 인공 지능을 위한 비지니스 혁신 - 윤석찬 (AWS 테크에반젤리스트)클라우드 기반 데이터 분석 및 인공 지능을 위한 비지니스 혁신 - 윤석찬 (AWS 테크에반젤리스트)
클라우드 기반 데이터 분석 및 인공 지능을 위한 비지니스 혁신 - 윤석찬 (AWS 테크에반젤리스트)Amazon Web Services Korea
 
Azure Stream Analytics : Analyse Data in Motion
Azure Stream Analytics  : Analyse Data in MotionAzure Stream Analytics  : Analyse Data in Motion
Azure Stream Analytics : Analyse Data in MotionRuhani Arora
 
AWS Summit Seoul 2015 - AWS 최신 서비스 살펴보기 - Aurora, Lambda, EFS, Machine Learn...
AWS Summit Seoul 2015 -  AWS 최신 서비스 살펴보기 - Aurora, Lambda, EFS, Machine Learn...AWS Summit Seoul 2015 -  AWS 최신 서비스 살펴보기 - Aurora, Lambda, EFS, Machine Learn...
AWS Summit Seoul 2015 - AWS 최신 서비스 살펴보기 - Aurora, Lambda, EFS, Machine Learn...Amazon Web Services Korea
 
AWS Activate Webinar - Growing on AWS
AWS Activate Webinar - Growing on AWSAWS Activate Webinar - Growing on AWS
AWS Activate Webinar - Growing on AWSAmazon Web Services
 
FSV307-Capital Markets Discovery How FINRA Runs Trade Analytics and Surveilla...
FSV307-Capital Markets Discovery How FINRA Runs Trade Analytics and Surveilla...FSV307-Capital Markets Discovery How FINRA Runs Trade Analytics and Surveilla...
FSV307-Capital Markets Discovery How FINRA Runs Trade Analytics and Surveilla...Amazon Web Services
 
Big Data & Analytics - Use Cases in Mobile, E-commerce, Media and more
Big Data & Analytics - Use Cases in Mobile, E-commerce, Media and moreBig Data & Analytics - Use Cases in Mobile, E-commerce, Media and more
Big Data & Analytics - Use Cases in Mobile, E-commerce, Media and moreAmazon Web Services
 
Introduction to WSO2 Data Analytics Platform
Introduction to  WSO2 Data Analytics PlatformIntroduction to  WSO2 Data Analytics Platform
Introduction to WSO2 Data Analytics PlatformSrinath Perera
 
Big Data Day LA 2015 - The AWS Big Data Platform by Michael Limcaco of Amazon
Big Data Day LA 2015 - The AWS Big Data Platform by Michael Limcaco of AmazonBig Data Day LA 2015 - The AWS Big Data Platform by Michael Limcaco of Amazon
Big Data Day LA 2015 - The AWS Big Data Platform by Michael Limcaco of AmazonData Con LA
 
Getting Started with Real-time Analytics
Getting Started with Real-time AnalyticsGetting Started with Real-time Analytics
Getting Started with Real-time AnalyticsAmazon Web Services
 
Thing you didn't know you could do in Spark
Thing you didn't know you could do in SparkThing you didn't know you could do in Spark
Thing you didn't know you could do in SparkSnappyData
 
Launch Your Game in the Cloud in Record Time
Launch Your Game in the Cloud in Record TimeLaunch Your Game in the Cloud in Record Time
Launch Your Game in the Cloud in Record TimeRightScale
 
Evolving Your Big Data Use Cases from Batch to Real-Time - AWS May 2016 Webi...
Evolving Your Big Data Use Cases from Batch to Real-Time - AWS May 2016  Webi...Evolving Your Big Data Use Cases from Batch to Real-Time - AWS May 2016  Webi...
Evolving Your Big Data Use Cases from Batch to Real-Time - AWS May 2016 Webi...Amazon Web Services
 
Big Data Analytics from Azure Cloud to Power BI Mobile
Big Data Analytics from Azure Cloud to Power BI MobileBig Data Analytics from Azure Cloud to Power BI Mobile
Big Data Analytics from Azure Cloud to Power BI MobileRoy Kim
 
B3 - Business intelligence apps on aws
B3 - Business intelligence apps on awsB3 - Business intelligence apps on aws
B3 - Business intelligence apps on awsAmazon Web Services
 
AWS re:Invent 2016: ↑↑↓↓←→←→ BA Lambda Start (SVR305)
AWS re:Invent 2016: ↑↑↓↓←→←→ BA Lambda Start (SVR305)AWS re:Invent 2016: ↑↑↓↓←→←→ BA Lambda Start (SVR305)
AWS re:Invent 2016: ↑↑↓↓←→←→ BA Lambda Start (SVR305)Amazon Web Services
 

Similar to Lumberjacking on AWS: Cutting Through Logs to Find What Matters (ARC306) | AWS re:Invent 2013 (20)

Big data on_aws in korea by abhishek sinha (lunch and learn)
Big data on_aws in korea by abhishek sinha (lunch and learn)Big data on_aws in korea by abhishek sinha (lunch and learn)
Big data on_aws in korea by abhishek sinha (lunch and learn)
 
Introduction to Azure DocumentDB
Introduction to Azure DocumentDBIntroduction to Azure DocumentDB
Introduction to Azure DocumentDB
 
5 Years Of Building SaaS On AWS
5 Years Of Building SaaS On AWS5 Years Of Building SaaS On AWS
5 Years Of Building SaaS On AWS
 
DW on AWS
DW on AWSDW on AWS
DW on AWS
 
클라우드 기반 데이터 분석 및 인공 지능을 위한 비지니스 혁신 - 윤석찬 (AWS 테크에반젤리스트)
클라우드 기반 데이터 분석 및 인공 지능을 위한 비지니스 혁신 - 윤석찬 (AWS 테크에반젤리스트)클라우드 기반 데이터 분석 및 인공 지능을 위한 비지니스 혁신 - 윤석찬 (AWS 테크에반젤리스트)
클라우드 기반 데이터 분석 및 인공 지능을 위한 비지니스 혁신 - 윤석찬 (AWS 테크에반젤리스트)
 
Azure Stream Analytics : Analyse Data in Motion
Azure Stream Analytics  : Analyse Data in MotionAzure Stream Analytics  : Analyse Data in Motion
Azure Stream Analytics : Analyse Data in Motion
 
AWS Summit Seoul 2015 - AWS 최신 서비스 살펴보기 - Aurora, Lambda, EFS, Machine Learn...
AWS Summit Seoul 2015 -  AWS 최신 서비스 살펴보기 - Aurora, Lambda, EFS, Machine Learn...AWS Summit Seoul 2015 -  AWS 최신 서비스 살펴보기 - Aurora, Lambda, EFS, Machine Learn...
AWS Summit Seoul 2015 - AWS 최신 서비스 살펴보기 - Aurora, Lambda, EFS, Machine Learn...
 
AWS Activate Webinar - Growing on AWS
AWS Activate Webinar - Growing on AWSAWS Activate Webinar - Growing on AWS
AWS Activate Webinar - Growing on AWS
 
FSV307-Capital Markets Discovery How FINRA Runs Trade Analytics and Surveilla...
FSV307-Capital Markets Discovery How FINRA Runs Trade Analytics and Surveilla...FSV307-Capital Markets Discovery How FINRA Runs Trade Analytics and Surveilla...
FSV307-Capital Markets Discovery How FINRA Runs Trade Analytics and Surveilla...
 
Big Data & Analytics - Use Cases in Mobile, E-commerce, Media and more
Big Data & Analytics - Use Cases in Mobile, E-commerce, Media and moreBig Data & Analytics - Use Cases in Mobile, E-commerce, Media and more
Big Data & Analytics - Use Cases in Mobile, E-commerce, Media and more
 
Analytics in the Cloud
Analytics in the CloudAnalytics in the Cloud
Analytics in the Cloud
 
Introduction to WSO2 Data Analytics Platform
Introduction to  WSO2 Data Analytics PlatformIntroduction to  WSO2 Data Analytics Platform
Introduction to WSO2 Data Analytics Platform
 
Big Data Day LA 2015 - The AWS Big Data Platform by Michael Limcaco of Amazon
Big Data Day LA 2015 - The AWS Big Data Platform by Michael Limcaco of AmazonBig Data Day LA 2015 - The AWS Big Data Platform by Michael Limcaco of Amazon
Big Data Day LA 2015 - The AWS Big Data Platform by Michael Limcaco of Amazon
 
Getting Started with Real-time Analytics
Getting Started with Real-time AnalyticsGetting Started with Real-time Analytics
Getting Started with Real-time Analytics
 
Thing you didn't know you could do in Spark
Thing you didn't know you could do in SparkThing you didn't know you could do in Spark
Thing you didn't know you could do in Spark
 
Launch Your Game in the Cloud in Record Time
Launch Your Game in the Cloud in Record TimeLaunch Your Game in the Cloud in Record Time
Launch Your Game in the Cloud in Record Time
 
Evolving Your Big Data Use Cases from Batch to Real-Time - AWS May 2016 Webi...
Evolving Your Big Data Use Cases from Batch to Real-Time - AWS May 2016  Webi...Evolving Your Big Data Use Cases from Batch to Real-Time - AWS May 2016  Webi...
Evolving Your Big Data Use Cases from Batch to Real-Time - AWS May 2016 Webi...
 
Big Data Analytics from Azure Cloud to Power BI Mobile
Big Data Analytics from Azure Cloud to Power BI MobileBig Data Analytics from Azure Cloud to Power BI Mobile
Big Data Analytics from Azure Cloud to Power BI Mobile
 
B3 - Business intelligence apps on aws
B3 - Business intelligence apps on awsB3 - Business intelligence apps on aws
B3 - Business intelligence apps on aws
 
AWS re:Invent 2016: ↑↑↓↓←→←→ BA Lambda Start (SVR305)
AWS re:Invent 2016: ↑↑↓↓←→←→ BA Lambda Start (SVR305)AWS re:Invent 2016: ↑↑↓↓←→←→ BA Lambda Start (SVR305)
AWS re:Invent 2016: ↑↑↓↓←→←→ BA Lambda Start (SVR305)
 

More from Amazon Web Services

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Amazon Web Services
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Amazon Web Services
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateAmazon Web Services
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSAmazon Web Services
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Amazon Web Services
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Amazon Web Services
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...Amazon Web Services
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsAmazon Web Services
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareAmazon Web Services
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSAmazon Web Services
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAmazon Web Services
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareAmazon Web Services
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWSAmazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckAmazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without serversAmazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...Amazon Web Services
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceAmazon Web Services
 

More from Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

Recently uploaded

TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 

Recently uploaded (20)

TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 

Lumberjacking on AWS: Cutting Through Logs to Find What Matters (ARC306) | AWS re:Invent 2013

  • 1. ARC 306: Lumberjacking on AWS Cutting Through Logs to Find What Matters Guy Ernest, Solutions Architecture November 15, 2013 © 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc.
  • 2.
  • 3.
  • 4. Progress Is Not Evenly Distributed 1980 $14,000,000/TB  450,000 ÷   30,000 X  100 MB  50 X  4 MB/s Today $30/TB 3 TB 200 MB/s
  • 5. by Kheel Center, Cornell University Solution: More Spindles
  • 6. Case Study – Foursquare
  • 7. The Challenge “…Foursquare streams hundreds of millions of application logs each day. The company relies on analytics to report on its daily usage, evaluate new offerings, and perform long-term trend analysis—and with millions of new check-ins each day, the workload is only growing…”
  • 8. “Real” Project Requirements Example Cost Analysis Marketing Operations Revenue Data transfer Top URLs Error rates Top games • By date/time • By edge location • By date/time within an edge location • By top X URLs • By HTTP vs. HTTPS • • • • • By top X URLs • By edge location • By edge location and content type • By revenue • By edge location and revenue As-is count By content type By edge location By edge location and content type Top ads • That lead to a game purchase Requests served • By edge location Revenue • By edge location Top games • By age • By income • By gender
  • 10. Available Data Sources Metric Data transfer by date/time Data transfer by edge location Data transfer by date/time within an edge location Data transfer by top x URLs Data transfer by http vs HTTPS Top URLs Top URLs by Content Type Top URLs by Edge Location Top URLs by Edge Location and Content Type Error rates by top x URLs Error rate by edge location Error Rate by edge location and content type Requests served by edge location Revenue by edge location Top games segmented by age Top games segmented by income Top games segmented by gender Top games by revenue Top games by edge location and revenue Top game revenue segmented by age Sources CloudFront logs CloudFront logs CloudFront logs CloudFront logs, web servers logs CloudFront logs CloudFront logs, web servers logs CloudFront logs CloudFront logs CloudFront logs CloudFront logs, web servers logs CloudFront logs CloudFront logs CloudFront logs CloudFront logs, OrdersDB, app servers logs CloudFront logs, user profile CloudFront logs, user profile CloudFront logs, user profile CloudFront logs, OrdersDB CloudFront logs, OrdersDB CloudFront logs, OrdersDB, user profile
  • 11. CloudFront Access Log Format #Version: 1.0 #Fields: date time x-edge-location sc-bytes c-ip cs-method cs(Host) cs-uri-stem sc-status cs(Referer) cs(User-Agent) cs-uri-query 2012-05-25 22:01:30 AMS1 4448 94.212.249.78 GET d1234567890213.cloudfront.net /YT0KthT/F5SOWdDPqNqQF07tiTOXqJMpfD dlb3LMwv3/jP3/CINm/yDSy0MsRcWJN/Simutrans.exe 200 http://AtRJw2kxg0EMW.com/kZetr/YCb6AM9N2xt2 Mozilla/5.0%20(compatible;%20M SIE%209.0;%20Windows%20NT%206.1;%20WOW64;%20Trident/5.0) uid=100&oid=108625181 2012-05-25 22:01:30 AMS1 4952 94.212.249.78 GET d1234567890213.cloudfront.net /66IG584/CPCxY0P44BGb5ZOd3qSUrauL05 0LOvFwaMj/eH/caw/Blob Wars-Blob And Conquer.exe 200 http://AtRJw2kxg0EMW.com/kZetr/YCb6AM9N2xt2 Mozilla/5.0%20(compatible;%20M SIE%209.0;%20Windows%20NT%206.1;%20WOW64;%20Trident/5.0) uid=100&oid=108625184 2012-05-25 22:01:30 AMS1 4556 78.8.5.135 GET d1234567890213.cloudfront.net /SwlufjC/xEjH3BRbXMXwmFWqzKt7od6tlW R3e13LhmH/V3eF/lo6g/AstroMenace.exe 200 http://AtRJw2kxg0EMW.com/AC1vg/1727EWfb7fPt Opera/9.80%20(Windows%20NT%205.1;%20U;%20pl)%2 0Presto/2.10.229%20Version/11.60 uid=100&oid=108625189 2012-05-25 22:01:30 AMS1 47172 78.8.5.135 GET d1234567890213.cloudfront.net /Di1cXoN/TskldkSHcgkvZXQEmv5vOVR25X 5UTisFkRq/pQa/wCjUXZb/Z1HRuGlo/Kroz.exe 200 http://AtRJw2kxg0EMW.com/AC1vg/1727EWfb7fPt Opera/9.80%20(Windows%20NT%205.1;%20U; %20pl)%20Presto/2.10.229%20Version/11.60 uid=100&oid=108625206
  • 12. Sample Your Data with R > > > > > sample_data <- read.delim(”SampleFiles/E123ABCDEF.2012-05-25-22.NEfbhLN3", header=F) sample_data <- sample_data[-1:-2,] View(sample_data) m <- ggplot(sample_data, aes(x = factor(V9))) m + geom_histogram() + scale_y_log10() + xlab('Error Codes') + ylab('log(Frequency)')
  • 13. Need a Lot of Memory?
  • 14. OpenRefine Running on an EC2 Instance
  • 16.
  • 17.
  • 18.
  • 19. Swedish public domain photo taken in 1918 Log Shipping
  • 20. “Poor Man’s Log Shipping”
  • 21.
  • 22. Embedding Poor-man Invisible Pixel http://www.poor-mananalytics.com/__track.gif?idt=5.1.5&idc=5&utmn=1532897343&utmhn=www.douban .com&utmcs=UTF-8&utmsr=1440x900&utmsc=24-bit&utmul=enus&utmje=1&utmfl=10.3%20r181&utmdt=%E8%B1%86%E7%93%A3&utmhid=571356425&utmr =-&utmp=%2F&utmac=UA-70197651&utmcc=__utma%3D30149280.1785629903.1314674330.1315290610.1315452707.10%3B %2B__utmz%3D30149280.1315452707.10.7.utmcsr%3Dbiaodianfu.com%7Cutmccn%3D(re ferral)%7Cutmcmd%3Dreferral%7Cutmcct%3D%2Fpoor-man-analyticsarchitecture.html%3B%2B__utmv%3D30149280.162%3B&utmu=qBM~
  • 23.
  • 24.
  • 25. Open Source Frameworks Fluentd Flume Scribe Chukwa … Fluentd Ascii Diagrams Input Output +--------------------------------------------+ | | | Web Apps ---+ +--> File | | | | | | +--> ---+ | | /var/log ------> Fluentd ------> Mail | | +--> ---+ | | | | | | Apache ---+ +--> S3 | | | +--------------------------------------------+ Web Server +---------+ | Fluentd -------+ +---------+ | | Proxy Server | +---------+ +--> +---------+ | Fluentd ----------> | Fluentd | +---------+ +--> +---------+ | Database Server | +---------+ | | Fluentd -------+ +---------+
  • 26. Use Amazon Kinesis to Ship Your Logs New
  • 27.
  • 29. S3distcp on EMR Job Sample ./elastic-mapreduce --jobflow j-3GY8JC4179IOK --jar /home/hadoop/lib/emr-s3distcp-1.0.jar --args '--src,s3://myawsbucket/cf, --dest,s3://myoutputbucket/aggregate , --groupBy,.*XABCD12345678.([0-9]+-[0-9]+-[0-9]+-[0-9]+).*, --targetSize,128, --outputCodec,lzo, --deleteOnSuccess'
  • 30.
  • 31.
  • 32. Pig for Access Logs Analysis RAW_LOG = LOAD 's3://myoutputbucket/aggregate/' AS (ts:chararray, url:chararray…); LOGS_BASE_F = FILTER RAW_LOG BY url MATCHES '^GET /__track.*$’; LOGS_BASE_F_W_PARAM = FOREACH LOGS_BASE_F GENERATE url, Load and Filter DATE_TIME(ts, 'dd/MMM/yyyy:HH:mm:ss Z') as dt, SUBSTRING(DATE_TIME(ts, 'dd/MMM/yyyy:HH:mm:ss Z') ,0, 10 ) as day, (cat / grep) … status, REGEX_EXTRACT(url, '^GET /([^?]+)', 1) AS action: chararray, REGEX_EXTRACT(url, 'idt=([^&]+)', 1) AS idt: chararray, REGEX_EXTRACT(url, 'idc=([^&]+)', 1) AS idc: chararray; I1 = FILTER LOGS_BASE_F_W_PARAM by action == 'clic' or action == 'display'; Parse LOGS_SHORT = FOREACH I1 GENERATE uuid, action, dt, day, ida, idas, act, idp, idcmp (awk) ,idc; Store G1 = GROUP LOGS_SHORT BY (uuid,idc); store G1 into ‘s3://mybucket/sessions/’; (>)
  • 33. Pig vs. Hive • Pig is geared toward sequentially transforming data – ETL – Shell in scale (from local mode to any scale) • Hive is for querying data – Data analysis / HQL – Some transformation, typically as a means to a goal i.e., temporary tables
  • 36. Optimize Your EMR Cluster
  • 37. Monitor Your EMR Cluster
  • 40.
  • 41. Customers Tools Gathering information about EMR jobs from multiple sources and presentation it in a textual and graphic view github.com/Hi-Media/EmrMonitoring
  • 43.
  • 44.
  • 46. Jeff Bezos (early Amazon days)
  • 48. More Trends to Consider Transactional Processing Analytical Processing Transactional context Global context Latency Throughput Indexed access Full table scans Random IO Sequential IO Disk seek times Disk transfer rate
  • 49.
  • 50.
  • 51. COPY into Amazon Redshift create table cf_logs ( d date, t char(8), edge char(4), bytes int, cip varchar(15), verb char(3), distro varchar(MAX), object varchar(MAX), status int, Referer varchar(MAX), agent varchar(MAX), qs varchar(MAX) ) copy cf_logs from 's3://big-data/logs/E123ABCDEF/' credentials 'aws_access_key_id=<key_id>;aws_secret_access_key=<secret_key>' IGNOREHEADER 2 GZIP DELIMITER 't' DATEFORMAT 'YYYY-MM-DD'
  • 52. COPY into Amazon Redshift with AWS Data Pipeline
  • 53. Charles Minard's flow map of Napoleon's March (1869) Time for Data Visualization
  • 54.
  • 55. Choose Your Favorite Visualization Tool Tableau (Windows instance) R Jaspersoft QlikView MicroStrategy SiSense …
  • 56.
  • 57.
  • 58.
  • 60. Unload Data from Amazon Redshift unload (“select * from cf_logs where date between '2013-11-03’ and '201311-10’“) to 's3://mybucket/unload_cf_logs_week_46' credentials 'aws_access_key_id=<key_id>; aws_secret_access_key=<secret_key>’ delimiter as 't’ GZIP;
  • 63. What Else Can You Do with Log Analysis?
  • 64.
  • 65. Finally, a Small Warning Abraham Wald (1902-1950)
  • 66. B C A
  • 67.
  • 68. Would You Like to Know More? Further reading http://aws.amazon.com/architecture http://aws.amazon.com/articles http://aws.typepad.com Re:invent sessions DAT205 - Amazon Redshift in Action: Enterprise, Big Data, and SaaS DAT305 - Getting Maximum Performance from Amazon Redshift BDT301 - Scaling your Analytics with Amazon Elastic MapReduce
  • 69. Please give us your feedback on this presentation ARC306 As a thank you, we will select prize winners daily for completed surveys!