SlideShare ist ein Scribd-Unternehmen logo
1 von 22
Scalable Application Insights
Framework
Yahoo! Architect Development Program-
2013
Rajesh Chandramohan
(rajeshc@)
Overview
Build System/Framework to Aggregate and Visualize user/system
insights from production application data sets.
Motivation: Production Servers are generating huge chunk of logs and
we realize that capturing all our data is now economical and valuable
Implemention
 Hadoop & Hbase/Hive
 Aggregators
Scribe / FLUME
 Visualization
Hive With Mysql Or Splunk
Evaluate Methodologies like
SHARK ( Which is suitable for the scale of million events per hour and
Terabytes of data store )
Use Cases
 All about end user level metrics/events in production.
 Track outbound/Inbound mails which is in billions
 Consolidate scattered data sets spread across multiple application
servers.
 Mail delivery Percentile, based on latency buckets.
Delivery Pipeline
Outbound Data Events
Requirements
Functional Requirements:
 Data Aggregation over time window
 Pattern Matching with Pig scripting and custom Scripts
 Create relation between data sets & Persistent Storage
 Informative Visualization
NON-Functional Requirements:
 Review with Stake holder & Architects
 Scalable Data Aggregation
 Easy operability
 Consistency, Reliability, Durability
 DATA Quality: ACCURACY , completeness
 Performance: Scalability, Latency
Outbound Data Scale
~400million Outbound Mail per day
Components Size/Hour
1 Webfarm
300 Farms
~1GB
300GB
Total ~300GB*24 ~7TB
System Workflow
Architecture
Data
logs
9
v
HdfsProxy
Fetch Data tool
HDFS Proxy
Grid ( Bassnium-Tan )
Oozie Jobs
SSH Keys
GDM Pull
YCA AUTH
Data Push
Aggregator
Formatted Data/graphs
App Logs/UnStructured Data
HBASE
Launcher/UI
ProdHosts
Semi formated Data
YCA AUTH
DS Kerberose
PIGUn Structured Structured Data
Apache/PHP
Kafka/Flume
SHARK
HIVE
MySql
Technology Stack
Data Aggregation framework :
 Started with custom scripts, based on logtail and Parallel
Processing . Currently collected Based on time interval in the span
of 30-60min interval .
Evaluated with Scribe , Fluentd and Flume
 Hadoop: For Raw data storage And processing and making
relation between user events
 Oozie: The data aggregation , relation/processing, Data
management , All these controlled in scheduled workflows.
 Hbase: For storing processed data in Htable format. To retrieve the
results effectively.
 Hive/SHARK: Evaluate Shark to store the data to retrieve faster
even in seconds interval with in-memory store.
Hbase Schema
Mail Success Delivery Bounced/ Failed Delivery
Hbase Table
Hbase Scalability
Why Not Splunk
 SPLUNK is commercial , Lacks versatility
 Splunk is not very customizable
 We have to depend on their tools and system
We Are using Splunk
 As splunk is available for yahoo,we intend to use only for data
visualization with hadoop Connect Interface.
 HUNK is a virtual Indexer service. Would be right choice for us.
Challenges & Learning
 Hadoop Data access is Sequential
 Smaller Output files Hits namespace quota
 Hbase good for Storing Output files
 Hbase Schema Decision ( row key design , Reduce No. of region
Servers )
 SHARK evaluated , good but not feasible to implement
 Flume: Durable, Flexible
 Hive layer helped for data visualization
 Use Splunk License availability , Made Visualization Simpler
 Manage Delayed Mail Info in Delivery pipe Line
 Consistency to Plot 300-500Million Events Per day
NEW ARCHITECTURE
Demo
Outputs: With hadoop System:
http://trackoutbound.mail.yahoo.com:9999/trackoutboundmail/
Demo Back-End Systems:
Hbase Analysis:
http://twiki.corp.yahoo.com/view/Mail/SELogsOnHBASE
TechPulse2013:http://techpulse-submission.corp.yahoo.com/paper?p=7&ls=1
Achievements:
It works on Hadoop Infrastructure
 Data aggregation in <10 minutes interval
 Sequential Read is taken off , with Hbase Store
 Flexible and Scalable Aggregation framework
 Feature rich mechanism for data visualization with Hive Or splunk
rajeshc@yahoo-inc.com
THANKS
Q & A
HBase Overview
HBase Overview
Apache HBase is an open source Bigtable-like, distributed, scalable, consistent, random access, columnar, key-
value store built on Apache Hadoop
Column Family - Info
Rowkey Email Age Password
Alice alice@wonderland.com 23
Bob bob@myworld.com 25 Iambob
Eve hithere@getintouch.com 30 nice1pass
Table is
lexicographically
sorted on
rowkeys
1
2
3
trickedyou
newpassword
Cells
4
ts1 = 1
ts2 = 2
Each cell has multiple
versions represented by
timestamp where
ts2>ts1
Identify your data (cell value) in the HBase table by
[1] rowkey, [2] column family, [3] column qualifier, [4] timestamp/ version]
HBase Data Model
HBase Operations
 get(<ROW>)
 put(<ROW>, Map<KEY,VALUE>)
 scan(<TABLE>)
 checkAndDelete()
 checkAndPut()
 increment()
…check HTable class for further details on operations
Caution:
 No queries
 No secondary indexes
 Billions of Rows * Millions of colums * thousands of versions
 Zookeeper as disctributed coordination service
HBase Overview
Scalable Application Insight Framework
Scalable Application Insight Framework

Weitere ähnliche Inhalte

Was ist angesagt?

Spark and Bloomberg by Sudarshan Kadambi and Partha Nageswaran
Spark and Bloomberg by  Sudarshan Kadambi and Partha NageswaranSpark and Bloomberg by  Sudarshan Kadambi and Partha Nageswaran
Spark and Bloomberg by Sudarshan Kadambi and Partha NageswaranSpark Summit
 
Database Camp 2016 @ United Nations, NYC - Michael Glukhovsky, Co-Founder, Re...
Database Camp 2016 @ United Nations, NYC - Michael Glukhovsky, Co-Founder, Re...Database Camp 2016 @ United Nations, NYC - Michael Glukhovsky, Co-Founder, Re...
Database Camp 2016 @ United Nations, NYC - Michael Glukhovsky, Co-Founder, Re...✔ Eric David Benari, PMP
 
HBaseCon 2015: Industrial Internet Case Study using HBase and TSDB
HBaseCon 2015: Industrial Internet Case Study using HBase and TSDBHBaseCon 2015: Industrial Internet Case Study using HBase and TSDB
HBaseCon 2015: Industrial Internet Case Study using HBase and TSDBHBaseCon
 
Scalable And Incremental Data Profiling With Spark
Scalable And Incremental Data Profiling With SparkScalable And Incremental Data Profiling With Spark
Scalable And Incremental Data Profiling With SparkJen Aman
 
Spark Summit San Francisco 2016 - Matei Zaharia Keynote: Apache Spark 2.0
Spark Summit San Francisco 2016 - Matei Zaharia Keynote: Apache Spark 2.0Spark Summit San Francisco 2016 - Matei Zaharia Keynote: Apache Spark 2.0
Spark Summit San Francisco 2016 - Matei Zaharia Keynote: Apache Spark 2.0Databricks
 
Spark Summit EU talk by Stephan Kessler
Spark Summit EU talk by Stephan KesslerSpark Summit EU talk by Stephan Kessler
Spark Summit EU talk by Stephan KesslerSpark Summit
 
Big Data Day LA 2016/ Use Case Driven track - From Clusters to Clouds, Hardwa...
Big Data Day LA 2016/ Use Case Driven track - From Clusters to Clouds, Hardwa...Big Data Day LA 2016/ Use Case Driven track - From Clusters to Clouds, Hardwa...
Big Data Day LA 2016/ Use Case Driven track - From Clusters to Clouds, Hardwa...Data Con LA
 
2 hadoop@e bay-hug-2010-07-21
2 hadoop@e bay-hug-2010-07-212 hadoop@e bay-hug-2010-07-21
2 hadoop@e bay-hug-2010-07-21Hadoop User Group
 
HBaseCon2017 Community-Driven Graphs with JanusGraph
HBaseCon2017 Community-Driven Graphs with JanusGraphHBaseCon2017 Community-Driven Graphs with JanusGraph
HBaseCon2017 Community-Driven Graphs with JanusGraphHBaseCon
 
Big data architecture on cloud computing infrastructure
Big data architecture on cloud computing infrastructureBig data architecture on cloud computing infrastructure
Big data architecture on cloud computing infrastructuredatastack
 
A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO)
A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO)A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO)
A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO)Spark Summit
 
Spark Summit EU talk by Bas Geerdink
Spark Summit EU talk by Bas GeerdinkSpark Summit EU talk by Bas Geerdink
Spark Summit EU talk by Bas GeerdinkSpark Summit
 
Enabling Insight to Support World-Class Supercomputing (Stefan Ceballos, Oak ...
Enabling Insight to Support World-Class Supercomputing (Stefan Ceballos, Oak ...Enabling Insight to Support World-Class Supercomputing (Stefan Ceballos, Oak ...
Enabling Insight to Support World-Class Supercomputing (Stefan Ceballos, Oak ...confluent
 
Scaling Data and ML with Apache Spark and Feast
Scaling Data and ML with Apache Spark and FeastScaling Data and ML with Apache Spark and Feast
Scaling Data and ML with Apache Spark and FeastDatabricks
 
From zero to hero with the actor model - Tamir Dresher - Odessa 2019
From zero to hero with the actor model  - Tamir Dresher - Odessa 2019From zero to hero with the actor model  - Tamir Dresher - Odessa 2019
From zero to hero with the actor model - Tamir Dresher - Odessa 2019Tamir Dresher
 
Mapping Data Flows Perf Tuning April 2021
Mapping Data Flows Perf Tuning April 2021Mapping Data Flows Perf Tuning April 2021
Mapping Data Flows Perf Tuning April 2021Mark Kromer
 
Rich Data Graphs for MapReduce
Rich Data Graphs for MapReduceRich Data Graphs for MapReduce
Rich Data Graphs for MapReduceScott Cinnamond
 

Was ist angesagt? (20)

Spark and Bloomberg by Sudarshan Kadambi and Partha Nageswaran
Spark and Bloomberg by  Sudarshan Kadambi and Partha NageswaranSpark and Bloomberg by  Sudarshan Kadambi and Partha Nageswaran
Spark and Bloomberg by Sudarshan Kadambi and Partha Nageswaran
 
Rameez Rangrez(3)
Rameez Rangrez(3)Rameez Rangrez(3)
Rameez Rangrez(3)
 
Hadoop at Ebay
Hadoop at EbayHadoop at Ebay
Hadoop at Ebay
 
Database Camp 2016 @ United Nations, NYC - Michael Glukhovsky, Co-Founder, Re...
Database Camp 2016 @ United Nations, NYC - Michael Glukhovsky, Co-Founder, Re...Database Camp 2016 @ United Nations, NYC - Michael Glukhovsky, Co-Founder, Re...
Database Camp 2016 @ United Nations, NYC - Michael Glukhovsky, Co-Founder, Re...
 
HBaseCon 2015: Industrial Internet Case Study using HBase and TSDB
HBaseCon 2015: Industrial Internet Case Study using HBase and TSDBHBaseCon 2015: Industrial Internet Case Study using HBase and TSDB
HBaseCon 2015: Industrial Internet Case Study using HBase and TSDB
 
Scalable And Incremental Data Profiling With Spark
Scalable And Incremental Data Profiling With SparkScalable And Incremental Data Profiling With Spark
Scalable And Incremental Data Profiling With Spark
 
Spark Summit San Francisco 2016 - Matei Zaharia Keynote: Apache Spark 2.0
Spark Summit San Francisco 2016 - Matei Zaharia Keynote: Apache Spark 2.0Spark Summit San Francisco 2016 - Matei Zaharia Keynote: Apache Spark 2.0
Spark Summit San Francisco 2016 - Matei Zaharia Keynote: Apache Spark 2.0
 
Spark Summit EU talk by Stephan Kessler
Spark Summit EU talk by Stephan KesslerSpark Summit EU talk by Stephan Kessler
Spark Summit EU talk by Stephan Kessler
 
Big Data Day LA 2016/ Use Case Driven track - From Clusters to Clouds, Hardwa...
Big Data Day LA 2016/ Use Case Driven track - From Clusters to Clouds, Hardwa...Big Data Day LA 2016/ Use Case Driven track - From Clusters to Clouds, Hardwa...
Big Data Day LA 2016/ Use Case Driven track - From Clusters to Clouds, Hardwa...
 
2 hadoop@e bay-hug-2010-07-21
2 hadoop@e bay-hug-2010-07-212 hadoop@e bay-hug-2010-07-21
2 hadoop@e bay-hug-2010-07-21
 
HBaseCon2017 Community-Driven Graphs with JanusGraph
HBaseCon2017 Community-Driven Graphs with JanusGraphHBaseCon2017 Community-Driven Graphs with JanusGraph
HBaseCon2017 Community-Driven Graphs with JanusGraph
 
Big data architecture on cloud computing infrastructure
Big data architecture on cloud computing infrastructureBig data architecture on cloud computing infrastructure
Big data architecture on cloud computing infrastructure
 
A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO)
A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO)A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO)
A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO)
 
Spark Summit EU talk by Bas Geerdink
Spark Summit EU talk by Bas GeerdinkSpark Summit EU talk by Bas Geerdink
Spark Summit EU talk by Bas Geerdink
 
Enabling Insight to Support World-Class Supercomputing (Stefan Ceballos, Oak ...
Enabling Insight to Support World-Class Supercomputing (Stefan Ceballos, Oak ...Enabling Insight to Support World-Class Supercomputing (Stefan Ceballos, Oak ...
Enabling Insight to Support World-Class Supercomputing (Stefan Ceballos, Oak ...
 
Scaling Data and ML with Apache Spark and Feast
Scaling Data and ML with Apache Spark and FeastScaling Data and ML with Apache Spark and Feast
Scaling Data and ML with Apache Spark and Feast
 
From zero to hero with the actor model - Tamir Dresher - Odessa 2019
From zero to hero with the actor model  - Tamir Dresher - Odessa 2019From zero to hero with the actor model  - Tamir Dresher - Odessa 2019
From zero to hero with the actor model - Tamir Dresher - Odessa 2019
 
Hadoop-2 @ eBay
Hadoop-2 @ eBayHadoop-2 @ eBay
Hadoop-2 @ eBay
 
Mapping Data Flows Perf Tuning April 2021
Mapping Data Flows Perf Tuning April 2021Mapping Data Flows Perf Tuning April 2021
Mapping Data Flows Perf Tuning April 2021
 
Rich Data Graphs for MapReduce
Rich Data Graphs for MapReduceRich Data Graphs for MapReduce
Rich Data Graphs for MapReduce
 

Andere mochten auch

Darts tickets prize draw Ts & Cs 19 11 11 Aftenoon
Darts tickets prize draw Ts & Cs 19 11 11 AftenoonDarts tickets prize draw Ts & Cs 19 11 11 Aftenoon
Darts tickets prize draw Ts & Cs 19 11 11 AftenoonVWCV_Terms_Conditions
 
Casamento carlos tufvesson
Casamento carlos tufvessonCasamento carlos tufvesson
Casamento carlos tufvessonSergyo Vitro
 
Azure Application Insights
Azure Application InsightsAzure Application Insights
Azure Application InsightsKlab
 
Panzergrenadierdivision "Großdeutschland"
Panzergrenadierdivision "Großdeutschland"Panzergrenadierdivision "Großdeutschland"
Panzergrenadierdivision "Großdeutschland"Bill Colmes
 
Macro Economics_Chapter 4_supply demnd
Macro Economics_Chapter 4_supply demndMacro Economics_Chapter 4_supply demnd
Macro Economics_Chapter 4_supply demnddjalex035
 
353 2 гдз. английский язык. 9кл. к уч. биболетовой м.з. и др.-2012 -256с
353 2  гдз. английский язык. 9кл. к уч. биболетовой м.з. и др.-2012 -256с353 2  гдз. английский язык. 9кл. к уч. биболетовой м.з. и др.-2012 -256с
353 2 гдз. английский язык. 9кл. к уч. биболетовой м.з. и др.-2012 -256сrobinbad123100
 

Andere mochten auch (11)

Book2015_SH
Book2015_SHBook2015_SH
Book2015_SH
 
Inf i matem_ju
Inf i matem_juInf i matem_ju
Inf i matem_ju
 
Nr – 12
Nr – 12Nr – 12
Nr – 12
 
Darts tickets prize draw Ts & Cs 19 11 11 Aftenoon
Darts tickets prize draw Ts & Cs 19 11 11 AftenoonDarts tickets prize draw Ts & Cs 19 11 11 Aftenoon
Darts tickets prize draw Ts & Cs 19 11 11 Aftenoon
 
Casamento carlos tufvesson
Casamento carlos tufvessonCasamento carlos tufvesson
Casamento carlos tufvesson
 
hassan4
hassan4hassan4
hassan4
 
Azure Application Insights
Azure Application InsightsAzure Application Insights
Azure Application Insights
 
Panzergrenadierdivision "Großdeutschland"
Panzergrenadierdivision "Großdeutschland"Panzergrenadierdivision "Großdeutschland"
Panzergrenadierdivision "Großdeutschland"
 
Macro Economics_Chapter 4_supply demnd
Macro Economics_Chapter 4_supply demndMacro Economics_Chapter 4_supply demnd
Macro Economics_Chapter 4_supply demnd
 
353 2 гдз. английский язык. 9кл. к уч. биболетовой м.з. и др.-2012 -256с
353 2  гдз. английский язык. 9кл. к уч. биболетовой м.з. и др.-2012 -256с353 2  гдз. английский язык. 9кл. к уч. биболетовой м.з. и др.-2012 -256с
353 2 гдз. английский язык. 9кл. к уч. биболетовой м.з. и др.-2012 -256с
 
Bdsadsadasdsadsa
BdsadsadasdsadsaBdsadsadasdsadsa
Bdsadsadasdsadsa
 

Ähnlich wie Scalable Application Insight Framework

How can Hadoop & SAP be integrated
How can Hadoop & SAP be integratedHow can Hadoop & SAP be integrated
How can Hadoop & SAP be integratedDouglas Bernardini
 
Hadoop World 2011: Building Web Analytics Processing on Hadoop at CBS Interac...
Hadoop World 2011: Building Web Analytics Processing on Hadoop at CBS Interac...Hadoop World 2011: Building Web Analytics Processing on Hadoop at CBS Interac...
Hadoop World 2011: Building Web Analytics Processing on Hadoop at CBS Interac...Cloudera, Inc.
 
Brief Introduction about Hadoop and Core Services.
Brief Introduction about Hadoop and Core Services.Brief Introduction about Hadoop and Core Services.
Brief Introduction about Hadoop and Core Services.Muthu Natarajan
 
Hadoop Frameworks Panel__HadoopSummit2010
Hadoop Frameworks Panel__HadoopSummit2010Hadoop Frameworks Panel__HadoopSummit2010
Hadoop Frameworks Panel__HadoopSummit2010Yahoo Developer Network
 
How to Use Apache Zeppelin with HWX HDB
How to Use Apache Zeppelin with HWX HDBHow to Use Apache Zeppelin with HWX HDB
How to Use Apache Zeppelin with HWX HDBHortonworks
 
Hadoop a Natural Choice for Data Intensive Log Processing
Hadoop a Natural Choice for Data Intensive Log ProcessingHadoop a Natural Choice for Data Intensive Log Processing
Hadoop a Natural Choice for Data Intensive Log ProcessingHitendra Kumar
 
Big Data Technology Stack : Nutshell
Big Data Technology Stack : NutshellBig Data Technology Stack : Nutshell
Big Data Technology Stack : NutshellKhalid Imran
 
Google Data Engineering.pdf
Google Data Engineering.pdfGoogle Data Engineering.pdf
Google Data Engineering.pdfavenkatram
 
Data Engineering on GCP
Data Engineering on GCPData Engineering on GCP
Data Engineering on GCPBlibBlobb
 
Data ingestion
Data ingestionData ingestion
Data ingestionnitheeshe2
 
Architecting the Future of Big Data and Search
Architecting the Future of Big Data and SearchArchitecting the Future of Big Data and Search
Architecting the Future of Big Data and SearchHortonworks
 
Hadoop Integration with Microstrategy
Hadoop Integration with Microstrategy Hadoop Integration with Microstrategy
Hadoop Integration with Microstrategy snehal parikh
 
Big-Data Hadoop Tutorials - MindScripts Technologies, Pune
Big-Data Hadoop Tutorials - MindScripts Technologies, Pune Big-Data Hadoop Tutorials - MindScripts Technologies, Pune
Big-Data Hadoop Tutorials - MindScripts Technologies, Pune amrutupre
 
Cloud Computing: Hadoop
Cloud Computing: HadoopCloud Computing: Hadoop
Cloud Computing: Hadoopdarugar
 
The other Apache Technologies your Big Data solution needs
The other Apache Technologies your Big Data solution needsThe other Apache Technologies your Big Data solution needs
The other Apache Technologies your Big Data solution needsgagravarr
 

Ähnlich wie Scalable Application Insight Framework (20)

Big Data , Big Problem?
Big Data , Big Problem?Big Data , Big Problem?
Big Data , Big Problem?
 
How can Hadoop & SAP be integrated
How can Hadoop & SAP be integratedHow can Hadoop & SAP be integrated
How can Hadoop & SAP be integrated
 
Hadoop World 2011: Building Web Analytics Processing on Hadoop at CBS Interac...
Hadoop World 2011: Building Web Analytics Processing on Hadoop at CBS Interac...Hadoop World 2011: Building Web Analytics Processing on Hadoop at CBS Interac...
Hadoop World 2011: Building Web Analytics Processing on Hadoop at CBS Interac...
 
Sureh hadoop 3 years t
Sureh hadoop 3 years tSureh hadoop 3 years t
Sureh hadoop 3 years t
 
Brief Introduction about Hadoop and Core Services.
Brief Introduction about Hadoop and Core Services.Brief Introduction about Hadoop and Core Services.
Brief Introduction about Hadoop and Core Services.
 
Intro to Hadoop
Intro to HadoopIntro to Hadoop
Intro to Hadoop
 
Hadoop Frameworks Panel__HadoopSummit2010
Hadoop Frameworks Panel__HadoopSummit2010Hadoop Frameworks Panel__HadoopSummit2010
Hadoop Frameworks Panel__HadoopSummit2010
 
How to Use Apache Zeppelin with HWX HDB
How to Use Apache Zeppelin with HWX HDBHow to Use Apache Zeppelin with HWX HDB
How to Use Apache Zeppelin with HWX HDB
 
Hadoop_arunam_ppt
Hadoop_arunam_pptHadoop_arunam_ppt
Hadoop_arunam_ppt
 
Hadoop a Natural Choice for Data Intensive Log Processing
Hadoop a Natural Choice for Data Intensive Log ProcessingHadoop a Natural Choice for Data Intensive Log Processing
Hadoop a Natural Choice for Data Intensive Log Processing
 
Big Data Technology Stack : Nutshell
Big Data Technology Stack : NutshellBig Data Technology Stack : Nutshell
Big Data Technology Stack : Nutshell
 
Google Data Engineering.pdf
Google Data Engineering.pdfGoogle Data Engineering.pdf
Google Data Engineering.pdf
 
Data Engineering on GCP
Data Engineering on GCPData Engineering on GCP
Data Engineering on GCP
 
Data ingestion
Data ingestionData ingestion
Data ingestion
 
Architecting the Future of Big Data and Search
Architecting the Future of Big Data and SearchArchitecting the Future of Big Data and Search
Architecting the Future of Big Data and Search
 
Hadoop Integration with Microstrategy
Hadoop Integration with Microstrategy Hadoop Integration with Microstrategy
Hadoop Integration with Microstrategy
 
In15orlesss hadoop
In15orlesss hadoopIn15orlesss hadoop
In15orlesss hadoop
 
Big-Data Hadoop Tutorials - MindScripts Technologies, Pune
Big-Data Hadoop Tutorials - MindScripts Technologies, Pune Big-Data Hadoop Tutorials - MindScripts Technologies, Pune
Big-Data Hadoop Tutorials - MindScripts Technologies, Pune
 
Cloud Computing: Hadoop
Cloud Computing: HadoopCloud Computing: Hadoop
Cloud Computing: Hadoop
 
The other Apache Technologies your Big Data solution needs
The other Apache Technologies your Big Data solution needsThe other Apache Technologies your Big Data solution needs
The other Apache Technologies your Big Data solution needs
 

Kürzlich hochgeladen

➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...amitlee9823
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...amitlee9823
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...amitlee9823
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...amitlee9823
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...amitlee9823
 
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...karishmasinghjnh
 
Detecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning ApproachDetecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning ApproachBoston Institute of Analytics
 
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Pooja Nehwal
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...amitlee9823
 
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men 🔝Ongole🔝 Escorts S...
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men  🔝Ongole🔝   Escorts S...➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men  🔝Ongole🔝   Escorts S...
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men 🔝Ongole🔝 Escorts S...amitlee9823
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...amitlee9823
 
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...amitlee9823
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...amitlee9823
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...SUHANI PANDEY
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...only4webmaster01
 
Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...
Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...
Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...gajnagarg
 
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...gajnagarg
 
Call Girls In Shivaji Nagar ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Shivaji Nagar ☎ 7737669865 🥵 Book Your One night StandCall Girls In Shivaji Nagar ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Shivaji Nagar ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 

Kürzlich hochgeladen (20)

➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
Anomaly detection and data imputation within time series
Anomaly detection and data imputation within time seriesAnomaly detection and data imputation within time series
Anomaly detection and data imputation within time series
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
 
Detecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning ApproachDetecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning Approach
 
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
 
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men 🔝Ongole🔝 Escorts S...
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men  🔝Ongole🔝   Escorts S...➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men  🔝Ongole🔝   Escorts S...
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men 🔝Ongole🔝 Escorts S...
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
 
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
 
Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...
Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...
Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...
 
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...
 
Call Girls In Shivaji Nagar ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Shivaji Nagar ☎ 7737669865 🥵 Book Your One night StandCall Girls In Shivaji Nagar ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Shivaji Nagar ☎ 7737669865 🥵 Book Your One night Stand
 

Scalable Application Insight Framework

  • 1. Scalable Application Insights Framework Yahoo! Architect Development Program- 2013 Rajesh Chandramohan (rajeshc@)
  • 2. Overview Build System/Framework to Aggregate and Visualize user/system insights from production application data sets. Motivation: Production Servers are generating huge chunk of logs and we realize that capturing all our data is now economical and valuable Implemention  Hadoop & Hbase/Hive  Aggregators Scribe / FLUME  Visualization Hive With Mysql Or Splunk Evaluate Methodologies like SHARK ( Which is suitable for the scale of million events per hour and Terabytes of data store )
  • 3. Use Cases  All about end user level metrics/events in production.  Track outbound/Inbound mails which is in billions  Consolidate scattered data sets spread across multiple application servers.  Mail delivery Percentile, based on latency buckets.
  • 6. Requirements Functional Requirements:  Data Aggregation over time window  Pattern Matching with Pig scripting and custom Scripts  Create relation between data sets & Persistent Storage  Informative Visualization NON-Functional Requirements:  Review with Stake holder & Architects  Scalable Data Aggregation  Easy operability  Consistency, Reliability, Durability  DATA Quality: ACCURACY , completeness  Performance: Scalability, Latency
  • 7. Outbound Data Scale ~400million Outbound Mail per day Components Size/Hour 1 Webfarm 300 Farms ~1GB 300GB Total ~300GB*24 ~7TB
  • 9. Architecture Data logs 9 v HdfsProxy Fetch Data tool HDFS Proxy Grid ( Bassnium-Tan ) Oozie Jobs SSH Keys GDM Pull YCA AUTH Data Push Aggregator Formatted Data/graphs App Logs/UnStructured Data HBASE Launcher/UI ProdHosts Semi formated Data YCA AUTH DS Kerberose PIGUn Structured Structured Data Apache/PHP Kafka/Flume SHARK HIVE MySql
  • 10. Technology Stack Data Aggregation framework :  Started with custom scripts, based on logtail and Parallel Processing . Currently collected Based on time interval in the span of 30-60min interval . Evaluated with Scribe , Fluentd and Flume  Hadoop: For Raw data storage And processing and making relation between user events  Oozie: The data aggregation , relation/processing, Data management , All these controlled in scheduled workflows.  Hbase: For storing processed data in Htable format. To retrieve the results effectively.  Hive/SHARK: Evaluate Shark to store the data to retrieve faster even in seconds interval with in-memory store.
  • 11. Hbase Schema Mail Success Delivery Bounced/ Failed Delivery
  • 14. Why Not Splunk  SPLUNK is commercial , Lacks versatility  Splunk is not very customizable  We have to depend on their tools and system We Are using Splunk  As splunk is available for yahoo,we intend to use only for data visualization with hadoop Connect Interface.  HUNK is a virtual Indexer service. Would be right choice for us.
  • 15. Challenges & Learning  Hadoop Data access is Sequential  Smaller Output files Hits namespace quota  Hbase good for Storing Output files  Hbase Schema Decision ( row key design , Reduce No. of region Servers )  SHARK evaluated , good but not feasible to implement  Flume: Durable, Flexible  Hive layer helped for data visualization  Use Splunk License availability , Made Visualization Simpler  Manage Delayed Mail Info in Delivery pipe Line  Consistency to Plot 300-500Million Events Per day
  • 17. Demo Outputs: With hadoop System: http://trackoutbound.mail.yahoo.com:9999/trackoutboundmail/ Demo Back-End Systems: Hbase Analysis: http://twiki.corp.yahoo.com/view/Mail/SELogsOnHBASE TechPulse2013:http://techpulse-submission.corp.yahoo.com/paper?p=7&ls=1 Achievements: It works on Hadoop Infrastructure  Data aggregation in <10 minutes interval  Sequential Read is taken off , with Hbase Store  Flexible and Scalable Aggregation framework  Feature rich mechanism for data visualization with Hive Or splunk
  • 19. HBase Overview HBase Overview Apache HBase is an open source Bigtable-like, distributed, scalable, consistent, random access, columnar, key- value store built on Apache Hadoop Column Family - Info Rowkey Email Age Password Alice alice@wonderland.com 23 Bob bob@myworld.com 25 Iambob Eve hithere@getintouch.com 30 nice1pass Table is lexicographically sorted on rowkeys 1 2 3 trickedyou newpassword Cells 4 ts1 = 1 ts2 = 2 Each cell has multiple versions represented by timestamp where ts2>ts1 Identify your data (cell value) in the HBase table by [1] rowkey, [2] column family, [3] column qualifier, [4] timestamp/ version] HBase Data Model
  • 20. HBase Operations  get(<ROW>)  put(<ROW>, Map<KEY,VALUE>)  scan(<TABLE>)  checkAndDelete()  checkAndPut()  increment() …check HTable class for further details on operations Caution:  No queries  No secondary indexes  Billions of Rows * Millions of colums * thousands of versions  Zookeeper as disctributed coordination service HBase Overview