You built Cascading/Scalding apps to mine all that data you collected in Hadoop. But just when you were seeing results, something went wrong — the app broke, data flows stopped, and business came to a halt.
So what do you do next? How do you find out what went wrong in the shortest time possible? How do you pinpoint the line of code where the error occurred? How do you know which SLA is going to be impacted? How do you view the lineage of data to adhere to compliance requirements?
In this presentation, we show you how to easily find the answers with Driven, the most comprehensive Big Data App Performance Management Platform.
Furthermore, this presentation describes how Driven can help you build higher quality big data apps; run big data apps more reliably; and manage big data apps more effectively.
Who should view this PPT: Any person or organization that is currently involved in planning, deploying or managing a Hadoop application infrastructure.
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
How To Get Hadoop App Intelligence with Driven
1. How to Get Hadoop
Application Intelligence with Driven
2. Confidential!
WHY NOW
2!
As Hadoop applications become the
engine of your data management strategy,
they must meet higher standards of
quality, reliability, and manageability.
3. Confidential!
WE’RE THE FORCE BEHIND CASCADING…
3!
Cascading is a proven platform for building and
deploying big data applications on Hadoop with
10,000+ production deployments!
Java, Scala (Scalding), SQL!
SIMPLE
Ensure best practices !
at any scale thanks to !
easy-to-learn design
principles!
FLEXIBLE
Leverage existing Java,
Scala, and SQL skills
and easily adapt to new
systems!
RELIABLE
Always get optimal
performance and !
reliability for big data
applications!
!
4. Confidential!
… POWERING BIG DATA APPS ACROSS INDUSTRIES
!
Social Media Consumer & Retail Business Services Ad & Marketing
Financial
Telecom
What people are saying…!
4!
5. Confidential!
WHO ARE WE
!
TRUSTED
by over 10,000
companies as their big
data app platform!
!
BACKED
by top Silicon Valley
investors True Ventures,
Rembrandt VP, Bain
Capital!
!
!
!
FOUNDED !
in 2008, with
headquarters in San
Francisco!
5!
7. Confidential!
DEVELOPERS, OPS TEAMS, AND CIOS ASKED US
Can you help us improve the quality, reliability and manageability
of all our big data applications? !
!
By visualizing our entire data pipeline!
!
By tracking exactly how our big data apps behave at runtime and
pinpointing bottlenecks!
!
By helping us understand how our departments, teams, and other
segments consume big data resources and deliver value!
!7!
8. Confidential!
PERFORMANCE MANAGEMENT FOR HADOOP APPS
PERFORMANCE MANAGEMENT FOR
HADOOP APPLICATIONS
higher quality
hadoop apps
BUILD
hadoop apps
more reliably
RUN
hadoop apps
more effectively
MANAGE
13. Confidential!
RUN HADOOP APPS MORE RELIABLY
13!
CURRENTLY EXECUTING
Watch your apps execute in real time
Easily detect apps that violate SLA’s and
policies
Pinpoint bottlenecks and identify causes
14. Confidential!
RUN HADOOP APPS MORE RELIABLY
14!
Pinpoint bottlenecks and
identify causes
EXECUTING! WAITING!
Watch your apps execute in real time
Easily detect apps that violate SLA’s and
policies
Pinpoint bottlenecks and identify causes
DETAILED MAPPER/REDUCER STATS!
15. Confidential!
RUN HADOOP APPS MORE RELIABLY
15!
Pinpoint bottlenecks and
identify causes
Watch your apps execute in real time
Easily detect apps that violate SLA’s and
policies
Pinpoint bottlenecks and identify causes
View metrics for all apps on the
production cluster that failed to execute
in under 5 minutes… !
…or all applications that use more than
their allotment of mappers!
17. Confidential!
MANAGE BIG DATA APPS MORE EFFECTIVELY
17!
See how all apps consume resources as they run
Segment performance by team, by department or custom tags for
role-based views, chargeback models, and capacity planning
18. Confidential!
MANAGE HADOOP APPS MORE EFFECTIVELY
18!
See how all apps consume resources as they run
Segment performance by team, by department or custom tags for
role-based views, chargeback models, and capacity planning
View the performance of all apps owned by the
DevOps team!
Marketing
Sales
Compliance
Data science team
QA cluster
Production cluster
19. Confidential!
MANAGE HADOOP APPS FOR COMPLIANCE
19!
Visualize Lineage – See exactly how each app ingests, manipulates
and outputs data
Further inspect lineage by detecting apps that write to, or read from, a
given dataset
SOURCES OPERATIONS !
(Functions, filters, joins, and aggregators)
RESULTS
20. Confidential!
MANAGE HADOOP APPS FOR COMPLIANCE
20!
Visualize Lineage – See exactly how each app ingests, manipulates
and outputs data
Further inspect lineage by detecting apps that write to, or read from, a
given dataset
For example, show all apps that interact
with the dataset in “rain.txt”!
21. Confidential!
MANAGE HADOOP APPS WITH COLLABORATION
21!
Create JIRA issues with views and data for quickly collaborating to
resolve performance problems
Integrate alerts with popular notification platforms like HipChat,
PagerDuty, & Nagios
With one click, create a JIRA issue with
a link to this view!
22. Confidential!
MANAGE HADOOP APPS WITH INTEGRATION
22!
Create JIRA issues with views and data for quickly collaborating to
resolve performance problems
Integrate alerts with popular notification platforms like HipChat,
PagerDuty, & Nagios
Automatically send app status
notifications via webhooks or JMX !
24. Confidential!
End-to-end operational telemetry metadata for big data applications!
!
Accessible via Web browser, command-line interface (CLI), or simple search queries!
!
Easy integrations through JMX and upcoming Driven SDK!
DRIVEN ARCHITECTURE
Telemetry metadata!
(SSL)!
YARN!
HADOOP APPS AND INFRASTRUCTURE
APPLICATIONS!
Plugin!
24!
HADOOP CLUSTERS!
WARfiles!
Web App!
Server!
Server!
Web CLI JMX!
Web App!
Server!
25. Confidential!
DELIVERING OPERATIONAL EXCELLENCE
“The coolest part about Driven
is being able to visualize data
pipelines and inspect
components in real time for
easy troubleshooting and
optimization. I don't know of
any other tool that's close in
functionality.”
- Neville Li
Software Engineer, Spotify
25!
“With Driven, it’s easy to see
how our apps use the data.
When there’s an exception,
Driven shows the history, so we
can learn exactly what went
wrong. That’s a huge time
saver.”"
- Niels Boldt
Lead Software Engineer, Mojn
26. Confidential!
PERFORMANCE MANAGEMENT FOR HADOOP APPS
PERFORMANCE MANAGEMENT FOR
HADOOP APPLICATIONS
higher quality
hadoop apps
BUILD
hadoop apps
more reliably
RUN
hadoop apps
more effectively
MANAGE