Hadoop World 2011: Big Data Analytics – Data Professionals: The New Enterprise Rock Stars - Martin Hall, Karmasphere

Data Professionals:
The New Enterprise Rock Stars

www.karmasphere.com
1 © Karmasphere 2011 All rights reserved. Karmasphere Proprietary and Confidential.

IT Transition

100100 10111 100010110 1101101110101 0 10 01101010110 010 1010 11010 010010100 101
000 0111100 001010 101 01101 1010 1110 01010 011010 101011 1001 100 01010 01101 010
00110011 0101 10010 11011 000101 101 11010 001 0101 10101010 1101 1101011011 0 011
101101 1011 0101011 10101101 1010 100010 001010 0001 010 01101 0111 101 011 01000
010110 101110 1011 10101 000 0010 101010 1010111 1011 01000011 11010101 01111010
00100 01111 01111 011110 01001 001010 11 100100 10111 100010110 1101101110101 0 10
Business
01101010110 010 1010 11010 010010100 + Agents
Forces 101 000 0111100 001010 101 01101 1010 1110 Business
Requirements
01101 11010 1010 011010 101011 1001 change 01101 010 00110011 0101 10010 11011
of 100 01010
000101 101 11010 001 0101 10101010 1101 1101011011 0 011 101101 1011 0101011
Empowerment
10101101 1010 011 100010 001010 0001 010 01101 0111 101 011 01000 010110 101110 1011
10101 000 0010 101010 1010111 1011 01000011 11010101 01111010 00100 01111 01111
011110 01001 0110 100100 10111 100010110 1101101110101 0 10 01101010110 010 1010
11010 010010100 101 000 0111100 001010 101 01101 1010 1110 01010 011010 101011 1001
100 01010 01101 010 00110011 0101 10010 11011 000101 101 11010 001 0101 10101010
1101 1101011011 0 011 101101 1011 0101011 10101101 1010 100010 001010 0001 010 01101


Overview

•The Requirements-Driven Business
•The Forces of Change
•The Empowerment-Driven Business
•Data Professionals: The Agents of Change
• Who are they?
• What do they need?


The Requirements-Driven Business

• IT driven by business requirements
• Slow
• Rigid
• Subset analysis
• Data access is limited
• Costly
• Innovation is hard

Result
Fixed, limited and backward-looking insights


The Forces of Change

Data is Competitive Advantage

Open Source Data Volume &
Economics Variety

Limits of Existing Technology


The Empowerment-Driven Business

• IT empowers the business
• Fosters innovation
• Creates self-service environment
• Steps aside
• Fast
• Flexible
• Analysis of all data
• Data access is democratized
• Affordable
• Innovation facilitated

Result
Flexible, unlimited and forward-looking insights


“We have 250,000 people
who don’t want to rely on IT.
They want to be self-service”
Larry Feinsmith
JP Morgan

Data Professionals: The Agents of Change

• Key to the data-centric, empowerment-driven business
• Catalysts for business insight and change
• Roles have greater responsibility and business impact
• Will get paid more!


IT

Data
IT
Hadoop


“It’s not enough to have a
platform that only Java
developers can use”
Mike Olson, Cloudera

Data Professionals

Data Analysts

Data Engineers
Business
Operations
Data
IT
Hadoop


Hadoop and Big Data Analytics in the Data Fabric


Open Source Apache Hadoop

SQL, Data Flow Languages,
Predictive Analytics

MapReduce, HDFS,
Serialization, Coordination,
Scalable database …


Big Data Professional Roles

Who What

Access all the data they need on one or more clusters
Data Analyst Work easily with structured and unstructured data
Generate, share and integrate insights with the business

Create fully optimized data processing jobs –
Data Engineer transformations, filtering etc
Build distributed M/R algorithms for use by Data Analysts

IT Data
Choose, install, manage, provision and scale Hadoop clusters
Management


How to Empower Data Analysts & Engineers

• Mine large volumes of data • Purpose-built workspaces
daily • Native support for Hadoop
• Look for trends and patterns • Tight integration with on-
that may not be picked up with premise and in-cloud
structured data or tools Hadoop infrastructure
• Determine how those trends • Wizard-driven workflows
and patterns can help predict • Familiar, powerful and high
the business future productivity paradigms and
• Use discoveries to languages
• Create new products/services • Open source compatibility
• Optimize operations
• Crystallize customer view


The Big Data Analytics Workflow

100100 10111 100010110 1101101110101 0 10 01101010110 010 1010 11010 010010100 101
000 0111100 001010 101 01101 1010 1110 01010 011010 101011 1001 100 01010 01101 010
00110011 0101 10010 11011 000101 101 11010 001 0101 10101010 1101 1101011011 0 011
101101 1011 0101011 10101101 1010 100010 001010 0001 010 01101 0111 101 011 01000
010110 101110 1011 10101 000 0010 101010 1010111 1011 01000011 11010101 01111010
00100 01111 01111 011110 01001 001010 11 100100 10111 100010110 1101101110101 0 10
01101010110 010 1010 11010 010010100 101 000 0111100 001010 101 01101 1010 1110
01101 11010 1010 011010 101011 1001 100 01010 01101 010 00110011 0101 10010 11011
000101 101 11010 001 0101 10101010 1101 1101011011 0 011 101101 1011 0101011
10101101 1010 011 100010 001010 0001 010 01101 0111 101 011 01000 010110 101110 1011
10101 000 0010 101010 1010111 1011 01000011 11010101 01111010 00100 01111 01111
011110 01001 0110 100100 10111 100010110 1101101110101 0 10 01101010110 010 1010
11010 010010100 101 000 0111100 001010 101 01101 1010 1110 01010 011010 101011 1001
100 01010 01101 010 00110011 0101 10010 11011 000101 101 11010 001 0101 10101010
1101 1101011011 0 011 101101 1011 0101011 10101101 1010 100010 001010 0001 010 01101


Step One: Access

View Clusters and Navigate
Connect easily Data Graphically
to any Hadoop cluster • Create and Share Connections
• Browse Data Graphically –
HDFS, Local, Networked and
Cloud File Systems
Accesses Any and Every
Hadoop Cluster
• On-premise
• In-cloud
• Behind firewalls


Step Two: Assemble

Explore, integrate and Get wizard-driven, automatic
organize data of any type— data parsing
on-the-fly—to prepare for • LZO, GZip and bzip files
analysis • JSON, RCFiles, Text, Extended
Text, Sequence, Binary, and
custom types
• Tab, comma semicolon,
comma, space...
Assemble Works with standard Hadoop
Prepare data metastore
for analysis


Step Three: Analyze

Write powerful queries easily
Easily navigate Syntax Checking& Auto
complete
full data sets; find patterns,
180 predefined functions or add
trends and insights custom
Get assistance and visualization
Formatting
Filtering
Sorting
Analyze Charting
Explore, query and Learn and iterate quickly
visualize data
Easily save and reuse queries
Save and share results


Step Four: Act

Share
Share, integrate and Save results and queries to
operationalize results, Hadoop and JDBC databases
Integrate
queries and visualizations
Use results with existing
products including
Excel, Tableau and BI products
Operationalize
Save results to operational
Analyze data stores including legacy
Explore, query and and Big Data stores
visualize data


For The Data Analyst

• Graphical workspace for big data
analysis of any type and size
• Integrated Big Data Analytics
workflow
• Familiar and powerful user interface
• SQL-based
• Integrated with on-premise and in-
cloud Hadoop
• 100% Apache Hive compatible


Karmasphere Analyst - Access

Rich to create Screen shot


Karmasphere Analyst – Assemble


Karmasphere Analyst - Interact and Analyze


Karmasphere Analyst – Visualize and Act


For The Data Engineer

• Guided MapReduce development, with wizardsand workflow
• Prototype on the desktop, debug on the cluster
• Profile and optimize Hadoop jobs with graphical monitoring
• Package and export jobs for external submissionto a cluster

Rich to Get Screen Shot of Studio Workflow


If You’re Interested in Karmasphere

• Download free versions and virtual appliances, including
Cloudera CDH + Karmasphere
• http://www.karmasphere.com
• Get Pay-As-You-Go Hadoop analytics software from
• http://aws.amazon.com/elasticmapreduce


Thank You
martinh@karmasphere.com

www.karmasphere.com

Hadoop World 2011: Big Data Analytics – Data Professionals: The New Enterprise Rock Stars - Martin Hall, Karmasphere

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (19)

Andere mochten auch

Andere mochten auch (12)

Ähnlich wie Hadoop World 2011: Big Data Analytics – Data Professionals: The New Enterprise Rock Stars - Martin Hall, Karmasphere

Ähnlich wie Hadoop World 2011: Big Data Analytics – Data Professionals: The New Enterprise Rock Stars - Martin Hall, Karmasphere (20)

Mehr von Cloudera, Inc.

Mehr von Cloudera, Inc. (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Hadoop World 2011: Big Data Analytics – Data Professionals: The New Enterprise Rock Stars - Martin Hall, Karmasphere

Hinweis der Redaktion