SlideShare ist ein Scribd-Unternehmen logo
1 von 51
Downloaden Sie, um offline zu lesen
1© Cloudera, Inc. All rights reserved.
Creating your center of excellence
Becomingdata-driventhroughculturalchange
Frank Vullers
Business Value Strategist Cloudera
2© Cloudera, Inc. All rights reserved.
Imagine a world where we…
3© Cloudera, Inc. All rights reserved.
Imagine a world where we…
use	sensors	to	understand							
air	quality	triggers	to											
infant	asthmatic	events.
4© Cloudera, Inc. All rights reserved.
Imagine a world where we…
track	weather	and	crowds	to	
reduce	environmental	impact	
while	improving	service.
5© Cloudera, Inc. All rights reserved.
Imagine a world where we…
5
use	social	media	data	to	fight	
child	sexual	exploitation.
6© Cloudera, Inc. All rights reserved.
Imagine a world where we…
use	data	for	early	detection	
to	save	lives.
7© Cloudera, Inc. All rights reserved.
Imagine a world where we…
use	data	to	simulate	human	
travel	to	deep	space.
8© Cloudera, Inc. All rights reserved.
We live in that world today because our relationship with
data is changing
9© Cloudera, Inc. All rights reserved.
Instrumentation
Today, everything that can be
measured will be measured.
Today, data is the
application.
Today, becoming
data-driven is a
imperative..
Consumerization
Experimentation
Data is now a strategic asset
10© Cloudera, Inc. All rights reserved.
50%
50%
By 2017,
By 2018,
or fewer organizations will have made
the cultural or business model
adjustments to benefit from big data.
of business ethics violations will be
from improper use of big data
analytics.
Gartner “Predicts 2015: Big Data Challenges Move From Technology to the Organization” – November 2014
Yet the journey requires organizational change
11© Cloudera, Inc. All rights reserved.
How do you ensure success?
Our most successful customers
do these five things.
12© Cloudera, Inc. All rights reserved.
1. Build a data-driven culture
2. Develop the right team and skills
3. Adopt an agile/lean approach
4. Efficiently operationalize your insights
5. Right-size data governance
Our most successful customers do these five things
Data-driven
culture
Team and skills Agile
Operationalize
insights
Data
governance
13© Cloudera, Inc. All rights reserved.
Build a data-driven culture
14© Cloudera, Inc. All rights reserved.
key to success for the overall data-driven mission including
advocacy for creating/collecting data and for individual use cases.
§ Focused on change, and willing to take risk
§ Use every opportunity to brief sponsors and stakeholders.
Profile
Education
Advocacy § Build Big Data success stories from within the business.
The important role of the executive sponsor
Data-driven culture Team and skills Agile
Operationalize
insights
Data governance
Description
15© Cloudera, Inc. All rights reserved.
• Make communications more programmatic
• Enable many in the organization to become evangelists
Insights § The value from the data & use case delivered to the business.
Data
§ Valuing data as (eventually) a balance sheet asset
§ Size and utilization of the asset, specific data sources ingested
§ Governance & maturity
Platform &
tooling
§ Updates on releases and capabilities in the platform / ecosystem of user tools
Communications content
Vision § How being data-driven will deliver business results. Align to strategic initiatives.
Data-driven culture Team and skills Agile
Operationalize
insights
Data governance
Description
16© Cloudera, Inc. All rights reserved.
Description Use different vehicles and forms to enable collaboration
Meetups
§ Bringing together the larger community to share interests, learnings, and wins.
§ Team led
Big Data days
§ Transfer of information through executive led thought leadership.
§ Include experts from across the business units, vendors, partners.
§ Cross-domain focussed.
Hackathons § Allow developers to build new applications designed to boost business.
Communications and collaboration
Data-driven culture Team and skills Agile
Operationalize
insights
Data governance
17© Cloudera, Inc. All rights reserved.
Visualizations § Powerful way to express the importance of insights on the road to action.
Movie trailer § Compelling visualizations can serve as a ‘trailer’ for your movie
How and when § Make them colourful and make them move
Description
The hardest part of any analytic project , is enabling action.
Visualizations are a powerful tool to help
The power of visualizations
Data-driven culture Team and skills Agile
Operationalize
insights
Data governance
Telco	roaming	
crime	ring	–
Argyle	data
Point	of	Interest	
(POI)	Density	by	
City
Tweets	by	GPS	
coordinate	near	M&S	
shopping	area
18© Cloudera, Inc. All rights reserved.
Visualizations about the data asset itself help your user
community understand the size, growth and value of your data
Scorecard
category
KPI Q1
Target
Q1
Actuals
Reach # data sources under management 15 25
# business KPIs reported 10 20
Amount of data under management .5PB 1PB
Acquisition # of platform users 200 400
# of jobs per day 2000 5000
Conversion # of jobs moved from other clusters 50 30
Churn Churn of jobs and data 0 0
Capacity
/Utilization
Amount of storage -
Amount of data under management
75% 75%
0
50
100
150
200
250
Description
The power of visualizations
Data-driven culture Team and skills Agile
Operationalize
insights
Data governance
19© Cloudera, Inc. All rights reserved.
Develop the right team and skills
20© Cloudera, Inc. All rights reserved.
A traditional BI and analytics organization consists of three
main components.
Analytics § Use data to develop reports, find insights
Data
management
§ Satisfy requests, answer users questions, load models
Infrastructure § Hardware and software specialists and software components
Description
Staff for success
Data-driven culture Team and skills Agile
Operationalize
insights
Data governance
The past & the present
21© Cloudera, Inc. All rights reserved.
Analytics
Data
management
Infrastructure
Data Engineering team becomes strategic , data can be
transformed and used many different ways
Big Data
management
Architects
Data scientists
Data engineers
It is critical
that these
three roles
be tightly
aligned
Description
Staff for success
Data-driven culture Team and skills Agile
Operationalize
insights
Data governance
The future
22© Cloudera, Inc. All rights reserved.
Ecosystems change rapidly
Architects need to balance tactical and strategic needs.
Communication § Collaborate between software and hardware infrastructure
Education § Training is essential: admin, developer.
Leadership § Be the infrastructure expert and advise on new projects/requirements
Description
Your infrastructure team & architect
Data-driven culture Team and skills Agile
Operationalize
insights
Data governance
Infrastructure	
architecture	
&	operations
Enterprise	
architect
Hadoop	
admin
Network	
admin
Systems	
admin
23© Cloudera, Inc. All rights reserved.
Employ it in a meaningful way.
be committed to make data the utmost strategic asset
§ advocate for new data and for improved data.
§ Get trained and certified
§ Promote and evangelize value and data governance
Your data engineering team
Description
Data-driven culture Team and skills Agile
Operationalize
insights
Data governance
Data	
engineering
Data	
engineer
Data	
stewards
Data	ingest	
(ETL)
Data	dev	
operations
Information	
security
Communication
Education
Leadership
24© Cloudera, Inc. All rights reserved.
The hybrid data scientist
• Subject Matter Expertise lies in the business
• Hacking skills ,existing IT staff or new hires
• Staff at least one true Ph.D statistician for
model oversight across all teams
Important character trait
A luxury is finding one or more data
scientists that cross these disciplines
Your data scientist team(s)
Data-driven culture Team and skills Agile
Operationalize
insights
Data governance
Curiosity
Math	&	
Statistical	
Knowledge
Hacking
skills
Subject	
Matter	
Expertise
Data	Science
25© Cloudera, Inc. All rights reserved.
(centralized) Data Science team partner with the
business to identify data, explore use cases to solve
Agility § The team must be able to learn quickly and adapt
Skills
§ Computer science domain expertise and at least one true statistician.
Teams § domain expertise in-house, add in MS/Ph.D. and hire that one true statistician
Experts
§ This team must be the “data experts” for the entire company
Staff for success: data science-as-a-service
Description
Data-driven culture Team and skills Agile
Operationalize
insights
Data governance
Analytical	
development
Data	
scientist
Application	
developer
SQL	
developer
26© Cloudera, Inc. All rights reserved.
Organizing and sizing for success
Centralized De-centralized
Data science-
as-a-service
Data engineers
Architecture
Scale based upon # of
use cases
Scale based upon # of
data sets and amount of
data under management
Scale based on cluster
size and # of
components used
Business
focused SQL
and app
developers,
analysts and
data scientists
Scale based upon # of
use cases
Data-driven culture Team and skills Agile
Operationalize
insights
Data governance
27© Cloudera, Inc. All rights reserved.
Adopt an agile/lean approach
28© Cloudera, Inc. All rights reserved.
Lower risk § Risk of funding long-running projects with limited business value is small.
Lower costs § Can run infrastructure, data and insights work streams in parallel.
Communication § clear short-term results, continuous communication stream (results / failures)
Team § Can start with small team, and add additional scrum teams
Provides actionable results more rapidly and measures the
value gained at each step, in small iterations.
Leverage agile methodology to reduce risk (1/3)
Data-driven culture Team and skills Agile
Operationalize
insights
Data governance
Description
29© Cloudera, Inc. All rights reserved.
Transparency and ‘fail-fast’ capability are essential.
A lightweight agile process using 2-3 week sprints.
Epics § Documentation for broad concepts and requirements (ingestion/use case)
Stories § Business description of the work to be done (within a sprint)
Tasks § Manageable units of work with success criteria/ clear requirements
Teams
§ Small co-located (virtual is possible) teams deliver quickly and sprint exits
offer opportunity for demos and transparency into work
Description
Leverage agile methodology to reduce risk (2/3)
Data-driven culture Team and skills Agile
Operationalize
insights
Data governance
30© Cloudera, Inc. All rights reserved.
Product Owner
§ Identify the person that owns the lean development and on-going agile
management of each workstream
Backlog
§ Use agile and lean methods to document the work that needs to get done
Move
§ Agile means don’t wait until you know everything. Get moving quickly and be
able to change as new information becomes available
Roadblocks
§ The purpose of the scrum master and product owners are to remove
roadblocks so the team can continue to move and make progress
Agile applies to parallel workstreams: data asset creation and
management, and insights / data science / analysis.
Leverage agile methodology to reduce risk (3/3)
Data-driven culture Team and skills Agile
Operationalize
insights
Data governance
Description
31© Cloudera, Inc. All rights reserved.
list of Big Data needs will be longer than the list of resources
A transparent process for prioritizing the work is essential
Quarterly
§ Review of data processing and use case epics. Prioritize backlog.
Data epics § Prioritize : Value of data across business, conformed dimensions, single use case
Use case epics § Prioritize : Value in the business, Availability of the data, Ability to dimensions
Data and use case work prioritization process
Data-driven culture Team and skills Agile
Operationalize
insights
Data governance
Description
32© Cloudera, Inc. All rights reserved.
Agile methodology enables iterative workstreams
Use	Case	Development
EDH	Buildout
Data Governance & Common Profile
Development
Data	Engineering	
Use	Case	Development
Agile	Use	Case	Development
Scrum	Team Release	1 Release	2 Release	3
Production
Ready
Scrum	Team Release	1 Release	2
Production
Ready
Release	4
Release	3
Agile	Data	Ingestion/Management
Scrum	Team Release	1 Release	2 Release	3
Production
Ready
Scrum Team Release	1 Release	2
Production
Ready
Release	4
Release	3
Agile	Data	Governance	&	Common	Profile	Development
Scrum	Team Release	1 Release	2 Release	3
Production
Ready
Scrum Team Release	1 Release	2
Production
Ready
Release	4
Release	3
EDH	Buildout
Data-driven culture Team and skills Agile
Operationalize
insights
Data governance
33© Cloudera, Inc. All rights reserved.
After prioritization communicate along with the business the
work roadmap describing the areas
Business
roadmap
§ Data sets processed, infrastructure installed / use cases/insights expected
Technical
roadmap
§ Use the agile epics, stories, etc backlog to manage the technical deliverables.
Your Big Data business roadmap
Iterative § Ability to “fail-fast”. roadmaps change more often then in a waterfall world.
Data-driven culture Team and skills Agile
Operationalize
insights
Data governance
Description
34© Cloudera, Inc. All rights reserved.
Prioritization
Use cases
Availability Complexity Value
• accessible?
• quality?
• steward?
• generate /buy the data ?
• Shareable ?
• Simple report
• Machine Learning
• Advanced Analytics
• Automated decision making
• align to objective ?
• enable other case(s) ?
• increase revenue/ cost savings ?
• insights effect ?
• Executive override
Data*
High Medium Low
• urgently needed ?
• high value ?
• Multiple teams ?
• short-lived or streaming ?
• Augment existing data
• Reuse existing data processing
code.
• Easy to pull down.
• API allows to bring historical data.
• some data access/workaround.
• Low-quality data.
• Data has to be screen-scraped.
• Low likelihood of data being used
*Source: Carl Anderson – “Creating a Data-Driven Culture”
Data-driven culture Team and skills Agile
Operationalize
insights
Data governance
35© Cloudera, Inc. All rights reserved.
Efficiently operationalize your insights
36© Cloudera, Inc. All rights reserved.
Cloudera customers: strategic initiatives
Drive customer insights
Connect products
& services
Protect business
There are three axes to support these initiatives:
1. Data available
2. Analytical methods
3. Integration of analytical results into Report/App/ Web app etc
1 32
Data-driven culture Team and skills Agile
Operationalize
insights
Data governance
37© Cloudera, Inc. All rights reserved.
Digital/mobile Transaction CRM/call center Demographics Network/product Social
The first axis of analytics: the data
Properties: batch, stream, real-time
Data-driven culture Team and skills Agile
Operationalize
insights
Data governance
Digital media
• Teradata Aprimo
• IBM Unica
• Oracle Eloqua
• X+1
Web logs
• Microsoft IIS
• Apache
• nginx
• Google GWS
Clickstream/UX
• Adobe Omniture
• IBM Coremetrics
• IBM Tealeaf
• Google Analytics
Premium
• Webtrends
Mobile application
• SMS
Retail
Mobile
Web
Channel
Distributor
Bot
Call center
Indirect
Kiosk
Embedded commerce
service
Billing
Customer lifecycle
• Acquisition
• Churn
• Cross-Sell
• Upsell
CRM
• MS Dynamics
• Oracle/Siebel
• Salesforce
• SAP
Online chat
• Oracle RightNow
• Moxie Live Chat
• LivePerson
• Instant Service
• Oracle Live Help
• BoldChat
• Zendesk Zopim
• Kana Live Chat
IVR
• Avaya
• Cisco
• Nortel
• Nuance
Data broker / syndicate
• Acxiom
• CoreLogic
• Datalogix
• eBureau
• ID Analytics
• Intelius
• PeekYou
• Rapleaf
• Recorded Future
• IHS Polk
• Nielsen
• InfoScout
• Symphony IRI
• Gfk
Behavior
Loyalty
• Aimia
• Brierley+Partners
• Comarch
• Epsilon
• Kobie
• ICF Olson 1to1
• Merkle
• Clutch
• CrowdTwist
• DataCandy
• Deluxe
• Inte Q
• ICLP
Survey
• ABA
• Medallia
• Forsee
• Allegiance
• Walker Information
Direct
• Twitter
• Facebook
• Bazaarvoice
Listening/management
• Sprinklr
• Crimson Hexagon
• Radian6
• Lithium
• Simply Measured
• Curalate
• Datasift
Voice of the community
• CSAT
• NPS
38© Cloudera, Inc. All rights reserved.
The 2nd axis of analytics: analytical processing
Unsupervised learning: clustering,
topic modeling, time series analysis
Classification: gradient boosted trees,
SVMs, logistic regression, etc
Deep learning ("neural nets") and
natural language processingProfile
Customer
→ Detect anomalous events (e.g.;
predictive maintenance)
→ Score entities by behavior (e.g.;
churn analytics)
→ Classify or cluster
unstructured data (e.g.;
images or text for cyber
threats)
Data-driven culture Team and skills Agile
Operationalize
insights
Data governance
Simple Aggregation
39© Cloudera, Inc. All rights reserved.
The 3rd axis of analytics: serving insights
Integration with web
applications via Spark or
HBase
Integration with mobile
apps via Spark or HBase
Integration with enterprise
applications, e.g.; CRM,
sales
Search applications via solr
Serving to standard BI tools
(e.g.; Tableau, Qlik)
Data-driven culture Team and skills Agile
Operationalize
insights
Data governance
40© Cloudera, Inc. All rights reserved.
Project execution methodologies-the change*
Data-driven culture Team and skills Agile
Operationalize
insights
Data governance
*https://blog.spec-india.com/from-waterfall-to-agile-to-devops-a-cultural-and-technological-shift/
Design Code Test Deploy
Design Code Test Code Test Code Test Code Test Deploy
Design C T D C T D
Waterfall
Agile
DevOps
80-90’s
Late 90’s
00’s C T D C T D C T D C T D C T D C T D
41© Cloudera, Inc. All rights reserved.
Insights
Big Data
management
Infrastructure
links highly fluid and continuous development with more
structured infrastructure by ensuring the collaboration
Description
§ Existing IT support handles Infrastructure for L1, L2,
L3.
§ Existing IT support passes data and insights
through L1 and L2 to a DevOps team for Level 3
support. Key contact with Cloudera support
§ Data and insights teams work with DevOps for
production, and DevOps works with L2 to debug and
deploy.
**Cloudera
administrator,
developer,
navigator,
security training
Key role: DevOps for continuous deployment
Data-driven culture Team and Skills Agile
Operationalize
insights
Data governance
42© Cloudera, Inc. All rights reserved.
Governance
43© Cloudera, Inc. All rights reserved.
Governance: the foundation of data management
Compliance
Track, understand and
protect access to data
Am I prepared for an
audit?
Stewardship
Manage and organize
data assets at Hadoop
scale
Data Science
Effortlessly find and trust
the data that matters
most
Administration
Boost user productivity
and cluster performance
Who’s accessing what
data?
What are they doing
with the data?
Is sensitive data
governed and
protected?
How to efficiently
manage data lifecycle,
from ingest to purge?
How do I classify data
efficiently?
How do I make data
available to my end
users efficiently?
How can I explore data
on my own?
Can I trust what I find?
How do I use what I
find?
How do I find and use
related data sets?
How is data being
used today?
How can I optimize for
future workloads?
How can I quickly take
advantage of Hadoop
risk-free?
Data-driven culture Team and skills Agile
Operationalize
insights
Data governance
44© Cloudera, Inc. All rights reserved.
Data stewards
Owners and/or
creators of the data
Responsibilities
§ Knowledge of the data
§ Documenting
Data engineers
Implement the data
governance policies
Responsibilities
§ Defining the governance
§ Organizing the Council
§ Utilize tools
Data governance
council
Business owners of
the data governance
Responsibilities
§ Communication governance
§ Assigning data steward roles
§ Improving the link-ability
Right-size your Big Data governance
Data-driven culture Team and skills Agile
Operationalize
insights
Data governance
45© Cloudera, Inc. All rights reserved.
Data governance council
Cross-Org: has the authority for governing data
Data stewards
Retention
management
Profile
management
Quality
management
Data
privacy
Step 1: Data engineering team proposes policies, data steward roles, data dictionary,
master data (for profiles)
Step 2: Employ technology, e.g. Navigator, to implement the policies
Step 3: Data governance exec council – cross company participation
Data governance program
Responsible for governing the company’s Big Data asset
Right-size your Big Data governance
Data-driven culture Team and skills Agile
Operationalize
insights
Data governance
46© Cloudera, Inc. All rights reserved.
1. Build a data-driven culture
2. Develop the right team and skills
3. Adopt an agile/lean approach
4. Efficiently operationalize your insights
5. Right-size data governance
Data-driven culture Team and skills Agile
Operationalize
insights
Data governance
Our most successful customers do these five things
47© Cloudera, Inc. All rights reserved.
Thank you
Frank Vullers
Business Value Strategist Cloudera
fvullers@cloudera.com
@FrankVullers
48© Cloudera, Inc. All rights reserved.
49© Cloudera, Inc. All rights reserved.
All types of personal data included in GDPR
50© Cloudera, Inc. All rights reserved.
Integrity – Complete Audit and Policy rulesGovern
• Encrypt sensitive data
• Personal data is moved to encrypted zone
• Policy rule in Navigator
• Audit on all data
• All action fully audited
• Denied access
• Access to personal data all audited
51© Cloudera, Inc. All rights reserved.
Right to be Forgotten – Lineage back to source
Cloudera Navigator Lineage
Operate
• Lineage back to Source
• GDPR data tagged
• Source System tagged
• Source Table and Column
• Delete back to Source
• Trigger overnight Delete process

Weitere ähnliche Inhalte

Was ist angesagt?

Five Things to Consider About Data Mesh and Data Governance
Five Things to Consider About Data Mesh and Data GovernanceFive Things to Consider About Data Mesh and Data Governance
Five Things to Consider About Data Mesh and Data GovernanceDATAVERSITY
 
Enterprise Architecture vs. Data Architecture
Enterprise Architecture vs. Data ArchitectureEnterprise Architecture vs. Data Architecture
Enterprise Architecture vs. Data ArchitectureDATAVERSITY
 
Data Marketplace and the Role of Data Virtualization
Data Marketplace and the Role of Data VirtualizationData Marketplace and the Role of Data Virtualization
Data Marketplace and the Role of Data VirtualizationDenodo
 
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...DATAVERSITY
 
DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDatabricks
 
Rick Austin - Portfolio mangement in an agile world [Agile DC]
Rick Austin - Portfolio mangement in an agile world [Agile DC]Rick Austin - Portfolio mangement in an agile world [Agile DC]
Rick Austin - Portfolio mangement in an agile world [Agile DC]LeadingAgile
 
Data Architecture Best Practices for Advanced Analytics
Data Architecture Best Practices for Advanced AnalyticsData Architecture Best Practices for Advanced Analytics
Data Architecture Best Practices for Advanced AnalyticsDATAVERSITY
 
Introducing Databricks Delta
Introducing Databricks DeltaIntroducing Databricks Delta
Introducing Databricks DeltaDatabricks
 
Modern Data Flow
Modern Data FlowModern Data Flow
Modern Data Flowconfluent
 
Modern Data architecture Design
Modern Data architecture DesignModern Data architecture Design
Modern Data architecture DesignKujambu Murugesan
 
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...DataScienceConferenc1
 
Data Profiling, Data Catalogs and Metadata Harmonisation
Data Profiling, Data Catalogs and Metadata HarmonisationData Profiling, Data Catalogs and Metadata Harmonisation
Data Profiling, Data Catalogs and Metadata HarmonisationAlan McSweeney
 
How to Build Data Governance Programs That Last: A Business-First Approach
How to Build Data Governance Programs That Last: A Business-First ApproachHow to Build Data Governance Programs That Last: A Business-First Approach
How to Build Data Governance Programs That Last: A Business-First ApproachPrecisely
 
From Business model to Capability Map
From Business model to Capability Map From Business model to Capability Map
From Business model to Capability Map COMPETENSIS
 
Data Mess to Data Mesh | Jay Kreps, CEO, Confluent | Kafka Summit Americas 20...
Data Mess to Data Mesh | Jay Kreps, CEO, Confluent | Kafka Summit Americas 20...Data Mess to Data Mesh | Jay Kreps, CEO, Confluent | Kafka Summit Americas 20...
Data Mess to Data Mesh | Jay Kreps, CEO, Confluent | Kafka Summit Americas 20...HostedbyConfluent
 
Data Architecture for Data Governance
Data Architecture for Data GovernanceData Architecture for Data Governance
Data Architecture for Data GovernanceDATAVERSITY
 
Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?DATAVERSITY
 
Data Marketplace - Rethink the Data
Data Marketplace - Rethink the DataData Marketplace - Rethink the Data
Data Marketplace - Rethink the DataDenodo
 
Azure data analytics platform - A reference architecture
Azure data analytics platform - A reference architecture Azure data analytics platform - A reference architecture
Azure data analytics platform - A reference architecture Rajesh Kumar
 

Was ist angesagt? (20)

Five Things to Consider About Data Mesh and Data Governance
Five Things to Consider About Data Mesh and Data GovernanceFive Things to Consider About Data Mesh and Data Governance
Five Things to Consider About Data Mesh and Data Governance
 
Enterprise Architecture vs. Data Architecture
Enterprise Architecture vs. Data ArchitectureEnterprise Architecture vs. Data Architecture
Enterprise Architecture vs. Data Architecture
 
Data Marketplace and the Role of Data Virtualization
Data Marketplace and the Role of Data VirtualizationData Marketplace and the Role of Data Virtualization
Data Marketplace and the Role of Data Virtualization
 
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
 
DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptx
 
Rick Austin - Portfolio mangement in an agile world [Agile DC]
Rick Austin - Portfolio mangement in an agile world [Agile DC]Rick Austin - Portfolio mangement in an agile world [Agile DC]
Rick Austin - Portfolio mangement in an agile world [Agile DC]
 
Data Architecture Best Practices for Advanced Analytics
Data Architecture Best Practices for Advanced AnalyticsData Architecture Best Practices for Advanced Analytics
Data Architecture Best Practices for Advanced Analytics
 
Introducing Databricks Delta
Introducing Databricks DeltaIntroducing Databricks Delta
Introducing Databricks Delta
 
Modern Data Flow
Modern Data FlowModern Data Flow
Modern Data Flow
 
Modern Data architecture Design
Modern Data architecture DesignModern Data architecture Design
Modern Data architecture Design
 
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
 
Data Profiling, Data Catalogs and Metadata Harmonisation
Data Profiling, Data Catalogs and Metadata HarmonisationData Profiling, Data Catalogs and Metadata Harmonisation
Data Profiling, Data Catalogs and Metadata Harmonisation
 
How to Build Data Governance Programs That Last: A Business-First Approach
How to Build Data Governance Programs That Last: A Business-First ApproachHow to Build Data Governance Programs That Last: A Business-First Approach
How to Build Data Governance Programs That Last: A Business-First Approach
 
From Business model to Capability Map
From Business model to Capability Map From Business model to Capability Map
From Business model to Capability Map
 
Data Mess to Data Mesh | Jay Kreps, CEO, Confluent | Kafka Summit Americas 20...
Data Mess to Data Mesh | Jay Kreps, CEO, Confluent | Kafka Summit Americas 20...Data Mess to Data Mesh | Jay Kreps, CEO, Confluent | Kafka Summit Americas 20...
Data Mess to Data Mesh | Jay Kreps, CEO, Confluent | Kafka Summit Americas 20...
 
Data Architecture for Data Governance
Data Architecture for Data GovernanceData Architecture for Data Governance
Data Architecture for Data Governance
 
Modern data warehouse
Modern data warehouseModern data warehouse
Modern data warehouse
 
Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?
 
Data Marketplace - Rethink the Data
Data Marketplace - Rethink the DataData Marketplace - Rethink the Data
Data Marketplace - Rethink the Data
 
Azure data analytics platform - A reference architecture
Azure data analytics platform - A reference architecture Azure data analytics platform - A reference architecture
Azure data analytics platform - A reference architecture
 

Ähnlich wie How to Build a Successful Data-Driven Organization

The Five Markers on Your Big Data Journey
The Five Markers on Your Big Data JourneyThe Five Markers on Your Big Data Journey
The Five Markers on Your Big Data JourneyCloudera, Inc.
 
Becoming Data-Driven Through Cultural Change
Becoming Data-Driven Through Cultural ChangeBecoming Data-Driven Through Cultural Change
Becoming Data-Driven Through Cultural ChangeCloudera, Inc.
 
Standing Up an Effective Enterprise Data Hub -- Technology and Beyond
Standing Up an Effective Enterprise Data Hub -- Technology and BeyondStanding Up an Effective Enterprise Data Hub -- Technology and Beyond
Standing Up an Effective Enterprise Data Hub -- Technology and BeyondCloudera, Inc.
 
Is your big data journey stalling? Take the Leap with Capgemini and Cloudera
Is your big data journey stalling? Take the Leap with Capgemini and ClouderaIs your big data journey stalling? Take the Leap with Capgemini and Cloudera
Is your big data journey stalling? Take the Leap with Capgemini and ClouderaCloudera, Inc.
 
Capgemini Leap Data Transformation Framework with Cloudera
Capgemini Leap Data Transformation Framework with ClouderaCapgemini Leap Data Transformation Framework with Cloudera
Capgemini Leap Data Transformation Framework with ClouderaCapgemini
 
Optimize your cloud strategy for machine learning and analytics
Optimize your cloud strategy for machine learning and analyticsOptimize your cloud strategy for machine learning and analytics
Optimize your cloud strategy for machine learning and analyticsCloudera, Inc.
 
How to Identify, Train or Become a Data Scientist
How to Identify, Train or Become a Data ScientistHow to Identify, Train or Become a Data Scientist
How to Identify, Train or Become a Data ScientistInside Analysis
 
Put Alternative Data to Use in Capital Markets

Put Alternative Data to Use in Capital Markets
Put Alternative Data to Use in Capital Markets

Put Alternative Data to Use in Capital Markets
Cloudera, Inc.
 
Keyrus US Information
Keyrus US InformationKeyrus US Information
Keyrus US InformationJulian Tong
 
151116 Sedania Cloudera BDA Profile
151116 Sedania Cloudera BDA Profile151116 Sedania Cloudera BDA Profile
151116 Sedania Cloudera BDA ProfileZarul Zaabah
 
The Path to Data and Analytics Modernization
The Path to Data and Analytics ModernizationThe Path to Data and Analytics Modernization
The Path to Data and Analytics ModernizationAnalytics8
 
Big Data Everywhere Chicago: Platfora - Practices for Customer Analytics on H...
Big Data Everywhere Chicago: Platfora - Practices for Customer Analytics on H...Big Data Everywhere Chicago: Platfora - Practices for Customer Analytics on H...
Big Data Everywhere Chicago: Platfora - Practices for Customer Analytics on H...BigDataEverywhere
 
Data Engineering: Elastic, Low-Cost Data Processing in the Cloud
Data Engineering: Elastic, Low-Cost Data Processing in the CloudData Engineering: Elastic, Low-Cost Data Processing in the Cloud
Data Engineering: Elastic, Low-Cost Data Processing in the CloudCloudera, Inc.
 
The Journey to Success with Big Data
The Journey to Success with Big DataThe Journey to Success with Big Data
The Journey to Success with Big DataCloudera, Inc.
 
Accelerating Innovation in Energy
Accelerating Innovation in EnergyAccelerating Innovation in Energy
Accelerating Innovation in Energyaccenture
 
The 5 Biggest Data Myths in Telco: Exposed
The 5 Biggest Data Myths in Telco: ExposedThe 5 Biggest Data Myths in Telco: Exposed
The 5 Biggest Data Myths in Telco: ExposedCloudera, Inc.
 
Get ahead of the cloud or get left behind
Get ahead of the cloud or get left behindGet ahead of the cloud or get left behind
Get ahead of the cloud or get left behindMatt Mandich
 
Building your skills for a Cloud World
Building your skills for a Cloud WorldBuilding your skills for a Cloud World
Building your skills for a Cloud WorldChristian Verstraete
 
Incorporating cloud computing for enhanced communication v2
Incorporating cloud computing for enhanced communication v2Incorporating cloud computing for enhanced communication v2
Incorporating cloud computing for enhanced communication v2Christian Verstraete
 

Ähnlich wie How to Build a Successful Data-Driven Organization (20)

The Five Markers on Your Big Data Journey
The Five Markers on Your Big Data JourneyThe Five Markers on Your Big Data Journey
The Five Markers on Your Big Data Journey
 
Becoming Data-Driven Through Cultural Change
Becoming Data-Driven Through Cultural ChangeBecoming Data-Driven Through Cultural Change
Becoming Data-Driven Through Cultural Change
 
Standing Up an Effective Enterprise Data Hub -- Technology and Beyond
Standing Up an Effective Enterprise Data Hub -- Technology and BeyondStanding Up an Effective Enterprise Data Hub -- Technology and Beyond
Standing Up an Effective Enterprise Data Hub -- Technology and Beyond
 
Is your big data journey stalling? Take the Leap with Capgemini and Cloudera
Is your big data journey stalling? Take the Leap with Capgemini and ClouderaIs your big data journey stalling? Take the Leap with Capgemini and Cloudera
Is your big data journey stalling? Take the Leap with Capgemini and Cloudera
 
Capgemini Leap Data Transformation Framework with Cloudera
Capgemini Leap Data Transformation Framework with ClouderaCapgemini Leap Data Transformation Framework with Cloudera
Capgemini Leap Data Transformation Framework with Cloudera
 
Optimize your cloud strategy for machine learning and analytics
Optimize your cloud strategy for machine learning and analyticsOptimize your cloud strategy for machine learning and analytics
Optimize your cloud strategy for machine learning and analytics
 
How to Identify, Train or Become a Data Scientist
How to Identify, Train or Become a Data ScientistHow to Identify, Train or Become a Data Scientist
How to Identify, Train or Become a Data Scientist
 
Put Alternative Data to Use in Capital Markets

Put Alternative Data to Use in Capital Markets
Put Alternative Data to Use in Capital Markets

Put Alternative Data to Use in Capital Markets

 
Keyrus US Information
Keyrus US InformationKeyrus US Information
Keyrus US Information
 
Keyrus US Information
Keyrus US InformationKeyrus US Information
Keyrus US Information
 
151116 Sedania Cloudera BDA Profile
151116 Sedania Cloudera BDA Profile151116 Sedania Cloudera BDA Profile
151116 Sedania Cloudera BDA Profile
 
The Path to Data and Analytics Modernization
The Path to Data and Analytics ModernizationThe Path to Data and Analytics Modernization
The Path to Data and Analytics Modernization
 
Big Data Everywhere Chicago: Platfora - Practices for Customer Analytics on H...
Big Data Everywhere Chicago: Platfora - Practices for Customer Analytics on H...Big Data Everywhere Chicago: Platfora - Practices for Customer Analytics on H...
Big Data Everywhere Chicago: Platfora - Practices for Customer Analytics on H...
 
Data Engineering: Elastic, Low-Cost Data Processing in the Cloud
Data Engineering: Elastic, Low-Cost Data Processing in the CloudData Engineering: Elastic, Low-Cost Data Processing in the Cloud
Data Engineering: Elastic, Low-Cost Data Processing in the Cloud
 
The Journey to Success with Big Data
The Journey to Success with Big DataThe Journey to Success with Big Data
The Journey to Success with Big Data
 
Accelerating Innovation in Energy
Accelerating Innovation in EnergyAccelerating Innovation in Energy
Accelerating Innovation in Energy
 
The 5 Biggest Data Myths in Telco: Exposed
The 5 Biggest Data Myths in Telco: ExposedThe 5 Biggest Data Myths in Telco: Exposed
The 5 Biggest Data Myths in Telco: Exposed
 
Get ahead of the cloud or get left behind
Get ahead of the cloud or get left behindGet ahead of the cloud or get left behind
Get ahead of the cloud or get left behind
 
Building your skills for a Cloud World
Building your skills for a Cloud WorldBuilding your skills for a Cloud World
Building your skills for a Cloud World
 
Incorporating cloud computing for enhanced communication v2
Incorporating cloud computing for enhanced communication v2Incorporating cloud computing for enhanced communication v2
Incorporating cloud computing for enhanced communication v2
 

Kürzlich hochgeladen

Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...panagenda
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesThousandEyes
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...AliaaTarek5
 

Kürzlich hochgeladen (20)

Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
 

How to Build a Successful Data-Driven Organization

  • 1. 1© Cloudera, Inc. All rights reserved. Creating your center of excellence Becomingdata-driventhroughculturalchange Frank Vullers Business Value Strategist Cloudera
  • 2. 2© Cloudera, Inc. All rights reserved. Imagine a world where we…
  • 3. 3© Cloudera, Inc. All rights reserved. Imagine a world where we… use sensors to understand air quality triggers to infant asthmatic events.
  • 4. 4© Cloudera, Inc. All rights reserved. Imagine a world where we… track weather and crowds to reduce environmental impact while improving service.
  • 5. 5© Cloudera, Inc. All rights reserved. Imagine a world where we… 5 use social media data to fight child sexual exploitation.
  • 6. 6© Cloudera, Inc. All rights reserved. Imagine a world where we… use data for early detection to save lives.
  • 7. 7© Cloudera, Inc. All rights reserved. Imagine a world where we… use data to simulate human travel to deep space.
  • 8. 8© Cloudera, Inc. All rights reserved. We live in that world today because our relationship with data is changing
  • 9. 9© Cloudera, Inc. All rights reserved. Instrumentation Today, everything that can be measured will be measured. Today, data is the application. Today, becoming data-driven is a imperative.. Consumerization Experimentation Data is now a strategic asset
  • 10. 10© Cloudera, Inc. All rights reserved. 50% 50% By 2017, By 2018, or fewer organizations will have made the cultural or business model adjustments to benefit from big data. of business ethics violations will be from improper use of big data analytics. Gartner “Predicts 2015: Big Data Challenges Move From Technology to the Organization” – November 2014 Yet the journey requires organizational change
  • 11. 11© Cloudera, Inc. All rights reserved. How do you ensure success? Our most successful customers do these five things.
  • 12. 12© Cloudera, Inc. All rights reserved. 1. Build a data-driven culture 2. Develop the right team and skills 3. Adopt an agile/lean approach 4. Efficiently operationalize your insights 5. Right-size data governance Our most successful customers do these five things Data-driven culture Team and skills Agile Operationalize insights Data governance
  • 13. 13© Cloudera, Inc. All rights reserved. Build a data-driven culture
  • 14. 14© Cloudera, Inc. All rights reserved. key to success for the overall data-driven mission including advocacy for creating/collecting data and for individual use cases. § Focused on change, and willing to take risk § Use every opportunity to brief sponsors and stakeholders. Profile Education Advocacy § Build Big Data success stories from within the business. The important role of the executive sponsor Data-driven culture Team and skills Agile Operationalize insights Data governance Description
  • 15. 15© Cloudera, Inc. All rights reserved. • Make communications more programmatic • Enable many in the organization to become evangelists Insights § The value from the data & use case delivered to the business. Data § Valuing data as (eventually) a balance sheet asset § Size and utilization of the asset, specific data sources ingested § Governance & maturity Platform & tooling § Updates on releases and capabilities in the platform / ecosystem of user tools Communications content Vision § How being data-driven will deliver business results. Align to strategic initiatives. Data-driven culture Team and skills Agile Operationalize insights Data governance Description
  • 16. 16© Cloudera, Inc. All rights reserved. Description Use different vehicles and forms to enable collaboration Meetups § Bringing together the larger community to share interests, learnings, and wins. § Team led Big Data days § Transfer of information through executive led thought leadership. § Include experts from across the business units, vendors, partners. § Cross-domain focussed. Hackathons § Allow developers to build new applications designed to boost business. Communications and collaboration Data-driven culture Team and skills Agile Operationalize insights Data governance
  • 17. 17© Cloudera, Inc. All rights reserved. Visualizations § Powerful way to express the importance of insights on the road to action. Movie trailer § Compelling visualizations can serve as a ‘trailer’ for your movie How and when § Make them colourful and make them move Description The hardest part of any analytic project , is enabling action. Visualizations are a powerful tool to help The power of visualizations Data-driven culture Team and skills Agile Operationalize insights Data governance Telco roaming crime ring – Argyle data Point of Interest (POI) Density by City Tweets by GPS coordinate near M&S shopping area
  • 18. 18© Cloudera, Inc. All rights reserved. Visualizations about the data asset itself help your user community understand the size, growth and value of your data Scorecard category KPI Q1 Target Q1 Actuals Reach # data sources under management 15 25 # business KPIs reported 10 20 Amount of data under management .5PB 1PB Acquisition # of platform users 200 400 # of jobs per day 2000 5000 Conversion # of jobs moved from other clusters 50 30 Churn Churn of jobs and data 0 0 Capacity /Utilization Amount of storage - Amount of data under management 75% 75% 0 50 100 150 200 250 Description The power of visualizations Data-driven culture Team and skills Agile Operationalize insights Data governance
  • 19. 19© Cloudera, Inc. All rights reserved. Develop the right team and skills
  • 20. 20© Cloudera, Inc. All rights reserved. A traditional BI and analytics organization consists of three main components. Analytics § Use data to develop reports, find insights Data management § Satisfy requests, answer users questions, load models Infrastructure § Hardware and software specialists and software components Description Staff for success Data-driven culture Team and skills Agile Operationalize insights Data governance The past & the present
  • 21. 21© Cloudera, Inc. All rights reserved. Analytics Data management Infrastructure Data Engineering team becomes strategic , data can be transformed and used many different ways Big Data management Architects Data scientists Data engineers It is critical that these three roles be tightly aligned Description Staff for success Data-driven culture Team and skills Agile Operationalize insights Data governance The future
  • 22. 22© Cloudera, Inc. All rights reserved. Ecosystems change rapidly Architects need to balance tactical and strategic needs. Communication § Collaborate between software and hardware infrastructure Education § Training is essential: admin, developer. Leadership § Be the infrastructure expert and advise on new projects/requirements Description Your infrastructure team & architect Data-driven culture Team and skills Agile Operationalize insights Data governance Infrastructure architecture & operations Enterprise architect Hadoop admin Network admin Systems admin
  • 23. 23© Cloudera, Inc. All rights reserved. Employ it in a meaningful way. be committed to make data the utmost strategic asset § advocate for new data and for improved data. § Get trained and certified § Promote and evangelize value and data governance Your data engineering team Description Data-driven culture Team and skills Agile Operationalize insights Data governance Data engineering Data engineer Data stewards Data ingest (ETL) Data dev operations Information security Communication Education Leadership
  • 24. 24© Cloudera, Inc. All rights reserved. The hybrid data scientist • Subject Matter Expertise lies in the business • Hacking skills ,existing IT staff or new hires • Staff at least one true Ph.D statistician for model oversight across all teams Important character trait A luxury is finding one or more data scientists that cross these disciplines Your data scientist team(s) Data-driven culture Team and skills Agile Operationalize insights Data governance Curiosity Math & Statistical Knowledge Hacking skills Subject Matter Expertise Data Science
  • 25. 25© Cloudera, Inc. All rights reserved. (centralized) Data Science team partner with the business to identify data, explore use cases to solve Agility § The team must be able to learn quickly and adapt Skills § Computer science domain expertise and at least one true statistician. Teams § domain expertise in-house, add in MS/Ph.D. and hire that one true statistician Experts § This team must be the “data experts” for the entire company Staff for success: data science-as-a-service Description Data-driven culture Team and skills Agile Operationalize insights Data governance Analytical development Data scientist Application developer SQL developer
  • 26. 26© Cloudera, Inc. All rights reserved. Organizing and sizing for success Centralized De-centralized Data science- as-a-service Data engineers Architecture Scale based upon # of use cases Scale based upon # of data sets and amount of data under management Scale based on cluster size and # of components used Business focused SQL and app developers, analysts and data scientists Scale based upon # of use cases Data-driven culture Team and skills Agile Operationalize insights Data governance
  • 27. 27© Cloudera, Inc. All rights reserved. Adopt an agile/lean approach
  • 28. 28© Cloudera, Inc. All rights reserved. Lower risk § Risk of funding long-running projects with limited business value is small. Lower costs § Can run infrastructure, data and insights work streams in parallel. Communication § clear short-term results, continuous communication stream (results / failures) Team § Can start with small team, and add additional scrum teams Provides actionable results more rapidly and measures the value gained at each step, in small iterations. Leverage agile methodology to reduce risk (1/3) Data-driven culture Team and skills Agile Operationalize insights Data governance Description
  • 29. 29© Cloudera, Inc. All rights reserved. Transparency and ‘fail-fast’ capability are essential. A lightweight agile process using 2-3 week sprints. Epics § Documentation for broad concepts and requirements (ingestion/use case) Stories § Business description of the work to be done (within a sprint) Tasks § Manageable units of work with success criteria/ clear requirements Teams § Small co-located (virtual is possible) teams deliver quickly and sprint exits offer opportunity for demos and transparency into work Description Leverage agile methodology to reduce risk (2/3) Data-driven culture Team and skills Agile Operationalize insights Data governance
  • 30. 30© Cloudera, Inc. All rights reserved. Product Owner § Identify the person that owns the lean development and on-going agile management of each workstream Backlog § Use agile and lean methods to document the work that needs to get done Move § Agile means don’t wait until you know everything. Get moving quickly and be able to change as new information becomes available Roadblocks § The purpose of the scrum master and product owners are to remove roadblocks so the team can continue to move and make progress Agile applies to parallel workstreams: data asset creation and management, and insights / data science / analysis. Leverage agile methodology to reduce risk (3/3) Data-driven culture Team and skills Agile Operationalize insights Data governance Description
  • 31. 31© Cloudera, Inc. All rights reserved. list of Big Data needs will be longer than the list of resources A transparent process for prioritizing the work is essential Quarterly § Review of data processing and use case epics. Prioritize backlog. Data epics § Prioritize : Value of data across business, conformed dimensions, single use case Use case epics § Prioritize : Value in the business, Availability of the data, Ability to dimensions Data and use case work prioritization process Data-driven culture Team and skills Agile Operationalize insights Data governance Description
  • 32. 32© Cloudera, Inc. All rights reserved. Agile methodology enables iterative workstreams Use Case Development EDH Buildout Data Governance & Common Profile Development Data Engineering Use Case Development Agile Use Case Development Scrum Team Release 1 Release 2 Release 3 Production Ready Scrum Team Release 1 Release 2 Production Ready Release 4 Release 3 Agile Data Ingestion/Management Scrum Team Release 1 Release 2 Release 3 Production Ready Scrum Team Release 1 Release 2 Production Ready Release 4 Release 3 Agile Data Governance & Common Profile Development Scrum Team Release 1 Release 2 Release 3 Production Ready Scrum Team Release 1 Release 2 Production Ready Release 4 Release 3 EDH Buildout Data-driven culture Team and skills Agile Operationalize insights Data governance
  • 33. 33© Cloudera, Inc. All rights reserved. After prioritization communicate along with the business the work roadmap describing the areas Business roadmap § Data sets processed, infrastructure installed / use cases/insights expected Technical roadmap § Use the agile epics, stories, etc backlog to manage the technical deliverables. Your Big Data business roadmap Iterative § Ability to “fail-fast”. roadmaps change more often then in a waterfall world. Data-driven culture Team and skills Agile Operationalize insights Data governance Description
  • 34. 34© Cloudera, Inc. All rights reserved. Prioritization Use cases Availability Complexity Value • accessible? • quality? • steward? • generate /buy the data ? • Shareable ? • Simple report • Machine Learning • Advanced Analytics • Automated decision making • align to objective ? • enable other case(s) ? • increase revenue/ cost savings ? • insights effect ? • Executive override Data* High Medium Low • urgently needed ? • high value ? • Multiple teams ? • short-lived or streaming ? • Augment existing data • Reuse existing data processing code. • Easy to pull down. • API allows to bring historical data. • some data access/workaround. • Low-quality data. • Data has to be screen-scraped. • Low likelihood of data being used *Source: Carl Anderson – “Creating a Data-Driven Culture” Data-driven culture Team and skills Agile Operationalize insights Data governance
  • 35. 35© Cloudera, Inc. All rights reserved. Efficiently operationalize your insights
  • 36. 36© Cloudera, Inc. All rights reserved. Cloudera customers: strategic initiatives Drive customer insights Connect products & services Protect business There are three axes to support these initiatives: 1. Data available 2. Analytical methods 3. Integration of analytical results into Report/App/ Web app etc 1 32 Data-driven culture Team and skills Agile Operationalize insights Data governance
  • 37. 37© Cloudera, Inc. All rights reserved. Digital/mobile Transaction CRM/call center Demographics Network/product Social The first axis of analytics: the data Properties: batch, stream, real-time Data-driven culture Team and skills Agile Operationalize insights Data governance Digital media • Teradata Aprimo • IBM Unica • Oracle Eloqua • X+1 Web logs • Microsoft IIS • Apache • nginx • Google GWS Clickstream/UX • Adobe Omniture • IBM Coremetrics • IBM Tealeaf • Google Analytics Premium • Webtrends Mobile application • SMS Retail Mobile Web Channel Distributor Bot Call center Indirect Kiosk Embedded commerce service Billing Customer lifecycle • Acquisition • Churn • Cross-Sell • Upsell CRM • MS Dynamics • Oracle/Siebel • Salesforce • SAP Online chat • Oracle RightNow • Moxie Live Chat • LivePerson • Instant Service • Oracle Live Help • BoldChat • Zendesk Zopim • Kana Live Chat IVR • Avaya • Cisco • Nortel • Nuance Data broker / syndicate • Acxiom • CoreLogic • Datalogix • eBureau • ID Analytics • Intelius • PeekYou • Rapleaf • Recorded Future • IHS Polk • Nielsen • InfoScout • Symphony IRI • Gfk Behavior Loyalty • Aimia • Brierley+Partners • Comarch • Epsilon • Kobie • ICF Olson 1to1 • Merkle • Clutch • CrowdTwist • DataCandy • Deluxe • Inte Q • ICLP Survey • ABA • Medallia • Forsee • Allegiance • Walker Information Direct • Twitter • Facebook • Bazaarvoice Listening/management • Sprinklr • Crimson Hexagon • Radian6 • Lithium • Simply Measured • Curalate • Datasift Voice of the community • CSAT • NPS
  • 38. 38© Cloudera, Inc. All rights reserved. The 2nd axis of analytics: analytical processing Unsupervised learning: clustering, topic modeling, time series analysis Classification: gradient boosted trees, SVMs, logistic regression, etc Deep learning ("neural nets") and natural language processingProfile Customer → Detect anomalous events (e.g.; predictive maintenance) → Score entities by behavior (e.g.; churn analytics) → Classify or cluster unstructured data (e.g.; images or text for cyber threats) Data-driven culture Team and skills Agile Operationalize insights Data governance Simple Aggregation
  • 39. 39© Cloudera, Inc. All rights reserved. The 3rd axis of analytics: serving insights Integration with web applications via Spark or HBase Integration with mobile apps via Spark or HBase Integration with enterprise applications, e.g.; CRM, sales Search applications via solr Serving to standard BI tools (e.g.; Tableau, Qlik) Data-driven culture Team and skills Agile Operationalize insights Data governance
  • 40. 40© Cloudera, Inc. All rights reserved. Project execution methodologies-the change* Data-driven culture Team and skills Agile Operationalize insights Data governance *https://blog.spec-india.com/from-waterfall-to-agile-to-devops-a-cultural-and-technological-shift/ Design Code Test Deploy Design Code Test Code Test Code Test Code Test Deploy Design C T D C T D Waterfall Agile DevOps 80-90’s Late 90’s 00’s C T D C T D C T D C T D C T D C T D
  • 41. 41© Cloudera, Inc. All rights reserved. Insights Big Data management Infrastructure links highly fluid and continuous development with more structured infrastructure by ensuring the collaboration Description § Existing IT support handles Infrastructure for L1, L2, L3. § Existing IT support passes data and insights through L1 and L2 to a DevOps team for Level 3 support. Key contact with Cloudera support § Data and insights teams work with DevOps for production, and DevOps works with L2 to debug and deploy. **Cloudera administrator, developer, navigator, security training Key role: DevOps for continuous deployment Data-driven culture Team and Skills Agile Operationalize insights Data governance
  • 42. 42© Cloudera, Inc. All rights reserved. Governance
  • 43. 43© Cloudera, Inc. All rights reserved. Governance: the foundation of data management Compliance Track, understand and protect access to data Am I prepared for an audit? Stewardship Manage and organize data assets at Hadoop scale Data Science Effortlessly find and trust the data that matters most Administration Boost user productivity and cluster performance Who’s accessing what data? What are they doing with the data? Is sensitive data governed and protected? How to efficiently manage data lifecycle, from ingest to purge? How do I classify data efficiently? How do I make data available to my end users efficiently? How can I explore data on my own? Can I trust what I find? How do I use what I find? How do I find and use related data sets? How is data being used today? How can I optimize for future workloads? How can I quickly take advantage of Hadoop risk-free? Data-driven culture Team and skills Agile Operationalize insights Data governance
  • 44. 44© Cloudera, Inc. All rights reserved. Data stewards Owners and/or creators of the data Responsibilities § Knowledge of the data § Documenting Data engineers Implement the data governance policies Responsibilities § Defining the governance § Organizing the Council § Utilize tools Data governance council Business owners of the data governance Responsibilities § Communication governance § Assigning data steward roles § Improving the link-ability Right-size your Big Data governance Data-driven culture Team and skills Agile Operationalize insights Data governance
  • 45. 45© Cloudera, Inc. All rights reserved. Data governance council Cross-Org: has the authority for governing data Data stewards Retention management Profile management Quality management Data privacy Step 1: Data engineering team proposes policies, data steward roles, data dictionary, master data (for profiles) Step 2: Employ technology, e.g. Navigator, to implement the policies Step 3: Data governance exec council – cross company participation Data governance program Responsible for governing the company’s Big Data asset Right-size your Big Data governance Data-driven culture Team and skills Agile Operationalize insights Data governance
  • 46. 46© Cloudera, Inc. All rights reserved. 1. Build a data-driven culture 2. Develop the right team and skills 3. Adopt an agile/lean approach 4. Efficiently operationalize your insights 5. Right-size data governance Data-driven culture Team and skills Agile Operationalize insights Data governance Our most successful customers do these five things
  • 47. 47© Cloudera, Inc. All rights reserved. Thank you Frank Vullers Business Value Strategist Cloudera fvullers@cloudera.com @FrankVullers
  • 48. 48© Cloudera, Inc. All rights reserved.
  • 49. 49© Cloudera, Inc. All rights reserved. All types of personal data included in GDPR
  • 50. 50© Cloudera, Inc. All rights reserved. Integrity – Complete Audit and Policy rulesGovern • Encrypt sensitive data • Personal data is moved to encrypted zone • Policy rule in Navigator • Audit on all data • All action fully audited • Denied access • Access to personal data all audited
  • 51. 51© Cloudera, Inc. All rights reserved. Right to be Forgotten – Lineage back to source Cloudera Navigator Lineage Operate • Lineage back to Source • GDPR data tagged • Source System tagged • Source Table and Column • Delete back to Source • Trigger overnight Delete process