SlideShare ist ein Scribd-Unternehmen logo
1 von 36
Downloaden Sie, um offline zu lesen
Building Data
Science Teams
David Dietrich
Advisory Technical Education Consultant
EMC Education Services
@imdaviddietrich

© Copyright 2013 EMC Corporation. All rights reserved.

1
Roadmap Information Disclaimer
 EMC makes no representation and undertakes no obligations with
regard to product planning information, anticipated product
characteristics, performance specifications, or anticipated release
dates (collectively, “Roadmap Information”).
 Roadmap Information is provided by EMC as an accommodation to the
recipient solely for purposes of discussion and without intending to be
bound thereby.
 Roadmap information is EMC Restricted Confidential and is provided
under the terms, conditions and restrictions defined in the EMC NonDisclosure Agreement in place with your organization.

© Copyright 2013 EMC Corporation. All rights reserved.

2
Do You Need A Data Science Team For This?
Creating Reports, Dashboards, and Databases…

© Copyright 2013 EMC Corporation. All rights reserved.

3
How About For This?

Creating a Map of the Internet

© Copyright 2013 EMC Corporation. All rights reserved.

4
Example: Output From a Data Science Team
Mapping The Spread of Innovation Ideas Using Social Graphs

© Copyright 2013 EMC Corporation. All rights reserved.

5
Big Data Requires New Approaches to Analytics
Business Intelligence Versus Data Science

Predictive Analytics and Data Mining
(Data Science)
Typical
Techniques and
Data Types

Common
Questions

Exploratory

Data
Science

Analytical
Approach

•

•
•
•

•

Typical
Techniques and
Data Types

Business
Intelligence

Past

© Copyright 2013 EMC Corporation. All rights reserved.

What if…..?
What’s the optimal scenario for our business?
What will happen next? What if these trends continue?
Why is this happening?

Business Intelligence

TIME

•

Common
Questions

Explanatory

Optimization, predictive modeling, forecasting, statistical
analysis
Structured/unstructured data, many types of sources,
very large data sets

•
•
•

•

Standard and ad hoc reporting, dashboards, alerts,
queries, details on demand
Structured data, traditional sources, manageable data
sets

What happened last quarter?
How many did we sell?
Where is the problem? In which situations?

Future

6
“Companies Are Always Looking To Reinvent
Themselves….But It’s A Mistake To Treat Data Science
Teams Like Any Old Product Group.
To Build Teams That Create Great Data Products, You
Have To Find People With The Skills And The Curiosity To
Ask The Big Questions.”

-

DJ Patil, Data Scientist in Residence at Greylock Partners

© Copyright 2013 EMC Corporation. All rights reserved.

7
Framework for Developing Data Science Teams

Data Science
Team

Data
Scientist

© Copyright 2013 EMC Corporation. All rights reserved.

BI
Analyst

Project
Sponsor

Project
Manager

Business
User

Data
Engineer

DBA

8
Data
Science
Teams
© Copyright 2013 EMC Corporation. All rights reserved.

9
Data Scientist: An Emerging Career

SPOTLIGHT ON BIG DATA

by Thomas H. Davenport and D.J. Patil

© Copyright 2013 EMC Corporation. All rights reserved.

10
Comparing Two Data Analysts
Traditional BI Analyst

Data Scientist

ACME
Healthcare

ACME
Healthcare

John

Janet

Sample Tasks

Sample Tasks

• Report Regional Sales For Last Quarter
• Perform Customer Feedback Surveys
• Identify Average Cost Per Supplier

• Predict Regional Sales For Next Quarter
• Discover Customer Opinions Via Social Media
• Identify Ways to Maximize Sales Campaign ROI

© Copyright 2013 EMC Corporation. All rights reserved.

11
Skills Matrix, Based on Recent Students

Quantitative
Skills

Quantitative Analysts,
Statisticians,
Business and data
analysts
Recent STEM
Grads

Data
Scientists

Business
Intelligence
Professionals, IT

Technical Ability

© Copyright 2013 EMC Corporation. All rights reserved.

12
Profile of a Data Scientist
Quantitative

Technical

Skeptical

© Copyright 2013 EMC Corporation. All rights reserved.

Curious &
Creative

Communicative
& Collaborative

13
Interpreting the Resume of a Senior Data Scientist
Data Scientist Job Description
Responsibilities:
• Work with business owners to map business requirements into technical solutions
• Analyze and extract relevant information from large amounts of data to help identify
key revenue-driven features
• Perform ad-hoc statistical and data mining analyses
• Design and implement scalable and repeatable solutions, and establish scalable,
efficient, automated processes for large scale data analyses
• Work closely with the software engineering team to drive new feature creation
• Design multi-factor experiments and validate hypothesis
Qualifications:
• A proven passion for generating insights from data, with a strong familiarity with the
higher-level trends in data growth, open-source platforms, and public data sets
• Experience with statistical languages and packages, including R, S-Plus, SAS and Matlab,
and/or Mahout
• Experience working with relational databases and/or distributed computing platforms,
and their query interfaces, such as SQL, MapReduce, Hadoop, Cassandra, PIG, and Hive
• Strong communication skills, with ability to communicate at all levels of the organization
• Masters/PhD degree in mathematics, statistics, computer science or a similar
quantitative field
• Experience in designing and implementing scalable data mining solutions
• Preferably experience with additional programming languages, including Python, Java,
and C/C++
• Ability to travel as-needed to meet with customers

Statistics

Programming

Data Mining

Advanced STEM
Degrees

© Copyright 2013 EMC Corporation. All rights reserved.

Sample Data Scientist Resume
John Smith

john.smith@email.com
Skills
R, SAS, Java, data mining, statistics, ontology, bioinformatics, human-computer interaction,
research
Experience
2009—Present, Senior Data Scientist, ABC Analytics
2007—2009, Founder&CEO, Genome
Genome specializes in consumer health information. The main product is
InherithHealth, a tool for acquisition of family medical histories that provides familial
disease risk assessment.
2005—2007, Knowledge Engineer, ScienceExperts.com
Managed technical outsourcing efforts. Developed criterion and evaluated
engineering outsourcing agencies and individuals …
2004—2006, Research Scientist, University of Washington
Developed rigorous statistical and computational models for addressing primary
shortcomings of observational data analysis in the context of disease risk and drug
response.
2000—2004, Research Developer, Nat’l Inst. of Standards and Technology
Designed and implemented prototypes. Evaluated tools for representing rules of
autonomous on-road navigation.
Education
Ph.D, Biomedical Informatics, University of Washington, 2011
Dissertation: Detection of Protein–protein Interaction in Living Cells by Flow
Cytometry
BS, Computer Science, University of Texas at Austin, 2004

14
Successful Analytic Projects Require Breadth of Roles
Business User

Project Sponsor

Database Administrator
(DBA)

© Copyright 2013 EMC Corporation. All rights reserved.

Project Manager

Data Engineer

Business Intelligence Analyst

Data Scientist

15
Framework for Developing Data Science Teams

Developing Data
Science
Capabilities
Data Science
Team

Transforming

Data
Scientist

© Copyright 2013 EMC Corporation. All rights reserved.

BI
Analyst

Creating

Project
Sponsor

As-a-Service

Project
Manager

Business
User

Crowdsourcing

Data
Engineer

DBA

16
Developing
Data
Science
Capabilities
© Copyright 2013 EMC Corporation. All rights reserved.

17
Four Approaches to Developing Data Science
Capabilities

Transforming

© Copyright 2013 EMC Corporation. All rights reserved.

Creating

As a Service

Crowdsourcing

18
Approaches to Developing Data Science Capabilities:
Transforming Teams

Transforming And Realignment With Minimal
Change To The Current Organizational
Structure

• Industries Requiring Deep Domain Knowledge
(Such As Genetics And DNA Sequencing)
• Established Companies Who Wish To Introduce Data Science
Into Their Business
• Companies Who Wish To Enrich The In-house Skill Sets

© Copyright 2013 EMC Corporation. All rights reserved.

19
Approaches to Developing Data Science Capabilities:
Transforming Teams

Developing A New Team From Scratch

• Start-up Companies
• Companies Who Wish To …
– Increase Their Focus On Data Analytics
– Start New Data Science Projects
• Companies Where Data Is The Product
• Deep Domain Knowledge Is Less Critical For The Analytics

© Copyright 2013 EMC Corporation. All rights reserved.

20
Approaches to Developing Data Science Capabilities:
Data Science as a Service

Engaging Data Science as a Service
(DSaaS)
• When To Engage DSaaS Providers
– Prefer Not To Change Existing Organizational Structure
– When Creating Or Transforming Are Not Viable Options
• Consider Service-level Agreements (SLAs) When Determining Whether
To Engage Internal Resources Or External Providers

© Copyright 2013 EMC Corporation. All rights reserved.

21
Approaches to Developing Data Science Capabilities:
Crowdsourcing Data Science

Outsource Data Science Project To
Distributed Groups Of People
• When To Crowdsource
– The Problem Is “Open” In Nature
– Willing To Accept Opinions From Distributed And Diverse Groups Of
People
– There’s A Back-up Plan In Case Of “Crowd Failures”
• Examples: Wikipedia, Netflix’s $1,000,000 Prize

© Copyright 2013 EMC Corporation. All rights reserved.

22
Approaches to Developing Data Science Capabilities:
Crowdsourcing Data Science (Cont’d)

Outsource Data Science Projects To
Distributed Groups Of People
• Different Crowdsourcing Models
– Wisdom Of Crowds
– Swarm Creativity (Collective Intelligence)
• Crowdsourcing Platforms
– Kaggle.com, Innocentive.com
– Amazon Mechanical Turk
• Crowd Failures: When The Turnout Of Crowdsourcing Is Unsatisfactory

© Copyright 2013 EMC Corporation. All rights reserved.

23
Benefits and Drawbacks of the Four Approaches
Transforming

Creating

DSaaS

Crowdsourcing

• Strong Domain

• Control Over Skill-

• Able to Scale on

• Leverage Wisdom

• Risk of homogeneous

• Hiring and

• Provider May Not

• No SLA; value not

Knowledge
• Knowledge of Business
Processes
• New Talent Raises Level
of Team Performance
• Gradually Increases the
Quality of Service

thinking
• May Struggle With
Quality of Service
• Some Team Members
May Resist Change

© Copyright 2013 EMC Corporation. All rights reserved.

sets
• More Flexibility
• High Quality of
Service

Knowledge
Transfer Are Timeconsuming
• Time Required to Find
and Hire Right Team
Members

Demand
• May Get Better
Service Levels Than
In-house
• Learn From Outside
Experts

Understand
Company’s Unique
Processes
• Difficult to Bring
Expertise Back Inhouse
• Decreasing Quality of
Service Over Time

of the Crowds
• Diverse Perspectives
• Lower Cost
• Fast Results

guaranteed
• Difficult to design the
“Open” Problem
• Difficult For Domain
Intensive Tasks
• Crowd Failure May
Happen (Adds Cost)

24
Framework for Developing Data Science Teams

Organizational
Model
Developing Data
Science
Capabilities
Data Science
Team

Centralized

Transforming

Data
Scientist

© Copyright 2013 EMC Corporation. All rights reserved.

BI
Analyst

Decentralized

Creating

Project
Sponsor

Hybrid

As-a-Service

Project
Manager

Business
User

Crowdsourcing

Data
Engineer

DBA

25
Organizational
Model

© Copyright 2013 EMC Corporation. All rights reserved.

26
Organizational Models for Data Science Teams
Centralized
BU

BU

BU

BU

DS Team

Hybrid
DS Team

BU

The Data Science Team Functions As A Hub
And Spoke Model, In Which They Are A
Central Provider Of Analytics To Multiple
Business Units

Decentralized
BU

BU

BU

BU

Each Business Unit Has Its Own
Data Science Capabilities

© Copyright 2013 EMC Corporation. All rights reserved.

There Is A Centralized Data Science
Team, But Business Units Also Have
Data Science Capabilities

Regardless Of Which Approach, They All
Need Executive Sponsorship To Succeed

27
Framework for Developing Data Science Teams
Executive
Engagement

Data-driven CEO

Organizational
Model
Developing Data
Science
Capabilities
Data Science
Team

Centralized

Transforming

Data
Scientist

© Copyright 2013 EMC Corporation. All rights reserved.

BI
Analyst

Chief Data Officer

Decentralized

Creating

Project
Sponsor

Hybrid

As-a-Service

Project
Manager

Business
User

Crowdsourcing

Data
Engineer

DBA

28
Executive
Engagement

© Copyright 2013 EMC Corporation. All rights reserved.

29
Analytics Requires Executive Level Engagement
“Executive Sponsorship Is So Vital To Analytical Competition…”
-- Tom Davenport (Competing on Analytics)
Chief Strategy Officer
Simulate Outcomes for
Acquiring Our Top 3
Competitors

CEO

Chief Finance Officer
Use Time Series Analysis
Over Historical Data to
Predict KPIs to Project
Earnings

Chief Product Officer
Conduct Social Media
Analyses to Identify
Customer Opinions

Chief Security Officer
Collect and Mine Log Data
Within and Outside of the
Company to Detect
Unknown Threats

Chief Marketing Officer
Conduct Behavior Analyses
to Predict If Customers Are
Going to Churn

Chief Operating Officer
Mine Customer Opinions
and Competitor Behaviors
to Predict Inventory
Demands

© Copyright 2013 EMC Corporation. All rights reserved.

Executive Boardroom

30
Executive Engagement: Data-Driven CEO
“… If Your Organization Can Arrange It … Have Someone In A Key Operational Role -Business Unit Head, Chief Operations Officer, Even CEO -- To Be An Enthusiastic Advocate Of
Matters Quantitative.”

-- Tom Davenport (HBR Blog Network)

Key Focus Areas of a
Data-driven CEO:
• Strategic Data
Planning
• Analytic
Understanding
• Technology Awareness

© Copyright 2013 EMC Corporation. All rights reserved.

Procter & Gamble Business Sphere

31
Executive Engagement: Chief Data Officer (CDO)
“… It's Time For Corporations To Embrace A New Functional Member Of
The C-suite: The Chief Data Officer (CDO).”

-- Anthony Goldbloom and Merav Bloch, Kaggle

• Promote Data-driven Decision
Making To Support Company’s Key
Initiatives
• Ensure The Company Collects The
Right Data
• Oversee And Drive Analytics
Company-wide
25% of organizations will have a Chief Data
Officer by 2015.

Executive-level Advisor On Data Analytics

Executive Boardroom

-- Gartner Blog Network

© Copyright 2013 EMC Corporation. All rights reserved.

32
Two New EMC Data Science Courses for Business Transformation

Business Leaders

New

90 min

Introducing Data Science and Big Data Analytics for Business Transformation

Heads of Data
Science Teams

New

1 day
Data Science and Big Data Analytics for Business Transformation

Aspiring Data
Scientists

5 days
Data Science and Big Data Analytics

© Copyright 2013 EMC Corporation. All rights reserved.

33
Closing Thoughts….
SPOTLIGHT ON BIG DATA

Now You Know How To Develop Data Science Teams…What Next?
•
•
•
•
•

Determine How You Would Like To Develop Data Science Capabilities
Hire People To Fill Out Your Data Science Team
Consider Which Organizational Model Will Work Best For Your Situation
Assess How Much Executive Engagement You Have Or Need
Map Out Potential Projects -- Balance Quick Wins With Longer-term Wins

© Copyright 2013 EMC Corporation. All rights reserved.

34
Questions?
Visit the EMC Global Services Booth, #110
Additional Resources:
1. EMC Education Services curriculum on Data Science and Big Data Analytics
for Business Transformation:
http://education.emc.com/guest/campaign/data_science.aspx

2. My Blog on Data Science & Big Data Analytics:
http://infocus.emc.com/author/david_dietrich/

3. Blog on applying Data Analytics Lifecycle to measuring innovation data:
http://stevetodd.typepad.com/my_weblog/data-science-and-big-data-curriculum/

© Copyright 2013 EMC Corporation. All rights reserved.

35
Building Data Science Teams

Weitere ähnliche Inhalte

Was ist angesagt?

Unlocking the Power of Generative AI An Executive's Guide.pdf
Unlocking the Power of Generative AI An Executive's Guide.pdfUnlocking the Power of Generative AI An Executive's Guide.pdf
Unlocking the Power of Generative AI An Executive's Guide.pdfPremNaraindas1
 
Exploring Opportunities in the Generative AI Value Chain.pdf
Exploring Opportunities in the Generative AI Value Chain.pdfExploring Opportunities in the Generative AI Value Chain.pdf
Exploring Opportunities in the Generative AI Value Chain.pdfDung Hoang
 
BI Consultancy - Data, Analytics and Strategy
BI Consultancy - Data, Analytics and StrategyBI Consultancy - Data, Analytics and Strategy
BI Consultancy - Data, Analytics and StrategyShivam Dhawan
 
Large Language Models Bootcamp
Large Language Models BootcampLarge Language Models Bootcamp
Large Language Models BootcampData Science Dojo
 
AIDRC_Generative_AI_TL_v5.pdf
AIDRC_Generative_AI_TL_v5.pdfAIDRC_Generative_AI_TL_v5.pdf
AIDRC_Generative_AI_TL_v5.pdfThierry Lestable
 
Data Mesh in Azure using Cloud Scale Analytics (WAF)
Data Mesh in Azure using Cloud Scale Analytics (WAF)Data Mesh in Azure using Cloud Scale Analytics (WAF)
Data Mesh in Azure using Cloud Scale Analytics (WAF)Nathan Bijnens
 
data-analytics-strategy-ebook.pptx
data-analytics-strategy-ebook.pptxdata-analytics-strategy-ebook.pptx
data-analytics-strategy-ebook.pptxMohamedHendawy17
 
The art of the possible with graph technology_Neo4j GraphSummit Dublin 2023.pptx
The art of the possible with graph technology_Neo4j GraphSummit Dublin 2023.pptxThe art of the possible with graph technology_Neo4j GraphSummit Dublin 2023.pptx
The art of the possible with graph technology_Neo4j GraphSummit Dublin 2023.pptxNeo4j
 
How a Semantic Layer Makes Data Mesh Work at Scale
How a Semantic Layer Makes  Data Mesh Work at ScaleHow a Semantic Layer Makes  Data Mesh Work at Scale
How a Semantic Layer Makes Data Mesh Work at ScaleDATAVERSITY
 
Knowledge Graphs and Generative AI
Knowledge Graphs and Generative AIKnowledge Graphs and Generative AI
Knowledge Graphs and Generative AINeo4j
 
Neo4j Webinar: Graphs in banking
Neo4j Webinar:  Graphs in banking Neo4j Webinar:  Graphs in banking
Neo4j Webinar: Graphs in banking Neo4j
 
Artificial Intelligence Roadmap 2021-2025
Artificial Intelligence Roadmap 2021-2025Artificial Intelligence Roadmap 2021-2025
Artificial Intelligence Roadmap 2021-2025Ikhwan115951
 
DAS Slides: Building a Data Strategy – Practical Steps for Aligning with Busi...
DAS Slides: Building a Data Strategy – Practical Steps for Aligning with Busi...DAS Slides: Building a Data Strategy – Practical Steps for Aligning with Busi...
DAS Slides: Building a Data Strategy – Practical Steps for Aligning with Busi...DATAVERSITY
 
Data Modeling, Data Governance, & Data Quality
Data Modeling, Data Governance, & Data QualityData Modeling, Data Governance, & Data Quality
Data Modeling, Data Governance, & Data QualityDATAVERSITY
 
AI: Built to Scale
AI: Built to ScaleAI: Built to Scale
AI: Built to Scaleaccenture
 
What’s New with Databricks Machine Learning
What’s New with Databricks Machine LearningWhat’s New with Databricks Machine Learning
What’s New with Databricks Machine LearningDatabricks
 
Introduction of Data Science
Introduction of Data ScienceIntroduction of Data Science
Introduction of Data ScienceJason Geng
 

Was ist angesagt? (20)

Unlocking the Power of Generative AI An Executive's Guide.pdf
Unlocking the Power of Generative AI An Executive's Guide.pdfUnlocking the Power of Generative AI An Executive's Guide.pdf
Unlocking the Power of Generative AI An Executive's Guide.pdf
 
Exploring Opportunities in the Generative AI Value Chain.pdf
Exploring Opportunities in the Generative AI Value Chain.pdfExploring Opportunities in the Generative AI Value Chain.pdf
Exploring Opportunities in the Generative AI Value Chain.pdf
 
BI Consultancy - Data, Analytics and Strategy
BI Consultancy - Data, Analytics and StrategyBI Consultancy - Data, Analytics and Strategy
BI Consultancy - Data, Analytics and Strategy
 
Large Language Models Bootcamp
Large Language Models BootcampLarge Language Models Bootcamp
Large Language Models Bootcamp
 
AIDRC_Generative_AI_TL_v5.pdf
AIDRC_Generative_AI_TL_v5.pdfAIDRC_Generative_AI_TL_v5.pdf
AIDRC_Generative_AI_TL_v5.pdf
 
Data Mesh in Azure using Cloud Scale Analytics (WAF)
Data Mesh in Azure using Cloud Scale Analytics (WAF)Data Mesh in Azure using Cloud Scale Analytics (WAF)
Data Mesh in Azure using Cloud Scale Analytics (WAF)
 
data-analytics-strategy-ebook.pptx
data-analytics-strategy-ebook.pptxdata-analytics-strategy-ebook.pptx
data-analytics-strategy-ebook.pptx
 
The art of the possible with graph technology_Neo4j GraphSummit Dublin 2023.pptx
The art of the possible with graph technology_Neo4j GraphSummit Dublin 2023.pptxThe art of the possible with graph technology_Neo4j GraphSummit Dublin 2023.pptx
The art of the possible with graph technology_Neo4j GraphSummit Dublin 2023.pptx
 
8 Steps to Creating a Data Strategy
8 Steps to Creating a Data Strategy8 Steps to Creating a Data Strategy
8 Steps to Creating a Data Strategy
 
How a Semantic Layer Makes Data Mesh Work at Scale
How a Semantic Layer Makes  Data Mesh Work at ScaleHow a Semantic Layer Makes  Data Mesh Work at Scale
How a Semantic Layer Makes Data Mesh Work at Scale
 
Determine Your Data Strategy
Determine Your Data StrategyDetermine Your Data Strategy
Determine Your Data Strategy
 
Knowledge Graphs and Generative AI
Knowledge Graphs and Generative AIKnowledge Graphs and Generative AI
Knowledge Graphs and Generative AI
 
UTILITY OF AI
UTILITY OF AIUTILITY OF AI
UTILITY OF AI
 
Neo4j Webinar: Graphs in banking
Neo4j Webinar:  Graphs in banking Neo4j Webinar:  Graphs in banking
Neo4j Webinar: Graphs in banking
 
Artificial Intelligence Roadmap 2021-2025
Artificial Intelligence Roadmap 2021-2025Artificial Intelligence Roadmap 2021-2025
Artificial Intelligence Roadmap 2021-2025
 
DAS Slides: Building a Data Strategy – Practical Steps for Aligning with Busi...
DAS Slides: Building a Data Strategy – Practical Steps for Aligning with Busi...DAS Slides: Building a Data Strategy – Practical Steps for Aligning with Busi...
DAS Slides: Building a Data Strategy – Practical Steps for Aligning with Busi...
 
Data Modeling, Data Governance, & Data Quality
Data Modeling, Data Governance, & Data QualityData Modeling, Data Governance, & Data Quality
Data Modeling, Data Governance, & Data Quality
 
AI: Built to Scale
AI: Built to ScaleAI: Built to Scale
AI: Built to Scale
 
What’s New with Databricks Machine Learning
What’s New with Databricks Machine LearningWhat’s New with Databricks Machine Learning
What’s New with Databricks Machine Learning
 
Introduction of Data Science
Introduction of Data ScienceIntroduction of Data Science
Introduction of Data Science
 

Ähnlich wie Building Data Science Teams

Quicker Insights and Sustainable Business Agility Powered By Data Virtualizat...
Quicker Insights and Sustainable Business Agility Powered By Data Virtualizat...Quicker Insights and Sustainable Business Agility Powered By Data Virtualizat...
Quicker Insights and Sustainable Business Agility Powered By Data Virtualizat...Denodo
 
DATA SCIENCE IS CATALYZING BUSINESS AND INNOVATION
DATA SCIENCE IS CATALYZING BUSINESS AND INNOVATION DATA SCIENCE IS CATALYZING BUSINESS AND INNOVATION
DATA SCIENCE IS CATALYZING BUSINESS AND INNOVATION Elvis Muyanja
 
Data Science Highlights
Data Science Highlights Data Science Highlights
Data Science Highlights Joe Lamantia
 
L3 Big Data and Application.pptx
L3  Big Data and Application.pptxL3  Big Data and Application.pptx
L3 Big Data and Application.pptxShambhavi Vats
 
Put Alternative Data to Use in Capital Markets

Put Alternative Data to Use in Capital Markets
Put Alternative Data to Use in Capital Markets

Put Alternative Data to Use in Capital Markets
Cloudera, Inc.
 
Big data analytics presented at meetup big data for decision makers
Big data analytics presented at meetup big data for decision makersBig data analytics presented at meetup big data for decision makers
Big data analytics presented at meetup big data for decision makersRuhollah Farchtchi
 
How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...
How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...
How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...Denodo
 
Keyrus US Information
Keyrus US InformationKeyrus US Information
Keyrus US InformationJulian Tong
 
6 enriching your data warehouse with big data and hadoop
6 enriching your data warehouse with big data and hadoop6 enriching your data warehouse with big data and hadoop
6 enriching your data warehouse with big data and hadoopDr. Wilfred Lin (Ph.D.)
 
Tips for Effective Data Science in the Enterprise
Tips for Effective Data Science in the EnterpriseTips for Effective Data Science in the Enterprise
Tips for Effective Data Science in the EnterpriseLisa Cohen
 
Engineering Machine Learning Data Pipelines Series: Big Data Quality - Cleans...
Engineering Machine Learning Data Pipelines Series: Big Data Quality - Cleans...Engineering Machine Learning Data Pipelines Series: Big Data Quality - Cleans...
Engineering Machine Learning Data Pipelines Series: Big Data Quality - Cleans...Precisely
 
02 a holistic approach to big data
02 a holistic approach to big data02 a holistic approach to big data
02 a holistic approach to big dataRaul Chong
 
Matt McIlwain opening keynote
Matt McIlwain opening keynoteMatt McIlwain opening keynote
Matt McIlwain opening keynoteSeattleSIM
 
Ch1IntroductiontoDataScience.pptx
Ch1IntroductiontoDataScience.pptxCh1IntroductiontoDataScience.pptx
Ch1IntroductiontoDataScience.pptxAbderrahmanABID2
 
How Can Analytics Improve Business?
How Can Analytics Improve Business?How Can Analytics Improve Business?
How Can Analytics Improve Business?Inside Analysis
 
Introduction to Business and Data Analysis Undergraduate.pdf
Introduction to Business and Data Analysis Undergraduate.pdfIntroduction to Business and Data Analysis Undergraduate.pdf
Introduction to Business and Data Analysis Undergraduate.pdfAbdulrahimShaibuIssa
 
Advanced Analytics and Machine Learning with Data Virtualization
Advanced Analytics and Machine Learning with Data VirtualizationAdvanced Analytics and Machine Learning with Data Virtualization
Advanced Analytics and Machine Learning with Data VirtualizationDenodo
 

Ähnlich wie Building Data Science Teams (20)

Quicker Insights and Sustainable Business Agility Powered By Data Virtualizat...
Quicker Insights and Sustainable Business Agility Powered By Data Virtualizat...Quicker Insights and Sustainable Business Agility Powered By Data Virtualizat...
Quicker Insights and Sustainable Business Agility Powered By Data Virtualizat...
 
Introduction to BigData
Introduction to BigData Introduction to BigData
Introduction to BigData
 
DATA SCIENCE IS CATALYZING BUSINESS AND INNOVATION
DATA SCIENCE IS CATALYZING BUSINESS AND INNOVATION DATA SCIENCE IS CATALYZING BUSINESS AND INNOVATION
DATA SCIENCE IS CATALYZING BUSINESS AND INNOVATION
 
Data Science Highlights
Data Science Highlights Data Science Highlights
Data Science Highlights
 
L3 Big Data and Application.pptx
L3  Big Data and Application.pptxL3  Big Data and Application.pptx
L3 Big Data and Application.pptx
 
Put Alternative Data to Use in Capital Markets

Put Alternative Data to Use in Capital Markets
Put Alternative Data to Use in Capital Markets

Put Alternative Data to Use in Capital Markets

 
Big data analytics presented at meetup big data for decision makers
Big data analytics presented at meetup big data for decision makersBig data analytics presented at meetup big data for decision makers
Big data analytics presented at meetup big data for decision makers
 
How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...
How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...
How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...
 
Keyrus US Information
Keyrus US InformationKeyrus US Information
Keyrus US Information
 
Keyrus US Information
Keyrus US InformationKeyrus US Information
Keyrus US Information
 
6 enriching your data warehouse with big data and hadoop
6 enriching your data warehouse with big data and hadoop6 enriching your data warehouse with big data and hadoop
6 enriching your data warehouse with big data and hadoop
 
semana1.pptx
semana1.pptxsemana1.pptx
semana1.pptx
 
Tips for Effective Data Science in the Enterprise
Tips for Effective Data Science in the EnterpriseTips for Effective Data Science in the Enterprise
Tips for Effective Data Science in the Enterprise
 
Engineering Machine Learning Data Pipelines Series: Big Data Quality - Cleans...
Engineering Machine Learning Data Pipelines Series: Big Data Quality - Cleans...Engineering Machine Learning Data Pipelines Series: Big Data Quality - Cleans...
Engineering Machine Learning Data Pipelines Series: Big Data Quality - Cleans...
 
02 a holistic approach to big data
02 a holistic approach to big data02 a holistic approach to big data
02 a holistic approach to big data
 
Matt McIlwain opening keynote
Matt McIlwain opening keynoteMatt McIlwain opening keynote
Matt McIlwain opening keynote
 
Ch1IntroductiontoDataScience.pptx
Ch1IntroductiontoDataScience.pptxCh1IntroductiontoDataScience.pptx
Ch1IntroductiontoDataScience.pptx
 
How Can Analytics Improve Business?
How Can Analytics Improve Business?How Can Analytics Improve Business?
How Can Analytics Improve Business?
 
Introduction to Business and Data Analysis Undergraduate.pdf
Introduction to Business and Data Analysis Undergraduate.pdfIntroduction to Business and Data Analysis Undergraduate.pdf
Introduction to Business and Data Analysis Undergraduate.pdf
 
Advanced Analytics and Machine Learning with Data Virtualization
Advanced Analytics and Machine Learning with Data VirtualizationAdvanced Analytics and Machine Learning with Data Virtualization
Advanced Analytics and Machine Learning with Data Virtualization
 

Mehr von EMC

INDUSTRY-LEADING TECHNOLOGY FOR LONG TERM RETENTION OF BACKUPS IN THE CLOUD
INDUSTRY-LEADING  TECHNOLOGY FOR LONG TERM RETENTION OF BACKUPS IN THE CLOUDINDUSTRY-LEADING  TECHNOLOGY FOR LONG TERM RETENTION OF BACKUPS IN THE CLOUD
INDUSTRY-LEADING TECHNOLOGY FOR LONG TERM RETENTION OF BACKUPS IN THE CLOUDEMC
 
Cloud Foundry Summit Berlin Keynote
Cloud Foundry Summit Berlin Keynote Cloud Foundry Summit Berlin Keynote
Cloud Foundry Summit Berlin Keynote EMC
 
EMC GLOBAL DATA PROTECTION INDEX
EMC GLOBAL DATA PROTECTION INDEX EMC GLOBAL DATA PROTECTION INDEX
EMC GLOBAL DATA PROTECTION INDEX EMC
 
Transforming Desktop Virtualization with Citrix XenDesktop and EMC XtremIO
Transforming Desktop Virtualization with Citrix XenDesktop and EMC XtremIOTransforming Desktop Virtualization with Citrix XenDesktop and EMC XtremIO
Transforming Desktop Virtualization with Citrix XenDesktop and EMC XtremIOEMC
 
Citrix ready-webinar-xtremio
Citrix ready-webinar-xtremioCitrix ready-webinar-xtremio
Citrix ready-webinar-xtremioEMC
 
EMC FORUM RESEARCH GLOBAL RESULTS - 10,451 RESPONSES ACROSS 33 COUNTRIES
EMC FORUM RESEARCH GLOBAL RESULTS - 10,451 RESPONSES ACROSS 33 COUNTRIES EMC FORUM RESEARCH GLOBAL RESULTS - 10,451 RESPONSES ACROSS 33 COUNTRIES
EMC FORUM RESEARCH GLOBAL RESULTS - 10,451 RESPONSES ACROSS 33 COUNTRIES EMC
 
EMC with Mirantis Openstack
EMC with Mirantis OpenstackEMC with Mirantis Openstack
EMC with Mirantis OpenstackEMC
 
Modern infrastructure for business data lake
Modern infrastructure for business data lakeModern infrastructure for business data lake
Modern infrastructure for business data lakeEMC
 
Force Cyber Criminals to Shop Elsewhere
Force Cyber Criminals to Shop ElsewhereForce Cyber Criminals to Shop Elsewhere
Force Cyber Criminals to Shop ElsewhereEMC
 
Pivotal : Moments in Container History
Pivotal : Moments in Container History Pivotal : Moments in Container History
Pivotal : Moments in Container History EMC
 
Data Lake Protection - A Technical Review
Data Lake Protection - A Technical ReviewData Lake Protection - A Technical Review
Data Lake Protection - A Technical ReviewEMC
 
Mobile E-commerce: Friend or Foe
Mobile E-commerce: Friend or FoeMobile E-commerce: Friend or Foe
Mobile E-commerce: Friend or FoeEMC
 
Virtualization Myths Infographic
Virtualization Myths Infographic Virtualization Myths Infographic
Virtualization Myths Infographic EMC
 
Intelligence-Driven GRC for Security
Intelligence-Driven GRC for SecurityIntelligence-Driven GRC for Security
Intelligence-Driven GRC for SecurityEMC
 
The Trust Paradox: Access Management and Trust in an Insecure Age
The Trust Paradox: Access Management and Trust in an Insecure AgeThe Trust Paradox: Access Management and Trust in an Insecure Age
The Trust Paradox: Access Management and Trust in an Insecure AgeEMC
 
EMC Technology Day - SRM University 2015
EMC Technology Day - SRM University 2015EMC Technology Day - SRM University 2015
EMC Technology Day - SRM University 2015EMC
 
EMC Academic Summit 2015
EMC Academic Summit 2015EMC Academic Summit 2015
EMC Academic Summit 2015EMC
 
Data Science and Big Data Analytics Book from EMC Education Services
Data Science and Big Data Analytics Book from EMC Education ServicesData Science and Big Data Analytics Book from EMC Education Services
Data Science and Big Data Analytics Book from EMC Education ServicesEMC
 
Using EMC Symmetrix Storage in VMware vSphere Environments
Using EMC Symmetrix Storage in VMware vSphere EnvironmentsUsing EMC Symmetrix Storage in VMware vSphere Environments
Using EMC Symmetrix Storage in VMware vSphere EnvironmentsEMC
 
Using EMC VNX storage with VMware vSphereTechBook
Using EMC VNX storage with VMware vSphereTechBookUsing EMC VNX storage with VMware vSphereTechBook
Using EMC VNX storage with VMware vSphereTechBookEMC
 

Mehr von EMC (20)

INDUSTRY-LEADING TECHNOLOGY FOR LONG TERM RETENTION OF BACKUPS IN THE CLOUD
INDUSTRY-LEADING  TECHNOLOGY FOR LONG TERM RETENTION OF BACKUPS IN THE CLOUDINDUSTRY-LEADING  TECHNOLOGY FOR LONG TERM RETENTION OF BACKUPS IN THE CLOUD
INDUSTRY-LEADING TECHNOLOGY FOR LONG TERM RETENTION OF BACKUPS IN THE CLOUD
 
Cloud Foundry Summit Berlin Keynote
Cloud Foundry Summit Berlin Keynote Cloud Foundry Summit Berlin Keynote
Cloud Foundry Summit Berlin Keynote
 
EMC GLOBAL DATA PROTECTION INDEX
EMC GLOBAL DATA PROTECTION INDEX EMC GLOBAL DATA PROTECTION INDEX
EMC GLOBAL DATA PROTECTION INDEX
 
Transforming Desktop Virtualization with Citrix XenDesktop and EMC XtremIO
Transforming Desktop Virtualization with Citrix XenDesktop and EMC XtremIOTransforming Desktop Virtualization with Citrix XenDesktop and EMC XtremIO
Transforming Desktop Virtualization with Citrix XenDesktop and EMC XtremIO
 
Citrix ready-webinar-xtremio
Citrix ready-webinar-xtremioCitrix ready-webinar-xtremio
Citrix ready-webinar-xtremio
 
EMC FORUM RESEARCH GLOBAL RESULTS - 10,451 RESPONSES ACROSS 33 COUNTRIES
EMC FORUM RESEARCH GLOBAL RESULTS - 10,451 RESPONSES ACROSS 33 COUNTRIES EMC FORUM RESEARCH GLOBAL RESULTS - 10,451 RESPONSES ACROSS 33 COUNTRIES
EMC FORUM RESEARCH GLOBAL RESULTS - 10,451 RESPONSES ACROSS 33 COUNTRIES
 
EMC with Mirantis Openstack
EMC with Mirantis OpenstackEMC with Mirantis Openstack
EMC with Mirantis Openstack
 
Modern infrastructure for business data lake
Modern infrastructure for business data lakeModern infrastructure for business data lake
Modern infrastructure for business data lake
 
Force Cyber Criminals to Shop Elsewhere
Force Cyber Criminals to Shop ElsewhereForce Cyber Criminals to Shop Elsewhere
Force Cyber Criminals to Shop Elsewhere
 
Pivotal : Moments in Container History
Pivotal : Moments in Container History Pivotal : Moments in Container History
Pivotal : Moments in Container History
 
Data Lake Protection - A Technical Review
Data Lake Protection - A Technical ReviewData Lake Protection - A Technical Review
Data Lake Protection - A Technical Review
 
Mobile E-commerce: Friend or Foe
Mobile E-commerce: Friend or FoeMobile E-commerce: Friend or Foe
Mobile E-commerce: Friend or Foe
 
Virtualization Myths Infographic
Virtualization Myths Infographic Virtualization Myths Infographic
Virtualization Myths Infographic
 
Intelligence-Driven GRC for Security
Intelligence-Driven GRC for SecurityIntelligence-Driven GRC for Security
Intelligence-Driven GRC for Security
 
The Trust Paradox: Access Management and Trust in an Insecure Age
The Trust Paradox: Access Management and Trust in an Insecure AgeThe Trust Paradox: Access Management and Trust in an Insecure Age
The Trust Paradox: Access Management and Trust in an Insecure Age
 
EMC Technology Day - SRM University 2015
EMC Technology Day - SRM University 2015EMC Technology Day - SRM University 2015
EMC Technology Day - SRM University 2015
 
EMC Academic Summit 2015
EMC Academic Summit 2015EMC Academic Summit 2015
EMC Academic Summit 2015
 
Data Science and Big Data Analytics Book from EMC Education Services
Data Science and Big Data Analytics Book from EMC Education ServicesData Science and Big Data Analytics Book from EMC Education Services
Data Science and Big Data Analytics Book from EMC Education Services
 
Using EMC Symmetrix Storage in VMware vSphere Environments
Using EMC Symmetrix Storage in VMware vSphere EnvironmentsUsing EMC Symmetrix Storage in VMware vSphere Environments
Using EMC Symmetrix Storage in VMware vSphere Environments
 
Using EMC VNX storage with VMware vSphereTechBook
Using EMC VNX storage with VMware vSphereTechBookUsing EMC VNX storage with VMware vSphereTechBook
Using EMC VNX storage with VMware vSphereTechBook
 

Kürzlich hochgeladen

Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 

Kürzlich hochgeladen (20)

Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 

Building Data Science Teams

  • 1. Building Data Science Teams David Dietrich Advisory Technical Education Consultant EMC Education Services @imdaviddietrich © Copyright 2013 EMC Corporation. All rights reserved. 1
  • 2. Roadmap Information Disclaimer  EMC makes no representation and undertakes no obligations with regard to product planning information, anticipated product characteristics, performance specifications, or anticipated release dates (collectively, “Roadmap Information”).  Roadmap Information is provided by EMC as an accommodation to the recipient solely for purposes of discussion and without intending to be bound thereby.  Roadmap information is EMC Restricted Confidential and is provided under the terms, conditions and restrictions defined in the EMC NonDisclosure Agreement in place with your organization. © Copyright 2013 EMC Corporation. All rights reserved. 2
  • 3. Do You Need A Data Science Team For This? Creating Reports, Dashboards, and Databases… © Copyright 2013 EMC Corporation. All rights reserved. 3
  • 4. How About For This? Creating a Map of the Internet © Copyright 2013 EMC Corporation. All rights reserved. 4
  • 5. Example: Output From a Data Science Team Mapping The Spread of Innovation Ideas Using Social Graphs © Copyright 2013 EMC Corporation. All rights reserved. 5
  • 6. Big Data Requires New Approaches to Analytics Business Intelligence Versus Data Science Predictive Analytics and Data Mining (Data Science) Typical Techniques and Data Types Common Questions Exploratory Data Science Analytical Approach • • • • • Typical Techniques and Data Types Business Intelligence Past © Copyright 2013 EMC Corporation. All rights reserved. What if…..? What’s the optimal scenario for our business? What will happen next? What if these trends continue? Why is this happening? Business Intelligence TIME • Common Questions Explanatory Optimization, predictive modeling, forecasting, statistical analysis Structured/unstructured data, many types of sources, very large data sets • • • • Standard and ad hoc reporting, dashboards, alerts, queries, details on demand Structured data, traditional sources, manageable data sets What happened last quarter? How many did we sell? Where is the problem? In which situations? Future 6
  • 7. “Companies Are Always Looking To Reinvent Themselves….But It’s A Mistake To Treat Data Science Teams Like Any Old Product Group. To Build Teams That Create Great Data Products, You Have To Find People With The Skills And The Curiosity To Ask The Big Questions.” - DJ Patil, Data Scientist in Residence at Greylock Partners © Copyright 2013 EMC Corporation. All rights reserved. 7
  • 8. Framework for Developing Data Science Teams Data Science Team Data Scientist © Copyright 2013 EMC Corporation. All rights reserved. BI Analyst Project Sponsor Project Manager Business User Data Engineer DBA 8
  • 9. Data Science Teams © Copyright 2013 EMC Corporation. All rights reserved. 9
  • 10. Data Scientist: An Emerging Career SPOTLIGHT ON BIG DATA by Thomas H. Davenport and D.J. Patil © Copyright 2013 EMC Corporation. All rights reserved. 10
  • 11. Comparing Two Data Analysts Traditional BI Analyst Data Scientist ACME Healthcare ACME Healthcare John Janet Sample Tasks Sample Tasks • Report Regional Sales For Last Quarter • Perform Customer Feedback Surveys • Identify Average Cost Per Supplier • Predict Regional Sales For Next Quarter • Discover Customer Opinions Via Social Media • Identify Ways to Maximize Sales Campaign ROI © Copyright 2013 EMC Corporation. All rights reserved. 11
  • 12. Skills Matrix, Based on Recent Students Quantitative Skills Quantitative Analysts, Statisticians, Business and data analysts Recent STEM Grads Data Scientists Business Intelligence Professionals, IT Technical Ability © Copyright 2013 EMC Corporation. All rights reserved. 12
  • 13. Profile of a Data Scientist Quantitative Technical Skeptical © Copyright 2013 EMC Corporation. All rights reserved. Curious & Creative Communicative & Collaborative 13
  • 14. Interpreting the Resume of a Senior Data Scientist Data Scientist Job Description Responsibilities: • Work with business owners to map business requirements into technical solutions • Analyze and extract relevant information from large amounts of data to help identify key revenue-driven features • Perform ad-hoc statistical and data mining analyses • Design and implement scalable and repeatable solutions, and establish scalable, efficient, automated processes for large scale data analyses • Work closely with the software engineering team to drive new feature creation • Design multi-factor experiments and validate hypothesis Qualifications: • A proven passion for generating insights from data, with a strong familiarity with the higher-level trends in data growth, open-source platforms, and public data sets • Experience with statistical languages and packages, including R, S-Plus, SAS and Matlab, and/or Mahout • Experience working with relational databases and/or distributed computing platforms, and their query interfaces, such as SQL, MapReduce, Hadoop, Cassandra, PIG, and Hive • Strong communication skills, with ability to communicate at all levels of the organization • Masters/PhD degree in mathematics, statistics, computer science or a similar quantitative field • Experience in designing and implementing scalable data mining solutions • Preferably experience with additional programming languages, including Python, Java, and C/C++ • Ability to travel as-needed to meet with customers Statistics Programming Data Mining Advanced STEM Degrees © Copyright 2013 EMC Corporation. All rights reserved. Sample Data Scientist Resume John Smith john.smith@email.com Skills R, SAS, Java, data mining, statistics, ontology, bioinformatics, human-computer interaction, research Experience 2009—Present, Senior Data Scientist, ABC Analytics 2007—2009, Founder&CEO, Genome Genome specializes in consumer health information. The main product is InherithHealth, a tool for acquisition of family medical histories that provides familial disease risk assessment. 2005—2007, Knowledge Engineer, ScienceExperts.com Managed technical outsourcing efforts. Developed criterion and evaluated engineering outsourcing agencies and individuals … 2004—2006, Research Scientist, University of Washington Developed rigorous statistical and computational models for addressing primary shortcomings of observational data analysis in the context of disease risk and drug response. 2000—2004, Research Developer, Nat’l Inst. of Standards and Technology Designed and implemented prototypes. Evaluated tools for representing rules of autonomous on-road navigation. Education Ph.D, Biomedical Informatics, University of Washington, 2011 Dissertation: Detection of Protein–protein Interaction in Living Cells by Flow Cytometry BS, Computer Science, University of Texas at Austin, 2004 14
  • 15. Successful Analytic Projects Require Breadth of Roles Business User Project Sponsor Database Administrator (DBA) © Copyright 2013 EMC Corporation. All rights reserved. Project Manager Data Engineer Business Intelligence Analyst Data Scientist 15
  • 16. Framework for Developing Data Science Teams Developing Data Science Capabilities Data Science Team Transforming Data Scientist © Copyright 2013 EMC Corporation. All rights reserved. BI Analyst Creating Project Sponsor As-a-Service Project Manager Business User Crowdsourcing Data Engineer DBA 16
  • 17. Developing Data Science Capabilities © Copyright 2013 EMC Corporation. All rights reserved. 17
  • 18. Four Approaches to Developing Data Science Capabilities Transforming © Copyright 2013 EMC Corporation. All rights reserved. Creating As a Service Crowdsourcing 18
  • 19. Approaches to Developing Data Science Capabilities: Transforming Teams Transforming And Realignment With Minimal Change To The Current Organizational Structure • Industries Requiring Deep Domain Knowledge (Such As Genetics And DNA Sequencing) • Established Companies Who Wish To Introduce Data Science Into Their Business • Companies Who Wish To Enrich The In-house Skill Sets © Copyright 2013 EMC Corporation. All rights reserved. 19
  • 20. Approaches to Developing Data Science Capabilities: Transforming Teams Developing A New Team From Scratch • Start-up Companies • Companies Who Wish To … – Increase Their Focus On Data Analytics – Start New Data Science Projects • Companies Where Data Is The Product • Deep Domain Knowledge Is Less Critical For The Analytics © Copyright 2013 EMC Corporation. All rights reserved. 20
  • 21. Approaches to Developing Data Science Capabilities: Data Science as a Service Engaging Data Science as a Service (DSaaS) • When To Engage DSaaS Providers – Prefer Not To Change Existing Organizational Structure – When Creating Or Transforming Are Not Viable Options • Consider Service-level Agreements (SLAs) When Determining Whether To Engage Internal Resources Or External Providers © Copyright 2013 EMC Corporation. All rights reserved. 21
  • 22. Approaches to Developing Data Science Capabilities: Crowdsourcing Data Science Outsource Data Science Project To Distributed Groups Of People • When To Crowdsource – The Problem Is “Open” In Nature – Willing To Accept Opinions From Distributed And Diverse Groups Of People – There’s A Back-up Plan In Case Of “Crowd Failures” • Examples: Wikipedia, Netflix’s $1,000,000 Prize © Copyright 2013 EMC Corporation. All rights reserved. 22
  • 23. Approaches to Developing Data Science Capabilities: Crowdsourcing Data Science (Cont’d) Outsource Data Science Projects To Distributed Groups Of People • Different Crowdsourcing Models – Wisdom Of Crowds – Swarm Creativity (Collective Intelligence) • Crowdsourcing Platforms – Kaggle.com, Innocentive.com – Amazon Mechanical Turk • Crowd Failures: When The Turnout Of Crowdsourcing Is Unsatisfactory © Copyright 2013 EMC Corporation. All rights reserved. 23
  • 24. Benefits and Drawbacks of the Four Approaches Transforming Creating DSaaS Crowdsourcing • Strong Domain • Control Over Skill- • Able to Scale on • Leverage Wisdom • Risk of homogeneous • Hiring and • Provider May Not • No SLA; value not Knowledge • Knowledge of Business Processes • New Talent Raises Level of Team Performance • Gradually Increases the Quality of Service thinking • May Struggle With Quality of Service • Some Team Members May Resist Change © Copyright 2013 EMC Corporation. All rights reserved. sets • More Flexibility • High Quality of Service Knowledge Transfer Are Timeconsuming • Time Required to Find and Hire Right Team Members Demand • May Get Better Service Levels Than In-house • Learn From Outside Experts Understand Company’s Unique Processes • Difficult to Bring Expertise Back Inhouse • Decreasing Quality of Service Over Time of the Crowds • Diverse Perspectives • Lower Cost • Fast Results guaranteed • Difficult to design the “Open” Problem • Difficult For Domain Intensive Tasks • Crowd Failure May Happen (Adds Cost) 24
  • 25. Framework for Developing Data Science Teams Organizational Model Developing Data Science Capabilities Data Science Team Centralized Transforming Data Scientist © Copyright 2013 EMC Corporation. All rights reserved. BI Analyst Decentralized Creating Project Sponsor Hybrid As-a-Service Project Manager Business User Crowdsourcing Data Engineer DBA 25
  • 26. Organizational Model © Copyright 2013 EMC Corporation. All rights reserved. 26
  • 27. Organizational Models for Data Science Teams Centralized BU BU BU BU DS Team Hybrid DS Team BU The Data Science Team Functions As A Hub And Spoke Model, In Which They Are A Central Provider Of Analytics To Multiple Business Units Decentralized BU BU BU BU Each Business Unit Has Its Own Data Science Capabilities © Copyright 2013 EMC Corporation. All rights reserved. There Is A Centralized Data Science Team, But Business Units Also Have Data Science Capabilities Regardless Of Which Approach, They All Need Executive Sponsorship To Succeed 27
  • 28. Framework for Developing Data Science Teams Executive Engagement Data-driven CEO Organizational Model Developing Data Science Capabilities Data Science Team Centralized Transforming Data Scientist © Copyright 2013 EMC Corporation. All rights reserved. BI Analyst Chief Data Officer Decentralized Creating Project Sponsor Hybrid As-a-Service Project Manager Business User Crowdsourcing Data Engineer DBA 28
  • 29. Executive Engagement © Copyright 2013 EMC Corporation. All rights reserved. 29
  • 30. Analytics Requires Executive Level Engagement “Executive Sponsorship Is So Vital To Analytical Competition…” -- Tom Davenport (Competing on Analytics) Chief Strategy Officer Simulate Outcomes for Acquiring Our Top 3 Competitors CEO Chief Finance Officer Use Time Series Analysis Over Historical Data to Predict KPIs to Project Earnings Chief Product Officer Conduct Social Media Analyses to Identify Customer Opinions Chief Security Officer Collect and Mine Log Data Within and Outside of the Company to Detect Unknown Threats Chief Marketing Officer Conduct Behavior Analyses to Predict If Customers Are Going to Churn Chief Operating Officer Mine Customer Opinions and Competitor Behaviors to Predict Inventory Demands © Copyright 2013 EMC Corporation. All rights reserved. Executive Boardroom 30
  • 31. Executive Engagement: Data-Driven CEO “… If Your Organization Can Arrange It … Have Someone In A Key Operational Role -Business Unit Head, Chief Operations Officer, Even CEO -- To Be An Enthusiastic Advocate Of Matters Quantitative.” -- Tom Davenport (HBR Blog Network) Key Focus Areas of a Data-driven CEO: • Strategic Data Planning • Analytic Understanding • Technology Awareness © Copyright 2013 EMC Corporation. All rights reserved. Procter & Gamble Business Sphere 31
  • 32. Executive Engagement: Chief Data Officer (CDO) “… It's Time For Corporations To Embrace A New Functional Member Of The C-suite: The Chief Data Officer (CDO).” -- Anthony Goldbloom and Merav Bloch, Kaggle • Promote Data-driven Decision Making To Support Company’s Key Initiatives • Ensure The Company Collects The Right Data • Oversee And Drive Analytics Company-wide 25% of organizations will have a Chief Data Officer by 2015. Executive-level Advisor On Data Analytics Executive Boardroom -- Gartner Blog Network © Copyright 2013 EMC Corporation. All rights reserved. 32
  • 33. Two New EMC Data Science Courses for Business Transformation Business Leaders New 90 min Introducing Data Science and Big Data Analytics for Business Transformation Heads of Data Science Teams New 1 day Data Science and Big Data Analytics for Business Transformation Aspiring Data Scientists 5 days Data Science and Big Data Analytics © Copyright 2013 EMC Corporation. All rights reserved. 33
  • 34. Closing Thoughts…. SPOTLIGHT ON BIG DATA Now You Know How To Develop Data Science Teams…What Next? • • • • • Determine How You Would Like To Develop Data Science Capabilities Hire People To Fill Out Your Data Science Team Consider Which Organizational Model Will Work Best For Your Situation Assess How Much Executive Engagement You Have Or Need Map Out Potential Projects -- Balance Quick Wins With Longer-term Wins © Copyright 2013 EMC Corporation. All rights reserved. 34
  • 35. Questions? Visit the EMC Global Services Booth, #110 Additional Resources: 1. EMC Education Services curriculum on Data Science and Big Data Analytics for Business Transformation: http://education.emc.com/guest/campaign/data_science.aspx 2. My Blog on Data Science & Big Data Analytics: http://infocus.emc.com/author/david_dietrich/ 3. Blog on applying Data Analytics Lifecycle to measuring innovation data: http://stevetodd.typepad.com/my_weblog/data-science-and-big-data-curriculum/ © Copyright 2013 EMC Corporation. All rights reserved. 35