SlideShare ist ein Scribd-Unternehmen logo
1 von 11
Moving Data Science from an Event to
a Program
Wayne Applebaum, Ph. D.
What Gartner Sees
2
“… by 2017, 33 percent of Fortune 100
organizations will experience an
information crisis, due to their inability to
effectively value, govern and trust their
enterprise information.”
Gartner Press Release, February 27, 2014
How it should work
Business
processes and
Business
decisions
bracket a
robust
infrastructure of
data tools and
processes.Transactional Information/Other Data
Measures Analytics Tools
Business Decisions
Target data store
Load Quality Processes
Business Processes
How it usually works
Silo’s of
data that
are difficult
to put
together
"Those who don't know history are destined to repeat
it.”-Edmund Burke
Here’s a Data Scientist Viewpoint
• Identifying Data
Sources
• Data Correctness
• Data Quality
• Business
Involvement
• Multiple Sources
• Data Governance
• Flexibility
• Takes up 80% of
their time
Why the Problem is Getting Worse
• Use and value placed on data and is increasing
• More decisions are being made in the same
amount of time
• Answers aren’t in the silos-you need to cross the
silos to get them
• Business demand for information based decision
is not discussed in the popular media
6
Pressure and opportunity for data and analytics
is rising
Reuse is becoming a business necessity
Emergence of Business Decision Data
7
Data
Business
Decision
Data
MasterTransactional
While the basic rules of Data Governance remains the
same. the scope is expanding
Transactions Vs. Decisions
8
Transactions
Decisions
Process each transaction as quickly as possible
Consolidate Information to make the correct
decision as quickly as possible
The Data Governance-No Free
Lunch Rule
9
When it comes to integrating data sources
There is no free lunch
You have to understand the data and
context to be able to make decisions
10
Creating the Data Hub: Overview
Scope
Identifying Key Objects/Values
Creating the
Controlled
Vocabulary
Object
Mapping
Creating the
Canonical/Targ
et Model
Creating and Rules and Standards
Implementing
Data Retrieval
Creating the
User Interface
Developing
Load
Procedures
Architecture Decisions
Ingestion, Database, Data Governance. Retrieval
Where do we go from here?
11
• Implement Data Governance early
• Integrate Data Governance Across Silo’s
• Recognize that Data Governance doesn’t end with
Master Data
• Big Data represents a new challenges because the
meaning of a transaction is no longer defined on entry
• Create the governance and structures to support both
transactions and decisions
• Consider Data Hubs for cross silo integrations
Governance is essential for reuse and reuse is essential
to maximize value

Weitere ähnliche Inhalte

Was ist angesagt?

Domino and AWS: collaborative analytics and model governance at financial ser...
Domino and AWS: collaborative analytics and model governance at financial ser...Domino and AWS: collaborative analytics and model governance at financial ser...
Domino and AWS: collaborative analytics and model governance at financial ser...
Domino Data Lab
 
Pay no attention to the man behind the curtain - the unseen work behind data ...
Pay no attention to the man behind the curtain - the unseen work behind data ...Pay no attention to the man behind the curtain - the unseen work behind data ...
Pay no attention to the man behind the curtain - the unseen work behind data ...
mark madsen
 
Valuing the data asset
Valuing the data assetValuing the data asset
Valuing the data asset
Bala Iyer
 
Architecting a Data Platform For Enterprise Use (Strata NY 2018)
Architecting a Data Platform For Enterprise Use (Strata NY 2018)Architecting a Data Platform For Enterprise Use (Strata NY 2018)
Architecting a Data Platform For Enterprise Use (Strata NY 2018)
mark madsen
 

Was ist angesagt? (20)

Reproducible Dashboards and other great things to do with Jupyter
Reproducible Dashboards and other great things to do with JupyterReproducible Dashboards and other great things to do with Jupyter
Reproducible Dashboards and other great things to do with Jupyter
 
Andreas weigend
Andreas weigendAndreas weigend
Andreas weigend
 
Notilyze SAS
Notilyze SASNotilyze SAS
Notilyze SAS
 
Building Data Science Teams: A Moneyball Approach
Building Data Science Teams: A Moneyball ApproachBuilding Data Science Teams: A Moneyball Approach
Building Data Science Teams: A Moneyball Approach
 
Domino and AWS: collaborative analytics and model governance at financial ser...
Domino and AWS: collaborative analytics and model governance at financial ser...Domino and AWS: collaborative analytics and model governance at financial ser...
Domino and AWS: collaborative analytics and model governance at financial ser...
 
Data quality management Basic
Data quality management BasicData quality management Basic
Data quality management Basic
 
Leveraged Analytics at Scale
Leveraged Analytics at ScaleLeveraged Analytics at Scale
Leveraged Analytics at Scale
 
Data Science Salon: Quit Wasting Time – Case Studies in Production Machine Le...
Data Science Salon: Quit Wasting Time – Case Studies in Production Machine Le...Data Science Salon: Quit Wasting Time – Case Studies in Production Machine Le...
Data Science Salon: Quit Wasting Time – Case Studies in Production Machine Le...
 
Giovanni Lanzani GoDataDriven
Giovanni Lanzani GoDataDrivenGiovanni Lanzani GoDataDriven
Giovanni Lanzani GoDataDriven
 
H2O World - What you need before doing predictive analysis - Keen.io
H2O World - What you need before doing predictive analysis - Keen.ioH2O World - What you need before doing predictive analysis - Keen.io
H2O World - What you need before doing predictive analysis - Keen.io
 
Data quality - The True Big Data Challenge
Data quality - The True Big Data ChallengeData quality - The True Big Data Challenge
Data quality - The True Big Data Challenge
 
Pay no attention to the man behind the curtain - the unseen work behind data ...
Pay no attention to the man behind the curtain - the unseen work behind data ...Pay no attention to the man behind the curtain - the unseen work behind data ...
Pay no attention to the man behind the curtain - the unseen work behind data ...
 
Back to Square One: Building a Data Science Team from Scratch
Back to Square One: Building a Data Science Team from ScratchBack to Square One: Building a Data Science Team from Scratch
Back to Square One: Building a Data Science Team from Scratch
 
Building a Data Platform Strata SF 2019
Building a Data Platform Strata SF 2019Building a Data Platform Strata SF 2019
Building a Data Platform Strata SF 2019
 
Data Architecture: OMG It’s Made of People
Data Architecture: OMG It’s Made of PeopleData Architecture: OMG It’s Made of People
Data Architecture: OMG It’s Made of People
 
Valuing the data asset
Valuing the data assetValuing the data asset
Valuing the data asset
 
Architecting a Data Platform For Enterprise Use (Strata NY 2018)
Architecting a Data Platform For Enterprise Use (Strata NY 2018)Architecting a Data Platform For Enterprise Use (Strata NY 2018)
Architecting a Data Platform For Enterprise Use (Strata NY 2018)
 
What is a Data Scientist
What is a Data Scientist What is a Data Scientist
What is a Data Scientist
 
Foundational Strategies for Trust in Big Data Part 1: Getting Data to the Pla...
Foundational Strategies for Trust in Big Data Part 1: Getting Data to the Pla...Foundational Strategies for Trust in Big Data Part 1: Getting Data to the Pla...
Foundational Strategies for Trust in Big Data Part 1: Getting Data to the Pla...
 
Creating a DevOps Practice for Analytics -- Strata Data, September 28, 2017
Creating a DevOps Practice for Analytics -- Strata Data, September 28, 2017Creating a DevOps Practice for Analytics -- Strata Data, September 28, 2017
Creating a DevOps Practice for Analytics -- Strata Data, September 28, 2017
 

Ähnlich wie Moving Data Science from an Event to A Program: Considerations in Creating Sustainable and Reusable Data Sources

Explorasi Data untuk Peluang Bisnis dan Pengembangan Karir.pptx
Explorasi Data untuk Peluang Bisnis dan Pengembangan Karir.pptxExplorasi Data untuk Peluang Bisnis dan Pengembangan Karir.pptx
Explorasi Data untuk Peluang Bisnis dan Pengembangan Karir.pptx
windu19
 
Hcd wp-2012-better dataleadstobetteranalytics
Hcd wp-2012-better dataleadstobetteranalyticsHcd wp-2012-better dataleadstobetteranalytics
Hcd wp-2012-better dataleadstobetteranalytics
Health Care DataWorks
 

Ähnlich wie Moving Data Science from an Event to A Program: Considerations in Creating Sustainable and Reusable Data Sources (20)

DC Salesforce1 Tour Data Governance Lunch Best Practices deck
DC Salesforce1 Tour Data Governance Lunch Best Practices deckDC Salesforce1 Tour Data Governance Lunch Best Practices deck
DC Salesforce1 Tour Data Governance Lunch Best Practices deck
 
How Data Integration and Governance Enables HR to Drive Value .pptx
How Data Integration and Governance Enables HR to Drive Value .pptxHow Data Integration and Governance Enables HR to Drive Value .pptx
How Data Integration and Governance Enables HR to Drive Value .pptx
 
Data-Ed Webinar: Data Quality Success Stories
Data-Ed Webinar: Data Quality Success StoriesData-Ed Webinar: Data Quality Success Stories
Data-Ed Webinar: Data Quality Success Stories
 
Is Your Agency Data Challenged?
Is Your Agency Data Challenged?Is Your Agency Data Challenged?
Is Your Agency Data Challenged?
 
Stop the madness - Never doubt the quality of BI again using Data Governance
Stop the madness - Never doubt the quality of BI again using Data GovernanceStop the madness - Never doubt the quality of BI again using Data Governance
Stop the madness - Never doubt the quality of BI again using Data Governance
 
Securing big data (july 2012)
Securing big data (july 2012)Securing big data (july 2012)
Securing big data (july 2012)
 
Foundational Strategies for Trust in Big Data Part 2: Understanding Your Data
Foundational Strategies for Trust in Big Data Part 2: Understanding Your DataFoundational Strategies for Trust in Big Data Part 2: Understanding Your Data
Foundational Strategies for Trust in Big Data Part 2: Understanding Your Data
 
The Merger is Happening, Now What Do We Do?
The Merger is Happening, Now What Do We Do?The Merger is Happening, Now What Do We Do?
The Merger is Happening, Now What Do We Do?
 
Most Common Data Governance Challenges in the Digital Economy
Most Common Data Governance Challenges in the Digital EconomyMost Common Data Governance Challenges in the Digital Economy
Most Common Data Governance Challenges in the Digital Economy
 
Cff data governance best practices
Cff data governance best practicesCff data governance best practices
Cff data governance best practices
 
Data driven decision making
Data driven decision makingData driven decision making
Data driven decision making
 
Why data governance is the new buzz?
Why data governance is the new buzz?Why data governance is the new buzz?
Why data governance is the new buzz?
 
Explorasi Data untuk Peluang Bisnis dan Pengembangan Karir.pptx
Explorasi Data untuk Peluang Bisnis dan Pengembangan Karir.pptxExplorasi Data untuk Peluang Bisnis dan Pengembangan Karir.pptx
Explorasi Data untuk Peluang Bisnis dan Pengembangan Karir.pptx
 
Fate of the Chief Data Officer
Fate of the Chief Data OfficerFate of the Chief Data Officer
Fate of the Chief Data Officer
 
Hcd wp-2012-better dataleadstobetteranalytics
Hcd wp-2012-better dataleadstobetteranalyticsHcd wp-2012-better dataleadstobetteranalytics
Hcd wp-2012-better dataleadstobetteranalytics
 
Successful stewardship Presentation
Successful stewardship PresentationSuccessful stewardship Presentation
Successful stewardship Presentation
 
Increasing Agility Through Data Virtualization
Increasing Agility Through Data VirtualizationIncreasing Agility Through Data Virtualization
Increasing Agility Through Data Virtualization
 
Building Rules for Data Governance
Building Rules for Data GovernanceBuilding Rules for Data Governance
Building Rules for Data Governance
 
WHITE PAPER: Distributed Data Quality
WHITE PAPER: Distributed Data QualityWHITE PAPER: Distributed Data Quality
WHITE PAPER: Distributed Data Quality
 
Sgcp14dunlea
Sgcp14dunleaSgcp14dunlea
Sgcp14dunlea
 

Mehr von Domino Data Lab

What's in your workflow? Bringing data science workflows to business analysis...
What's in your workflow? Bringing data science workflows to business analysis...What's in your workflow? Bringing data science workflows to business analysis...
What's in your workflow? Bringing data science workflows to business analysis...
Domino Data Lab
 
Building Data Analytics pipelines in the cloud using serverless technology
Building Data Analytics pipelines in the cloud using serverless technologyBuilding Data Analytics pipelines in the cloud using serverless technology
Building Data Analytics pipelines in the cloud using serverless technology
Domino Data Lab
 
Leveraging Open Source Automated Data Science Tools
Leveraging Open Source Automated Data Science ToolsLeveraging Open Source Automated Data Science Tools
Leveraging Open Source Automated Data Science Tools
Domino Data Lab
 

Mehr von Domino Data Lab (20)

What's in your workflow? Bringing data science workflows to business analysis...
What's in your workflow? Bringing data science workflows to business analysis...What's in your workflow? Bringing data science workflows to business analysis...
What's in your workflow? Bringing data science workflows to business analysis...
 
The Proliferation of New Database Technologies and Implications for Data Scie...
The Proliferation of New Database Technologies and Implications for Data Scie...The Proliferation of New Database Technologies and Implications for Data Scie...
The Proliferation of New Database Technologies and Implications for Data Scie...
 
Racial Bias in Policing: an analysis of Illinois traffic stops data
Racial Bias in Policing: an analysis of Illinois traffic stops dataRacial Bias in Policing: an analysis of Illinois traffic stops data
Racial Bias in Policing: an analysis of Illinois traffic stops data
 
Leveraging Data Science in the Automotive Industry
Leveraging Data Science in the Automotive IndustryLeveraging Data Science in the Automotive Industry
Leveraging Data Science in the Automotive Industry
 
Summertime Analytics: Predicting E. coli and West Nile Virus
Summertime Analytics: Predicting E. coli and West Nile VirusSummertime Analytics: Predicting E. coli and West Nile Virus
Summertime Analytics: Predicting E. coli and West Nile Virus
 
GeoViz: A Canvas for Data Science
GeoViz: A Canvas for Data ScienceGeoViz: A Canvas for Data Science
GeoViz: A Canvas for Data Science
 
Doing your first Kaggle (Python for Big Data sets)
Doing your first Kaggle (Python for Big Data sets)Doing your first Kaggle (Python for Big Data sets)
Doing your first Kaggle (Python for Big Data sets)
 
How I Learned to Stop Worrying and Love Linked Data
How I Learned to Stop Worrying and Love Linked DataHow I Learned to Stop Worrying and Love Linked Data
How I Learned to Stop Worrying and Love Linked Data
 
Software Engineering for Data Scientists
Software Engineering for Data ScientistsSoftware Engineering for Data Scientists
Software Engineering for Data Scientists
 
Making Big Data Smart
Making Big Data SmartMaking Big Data Smart
Making Big Data Smart
 
Building Data Analytics pipelines in the cloud using serverless technology
Building Data Analytics pipelines in the cloud using serverless technologyBuilding Data Analytics pipelines in the cloud using serverless technology
Building Data Analytics pipelines in the cloud using serverless technology
 
Leveraging Open Source Automated Data Science Tools
Leveraging Open Source Automated Data Science ToolsLeveraging Open Source Automated Data Science Tools
Leveraging Open Source Automated Data Science Tools
 
The Role and Importance of Curiosity in Data Science
The Role and Importance of Curiosity in Data ScienceThe Role and Importance of Curiosity in Data Science
The Role and Importance of Curiosity in Data Science
 
Fuzzy Matching to the Rescue
Fuzzy Matching to the RescueFuzzy Matching to the Rescue
Fuzzy Matching to the Rescue
 
How to Effectively Combine Numerical Features and Categorical Features
How to Effectively Combine Numerical Features and Categorical FeaturesHow to Effectively Combine Numerical Features and Categorical Features
How to Effectively Combine Numerical Features and Categorical Features
 
Building Up Local Models of Customers
Building Up Local Models of CustomersBuilding Up Local Models of Customers
Building Up Local Models of Customers
 
Making Investing A Science
Making Investing A ScienceMaking Investing A Science
Making Investing A Science
 
How to Use Data Science to Affect Company Change
How to Use Data Science to Affect Company ChangeHow to Use Data Science to Affect Company Change
How to Use Data Science to Affect Company Change
 
Making Media with Jupyter
Making Media with JupyterMaking Media with Jupyter
Making Media with Jupyter
 
Lean Data Science
Lean Data ScienceLean Data Science
Lean Data Science
 

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 

Moving Data Science from an Event to A Program: Considerations in Creating Sustainable and Reusable Data Sources

  • 1. Moving Data Science from an Event to a Program Wayne Applebaum, Ph. D.
  • 2. What Gartner Sees 2 “… by 2017, 33 percent of Fortune 100 organizations will experience an information crisis, due to their inability to effectively value, govern and trust their enterprise information.” Gartner Press Release, February 27, 2014
  • 3. How it should work Business processes and Business decisions bracket a robust infrastructure of data tools and processes.Transactional Information/Other Data Measures Analytics Tools Business Decisions Target data store Load Quality Processes Business Processes
  • 4. How it usually works Silo’s of data that are difficult to put together "Those who don't know history are destined to repeat it.”-Edmund Burke
  • 5. Here’s a Data Scientist Viewpoint • Identifying Data Sources • Data Correctness • Data Quality • Business Involvement • Multiple Sources • Data Governance • Flexibility • Takes up 80% of their time
  • 6. Why the Problem is Getting Worse • Use and value placed on data and is increasing • More decisions are being made in the same amount of time • Answers aren’t in the silos-you need to cross the silos to get them • Business demand for information based decision is not discussed in the popular media 6 Pressure and opportunity for data and analytics is rising Reuse is becoming a business necessity
  • 7. Emergence of Business Decision Data 7 Data Business Decision Data MasterTransactional While the basic rules of Data Governance remains the same. the scope is expanding
  • 8. Transactions Vs. Decisions 8 Transactions Decisions Process each transaction as quickly as possible Consolidate Information to make the correct decision as quickly as possible
  • 9. The Data Governance-No Free Lunch Rule 9 When it comes to integrating data sources There is no free lunch You have to understand the data and context to be able to make decisions
  • 10. 10 Creating the Data Hub: Overview Scope Identifying Key Objects/Values Creating the Controlled Vocabulary Object Mapping Creating the Canonical/Targ et Model Creating and Rules and Standards Implementing Data Retrieval Creating the User Interface Developing Load Procedures Architecture Decisions Ingestion, Database, Data Governance. Retrieval
  • 11. Where do we go from here? 11 • Implement Data Governance early • Integrate Data Governance Across Silo’s • Recognize that Data Governance doesn’t end with Master Data • Big Data represents a new challenges because the meaning of a transaction is no longer defined on entry • Create the governance and structures to support both transactions and decisions • Consider Data Hubs for cross silo integrations Governance is essential for reuse and reuse is essential to maximize value