SlideShare ist ein Scribd-Unternehmen logo
1 von 21
Data Scientist – The New Data Analyst
    Josh Wills, Senior Director of Data Science




1
About Me




2
What Do Data Scientists Do?




3
What I Think I Do




4
What Other People Think I Do




5
What I Actually Do




6
Think Like a Data Scientist




7
Solving Problems vs. Finding Insights




8
Parallelize Everything




9
Abundance vs. Scarcity




10
Building Data Products




11
Create a Data Science Team




12
Choose Good Problems




13
Design the Model




14
Mind the Gap




15
Amortize Costs




16
Measure Everything




17
Rinse and Repeat




18
Work Like a Data Scientist




19
Introduction to Data Science:
     Building Recommender Systems
       December 12-14, New York, NY
       http://university.cloudera.com/




20
Thank you!
Josh Wills, Director of Data Science, Cloudera   @josh_wills

Weitere ähnliche Inhalte

Was ist angesagt?

5. Workshop Responsible Data Science - Discussion on Transparency in data sci...
5. Workshop Responsible Data Science - Discussion on Transparency in data sci...5. Workshop Responsible Data Science - Discussion on Transparency in data sci...
5. Workshop Responsible Data Science - Discussion on Transparency in data sci...Jheronimus Academy of Data Science
 
Building an Insight Machine - Strata DDBD 2015
Building an Insight Machine - Strata DDBD 2015Building an Insight Machine - Strata DDBD 2015
Building an Insight Machine - Strata DDBD 2015Domino Data Lab
 
otto_presentation
otto_presentationotto_presentation
otto_presentationTyler Otto
 
2018 02 converged it
2018 02 converged it2018 02 converged it
2018 02 converged itChris Dwan
 
Big Data Day LA 2016/ Data Science Track - The Right Tool for the Job: Guidel...
Big Data Day LA 2016/ Data Science Track - The Right Tool for the Job: Guidel...Big Data Day LA 2016/ Data Science Track - The Right Tool for the Job: Guidel...
Big Data Day LA 2016/ Data Science Track - The Right Tool for the Job: Guidel...Data Con LA
 
Big Data Day LA 2015 - Data Science at Whisper - From content quality to pers...
Big Data Day LA 2015 - Data Science at Whisper - From content quality to pers...Big Data Day LA 2015 - Data Science at Whisper - From content quality to pers...
Big Data Day LA 2015 - Data Science at Whisper - From content quality to pers...Data Con LA
 
Tableau - Make your SEO data work for you!
Tableau - Make your SEO data work for you!Tableau - Make your SEO data work for you!
Tableau - Make your SEO data work for you!Renco Smeding
 
How to Become a Data Science Company instead of a company with Data Scientist...
How to Become a Data Science Company instead of a company with Data Scientist...How to Become a Data Science Company instead of a company with Data Scientist...
How to Become a Data Science Company instead of a company with Data Scientist...Ruth Kearney
 
Data Science towards the Digital Enterprise
Data Science towards the Digital EnterpriseData Science towards the Digital Enterprise
Data Science towards the Digital EnterpriseJake Bouma
 
Idiots guide to setting up a data science team
Idiots guide to setting up a data science teamIdiots guide to setting up a data science team
Idiots guide to setting up a data science teamAshish Bansal
 
Steeping into the shoes of data scientist
Steeping into the shoes of data scientistSteeping into the shoes of data scientist
Steeping into the shoes of data scientistPratyush Sunandan
 
Security Administration Vii 2 Statistical Analysis
Security Administration Vii 2 Statistical AnalysisSecurity Administration Vii 2 Statistical Analysis
Security Administration Vii 2 Statistical AnalysisCarter F. Smith, J.D., Ph.D.
 
Decoding Data Science
Decoding Data ScienceDecoding Data Science
Decoding Data ScienceMatt Fornito
 
Data driven omaha_12112014
Data driven omaha_12112014Data driven omaha_12112014
Data driven omaha_12112014Daniel Murray
 

Was ist angesagt? (20)

5. Workshop Responsible Data Science - Discussion on Transparency in data sci...
5. Workshop Responsible Data Science - Discussion on Transparency in data sci...5. Workshop Responsible Data Science - Discussion on Transparency in data sci...
5. Workshop Responsible Data Science - Discussion on Transparency in data sci...
 
Building an Insight Machine - Strata DDBD 2015
Building an Insight Machine - Strata DDBD 2015Building an Insight Machine - Strata DDBD 2015
Building an Insight Machine - Strata DDBD 2015
 
otto_presentation
otto_presentationotto_presentation
otto_presentation
 
2018 02 converged it
2018 02 converged it2018 02 converged it
2018 02 converged it
 
Wingu -- Pharma-in-a-Box
Wingu -- Pharma-in-a-BoxWingu -- Pharma-in-a-Box
Wingu -- Pharma-in-a-Box
 
Big Data Day LA 2016/ Data Science Track - The Right Tool for the Job: Guidel...
Big Data Day LA 2016/ Data Science Track - The Right Tool for the Job: Guidel...Big Data Day LA 2016/ Data Science Track - The Right Tool for the Job: Guidel...
Big Data Day LA 2016/ Data Science Track - The Right Tool for the Job: Guidel...
 
Decision Analysis
Decision AnalysisDecision Analysis
Decision Analysis
 
Big Data Day LA 2015 - Data Science at Whisper - From content quality to pers...
Big Data Day LA 2015 - Data Science at Whisper - From content quality to pers...Big Data Day LA 2015 - Data Science at Whisper - From content quality to pers...
Big Data Day LA 2015 - Data Science at Whisper - From content quality to pers...
 
Tableau - Make your SEO data work for you!
Tableau - Make your SEO data work for you!Tableau - Make your SEO data work for you!
Tableau - Make your SEO data work for you!
 
How to Become a Data Science Company instead of a company with Data Scientist...
How to Become a Data Science Company instead of a company with Data Scientist...How to Become a Data Science Company instead of a company with Data Scientist...
How to Become a Data Science Company instead of a company with Data Scientist...
 
Data Science towards the Digital Enterprise
Data Science towards the Digital EnterpriseData Science towards the Digital Enterprise
Data Science towards the Digital Enterprise
 
Publications
PublicationsPublications
Publications
 
Idiots guide to setting up a data science team
Idiots guide to setting up a data science teamIdiots guide to setting up a data science team
Idiots guide to setting up a data science team
 
Steeping into the shoes of data scientist
Steeping into the shoes of data scientistSteeping into the shoes of data scientist
Steeping into the shoes of data scientist
 
Security Administration Vii 2 Statistical Analysis
Security Administration Vii 2 Statistical AnalysisSecurity Administration Vii 2 Statistical Analysis
Security Administration Vii 2 Statistical Analysis
 
Decoding Data Science
Decoding Data ScienceDecoding Data Science
Decoding Data Science
 
Data driven omaha_12112014
Data driven omaha_12112014Data driven omaha_12112014
Data driven omaha_12112014
 
S1240226 assigment c
S1240226 assigment cS1240226 assigment c
S1240226 assigment c
 
Beauty of data visualization
Beauty of data visualizationBeauty of data visualization
Beauty of data visualization
 
Digital Economics
Digital EconomicsDigital Economics
Digital Economics
 

Andere mochten auch

Design Thinking for Data Science #StrataHadoop
Design Thinking for Data Science #StrataHadoopDesign Thinking for Data Science #StrataHadoop
Design Thinking for Data Science #StrataHadoopIntuit Inc.
 
Cost of Quality How to Save Money
Cost of Quality How to Save MoneyCost of Quality How to Save Money
Cost of Quality How to Save MoneyIosif Itkin
 
5 Key Hacks for Breakthrough Innovation
5 Key Hacks for Breakthrough Innovation 5 Key Hacks for Breakthrough Innovation
5 Key Hacks for Breakthrough Innovation Amy Jo Kim
 
Be Data Informed Without Being a Data Scientist
Be Data Informed Without Being a Data ScientistBe Data Informed Without Being a Data Scientist
Be Data Informed Without Being a Data ScientistPamela Pavliscak
 
Impact Hiring: How Data Will Transform Youth Employment
Impact Hiring: How Data Will Transform Youth EmploymentImpact Hiring: How Data Will Transform Youth Employment
Impact Hiring: How Data Will Transform Youth EmploymentThe Rockefeller Foundation
 
Statistical quality control
Statistical quality controlStatistical quality control
Statistical quality controlAnubhav Grover
 
The Personality of a Scientist
The Personality of a Scientist The Personality of a Scientist
The Personality of a Scientist Kelly Services
 
Project quality management - PMI PMBOK Knowledge Area
Project quality management - PMI PMBOK Knowledge AreaProject quality management - PMI PMBOK Knowledge Area
Project quality management - PMI PMBOK Knowledge AreaImran Jamil
 
Statistical Quality Control.
Statistical Quality Control.Statistical Quality Control.
Statistical Quality Control.Raviraj Jadeja
 
How to Become a Data Scientist
How to Become a Data ScientistHow to Become a Data Scientist
How to Become a Data Scientistryanorban
 
SXSW 2016: The Need To Knows
SXSW 2016: The Need To KnowsSXSW 2016: The Need To Knows
SXSW 2016: The Need To KnowsOgilvy Consulting
 
Hype vs. Reality: The AI Explainer
Hype vs. Reality: The AI ExplainerHype vs. Reality: The AI Explainer
Hype vs. Reality: The AI ExplainerLuminary Labs
 
Visual Design with Data
Visual Design with DataVisual Design with Data
Visual Design with DataSeth Familian
 

Andere mochten auch (19)

Datascience
DatascienceDatascience
Datascience
 
Design Thinking for Data Science #StrataHadoop
Design Thinking for Data Science #StrataHadoopDesign Thinking for Data Science #StrataHadoop
Design Thinking for Data Science #StrataHadoop
 
Data Analyst Role
Data Analyst RoleData Analyst Role
Data Analyst Role
 
Cost of Quality How to Save Money
Cost of Quality How to Save MoneyCost of Quality How to Save Money
Cost of Quality How to Save Money
 
5 Key Hacks for Breakthrough Innovation
5 Key Hacks for Breakthrough Innovation 5 Key Hacks for Breakthrough Innovation
5 Key Hacks for Breakthrough Innovation
 
Strategic Cost Management
Strategic Cost ManagementStrategic Cost Management
Strategic Cost Management
 
Be Data Informed Without Being a Data Scientist
Be Data Informed Without Being a Data ScientistBe Data Informed Without Being a Data Scientist
Be Data Informed Without Being a Data Scientist
 
Impact Hiring: How Data Will Transform Youth Employment
Impact Hiring: How Data Will Transform Youth EmploymentImpact Hiring: How Data Will Transform Youth Employment
Impact Hiring: How Data Will Transform Youth Employment
 
Statistical quality control
Statistical quality controlStatistical quality control
Statistical quality control
 
Quality by Design
Quality by DesignQuality by Design
Quality by Design
 
Quality
QualityQuality
Quality
 
The Personality of a Scientist
The Personality of a Scientist The Personality of a Scientist
The Personality of a Scientist
 
Project quality management - PMI PMBOK Knowledge Area
Project quality management - PMI PMBOK Knowledge AreaProject quality management - PMI PMBOK Knowledge Area
Project quality management - PMI PMBOK Knowledge Area
 
Statistical Quality Control.
Statistical Quality Control.Statistical Quality Control.
Statistical Quality Control.
 
Creativity & innovation
Creativity & innovationCreativity & innovation
Creativity & innovation
 
How to Become a Data Scientist
How to Become a Data ScientistHow to Become a Data Scientist
How to Become a Data Scientist
 
SXSW 2016: The Need To Knows
SXSW 2016: The Need To KnowsSXSW 2016: The Need To Knows
SXSW 2016: The Need To Knows
 
Hype vs. Reality: The AI Explainer
Hype vs. Reality: The AI ExplainerHype vs. Reality: The AI Explainer
Hype vs. Reality: The AI Explainer
 
Visual Design with Data
Visual Design with DataVisual Design with Data
Visual Design with Data
 

Ähnlich wie Data Scientist - The New Data Analyst Role

Big Data Talent in Academic and Industry R&D
Big Data Talent in Academic and Industry R&DBig Data Talent in Academic and Industry R&D
Big Data Talent in Academic and Industry R&DUniversity of Washington
 
Innovation med big data – chr. hansens erfaringer
Innovation med big data – chr. hansens erfaringerInnovation med big data – chr. hansens erfaringer
Innovation med big data – chr. hansens erfaringerMicrosoft
 
Data_Science_Applications_&_Use_Cases.pdf
Data_Science_Applications_&_Use_Cases.pdfData_Science_Applications_&_Use_Cases.pdf
Data_Science_Applications_&_Use_Cases.pdfvishal choudhary
 
Data_Science_Applications_&_Use_Cases.pptx
Data_Science_Applications_&_Use_Cases.pptxData_Science_Applications_&_Use_Cases.pptx
Data_Science_Applications_&_Use_Cases.pptxssuser1a4f0f
 
How Your Data Can Predict The Future
How Your Data Can Predict The FutureHow Your Data Can Predict The Future
How Your Data Can Predict The FutureBecky Wang
 
Data_Science_Applications_&_Use_Cases.pptx
Data_Science_Applications_&_Use_Cases.pptxData_Science_Applications_&_Use_Cases.pptx
Data_Science_Applications_&_Use_Cases.pptxwahiba ben abdessalem
 
Leland Lockhart - SXSW Intro to Data Science
Leland Lockhart -  SXSW Intro to Data ScienceLeland Lockhart -  SXSW Intro to Data Science
Leland Lockhart - SXSW Intro to Data ScienceLeland Lockhart, PhD
 
Design and Data Processes  Unified -  3rd Corner View
Design and Data Processes  Unified -  3rd Corner ViewDesign and Data Processes  Unified -  3rd Corner View
Design and Data Processes  Unified -  3rd Corner ViewJulian Jordan
 
Making an impact with data science
Making an impact  with data scienceMaking an impact  with data science
Making an impact with data scienceJordan Engbers
 
Accretive Health - Quality Management in Health Care
Accretive Health - Quality Management in Health CareAccretive Health - Quality Management in Health Care
Accretive Health - Quality Management in Health CareAccretiveHealth
 
Real-time applications of Data Science.pptx
Real-time applications  of Data Science.pptxReal-time applications  of Data Science.pptx
Real-time applications of Data Science.pptxshalini s
 
intro to data science Clustering and visualization of data science subfields ...
intro to data science Clustering and visualization of data science subfields ...intro to data science Clustering and visualization of data science subfields ...
intro to data science Clustering and visualization of data science subfields ...jybufgofasfbkpoovh
 
Optimising Scientific Knowledge Transfer: How Collective Sensemaking Can Ena...
Optimising Scientific Knowledge Transfer: How Collective Sensemaking Can Ena...Optimising Scientific Knowledge Transfer: How Collective Sensemaking Can Ena...
Optimising Scientific Knowledge Transfer: How Collective Sensemaking Can Ena...Anita de Waard
 
What's the profile of a data scientist?
What's the profile of a data scientist? What's the profile of a data scientist?
What's the profile of a data scientist? BICC Thomas More
 

Ähnlich wie Data Scientist - The New Data Analyst Role (20)

50 Years of Data Science
50 Years of Data Science50 Years of Data Science
50 Years of Data Science
 
Data Science
Data ScienceData Science
Data Science
 
Big Data Talent in Academic and Industry R&D
Big Data Talent in Academic and Industry R&DBig Data Talent in Academic and Industry R&D
Big Data Talent in Academic and Industry R&D
 
Innovation med big data – chr. hansens erfaringer
Innovation med big data – chr. hansens erfaringerInnovation med big data – chr. hansens erfaringer
Innovation med big data – chr. hansens erfaringer
 
Vikrant data scientist
Vikrant data scientistVikrant data scientist
Vikrant data scientist
 
Data_Science_Applications_&_Use_Cases.pdf
Data_Science_Applications_&_Use_Cases.pdfData_Science_Applications_&_Use_Cases.pdf
Data_Science_Applications_&_Use_Cases.pdf
 
Data_Science_Applications_&_Use_Cases.pptx
Data_Science_Applications_&_Use_Cases.pptxData_Science_Applications_&_Use_Cases.pptx
Data_Science_Applications_&_Use_Cases.pptx
 
How Your Data Can Predict The Future
How Your Data Can Predict The FutureHow Your Data Can Predict The Future
How Your Data Can Predict The Future
 
Data_Science_Applications_&_Use_Cases.pptx
Data_Science_Applications_&_Use_Cases.pptxData_Science_Applications_&_Use_Cases.pptx
Data_Science_Applications_&_Use_Cases.pptx
 
Leland Lockhart - SXSW Intro to Data Science
Leland Lockhart -  SXSW Intro to Data ScienceLeland Lockhart -  SXSW Intro to Data Science
Leland Lockhart - SXSW Intro to Data Science
 
Design and Data Processes  Unified -  3rd Corner View
Design and Data Processes  Unified -  3rd Corner ViewDesign and Data Processes  Unified -  3rd Corner View
Design and Data Processes  Unified -  3rd Corner View
 
Making an impact with data science
Making an impact  with data scienceMaking an impact  with data science
Making an impact with data science
 
Accretive Health - Quality Management in Health Care
Accretive Health - Quality Management in Health CareAccretive Health - Quality Management in Health Care
Accretive Health - Quality Management in Health Care
 
Real-time applications of Data Science.pptx
Real-time applications  of Data Science.pptxReal-time applications  of Data Science.pptx
Real-time applications of Data Science.pptx
 
intro to data science Clustering and visualization of data science subfields ...
intro to data science Clustering and visualization of data science subfields ...intro to data science Clustering and visualization of data science subfields ...
intro to data science Clustering and visualization of data science subfields ...
 
Intro to Data Science Concepts
Intro to Data Science ConceptsIntro to Data Science Concepts
Intro to Data Science Concepts
 
Optimising Scientific Knowledge Transfer: How Collective Sensemaking Can Ena...
Optimising Scientific Knowledge Transfer: How Collective Sensemaking Can Ena...Optimising Scientific Knowledge Transfer: How Collective Sensemaking Can Ena...
Optimising Scientific Knowledge Transfer: How Collective Sensemaking Can Ena...
 
Lecture_1_Intro_toDS&AI.pptx
Lecture_1_Intro_toDS&AI.pptxLecture_1_Intro_toDS&AI.pptx
Lecture_1_Intro_toDS&AI.pptx
 
2011 ATE Conference PreConference Workshop C
2011 ATE Conference PreConference Workshop C2011 ATE Conference PreConference Workshop C
2011 ATE Conference PreConference Workshop C
 
What's the profile of a data scientist?
What's the profile of a data scientist? What's the profile of a data scientist?
What's the profile of a data scientist?
 

Mehr von Cloudera, Inc.

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxCloudera, Inc.
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera, Inc.
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards FinalistsCloudera, Inc.
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Cloudera, Inc.
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Cloudera, Inc.
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Cloudera, Inc.
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Cloudera, Inc.
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Cloudera, Inc.
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Cloudera, Inc.
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Cloudera, Inc.
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Cloudera, Inc.
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Cloudera, Inc.
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformCloudera, Inc.
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Cloudera, Inc.
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Cloudera, Inc.
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Cloudera, Inc.
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Cloudera, Inc.
 

Mehr von Cloudera, Inc. (20)

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptx
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2
 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the Platform
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18
 

Kürzlich hochgeladen

Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditSkynet Technologies
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...AliaaTarek5
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...panagenda
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityIES VE
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 

Kürzlich hochgeladen (20)

Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance Audit
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 

Data Scientist - The New Data Analyst Role

Hinweis der Redaktion

  1. 90% of industrial machine learning is feature extraction. What I really do is ETL.
  2. Drive revenue and new business. Data is too important to be left to business people. The data warehouse is where data goes to die-- to be used in operational reporting or diagnosing problems. When we’re talking about data products, we’re talking about creating new revenue streams, optimizing existing ones, and solving problems for customers and for the business.
  3. 1) Means I collect everything– I don’t want to waste time getting data from operational systems every time I need something new. 2) Means I keep all of the phases of data available to me– from the raw stuff, to cleansed stuff, to joined stuff. 3) Implications for denormalization-- 1) we go beyond dimensional modeling to full on denormalization, usually along the lines of one of our conformed dimensions (product, customer, etc.)
  4. Similar to the EDW team. For a small datawarehouse/datamart, the DW architect is the ETL developer, the DBA, the dashboard builder, and the business analyst all rolled in to one. When we are talking about data products– classifiers, recommenders, interactive or real-time data tools, we need to bring in the ability to take things to production.
  5. Most important decision: the metrics you’re going to use to measure performance. It is an anti-pattern to solve a problem exactly once. You should either solve a problem 0 times or N times.
  6. Time is money. Your time costs a lot more than the cost of data storage. Data acquisition, data processing, reuse code. All things you do to save money over the long term.