SlideShare ist ein Scribd-Unternehmen logo
1 von 19
Accelerating Machine Learning Strategy
Shioulin Sam
Cloudera Fast Forward Labs
© Cloudera, Inc. All rights reserved.
2
Machine
Learning
Data Science
Analytics
Big Data
AI
© Cloudera, Inc. All rights reserved.
There is a generic formulation of the
problem, and then there is your problem
There is no software that can solve your
problem
© Cloudera, Inc. All rights reserved.
We believe technology and strategy are part
of the same problem.
© Cloudera, Inc. All rights reserved.
Cloudera Fast Forward Labs is your partner
to create and execute on an excellent data
strategy.
We accelerate your data and machine
learning strategy with expert research and
advising.
© Cloudera, Inc. All rights reserved.
Academic ResearchStartups
Enterprise
© Cloudera, Inc. All rights reserved.
Research
Stay on top of emerging
ML technologies
Advising
Define data strategy
Evaluate ML capabilities
Application
Development
Feasibility studies:
Build a ML product
Fast Forward Lab Services
© Cloudera, Inc. All rights reserved.
© Cloudera, Inc. All rights reserved.
Gentle Introduction
Algorithm
Prototype
Commercial and Open Source Landscape
Ethics
Sci-fi short story
A Fast Forward Labs Report
© Cloudera, Inc. All rights reserved.
© Cloudera, Inc. All rights reserved.
© Cloudera, Inc. All rights reserved.
• Retail bank parsing customer service transcripts to better
recommend actions
• Investment bank parsing news effectively for commodities traders
Sample Use Cases
© Cloudera, Inc. All rights reserved.
© Cloudera, Inc. All rights reserved.
© Cloudera, Inc. All rights reserved.
© Cloudera, Inc. All rights reserved.
© Cloudera, Inc. All rights reserved.
Sample Use Cases
• Telecom modeling reasons and remedies for customer churn
• Regulatory compliance and bias testing
© Cloudera, Inc. All rights reserved.
Uncertainty
Fast Forward Labs cuts through the hype
Data Silos
Enterprise Data Hub provides a unified
foundation
Productivity
Data Science Workbench enables
collaborative self-serve
Machine Learning
at Cloudera
© Cloudera, Inc. All rights reserved.
THANK YOU

Weitere ähnliche Inhalte

Was ist angesagt?

How komatsu is driving operational efficiencies using io t and machine learni...
How komatsu is driving operational efficiencies using io t and machine learni...How komatsu is driving operational efficiencies using io t and machine learni...
How komatsu is driving operational efficiencies using io t and machine learni...Cloudera, Inc.
 
Big data journey to the cloud maz chaudhri 5.30.18
Big data journey to the cloud   maz chaudhri 5.30.18Big data journey to the cloud   maz chaudhri 5.30.18
Big data journey to the cloud maz chaudhri 5.30.18Cloudera, Inc.
 
Self-service Big Data Analytics on Microsoft Azure
Self-service Big Data Analytics on Microsoft AzureSelf-service Big Data Analytics on Microsoft Azure
Self-service Big Data Analytics on Microsoft AzureCloudera, Inc.
 
Making Self-Service BI a Reality in the Enterprise
Making Self-Service BI a Reality in the EnterpriseMaking Self-Service BI a Reality in the Enterprise
Making Self-Service BI a Reality in the EnterpriseCloudera, Inc.
 
Introducing Workload XM 8.7.18
Introducing Workload XM 8.7.18Introducing Workload XM 8.7.18
Introducing Workload XM 8.7.18Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Cloudera, Inc.
 
Data Drive Applications_Webinar
Data Drive Applications_WebinarData Drive Applications_Webinar
Data Drive Applications_WebinarSean Spediacci
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Cloudera, Inc.
 
Spark and Deep Learning Frameworks at Scale 7.19.18
Spark and Deep Learning Frameworks at Scale 7.19.18Spark and Deep Learning Frameworks at Scale 7.19.18
Spark and Deep Learning Frameworks at Scale 7.19.18Cloudera, Inc.
 
The Vision & Challenge of Applied Machine Learning
The Vision & Challenge of Applied Machine LearningThe Vision & Challenge of Applied Machine Learning
The Vision & Challenge of Applied Machine LearningCloudera, Inc.
 
Get started with Cloudera's cyber solution
Get started with Cloudera's cyber solutionGet started with Cloudera's cyber solution
Get started with Cloudera's cyber solutionCloudera, Inc.
 
How Cloudera SDX can aid GDPR compliance
How Cloudera SDX can aid GDPR complianceHow Cloudera SDX can aid GDPR compliance
How Cloudera SDX can aid GDPR complianceCloudera, Inc.
 
How to Build Continuous Ingestion for the Internet of Things
How to Build Continuous Ingestion for the Internet of ThingsHow to Build Continuous Ingestion for the Internet of Things
How to Build Continuous Ingestion for the Internet of ThingsCloudera, Inc.
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Cloudera, Inc.
 
Cloud Data Warehousing with Cloudera Altus 7.24.18
Cloud Data Warehousing with Cloudera Altus 7.24.18Cloud Data Warehousing with Cloudera Altus 7.24.18
Cloud Data Warehousing with Cloudera Altus 7.24.18Cloudera, Inc.
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Cloudera, Inc.
 
Protecting health and life science organizations from breaches and ransomware
Protecting health and life science organizations from breaches and ransomwareProtecting health and life science organizations from breaches and ransomware
Protecting health and life science organizations from breaches and ransomwareCloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Cloudera, Inc.
 
Kelley Blue Book Uses Big Data to Increase User Engagement Over 100%
Kelley Blue Book Uses Big Data to Increase User Engagement Over 100%Kelley Blue Book Uses Big Data to Increase User Engagement Over 100%
Kelley Blue Book Uses Big Data to Increase User Engagement Over 100%Cloudera, Inc.
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Cloudera, Inc.
 

Was ist angesagt? (20)

How komatsu is driving operational efficiencies using io t and machine learni...
How komatsu is driving operational efficiencies using io t and machine learni...How komatsu is driving operational efficiencies using io t and machine learni...
How komatsu is driving operational efficiencies using io t and machine learni...
 
Big data journey to the cloud maz chaudhri 5.30.18
Big data journey to the cloud   maz chaudhri 5.30.18Big data journey to the cloud   maz chaudhri 5.30.18
Big data journey to the cloud maz chaudhri 5.30.18
 
Self-service Big Data Analytics on Microsoft Azure
Self-service Big Data Analytics on Microsoft AzureSelf-service Big Data Analytics on Microsoft Azure
Self-service Big Data Analytics on Microsoft Azure
 
Making Self-Service BI a Reality in the Enterprise
Making Self-Service BI a Reality in the EnterpriseMaking Self-Service BI a Reality in the Enterprise
Making Self-Service BI a Reality in the Enterprise
 
Introducing Workload XM 8.7.18
Introducing Workload XM 8.7.18Introducing Workload XM 8.7.18
Introducing Workload XM 8.7.18
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3
 
Data Drive Applications_Webinar
Data Drive Applications_WebinarData Drive Applications_Webinar
Data Drive Applications_Webinar
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019
 
Spark and Deep Learning Frameworks at Scale 7.19.18
Spark and Deep Learning Frameworks at Scale 7.19.18Spark and Deep Learning Frameworks at Scale 7.19.18
Spark and Deep Learning Frameworks at Scale 7.19.18
 
The Vision & Challenge of Applied Machine Learning
The Vision & Challenge of Applied Machine LearningThe Vision & Challenge of Applied Machine Learning
The Vision & Challenge of Applied Machine Learning
 
Get started with Cloudera's cyber solution
Get started with Cloudera's cyber solutionGet started with Cloudera's cyber solution
Get started with Cloudera's cyber solution
 
How Cloudera SDX can aid GDPR compliance
How Cloudera SDX can aid GDPR complianceHow Cloudera SDX can aid GDPR compliance
How Cloudera SDX can aid GDPR compliance
 
How to Build Continuous Ingestion for the Internet of Things
How to Build Continuous Ingestion for the Internet of ThingsHow to Build Continuous Ingestion for the Internet of Things
How to Build Continuous Ingestion for the Internet of Things
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
 
Cloud Data Warehousing with Cloudera Altus 7.24.18
Cloud Data Warehousing with Cloudera Altus 7.24.18Cloud Data Warehousing with Cloudera Altus 7.24.18
Cloud Data Warehousing with Cloudera Altus 7.24.18
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18
 
Protecting health and life science organizations from breaches and ransomware
Protecting health and life science organizations from breaches and ransomwareProtecting health and life science organizations from breaches and ransomware
Protecting health and life science organizations from breaches and ransomware
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2
 
Kelley Blue Book Uses Big Data to Increase User Engagement Over 100%
Kelley Blue Book Uses Big Data to Increase User Engagement Over 100%Kelley Blue Book Uses Big Data to Increase User Engagement Over 100%
Kelley Blue Book Uses Big Data to Increase User Engagement Over 100%
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
 

Ähnlich wie Accelerating Your Machine Learning & Data Strategy with Expert Advice

Data Science in Enterprise
Data Science in EnterpriseData Science in Enterprise
Data Science in EnterpriseJosh Yeh
 
Cloudera Fast Forward Labs: Accelerate machine learning
Cloudera Fast Forward Labs: Accelerate machine learningCloudera Fast Forward Labs: Accelerate machine learning
Cloudera Fast Forward Labs: Accelerate machine learningCloudera, Inc.
 
Data-Driven Customer Support
Data-Driven Customer SupportData-Driven Customer Support
Data-Driven Customer SupportCloudera, Inc.
 
Standing Up an Effective Enterprise Data Hub -- Technology and Beyond
Standing Up an Effective Enterprise Data Hub -- Technology and BeyondStanding Up an Effective Enterprise Data Hub -- Technology and Beyond
Standing Up an Effective Enterprise Data Hub -- Technology and BeyondCloudera, Inc.
 
2016 Cybersecurity Analytics State of the Union
2016 Cybersecurity Analytics State of the Union2016 Cybersecurity Analytics State of the Union
2016 Cybersecurity Analytics State of the UnionCloudera, Inc.
 
Machine Learning Model Deployment: Strategy to Implementation
Machine Learning Model Deployment: Strategy to ImplementationMachine Learning Model Deployment: Strategy to Implementation
Machine Learning Model Deployment: Strategy to ImplementationDataWorks Summit
 
Transform Banking with Big Data and Automated Machine Learning 9.12.17
Transform Banking with Big Data and Automated Machine Learning 9.12.17Transform Banking with Big Data and Automated Machine Learning 9.12.17
Transform Banking with Big Data and Automated Machine Learning 9.12.17Cloudera, Inc.
 
Limitless Data, Rapid Discovery, Powerful Insight: How to Connect Cloudera to...
Limitless Data, Rapid Discovery, Powerful Insight: How to Connect Cloudera to...Limitless Data, Rapid Discovery, Powerful Insight: How to Connect Cloudera to...
Limitless Data, Rapid Discovery, Powerful Insight: How to Connect Cloudera to...Cloudera, Inc.
 
The Five Markers on Your Big Data Journey
The Five Markers on Your Big Data JourneyThe Five Markers on Your Big Data Journey
The Five Markers on Your Big Data JourneyCloudera, Inc.
 
12 Steps to get Started with Cloud.pdf
12 Steps to get Started with Cloud.pdf12 Steps to get Started with Cloud.pdf
12 Steps to get Started with Cloud.pdfAmazon Web Services
 
Big Data Fundamentals 6.6.18
Big Data Fundamentals 6.6.18Big Data Fundamentals 6.6.18
Big Data Fundamentals 6.6.18Cloudera, Inc.
 
Leading Your Team Through a Cloud Transformation - Virtual Transformation Day...
Leading Your Team Through a Cloud Transformation - Virtual Transformation Day...Leading Your Team Through a Cloud Transformation - Virtual Transformation Day...
Leading Your Team Through a Cloud Transformation - Virtual Transformation Day...Amazon Web Services
 
Big Data LDN 2018: A LOOK INSIDE APPLIED MACHINE LEARNING
Big Data LDN 2018: A LOOK INSIDE APPLIED MACHINE LEARNINGBig Data LDN 2018: A LOOK INSIDE APPLIED MACHINE LEARNING
Big Data LDN 2018: A LOOK INSIDE APPLIED MACHINE LEARNINGMatt Stubbs
 
NOVA Data Science Meetup 2-21-2018 Presentation Cloudera Data Science Workbench
NOVA Data Science Meetup 2-21-2018 Presentation Cloudera Data Science WorkbenchNOVA Data Science Meetup 2-21-2018 Presentation Cloudera Data Science Workbench
NOVA Data Science Meetup 2-21-2018 Presentation Cloudera Data Science WorkbenchNOVA DATASCIENCE
 
How to classify documents automatically using NLP
How to classify documents automatically using NLPHow to classify documents automatically using NLP
How to classify documents automatically using NLPSkyl.ai
 
Becoming Data-Driven Through Cultural Change
Becoming Data-Driven Through Cultural ChangeBecoming Data-Driven Through Cultural Change
Becoming Data-Driven Through Cultural ChangeCloudera, Inc.
 
Leading Your Team Through a Cloud Transformation - AWS Online Tech Talks
Leading Your Team Through a Cloud Transformation - AWS Online Tech TalksLeading Your Team Through a Cloud Transformation - AWS Online Tech Talks
Leading Your Team Through a Cloud Transformation - AWS Online Tech TalksAmazon Web Services
 
Mythbusting the Federal Cloud Journey
Mythbusting the Federal Cloud JourneyMythbusting the Federal Cloud Journey
Mythbusting the Federal Cloud JourneyAmazon Web Services
 
TC028SN_Spencer_FINAL
TC028SN_Spencer_FINALTC028SN_Spencer_FINAL
TC028SN_Spencer_FINALTerri Spencer
 
Skytree Partner Program 2-15
Skytree Partner Program 2-15Skytree Partner Program 2-15
Skytree Partner Program 2-15Dylan Steeg
 

Ähnlich wie Accelerating Your Machine Learning & Data Strategy with Expert Advice (20)

Data Science in Enterprise
Data Science in EnterpriseData Science in Enterprise
Data Science in Enterprise
 
Cloudera Fast Forward Labs: Accelerate machine learning
Cloudera Fast Forward Labs: Accelerate machine learningCloudera Fast Forward Labs: Accelerate machine learning
Cloudera Fast Forward Labs: Accelerate machine learning
 
Data-Driven Customer Support
Data-Driven Customer SupportData-Driven Customer Support
Data-Driven Customer Support
 
Standing Up an Effective Enterprise Data Hub -- Technology and Beyond
Standing Up an Effective Enterprise Data Hub -- Technology and BeyondStanding Up an Effective Enterprise Data Hub -- Technology and Beyond
Standing Up an Effective Enterprise Data Hub -- Technology and Beyond
 
2016 Cybersecurity Analytics State of the Union
2016 Cybersecurity Analytics State of the Union2016 Cybersecurity Analytics State of the Union
2016 Cybersecurity Analytics State of the Union
 
Machine Learning Model Deployment: Strategy to Implementation
Machine Learning Model Deployment: Strategy to ImplementationMachine Learning Model Deployment: Strategy to Implementation
Machine Learning Model Deployment: Strategy to Implementation
 
Transform Banking with Big Data and Automated Machine Learning 9.12.17
Transform Banking with Big Data and Automated Machine Learning 9.12.17Transform Banking with Big Data and Automated Machine Learning 9.12.17
Transform Banking with Big Data and Automated Machine Learning 9.12.17
 
Limitless Data, Rapid Discovery, Powerful Insight: How to Connect Cloudera to...
Limitless Data, Rapid Discovery, Powerful Insight: How to Connect Cloudera to...Limitless Data, Rapid Discovery, Powerful Insight: How to Connect Cloudera to...
Limitless Data, Rapid Discovery, Powerful Insight: How to Connect Cloudera to...
 
The Five Markers on Your Big Data Journey
The Five Markers on Your Big Data JourneyThe Five Markers on Your Big Data Journey
The Five Markers on Your Big Data Journey
 
12 Steps to get Started with Cloud.pdf
12 Steps to get Started with Cloud.pdf12 Steps to get Started with Cloud.pdf
12 Steps to get Started with Cloud.pdf
 
Big Data Fundamentals 6.6.18
Big Data Fundamentals 6.6.18Big Data Fundamentals 6.6.18
Big Data Fundamentals 6.6.18
 
Leading Your Team Through a Cloud Transformation - Virtual Transformation Day...
Leading Your Team Through a Cloud Transformation - Virtual Transformation Day...Leading Your Team Through a Cloud Transformation - Virtual Transformation Day...
Leading Your Team Through a Cloud Transformation - Virtual Transformation Day...
 
Big Data LDN 2018: A LOOK INSIDE APPLIED MACHINE LEARNING
Big Data LDN 2018: A LOOK INSIDE APPLIED MACHINE LEARNINGBig Data LDN 2018: A LOOK INSIDE APPLIED MACHINE LEARNING
Big Data LDN 2018: A LOOK INSIDE APPLIED MACHINE LEARNING
 
NOVA Data Science Meetup 2-21-2018 Presentation Cloudera Data Science Workbench
NOVA Data Science Meetup 2-21-2018 Presentation Cloudera Data Science WorkbenchNOVA Data Science Meetup 2-21-2018 Presentation Cloudera Data Science Workbench
NOVA Data Science Meetup 2-21-2018 Presentation Cloudera Data Science Workbench
 
How to classify documents automatically using NLP
How to classify documents automatically using NLPHow to classify documents automatically using NLP
How to classify documents automatically using NLP
 
Becoming Data-Driven Through Cultural Change
Becoming Data-Driven Through Cultural ChangeBecoming Data-Driven Through Cultural Change
Becoming Data-Driven Through Cultural Change
 
Leading Your Team Through a Cloud Transformation - AWS Online Tech Talks
Leading Your Team Through a Cloud Transformation - AWS Online Tech TalksLeading Your Team Through a Cloud Transformation - AWS Online Tech Talks
Leading Your Team Through a Cloud Transformation - AWS Online Tech Talks
 
Mythbusting the Federal Cloud Journey
Mythbusting the Federal Cloud JourneyMythbusting the Federal Cloud Journey
Mythbusting the Federal Cloud Journey
 
TC028SN_Spencer_FINAL
TC028SN_Spencer_FINALTC028SN_Spencer_FINAL
TC028SN_Spencer_FINAL
 
Skytree Partner Program 2-15
Skytree Partner Program 2-15Skytree Partner Program 2-15
Skytree Partner Program 2-15
 

Mehr von Cloudera, Inc.

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxCloudera, Inc.
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera, Inc.
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards FinalistsCloudera, Inc.
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Cloudera, Inc.
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Cloudera, Inc.
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Cloudera, Inc.
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Cloudera, Inc.
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Cloudera, Inc.
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformCloudera, Inc.
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Cloudera, Inc.
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Cloudera, Inc.
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Cloudera, Inc.
 
When SAP alone is not enough
When SAP alone is not enoughWhen SAP alone is not enough
When SAP alone is not enoughCloudera, Inc.
 
Multi task learning stepping away from narrow expert models 7.11.18
Multi task learning stepping away from narrow expert models 7.11.18Multi task learning stepping away from narrow expert models 7.11.18
Multi task learning stepping away from narrow expert models 7.11.18Cloudera, Inc.
 

Mehr von Cloudera, Inc. (16)

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptx
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18
 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the Platform
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18
 
Cloudera SDX
Cloudera SDXCloudera SDX
Cloudera SDX
 
When SAP alone is not enough
When SAP alone is not enoughWhen SAP alone is not enough
When SAP alone is not enough
 
Multi task learning stepping away from narrow expert models 7.11.18
Multi task learning stepping away from narrow expert models 7.11.18Multi task learning stepping away from narrow expert models 7.11.18
Multi task learning stepping away from narrow expert models 7.11.18
 

Kürzlich hochgeladen

Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 

Kürzlich hochgeladen (20)

Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 

Accelerating Your Machine Learning & Data Strategy with Expert Advice

  • 1. Accelerating Machine Learning Strategy Shioulin Sam Cloudera Fast Forward Labs
  • 2. © Cloudera, Inc. All rights reserved. 2 Machine Learning Data Science Analytics Big Data AI
  • 3. © Cloudera, Inc. All rights reserved. There is a generic formulation of the problem, and then there is your problem There is no software that can solve your problem
  • 4. © Cloudera, Inc. All rights reserved. We believe technology and strategy are part of the same problem.
  • 5. © Cloudera, Inc. All rights reserved. Cloudera Fast Forward Labs is your partner to create and execute on an excellent data strategy. We accelerate your data and machine learning strategy with expert research and advising.
  • 6. © Cloudera, Inc. All rights reserved. Academic ResearchStartups Enterprise
  • 7. © Cloudera, Inc. All rights reserved. Research Stay on top of emerging ML technologies Advising Define data strategy Evaluate ML capabilities Application Development Feasibility studies: Build a ML product Fast Forward Lab Services
  • 8. © Cloudera, Inc. All rights reserved.
  • 9. © Cloudera, Inc. All rights reserved. Gentle Introduction Algorithm Prototype Commercial and Open Source Landscape Ethics Sci-fi short story A Fast Forward Labs Report
  • 10. © Cloudera, Inc. All rights reserved.
  • 11. © Cloudera, Inc. All rights reserved.
  • 12. © Cloudera, Inc. All rights reserved. • Retail bank parsing customer service transcripts to better recommend actions • Investment bank parsing news effectively for commodities traders Sample Use Cases
  • 13. © Cloudera, Inc. All rights reserved.
  • 14. © Cloudera, Inc. All rights reserved.
  • 15. © Cloudera, Inc. All rights reserved.
  • 16. © Cloudera, Inc. All rights reserved.
  • 17. © Cloudera, Inc. All rights reserved. Sample Use Cases • Telecom modeling reasons and remedies for customer churn • Regulatory compliance and bias testing
  • 18. © Cloudera, Inc. All rights reserved. Uncertainty Fast Forward Labs cuts through the hype Data Silos Enterprise Data Hub provides a unified foundation Productivity Data Science Workbench enables collaborative self-serve Machine Learning at Cloudera
  • 19. © Cloudera, Inc. All rights reserved. THANK YOU

Hinweis der Redaktion

  1. Welcome to Cloudera Sessions! My name is JJ Sakeyand I have the honor of taking us through today’s jam-packed program. By way of introduction, {tell us about what you do JJ}. We are all here today because we believe that data can make what is impossible today possible tomorrow. Certainly many of us in this room have already created board level impacts to our business. We are going to have two customers speak to you today about their data journeys –Sentier, Amazon Web Services, and Altisource. There is no question that big data is improving insights into customers, it’s connecting products and services via IOT and its protecting your business from cyber attacks and regulatory fines. But it’s also quietly having a major social impact that affects every one of us in this room. For example, 4 out of the 5 cancer research centers are using Cloudera to find a cure, and when we do the odds are good that it was done with our software and big data. Cloudera has partnered with many hospitals and saved hundreds of lives already by early detection of sepsis. Lastly, a topic near to my heart, we have partnered with several non-profit groups to detect early signs of suicide, especially for veterans, which is the one of the leading cause of death.   So, thank you for spending the day to talk about Big Data, that is having a profound effect in business and our lives. Let’s jump in and cover a couple important logistics items first…
  2. What is artificial intelligence? What is machine learning? Popular media suggests that AI is all about recognizing pictures of cats and dogs, or machines beating human in the game of Go. But when we peel away the hype, machine learning of today is really a very smart pattern recognizer. How do we leverage this capability and tum it into a competitive advantage? The tech stack looks like this. First, there is data. then we need to have the capability to build basic understanding of the data (this is the analytics layer). At this layer, we are able to say things like, “the average age of my customer is 40”. Naturally, we would like our data to tell us more – and this is when we move into the data science layer. Both analytics and Data Science are built on top of the big data layer, but the data requirement is tighter than the analytics layer. At this layer, we start to focus more on data cleaning and preping. Here, we are able to answer questions like “How much sales do I expect to generate next year”. To answer more sophisticated questions, we move into the ML layer. The ML layer puts a lot of focus on algorithms and can only work if the organization has the know-how of the lower stacks. Analytics – descriptive stats, visualize data Data Science – data cleaning, prep, analyze (forecast) ML - algorithms
  3. Machine learning will transform businesses. It is a huge opportunity for every company, but it is hard to execute on. Companies often do not know what questions to ask, and what problems to focus on. Even if they have converged on a problem, they soon realize that … … and that no software can solve their problem. Successful data products are often a clever combination of known components, machine learning tools and algorithms, applied to a well understood problem.
  4. Building the right data product requires both strategy and technology to be properly aligned. You cannot build a data product independently because business strategy dictates data availability. In many cases data opportunities require optimizing over both business needs and technological capability and can also require organizational transformation.
  5. And this is where we come in – Combined with Cloudera’s Enterprise Data Hub and Data Science Workbench, our goal is to accelerate machine learning in the enterprise, from research to production.
  6. We sit at the intersection of three entities. What we try to to is to build a bridge to connect academic research and enterprise – in a way, extract and present information from academic research such that businesses can make use of it. We also intersect startups because we find that it’s a helpful window into what businesses are looking for. We live at and our team has experience at the intersection of startup culture (agility, novelty, speed), academic research (where new algorithmic ideas come from), and the enterprise (opportunity to execute at scale, unique data). We’ve been doing this for 5 years (cf. Amazon, MSFT, Google, etc.) Academic research doesn’t focus on valuable business problems. Startups generally don’t invent new technology. Corporate R&D struggles to align with business priorities and effectively execute.
  7. What makes us unique? We engage in 3 ways – research subscription, advising and application development. We use research to help clients stay on top of emerging ML technologies. Every quarter we release a research report focusing on the new capability/breakthrough that we believe will become important in a next 6 months-2 years. The second way we engage is through advising, where we help define data strategy and evaluate ML capabilities. Our research subscription comes with 4 hours of advising/month. This is tailored for each client and every client utliizes this time differently. As an example, clients have used the time to do a deeper dive into our reports, to help identify data assets, to help guide their ML product development and to develop strategic and technical roadmaps. Lastly, for some clients with very specific projects in mind, but who are unsure of whether they have the resources to succeed, we help transform these science experiments into actual products by performing feasibility studies. The deliverables are proof of concept code and extensive documentation of what worked and didn’t work. In the end clients get a piece of working code, tailored to their problem and data, that you own and that you can build on top of.
  8. Here are all the reports we have done in the past. In choosing a breakthrough topic, we use answers to the following questions as a guide: 1) Is it useful? 2) Can we build a prototype 3) Is it timely? Purely algorithmic breakthroughs are not interesting to the business community unless they have specific applications. One way to ensure that is to filter for ones where a product prototype that depends on the algorithm can be built. Finally, the breakthrough has to be timely. It has to be more possible now than it was 1-2 years ago, and we expect it to be even more possible in 1-2 years more. We predict timeliness using two gauges i) economic constraints and ii) commoditization of tools. Sudden lifting of economic constraints can make what were previously nice ideas practical while commoditization of tooling makes it quicker to build things that were possible, but were difficult to get right and time consuming. Deep learning’s acceleration by GPUs and Keras/Tensorflow clearly illustrate both aspects. In our latest report on semantic recommendations, we look at the state of recommendation systems and their common pitfalls. Recommendation systems have been around for many years and businesses rely on them to surface interesting items for end-users. Unfortunately classical recommendation systems do not understand what they are recommending. Things are recommended to you because others similar to you have liked them. In our report, we look at ways to inject content of items into the system. When we do this, we are building a recommendation system that understands user preferences as it relates to item content. Turns out this technique also solves the cold start problem – this is a common problem in classical recommendation systems where the system doesn’t know how to generate recommendations for new items. In the interpretability report, we look at ways to understand and explain how a model makes decisions. Interpretability is important not just for regulatory reasons, Being able to explain why and how a model works can help us improve models and build a better product. Black box techniques like deep learning delivers breakthrough capabilities at the cost of interpretability – in this report, we show how to make models interpretable without sacrificing their capability or accuracy. If your model is accurate, but you have no idea how it works, what are you missing? Turns out quite a lot! It’s easier to improve an interpretable model. The ability to explain individual decisions to their subjects is intrinsically useful. People like to know why a model has treated them a certain way. And in many cases there’s an ethical and/or legal duty to ensure models are safe and non-discriminatory, which can only be done if they are interpretable. A paper published in 2016 made this report possible, by releasing a algorithm called LIME to probe the inner workings of a black box model. Text summarization. This report looks at a specific and very practical problem: summarizing documents. We show how to do that using the latest and greatest ideas from deep learning and topic modeling. But because text summarization is just a special case of a much broader set of problems — how can we help computers work with natural language — it’s a report with much wider implications, for any of us who work with text, either consuming or generating it. Next, probabilistic programming. The conclusions you draw from imperfect or incomplete data are uncertain. This report is all about how you work with that. Academic statisticians have known the how to deal with this uncertainty for a long time, but it’s only in the past few years that the algorithms have caught up with the scale of big data, and only very recently that tools and have made these algorithms accessible. In our deep learning report, we look at how neural networks enable us to analyze images. We explain what neural networks are, and how we can apply deep learning today.
  9. In all our reports, we first begin with the gentle introduction of the capability We then move on with a rigourous but conceptual discussion of the state of the art algorithm. We also describe the prototype, and the process of building it. For clients who are interested in implementing the new capability, we dedicate a chapter to commerical and open source landscape that will hopefully help with the buy or build decision. Because the focus is on business applications, each report also has a chapter on ethics. We close with a sci-fi short story - mostly to get readers to imagine in a very unconstrained way, what the capability can do for their businesses.
  10. With all that in mind, let’s take a look at a couple of the reports. First, text summarization. This report looks at a specific and very practical problem: summarizing documents. We show how to do that using the latest and greatest ideas from deep learning and topic modeling. But because text summarization is just a special case of a much broader set of problems — how can we help computers work with natural language — it’s a report with much wider implications, for any of us who work with text, either consuming or generating it. How do you take a long document and make it shorter? More generally, how do you make language computable We describe single and multiple document summarization using: topic models (a mature, accessible approach) language embeddings and recurrent neural networks (a cutting-edge deep learning approach)
  11. Next, let’s look at our interpretability report. In the interpretability report, we look at ways to understand and explain how a model makes decisions. Interpretability is important not just for regulatory reasons, Being able to explain why and how a model works can help us improve models and build a better product. Black box techniques like deep learning delivers breakthrough capabilities at the cost of interpretability – in this report, we show how to make models interpretable without sacrificing their capability or accuracy. If your model is accurate, but you have no idea how it works, what are you missing? Turns out quite a lot! It’s easier to improve an interpretable model. The ability to explain individual decisions to their subjects is intrinsically useful. People like to know why a model has treated them a certain way. And in many cases there’s an ethical and/or legal duty to ensure models are safe and non-discriminatory, which can only be done if they are interpretable. A paper published in 2016 made this report possible, by releasing a algorithm called LIME to probe the inner workings of a black box model. Interpretable models are easier to improve Regulators and society can better trust them to be safe and nondiscriminatory They offer insights that can be used to change real-world outcomes for the better We describe the Local Interpretable Model-Agnostic Explanation (LIME) algorithm
  12. To illustrate the capability, we built a prototype where we model the likelihood of a customer churning. Without interpretability, all the model gives us is the probability that a customer will churn. As an example, we see here that customer iD 3676 has a 79% chance of churning.
  13. When we add interpretability to the model by using LIME, we are now able to see why a customer is assigned a particular churn probability. The factors are color coded – redder means that LIME has assigned higher importance to this factor. Using LIME, we are able to say that the 79% churn probability is mostly caused by three factors – the fact that they have Fiber, and that their contract is month-to-month and that the customer is new.
  14. Cloudera helps scale data science and ML: Cloudera acclerates machine learning in the enterprise, from reasearch to production. We address uncertainty with FFL research and advising that cuts through the hype We address data silos issues with our enterprise data hub that unifies collection, access and deployment with shared security and governance Lastly, our Data Science Workbench makes collective, secure data science at scale a reality for the enterprise. SDX: shared data services (ALTUS) Cloudera Altus lets you automate massive-scale data engineering and analytic database compute workloads in your public cloud, without the headache of managing the infrastructure yourself. At the core of Altus is Cloudera's Shared Data Experience (SDX) that eliminates data silos with persistent metadata, security, and governance.