SlideShare a Scribd company logo
1 of 16
Data Science with Teams
Improve the efficiency of your data science teams
with platforms that enhance collaboration and
flexibility
Agenda
● Some Background
● Goals
● Data Science Project Teams
● Challenges
● Some Solutions
● Conclusions
Background
Integration experience with Oil & Gas, Financial, Insurance and Retail industries in
multiple geographies
What did these customers have in common? All had data science teams that
worked in Silos
Difficulties when taking a data science
course
Source: Wikipedia
Background (cont)
What’s going on here?
We started to do some digging!
Source: http://cesarsway.com
Data Teams - The Old Way
Department Teams Data Scientist
Data Analyst IT Manager
The Analytics Deliverable
A dashboard! An interactive dashboard is even cooler.
Conway’s Data Scientist Venn Diagram
Data Science Teams - The New Way
Data ScientistFinance Manager
Accountant
Tax and Compliance
Treasury
Data Gurus:
- Analytics
- Data Engineers
- Business Intelligence
- Compliance
IT Manager
The Data Science Deliverable
A machine or deep learning model!
I Want GPUs - And I Just Want Them to Work
Work around for NVidia Docker Wrapper:
- nvidia-docker -d -p 8888:8888 tensorflow/tensorflow:latest-gpu
OR
- docker run -ti --rm `curl -s http://localhost:3476/docker/cli`
tensorflow/tensorflow:latest-gpu
OR
- docker run -ti --rm --volume-driver=nvidia-docker --
volume=nvidia_driver_375.82:/usr/local/nvidia:ro --device=/dev/nvidiactl --
device=/dev/nvidia-uvm --device=/dev/nvidia0 nvidia/cuda nvidia-smi
The Need for DevOps Chops
Registrator
Docker
Container
EC2 Instance
Reverse Proxy with consul-template
The old way... The new way...
Docker
Container
EC2 Instance
Reverse Proxy with static upstream location
name
$$$ $
Infrastructure Management
Uh-oh, someone has to manage this stuff!
Our Architecture - API First and Microservices
3Blades Hub for Data Scientists
Solutions
Provide flexibility with the tools that data scientists use for exploratory
data analysis and visualizations
One central source for project files with support for version control
Share visualizations from EDA
Train and save Machine Learning and Deep Learning models with
multiple frameworks, from within the same project
Streamline deployment pipelines
Thank You!!
Email: hello@3blades.io
Web: https://3blades.io
Twitter: @3bladesio
GitHub: https://github.com/3blades
Email: gwerner@3blades.io
Twitter: @gwerner
LinkedIn: https://www.linkedin.com/in/wernergreg
GitHub: https://github.com/jgwerner

More Related Content

Viewers also liked

Ashfaq Munshi, ML7 Fellow, Pepperdata
Ashfaq Munshi, ML7 Fellow, PepperdataAshfaq Munshi, ML7 Fellow, Pepperdata
Ashfaq Munshi, ML7 Fellow, PepperdataMLconf
 
Dr. Steve Liu, Chief Scientist, Tinder at MLconf SF 2017
Dr. Steve Liu, Chief Scientist, Tinder at MLconf SF 2017Dr. Steve Liu, Chief Scientist, Tinder at MLconf SF 2017
Dr. Steve Liu, Chief Scientist, Tinder at MLconf SF 2017MLconf
 
Doug Eck, Research Scientist, Google Magenta, at MLconf SF 2017
Doug Eck, Research Scientist, Google Magenta, at MLconf SF 2017Doug Eck, Research Scientist, Google Magenta, at MLconf SF 2017
Doug Eck, Research Scientist, Google Magenta, at MLconf SF 2017MLconf
 
Ashrith Barthur, Security Scientist, H2o.ai, at MLconf 2017
Ashrith Barthur, Security Scientist, H2o.ai, at MLconf 2017Ashrith Barthur, Security Scientist, H2o.ai, at MLconf 2017
Ashrith Barthur, Security Scientist, H2o.ai, at MLconf 2017MLconf
 
Tamara G. Kolda, Distinguished Member of Technical Staff, Sandia National Lab...
Tamara G. Kolda, Distinguished Member of Technical Staff, Sandia National Lab...Tamara G. Kolda, Distinguished Member of Technical Staff, Sandia National Lab...
Tamara G. Kolda, Distinguished Member of Technical Staff, Sandia National Lab...MLconf
 
Rushin Shah, Engineering Manager, Facebook at MLconf SF 2017
Rushin Shah, Engineering Manager, Facebook at MLconf SF 2017Rushin Shah, Engineering Manager, Facebook at MLconf SF 2017
Rushin Shah, Engineering Manager, Facebook at MLconf SF 2017MLconf
 
Ryan West, Machine Learning Engineer, Nexosis at MLconf ATL 2017
Ryan West, Machine Learning Engineer, Nexosis at MLconf ATL 2017Ryan West, Machine Learning Engineer, Nexosis at MLconf ATL 2017
Ryan West, Machine Learning Engineer, Nexosis at MLconf ATL 2017MLconf
 
Michael Alcorn, Sr. Software Engineer, Red Hat Inc. at MLconf SF 2017
Michael Alcorn, Sr. Software Engineer, Red Hat Inc. at MLconf SF 2017Michael Alcorn, Sr. Software Engineer, Red Hat Inc. at MLconf SF 2017
Michael Alcorn, Sr. Software Engineer, Red Hat Inc. at MLconf SF 2017MLconf
 
Dr. June Andrews, Principal Data Scientist, Wise.io, From GE Digital at MLcon...
Dr. June Andrews, Principal Data Scientist, Wise.io, From GE Digital at MLcon...Dr. June Andrews, Principal Data Scientist, Wise.io, From GE Digital at MLcon...
Dr. June Andrews, Principal Data Scientist, Wise.io, From GE Digital at MLcon...MLconf
 
Anima Anadkumar, Principal Scientist, Amazon Web Services, Endowed Professor,...
Anima Anadkumar, Principal Scientist, Amazon Web Services, Endowed Professor,...Anima Anadkumar, Principal Scientist, Amazon Web Services, Endowed Professor,...
Anima Anadkumar, Principal Scientist, Amazon Web Services, Endowed Professor,...MLconf
 
Talha Obaid, Email Security, Symantec at MLconf ATL 2017
Talha Obaid, Email Security, Symantec at MLconf ATL 2017Talha Obaid, Email Security, Symantec at MLconf ATL 2017
Talha Obaid, Email Security, Symantec at MLconf ATL 2017MLconf
 

Viewers also liked (11)

Ashfaq Munshi, ML7 Fellow, Pepperdata
Ashfaq Munshi, ML7 Fellow, PepperdataAshfaq Munshi, ML7 Fellow, Pepperdata
Ashfaq Munshi, ML7 Fellow, Pepperdata
 
Dr. Steve Liu, Chief Scientist, Tinder at MLconf SF 2017
Dr. Steve Liu, Chief Scientist, Tinder at MLconf SF 2017Dr. Steve Liu, Chief Scientist, Tinder at MLconf SF 2017
Dr. Steve Liu, Chief Scientist, Tinder at MLconf SF 2017
 
Doug Eck, Research Scientist, Google Magenta, at MLconf SF 2017
Doug Eck, Research Scientist, Google Magenta, at MLconf SF 2017Doug Eck, Research Scientist, Google Magenta, at MLconf SF 2017
Doug Eck, Research Scientist, Google Magenta, at MLconf SF 2017
 
Ashrith Barthur, Security Scientist, H2o.ai, at MLconf 2017
Ashrith Barthur, Security Scientist, H2o.ai, at MLconf 2017Ashrith Barthur, Security Scientist, H2o.ai, at MLconf 2017
Ashrith Barthur, Security Scientist, H2o.ai, at MLconf 2017
 
Tamara G. Kolda, Distinguished Member of Technical Staff, Sandia National Lab...
Tamara G. Kolda, Distinguished Member of Technical Staff, Sandia National Lab...Tamara G. Kolda, Distinguished Member of Technical Staff, Sandia National Lab...
Tamara G. Kolda, Distinguished Member of Technical Staff, Sandia National Lab...
 
Rushin Shah, Engineering Manager, Facebook at MLconf SF 2017
Rushin Shah, Engineering Manager, Facebook at MLconf SF 2017Rushin Shah, Engineering Manager, Facebook at MLconf SF 2017
Rushin Shah, Engineering Manager, Facebook at MLconf SF 2017
 
Ryan West, Machine Learning Engineer, Nexosis at MLconf ATL 2017
Ryan West, Machine Learning Engineer, Nexosis at MLconf ATL 2017Ryan West, Machine Learning Engineer, Nexosis at MLconf ATL 2017
Ryan West, Machine Learning Engineer, Nexosis at MLconf ATL 2017
 
Michael Alcorn, Sr. Software Engineer, Red Hat Inc. at MLconf SF 2017
Michael Alcorn, Sr. Software Engineer, Red Hat Inc. at MLconf SF 2017Michael Alcorn, Sr. Software Engineer, Red Hat Inc. at MLconf SF 2017
Michael Alcorn, Sr. Software Engineer, Red Hat Inc. at MLconf SF 2017
 
Dr. June Andrews, Principal Data Scientist, Wise.io, From GE Digital at MLcon...
Dr. June Andrews, Principal Data Scientist, Wise.io, From GE Digital at MLcon...Dr. June Andrews, Principal Data Scientist, Wise.io, From GE Digital at MLcon...
Dr. June Andrews, Principal Data Scientist, Wise.io, From GE Digital at MLcon...
 
Anima Anadkumar, Principal Scientist, Amazon Web Services, Endowed Professor,...
Anima Anadkumar, Principal Scientist, Amazon Web Services, Endowed Professor,...Anima Anadkumar, Principal Scientist, Amazon Web Services, Endowed Professor,...
Anima Anadkumar, Principal Scientist, Amazon Web Services, Endowed Professor,...
 
Talha Obaid, Email Security, Symantec at MLconf ATL 2017
Talha Obaid, Email Security, Symantec at MLconf ATL 2017Talha Obaid, Email Security, Symantec at MLconf ATL 2017
Talha Obaid, Email Security, Symantec at MLconf ATL 2017
 

More from MLconf

Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...MLconf
 
Ted Willke - The Brain’s Guide to Dealing with Context in Language Understanding
Ted Willke - The Brain’s Guide to Dealing with Context in Language UnderstandingTed Willke - The Brain’s Guide to Dealing with Context in Language Understanding
Ted Willke - The Brain’s Guide to Dealing with Context in Language UnderstandingMLconf
 
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...MLconf
 
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold RushIgor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold RushMLconf
 
Josh Wills - Data Labeling as Religious Experience
Josh Wills - Data Labeling as Religious ExperienceJosh Wills - Data Labeling as Religious Experience
Josh Wills - Data Labeling as Religious ExperienceMLconf
 
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...MLconf
 
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...MLconf
 
Meghana Ravikumar - Optimized Image Classification on the Cheap
Meghana Ravikumar - Optimized Image Classification on the CheapMeghana Ravikumar - Optimized Image Classification on the Cheap
Meghana Ravikumar - Optimized Image Classification on the CheapMLconf
 
Noam Finkelstein - The Importance of Modeling Data Collection
Noam Finkelstein - The Importance of Modeling Data CollectionNoam Finkelstein - The Importance of Modeling Data Collection
Noam Finkelstein - The Importance of Modeling Data CollectionMLconf
 
June Andrews - The Uncanny Valley of ML
June Andrews - The Uncanny Valley of MLJune Andrews - The Uncanny Valley of ML
June Andrews - The Uncanny Valley of MLMLconf
 
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection TasksSneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection TasksMLconf
 
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...MLconf
 
Vito Ostuni - The Voice: New Challenges in a Zero UI World
Vito Ostuni - The Voice: New Challenges in a Zero UI WorldVito Ostuni - The Voice: New Challenges in a Zero UI World
Vito Ostuni - The Voice: New Challenges in a Zero UI WorldMLconf
 
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...MLconf
 
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...MLconf
 
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...MLconf
 
Neel Sundaresan - Teaching a machine to code
Neel Sundaresan - Teaching a machine to codeNeel Sundaresan - Teaching a machine to code
Neel Sundaresan - Teaching a machine to codeMLconf
 
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...MLconf
 
Soumith Chintala - Increasing the Impact of AI Through Better Software
Soumith Chintala - Increasing the Impact of AI Through Better SoftwareSoumith Chintala - Increasing the Impact of AI Through Better Software
Soumith Chintala - Increasing the Impact of AI Through Better SoftwareMLconf
 
Roy Lowrance - Predicting Bond Prices: Regime Changes
Roy Lowrance - Predicting Bond Prices: Regime ChangesRoy Lowrance - Predicting Bond Prices: Regime Changes
Roy Lowrance - Predicting Bond Prices: Regime ChangesMLconf
 

More from MLconf (20)

Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
 
Ted Willke - The Brain’s Guide to Dealing with Context in Language Understanding
Ted Willke - The Brain’s Guide to Dealing with Context in Language UnderstandingTed Willke - The Brain’s Guide to Dealing with Context in Language Understanding
Ted Willke - The Brain’s Guide to Dealing with Context in Language Understanding
 
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
 
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold RushIgor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
 
Josh Wills - Data Labeling as Religious Experience
Josh Wills - Data Labeling as Religious ExperienceJosh Wills - Data Labeling as Religious Experience
Josh Wills - Data Labeling as Religious Experience
 
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
 
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
 
Meghana Ravikumar - Optimized Image Classification on the Cheap
Meghana Ravikumar - Optimized Image Classification on the CheapMeghana Ravikumar - Optimized Image Classification on the Cheap
Meghana Ravikumar - Optimized Image Classification on the Cheap
 
Noam Finkelstein - The Importance of Modeling Data Collection
Noam Finkelstein - The Importance of Modeling Data CollectionNoam Finkelstein - The Importance of Modeling Data Collection
Noam Finkelstein - The Importance of Modeling Data Collection
 
June Andrews - The Uncanny Valley of ML
June Andrews - The Uncanny Valley of MLJune Andrews - The Uncanny Valley of ML
June Andrews - The Uncanny Valley of ML
 
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection TasksSneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
 
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
 
Vito Ostuni - The Voice: New Challenges in a Zero UI World
Vito Ostuni - The Voice: New Challenges in a Zero UI WorldVito Ostuni - The Voice: New Challenges in a Zero UI World
Vito Ostuni - The Voice: New Challenges in a Zero UI World
 
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
 
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
 
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
 
Neel Sundaresan - Teaching a machine to code
Neel Sundaresan - Teaching a machine to codeNeel Sundaresan - Teaching a machine to code
Neel Sundaresan - Teaching a machine to code
 
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
 
Soumith Chintala - Increasing the Impact of AI Through Better Software
Soumith Chintala - Increasing the Impact of AI Through Better SoftwareSoumith Chintala - Increasing the Impact of AI Through Better Software
Soumith Chintala - Increasing the Impact of AI Through Better Software
 
Roy Lowrance - Predicting Bond Prices: Regime Changes
Roy Lowrance - Predicting Bond Prices: Regime ChangesRoy Lowrance - Predicting Bond Prices: Regime Changes
Roy Lowrance - Predicting Bond Prices: Regime Changes
 

Recently uploaded

[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdfSandro Moreira
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Victor Rentea
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Victor Rentea
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Angeliki Cooney
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Orbitshub
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024The Digital Insurer
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistandanishmna97
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontologyjohnbeverley2021
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfOrbitshub
 

Recently uploaded (20)

[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 

Greg Werner, CEO & Founder, 3Blades.io at MLconf ATL 2017

  • 1. Data Science with Teams Improve the efficiency of your data science teams with platforms that enhance collaboration and flexibility
  • 2. Agenda ● Some Background ● Goals ● Data Science Project Teams ● Challenges ● Some Solutions ● Conclusions
  • 3. Background Integration experience with Oil & Gas, Financial, Insurance and Retail industries in multiple geographies What did these customers have in common? All had data science teams that worked in Silos Difficulties when taking a data science course Source: Wikipedia
  • 4. Background (cont) What’s going on here? We started to do some digging! Source: http://cesarsway.com
  • 5. Data Teams - The Old Way Department Teams Data Scientist Data Analyst IT Manager
  • 6. The Analytics Deliverable A dashboard! An interactive dashboard is even cooler.
  • 8. Data Science Teams - The New Way Data ScientistFinance Manager Accountant Tax and Compliance Treasury Data Gurus: - Analytics - Data Engineers - Business Intelligence - Compliance IT Manager
  • 9. The Data Science Deliverable A machine or deep learning model!
  • 10. I Want GPUs - And I Just Want Them to Work Work around for NVidia Docker Wrapper: - nvidia-docker -d -p 8888:8888 tensorflow/tensorflow:latest-gpu OR - docker run -ti --rm `curl -s http://localhost:3476/docker/cli` tensorflow/tensorflow:latest-gpu OR - docker run -ti --rm --volume-driver=nvidia-docker -- volume=nvidia_driver_375.82:/usr/local/nvidia:ro --device=/dev/nvidiactl -- device=/dev/nvidia-uvm --device=/dev/nvidia0 nvidia/cuda nvidia-smi
  • 11. The Need for DevOps Chops Registrator Docker Container EC2 Instance Reverse Proxy with consul-template The old way... The new way... Docker Container EC2 Instance Reverse Proxy with static upstream location name $$$ $
  • 12. Infrastructure Management Uh-oh, someone has to manage this stuff!
  • 13. Our Architecture - API First and Microservices
  • 14. 3Blades Hub for Data Scientists
  • 15. Solutions Provide flexibility with the tools that data scientists use for exploratory data analysis and visualizations One central source for project files with support for version control Share visualizations from EDA Train and save Machine Learning and Deep Learning models with multiple frameworks, from within the same project Streamline deployment pipelines
  • 16. Thank You!! Email: hello@3blades.io Web: https://3blades.io Twitter: @3bladesio GitHub: https://github.com/3blades Email: gwerner@3blades.io Twitter: @gwerner LinkedIn: https://www.linkedin.com/in/wernergreg GitHub: https://github.com/jgwerner

Editor's Notes

  1. Talking points: Siloed data initiatives is a common denominator Data scientists were segregated from the rest of the organization Tooling was disparate Initially, the need to streamline Jupyter Notebook deployments with a class of students came up after many students were complaining about the time and effort involved in using specific dependencies to complete their tasks. Package managers were not enough: users also needed an integrated solution to access a consistent and reliable solution to complete their assignments using Jupyter Notebooks. We also noticed that companies, in general, did not provide a homogeneous environment for their data science teams. This led to many headaches but was considered business as usual.
  2. Talking points: Issues encountered with the education vertical were common across industry, i.e. too much time spent con configuration Basic ROI calculations justified the implementation and support of a data science hub Data science platform “a ha” moment came when pitching a solution to consolidate project workspace environments for different people across different organizations, in particular for Exploratory Data Analysis (EDA). Educational institutions are usually constrained by budget requirements, however, after providing ROI numbers on how much time and effort Teachers Assistants (TA’s) spent on providing technical support for their users, the decision to implement a data science platform was a no brainer. Nevertheless, we had the suspicion that the enterprise (SMB’s and large enterprises alike) were encountering the same challenges but were exacerbated due to the fact that more personas were involved within data science and analytics teams.
  3. Talking points: Disparate teams Data scientists siloed from the rest of the organization Ultimate goal is to automate certain processes within the organization Automation helps improve the top and bottom lines, improves competitiveness Organizations struggle to become ‘data driven’. What does that mean? Data driven organizations are those that wish to use the data they have available to improve insights and allow their business to become more competitive. Assuming the organization has successfully consolidated their data into central data warehouses or data lakes, and assuming this data is defined with standard schemas, data science and data analytics teams have the power to analyze the data, obtain valuable insights and start improving the agility of their organizations with ‘prescriptive analytics’ and ‘predictive analytics’. Prescriptive analytics involves creating machine learning and deep learning models that automate certain business processes, such as: Automatically tag images with classification types (cat or not a cat) Automatically classifying a customer with the probability that the customer will churn Recommendations for value added products to improve the checkout dollar amount at an e-commerce site Spam or not spam However, organizations have struggled to integrate data scientists into their organizations. Data science teams that just ‘do the math’ and create visualizations on an organization's data sets to not provide much value in and of itself. Creating a machine learning model that automatically recommends a product that is not strategic to the organization does not provide much value.
  4. Talking points: Dashboards democratize data so that team members can quickly absorb meaningful insights and key performance indicators. Exploratory data analysis (EDA) and model creation/deployment not really a part of the picture. Traditional Business Intelligence tools have been around for years. Some tools offer specific integrations into a variety of data sources and allow users to quickly create rich and interactive visualizations of their data. SQL, a language made popular by relational databases, is a very popular language for analytics. New developments help accelerate the time from data source to dashboard with in memory calculations, GPU powered databases, among others. Big data tools, such as Hadoop and Spark allow users to create dashboards from large data sets. However, BI tools rely traditionally on structured data. Also, traditional dashboards don’t take into account how to create machine learning and deep learning models.
  5. Just a review of a Data Scientist’s skill set.
  6. Talking points: Organizations realize they need to automate their processes and that automation must come from real time analysis of data points The deliverable is not just a BI dashboard anymore, the deliverable is a deployable machine learning and deep learning model Embedding a data science team member into the group increases value As mentioned, historically data science teams have been isolated from the rest of the organization. Successful data driven organizations embed their data scientists into various business groups. For example: data extraction and loading into a warehouse table are done by engineering teams, however, a data science liaison, embedded within a certain department or relevant company wide project, can help data engineers improve the schema definition for the data being exposed which could save valuable time during the exploratory data analysis phase. Data engineers can create tables using their favorite Extract Transform and Load (ETL) tools to remove not-a-number (NaN) rows, remove columns that are irrelevant such as data base PK/FK’s, etc. Inversely, the data scientist could help the person telling the data story (could be anyone in the group, including herself) what features are relevant, how the certain normalizations were completed without delving into the technical details, etc. “This was the only customer that bought a widget in Atlanta so the attributes for this person were adjusted to not skew the dataset in their favor”.
  7. Talking points: Move from prescriptive to predictive analytics Deliver a machine learning or deep learning model that will allow organizations to automate processes Visualizations are still important, but used for telling a data story for EDA and also for visualizing how models are behaving in real time Predictive analytics looks at the historical trends in data to provide insights. Organization members are then tasked to optimize processes to improve organizational results based on trends. However, companies need to automate tasks (remove the human from the actual task execution) based on certain indicators. In this case, visualizations are used in EDA to better understand the data with the goal of creating and deploying machine learning and deep learning models that can automate certain organization processes.
  8. Talking points: Support data source imports from multiple sources EDA needed as first step to build and deliver artifacts to automate business processes. Artifacts in this context are machine learning and deep learning models. Data engineers and DevOps need access to data science hub to streamline their own processes Traditional teams use Excel spread sheets, among other tools, and are flying back and forth with emails, chat applications or external project management solutions. Even if all users work within shared environments such as Google Docs or Office 365, teams had no way of sharing all files and tools within one common environment, particularly for exploratory data analysis, since viewing and editing files within these environments are constrained to a certain set of file formats. Nevertheless, certain organizations and individuals prefer one language over the other. For example, a data science team involved with the Finance department may be more involved with using the R programming language, and the data science team involved with the marketing department may be more involved with the Python programming environment. In both cases, users may use multiple tools for one language. For example, some individuals may prefer RStudio for R, and others amy prefer using R with Jupyter Notebooks. Server management is important to optimize compute resources.
  9. Traditional teams use Excel spread sheets, among other tools, and are flying back and forth with emails, chat applications or external project management solutions. Even if all users work within shared environments such as Google Docs or Office 365, teams had no way of sharing all files and tools within one common environment, particularly for exploratory data analysis, since viewing and editing files within these environments are constrained to a certain set of file formats. Nevertheless, certain organizations and individuals prefer one language over the other. For example, a data science team involved with the Finance department may be more involved with using the R programming language, and the data science team involved with the marketing department may be more involved with the Python programming environment. In both cases, users may use multiple tools for one language. For example, some individuals may prefer RStudio for R, and others amy prefer using R with Jupyter Notebooks. A central source for project files alleviates compliance requirements. Usually, data engineers (either due to security requirements or simply that they don’t want to surface multiple schemas/formats for different users) would rather deliver the data product to a ‘clean’ table, so data scientists can do their work using self service approaches. Having a centrally managed set of files for specific projects also helps keeps things organized when different users are accessing project files, so version control becomes important as well.