SlideShare a Scribd company logo
1 of 17
Download to read offline
Scikit-Learn
(or why I joined an open source software project)

Gilles Louppe
Dept. of EE & CS, & GIGA-R
Universit´ de Li`ge, Belgium
e
e

October 30, 2013
Publishing scientific software matters 1

Software is a central part of modern scientific discovery.
Software developed in one field can often be applied to
advance a different field if the underlying mathematics is
common.
The public availability of code is a corner stone of the
scientific method.

1. Pradal C. et al, Publishing scientific software matters, 2013.
if it’s not open and verifiable by others, it’s not science, or
engineering, or whatever it is you call what we do 2

2. V. Stodden, The scientific method in practice.
As a young PhD student full of illusions...

I wanted to write useful scientific software, for me and others
Leverage existing software

... but I didn’t want to reinvent the wheel !
... and then I joined an OSS project

An open source Machine Learning library in Python
Classical and well-established algorithms
- Supervised and unsupervised algorithms
- Model evaluation and selection
- Data processing and feature engineering
Collaborative development
Software quality matters

Peer-reviewed and well-tested code
Simple and consistent API

from sklearn.ensemble import RandomForestClassifier
clf = RandomForestClassifier()
clf.fit(X_train, y_train)
y_pred = clf.predict(X_test)
Simple and consistent API

from sklearn.svm import SVC
clf = SVC()
clf.fit(X_train, y_train)
y_pred = clf.predict(X_test)
Simple and consistent API

from sklearn.linear_model import LassoCV
clf = LassoCV()
clf.fit(X_train, y_train)
y_pred = clf.predict(X_test)
Side effect 1 : Learn and improve your skills

Strict programming practices
Software management (release cycle, git, etc)
Team work
Side effect 2 : People might start using your software
In research

In industry
Side effect 3 : You get to meet interesting people

(and eat pizzas !)
Start with small contributions...
Publish and share your research code

Join an open source software project
Questions ?

More Related Content

Similar to Scikit-Learn - Or why I joined an open source software project

Open Source Scientific Software
Open Source Scientific SoftwareOpen Source Scientific Software
Open Source Scientific SoftwareGael Varoquaux
 
Lift+fing 09 Michael Shiloh slides with notes
Lift+fing 09 Michael Shiloh slides with notesLift+fing 09 Michael Shiloh slides with notes
Lift+fing 09 Michael Shiloh slides with notesmichaelshiloh
 
Artificial Intelligence Techniques In Power Systems Paper Presentation
Artificial Intelligence Techniques In Power Systems Paper PresentationArtificial Intelligence Techniques In Power Systems Paper Presentation
Artificial Intelligence Techniques In Power Systems Paper Presentationguestac67362
 
Digital Science: Towards the executable paper
Digital Science: Towards the executable paperDigital Science: Towards the executable paper
Digital Science: Towards the executable paperJose Enrique Ruiz
 
Opening up education through digitization. Remarks on recent developments in ...
Opening up education through digitization. Remarks on recent developments in ...Opening up education through digitization. Remarks on recent developments in ...
Opening up education through digitization. Remarks on recent developments in ...MOVING Project
 
Scientix 5th SPNE London 24 April 2015: Go-Lab
Scientix 5th SPNE London 24 April 2015: Go-LabScientix 5th SPNE London 24 April 2015: Go-Lab
Scientix 5th SPNE London 24 April 2015: Go-LabBrussels, Belgium
 
Scientix 5th SPNE London 24 April 2015: Go Lab
Scientix 5th SPNE London 24 April 2015: Go LabScientix 5th SPNE London 24 April 2015: Go Lab
Scientix 5th SPNE London 24 April 2015: Go LabBrussels, Belgium
 
IWST 2013: Intro
IWST 2013: IntroIWST 2013: Intro
IWST 2013: IntroESUG
 
Agile Research in Information Systems Field: Analysis from Knowledge Transfor...
Agile Research in Information Systems Field: Analysis from Knowledge Transfor...Agile Research in Information Systems Field: Analysis from Knowledge Transfor...
Agile Research in Information Systems Field: Analysis from Knowledge Transfor...Ilia Bider
 
The Legacy and the Future of Research Networks in Technology-Enhanced Learning
The Legacy and the Future of Research Networks in Technology-Enhanced LearningThe Legacy and the Future of Research Networks in Technology-Enhanced Learning
The Legacy and the Future of Research Networks in Technology-Enhanced LearningRalf Klamma
 
How does Social Software support Global Software Development?
How does Social Software support Global Software Development?How does Social Software support Global Software Development?
How does Social Software support Global Software Development?Rosalba Giuffrida
 
Research & Development Projects
Research & Development ProjectsResearch & Development Projects
Research & Development ProjectsJeroen Doggen
 
Jupyter notebook for interactive data visualization敖
Jupyter notebook for interactive data visualization敖Jupyter notebook for interactive data visualization敖
Jupyter notebook for interactive data visualization敖Jellyfish.tech
 
Developing and sharing tools for bioelectromagnetic research
Developing and sharing tools for bioelectromagnetic researchDeveloping and sharing tools for bioelectromagnetic research
Developing and sharing tools for bioelectromagnetic researchRobert Oostenveld
 

Similar to Scikit-Learn - Or why I joined an open source software project (20)

Open Source Scientific Software
Open Source Scientific SoftwareOpen Source Scientific Software
Open Source Scientific Software
 
Lift+fing 09 Michael Shiloh slides with notes
Lift+fing 09 Michael Shiloh slides with notesLift+fing 09 Michael Shiloh slides with notes
Lift+fing 09 Michael Shiloh slides with notes
 
Artificial Intelligence Techniques In Power Systems Paper Presentation
Artificial Intelligence Techniques In Power Systems Paper PresentationArtificial Intelligence Techniques In Power Systems Paper Presentation
Artificial Intelligence Techniques In Power Systems Paper Presentation
 
Digital Science: Towards the executable paper
Digital Science: Towards the executable paperDigital Science: Towards the executable paper
Digital Science: Towards the executable paper
 
NYSCATE 2010
NYSCATE 2010NYSCATE 2010
NYSCATE 2010
 
Opening up education through digitization. Remarks on recent developments in ...
Opening up education through digitization. Remarks on recent developments in ...Opening up education through digitization. Remarks on recent developments in ...
Opening up education through digitization. Remarks on recent developments in ...
 
Scientix 5th SPNE London 24 April 2015: Go-Lab
Scientix 5th SPNE London 24 April 2015: Go-LabScientix 5th SPNE London 24 April 2015: Go-Lab
Scientix 5th SPNE London 24 April 2015: Go-Lab
 
Scientix 5th SPNE London 24 April 2015: Go Lab
Scientix 5th SPNE London 24 April 2015: Go LabScientix 5th SPNE London 24 April 2015: Go Lab
Scientix 5th SPNE London 24 April 2015: Go Lab
 
IWST 2013: Intro
IWST 2013: IntroIWST 2013: Intro
IWST 2013: Intro
 
NUS PhD e-open day 2020
NUS PhD e-open day 2020NUS PhD e-open day 2020
NUS PhD e-open day 2020
 
Agile Research in Information Systems Field: Analysis from Knowledge Transfor...
Agile Research in Information Systems Field: Analysis from Knowledge Transfor...Agile Research in Information Systems Field: Analysis from Knowledge Transfor...
Agile Research in Information Systems Field: Analysis from Knowledge Transfor...
 
The Legacy and the Future of Research Networks in Technology-Enhanced Learning
The Legacy and the Future of Research Networks in Technology-Enhanced LearningThe Legacy and the Future of Research Networks in Technology-Enhanced Learning
The Legacy and the Future of Research Networks in Technology-Enhanced Learning
 
How does Social Software support Global Software Development?
How does Social Software support Global Software Development?How does Social Software support Global Software Development?
How does Social Software support Global Software Development?
 
Research2.0
Research2.0Research2.0
Research2.0
 
Research2.0
Research2.0Research2.0
Research2.0
 
Research & Development Projects
Research & Development ProjectsResearch & Development Projects
Research & Development Projects
 
Irill owf2014
Irill owf2014Irill owf2014
Irill owf2014
 
Open Science
Open ScienceOpen Science
Open Science
 
Jupyter notebook for interactive data visualization敖
Jupyter notebook for interactive data visualization敖Jupyter notebook for interactive data visualization敖
Jupyter notebook for interactive data visualization敖
 
Developing and sharing tools for bioelectromagnetic research
Developing and sharing tools for bioelectromagnetic researchDeveloping and sharing tools for bioelectromagnetic research
Developing and sharing tools for bioelectromagnetic research
 

Recently uploaded

Film show evaluation powerpoint for site
Film show evaluation powerpoint for siteFilm show evaluation powerpoint for site
Film show evaluation powerpoint for siteAshtonCains
 
Ignite Your Online Influence: Sociocosmos - Where Social Media Magic Happens
Ignite Your Online Influence: Sociocosmos - Where Social Media Magic HappensIgnite Your Online Influence: Sociocosmos - Where Social Media Magic Happens
Ignite Your Online Influence: Sociocosmos - Where Social Media Magic HappensSocioCosmos
 
Jual Obat Aborsi Palu ( Taiwan No.1 ) 085657271886 Obat Penggugur Kandungan C...
Jual Obat Aborsi Palu ( Taiwan No.1 ) 085657271886 Obat Penggugur Kandungan C...Jual Obat Aborsi Palu ( Taiwan No.1 ) 085657271886 Obat Penggugur Kandungan C...
Jual Obat Aborsi Palu ( Taiwan No.1 ) 085657271886 Obat Penggugur Kandungan C...ZurliaSoop
 
Hire↠Young Call Girls in Hari Nagar (Delhi) ☎️ 9205541914 ☎️ Independent Esco...
Hire↠Young Call Girls in Hari Nagar (Delhi) ☎️ 9205541914 ☎️ Independent Esco...Hire↠Young Call Girls in Hari Nagar (Delhi) ☎️ 9205541914 ☎️ Independent Esco...
Hire↠Young Call Girls in Hari Nagar (Delhi) ☎️ 9205541914 ☎️ Independent Esco...Delhi Call girls
 
Capstone slidedeck for my capstone project part 2.pdf
Capstone slidedeck for my capstone project part 2.pdfCapstone slidedeck for my capstone project part 2.pdf
Capstone slidedeck for my capstone project part 2.pdfeliklein8
 
Film show investigation powerpoint for the site
Film show investigation powerpoint for the siteFilm show investigation powerpoint for the site
Film show investigation powerpoint for the siteAshtonCains
 
Enhancing Consumer Trust Through Strategic Content Marketing
Enhancing Consumer Trust Through Strategic Content MarketingEnhancing Consumer Trust Through Strategic Content Marketing
Enhancing Consumer Trust Through Strategic Content MarketingDigital Marketing Lab
 
Marketing Plan - Social Media. The Sparks Foundation
Marketing Plan -  Social Media. The Sparks FoundationMarketing Plan -  Social Media. The Sparks Foundation
Marketing Plan - Social Media. The Sparks Foundationsolidgbemi
 
International Airport Call Girls 🥰 8617370543 Service Offer VIP Hot Model
International Airport Call Girls 🥰 8617370543 Service Offer VIP Hot ModelInternational Airport Call Girls 🥰 8617370543 Service Offer VIP Hot Model
International Airport Call Girls 🥰 8617370543 Service Offer VIP Hot ModelDeepika Singh
 
Pondicherry Call Girls Book Now 8617697112 Top Class Pondicherry Escort Servi...
Pondicherry Call Girls Book Now 8617697112 Top Class Pondicherry Escort Servi...Pondicherry Call Girls Book Now 8617697112 Top Class Pondicherry Escort Servi...
Pondicherry Call Girls Book Now 8617697112 Top Class Pondicherry Escort Servi...Nitya salvi
 
BDSM⚡Call Girls in Sector 76 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 76 Noida Escorts >༒8448380779 Escort ServiceBDSM⚡Call Girls in Sector 76 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 76 Noida Escorts >༒8448380779 Escort ServiceDelhi Call girls
 
BVG BEACH CLEANING PROJECTS- ORISSA , ANDAMAN, PORT BLAIR
BVG BEACH CLEANING PROJECTS- ORISSA , ANDAMAN, PORT BLAIRBVG BEACH CLEANING PROJECTS- ORISSA , ANDAMAN, PORT BLAIR
BVG BEACH CLEANING PROJECTS- ORISSA , ANDAMAN, PORT BLAIRNeha Kajulkar
 
Capstone slidedeck for my capstone final edition.pdf
Capstone slidedeck for my capstone final edition.pdfCapstone slidedeck for my capstone final edition.pdf
Capstone slidedeck for my capstone final edition.pdfeliklein8
 
VIP Call Girls Morena 9332606886 Free Home Delivery 5500 Only
VIP Call Girls Morena 9332606886 Free Home Delivery 5500 OnlyVIP Call Girls Morena 9332606886 Free Home Delivery 5500 Only
VIP Call Girls Morena 9332606886 Free Home Delivery 5500 Onlykhanf3647647
 
Karol Bagh, Delhi Call girls :8448380779 Model Escorts | 100% verified
Karol Bagh, Delhi Call girls :8448380779 Model Escorts | 100% verifiedKarol Bagh, Delhi Call girls :8448380779 Model Escorts | 100% verified
Karol Bagh, Delhi Call girls :8448380779 Model Escorts | 100% verifiedDelhi Call girls
 
Film show production powerpoint for site
Film show production powerpoint for siteFilm show production powerpoint for site
Film show production powerpoint for siteAshtonCains
 
Capstone slide deck on the TikTok revolution
Capstone slide deck on the TikTok revolutionCapstone slide deck on the TikTok revolution
Capstone slide deck on the TikTok revolutioneliklein8
 
+971565801893>> ORIGINAL CYTOTEC ABORTION PILLS FOR SALE IN DUBAI AND ABUDHABI<<
+971565801893>> ORIGINAL CYTOTEC ABORTION PILLS FOR SALE IN DUBAI AND ABUDHABI<<+971565801893>> ORIGINAL CYTOTEC ABORTION PILLS FOR SALE IN DUBAI AND ABUDHABI<<
+971565801893>> ORIGINAL CYTOTEC ABORTION PILLS FOR SALE IN DUBAI AND ABUDHABI<<Health
 
College & House wife Call Girls in Paharganj 9634446618 -Best Escort call gi...
College & House wife  Call Girls in Paharganj 9634446618 -Best Escort call gi...College & House wife  Call Girls in Paharganj 9634446618 -Best Escort call gi...
College & House wife Call Girls in Paharganj 9634446618 -Best Escort call gi...Heena Escort Service
 

Recently uploaded (20)

Film show evaluation powerpoint for site
Film show evaluation powerpoint for siteFilm show evaluation powerpoint for site
Film show evaluation powerpoint for site
 
Ignite Your Online Influence: Sociocosmos - Where Social Media Magic Happens
Ignite Your Online Influence: Sociocosmos - Where Social Media Magic HappensIgnite Your Online Influence: Sociocosmos - Where Social Media Magic Happens
Ignite Your Online Influence: Sociocosmos - Where Social Media Magic Happens
 
Jual Obat Aborsi Palu ( Taiwan No.1 ) 085657271886 Obat Penggugur Kandungan C...
Jual Obat Aborsi Palu ( Taiwan No.1 ) 085657271886 Obat Penggugur Kandungan C...Jual Obat Aborsi Palu ( Taiwan No.1 ) 085657271886 Obat Penggugur Kandungan C...
Jual Obat Aborsi Palu ( Taiwan No.1 ) 085657271886 Obat Penggugur Kandungan C...
 
Hire↠Young Call Girls in Hari Nagar (Delhi) ☎️ 9205541914 ☎️ Independent Esco...
Hire↠Young Call Girls in Hari Nagar (Delhi) ☎️ 9205541914 ☎️ Independent Esco...Hire↠Young Call Girls in Hari Nagar (Delhi) ☎️ 9205541914 ☎️ Independent Esco...
Hire↠Young Call Girls in Hari Nagar (Delhi) ☎️ 9205541914 ☎️ Independent Esco...
 
Capstone slidedeck for my capstone project part 2.pdf
Capstone slidedeck for my capstone project part 2.pdfCapstone slidedeck for my capstone project part 2.pdf
Capstone slidedeck for my capstone project part 2.pdf
 
Film show investigation powerpoint for the site
Film show investigation powerpoint for the siteFilm show investigation powerpoint for the site
Film show investigation powerpoint for the site
 
Enhancing Consumer Trust Through Strategic Content Marketing
Enhancing Consumer Trust Through Strategic Content MarketingEnhancing Consumer Trust Through Strategic Content Marketing
Enhancing Consumer Trust Through Strategic Content Marketing
 
Marketing Plan - Social Media. The Sparks Foundation
Marketing Plan -  Social Media. The Sparks FoundationMarketing Plan -  Social Media. The Sparks Foundation
Marketing Plan - Social Media. The Sparks Foundation
 
International Airport Call Girls 🥰 8617370543 Service Offer VIP Hot Model
International Airport Call Girls 🥰 8617370543 Service Offer VIP Hot ModelInternational Airport Call Girls 🥰 8617370543 Service Offer VIP Hot Model
International Airport Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Pondicherry Call Girls Book Now 8617697112 Top Class Pondicherry Escort Servi...
Pondicherry Call Girls Book Now 8617697112 Top Class Pondicherry Escort Servi...Pondicherry Call Girls Book Now 8617697112 Top Class Pondicherry Escort Servi...
Pondicherry Call Girls Book Now 8617697112 Top Class Pondicherry Escort Servi...
 
BDSM⚡Call Girls in Sector 76 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 76 Noida Escorts >༒8448380779 Escort ServiceBDSM⚡Call Girls in Sector 76 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 76 Noida Escorts >༒8448380779 Escort Service
 
Call Girls in Chattarpur (delhi) call me [9953056974] escort service 24X7
Call Girls in Chattarpur (delhi) call me [9953056974] escort service 24X7Call Girls in Chattarpur (delhi) call me [9953056974] escort service 24X7
Call Girls in Chattarpur (delhi) call me [9953056974] escort service 24X7
 
BVG BEACH CLEANING PROJECTS- ORISSA , ANDAMAN, PORT BLAIR
BVG BEACH CLEANING PROJECTS- ORISSA , ANDAMAN, PORT BLAIRBVG BEACH CLEANING PROJECTS- ORISSA , ANDAMAN, PORT BLAIR
BVG BEACH CLEANING PROJECTS- ORISSA , ANDAMAN, PORT BLAIR
 
Capstone slidedeck for my capstone final edition.pdf
Capstone slidedeck for my capstone final edition.pdfCapstone slidedeck for my capstone final edition.pdf
Capstone slidedeck for my capstone final edition.pdf
 
VIP Call Girls Morena 9332606886 Free Home Delivery 5500 Only
VIP Call Girls Morena 9332606886 Free Home Delivery 5500 OnlyVIP Call Girls Morena 9332606886 Free Home Delivery 5500 Only
VIP Call Girls Morena 9332606886 Free Home Delivery 5500 Only
 
Karol Bagh, Delhi Call girls :8448380779 Model Escorts | 100% verified
Karol Bagh, Delhi Call girls :8448380779 Model Escorts | 100% verifiedKarol Bagh, Delhi Call girls :8448380779 Model Escorts | 100% verified
Karol Bagh, Delhi Call girls :8448380779 Model Escorts | 100% verified
 
Film show production powerpoint for site
Film show production powerpoint for siteFilm show production powerpoint for site
Film show production powerpoint for site
 
Capstone slide deck on the TikTok revolution
Capstone slide deck on the TikTok revolutionCapstone slide deck on the TikTok revolution
Capstone slide deck on the TikTok revolution
 
+971565801893>> ORIGINAL CYTOTEC ABORTION PILLS FOR SALE IN DUBAI AND ABUDHABI<<
+971565801893>> ORIGINAL CYTOTEC ABORTION PILLS FOR SALE IN DUBAI AND ABUDHABI<<+971565801893>> ORIGINAL CYTOTEC ABORTION PILLS FOR SALE IN DUBAI AND ABUDHABI<<
+971565801893>> ORIGINAL CYTOTEC ABORTION PILLS FOR SALE IN DUBAI AND ABUDHABI<<
 
College & House wife Call Girls in Paharganj 9634446618 -Best Escort call gi...
College & House wife  Call Girls in Paharganj 9634446618 -Best Escort call gi...College & House wife  Call Girls in Paharganj 9634446618 -Best Escort call gi...
College & House wife Call Girls in Paharganj 9634446618 -Best Escort call gi...
 

Scikit-Learn - Or why I joined an open source software project

  • 1. Scikit-Learn (or why I joined an open source software project) Gilles Louppe Dept. of EE & CS, & GIGA-R Universit´ de Li`ge, Belgium e e October 30, 2013
  • 2. Publishing scientific software matters 1 Software is a central part of modern scientific discovery. Software developed in one field can often be applied to advance a different field if the underlying mathematics is common. The public availability of code is a corner stone of the scientific method. 1. Pradal C. et al, Publishing scientific software matters, 2013.
  • 3. if it’s not open and verifiable by others, it’s not science, or engineering, or whatever it is you call what we do 2 2. V. Stodden, The scientific method in practice.
  • 4. As a young PhD student full of illusions... I wanted to write useful scientific software, for me and others
  • 5. Leverage existing software ... but I didn’t want to reinvent the wheel !
  • 6. ... and then I joined an OSS project An open source Machine Learning library in Python Classical and well-established algorithms - Supervised and unsupervised algorithms - Model evaluation and selection - Data processing and feature engineering
  • 9. Simple and consistent API from sklearn.ensemble import RandomForestClassifier clf = RandomForestClassifier() clf.fit(X_train, y_train) y_pred = clf.predict(X_test)
  • 10. Simple and consistent API from sklearn.svm import SVC clf = SVC() clf.fit(X_train, y_train) y_pred = clf.predict(X_test)
  • 11. Simple and consistent API from sklearn.linear_model import LassoCV clf = LassoCV() clf.fit(X_train, y_train) y_pred = clf.predict(X_test)
  • 12. Side effect 1 : Learn and improve your skills Strict programming practices Software management (release cycle, git, etc) Team work
  • 13. Side effect 2 : People might start using your software In research In industry
  • 14. Side effect 3 : You get to meet interesting people (and eat pizzas !)
  • 15. Start with small contributions...
  • 16. Publish and share your research code Join an open source software project