We have open sourced the toolkit behind www.ramp.studio and all the starting kits. You can use rampwf to build your own predictive workflows, organize workflow building and optimization within an internal data science team, and submit the kit to us if you want to run a code submission data challenge.
https://www.ramp.studio
https://github.com/paris-saclay-cds/ramp-workflow
https://github.com/ramp-kits
https://ramp-studio.slack.com
7. Center for Data Science
Paris-Saclay
B. Kégl (CNRS) 7
what you achieved with a well tuned deep net
the diversity gap
the human blender gap
competitive phase
collaborative phase
THE POWER OF THE (COLLABORATING) CROWD
8. Center for Data Science
Paris-Saclay
B. Kégl (CNRS)
OPEN PHASE LETS PARTICIPANTS CATCH UP
THE GOAL OF TEACHING
8
9. Center for Data Science
Paris-Saclay
B. Kégl (CNRS) 9
COMMUNICATION AND REUSE
14. • toolkit: https://github.com/paris-saclay-cds/ramp-workflow
• for designing workflows
• set of ready-made metrics, workflows, CV schemes, data readers
• unique command-line test script
• examples: https://github.com/ramp-kits
• a zoo of problems, experiments, workflows
• (at least) one initial solution
14
RAMP-WORKFLOW & RAMP-KITS
15. Center for Data Science
Paris-Saclay
B. Kégl (CNRS)
CLASSIFYING AND REGRESSING ON
MOLECULAR SPECTRA
15
chemotherapy
drug in
elastic pocket
laser
spectrometer
molecular
spectra
feature
extractor 1
feature
extractor 2
regressor
concentration
classifier
drug type
16. Center for Data Science
Paris-Saclay
B. Kégl (CNRS)
FORECASTING EL NINO SIX MONTHS
AHEAD
16
…
300.14 299.83 298.76 299.87 299.82 300.15 300.10 299.50… …
time series feature
extractor
x
(a fixed length feature vector)regressor
17. 17
A SINGLE SCRIPT TO DEFINE THE BUNDLE
X ypred score
type
score
cross-validation scheme
dataconnectors
FE CLF
workflow
18. 18
A SINGLE EXECUTABLE TO TEST THE
SUBMISSIONS
• Keep your different
submissions in a simple
file structure
• Communicate them on git
• Execute them also from
the notebook
19. 19
You can
1. Use rampwf for your own workflows
2. Use rampwf to organize workflow
building and optimization in an internal
data science team
3. Submit it to us if you want to run a data
challenge