SlideShare ist ein Scribd-Unternehmen logo
1 von 38
How to get into Kaggle?
Philipp Singer & Dmitry Gordeev
Vienna Data Science Meetup Vienna,
Dec 5th 2019
Who we are
● Philipp
○ Data scientist at UNIQA
○ PhD in CS at TU Graz
○ Profound experience in ML research and applications
○ Kaggle competition master currently ranked 36th
● Dmitry
○ Data scientist at UNIQA
○ Master’s degree in data mining
○ In-depth experience of ML applications in financial institutes
○ Kaggle competition grandmaster currently ranked 34th
● Competing successfully together on Kaggle for 1 year: The Zoo
2
What is Kaggle?
● “Your home for Data Science”
○ Online community of data scientists and machine learners
○ Founded in 2010
○ Acquired by Google in 2017
● Data science competitions
● Share notebooks, datasets, and discussions
● Courses and tutorials
● Free notebook infrastructure with CPUs and GPUs
3
How big is Kaggle
● The most popular ML competition platform
● The largest ML community
125 000+ users
350 completed competitions
up to 10 000 users per competition
Usually 20,000 $ - 100,000 $ prize fund
4
Kaggle survey results
5
Kaggle survey results
6
Kaggle survey results
7
Kaggle survey results
8
Competitions on Kaggle
● Usually hosted by companies or research institutes
● Main goal: prediction
● Wide range of different types of competitions
○ Different types of domains (e.g., financial, medical, sports, …)
○ Different types of data (e.g., tabular, nlp, image, videos, time-series, …)
○ Different types of objectives (e.g., classification, regression, segmentation, …)
○ Different goals of competitions (featured, research, playground, in-class)
● Built-in progression system with medals and ranks
● Top spots usually receive prize money
9
Competition medals
10
User ranking + titles
11
How competitions usually work
12https://mc.ai/pseudo-labeling/
● Started competing under the team name “The Zoo” exactly one year ago
● Little prior experience on Kaggle
● Participated in 7 competitions
● Strategy: diversify types of competitions for learning purposes
The Zoo
13
Our Journey
14
Quora
Develop models that identify
and flag insincere questions.
1 306 122 labelled
questions
6.2% insincere questions
4 037 teams
2 hours to fit and predict
15
Quora - sincere/insincere
How can I become a data scientist?
How come Trump is so stupid?
Is it possible for a vegan who does crossfit to go 10 minutes without telling
someone about it?
Everytime I slap myself in the face, it hurts. How can I prevent this?
16
Quora - solution
17
Quora - final standings
18
Santander
19
Identify which customers will
make a specific transaction in
the future
200 000 transactions
8 802 teams
2 months duration
Santander - the mysterious data
20
Santander - solution
21
Santander - final standings
22
LANL Earthquake Prediction
Predict the time remaining before
laboratory earthquakes occur
from real-time seismic data.
629 145 480 data points
4 200 trainings segments
4 540 teams
30 minutes to fit and predict
23
LANL - the physics
24
LANL - solution
● Derived handful of features from the data capturing peaks
and volatility of the acoustic signal
● Combination (ensemble) of two state-of-the-art modeling approaches
○ Gradient Boosting Regression Trees
○ Neural Network (Deep Learning)
● Novel statistical data adjustment to account for different earthquake cycles
25
LANL - final standings
26
APTOS Blindness Detection
Detect diabetic retinopathy to
stop blindness before it's too late!
3 662 retina images
0 - 4 retinopathy levels
2 943 teams
15 000 evaluation images
27
Diabetic retinopathy is the leading cause of blindness in
the working-age population of the developed world. It is
estimated to affect over 93 million people.
APTOS
28
https://www.eyeops.com/contents/our-services/eye-diseases/diabetic-retinopathy; https://www.vequill.com/how-to-cure-temporary-blindness/
APTOS - solution
● Careful image pre-processing to remove any
kind of bias (e.g., device)
● Combination of several current best deep
neural networks
● Models are pre-trained on large collection of
image data (imagenet + extra retina images)
29
APTOS - final standings
30
Quiz
● Did I have relevant experience to enter this competition?
31
Data: Atomic elements (H for hydrogen, C for carbon
etc.) and their X, Y, Z cartesian coordinates.
Task: Develop an algorithm that can predict the
magnetic interaction between two atoms in a
molecule.
Why should you start on Kaggle?
● Doing is the best way to learn
● Get in touch with data and use cases
outside your main domain
● Keep up-to-date with state-of-the-art methods
● Learn from others
● Measure yourself and know where you stand
● Hardware and software is provided by Kaggle
32
Easy start
33
How can you start on Kaggle?
● Don’t be afraid! Just do it!
● Overcome self-handicapping behavior
● You gain points regardless of the result
● “Getting started” competitions
● Pick a competition that sounds exciting to you, don’t be afraid to pick one
where you have no prior experience
● Research similar previous competitions and read solutions
● Follow published notebooks and discussions
34
Learn from the community
35
How to approach a competition?
● Choose a programming language (usually python or R)
● Understand the problem setting, get a feeling for the data and the metric
● Exploratory Data Analysis (EDA)
● Implement basic script / notebook from scratch doing training and prediction
OR just fork someone’s model ;-)
● Think hard about robust CV setup
● Keep up-to-date on discussions and developments of competition
● Experiment a lot and iterate quickly
36
Try more, fail fast
37
Baseline
model
Final
model
Thanks!
Get in touch with us! We are open to any inquiries.
me@philippsinger.com
dott1718@gmail.com
@ph_singer @dott1718
38Vienna Data Science Meetup Vienna,
Dec 5th 2019

Weitere ähnliche Inhalte

Ähnlich wie How to get into Kaggle? by Philipp Singer and Dmitry Gordeev

Ähnlich wie How to get into Kaggle? by Philipp Singer and Dmitry Gordeev (20)

Kaggle Days Brussels - Alberto Danese
Kaggle Days Brussels - Alberto DaneseKaggle Days Brussels - Alberto Danese
Kaggle Days Brussels - Alberto Danese
 
Kaggle Days Milan - March 2019
Kaggle Days Milan - March 2019Kaggle Days Milan - March 2019
Kaggle Days Milan - March 2019
 
Guerrilla UX: Practical and Affordable Research
Guerrilla UX: Practical and Affordable ResearchGuerrilla UX: Practical and Affordable Research
Guerrilla UX: Practical and Affordable Research
 
How to win a machine learning competition pavel pleskov
How to win a machine learning competition   pavel pleskovHow to win a machine learning competition   pavel pleskov
How to win a machine learning competition pavel pleskov
 
Machine Learning with Python
Machine Learning with Python Machine Learning with Python
Machine Learning with Python
 
Graph Gurus Episode 32: Using Graph Algorithms for Advanced Analytics Part 5
Graph Gurus Episode 32: Using Graph Algorithms for Advanced Analytics Part 5Graph Gurus Episode 32: Using Graph Algorithms for Advanced Analytics Part 5
Graph Gurus Episode 32: Using Graph Algorithms for Advanced Analytics Part 5
 
Using Graph Algorithms for Advanced Analytics - Part 5 Classification
Using Graph Algorithms for Advanced Analytics - Part 5 ClassificationUsing Graph Algorithms for Advanced Analytics - Part 5 Classification
Using Graph Algorithms for Advanced Analytics - Part 5 Classification
 
Kaggle Competitions, New Friends, New Skills and New Opportunities
Kaggle Competitions, New Friends, New Skills and New OpportunitiesKaggle Competitions, New Friends, New Skills and New Opportunities
Kaggle Competitions, New Friends, New Skills and New Opportunities
 
A Plan for Sustainable MIR Evaluation
A Plan for Sustainable MIR EvaluationA Plan for Sustainable MIR Evaluation
A Plan for Sustainable MIR Evaluation
 
AI and ML for Everyone
AI and ML for EveryoneAI and ML for Everyone
AI and ML for Everyone
 
Why am I doing this???
Why am I doing this???Why am I doing this???
Why am I doing this???
 
Agile Data Science
Agile Data ScienceAgile Data Science
Agile Data Science
 
20181212 Queensland AI Meetup
20181212 Queensland AI Meetup20181212 Queensland AI Meetup
20181212 Queensland AI Meetup
 
On science hackathons univercite 2016
On science hackathons univercite 2016On science hackathons univercite 2016
On science hackathons univercite 2016
 
An introduction to deep reinforcement learning
An introduction to deep reinforcement learningAn introduction to deep reinforcement learning
An introduction to deep reinforcement learning
 
Mortal analytics - Covid-19 and the problem of data quality
Mortal analytics - Covid-19 and the problem of data qualityMortal analytics - Covid-19 and the problem of data quality
Mortal analytics - Covid-19 and the problem of data quality
 
Automatic Image Cropping - A journey from a Master Thesis to Production
Automatic Image Cropping - A journey from a Master Thesis to ProductionAutomatic Image Cropping - A journey from a Master Thesis to Production
Automatic Image Cropping - A journey from a Master Thesis to Production
 
Introduction to data science
Introduction to data scienceIntroduction to data science
Introduction to data science
 
Automating fetal heart monitor using machine learning
Automating fetal heart monitor using machine learningAutomating fetal heart monitor using machine learning
Automating fetal heart monitor using machine learning
 
A few questions about large scale machine learning
A few questions about large scale machine learningA few questions about large scale machine learning
A few questions about large scale machine learning
 

Mehr von Vienna Data Science Group

Mehr von Vienna Data Science Group (20)

Deep learning in algorithmic trading
Deep learning in algorithmic tradingDeep learning in algorithmic trading
Deep learning in algorithmic trading
 
Multi state churn analysis with a subscription product
Multi state churn analysis with a subscription productMulti state churn analysis with a subscription product
Multi state churn analysis with a subscription product
 
Modelling the-spread-of-sars-cov-2
Modelling the-spread-of-sars-cov-2Modelling the-spread-of-sars-cov-2
Modelling the-spread-of-sars-cov-2
 
Deeplearning ai june-sharable (1)
Deeplearning ai june-sharable (1)Deeplearning ai june-sharable (1)
Deeplearning ai june-sharable (1)
 
NLP in a Bank: Automated Document Reading: Yevgen Kolesnyk / Patrik Zatko / D...
NLP in a Bank: Automated Document Reading: Yevgen Kolesnyk / Patrik Zatko / D...NLP in a Bank: Automated Document Reading: Yevgen Kolesnyk / Patrik Zatko / D...
NLP in a Bank: Automated Document Reading: Yevgen Kolesnyk / Patrik Zatko / D...
 
Anita Graser: Analyzing Movment Data with MovingPandas
Anita Graser: Analyzing Movment Data  with MovingPandas Anita Graser: Analyzing Movment Data  with MovingPandas
Anita Graser: Analyzing Movment Data with MovingPandas
 
Armin Rabitsch's presentation on the importance of social media in the electi...
Armin Rabitsch's presentation on the importance of social media in the electi...Armin Rabitsch's presentation on the importance of social media in the electi...
Armin Rabitsch's presentation on the importance of social media in the electi...
 
Martina Chichi describes Amnesty International Italy's Barometer of Hate Project
Martina Chichi describes Amnesty International Italy's Barometer of Hate ProjectMartina Chichi describes Amnesty International Italy's Barometer of Hate Project
Martina Chichi describes Amnesty International Italy's Barometer of Hate Project
 
Vdsg /Craftworks Industrial-AI
Vdsg /Craftworks Industrial-AIVdsg /Craftworks Industrial-AI
Vdsg /Craftworks Industrial-AI
 
Roessler, Hafner - Modelling and Simulation in Industrial Applications: Apply...
Roessler, Hafner - Modelling and Simulation in Industrial Applications: Apply...Roessler, Hafner - Modelling and Simulation in Industrial Applications: Apply...
Roessler, Hafner - Modelling and Simulation in Industrial Applications: Apply...
 
Wastian, Brunmeir - Data Analyses in Industrial Applications: From Predictive...
Wastian, Brunmeir - Data Analyses in Industrial Applications: From Predictive...Wastian, Brunmeir - Data Analyses in Industrial Applications: From Predictive...
Wastian, Brunmeir - Data Analyses in Industrial Applications: From Predictive...
 
Openfabnet - A collaborative approach towards industry 4.0 based on open sour...
Openfabnet - A collaborative approach towards industry 4.0 based on open sour...Openfabnet - A collaborative approach towards industry 4.0 based on open sour...
Openfabnet - A collaborative approach towards industry 4.0 based on open sour...
 
Lange - Industrial Data Space – Digital Sovereignty over Data
Lange - Industrial Data Space – Digital Sovereignty over DataLange - Industrial Data Space – Digital Sovereignty over Data
Lange - Industrial Data Space – Digital Sovereignty over Data
 
Industry 4.0 by VDSG and Informance
Industry 4.0 by VDSG and InformanceIndustry 4.0 by VDSG and Informance
Industry 4.0 by VDSG and Informance
 
Donner - Deep Learning - Overview and practical aspects
Donner - Deep Learning - Overview and practical aspectsDonner - Deep Learning - Overview and practical aspects
Donner - Deep Learning - Overview and practical aspects
 
Langs - Machine Learning in Medical Imaging: Learning from Large-scale popula...
Langs - Machine Learning in Medical Imaging: Learning from Large-scale popula...Langs - Machine Learning in Medical Imaging: Learning from Large-scale popula...
Langs - Machine Learning in Medical Imaging: Learning from Large-scale popula...
 
Brunauer, Weidinger - Welcome from the Vienna Data Science Group
Brunauer, Weidinger - Welcome from the Vienna Data Science GroupBrunauer, Weidinger - Welcome from the Vienna Data Science Group
Brunauer, Weidinger - Welcome from the Vienna Data Science Group
 
Data Market Austria and Data Science Continuing Education Course
Data Market Austria and Data Science Continuing Education CourseData Market Austria and Data Science Continuing Education Course
Data Market Austria and Data Science Continuing Education Course
 
20170126 big data processing
20170126 big data processing20170126 big data processing
20170126 big data processing
 
Data science for CRM in banks
Data science for CRM in banksData science for CRM in banks
Data science for CRM in banks
 

Kürzlich hochgeladen

Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
amitlee9823
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
amitlee9823
 
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
amitlee9823
 
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
Just Call Vip call girls kakinada Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls kakinada Escorts ☎️9352988975 Two shot with one girl...Just Call Vip call girls kakinada Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls kakinada Escorts ☎️9352988975 Two shot with one girl...
gajnagarg
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
amitlee9823
 
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...
gajnagarg
 
Call Girls In Shivaji Nagar ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Shivaji Nagar ☎ 7737669865 🥵 Book Your One night StandCall Girls In Shivaji Nagar ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Shivaji Nagar ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
amitlee9823
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
amitlee9823
 
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men 🔝Ongole🔝 Escorts S...
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men  🔝Ongole🔝   Escorts S...➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men  🔝Ongole🔝   Escorts S...
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men 🔝Ongole🔝 Escorts S...
amitlee9823
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
amitlee9823
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
only4webmaster01
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...
gajnagarg
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
amitlee9823
 

Kürzlich hochgeladen (20)

Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - Almora
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
 
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
 
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
 
Just Call Vip call girls kakinada Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls kakinada Escorts ☎️9352988975 Two shot with one girl...Just Call Vip call girls kakinada Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls kakinada Escorts ☎️9352988975 Two shot with one girl...
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...
 
Call Girls In Shivaji Nagar ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Shivaji Nagar ☎ 7737669865 🥵 Book Your One night StandCall Girls In Shivaji Nagar ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Shivaji Nagar ☎ 7737669865 🥵 Book Your One night Stand
 
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
 
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
 
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
 
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men 🔝Ongole🔝 Escorts S...
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men  🔝Ongole🔝   Escorts S...➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men  🔝Ongole🔝   Escorts S...
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men 🔝Ongole🔝 Escorts S...
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
 

How to get into Kaggle? by Philipp Singer and Dmitry Gordeev

  • 1. How to get into Kaggle? Philipp Singer & Dmitry Gordeev Vienna Data Science Meetup Vienna, Dec 5th 2019
  • 2. Who we are ● Philipp ○ Data scientist at UNIQA ○ PhD in CS at TU Graz ○ Profound experience in ML research and applications ○ Kaggle competition master currently ranked 36th ● Dmitry ○ Data scientist at UNIQA ○ Master’s degree in data mining ○ In-depth experience of ML applications in financial institutes ○ Kaggle competition grandmaster currently ranked 34th ● Competing successfully together on Kaggle for 1 year: The Zoo 2
  • 3. What is Kaggle? ● “Your home for Data Science” ○ Online community of data scientists and machine learners ○ Founded in 2010 ○ Acquired by Google in 2017 ● Data science competitions ● Share notebooks, datasets, and discussions ● Courses and tutorials ● Free notebook infrastructure with CPUs and GPUs 3
  • 4. How big is Kaggle ● The most popular ML competition platform ● The largest ML community 125 000+ users 350 completed competitions up to 10 000 users per competition Usually 20,000 $ - 100,000 $ prize fund 4
  • 9. Competitions on Kaggle ● Usually hosted by companies or research institutes ● Main goal: prediction ● Wide range of different types of competitions ○ Different types of domains (e.g., financial, medical, sports, …) ○ Different types of data (e.g., tabular, nlp, image, videos, time-series, …) ○ Different types of objectives (e.g., classification, regression, segmentation, …) ○ Different goals of competitions (featured, research, playground, in-class) ● Built-in progression system with medals and ranks ● Top spots usually receive prize money 9
  • 11. User ranking + titles 11
  • 12. How competitions usually work 12https://mc.ai/pseudo-labeling/
  • 13. ● Started competing under the team name “The Zoo” exactly one year ago ● Little prior experience on Kaggle ● Participated in 7 competitions ● Strategy: diversify types of competitions for learning purposes The Zoo 13
  • 15. Quora Develop models that identify and flag insincere questions. 1 306 122 labelled questions 6.2% insincere questions 4 037 teams 2 hours to fit and predict 15
  • 16. Quora - sincere/insincere How can I become a data scientist? How come Trump is so stupid? Is it possible for a vegan who does crossfit to go 10 minutes without telling someone about it? Everytime I slap myself in the face, it hurts. How can I prevent this? 16
  • 18. Quora - final standings 18
  • 19. Santander 19 Identify which customers will make a specific transaction in the future 200 000 transactions 8 802 teams 2 months duration
  • 20. Santander - the mysterious data 20
  • 22. Santander - final standings 22
  • 23. LANL Earthquake Prediction Predict the time remaining before laboratory earthquakes occur from real-time seismic data. 629 145 480 data points 4 200 trainings segments 4 540 teams 30 minutes to fit and predict 23
  • 24. LANL - the physics 24
  • 25. LANL - solution ● Derived handful of features from the data capturing peaks and volatility of the acoustic signal ● Combination (ensemble) of two state-of-the-art modeling approaches ○ Gradient Boosting Regression Trees ○ Neural Network (Deep Learning) ● Novel statistical data adjustment to account for different earthquake cycles 25
  • 26. LANL - final standings 26
  • 27. APTOS Blindness Detection Detect diabetic retinopathy to stop blindness before it's too late! 3 662 retina images 0 - 4 retinopathy levels 2 943 teams 15 000 evaluation images 27 Diabetic retinopathy is the leading cause of blindness in the working-age population of the developed world. It is estimated to affect over 93 million people.
  • 29. APTOS - solution ● Careful image pre-processing to remove any kind of bias (e.g., device) ● Combination of several current best deep neural networks ● Models are pre-trained on large collection of image data (imagenet + extra retina images) 29
  • 30. APTOS - final standings 30
  • 31. Quiz ● Did I have relevant experience to enter this competition? 31 Data: Atomic elements (H for hydrogen, C for carbon etc.) and their X, Y, Z cartesian coordinates. Task: Develop an algorithm that can predict the magnetic interaction between two atoms in a molecule.
  • 32. Why should you start on Kaggle? ● Doing is the best way to learn ● Get in touch with data and use cases outside your main domain ● Keep up-to-date with state-of-the-art methods ● Learn from others ● Measure yourself and know where you stand ● Hardware and software is provided by Kaggle 32
  • 34. How can you start on Kaggle? ● Don’t be afraid! Just do it! ● Overcome self-handicapping behavior ● You gain points regardless of the result ● “Getting started” competitions ● Pick a competition that sounds exciting to you, don’t be afraid to pick one where you have no prior experience ● Research similar previous competitions and read solutions ● Follow published notebooks and discussions 34
  • 35. Learn from the community 35
  • 36. How to approach a competition? ● Choose a programming language (usually python or R) ● Understand the problem setting, get a feeling for the data and the metric ● Exploratory Data Analysis (EDA) ● Implement basic script / notebook from scratch doing training and prediction OR just fork someone’s model ;-) ● Think hard about robust CV setup ● Keep up-to-date on discussions and developments of competition ● Experiment a lot and iterate quickly 36
  • 37. Try more, fail fast 37 Baseline model Final model
  • 38. Thanks! Get in touch with us! We are open to any inquiries. me@philippsinger.com dott1718@gmail.com @ph_singer @dott1718 38Vienna Data Science Meetup Vienna, Dec 5th 2019