Be naive. not idiot.

•

1 gefällt mir•248 views

PredicSis

Machine Learning Meetup Leveraging the class-conditional independency assumption

Daten & Analysen

Be naive. Not idiot.
Leveraging the class-conditional independency assumption.
Sylvain Ferrandiz
21 février 2017

Logistic
Regression
XOR
MLP 4
neurons
Decision
Tree

Class-conditionnal
independency assumption
Often said simple, or naive, even idiot*
* Idiot's Bayes - not so stupid after all?, Hand, D.J., & Yu, K. (2001).
International Statistical Review. Vol 69 part 3, pages 385-399. ISSN 0306-7734.
argmax
y
p(y/x1, ..., xK) = argmax
y
p(y)
Y
k
p(xk/y)

Estimate univariate
distributions
Parametric assumption?
Nonparametric assumption?
→ Kernels
→ Binning (discretization /
grouping
→ Gaussian mixture
→ Multinomial distribution

Winning binning?
Outliers
Missing values
Stability
* MODL: a Bayes optimal discretization method for continuous attributes. Boullé, M., (2006).
Machine Learning, 65(1):131-165.
No parameter to validate
O(nlog(n))

Selective Naive Bayes
On predictive distributions and Bayesian networks. Kontkanen, P., Myllymäki, P., Silander, T., Tirri, H. & Grünwald, P. (2000).
Statistics and Computing, 10, 39-54.
sk 2 {0, 1}
argmax
y
p(y/x1, ..., xK) = argmax
y
p(y)
Y
k
p(xk/y)sk

Select features
An introduction to variable and feature selection. Guyon, I., Elisseeff, A. (2003)
Journal of machine learning research 3 (Mar), 1157-1182
Wrapper approach?
Embedded approach?
→ Greedy optimization
→ Nested subsets
→ Direct objective
optimization
Filter approach?
→ Mutual information
→ Weak learner
→ Cross-validation

Forward Feature Selection
A
B
C D
Pool of actual
candidates
Pool of future
candidates
Features
included in the
model
E
Draw
independently
Include iff the AUROCC
is improved
Keep it safe
otherwise

wk 2 [0, 1]
Soft selection
argmax
y
p(y/x1, ..., xK) = argmax
y
p(y)
Y
k
p(xk/y)wk

The averaging trick
wk =
P
s2S skp(s/d)
P
s2S p(s/d)
* A Parameter-Free Classification Method for Large Scale Learning. Boullé, M., (2009).
Journal of Machine Learning Research, 10:1367-1385.

The averaging trick
Explored model only
wk =
P
s2S skp(s/d)
P
s2S p(s/d)
wk =
P
s2S skp(s/d)
P
s2S p(s/d)
* A Parameter-Free Classification Method for Large Scale Learning. Boullé, M., (2009).
Journal of Machine Learning Research, 10:1367-1385.

The averaging trick
Nonparametric prior
wk =
P
s2S skp(s/d)
P
s2S p(s/d)
* A Parameter-Free Classification Method for Large Scale Learning. Boullé, M., (2009).
Journal of Machine Learning Research, 10:1367-1385.

+ -
performance
algorithm complexity
Nonparametric and stable
(bye bye cross-validation!)
It’s up to the user to find
‘composite’ features and capture
correlational relationships, but
…
It’s where the
fun is, ain’t it?Numeric / Categorical
(bye bye dummy-encoding!)
Interpretable*
*https://www.quora.com/What-makes-a-model-
interpretable/answer/Claudia-Perlich

Feature engineering
XOR
X
Y
Z = (XY > 0)
Z

Feature surfacing
Users
Sales
Web
Users
CustomerId
Firstname
Lastname
Age
Sales
CustomerId
Product
Amount
Time
Web
CustomerId
Page
Time
Users.Customer_Id
Users.Firstname
Users.Lastname
Users.Age
Outcome
Count(Sales.Product)
CountDistinct(Sales.Product)
Mean(Sales.Amount)
Sum(Sales.Amount) where Sales.Product = 'Mobile Data'
Count(Web.Page) where Day(Web.Time) in [6;7]
…

Let’s stay in touch!
Sylvain FERRANDIZ
Machine Learning Scientist
sfe@predicsis.ai

Weitere ähnliche Inhalte

Was ist angesagt?

Pattern learning and recognition on statistical manifolds: An information-geo...Frank Nielsen

RECENT ADVANCES in PREDICTIVE (MACHINE) LEARNINGbutest

Understanding Random Forests: From Theory to PracticeGilles Louppe

[DL輪読会]Generative Models of Visually Grounded ImaginationDeep Learning JP

AP Calculus January 9, 2009Darren Kuropatwa

About functional SIRtuxette

Naive bayesumeskath

maXbox starter68 machine learning VIMax Kleiner

Sparse autoencoderDevashish Patel

Aae oop xp_05Niit Care

24 Machine Learning Combining Models - Ada BoostAndres Mendez-Vazquez

Variational autoencoders for speech processing d.bielievtsov dataconf 21 04 18Olga Zinkevych

"Let us talk about output features! by Florence d’Alché-Buc, LTCI & Full Prof...Paris Women in Machine Learning and Data Science

Boston talkChristian Robert

GAN（と強化学習との関係）Masahiro Suzuki

Lausanne 2019 #2Arthur Charpentier

An Importance Sampling Approach to Integrate Expert Knowledge When Learning B...NTNU

Prml8.4.4Ma Xuebin

Chapter 9 dsHanif Durad

Grade 4 5 multiplication problemsusan70

Was ist angesagt? (20)

Pattern learning and recognition on statistical manifolds: An information-geo...

RECENT ADVANCES in PREDICTIVE (MACHINE) LEARNING

Understanding Random Forests: From Theory to Practice

[DL輪読会]Generative Models of Visually Grounded Imagination

AP Calculus January 9, 2009

About functional SIR

Naive bayes

maXbox starter68 machine learning VI

Sparse autoencoder

Aae oop xp_05

24 Machine Learning Combining Models - Ada Boost

Variational autoencoders for speech processing d.bielievtsov dataconf 21 04 18

"Let us talk about output features! by Florence d’Alché-Buc, LTCI & Full Prof...

Boston talk

GAN（と強化学習との関係）

Lausanne 2019 #2

An Importance Sampling Approach to Integrate Expert Knowledge When Learning B...

Prml8.4.4

Chapter 9 ds

Grade 4 5 multiplication problem

Andere mochten auch

CHƯƠNG VI: ĐƯỜNG LỐI XÂY DỰNG HỆTHỐNG CHÍNH TRỊGiaHan Giang

Lecture 4 methods_of_analysisdhruv panchal

Employer brand case study - Paysafe Group PLCSteve Revill

Paper2readingRafael Megale

Electrophoresis presentationmessi1910

microbiologiaYuridia Edwiges Grijalva Mungarro

Caso práctico 2: mercadonaJulián Navas Fernández

Taller Dircom CyL: "Reinventarse a los 50" por Arturo Gómez QuijanoAsociación de Directivos de Comunicación

Paper1readingRafael Megale

Keluarga dalam pembentukan moralBen Devon

Telepresense projects in open spacesDenis Perevalov

Slides LAB L2 - RISORSE CLOUD PER LA DIDATTICAMassimo Priano

PresentationIram Naz

Comment définir et optimiser ses hypothèses en utilisant le machine learningPredicSis

Didáctica de las ciencias sociales y de la geografía para la sustentabilidadJoselyn Sanchez

Nanotecnologia em processos industriaisEdilson Gomes de Lima

Feature surfacing - meetupPredicSis

First appeal under RTI Act 2005 against Registrar (J-I) Supreme Court of Indi...Om Prakash Poddar

Andere mochten auch (18)

CHƯƠNG VI: ĐƯỜNG LỐI XÂY DỰNG HỆTHỐNG CHÍNH TRỊ

Lecture 4 methods_of_analysis

Employer brand case study - Paysafe Group PLC

Paper2reading

Electrophoresis presentation

microbiologia

Caso práctico 2: mercadona

Taller Dircom CyL: "Reinventarse a los 50" por Arturo Gómez Quijano

Paper1reading

Keluarga dalam pembentukan moral

Telepresense projects in open spaces

Slides LAB L2 - RISORSE CLOUD PER LA DIDATTICA

Presentation

Comment définir et optimiser ses hypothèses en utilisant le machine learning

Didáctica de las ciencias sociales y de la geografía para la sustentabilidad

Nanotecnologia em processos industriais

Feature surfacing - meetup

First appeal under RTI Act 2005 against Registrar (J-I) Supreme Court of Indi...

Ähnlich wie Be naive. not idiot.

. An introduction to machine learning and probabilistic ...butest

An RKHS Approach to Systematic Kernel Selection in Nonlinear System Identific...Yusuf Bhujwalla

(研究会輪読) Weight Uncertainty in Neural NetworksMasahiro Suzuki

An Introduction to Supervised Machine Learning and Pattern Classification: Th...Sebastian Raschka

Application of Bayesian and Sparse Network Models for Assessing Linkage Diseq...Gota Morota

20070702 Text Categorizationmidi

When Classifier Selection meets Information Theory: A Unifying ViewMohamed Farouk

Imprecision in learning: an overviewSebastien Destercke

Naive Bayes PresentationMd. Enamul Haque Chowdhury

prototypes-AMALEA.pdfUniversity of Groningen

MS CS - Selecting Machine Learning AlgorithmKaniska Mandal

Machine Learning and Statistical Analysisbutest

Csc446: Pattern Recognition Mostafa G. M. Mostafa

DataEngConf: Feature Extraction: Modern Questions and Challenges at GoogleHakka Labs

Ähnlich wie Be naive. not idiot. (20)

. An introduction to machine learning and probabilistic ...

An RKHS Approach to Systematic Kernel Selection in Nonlinear System Identific...

(研究会輪読) Weight Uncertainty in Neural Networks

An Introduction to Supervised Machine Learning and Pattern Classification: Th...

Application of Bayesian and Sparse Network Models for Assessing Linkage Diseq...

20070702 Text Categorization

When Classifier Selection meets Information Theory: A Unifying View

Imprecision in learning: an overview

Naive Bayes Presentation

prototypes-AMALEA.pdf

MS CS - Selecting Machine Learning Algorithm

Machine Learning and Statistical Analysis

Csc446: Pattern Recognition

DataEngConf: Feature Extraction: Modern Questions and Challenges at Google

Kürzlich hochgeladen

Determinants of health, dimensions of health, positive health and spectrum of...shambhavirathore45

Accredited-Transport-Cooperatives-Jan-2021-Web.pdfadriantubila

Halmar dropshipping via API with DroFxolyaivanovalion

Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Delhi Call girls

Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson

ALSO dropshipping via API with DroFx.pptxolyaivanovalion

Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H

April 2024 - Crypto Market Report's Analysismanisha194592

꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...Call Girls In Delhi Whatsup 9873940964 Enjoy Unlimited Pleasure

Mature dropshipping via API with DroFx.pptxolyaivanovalion

Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal

꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083

Edukaciniai dropshipping via API with DroFxolyaivanovalion

Invezz.com - Grow your wealth with trading signalsInvezz1

Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten

Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girlkumarajju5765

Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Riyadh +966572737505 get cytotec

FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg

Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann

CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE9953056974 Low Rate Call Girls In Saket, Delhi NCR

Kürzlich hochgeladen (20)

Determinants of health, dimensions of health, positive health and spectrum of...

Accredited-Transport-Cooperatives-Jan-2021-Web.pdf

Halmar dropshipping via API with DroFx

Best VIP Call Girls Noida Sector 39 Call Me: 8448380779

Schema on read is obsolete. Welcome metaprogramming..pdf

ALSO dropshipping via API with DroFx.pptx

Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf

April 2024 - Crypto Market Report's Analysis

꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...

Mature dropshipping via API with DroFx.pptx

Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure

꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call

Edukaciniai dropshipping via API with DroFx

Invezz.com - Grow your wealth with trading signals

Log Analysis using OSSEC sasoasasasas.pptx

Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl

Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec

FESE Capital Markets Fact Sheet 2024 Q1.pdf

Generative AI on Enterprise Cloud with NiFi and Milvus

CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE

Be naive. not idiot.

1. Be naive. Not idiot. Leveraging the class-conditional independency assumption. Sylvain Ferrandiz 21 février 2017

2. Logistic Regression XOR MLP 4 neurons Decision Tree

3. Class-conditionnal independency assumption Often said simple, or naive, even idiot* * Idiot's Bayes - not so stupid after all?, Hand, D.J., & Yu, K. (2001). International Statistical Review. Vol 69 part 3, pages 385-399. ISSN 0306-7734. argmax y p(y/x1, ..., xK) = argmax y p(y) Y k p(xk/y)

4. Estimate univariate distributions Parametric assumption? Nonparametric assumption? → Kernels → Binning (discretization / grouping → Gaussian mixture → Multinomial distribution

5. Winning binning? Outliers Missing values Stability * MODL: a Bayes optimal discretization method for continuous attributes. Boullé, M., (2006). Machine Learning, 65(1):131-165. No parameter to validate O(nlog(n))

6. Selective Naive Bayes On predictive distributions and Bayesian networks. Kontkanen, P., Myllymäki, P., Silander, T., Tirri, H. & Grünwald, P. (2000). Statistics and Computing, 10, 39-54. sk 2 {0, 1} argmax y p(y/x1, ..., xK) = argmax y p(y) Y k p(xk/y)sk

7. Select features An introduction to variable and feature selection. Guyon, I., Elisseeff, A. (2003) Journal of machine learning research 3 (Mar), 1157-1182 Wrapper approach? Embedded approach? → Greedy optimization → Nested subsets → Direct objective optimization Filter approach? → Mutual information → Weak learner → Cross-validation

8. Forward Feature Selection A B C D Pool of actual candidates Pool of future candidates Features included in the model E Draw independently Include iff the AUROCC is improved Keep it safe otherwise

9. Forward Feature Selection A B C D Pool of actual candidates Pool of future candidates Features included in the model E Draw independently Include iff the AUROCC is improved Keep it safe otherwise

10. wk 2 [0, 1] Soft selection argmax y p(y/x1, ..., xK) = argmax y p(y) Y k p(xk/y)wk

11. The averaging trick wk = P s2S skp(s/d) P s2S p(s/d) * A Parameter-Free Classification Method for Large Scale Learning. Boullé, M., (2009). Journal of Machine Learning Research, 10:1367-1385.

12. The averaging trick Explored model only wk = P s2S skp(s/d) P s2S p(s/d) wk = P s2S skp(s/d) P s2S p(s/d) * A Parameter-Free Classification Method for Large Scale Learning. Boullé, M., (2009). Journal of Machine Learning Research, 10:1367-1385.

13. The averaging trick Nonparametric prior wk = P s2S skp(s/d) P s2S p(s/d) * A Parameter-Free Classification Method for Large Scale Learning. Boullé, M., (2009). Journal of Machine Learning Research, 10:1367-1385.

14. + - performance algorithm complexity Nonparametric and stable (bye bye cross-validation!) It’s up to the user to find ‘composite’ features and capture correlational relationships, but … It’s where the fun is, ain’t it?Numeric / Categorical (bye bye dummy-encoding!) Interpretable* *https://www.quora.com/What-makes-a-model- interpretable/answer/Claudia-Perlich

15. Feature engineering XOR X Y Z = (XY > 0) Z

16. Feature surfacing Users Sales Web Users CustomerId Firstname Lastname Age Sales CustomerId Product Amount Time Web CustomerId Page Time Users.Customer_Id Users.Firstname Users.Lastname Users.Age Outcome Count(Sales.Product) CountDistinct(Sales.Product) Mean(Sales.Amount) Sum(Sales.Amount) where Sales.Product = 'Mobile Data' Count(Web.Page) where Day(Web.Time) in [6;7] …

17. Let’s stay in touch! Sylvain FERRANDIZ Machine Learning Scientist sfe@predicsis.ai

Be naive. not idiot.

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Andere mochten auch

Andere mochten auch (18)

Ähnlich wie Be naive. not idiot.

Ähnlich wie Be naive. not idiot. (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Be naive. not idiot.