SlideShare ist ein Scribd-Unternehmen logo
1 von 54
Downloaden Sie, um offline zu lesen
Introduction of Some Ensemble Learning Methods
Random Forest, Boosted Trees and so on
Honglin Yu
March 24, 2014
Outline
1 Big Picture: Ensemble Learning
2 Random Forest
3 Boosting
Big Picture: Ensemble Learning Random Forest Boosting
Big Picture
In statistics and machine learning, ensemble methods use
multiple models to obtain better predictive performance than
could be obtained from any of the constituent models.
— Wikipedia
Big Picture: Ensemble Learning Random Forest Boosting
Big Picture
In statistics and machine learning, ensemble methods use
multiple models to obtain better predictive performance than
could be obtained from any of the constituent models.
— Wikipedia
Build house (strong classifier) from bricks (weak classifiers)
Two questions:
1 What are the bricks? How to make them?
2 How to build house from bricks?
Big Picture: Ensemble Learning Random Forest Boosting
Classification and Regression Tree
Big Picture: Ensemble Learning Random Forest Boosting
Classification and Regression Tree (CART): Example and
Inference
Smell
Color Ingredient
0.9 0.6 0.60.8
good bad
bright dark tastyspicy
Inference:
1 Regression: the predicted response for an observation is given
by the mean response of the training observations that belong
to the same terminal node.
2 Classification: each observation belongs to the most
commonly occurring class of training observations in the
region to which it belongs.
Big Picture: Ensemble Learning Random Forest Boosting
Classification and Regression Tree (CART): Learning
How to learn or build or grow a tree?
Finding the optimal partitioning of the data is NP-complete
Practical solution: greedy method
Famous Example: C4.5 algorithm
Big Picture: Ensemble Learning Random Forest Boosting
Growing the Tree (1)
Main procedure: (similar to building a kd-tree) recursively
split a node of data {xi } into two nodes (NXj <s,NXj ≥s) by one
dimension of feature Xj and a threshold s (greedy)
Big Picture: Ensemble Learning Random Forest Boosting
Growing the Tree (2)
Split criterion (selecting Xj and s):
Regression: “minimizing RSS”
min
j,s
i:(xi )j <s
(yi − ¯y1)2
+
i:(xi )j ≥s
(yi − ¯y2)2
Classification: minimizing weighted (by number of data in
nodes) sum of
Classification error rate of 1 node: 1 − max
k
(pmk )
Gini index of 1 node:
K
k=1
pmk (1 − pmk )
Cross-entropy of 1 node: −
K
k=1
pmk log pmk
Big Picture: Ensemble Learning Random Forest Boosting
Growing the Tree (3)
“When building a classification tree, either the Gini index or
the cross-entropy are typically used to evaluate the quality of a
particular split, since these two approaches are more sensitive
to node purity than is the classification error rate. Any of
these three approaches might be used when pruning the tree,
but the classification error rate is preferable if prediction
accuracy of the final pruned tree is the goal.”
— ISLR book
Big Picture: Ensemble Learning Random Forest Boosting
Growing the Tree (3)
“When building a classification tree, either the Gini index or
the cross-entropy are typically used to evaluate the quality of a
particular split, since these two approaches are more sensitive
to node purity than is the classification error rate. Any of
these three approaches might be used when pruning the tree,
but the classification error rate is preferable if prediction
accuracy of the final pruned tree is the goal.”
— ISLR book
Stop criterion: for example, until “each terminal node has
fewer than some minimum number of observations”
Big Picture: Ensemble Learning Random Forest Boosting
Pruning the Tree (1)
Why: The growed tree is often too large
Overfitting
Sensitive to data fluctuations
How: Cost complexity pruning or weakest link pruning
min
T
|T|
m=1 i:xi ∈Rm
(yi − ¯yRm )2
+ α|T|
where T is a subtree of the full tree T0. |T| is the number of
nodes of T.
Big Picture: Ensemble Learning Random Forest Boosting
Pruning the Tree (2)
— ISLR book
Big Picture: Ensemble Learning Random Forest Boosting
Example
Big Picture: Ensemble Learning Random Forest Boosting
Example
Big Picture: Ensemble Learning Random Forest Boosting
Example
Big Picture: Ensemble Learning Random Forest Boosting
Example
Big Picture: Ensemble Learning Random Forest Boosting
Example
Big Picture: Ensemble Learning Random Forest Boosting
Example
Big Picture: Ensemble Learning Random Forest Boosting
Example
Big Picture: Ensemble Learning Random Forest Boosting
Example
Big Picture: Ensemble Learning Random Forest Boosting
Example
Big Picture: Ensemble Learning Random Forest Boosting
Example
Big Picture: Ensemble Learning Random Forest Boosting
Example
Big Picture: Ensemble Learning Random Forest Boosting
Example
Big Picture: Ensemble Learning Random Forest Boosting
Example
Big Picture: Ensemble Learning Random Forest Boosting
Pros and Cons of CART
Pros:
1 easier to interpret (easy to convert to logic rules)
2 automatic variable selection
3 scale well
4 insensitive to monotone transformations of the inputs (based
on ranking the data points), no need to scale the data
Cons:
1 performance is not good (greedy)
2 unstable
You can not expect too much on a brick ...
Qustion: what makes a good brick?
Big Picture: Ensemble Learning Random Forest Boosting
Question: Why Binary Split
Big Picture: Ensemble Learning Random Forest Boosting
Question: Why Binary Split
No concrete answer
1 Similar to logic
2 Stable (otherwise, more possibilities, errors on the top will
affect the splitting below)
Big Picture: Ensemble Learning Random Forest Boosting
Let’s begin to build the house!
Big Picture: Ensemble Learning Random Forest Boosting
Bootstrap Aggregation (Bagging): Just average it!
Bagging:
1 Build M decision trees on bootstrap data (randomly
resampled with replacement)
2 Average it:
f (x) =
1
M
M
i=1
fi (x)
Note that, CARTs here do not need pruning (why?)
Big Picture: Ensemble Learning Random Forest Boosting
Two things
1 How many trees to use (M) ? More the better, increase until
converge (CART is very fast, can be trained in parallel)
2 What is the size of bootstrap learning set? Breiman chose
same size as original data, then every bootstrap set leaves out
(1 −
1
n
)n
percentage of data (about 37% when n is large)
Big Picture: Ensemble Learning Random Forest Boosting
Question 1: Why Bagging Works?
Big Picture: Ensemble Learning Random Forest Boosting
Question 1: Why Bagging Works?
1 Bias-Variance Trade off:
E[(F(x) − fi (x))2
] = (F(x) − E[fi (x)])2
+ E[fi (x) − Efi (x)]2
= (Bias[fi (x)])2
+ Var[fi (x)]
2 |Bias(fi (x))| is small but Var(fi (x)) is big
3 Then for f (x), if {fi (x)} are mutually independent
Bias(f (x)) = Bias(fi (x))
Var(f (x)) =
1
n
Var(fi (x))
Variance is reduced by averaging
Big Picture: Ensemble Learning Random Forest Boosting
Question 2: Why Bagging works NOT quite well?
Big Picture: Ensemble Learning Random Forest Boosting
Question 2: Why Bagging works NOT quite well?
Trees are strongly Correlated (repetition, bootstrap)
To improve : random forest (add more randomness!)
Big Picture: Ensemble Learning Random Forest Boosting
Random Forest
Random forests are a combination of tree predictors such that
each tree depends on the values of a random vector sampled
independently and with the same distribution for all trees in
the forest
— LEO BREIMAN
Big Picture: Ensemble Learning Random Forest Boosting
Random Forest
Random forests are a combination of tree predictors such that
each tree depends on the values of a random vector sampled
independently and with the same distribution for all trees in
the forest
— LEO BREIMAN
Where randomness helps!
Big Picture: Ensemble Learning Random Forest Boosting
Random Forest: Learning
Where is the “extra” randomness:
As in bagging, we build a number of decision trees on
bootstrapped training samples. But when building these
decision trees, each time a split in a tree is considered, a
random sample of m predictors is chosen as split candidates
from the full set of p predictors. The split is allowed to use
only one of those m predictors.
— ISLR book
Give more chance on the “single weak” predictor (features)
Decorrelate the trees → reduce the variance of the final
hyperthesis
Big Picture: Ensemble Learning Random Forest Boosting
Random Forest
How many sub-features to choose (not very sensitive, try it
(1, half, int(log2 p + 1)))
Internal test of Random Forest (OOB error)
Interpretation
Big Picture: Ensemble Learning Random Forest Boosting
Out of Bag Error (OOB error)
“By product of bootstraping”: Out-of-bag data
For every training data (xi, yi ), there are about 37% of trees
that does not train on it.
In testing, use the 37% trees to predict the (xi, yi ). Then we
can get a RSS or classification error rate from it.
“proven to be unbiased in many tests.”
Now need to do Cross-validation.
Big Picture: Ensemble Learning Random Forest Boosting
Random Forest: Interpretation (Variable Importance)
By averaging, “Random Forest” lost the ability of “easy
interpretation” of CART. But it is still “interpretable”
Big Picture: Ensemble Learning Random Forest Boosting
Permutation Test (1)
Test correlation by permutation
Figure : Does not correlate
— Ken Rice, Thomas Lumley, 2008
Big Picture: Ensemble Learning Random Forest Boosting
Permutation Test (2)
Test correlation by permutation
Figure : Correlate
— Ken Rice, Thomas Lumley, 2008
Big Picture: Ensemble Learning Random Forest Boosting
Interpret Random Forest
https://stat.ethz.ch/education/semesters/ss2012/ams/slides/v10.2.pdf
Big Picture: Ensemble Learning Random Forest Boosting
Example
Feature importance: x : y = 0.34038216 : 0.65961784
OOB error: 0.947563375391
Big Picture: Ensemble Learning Random Forest Boosting
Pros and Cons
Pros:
1 Performance is good, a lot of applications
2 not hard to interpret
3 Easy to train in parallel
Cons:
1 Easy to overfit
2 (?) worse than SVM in high dimension learning
Big Picture: Ensemble Learning Random Forest Boosting
Boosting
Main difference between Boosting and Bagging?
1 “Bricks” (CART here) are trained sequentially.
2 Every “brick” is trained on the whole data (with modification
based on training former “bricks”)
Note that, actually, “Boosting” originates before the invention of
“Random Forest”. Though “Boosting” performs better than
“Random forest” and is also more complex than it.
Big Picture: Ensemble Learning Random Forest Boosting
Boosting a Regression Tree
Older trees can be seen as (except shrinkage) being built on the
“residuals” of former trees.
— ISLR book
Big Picture: Ensemble Learning Random Forest Boosting
Boosting a Classification Tree
Train trees sequentially: For new trees, “hard” data from old
trees are given more weights.
Reduce bias (early stage) and the variance (late stage)
Big Picture: Ensemble Learning Random Forest Boosting
Compared with Random Forest
1 Performance is often better
2 Slow to train and test
3 Easier to overfit
Big Picture: Ensemble Learning Random Forest Boosting
Questions
Is Neural Network also Ensemble Learning?
Big Picture: Ensemble Learning Random Forest Boosting
Questions
What is the relationship between Linear Regression and Random
Forest etc.?
Big Picture: Ensemble Learning Random Forest Boosting
Big Picture again
Adaptive basis function models

Weitere ähnliche Inhalte

Was ist angesagt?

Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...
Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...
Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...Simplilearn
 
Random forest and decision tree
Random forest and decision treeRandom forest and decision tree
Random forest and decision treeAAKANKSHA JAIN
 
Dimensionality Reduction
Dimensionality ReductionDimensionality Reduction
Dimensionality Reductionmrizwan969
 
Support vector machine
Support vector machineSupport vector machine
Support vector machineMusa Hawamdah
 
Ensemble methods in machine learning
Ensemble methods in machine learningEnsemble methods in machine learning
Ensemble methods in machine learningSANTHOSH RAJA M G
 
Random Forest and KNN is fun
Random Forest and KNN is funRandom Forest and KNN is fun
Random Forest and KNN is funZhen Li
 
A friendly introduction to GANs
A friendly introduction to GANsA friendly introduction to GANs
A friendly introduction to GANsCsongor Barabasi
 
SVM Algorithm Explained | Support Vector Machine Tutorial Using R | Edureka
SVM Algorithm Explained | Support Vector Machine Tutorial Using R | EdurekaSVM Algorithm Explained | Support Vector Machine Tutorial Using R | Edureka
SVM Algorithm Explained | Support Vector Machine Tutorial Using R | EdurekaEdureka!
 
hands on machine learning Chapter 6&7 decision tree, ensemble and random forest
hands on machine learning Chapter 6&7 decision tree, ensemble and random foresthands on machine learning Chapter 6&7 decision tree, ensemble and random forest
hands on machine learning Chapter 6&7 decision tree, ensemble and random forestJaey Jeong
 
Decision Tree Learning
Decision Tree LearningDecision Tree Learning
Decision Tree LearningMilind Gokhale
 
Research Trends in Editing image using GAN (TAGAN, Editable GAN)
Research Trends in Editing image using GAN (TAGAN, Editable GAN)Research Trends in Editing image using GAN (TAGAN, Editable GAN)
Research Trends in Editing image using GAN (TAGAN, Editable GAN)DaeJin Kim
 
Stacking ensemble
Stacking ensembleStacking ensemble
Stacking ensemblekalung0313
 
Random Forest In R | Random Forest Algorithm | Random Forest Tutorial |Machin...
Random Forest In R | Random Forest Algorithm | Random Forest Tutorial |Machin...Random Forest In R | Random Forest Algorithm | Random Forest Tutorial |Machin...
Random Forest In R | Random Forest Algorithm | Random Forest Tutorial |Machin...Simplilearn
 
Decision trees in Machine Learning
Decision trees in Machine Learning Decision trees in Machine Learning
Decision trees in Machine Learning Mohammad Junaid Khan
 
Generative adversarial networks
Generative adversarial networksGenerative adversarial networks
Generative adversarial networks남주 김
 
ICASSP 2018 Tutorial: Generative Adversarial Network and its Applications to ...
ICASSP 2018 Tutorial: Generative Adversarial Network and its Applications to ...ICASSP 2018 Tutorial: Generative Adversarial Network and its Applications to ...
ICASSP 2018 Tutorial: Generative Adversarial Network and its Applications to ...宏毅 李
 

Was ist angesagt? (20)

Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...
Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...
Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...
 
Random forest and decision tree
Random forest and decision treeRandom forest and decision tree
Random forest and decision tree
 
Dimensionality Reduction
Dimensionality ReductionDimensionality Reduction
Dimensionality Reduction
 
Support vector machine
Support vector machineSupport vector machine
Support vector machine
 
Ensemble methods in machine learning
Ensemble methods in machine learningEnsemble methods in machine learning
Ensemble methods in machine learning
 
Model selection
Model selectionModel selection
Model selection
 
Random Forest and KNN is fun
Random Forest and KNN is funRandom Forest and KNN is fun
Random Forest and KNN is fun
 
A friendly introduction to GANs
A friendly introduction to GANsA friendly introduction to GANs
A friendly introduction to GANs
 
SVM Algorithm Explained | Support Vector Machine Tutorial Using R | Edureka
SVM Algorithm Explained | Support Vector Machine Tutorial Using R | EdurekaSVM Algorithm Explained | Support Vector Machine Tutorial Using R | Edureka
SVM Algorithm Explained | Support Vector Machine Tutorial Using R | Edureka
 
hands on machine learning Chapter 6&7 decision tree, ensemble and random forest
hands on machine learning Chapter 6&7 decision tree, ensemble and random foresthands on machine learning Chapter 6&7 decision tree, ensemble and random forest
hands on machine learning Chapter 6&7 decision tree, ensemble and random forest
 
Decision Tree Learning
Decision Tree LearningDecision Tree Learning
Decision Tree Learning
 
Lecture6 - C4.5
Lecture6 - C4.5Lecture6 - C4.5
Lecture6 - C4.5
 
Research Trends in Editing image using GAN (TAGAN, Editable GAN)
Research Trends in Editing image using GAN (TAGAN, Editable GAN)Research Trends in Editing image using GAN (TAGAN, Editable GAN)
Research Trends in Editing image using GAN (TAGAN, Editable GAN)
 
Decision tree and random forest
Decision tree and random forestDecision tree and random forest
Decision tree and random forest
 
Stacking ensemble
Stacking ensembleStacking ensemble
Stacking ensemble
 
Random Forest In R | Random Forest Algorithm | Random Forest Tutorial |Machin...
Random Forest In R | Random Forest Algorithm | Random Forest Tutorial |Machin...Random Forest In R | Random Forest Algorithm | Random Forest Tutorial |Machin...
Random Forest In R | Random Forest Algorithm | Random Forest Tutorial |Machin...
 
Decision trees in Machine Learning
Decision trees in Machine Learning Decision trees in Machine Learning
Decision trees in Machine Learning
 
Naive bayes
Naive bayesNaive bayes
Naive bayes
 
Generative adversarial networks
Generative adversarial networksGenerative adversarial networks
Generative adversarial networks
 
ICASSP 2018 Tutorial: Generative Adversarial Network and its Applications to ...
ICASSP 2018 Tutorial: Generative Adversarial Network and its Applications to ...ICASSP 2018 Tutorial: Generative Adversarial Network and its Applications to ...
ICASSP 2018 Tutorial: Generative Adversarial Network and its Applications to ...
 

Andere mochten auch

Machine Learning and Data Mining: 16 Classifiers Ensembles
Machine Learning and Data Mining: 16 Classifiers EnsemblesMachine Learning and Data Mining: 16 Classifiers Ensembles
Machine Learning and Data Mining: 16 Classifiers EnsemblesPier Luca Lanzi
 
Decision Tree Ensembles - Bagging, Random Forest & Gradient Boosting Machines
Decision Tree Ensembles - Bagging, Random Forest & Gradient Boosting MachinesDecision Tree Ensembles - Bagging, Random Forest & Gradient Boosting Machines
Decision Tree Ensembles - Bagging, Random Forest & Gradient Boosting MachinesDeepak George
 
Machine learning basics using trees algorithm (Random forest, Gradient Boosting)
Machine learning basics using trees algorithm (Random forest, Gradient Boosting)Machine learning basics using trees algorithm (Random forest, Gradient Boosting)
Machine learning basics using trees algorithm (Random forest, Gradient Boosting)Parth Khare
 
Understanding Random Forests: From Theory to Practice
Understanding Random Forests: From Theory to PracticeUnderstanding Random Forests: From Theory to Practice
Understanding Random Forests: From Theory to PracticeGilles Louppe
 
Download It
Download ItDownload It
Download Itbutest
 
ensemble learning
ensemble learningensemble learning
ensemble learningbutest
 
ランダムフォレストとそのコンピュータビジョンへの応用
ランダムフォレストとそのコンピュータビジョンへの応用ランダムフォレストとそのコンピュータビジョンへの応用
ランダムフォレストとそのコンピュータビジョンへの応用Kinki University
 
Q trade presentation
Q trade presentationQ trade presentation
Q trade presentationewig123
 
[Women in Data Science Meetup ATX] Decision Trees
[Women in Data Science Meetup ATX] Decision Trees [Women in Data Science Meetup ATX] Decision Trees
[Women in Data Science Meetup ATX] Decision Trees Nikolaos Vergos
 
MLHEP 2015: Introductory Lecture #3
MLHEP 2015: Introductory Lecture #3MLHEP 2015: Introductory Lecture #3
MLHEP 2015: Introductory Lecture #3arogozhnikov
 
Ensemble modeling overview, Big Data meetup
Ensemble modeling overview, Big Data meetupEnsemble modeling overview, Big Data meetup
Ensemble modeling overview, Big Data meetupOptimalBI Limited
 
From decision trees to random forests
From decision trees to random forestsFrom decision trees to random forests
From decision trees to random forestsViet-Trung TRAN
 
Decision trees and random forests
Decision trees and random forestsDecision trees and random forests
Decision trees and random forestsDebdoot Sheet
 
Comparison Study of Decision Tree Ensembles for Regression
Comparison Study of Decision Tree Ensembles for RegressionComparison Study of Decision Tree Ensembles for Regression
Comparison Study of Decision Tree Ensembles for RegressionSeonho Park
 
Bias-variance decomposition in Random Forests
Bias-variance decomposition in Random ForestsBias-variance decomposition in Random Forests
Bias-variance decomposition in Random ForestsGilles Louppe
 
Kaggle "Give me some credit" challenge overview
Kaggle "Give me some credit" challenge overviewKaggle "Give me some credit" challenge overview
Kaggle "Give me some credit" challenge overviewAdam Pah
 
Machine Learning Algorithms (Part 1)
Machine Learning Algorithms (Part 1)Machine Learning Algorithms (Part 1)
Machine Learning Algorithms (Part 1)Zihui Li
 

Andere mochten auch (20)

L4. Ensembles of Decision Trees
L4. Ensembles of Decision TreesL4. Ensembles of Decision Trees
L4. Ensembles of Decision Trees
 
Machine Learning and Data Mining: 16 Classifiers Ensembles
Machine Learning and Data Mining: 16 Classifiers EnsemblesMachine Learning and Data Mining: 16 Classifiers Ensembles
Machine Learning and Data Mining: 16 Classifiers Ensembles
 
Decision Tree Ensembles - Bagging, Random Forest & Gradient Boosting Machines
Decision Tree Ensembles - Bagging, Random Forest & Gradient Boosting MachinesDecision Tree Ensembles - Bagging, Random Forest & Gradient Boosting Machines
Decision Tree Ensembles - Bagging, Random Forest & Gradient Boosting Machines
 
Machine learning basics using trees algorithm (Random forest, Gradient Boosting)
Machine learning basics using trees algorithm (Random forest, Gradient Boosting)Machine learning basics using trees algorithm (Random forest, Gradient Boosting)
Machine learning basics using trees algorithm (Random forest, Gradient Boosting)
 
Understanding Random Forests: From Theory to Practice
Understanding Random Forests: From Theory to PracticeUnderstanding Random Forests: From Theory to Practice
Understanding Random Forests: From Theory to Practice
 
Download It
Download ItDownload It
Download It
 
Data Mining
Data MiningData Mining
Data Mining
 
ensemble learning
ensemble learningensemble learning
ensemble learning
 
ランダムフォレストとそのコンピュータビジョンへの応用
ランダムフォレストとそのコンピュータビジョンへの応用ランダムフォレストとそのコンピュータビジョンへの応用
ランダムフォレストとそのコンピュータビジョンへの応用
 
Q trade presentation
Q trade presentationQ trade presentation
Q trade presentation
 
[Women in Data Science Meetup ATX] Decision Trees
[Women in Data Science Meetup ATX] Decision Trees [Women in Data Science Meetup ATX] Decision Trees
[Women in Data Science Meetup ATX] Decision Trees
 
Tree advanced
Tree advancedTree advanced
Tree advanced
 
MLHEP 2015: Introductory Lecture #3
MLHEP 2015: Introductory Lecture #3MLHEP 2015: Introductory Lecture #3
MLHEP 2015: Introductory Lecture #3
 
Ensemble modeling overview, Big Data meetup
Ensemble modeling overview, Big Data meetupEnsemble modeling overview, Big Data meetup
Ensemble modeling overview, Big Data meetup
 
From decision trees to random forests
From decision trees to random forestsFrom decision trees to random forests
From decision trees to random forests
 
Decision trees and random forests
Decision trees and random forestsDecision trees and random forests
Decision trees and random forests
 
Comparison Study of Decision Tree Ensembles for Regression
Comparison Study of Decision Tree Ensembles for RegressionComparison Study of Decision Tree Ensembles for Regression
Comparison Study of Decision Tree Ensembles for Regression
 
Bias-variance decomposition in Random Forests
Bias-variance decomposition in Random ForestsBias-variance decomposition in Random Forests
Bias-variance decomposition in Random Forests
 
Kaggle "Give me some credit" challenge overview
Kaggle "Give me some credit" challenge overviewKaggle "Give me some credit" challenge overview
Kaggle "Give me some credit" challenge overview
 
Machine Learning Algorithms (Part 1)
Machine Learning Algorithms (Part 1)Machine Learning Algorithms (Part 1)
Machine Learning Algorithms (Part 1)
 

Ähnlich wie Introduction to Some Tree based Learning Method

DecisionTree_RandomForest.pptx
DecisionTree_RandomForest.pptxDecisionTree_RandomForest.pptx
DecisionTree_RandomForest.pptxSagynKarabay
 
Paper review: Measuring the Intrinsic Dimension of Objective Landscapes.
Paper review: Measuring the Intrinsic Dimension of Objective Landscapes.Paper review: Measuring the Intrinsic Dimension of Objective Landscapes.
Paper review: Measuring the Intrinsic Dimension of Objective Landscapes.Wuhyun Rico Shin
 
Tree models with Scikit-Learn: Great models with little assumptions
Tree models with Scikit-Learn: Great models with little assumptionsTree models with Scikit-Learn: Great models with little assumptions
Tree models with Scikit-Learn: Great models with little assumptionsGilles Louppe
 
Machine learning in science and industry — day 2
Machine learning in science and industry — day 2Machine learning in science and industry — day 2
Machine learning in science and industry — day 2arogozhnikov
 
Random Forest Classifier in Machine Learning | Palin Analytics
Random Forest Classifier in Machine Learning | Palin AnalyticsRandom Forest Classifier in Machine Learning | Palin Analytics
Random Forest Classifier in Machine Learning | Palin AnalyticsPalin analytics
 
CS109a_Lecture16_Bagging_RF_Boosting.pptx
CS109a_Lecture16_Bagging_RF_Boosting.pptxCS109a_Lecture16_Bagging_RF_Boosting.pptx
CS109a_Lecture16_Bagging_RF_Boosting.pptxAbhishekSingh43430
 
IMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHES
IMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHESIMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHES
IMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHESVikash Kumar
 
Phylogenomic Supertrees. ORP Bininda-Emond
Phylogenomic Supertrees. ORP Bininda-EmondPhylogenomic Supertrees. ORP Bininda-Emond
Phylogenomic Supertrees. ORP Bininda-EmondRoderic Page
 
Efficient Disease Classifier Using Data Mining Techniques: Refinement of Rand...
Efficient Disease Classifier Using Data Mining Techniques: Refinement of Rand...Efficient Disease Classifier Using Data Mining Techniques: Refinement of Rand...
Efficient Disease Classifier Using Data Mining Techniques: Refinement of Rand...IOSR Journals
 
iccv2009 tutorial: boosting and random forest - part I
iccv2009 tutorial: boosting and random forest - part Iiccv2009 tutorial: boosting and random forest - part I
iccv2009 tutorial: boosting and random forest - part Izukun
 
Machine learning in science and industry — day 3
Machine learning in science and industry — day 3Machine learning in science and industry — day 3
Machine learning in science and industry — day 3arogozhnikov
 
Boosting Algorithms Omar Odibat
Boosting Algorithms Omar Odibat Boosting Algorithms Omar Odibat
Boosting Algorithms Omar Odibat omarodibat
 
Computer notes - Analysis of Union
Computer notes  - Analysis of Union Computer notes  - Analysis of Union
Computer notes - Analysis of Union ecomputernotes
 
computer notes - Data Structures - 30
computer notes - Data Structures - 30computer notes - Data Structures - 30
computer notes - Data Structures - 30ecomputernotes
 
Understanding Bagging and Boosting
Understanding Bagging and BoostingUnderstanding Bagging and Boosting
Understanding Bagging and BoostingMohit Rajput
 
Fundamental Limits of Recovering Tree Sparse Vectors from Noisy Linear Measur...
Fundamental Limits of Recovering Tree Sparse Vectors from Noisy Linear Measur...Fundamental Limits of Recovering Tree Sparse Vectors from Noisy Linear Measur...
Fundamental Limits of Recovering Tree Sparse Vectors from Noisy Linear Measur...sonix022
 

Ähnlich wie Introduction to Some Tree based Learning Method (20)

DecisionTree_RandomForest.pptx
DecisionTree_RandomForest.pptxDecisionTree_RandomForest.pptx
DecisionTree_RandomForest.pptx
 
Paper review: Measuring the Intrinsic Dimension of Objective Landscapes.
Paper review: Measuring the Intrinsic Dimension of Objective Landscapes.Paper review: Measuring the Intrinsic Dimension of Objective Landscapes.
Paper review: Measuring the Intrinsic Dimension of Objective Landscapes.
 
Ensembles.pdf
Ensembles.pdfEnsembles.pdf
Ensembles.pdf
 
Tree models with Scikit-Learn: Great models with little assumptions
Tree models with Scikit-Learn: Great models with little assumptionsTree models with Scikit-Learn: Great models with little assumptions
Tree models with Scikit-Learn: Great models with little assumptions
 
Machine learning in science and industry — day 2
Machine learning in science and industry — day 2Machine learning in science and industry — day 2
Machine learning in science and industry — day 2
 
Random Forest Classifier in Machine Learning | Palin Analytics
Random Forest Classifier in Machine Learning | Palin AnalyticsRandom Forest Classifier in Machine Learning | Palin Analytics
Random Forest Classifier in Machine Learning | Palin Analytics
 
CS109a_Lecture16_Bagging_RF_Boosting.pptx
CS109a_Lecture16_Bagging_RF_Boosting.pptxCS109a_Lecture16_Bagging_RF_Boosting.pptx
CS109a_Lecture16_Bagging_RF_Boosting.pptx
 
CudaTree (GTC 2014)
CudaTree (GTC 2014)CudaTree (GTC 2014)
CudaTree (GTC 2014)
 
IMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHES
IMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHESIMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHES
IMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHES
 
How many trees in a Random Forest
How many trees in a Random ForestHow many trees in a Random Forest
How many trees in a Random Forest
 
Disjoint set
Disjoint setDisjoint set
Disjoint set
 
Phylogenomic Supertrees. ORP Bininda-Emond
Phylogenomic Supertrees. ORP Bininda-EmondPhylogenomic Supertrees. ORP Bininda-Emond
Phylogenomic Supertrees. ORP Bininda-Emond
 
Efficient Disease Classifier Using Data Mining Techniques: Refinement of Rand...
Efficient Disease Classifier Using Data Mining Techniques: Refinement of Rand...Efficient Disease Classifier Using Data Mining Techniques: Refinement of Rand...
Efficient Disease Classifier Using Data Mining Techniques: Refinement of Rand...
 
iccv2009 tutorial: boosting and random forest - part I
iccv2009 tutorial: boosting and random forest - part Iiccv2009 tutorial: boosting and random forest - part I
iccv2009 tutorial: boosting and random forest - part I
 
Machine learning in science and industry — day 3
Machine learning in science and industry — day 3Machine learning in science and industry — day 3
Machine learning in science and industry — day 3
 
Boosting Algorithms Omar Odibat
Boosting Algorithms Omar Odibat Boosting Algorithms Omar Odibat
Boosting Algorithms Omar Odibat
 
Computer notes - Analysis of Union
Computer notes  - Analysis of Union Computer notes  - Analysis of Union
Computer notes - Analysis of Union
 
computer notes - Data Structures - 30
computer notes - Data Structures - 30computer notes - Data Structures - 30
computer notes - Data Structures - 30
 
Understanding Bagging and Boosting
Understanding Bagging and BoostingUnderstanding Bagging and Boosting
Understanding Bagging and Boosting
 
Fundamental Limits of Recovering Tree Sparse Vectors from Noisy Linear Measur...
Fundamental Limits of Recovering Tree Sparse Vectors from Noisy Linear Measur...Fundamental Limits of Recovering Tree Sparse Vectors from Noisy Linear Measur...
Fundamental Limits of Recovering Tree Sparse Vectors from Noisy Linear Measur...
 

Kürzlich hochgeladen

SMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptxSMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptxHaritikaChhatwal1
 
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptxThe Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptxTasha Penwell
 
Digital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksDigital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksdeepakthakur548787
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Seán Kennedy
 
Principles and Practices of Data Visualization
Principles and Practices of Data VisualizationPrinciples and Practices of Data Visualization
Principles and Practices of Data VisualizationKianJazayeri1
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Boston Institute of Analytics
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our WorldEduminds Learning
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Seán Kennedy
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...Amil Baba Dawood bangali
 
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024Susanna-Assunta Sansone
 
What To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptxWhat To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptxSimranPal17
 
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfEnglish-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfblazblazml
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max PrincetonTimothy Spann
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxMike Bennett
 
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Thomas Poetter
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxaleedritatuxx
 
IBEF report on the Insurance market in India
IBEF report on the Insurance market in IndiaIBEF report on the Insurance market in India
IBEF report on the Insurance market in IndiaManalVerma4
 
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Boston Institute of Analytics
 
Decoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis ProjectDecoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis ProjectBoston Institute of Analytics
 

Kürzlich hochgeladen (20)

SMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptxSMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptx
 
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptxThe Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
 
Data Analysis Project: Stroke Prediction
Data Analysis Project: Stroke PredictionData Analysis Project: Stroke Prediction
Data Analysis Project: Stroke Prediction
 
Digital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksDigital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing works
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...
 
Principles and Practices of Data Visualization
Principles and Practices of Data VisualizationPrinciples and Practices of Data Visualization
Principles and Practices of Data Visualization
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our World
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
 
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
 
What To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptxWhat To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptx
 
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfEnglish-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max Princeton
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptx
 
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
 
IBEF report on the Insurance market in India
IBEF report on the Insurance market in IndiaIBEF report on the Insurance market in India
IBEF report on the Insurance market in India
 
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
 
Decoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis ProjectDecoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis Project
 

Introduction to Some Tree based Learning Method

  • 1. Introduction of Some Ensemble Learning Methods Random Forest, Boosted Trees and so on Honglin Yu March 24, 2014
  • 2. Outline 1 Big Picture: Ensemble Learning 2 Random Forest 3 Boosting
  • 3. Big Picture: Ensemble Learning Random Forest Boosting Big Picture In statistics and machine learning, ensemble methods use multiple models to obtain better predictive performance than could be obtained from any of the constituent models. — Wikipedia
  • 4. Big Picture: Ensemble Learning Random Forest Boosting Big Picture In statistics and machine learning, ensemble methods use multiple models to obtain better predictive performance than could be obtained from any of the constituent models. — Wikipedia Build house (strong classifier) from bricks (weak classifiers) Two questions: 1 What are the bricks? How to make them? 2 How to build house from bricks?
  • 5. Big Picture: Ensemble Learning Random Forest Boosting Classification and Regression Tree
  • 6. Big Picture: Ensemble Learning Random Forest Boosting Classification and Regression Tree (CART): Example and Inference Smell Color Ingredient 0.9 0.6 0.60.8 good bad bright dark tastyspicy Inference: 1 Regression: the predicted response for an observation is given by the mean response of the training observations that belong to the same terminal node. 2 Classification: each observation belongs to the most commonly occurring class of training observations in the region to which it belongs.
  • 7. Big Picture: Ensemble Learning Random Forest Boosting Classification and Regression Tree (CART): Learning How to learn or build or grow a tree? Finding the optimal partitioning of the data is NP-complete Practical solution: greedy method Famous Example: C4.5 algorithm
  • 8. Big Picture: Ensemble Learning Random Forest Boosting Growing the Tree (1) Main procedure: (similar to building a kd-tree) recursively split a node of data {xi } into two nodes (NXj <s,NXj ≥s) by one dimension of feature Xj and a threshold s (greedy)
  • 9. Big Picture: Ensemble Learning Random Forest Boosting Growing the Tree (2) Split criterion (selecting Xj and s): Regression: “minimizing RSS” min j,s i:(xi )j <s (yi − ¯y1)2 + i:(xi )j ≥s (yi − ¯y2)2 Classification: minimizing weighted (by number of data in nodes) sum of Classification error rate of 1 node: 1 − max k (pmk ) Gini index of 1 node: K k=1 pmk (1 − pmk ) Cross-entropy of 1 node: − K k=1 pmk log pmk
  • 10. Big Picture: Ensemble Learning Random Forest Boosting Growing the Tree (3) “When building a classification tree, either the Gini index or the cross-entropy are typically used to evaluate the quality of a particular split, since these two approaches are more sensitive to node purity than is the classification error rate. Any of these three approaches might be used when pruning the tree, but the classification error rate is preferable if prediction accuracy of the final pruned tree is the goal.” — ISLR book
  • 11. Big Picture: Ensemble Learning Random Forest Boosting Growing the Tree (3) “When building a classification tree, either the Gini index or the cross-entropy are typically used to evaluate the quality of a particular split, since these two approaches are more sensitive to node purity than is the classification error rate. Any of these three approaches might be used when pruning the tree, but the classification error rate is preferable if prediction accuracy of the final pruned tree is the goal.” — ISLR book Stop criterion: for example, until “each terminal node has fewer than some minimum number of observations”
  • 12. Big Picture: Ensemble Learning Random Forest Boosting Pruning the Tree (1) Why: The growed tree is often too large Overfitting Sensitive to data fluctuations How: Cost complexity pruning or weakest link pruning min T |T| m=1 i:xi ∈Rm (yi − ¯yRm )2 + α|T| where T is a subtree of the full tree T0. |T| is the number of nodes of T.
  • 13. Big Picture: Ensemble Learning Random Forest Boosting Pruning the Tree (2) — ISLR book
  • 14. Big Picture: Ensemble Learning Random Forest Boosting Example
  • 15. Big Picture: Ensemble Learning Random Forest Boosting Example
  • 16. Big Picture: Ensemble Learning Random Forest Boosting Example
  • 17. Big Picture: Ensemble Learning Random Forest Boosting Example
  • 18. Big Picture: Ensemble Learning Random Forest Boosting Example
  • 19. Big Picture: Ensemble Learning Random Forest Boosting Example
  • 20. Big Picture: Ensemble Learning Random Forest Boosting Example
  • 21. Big Picture: Ensemble Learning Random Forest Boosting Example
  • 22. Big Picture: Ensemble Learning Random Forest Boosting Example
  • 23. Big Picture: Ensemble Learning Random Forest Boosting Example
  • 24. Big Picture: Ensemble Learning Random Forest Boosting Example
  • 25. Big Picture: Ensemble Learning Random Forest Boosting Example
  • 26. Big Picture: Ensemble Learning Random Forest Boosting Example
  • 27. Big Picture: Ensemble Learning Random Forest Boosting Pros and Cons of CART Pros: 1 easier to interpret (easy to convert to logic rules) 2 automatic variable selection 3 scale well 4 insensitive to monotone transformations of the inputs (based on ranking the data points), no need to scale the data Cons: 1 performance is not good (greedy) 2 unstable You can not expect too much on a brick ... Qustion: what makes a good brick?
  • 28. Big Picture: Ensemble Learning Random Forest Boosting Question: Why Binary Split
  • 29. Big Picture: Ensemble Learning Random Forest Boosting Question: Why Binary Split No concrete answer 1 Similar to logic 2 Stable (otherwise, more possibilities, errors on the top will affect the splitting below)
  • 30. Big Picture: Ensemble Learning Random Forest Boosting Let’s begin to build the house!
  • 31. Big Picture: Ensemble Learning Random Forest Boosting Bootstrap Aggregation (Bagging): Just average it! Bagging: 1 Build M decision trees on bootstrap data (randomly resampled with replacement) 2 Average it: f (x) = 1 M M i=1 fi (x) Note that, CARTs here do not need pruning (why?)
  • 32. Big Picture: Ensemble Learning Random Forest Boosting Two things 1 How many trees to use (M) ? More the better, increase until converge (CART is very fast, can be trained in parallel) 2 What is the size of bootstrap learning set? Breiman chose same size as original data, then every bootstrap set leaves out (1 − 1 n )n percentage of data (about 37% when n is large)
  • 33. Big Picture: Ensemble Learning Random Forest Boosting Question 1: Why Bagging Works?
  • 34. Big Picture: Ensemble Learning Random Forest Boosting Question 1: Why Bagging Works? 1 Bias-Variance Trade off: E[(F(x) − fi (x))2 ] = (F(x) − E[fi (x)])2 + E[fi (x) − Efi (x)]2 = (Bias[fi (x)])2 + Var[fi (x)] 2 |Bias(fi (x))| is small but Var(fi (x)) is big 3 Then for f (x), if {fi (x)} are mutually independent Bias(f (x)) = Bias(fi (x)) Var(f (x)) = 1 n Var(fi (x)) Variance is reduced by averaging
  • 35. Big Picture: Ensemble Learning Random Forest Boosting Question 2: Why Bagging works NOT quite well?
  • 36. Big Picture: Ensemble Learning Random Forest Boosting Question 2: Why Bagging works NOT quite well? Trees are strongly Correlated (repetition, bootstrap) To improve : random forest (add more randomness!)
  • 37. Big Picture: Ensemble Learning Random Forest Boosting Random Forest Random forests are a combination of tree predictors such that each tree depends on the values of a random vector sampled independently and with the same distribution for all trees in the forest — LEO BREIMAN
  • 38. Big Picture: Ensemble Learning Random Forest Boosting Random Forest Random forests are a combination of tree predictors such that each tree depends on the values of a random vector sampled independently and with the same distribution for all trees in the forest — LEO BREIMAN Where randomness helps!
  • 39. Big Picture: Ensemble Learning Random Forest Boosting Random Forest: Learning Where is the “extra” randomness: As in bagging, we build a number of decision trees on bootstrapped training samples. But when building these decision trees, each time a split in a tree is considered, a random sample of m predictors is chosen as split candidates from the full set of p predictors. The split is allowed to use only one of those m predictors. — ISLR book Give more chance on the “single weak” predictor (features) Decorrelate the trees → reduce the variance of the final hyperthesis
  • 40. Big Picture: Ensemble Learning Random Forest Boosting Random Forest How many sub-features to choose (not very sensitive, try it (1, half, int(log2 p + 1))) Internal test of Random Forest (OOB error) Interpretation
  • 41. Big Picture: Ensemble Learning Random Forest Boosting Out of Bag Error (OOB error) “By product of bootstraping”: Out-of-bag data For every training data (xi, yi ), there are about 37% of trees that does not train on it. In testing, use the 37% trees to predict the (xi, yi ). Then we can get a RSS or classification error rate from it. “proven to be unbiased in many tests.” Now need to do Cross-validation.
  • 42. Big Picture: Ensemble Learning Random Forest Boosting Random Forest: Interpretation (Variable Importance) By averaging, “Random Forest” lost the ability of “easy interpretation” of CART. But it is still “interpretable”
  • 43. Big Picture: Ensemble Learning Random Forest Boosting Permutation Test (1) Test correlation by permutation Figure : Does not correlate — Ken Rice, Thomas Lumley, 2008
  • 44. Big Picture: Ensemble Learning Random Forest Boosting Permutation Test (2) Test correlation by permutation Figure : Correlate — Ken Rice, Thomas Lumley, 2008
  • 45. Big Picture: Ensemble Learning Random Forest Boosting Interpret Random Forest https://stat.ethz.ch/education/semesters/ss2012/ams/slides/v10.2.pdf
  • 46. Big Picture: Ensemble Learning Random Forest Boosting Example Feature importance: x : y = 0.34038216 : 0.65961784 OOB error: 0.947563375391
  • 47. Big Picture: Ensemble Learning Random Forest Boosting Pros and Cons Pros: 1 Performance is good, a lot of applications 2 not hard to interpret 3 Easy to train in parallel Cons: 1 Easy to overfit 2 (?) worse than SVM in high dimension learning
  • 48. Big Picture: Ensemble Learning Random Forest Boosting Boosting Main difference between Boosting and Bagging? 1 “Bricks” (CART here) are trained sequentially. 2 Every “brick” is trained on the whole data (with modification based on training former “bricks”) Note that, actually, “Boosting” originates before the invention of “Random Forest”. Though “Boosting” performs better than “Random forest” and is also more complex than it.
  • 49. Big Picture: Ensemble Learning Random Forest Boosting Boosting a Regression Tree Older trees can be seen as (except shrinkage) being built on the “residuals” of former trees. — ISLR book
  • 50. Big Picture: Ensemble Learning Random Forest Boosting Boosting a Classification Tree Train trees sequentially: For new trees, “hard” data from old trees are given more weights. Reduce bias (early stage) and the variance (late stage)
  • 51. Big Picture: Ensemble Learning Random Forest Boosting Compared with Random Forest 1 Performance is often better 2 Slow to train and test 3 Easier to overfit
  • 52. Big Picture: Ensemble Learning Random Forest Boosting Questions Is Neural Network also Ensemble Learning?
  • 53. Big Picture: Ensemble Learning Random Forest Boosting Questions What is the relationship between Linear Regression and Random Forest etc.?
  • 54. Big Picture: Ensemble Learning Random Forest Boosting Big Picture again Adaptive basis function models