SlideShare ist ein Scribd-Unternehmen logo
1 von 25
Downloaden Sie, um offline zu lesen
Course Calendar
Class DATE Contents
1 Sep. 26 Course information & Course overview
2 Oct. 4 Bayes Estimation
3 〃 11 Classical Bayes Estimation - Kalman Filter -
4 〃 18 Simulation-based Bayesian Methods
5 〃 25 Modern Bayesian Estimation :Particle Filter
6 Nov. 1 HMM(Hidden Markov Model)
Nov. 8 No Class
7 〃 15 Bayesian Decision
8 〃 29 Non parametric Approaches
9 Dec. 6 PCA(Principal Component Analysis)
10 〃 13 ICA(Independent Component Analysis)
11 〃 20 Applications of PCA and ICA
12 〃 27 Clustering, k-means et al.
13 Jan. 17 Other Topics 1 Kernel machine.
14 〃 22(Tue) Other Topics 2
Lecture Plan
Nonparametric Approaches
1. Introduction
1.1 An Example
1.2 Nonparametric Density Estimation Problems
1.3 Histogram density estimation
2. Kernel Density Estimation
3. K-Nearest Neighbor Density Estimation
4. Cross-validation
1. Introduction
3
Automatic Fish-Sorting Process
action 1
belt conveyer
Classification/ Decision Theory
 
 
p ," sea bass" ,
p ," salmon"
x
x
x1
x2
Decision
Boundary
R2
R1
(Duda, Hart, & Stork 2004)
Training phase
Test phase
1.1 An Example -Fish sorting problem-
4
The first step -Training process-
The first task is a supervised learning process, thus each observed
sample has its own label either states of nature 𝜔1 𝑜𝑟 𝜔2.
Training(Learning) data
For a given set of N data samples, suppose the following labeled data
(the lightness of a fish) are observed.
     
       
 1 21 2 1 2
1 2
( ) ( )
1
2
( ) ( )
: , , , , : , , ,
where , are the lightness of i-th samples of (sea bass)
and j-th sample of (salmon) respectively.
We assume , are discrete data as illustrated in Fig.1
N N
i j
i j
x x x y y y
x y
x y
 


(a)
This joint probability distribution gives the histograms of other
probabilities and densities as shown in Fig.1.
5
These histograms can be used for Bayes decision classification.
(a) Samples drawn from a joint probability
over x and ω
Fig.1
iN
N
 p x
 1P 
 2P 
 1p x   2p x 
x
x
x
x
(b)
(c)
(d)
(e)
1 
2 
6
Density Estimation
The approach attempts to estimate the density directly from the
observed data.
The Bayes decision rules discussed in the last lecture have been
developed on the assumption that the relevant probability density
functions and prior probabilities are known. But this is not the case in
practice, we need to estimate the PDFs from a given se of observed data.
given data density distribution
Modeling of
density
http://www.unt.edu/benchmarks/archives/2003/february03/rss.htm
1.2 Nonparametric approaches
Two approaches for density estimation
Parametric approach :
Assume that the forms of the density functions are known, and the
parameters of its are to be estimated.
Non-parametric approach:
This can be used with arbitrary distributions, and does not assume the
form of density function.
Why nonparametric?*
/ Classical parametric densities are unimodal ones whereas practical
problems involve multimodal densities. Non parametric approaches are
applicable to arbitrary density with few assumptions
*Another idea: a mixed (hybrid) with parametric and non-parametric densities.
1.3 Histogram density estimation
A single variable (x) case
into distinct intervals (called ) of width (often chosen as
uniform bins ), and the number of data falling in -th bin.
For turning to a normalized probability density w
i
i
i
x
n i
n


Partition bins
count
 
e put
= at over i-th bin
The density ( ) is approximated by a stepwise function like bar graph.
In multi-dimensional case,
=
where V is
i
i
i
i
n
p x
N
p x
n
p x
NV

the volume of the bin.
Fig. 2 (RIGHT) Histogram density estimation
(from Bishop [3] web site) 50 data points are
generated from the distribution shown by the
green curve
(1)
9
The feature of the histogram estimation depends on the width (Δi
) of bins as shown in Fig.2.
Small Δ → density tends to have many spikes
Large Δ → density tends to be over-smoothed
Merit: Convenient visualization tool
Problems:
Discontinuities at the bin’s edges
Computational burden in high dimensional space (MD)
2. Kernel Density Estimation
10
-Basic idea of density estimation-
 
Unknown Density
D
p x
x R
 1 2
A set of observations
, , N
N
x x x
 p xEstimate
 
Consider a small regiron surrounding ( ) and define a probability
which means the probability of falling into
The number falling within of overall observe
x x
P p d
x
K N
 

 R
R R
R
R.
 
d data would be
Suppose can be approximated by a constant over we have
K PN
p x R
generates
(2)
(3)
11
 
 
where means the volume of .
Eqs. (2) (3) give the following density estimation form.
Two wayes of exploitation of this:
1) Fix then estimate
P p x V
V
K
p x
NV
V K
R
Kernel Density Estimation
2) Fix then estimate k-Nearest Neighbor EstimationK V


(4)
(5)
12
Kernel Density Estimation
A point : wish to determine the density at this point
Region : small hypercube centered on
the volume of
Find the number of samples K that fall within the region .
For
x
x
V 
R
R
R
 
counting we introduce the kernek function.
[one dimensional kernel function]
1
1
2
0
K
u
K u
elsewhere


 

For a given observation , considern
n
x
x x
K
h
 
 
 
(6)
(7)
13
 
1
1
2
0
where is called the bandwidth or smoothing parameter.
For a set of observations : 1, ,
gives the number of data which are located
nn
n
N
n
n
h
x xx x
K
h
elsewhere
h
x n N
x x
K K
h

   
  
  

 
  
 

 
 
1
within ,
2 2
Substituting (9) into (5)
1
An example graph of is illustrated in Fig.3
N
n
n
h h
x x
x x
p x K
Nh h
p x

 
  
 
 
 
 

(8)
(9)
(10)
14
Example : Data set {xn} n=1~4
x
x1 x2 x3 x4
xx1 x2 x3 x4
11 x x
K
h h
 
 
 
41 x x
K
h h
 
 
 
4
1
1 i
i
x x
K
h h
 
 
 
  
1
4
p x 
x1
x2
x3
x4
Fig. 3
15
Discontinuity and Smooth Kernel Function
Kernel Density Estimator will suffer from discontinuities in
estimated density. Smooth kernel function such as Gaussian is used.
This general method is referred to as the kernel density estimator or
Parzen estimator.
Example: Gaussian and 1-D case,
where h is the standard deviation.
Determination of bandwidth h
Small h → spiky p(x)
Large h → over-smoothed p(x)
Defects:
High computational cost
Because of fixed V there may be too few samples in some regions.
 
2
2
1
( )1 1
exp
22
N
n
n
x x
p x
N hh
 
 
 
 (11)
16
17
Example: Bayesian decision by the Parzen kernel estimation
Example: Kernel density estimation
Fig 4. Kernel density estimation
(from Bishop [3] web site)
Apply KDE method to the same 50
data used in Fig. 2
Fig. 5
The decision
boundaries
LEFT : small h
RIGHT: large h
Duda et al [1]
3. K-Nearest Neighbors density estimation
18
KDE approaches use fixed h throughout the data space. But we want
to apply small h for highly dense data region, on the other hand, to set
a larger h for sparse data region.
We come up with the idea of K-Nearest Neighbor (K-NN) approaches.
Expand the region (radius) surrounding the estimation point x until it
encloses K data points.
 
K
p x
NV

Fixed
Determine the minimum
volume V containing K
points in R
 
 
Volume of
hypersphere
with radius ( )
D
D
r x
K K
p x
N V N c r x

 
(12)
19
x
K-th closest neighbor point
1 2 3
4
( 2, , , )
3
Dc c c c

   
Fig. 7 K-nearest neighbor density
estimation
(from Bishop [3] web site)
Apply to the same 50 data points
used in Fig. 2
K is the free parameter in K-NN
method
r(x)
Problems: 1) integration of p(x) is not bounded, 2) discontinuities, 3)huge
computation time and storage
Fig. 6 K-NN algorithm
20
K-NN estimation as a Bayesian classifier
A method to generate decision boundary directly based on a set of
data.
− N training data with class labels (ω1 ~ ωc)
Nl points for l-th class, such that
− Classify a test sample x
− Get a sphere with minimum radius r(x) which encircles K samples.
− The volume of the sphere:
− Kl points for l-th class (ωl)
− Class-conditioned density at x:
− Evidence:
− Prior probabilities:
− Posterior Probabilities (Bayes’ theorem)
1
c
l
l
N N


 D
DV c r x
  l
l
l
K
p x
N V

  l
l
N
p
N

 
K
p x
NV
 
   
 
l l l
l
p x P K
P x
p x K
 
  (14)
21
K-NN classifier
− Find the class maximizing the posterior probability (Bayes decision)
− The point x is classified into 𝜔𝑙0
Summary: (Fig.8)
1) Select K data surrounding the estimation point x
2) The point x is assigned the major class of K points in the neighbor.
 0 : argmax argmax l
l
l l
K
l P x
K
 
Nearest Neighbor classifier
Let consider K=1 for K-NN classification.
The point x is classified into the class of the nearest point to x.
→ Nearest Neighbor Classifier
Classification boundary of the K-NN with respect to K
small K → tends to make many small class regions (Fig. 9 (b))
large K → few class regions (Fig.9(a))
(14)
22
K-Nearest
Neighbor
classifier
algorithm (K=3)
Nearest
Neighbor
Classifier
algorithm (K=1)
Fig. 9 K-Nearest Neighbor Classifiers (K=3, 1, 31) Bishop [3]
Fig. 8 (Bishop [3])
x
(a) (b) (c)
23
4. Cross-Validation
Parameter determination problem:
- Since the classifier have free parameters such as K in K-NN
classifier and h in Kernel density based classifiers, we need to
select the optimal parameters by evaluating the classification
performances.
- Over-fitting problem
- The classifier’s parameter (decision boundary) obtained from
using overall training data will overfit to them
→ Need an appropriate new test data
Cross-Validation
- Given data are split into S parts.
S=4
- Fig. 10
- Use (S-1) parts for training and a rest part is used for testing
- Apply all different parts for the test as shown in Fig. 11 (S=4)
total data
24
training datatest data
Experiment 1
Experiment 2
Experiment 3
Experiment 4
Score * 1
Score 2
Score 3
Score 4
×(1/4)=Averaged score
If we want to determine the best K for K-NN classifier, we
choose the K providing the highest averaged score by the
cross-validation procedure.
*Score= error rate or conditioned risk et al.
Fig. 11
+)
25
References:
[1] R.O. Duda, P.E. Hart, and D. G. Stork, “Pattern Classification”,
John Wiley & Sons, 2nd edition, 2004
[2] C. M. Bishop, “Pattern Recognition and Machine Learning”,
Springer, 2006
[3] All data files of Bishop’s book are available at the
“http://research.microsoft.com/~cmbishop/PRML”

Weitere ähnliche Inhalte

Was ist angesagt?

Gaussian processing
Gaussian processingGaussian processing
Gaussian processing홍배 김
 
Mixed Spectra for Stable Signals from Discrete Observations
Mixed Spectra for Stable Signals from Discrete ObservationsMixed Spectra for Stable Signals from Discrete Observations
Mixed Spectra for Stable Signals from Discrete Observationssipij
 
Cluster analysis using k-means method in R
Cluster analysis using k-means method in RCluster analysis using k-means method in R
Cluster analysis using k-means method in RVladimir Bakhrushin
 
Information-theoretic clustering with applications
Information-theoretic clustering  with applicationsInformation-theoretic clustering  with applications
Information-theoretic clustering with applicationsFrank Nielsen
 
Ee693 questionshomework
Ee693 questionshomeworkEe693 questionshomework
Ee693 questionshomeworkGopi Saiteja
 
Neural Networks: Principal Component Analysis (PCA)
Neural Networks: Principal Component Analysis (PCA)Neural Networks: Principal Component Analysis (PCA)
Neural Networks: Principal Component Analysis (PCA)Mostafa G. M. Mostafa
 
Principal Components Analysis, Calculation and Visualization
Principal Components Analysis, Calculation and VisualizationPrincipal Components Analysis, Calculation and Visualization
Principal Components Analysis, Calculation and VisualizationMarjan Sterjev
 
MLHEP 2015: Introductory Lecture #4
MLHEP 2015: Introductory Lecture #4MLHEP 2015: Introductory Lecture #4
MLHEP 2015: Introductory Lecture #4arogozhnikov
 
a decomposition methodMin quasdratic.pdf
a decomposition methodMin quasdratic.pdfa decomposition methodMin quasdratic.pdf
a decomposition methodMin quasdratic.pdfAnaRojas146538
 
Radial Basis Function Interpolation
Radial Basis Function InterpolationRadial Basis Function Interpolation
Radial Basis Function InterpolationJesse Bettencourt
 
Answers withexplanations
Answers withexplanationsAnswers withexplanations
Answers withexplanationsGopi Saiteja
 
Automatic Gain Tuning based on Gaussian Process Global Optimization (= Bayesi...
Automatic Gain Tuning based on Gaussian Process Global Optimization (= Bayesi...Automatic Gain Tuning based on Gaussian Process Global Optimization (= Bayesi...
Automatic Gain Tuning based on Gaussian Process Global Optimization (= Bayesi...홍배 김
 
machinelearning project
machinelearning projectmachinelearning project
machinelearning projectLianli Liu
 
MVPA with SpaceNet: sparse structured priors
MVPA with SpaceNet: sparse structured priorsMVPA with SpaceNet: sparse structured priors
MVPA with SpaceNet: sparse structured priorsElvis DOHMATOB
 
Visualization using tSNE
Visualization using tSNEVisualization using tSNE
Visualization using tSNEYan Xu
 

Was ist angesagt? (20)

Gaussian processing
Gaussian processingGaussian processing
Gaussian processing
 
Mixed Spectra for Stable Signals from Discrete Observations
Mixed Spectra for Stable Signals from Discrete ObservationsMixed Spectra for Stable Signals from Discrete Observations
Mixed Spectra for Stable Signals from Discrete Observations
 
Diffusion Homework Help
Diffusion Homework HelpDiffusion Homework Help
Diffusion Homework Help
 
Cluster analysis using k-means method in R
Cluster analysis using k-means method in RCluster analysis using k-means method in R
Cluster analysis using k-means method in R
 
Information-theoretic clustering with applications
Information-theoretic clustering  with applicationsInformation-theoretic clustering  with applications
Information-theoretic clustering with applications
 
Ee693 questionshomework
Ee693 questionshomeworkEe693 questionshomework
Ee693 questionshomework
 
Neural Networks: Principal Component Analysis (PCA)
Neural Networks: Principal Component Analysis (PCA)Neural Networks: Principal Component Analysis (PCA)
Neural Networks: Principal Component Analysis (PCA)
 
Principal Components Analysis, Calculation and Visualization
Principal Components Analysis, Calculation and VisualizationPrincipal Components Analysis, Calculation and Visualization
Principal Components Analysis, Calculation and Visualization
 
MLHEP 2015: Introductory Lecture #4
MLHEP 2015: Introductory Lecture #4MLHEP 2015: Introductory Lecture #4
MLHEP 2015: Introductory Lecture #4
 
Knapsack problem using fixed tuple
Knapsack problem using fixed tupleKnapsack problem using fixed tuple
Knapsack problem using fixed tuple
 
a decomposition methodMin quasdratic.pdf
a decomposition methodMin quasdratic.pdfa decomposition methodMin quasdratic.pdf
a decomposition methodMin quasdratic.pdf
 
Radial Basis Function Interpolation
Radial Basis Function InterpolationRadial Basis Function Interpolation
Radial Basis Function Interpolation
 
Answers withexplanations
Answers withexplanationsAnswers withexplanations
Answers withexplanations
 
Automatic Gain Tuning based on Gaussian Process Global Optimization (= Bayesi...
Automatic Gain Tuning based on Gaussian Process Global Optimization (= Bayesi...Automatic Gain Tuning based on Gaussian Process Global Optimization (= Bayesi...
Automatic Gain Tuning based on Gaussian Process Global Optimization (= Bayesi...
 
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
 
machinelearning project
machinelearning projectmachinelearning project
machinelearning project
 
Principal component analysis
Principal component analysisPrincipal component analysis
Principal component analysis
 
MVPA with SpaceNet: sparse structured priors
MVPA with SpaceNet: sparse structured priorsMVPA with SpaceNet: sparse structured priors
MVPA with SpaceNet: sparse structured priors
 
Visualization using tSNE
Visualization using tSNEVisualization using tSNE
Visualization using tSNE
 
11 clusadvanced
11 clusadvanced11 clusadvanced
11 clusadvanced
 

Ähnlich wie 2012 mdsp pr08 nonparametric approach

Kernel estimation(ref)
Kernel estimation(ref)Kernel estimation(ref)
Kernel estimation(ref)Zahra Amini
 
Machine Learning Algorithms (Part 1)
Machine Learning Algorithms (Part 1)Machine Learning Algorithms (Part 1)
Machine Learning Algorithms (Part 1)Zihui Li
 
Anomaly detection using deep one class classifier
Anomaly detection using deep one class classifierAnomaly detection using deep one class classifier
Anomaly detection using deep one class classifier홍배 김
 
2012 mdsp pr09 pca lda
2012 mdsp pr09 pca lda2012 mdsp pr09 pca lda
2012 mdsp pr09 pca ldanozomuhamada
 
Data Mining: Concepts and techniques: Chapter 11,Review: Basic Cluster Analys...
Data Mining: Concepts and techniques: Chapter 11,Review: Basic Cluster Analys...Data Mining: Concepts and techniques: Chapter 11,Review: Basic Cluster Analys...
Data Mining: Concepts and techniques: Chapter 11,Review: Basic Cluster Analys...Salah Amean
 
Classification of Iris Data using Kernel Radial Basis Probabilistic Neural N...
Classification of Iris Data using Kernel Radial Basis Probabilistic  Neural N...Classification of Iris Data using Kernel Radial Basis Probabilistic  Neural N...
Classification of Iris Data using Kernel Radial Basis Probabilistic Neural N...Scientific Review SR
 
Classification of Iris Data using Kernel Radial Basis Probabilistic Neural Ne...
Classification of Iris Data using Kernel Radial Basis Probabilistic Neural Ne...Classification of Iris Data using Kernel Radial Basis Probabilistic Neural Ne...
Classification of Iris Data using Kernel Radial Basis Probabilistic Neural Ne...Scientific Review
 
11ClusAdvanced.ppt
11ClusAdvanced.ppt11ClusAdvanced.ppt
11ClusAdvanced.pptSueMiu
 
Chapter 11. Cluster Analysis Advanced Methods.ppt
Chapter 11. Cluster Analysis Advanced Methods.pptChapter 11. Cluster Analysis Advanced Methods.ppt
Chapter 11. Cluster Analysis Advanced Methods.pptSubrata Kumer Paul
 
Kernel density estimation (kde)
Kernel density estimation (kde)Kernel density estimation (kde)
Kernel density estimation (kde)Padma Metta
 
MLHEP Lectures - day 1, basic track
MLHEP Lectures - day 1, basic trackMLHEP Lectures - day 1, basic track
MLHEP Lectures - day 1, basic trackarogozhnikov
 
Massively Parallel K-Nearest Neighbor Computation on Distributed Architectures
Massively Parallel K-Nearest Neighbor Computation on Distributed Architectures Massively Parallel K-Nearest Neighbor Computation on Distributed Architectures
Massively Parallel K-Nearest Neighbor Computation on Distributed Architectures Intel® Software
 
convolutional_neural_networks in deep learning
convolutional_neural_networks in deep learningconvolutional_neural_networks in deep learning
convolutional_neural_networks in deep learningssusere5ddd6
 
Chapter 11 cluster advanced, Han & Kamber
Chapter 11 cluster advanced, Han & KamberChapter 11 cluster advanced, Han & Kamber
Chapter 11 cluster advanced, Han & KamberHouw Liong The
 
3.4 density and grid methods
3.4 density and grid methods3.4 density and grid methods
3.4 density and grid methodsKrish_ver2
 
Chapter 11 cluster advanced : web and text mining
Chapter 11 cluster advanced : web and text miningChapter 11 cluster advanced : web and text mining
Chapter 11 cluster advanced : web and text miningHouw Liong The
 

Ähnlich wie 2012 mdsp pr08 nonparametric approach (20)

Kernel estimation(ref)
Kernel estimation(ref)Kernel estimation(ref)
Kernel estimation(ref)
 
kcde
kcdekcde
kcde
 
Machine Learning Algorithms (Part 1)
Machine Learning Algorithms (Part 1)Machine Learning Algorithms (Part 1)
Machine Learning Algorithms (Part 1)
 
Lecture 8
Lecture 8Lecture 8
Lecture 8
 
Anomaly detection using deep one class classifier
Anomaly detection using deep one class classifierAnomaly detection using deep one class classifier
Anomaly detection using deep one class classifier
 
2012 mdsp pr09 pca lda
2012 mdsp pr09 pca lda2012 mdsp pr09 pca lda
2012 mdsp pr09 pca lda
 
Data Mining: Concepts and techniques: Chapter 11,Review: Basic Cluster Analys...
Data Mining: Concepts and techniques: Chapter 11,Review: Basic Cluster Analys...Data Mining: Concepts and techniques: Chapter 11,Review: Basic Cluster Analys...
Data Mining: Concepts and techniques: Chapter 11,Review: Basic Cluster Analys...
 
Classification of Iris Data using Kernel Radial Basis Probabilistic Neural N...
Classification of Iris Data using Kernel Radial Basis Probabilistic  Neural N...Classification of Iris Data using Kernel Radial Basis Probabilistic  Neural N...
Classification of Iris Data using Kernel Radial Basis Probabilistic Neural N...
 
Classification of Iris Data using Kernel Radial Basis Probabilistic Neural Ne...
Classification of Iris Data using Kernel Radial Basis Probabilistic Neural Ne...Classification of Iris Data using Kernel Radial Basis Probabilistic Neural Ne...
Classification of Iris Data using Kernel Radial Basis Probabilistic Neural Ne...
 
11ClusAdvanced.ppt
11ClusAdvanced.ppt11ClusAdvanced.ppt
11ClusAdvanced.ppt
 
Chapter 11. Cluster Analysis Advanced Methods.ppt
Chapter 11. Cluster Analysis Advanced Methods.pptChapter 11. Cluster Analysis Advanced Methods.ppt
Chapter 11. Cluster Analysis Advanced Methods.ppt
 
Kernel density estimation (kde)
Kernel density estimation (kde)Kernel density estimation (kde)
Kernel density estimation (kde)
 
MLHEP Lectures - day 1, basic track
MLHEP Lectures - day 1, basic trackMLHEP Lectures - day 1, basic track
MLHEP Lectures - day 1, basic track
 
Massively Parallel K-Nearest Neighbor Computation on Distributed Architectures
Massively Parallel K-Nearest Neighbor Computation on Distributed Architectures Massively Parallel K-Nearest Neighbor Computation on Distributed Architectures
Massively Parallel K-Nearest Neighbor Computation on Distributed Architectures
 
convolutional_neural_networks in deep learning
convolutional_neural_networks in deep learningconvolutional_neural_networks in deep learning
convolutional_neural_networks in deep learning
 
Chapter 11 cluster advanced, Han & Kamber
Chapter 11 cluster advanced, Han & KamberChapter 11 cluster advanced, Han & Kamber
Chapter 11 cluster advanced, Han & Kamber
 
3.4 density and grid methods
3.4 density and grid methods3.4 density and grid methods
3.4 density and grid methods
 
Chapter 11 cluster advanced : web and text mining
Chapter 11 cluster advanced : web and text miningChapter 11 cluster advanced : web and text mining
Chapter 11 cluster advanced : web and text mining
 
2009 asilomar
2009 asilomar2009 asilomar
2009 asilomar
 
Data analysis of weather forecasting
Data analysis of weather forecastingData analysis of weather forecasting
Data analysis of weather forecasting
 

Mehr von nozomuhamada

2012 mdsp pr11 ica part 2 face recognition
2012 mdsp pr11 ica part 2 face recognition2012 mdsp pr11 ica part 2 face recognition
2012 mdsp pr11 ica part 2 face recognitionnozomuhamada
 
2012 mdsp pr07 bayes decision
2012 mdsp pr07 bayes decision2012 mdsp pr07 bayes decision
2012 mdsp pr07 bayes decisionnozomuhamada
 
2012 mdsp pr06  hmm
2012 mdsp pr06  hmm2012 mdsp pr06  hmm
2012 mdsp pr06  hmmnozomuhamada
 
2012 mdsp pr05 particle filter
2012 mdsp pr05 particle filter2012 mdsp pr05 particle filter
2012 mdsp pr05 particle filternozomuhamada
 
2012 mdsp pr03 kalman filter
2012 mdsp pr03 kalman filter2012 mdsp pr03 kalman filter
2012 mdsp pr03 kalman filternozomuhamada
 
2012 mdsp pr02 1004
2012 mdsp pr02 10042012 mdsp pr02 1004
2012 mdsp pr02 1004nozomuhamada
 
2012 mdsp pr01 introduction 0921
2012 mdsp pr01 introduction 09212012 mdsp pr01 introduction 0921
2012 mdsp pr01 introduction 0921nozomuhamada
 
招待講演(鶴岡)
招待講演(鶴岡)招待講演(鶴岡)
招待講演(鶴岡)nozomuhamada
 

Mehr von nozomuhamada (10)

2012 mdsp pr11 ica part 2 face recognition
2012 mdsp pr11 ica part 2 face recognition2012 mdsp pr11 ica part 2 face recognition
2012 mdsp pr11 ica part 2 face recognition
 
2012 mdsp pr07 bayes decision
2012 mdsp pr07 bayes decision2012 mdsp pr07 bayes decision
2012 mdsp pr07 bayes decision
 
2012 mdsp pr06  hmm
2012 mdsp pr06  hmm2012 mdsp pr06  hmm
2012 mdsp pr06  hmm
 
2012 mdsp pr05 particle filter
2012 mdsp pr05 particle filter2012 mdsp pr05 particle filter
2012 mdsp pr05 particle filter
 
2012 mdsp pr03 kalman filter
2012 mdsp pr03 kalman filter2012 mdsp pr03 kalman filter
2012 mdsp pr03 kalman filter
 
2012 mdsp pr02 1004
2012 mdsp pr02 10042012 mdsp pr02 1004
2012 mdsp pr02 1004
 
2012 mdsp pr01 introduction 0921
2012 mdsp pr01 introduction 09212012 mdsp pr01 introduction 0921
2012 mdsp pr01 introduction 0921
 
Ieice中国地区
Ieice中国地区Ieice中国地区
Ieice中国地区
 
招待講演(鶴岡)
招待講演(鶴岡)招待講演(鶴岡)
招待講演(鶴岡)
 
最終講義
最終講義最終講義
最終講義
 

Kürzlich hochgeladen

Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdfChristopherTHyatt
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfhans926745
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 

Kürzlich hochgeladen (20)

Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdf
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 

2012 mdsp pr08 nonparametric approach

  • 1. Course Calendar Class DATE Contents 1 Sep. 26 Course information & Course overview 2 Oct. 4 Bayes Estimation 3 〃 11 Classical Bayes Estimation - Kalman Filter - 4 〃 18 Simulation-based Bayesian Methods 5 〃 25 Modern Bayesian Estimation :Particle Filter 6 Nov. 1 HMM(Hidden Markov Model) Nov. 8 No Class 7 〃 15 Bayesian Decision 8 〃 29 Non parametric Approaches 9 Dec. 6 PCA(Principal Component Analysis) 10 〃 13 ICA(Independent Component Analysis) 11 〃 20 Applications of PCA and ICA 12 〃 27 Clustering, k-means et al. 13 Jan. 17 Other Topics 1 Kernel machine. 14 〃 22(Tue) Other Topics 2
  • 2. Lecture Plan Nonparametric Approaches 1. Introduction 1.1 An Example 1.2 Nonparametric Density Estimation Problems 1.3 Histogram density estimation 2. Kernel Density Estimation 3. K-Nearest Neighbor Density Estimation 4. Cross-validation
  • 3. 1. Introduction 3 Automatic Fish-Sorting Process action 1 belt conveyer Classification/ Decision Theory     p ," sea bass" , p ," salmon" x x x1 x2 Decision Boundary R2 R1 (Duda, Hart, & Stork 2004) Training phase Test phase
  • 4. 1.1 An Example -Fish sorting problem- 4 The first step -Training process- The first task is a supervised learning process, thus each observed sample has its own label either states of nature 𝜔1 𝑜𝑟 𝜔2. Training(Learning) data For a given set of N data samples, suppose the following labeled data (the lightness of a fish) are observed.                1 21 2 1 2 1 2 ( ) ( ) 1 2 ( ) ( ) : , , , , : , , , where , are the lightness of i-th samples of (sea bass) and j-th sample of (salmon) respectively. We assume , are discrete data as illustrated in Fig.1 N N i j i j x x x y y y x y x y     (a) This joint probability distribution gives the histograms of other probabilities and densities as shown in Fig.1.
  • 5. 5 These histograms can be used for Bayes decision classification. (a) Samples drawn from a joint probability over x and ω Fig.1 iN N  p x  1P   2P   1p x   2p x  x x x x (b) (c) (d) (e) 1  2 
  • 6. 6 Density Estimation The approach attempts to estimate the density directly from the observed data. The Bayes decision rules discussed in the last lecture have been developed on the assumption that the relevant probability density functions and prior probabilities are known. But this is not the case in practice, we need to estimate the PDFs from a given se of observed data. given data density distribution Modeling of density http://www.unt.edu/benchmarks/archives/2003/february03/rss.htm
  • 7. 1.2 Nonparametric approaches Two approaches for density estimation Parametric approach : Assume that the forms of the density functions are known, and the parameters of its are to be estimated. Non-parametric approach: This can be used with arbitrary distributions, and does not assume the form of density function. Why nonparametric?* / Classical parametric densities are unimodal ones whereas practical problems involve multimodal densities. Non parametric approaches are applicable to arbitrary density with few assumptions *Another idea: a mixed (hybrid) with parametric and non-parametric densities.
  • 8. 1.3 Histogram density estimation A single variable (x) case into distinct intervals (called ) of width (often chosen as uniform bins ), and the number of data falling in -th bin. For turning to a normalized probability density w i i i x n i n   Partition bins count   e put = at over i-th bin The density ( ) is approximated by a stepwise function like bar graph. In multi-dimensional case, = where V is i i i i n p x N p x n p x NV  the volume of the bin. Fig. 2 (RIGHT) Histogram density estimation (from Bishop [3] web site) 50 data points are generated from the distribution shown by the green curve (1)
  • 9. 9 The feature of the histogram estimation depends on the width (Δi ) of bins as shown in Fig.2. Small Δ → density tends to have many spikes Large Δ → density tends to be over-smoothed Merit: Convenient visualization tool Problems: Discontinuities at the bin’s edges Computational burden in high dimensional space (MD)
  • 10. 2. Kernel Density Estimation 10 -Basic idea of density estimation-   Unknown Density D p x x R  1 2 A set of observations , , N N x x x  p xEstimate   Consider a small regiron surrounding ( ) and define a probability which means the probability of falling into The number falling within of overall observe x x P p d x K N     R R R R R.   d data would be Suppose can be approximated by a constant over we have K PN p x R generates (2) (3)
  • 11. 11     where means the volume of . Eqs. (2) (3) give the following density estimation form. Two wayes of exploitation of this: 1) Fix then estimate P p x V V K p x NV V K R Kernel Density Estimation 2) Fix then estimate k-Nearest Neighbor EstimationK V   (4) (5)
  • 12. 12 Kernel Density Estimation A point : wish to determine the density at this point Region : small hypercube centered on the volume of Find the number of samples K that fall within the region . For x x V  R R R   counting we introduce the kernek function. [one dimensional kernel function] 1 1 2 0 K u K u elsewhere      For a given observation , considern n x x x K h       (6) (7)
  • 13. 13   1 1 2 0 where is called the bandwidth or smoothing parameter. For a set of observations : 1, , gives the number of data which are located nn n N n n h x xx x K h elsewhere h x n N x x K K h                         1 within , 2 2 Substituting (9) into (5) 1 An example graph of is illustrated in Fig.3 N n n h h x x x x p x K Nh h p x                (8) (9) (10)
  • 14. 14 Example : Data set {xn} n=1~4 x x1 x2 x3 x4 xx1 x2 x3 x4 11 x x K h h       41 x x K h h       4 1 1 i i x x K h h          1 4 p x  x1 x2 x3 x4 Fig. 3
  • 15. 15 Discontinuity and Smooth Kernel Function Kernel Density Estimator will suffer from discontinuities in estimated density. Smooth kernel function such as Gaussian is used. This general method is referred to as the kernel density estimator or Parzen estimator. Example: Gaussian and 1-D case, where h is the standard deviation. Determination of bandwidth h Small h → spiky p(x) Large h → over-smoothed p(x) Defects: High computational cost Because of fixed V there may be too few samples in some regions.   2 2 1 ( )1 1 exp 22 N n n x x p x N hh        (11)
  • 16. 16
  • 17. 17 Example: Bayesian decision by the Parzen kernel estimation Example: Kernel density estimation Fig 4. Kernel density estimation (from Bishop [3] web site) Apply KDE method to the same 50 data used in Fig. 2 Fig. 5 The decision boundaries LEFT : small h RIGHT: large h Duda et al [1]
  • 18. 3. K-Nearest Neighbors density estimation 18 KDE approaches use fixed h throughout the data space. But we want to apply small h for highly dense data region, on the other hand, to set a larger h for sparse data region. We come up with the idea of K-Nearest Neighbor (K-NN) approaches. Expand the region (radius) surrounding the estimation point x until it encloses K data points.   K p x NV  Fixed Determine the minimum volume V containing K points in R     Volume of hypersphere with radius ( ) D D r x K K p x N V N c r x    (12)
  • 19. 19 x K-th closest neighbor point 1 2 3 4 ( 2, , , ) 3 Dc c c c      Fig. 7 K-nearest neighbor density estimation (from Bishop [3] web site) Apply to the same 50 data points used in Fig. 2 K is the free parameter in K-NN method r(x) Problems: 1) integration of p(x) is not bounded, 2) discontinuities, 3)huge computation time and storage Fig. 6 K-NN algorithm
  • 20. 20 K-NN estimation as a Bayesian classifier A method to generate decision boundary directly based on a set of data. − N training data with class labels (ω1 ~ ωc) Nl points for l-th class, such that − Classify a test sample x − Get a sphere with minimum radius r(x) which encircles K samples. − The volume of the sphere: − Kl points for l-th class (ωl) − Class-conditioned density at x: − Evidence: − Prior probabilities: − Posterior Probabilities (Bayes’ theorem) 1 c l l N N    D DV c r x   l l l K p x N V    l l N p N    K p x NV         l l l l p x P K P x p x K     (14)
  • 21. 21 K-NN classifier − Find the class maximizing the posterior probability (Bayes decision) − The point x is classified into 𝜔𝑙0 Summary: (Fig.8) 1) Select K data surrounding the estimation point x 2) The point x is assigned the major class of K points in the neighbor.  0 : argmax argmax l l l l K l P x K   Nearest Neighbor classifier Let consider K=1 for K-NN classification. The point x is classified into the class of the nearest point to x. → Nearest Neighbor Classifier Classification boundary of the K-NN with respect to K small K → tends to make many small class regions (Fig. 9 (b)) large K → few class regions (Fig.9(a)) (14)
  • 22. 22 K-Nearest Neighbor classifier algorithm (K=3) Nearest Neighbor Classifier algorithm (K=1) Fig. 9 K-Nearest Neighbor Classifiers (K=3, 1, 31) Bishop [3] Fig. 8 (Bishop [3]) x (a) (b) (c)
  • 23. 23 4. Cross-Validation Parameter determination problem: - Since the classifier have free parameters such as K in K-NN classifier and h in Kernel density based classifiers, we need to select the optimal parameters by evaluating the classification performances. - Over-fitting problem - The classifier’s parameter (decision boundary) obtained from using overall training data will overfit to them → Need an appropriate new test data Cross-Validation - Given data are split into S parts. S=4 - Fig. 10 - Use (S-1) parts for training and a rest part is used for testing - Apply all different parts for the test as shown in Fig. 11 (S=4) total data
  • 24. 24 training datatest data Experiment 1 Experiment 2 Experiment 3 Experiment 4 Score * 1 Score 2 Score 3 Score 4 ×(1/4)=Averaged score If we want to determine the best K for K-NN classifier, we choose the K providing the highest averaged score by the cross-validation procedure. *Score= error rate or conditioned risk et al. Fig. 11 +)
  • 25. 25 References: [1] R.O. Duda, P.E. Hart, and D. G. Stork, “Pattern Classification”, John Wiley & Sons, 2nd edition, 2004 [2] C. M. Bishop, “Pattern Recognition and Machine Learning”, Springer, 2006 [3] All data files of Bishop’s book are available at the “http://research.microsoft.com/~cmbishop/PRML”