2. Carla P. Gomes
CS4700
1-Nearest Neighbor
One of the simplest of all machine learning classifiers
Simple idea: label a new point the same as the closest known point
Label it red.
3. Carla P. Gomes
CS4700
1-NNâs Aspects as an
Instance-Based Learner:
A distance metric
â Euclidean
â When different units are used for each dimension
ï normalize each dimension by standard deviation
â For discrete data, can use hamming distance
ï D(x1,x2) =number of features on which x1 and x2 differ
â Others (e.g., normal, cosine)
How many nearby neighbors to look at?
â One
How to fit with the local points?
â Just predict the same output as the nearest neighbor.
Adapted from âInstance-Based Learningâ
lecture slides by Andrew Moore, CMU.
4. Carla P. Gomes
CS4700
k â Nearest Neighbor
Generalizes 1-NN to smooth away noise in the labels
A new point is now assigned the most frequent label of its k nearest
neighbors
Label it red, when k = 3
Label it blue, when k = 7
5. KNN Example
New examples:
â Example 1 (great, no, no, normal, no)
â Example 2 (mediocre, yes, no, normal, no)
Food
(3)
Chat
(2)
Fast
(2)
Price
(3)
Bar
(2)
BigTip
1 great yes yes normal no yes
2 great no yes normal no yes
3 mediocre yes no high no no
4 great yes yes normal yes yes
ï most similar: number 2 (1 mismatch, 4 match) ï yes
ï Second most similar example: number 1 (2 mismatch, 3 match) ï yes
Similarity metric: Number of matching attributes (k=2)
Yes
ï Most similar: number 3 (1 mismatch, 4 match) ï no
ï Second most similar example: number 1 (2 mismatch, 3 match) ï yes
Yes/No
6. Carla P. Gomes
CS4700
Selecting the Number of Neighbors
Increase k:
â Makes KNN less sensitive to noise
Decrease k:
â Allows capturing finer structure of space
ïšPick k not too large, but not too small (depends on data)
7. Carla P. Gomes
CS4700
Curse-of-Dimensionality
Prediction accuracy can quickly degrade when number of attributes
grows.
â Irrelevant attributes easily âswampâ information from relevant
attributes
â When many irrelevant attributes, similarity/distance measure
becomes less reliable
Remedy
â Try to remove irrelevant attributes in pre-processing step
â Weight attributes differently
â Increase k (but not too much)
8. Carla P. Gomes
CS4700
Advantages and Disadvantages of KNN
Need distance/similarity measure and attributes that âmatchâ target
function.
For large training sets,
ï Must make a pass through the entire dataset for each classification.
This can be prohibitive for large data sets.
Prediction accuracy can quickly degrade when number of attributes
grows.
Simple to implement algorithm;
Requires little tuning;
Often performs quite weel!
(Try it first on a new learning problem).
9. Carla P. Gomes
CS4700
Example
Query -> x = (Maths=6,CS=8) , k=3
Euclidean Distance =
sqrt ((|x01 â xa1|2 + |x02- xa2|2))
Maths CS Result
1 4 3 F
2 6 7 P
3 7 8 P
4 5 5 F
5 8 8 P