knn classification

Akhilesh Joshi
Akhilesh JoshiGraduate Student
K-NEAREST NEIGHBORS
(K-NN)
knn classification
knn classification
PYTHON
READING DATASET DYNAMICALLY
from tkinter import *
from tkinter.filedialog import askopenfilename
root = Tk()
root.withdraw()
root.update()
file_path = askopenfilename()
root.destroy()
IMPORTING LIBRARIES
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
IMPORTING DATASET
dataset = pd.read_csv('Social_Network_Ads.csv')
X = dataset.iloc[:,2:4]
y= dataset.iloc[:,-1]
SPLITTING THE DATASET INTO THE
TRAINING SET AND TEST SET
from sklearn.cross_validation import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.25,
random_state = 0)
FEATURE SCALING
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)
FEATURE SELECTION …
Recursive Feature Elimination (RFE) is based on the idea to
repeatedly construct a model and choose either the best or
worst performing feature, setting the feature aside and then
repeating the process with the rest of the features.
This process is applied until all features in the dataset are
exhausted. The goal of RFE is to select features.
FEATURE SELECTION …
from sklearn import datasets
from sklearn.feature_selection import RFE
from sklearn.linear_model import LogisticRegression
logreg = LogisticRegression()
rfe = RFE(logreg, 2)
rfe = rfe.fit(X, y )
print(rfe.support_)
FITTING CLASSIFIER TO THE
TRAINING SET
from sklearn.neighbors import KNeighborsClassifier
classifier =
KNeighborsClassifier(n_neighbors=5,metric='minkowski',p=2)
classifier.fit(X_train,y_train)
CONFUSION MATRIX
from sklearn.metrics import confusion_matrix
cm = confusion_matrix(y_test, y_pred)
K FOLD
from sklearn import model_selection
from sklearn.model_selection import cross_val_score
kfold = model_selection.KFold(n_splits=10, random_state=7)
modelCV = KNeighborsClassifier()
scoring = 'accuracy'
results = model_selection.cross_val_score(modelCV, X_train, y_train, cv=kfold,
scoring=scoring)
print("10-fold cross validation average accuracy: %.3f" % (results.mean()))
PREDICTION
y_pred = classifier.predict(X_test)
EVALUATING CLASSIFICATION
REPORT
from sklearn.metrics import classification_report
print(classification_report(y_test, y_pred))
R
READ DATASET
library(readr)
dataset <- read_csv("D:/machine learning AZ/Machine Learning A-Z
Template Folder/Part 3 - Classification/Section 14 - Logistic
Regression/Logistic_Regression/Social_Network_Ads.csv")
dataset = dataset[3:5]
ENCODING THE TARGET FEATURE
AS FACTOR
dataset$Purchased = factor(dataset$Purchased, levels = c(0, 1))
SPLITTING THE DATASET INTO THE
TRAINING SET AND TEST SET
# install.packages('caTools')
library(caTools)
set.seed(123)
split = sample.split(dataset$Purchased, SplitRatio = 0.75)
training_set = subset(dataset, split == TRUE)
test_set = subset(dataset, split == FALSE)
FEATURE SCALING
training_set[-3] = scale(training_set[-3])
test_set[-3] = scale(test_set[-3])
FITTING K-NN TO THE TRAINING SET &
PREDICTION
library(class)
y_pred = knn(train = training_set[, c(1,2)],
test = test_set[, c(1,2)],
cl = training_set$Purchased,
k = 5,
prob = TRUE)
CONFUSION MATRIX
cm = table(test_set[, 3], y_pred)
1 von 23

Recomendados

KNN von
KNN KNN
KNN West Virginia University
23.8K views23 Folien
HEART DISEASE PREDICTION USING NAIVE BAYES ALGORITHM von
HEART DISEASE PREDICTION USING NAIVE BAYES ALGORITHMHEART DISEASE PREDICTION USING NAIVE BAYES ALGORITHM
HEART DISEASE PREDICTION USING NAIVE BAYES ALGORITHMamiteshg
13.3K views15 Folien
K - Nearest neighbor ( KNN ) von
K - Nearest neighbor  ( KNN )K - Nearest neighbor  ( KNN )
K - Nearest neighbor ( KNN )Mohammad Junaid Khan
5K views19 Folien
K nearest neighbor von
K nearest neighborK nearest neighbor
K nearest neighborUjjawal
12.1K views34 Folien
Density Based Clustering von
Density Based ClusteringDensity Based Clustering
Density Based ClusteringSSA KPI
20.2K views30 Folien
DBSCAN (2014_11_25 06_21_12 UTC) von
DBSCAN (2014_11_25 06_21_12 UTC)DBSCAN (2014_11_25 06_21_12 UTC)
DBSCAN (2014_11_25 06_21_12 UTC)Cory Cook
1.5K views22 Folien

Más contenido relacionado

Was ist angesagt?

Birch Algorithm With Solved Example von
Birch Algorithm With Solved ExampleBirch Algorithm With Solved Example
Birch Algorithm With Solved Examplekailash shaw
3.5K views13 Folien
K mean-clustering von
K mean-clusteringK mean-clustering
K mean-clusteringAfzaal Subhani
605 views25 Folien
Classification Based Machine Learning Algorithms von
Classification Based Machine Learning AlgorithmsClassification Based Machine Learning Algorithms
Classification Based Machine Learning AlgorithmsMd. Main Uddin Rony
9.9K views41 Folien
Dbscan von
DbscanDbscan
DbscanRohitPaul52
99 views12 Folien
Lecture10 - Naïve Bayes von
Lecture10 - Naïve BayesLecture10 - Naïve Bayes
Lecture10 - Naïve BayesAlbert Orriols-Puig
12.9K views18 Folien

Was ist angesagt?(20)

Birch Algorithm With Solved Example von kailash shaw
Birch Algorithm With Solved ExampleBirch Algorithm With Solved Example
Birch Algorithm With Solved Example
kailash shaw3.5K views
Classification Based Machine Learning Algorithms von Md. Main Uddin Rony
Classification Based Machine Learning AlgorithmsClassification Based Machine Learning Algorithms
Classification Based Machine Learning Algorithms
Md. Main Uddin Rony9.9K views
k Nearest Neighbor von butest
k Nearest Neighbork Nearest Neighbor
k Nearest Neighbor
butest9.5K views
K MEANS CLUSTERING von singh7599
K MEANS CLUSTERINGK MEANS CLUSTERING
K MEANS CLUSTERING
singh7599502 views
Anomaly/Novelty detection with scikit-learn von agramfort
Anomaly/Novelty detection with scikit-learnAnomaly/Novelty detection with scikit-learn
Anomaly/Novelty detection with scikit-learn
agramfort17.2K views
Naive bayes von umeskath
Naive bayesNaive bayes
Naive bayes
umeskath983 views
Curse of dimensionality von Nikhil Sharma
Curse of dimensionalityCurse of dimensionality
Curse of dimensionality
Nikhil Sharma11.2K views
K Nearest Neighbor Presentation von Dessy Amirudin
K Nearest Neighbor PresentationK Nearest Neighbor Presentation
K Nearest Neighbor Presentation
Dessy Amirudin2.8K views
Nonlinear component analysis as a kernel eigenvalue problem von Michele Filannino
Nonlinear component analysis as a kernel eigenvalue problemNonlinear component analysis as a kernel eigenvalue problem
Nonlinear component analysis as a kernel eigenvalue problem
Michele Filannino4.2K views
K- Nearest Neighbor Approach von Kumud Arora
K- Nearest Neighbor ApproachK- Nearest Neighbor Approach
K- Nearest Neighbor Approach
Kumud Arora256 views
Expectation Maximization and Gaussian Mixture Models von petitegeek
Expectation Maximization and Gaussian Mixture ModelsExpectation Maximization and Gaussian Mixture Models
Expectation Maximization and Gaussian Mixture Models
petitegeek34.8K views

Similar a knn classification

svm classification von
svm classificationsvm classification
svm classificationAkhilesh Joshi
849 views43 Folien
logistic regression with python and R von
logistic regression with python and Rlogistic regression with python and R
logistic regression with python and RAkhilesh Joshi
1.1K views27 Folien
X_train y_trainX_test y_testX_valid y_validtrainin.docx von
X_train y_trainX_test y_testX_valid y_validtrainin.docxX_train y_trainX_test y_testX_valid y_validtrainin.docx
X_train y_trainX_test y_testX_valid y_validtrainin.docxtroutmanboris
2 views12 Folien
Grid search (parameter tuning) von
Grid search (parameter tuning)Grid search (parameter tuning)
Grid search (parameter tuning)Akhilesh Joshi
524 views13 Folien
K fold von
K foldK fold
K foldAkhilesh Joshi
225 views10 Folien
simple linear regression von
simple linear regressionsimple linear regression
simple linear regressionAkhilesh Joshi
355 views16 Folien

Similar a knn classification(20)

logistic regression with python and R von Akhilesh Joshi
logistic regression with python and Rlogistic regression with python and R
logistic regression with python and R
Akhilesh Joshi1.1K views
X_train y_trainX_test y_testX_valid y_validtrainin.docx von troutmanboris
X_train y_trainX_test y_testX_valid y_validtrainin.docxX_train y_trainX_test y_testX_valid y_validtrainin.docx
X_train y_trainX_test y_testX_valid y_validtrainin.docx
troutmanboris2 views
Grid search (parameter tuning) von Akhilesh Joshi
Grid search (parameter tuning)Grid search (parameter tuning)
Grid search (parameter tuning)
Akhilesh Joshi524 views
My code from sklearn-datasets import load_diabetes from sklearn-model.pdf von KOCHHARHOSY
My code  from sklearn-datasets import load_diabetes from sklearn-model.pdfMy code  from sklearn-datasets import load_diabetes from sklearn-model.pdf
My code from sklearn-datasets import load_diabetes from sklearn-model.pdf
KOCHHARHOSY7 views
Pydata DC 2018 (Skorch - A Union of Scikit-learn and PyTorch) von Thomas Fan
Pydata DC 2018 (Skorch - A Union of Scikit-learn and PyTorch)Pydata DC 2018 (Skorch - A Union of Scikit-learn and PyTorch)
Pydata DC 2018 (Skorch - A Union of Scikit-learn and PyTorch)
Thomas Fan575 views
Nyc open-data-2015-andvanced-sklearn-expanded von Vivian S. Zhang
Nyc open-data-2015-andvanced-sklearn-expandedNyc open-data-2015-andvanced-sklearn-expanded
Nyc open-data-2015-andvanced-sklearn-expanded
Vivian S. Zhang2.6K views
I am working on this code for my project- but the accuracy is 0-951601.docx von RyanEAcTuckern
I am working on this code for my project- but the accuracy is 0-951601.docxI am working on this code for my project- but the accuracy is 0-951601.docx
I am working on this code for my project- but the accuracy is 0-951601.docx
RyanEAcTuckern8 views
Data preprocessing for Machine Learning with R and Python von Akhilesh Joshi
Data preprocessing for Machine Learning with R and PythonData preprocessing for Machine Learning with R and Python
Data preprocessing for Machine Learning with R and Python
Akhilesh Joshi646 views
Converting Scikit-Learn to PMML von Villu Ruusmann
Converting Scikit-Learn to PMMLConverting Scikit-Learn to PMML
Converting Scikit-Learn to PMML
Villu Ruusmann27.8K views
# Produce the features of a testing data instance X_new = np. arr.pdf von info893569
# Produce the features of a testing data instance X_new = np. arr.pdf# Produce the features of a testing data instance X_new = np. arr.pdf
# Produce the features of a testing data instance X_new = np. arr.pdf
info8935692 views
import numpy as np import pandas as pd import matplotlib.pyplot as pl.pdf von alankritaecobags
 import numpy as np import pandas as pd import matplotlib.pyplot as pl.pdf import numpy as np import pandas as pd import matplotlib.pyplot as pl.pdf
import numpy as np import pandas as pd import matplotlib.pyplot as pl.pdf
alankritaecobags13 views
Machine Learning - Introduction von Empatika
Machine Learning - IntroductionMachine Learning - Introduction
Machine Learning - Introduction
Empatika771 views
Visualizing the Model Selection Process von Benjamin Bengfort
Visualizing the Model Selection ProcessVisualizing the Model Selection Process
Visualizing the Model Selection Process
Benjamin Bengfort3.2K views

Más de Akhilesh Joshi

PCA and LDA in machine learning von
PCA and LDA in machine learningPCA and LDA in machine learning
PCA and LDA in machine learningAkhilesh Joshi
4K views188 Folien
random forest regression von
random forest regressionrandom forest regression
random forest regressionAkhilesh Joshi
1.6K views14 Folien
decision tree regression von
decision tree regressiondecision tree regression
decision tree regressionAkhilesh Joshi
1.7K views20 Folien
polynomial linear regression von
polynomial linear regressionpolynomial linear regression
polynomial linear regressionAkhilesh Joshi
1.1K views21 Folien
multiple linear regression von
multiple linear regressionmultiple linear regression
multiple linear regressionAkhilesh Joshi
654 views25 Folien
R square vs adjusted r square von
R square vs adjusted r squareR square vs adjusted r square
R square vs adjusted r squareAkhilesh Joshi
4.5K views9 Folien

Más de Akhilesh Joshi(13)

Último

CRM stick or twist workshop von
CRM stick or twist workshopCRM stick or twist workshop
CRM stick or twist workshopinfo828217
11 views16 Folien
Amy slides.pdf von
Amy slides.pdfAmy slides.pdf
Amy slides.pdfStatsCommunications
5 views13 Folien
OECD-Persol Holdings Workshop on Advancing Employee Well-being in Business an... von
OECD-Persol Holdings Workshop on Advancing Employee Well-being in Business an...OECD-Persol Holdings Workshop on Advancing Employee Well-being in Business an...
OECD-Persol Holdings Workshop on Advancing Employee Well-being in Business an...StatsCommunications
5 views26 Folien
Data about the sector workshop von
Data about the sector workshopData about the sector workshop
Data about the sector workshopinfo828217
15 views27 Folien
AvizoImageSegmentation.pptx von
AvizoImageSegmentation.pptxAvizoImageSegmentation.pptx
AvizoImageSegmentation.pptxnathanielbutterworth1
6 views14 Folien
Ukraine Infographic_22NOV2023_v2.pdf von
Ukraine Infographic_22NOV2023_v2.pdfUkraine Infographic_22NOV2023_v2.pdf
Ukraine Infographic_22NOV2023_v2.pdfAnastosiyaGurin
1.4K views3 Folien

Último(20)

CRM stick or twist workshop von info828217
CRM stick or twist workshopCRM stick or twist workshop
CRM stick or twist workshop
info82821711 views
OECD-Persol Holdings Workshop on Advancing Employee Well-being in Business an... von StatsCommunications
OECD-Persol Holdings Workshop on Advancing Employee Well-being in Business an...OECD-Persol Holdings Workshop on Advancing Employee Well-being in Business an...
OECD-Persol Holdings Workshop on Advancing Employee Well-being in Business an...
Data about the sector workshop von info828217
Data about the sector workshopData about the sector workshop
Data about the sector workshop
info82821715 views
Ukraine Infographic_22NOV2023_v2.pdf von AnastosiyaGurin
Ukraine Infographic_22NOV2023_v2.pdfUkraine Infographic_22NOV2023_v2.pdf
Ukraine Infographic_22NOV2023_v2.pdf
AnastosiyaGurin1.4K views
CRM stick or twist.pptx von info828217
CRM stick or twist.pptxCRM stick or twist.pptx
CRM stick or twist.pptx
info82821711 views
[DSC Europe 23][AI:CSI] Aleksa Stojanovic - Applying AI for Threat Detection ... von DataScienceConferenc1
[DSC Europe 23][AI:CSI] Aleksa Stojanovic - Applying AI for Threat Detection ...[DSC Europe 23][AI:CSI] Aleksa Stojanovic - Applying AI for Threat Detection ...
[DSC Europe 23][AI:CSI] Aleksa Stojanovic - Applying AI for Threat Detection ...
Short Story Assignment by Kelly Nguyen von kellynguyen01
Short Story Assignment by Kelly NguyenShort Story Assignment by Kelly Nguyen
Short Story Assignment by Kelly Nguyen
kellynguyen0119 views
CRIJ4385_Death Penalty_F23.pptx von yvettemm100
CRIJ4385_Death Penalty_F23.pptxCRIJ4385_Death Penalty_F23.pptx
CRIJ4385_Death Penalty_F23.pptx
yvettemm1007 views
[DSC Europe 23] Danijela Horak - The Innovator’s Dilemma: to Build or Not to ... von DataScienceConferenc1
[DSC Europe 23] Danijela Horak - The Innovator’s Dilemma: to Build or Not to ...[DSC Europe 23] Danijela Horak - The Innovator’s Dilemma: to Build or Not to ...
[DSC Europe 23] Danijela Horak - The Innovator’s Dilemma: to Build or Not to ...
[DSC Europe 23][AI:CSI] Dragan Pleskonjic - AI Impact on Cybersecurity and P... von DataScienceConferenc1
[DSC Europe 23][AI:CSI]  Dragan Pleskonjic - AI Impact on Cybersecurity and P...[DSC Europe 23][AI:CSI]  Dragan Pleskonjic - AI Impact on Cybersecurity and P...
[DSC Europe 23][AI:CSI] Dragan Pleskonjic - AI Impact on Cybersecurity and P...
Chapter 3b- Process Communication (1) (1)(1) (1).pptx von ayeshabaig2004
Chapter 3b- Process Communication (1) (1)(1) (1).pptxChapter 3b- Process Communication (1) (1)(1) (1).pptx
Chapter 3b- Process Communication (1) (1)(1) (1).pptx
ayeshabaig20047 views
Cross-network in Google Analytics 4.pdf von GA4 Tutorials
Cross-network in Google Analytics 4.pdfCross-network in Google Analytics 4.pdf
Cross-network in Google Analytics 4.pdf
GA4 Tutorials6 views

knn classification

  • 5. READING DATASET DYNAMICALLY from tkinter import * from tkinter.filedialog import askopenfilename root = Tk() root.withdraw() root.update() file_path = askopenfilename() root.destroy()
  • 6. IMPORTING LIBRARIES import pandas as pd import numpy as np import matplotlib.pyplot as plt
  • 7. IMPORTING DATASET dataset = pd.read_csv('Social_Network_Ads.csv') X = dataset.iloc[:,2:4] y= dataset.iloc[:,-1]
  • 8. SPLITTING THE DATASET INTO THE TRAINING SET AND TEST SET from sklearn.cross_validation import train_test_split X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.25, random_state = 0)
  • 9. FEATURE SCALING from sklearn.preprocessing import StandardScaler sc = StandardScaler() X_train = sc.fit_transform(X_train) X_test = sc.transform(X_test)
  • 10. FEATURE SELECTION … Recursive Feature Elimination (RFE) is based on the idea to repeatedly construct a model and choose either the best or worst performing feature, setting the feature aside and then repeating the process with the rest of the features. This process is applied until all features in the dataset are exhausted. The goal of RFE is to select features.
  • 11. FEATURE SELECTION … from sklearn import datasets from sklearn.feature_selection import RFE from sklearn.linear_model import LogisticRegression logreg = LogisticRegression() rfe = RFE(logreg, 2) rfe = rfe.fit(X, y ) print(rfe.support_)
  • 12. FITTING CLASSIFIER TO THE TRAINING SET from sklearn.neighbors import KNeighborsClassifier classifier = KNeighborsClassifier(n_neighbors=5,metric='minkowski',p=2) classifier.fit(X_train,y_train)
  • 13. CONFUSION MATRIX from sklearn.metrics import confusion_matrix cm = confusion_matrix(y_test, y_pred)
  • 14. K FOLD from sklearn import model_selection from sklearn.model_selection import cross_val_score kfold = model_selection.KFold(n_splits=10, random_state=7) modelCV = KNeighborsClassifier() scoring = 'accuracy' results = model_selection.cross_val_score(modelCV, X_train, y_train, cv=kfold, scoring=scoring) print("10-fold cross validation average accuracy: %.3f" % (results.mean()))
  • 16. EVALUATING CLASSIFICATION REPORT from sklearn.metrics import classification_report print(classification_report(y_test, y_pred))
  • 17. R
  • 18. READ DATASET library(readr) dataset <- read_csv("D:/machine learning AZ/Machine Learning A-Z Template Folder/Part 3 - Classification/Section 14 - Logistic Regression/Logistic_Regression/Social_Network_Ads.csv") dataset = dataset[3:5]
  • 19. ENCODING THE TARGET FEATURE AS FACTOR dataset$Purchased = factor(dataset$Purchased, levels = c(0, 1))
  • 20. SPLITTING THE DATASET INTO THE TRAINING SET AND TEST SET # install.packages('caTools') library(caTools) set.seed(123) split = sample.split(dataset$Purchased, SplitRatio = 0.75) training_set = subset(dataset, split == TRUE) test_set = subset(dataset, split == FALSE)
  • 21. FEATURE SCALING training_set[-3] = scale(training_set[-3]) test_set[-3] = scale(test_set[-3])
  • 22. FITTING K-NN TO THE TRAINING SET & PREDICTION library(class) y_pred = knn(train = training_set[, c(1,2)], test = test_set[, c(1,2)], cl = training_set$Purchased, k = 5, prob = TRUE)
  • 23. CONFUSION MATRIX cm = table(test_set[, 3], y_pred)