SlideShare ist ein Scribd-Unternehmen logo
1 von 25
Machine Learning in R
Suja A. Alex,
Assistant Professor,
Dept. of Information Technology,
St.Xavier’s Catholic College of Engineering
Data Science
• Multidisciplinary field
• Data Science is the science
which uses computer science,
statistics and machine learning,
visualization to collect, clean,
integrate, analyze, visualize,
interact with data to create data
products.
• Data science principles apply to
all data – big and small
2
5 Vs of Big Data:
• Raw Data: Volume
• Change over time: Velocity
• Data types: Variety
• Data Quality: Veracity
• Information for Decision Making: Value
3
Machine Learning Algorithms:
4
Input: Datasets in R
• https://vincentarelbundock.github.io/Rdatasets/datasets.html
• http://archive.ics.uci.edu/ml/datasets.php
• https://www.kaggle.com/datasets
5
Output: Data Visualization Packages in R
• graphics - plot(), barplot(), boxplot()
• ggplot2 - Scatterplot
• lattice - tiled plots
• plotly - Line plot, Time series chart, interactive 3D plots
6
1. Cluster Analysis
• Finding groups of objects
• Objects in a group will be similar (or related) to one another and
different from (or unrelated to) the objects in other groups.
7
Minimize Similarity between Clusters
8
Taxonomy of Clustering Algorithms
9
K-means clustering - Example
Data: S={2,3,4,10,11,12,20,25,30}
If we choose K=2
Find first set of Means (choose randomly):
M1=4, M2=12
Assign elements to two clusters K1 and K2:
K1={2,3,4} K2={10,11,12,20,25,30}
Find second set of Means:
M1=(2+3+4)/3=3 M2=(108/6)=18
Re-assign elements to two clusters K1 and K2:
K1={2,3,4,10} K2={11,12,20,25,30}
Now M1=(19/4)=4.75 = 5 M2=19.6 = 20
K1={2,3,4,10,11,12} K2={20,25,30}
M1=7 M2=25
K1={2,3,4,10,11,12} K2={20,25,30}
M1=7 M2=25
If we get same means, the k-means algorithm stops…We got final two clusters…
10
K-means clustering
• Simple unsupervised machine learning algorithm
• Partitional clustering approach
• Each cluster is associated with a centroid or mean (center point)
• Each point is assigned to the cluster with the closest centroid.
• Number of clusters K must be specified.
K-means Algorithm:
11
Clustering packages in R
1. Cluster
2. ClusterR
3. NbClust
Function for k-means in R:
kmeans(x, centers, nstart)
where x  numeric dataset (matrix or data frame)
centers  number of clusters to extract
nstart  generate number of initial configurations
12
K-means-clustering in R
// Before Clustering
// Explore data
library(datasets)
head(iris)
library(ggplot2)
ggplot(iris, aes(Petal.Length, Petal.Width, color = Species)) + geom_point()
// After K-means Clustering
set.seed(20)
irisCluster <- kmeans(iris[, 3:4], 3)
irisCluster
irisCluster$cluster <- as.factor(irisCluster$cluster)
ggplot(iris, aes(Petal.Length, Petal.Width, color = irisCluster$cluster)) + geom_point()
13
2. Classification
• Categorize our data into a desired and distinct number of classes.
14
Example:
15
Classification Algorithm
• Decision Tree
• Bayes Classifier
• Nearest Neighbor
• Support Vector Machines
• Naive Linear Classifiers (or Logistic Regression)
16
1. Decision Tree
17
Example Decision Tree:
18
Example
19
KNN classification:
• Supervised learning algorithm
• Lazy learning algorithm.
• Based on similarity measure (distance function)
Steps for KNN:
1. Calculate distance (e.g. Euclidean distance, Hamming distance, etc.)
2. Find k closest neighbors
3. Vote for labels or calculate the mean
20
Example:
21
KNN classification in R
df <- data(iris) ##load data
head(iris) ## see the stucture
##Generate a random number that is 90% of the total number of rows in dataset.
ran <- sample(1:nrow(iris), 0.9 * nrow(iris))
##the normalization function is created
nor <-function(x) { (x -min(x))/(max(x)-min(x)) }
##Run nomalization on first 4 coulumns of dataset because they are the predictors
iris_norm <- as.data.frame(lapply(iris[,c(1,2,3,4)], nor))
summary(iris_norm)
##extract training set
iris_train <- iris_norm[ran,]
##extract testing set
iris_test <- iris_norm[-ran,]
##extract 5th column of train dataset because it will be used as 'cl' argument in knn function.
iris_target_category <- iris[ran,5]
##extract 5th column if test dataset to measure the accuracy
iris_test_category <- iris[-ran,5]
##load the package class
library(class)
##run knn function
pr <- knn(iris_train,iris_test,cl=iris_target_category,k=13)
##create confusion matrix
tab <- table(pr,iris_test_category)
##this function divides the correct predictions by total number of predictions that tell us how accurate teh
model is.
accuracy <- function(x){sum(diag(x)/(sum(rowSums(x)))) * 100}
accuracy(tab)
22
3. Regression Analysis
1. Linear Regression:
• linear relationship between the input variable (x) and the soutput variable (y).
• Fitting a straight line to data.
2. Multiple linear regression:
When there are multiple input variables, literature from statistics often refers to
the method as multiple linear regression.
23
Simple Linear Regression:
x <- c(151, 174, 138, 186, 128, 136, 179, 163, 152, 131)
y <- c(63, 81, 56, 91, 47, 57, 76, 72, 62, 48)
# Apply the lm() function.
relation <- lm(y~x)
print(relation)
summary(relation)
24
Multiple Linear Regression:
input <- mtcars[,c("mpg","disp","hp","wt")]
print(head(input))
// Create Relationship Model & get the Coefficients
input <- mtcars[,c("mpg","disp","hp","wt")]
# Create the relationship model.
model <- lm(mpg~disp+hp+wt, data = input)
# Show the model.
print(model)
# Get the Intercept and coefficients as vector elements.
cat("# # # # The Coefficient Values # # # ","n")
a <- coef(model)[1]
print(a)
Xdisp <- coef(model)[2]
Xhp <- coef(model)[3]
Xwt <- coef(model)[4]
print(Xdisp)
print(Xhp)
print(Xwt)
25

Weitere ähnliche Inhalte

Was ist angesagt?

Bca ii dfs u-1 introduction to data structure
Bca ii dfs u-1 introduction to data structureBca ii dfs u-1 introduction to data structure
Bca ii dfs u-1 introduction to data structureRai University
 
"Quantum clustering - physics inspired clustering algorithm", Sigalit Bechler...
"Quantum clustering - physics inspired clustering algorithm", Sigalit Bechler..."Quantum clustering - physics inspired clustering algorithm", Sigalit Bechler...
"Quantum clustering - physics inspired clustering algorithm", Sigalit Bechler...Dataconomy Media
 
Introduction To R Language
Introduction To R LanguageIntroduction To R Language
Introduction To R LanguageGaurang Dobariya
 
Elementary data structure
Elementary data structureElementary data structure
Elementary data structureBiswajit Mandal
 
Tech Talk - JPA and Query Optimization - publish
Tech Talk  -  JPA and Query Optimization - publishTech Talk  -  JPA and Query Optimization - publish
Tech Talk - JPA and Query Optimization - publishGleydson Lima
 
Photon Technical Deep Dive: How to Think Vectorized
Photon Technical Deep Dive: How to Think VectorizedPhoton Technical Deep Dive: How to Think Vectorized
Photon Technical Deep Dive: How to Think VectorizedDatabricks
 
Bsc cs ii dfs u-1 introduction to data structure
Bsc cs ii dfs u-1 introduction to data structureBsc cs ii dfs u-1 introduction to data structure
Bsc cs ii dfs u-1 introduction to data structureRai University
 
Presentation on Heap Sort
Presentation on Heap Sort Presentation on Heap Sort
Presentation on Heap Sort Amit Kundu
 
ArrayList in JAVA
ArrayList in JAVAArrayList in JAVA
ArrayList in JAVASAGARDAVE29
 
Vasia Kalavri – Training: Gelly School
Vasia Kalavri – Training: Gelly School Vasia Kalavri – Training: Gelly School
Vasia Kalavri – Training: Gelly School Flink Forward
 
Data Structure
Data StructureData Structure
Data Structuresheraz1
 
Python data structures - best in class for data analysis
Python data structures -   best in class for data analysisPython data structures -   best in class for data analysis
Python data structures - best in class for data analysisRajesh M
 
Java ArrayList Tutorial | Edureka
Java ArrayList Tutorial | EdurekaJava ArrayList Tutorial | Edureka
Java ArrayList Tutorial | EdurekaEdureka!
 
"Machine Learning and Internet of Things, the future of medical prevention", ...
"Machine Learning and Internet of Things, the future of medical prevention", ..."Machine Learning and Internet of Things, the future of medical prevention", ...
"Machine Learning and Internet of Things, the future of medical prevention", ...Dataconomy Media
 
Advanced data structures vol. 1
Advanced data structures   vol. 1Advanced data structures   vol. 1
Advanced data structures vol. 1Christalin Nelson
 

Was ist angesagt? (20)

Bca ii dfs u-1 introduction to data structure
Bca ii dfs u-1 introduction to data structureBca ii dfs u-1 introduction to data structure
Bca ii dfs u-1 introduction to data structure
 
"Quantum clustering - physics inspired clustering algorithm", Sigalit Bechler...
"Quantum clustering - physics inspired clustering algorithm", Sigalit Bechler..."Quantum clustering - physics inspired clustering algorithm", Sigalit Bechler...
"Quantum clustering - physics inspired clustering algorithm", Sigalit Bechler...
 
Introduction To R Language
Introduction To R LanguageIntroduction To R Language
Introduction To R Language
 
Elementary data structure
Elementary data structureElementary data structure
Elementary data structure
 
Tech Talk - JPA and Query Optimization - publish
Tech Talk  -  JPA and Query Optimization - publishTech Talk  -  JPA and Query Optimization - publish
Tech Talk - JPA and Query Optimization - publish
 
Photon Technical Deep Dive: How to Think Vectorized
Photon Technical Deep Dive: How to Think VectorizedPhoton Technical Deep Dive: How to Think Vectorized
Photon Technical Deep Dive: How to Think Vectorized
 
Bsc cs ii dfs u-1 introduction to data structure
Bsc cs ii dfs u-1 introduction to data structureBsc cs ii dfs u-1 introduction to data structure
Bsc cs ii dfs u-1 introduction to data structure
 
arrays
arraysarrays
arrays
 
Data structures
Data structuresData structures
Data structures
 
Presentation on Heap Sort
Presentation on Heap Sort Presentation on Heap Sort
Presentation on Heap Sort
 
ArrayList in JAVA
ArrayList in JAVAArrayList in JAVA
ArrayList in JAVA
 
Stack Data structure
Stack Data structureStack Data structure
Stack Data structure
 
Vasia Kalavri – Training: Gelly School
Vasia Kalavri – Training: Gelly School Vasia Kalavri – Training: Gelly School
Vasia Kalavri – Training: Gelly School
 
Data Structure
Data StructureData Structure
Data Structure
 
Hadoop
HadoopHadoop
Hadoop
 
Python data structures - best in class for data analysis
Python data structures -   best in class for data analysisPython data structures -   best in class for data analysis
Python data structures - best in class for data analysis
 
Java ArrayList Tutorial | Edureka
Java ArrayList Tutorial | EdurekaJava ArrayList Tutorial | Edureka
Java ArrayList Tutorial | Edureka
 
"Machine Learning and Internet of Things, the future of medical prevention", ...
"Machine Learning and Internet of Things, the future of medical prevention", ..."Machine Learning and Internet of Things, the future of medical prevention", ...
"Machine Learning and Internet of Things, the future of medical prevention", ...
 
Heaps
HeapsHeaps
Heaps
 
Advanced data structures vol. 1
Advanced data structures   vol. 1Advanced data structures   vol. 1
Advanced data structures vol. 1
 

Ähnlich wie Machine Learning in R

An Introduction to Data Mining with R
An Introduction to Data Mining with RAn Introduction to Data Mining with R
An Introduction to Data Mining with RYanchang Zhao
 
RDataMining slides-clustering-with-r
RDataMining slides-clustering-with-rRDataMining slides-clustering-with-r
RDataMining slides-clustering-with-rYanchang Zhao
 
Machine Learning: Classification Concepts (Part 1)
Machine Learning: Classification Concepts (Part 1)Machine Learning: Classification Concepts (Part 1)
Machine Learning: Classification Concepts (Part 1)Daniel Chan
 
R language tutorial
R language tutorialR language tutorial
R language tutorialDavid Chiu
 
R Introduction
R IntroductionR Introduction
R IntroductionSangeetha S
 
R programming & Machine Learning
R programming & Machine LearningR programming & Machine Learning
R programming & Machine LearningAmanBhalla14
 
ClusterAnalysis
ClusterAnalysisClusterAnalysis
ClusterAnalysisAnbarasan S
 
Mini-lab 1: Stochastic Gradient Descent classifier, Optimizing Logistic Regre...
Mini-lab 1: Stochastic Gradient Descent classifier, Optimizing Logistic Regre...Mini-lab 1: Stochastic Gradient Descent classifier, Optimizing Logistic Regre...
Mini-lab 1: Stochastic Gradient Descent classifier, Optimizing Logistic Regre...Yao Yao
 
Introduction to R for data science
Introduction to R for data scienceIntroduction to R for data science
Introduction to R for data scienceLong Nguyen
 
Anomaly Detection with Apache Spark
Anomaly Detection with Apache SparkAnomaly Detection with Apache Spark
Anomaly Detection with Apache SparkCloudera, Inc.
 
A Comprehensive Study of Clustering Algorithms for Big Data Mining with MapRe...
A Comprehensive Study of Clustering Algorithms for Big Data Mining with MapRe...A Comprehensive Study of Clustering Algorithms for Big Data Mining with MapRe...
A Comprehensive Study of Clustering Algorithms for Big Data Mining with MapRe...KamleshKumar394
 
Document clustering for forensic analysis an approach for improving compute...
Document clustering for forensic   analysis an approach for improving compute...Document clustering for forensic   analysis an approach for improving compute...
Document clustering for forensic analysis an approach for improving compute...Madan Golla
 
Clustering and Visualisation using R programming
Clustering and Visualisation using R programmingClustering and Visualisation using R programming
Clustering and Visualisation using R programmingNixon Mendez
 
Tulsa techfest Spark Core Aug 5th 2016
Tulsa techfest Spark Core Aug 5th 2016Tulsa techfest Spark Core Aug 5th 2016
Tulsa techfest Spark Core Aug 5th 2016Mark Smith
 
TAO Fayan_Report on Top 10 data mining algorithms applications with R
TAO Fayan_Report on Top 10 data mining algorithms applications with RTAO Fayan_Report on Top 10 data mining algorithms applications with R
TAO Fayan_Report on Top 10 data mining algorithms applications with RFayan TAO
 
Data science with R - Clustering and Classification
Data science with R - Clustering and ClassificationData science with R - Clustering and Classification
Data science with R - Clustering and ClassificationBrigitte Mueller
 
ensembles_emptytemplate_v2
ensembles_emptytemplate_v2ensembles_emptytemplate_v2
ensembles_emptytemplate_v2Shrayes Ramesh
 

Ähnlich wie Machine Learning in R (20)

An Introduction to Data Mining with R
An Introduction to Data Mining with RAn Introduction to Data Mining with R
An Introduction to Data Mining with R
 
RDataMining slides-clustering-with-r
RDataMining slides-clustering-with-rRDataMining slides-clustering-with-r
RDataMining slides-clustering-with-r
 
Decision Tree.pptx
Decision Tree.pptxDecision Tree.pptx
Decision Tree.pptx
 
Machine Learning: Classification Concepts (Part 1)
Machine Learning: Classification Concepts (Part 1)Machine Learning: Classification Concepts (Part 1)
Machine Learning: Classification Concepts (Part 1)
 
R language tutorial
R language tutorialR language tutorial
R language tutorial
 
R Introduction
R IntroductionR Introduction
R Introduction
 
R programming & Machine Learning
R programming & Machine LearningR programming & Machine Learning
R programming & Machine Learning
 
ClusterAnalysis
ClusterAnalysisClusterAnalysis
ClusterAnalysis
 
Mini-lab 1: Stochastic Gradient Descent classifier, Optimizing Logistic Regre...
Mini-lab 1: Stochastic Gradient Descent classifier, Optimizing Logistic Regre...Mini-lab 1: Stochastic Gradient Descent classifier, Optimizing Logistic Regre...
Mini-lab 1: Stochastic Gradient Descent classifier, Optimizing Logistic Regre...
 
Introduction to R for data science
Introduction to R for data scienceIntroduction to R for data science
Introduction to R for data science
 
Clustering.pptx
Clustering.pptxClustering.pptx
Clustering.pptx
 
Anomaly Detection with Apache Spark
Anomaly Detection with Apache SparkAnomaly Detection with Apache Spark
Anomaly Detection with Apache Spark
 
A Comprehensive Study of Clustering Algorithms for Big Data Mining with MapRe...
A Comprehensive Study of Clustering Algorithms for Big Data Mining with MapRe...A Comprehensive Study of Clustering Algorithms for Big Data Mining with MapRe...
A Comprehensive Study of Clustering Algorithms for Big Data Mining with MapRe...
 
Document clustering for forensic analysis an approach for improving compute...
Document clustering for forensic   analysis an approach for improving compute...Document clustering for forensic   analysis an approach for improving compute...
Document clustering for forensic analysis an approach for improving compute...
 
Clustering and Visualisation using R programming
Clustering and Visualisation using R programmingClustering and Visualisation using R programming
Clustering and Visualisation using R programming
 
Tulsa techfest Spark Core Aug 5th 2016
Tulsa techfest Spark Core Aug 5th 2016Tulsa techfest Spark Core Aug 5th 2016
Tulsa techfest Spark Core Aug 5th 2016
 
TAO Fayan_Report on Top 10 data mining algorithms applications with R
TAO Fayan_Report on Top 10 data mining algorithms applications with RTAO Fayan_Report on Top 10 data mining algorithms applications with R
TAO Fayan_Report on Top 10 data mining algorithms applications with R
 
Data science with R - Clustering and Classification
Data science with R - Clustering and ClassificationData science with R - Clustering and Classification
Data science with R - Clustering and Classification
 
cluster(python)
cluster(python)cluster(python)
cluster(python)
 
ensembles_emptytemplate_v2
ensembles_emptytemplate_v2ensembles_emptytemplate_v2
ensembles_emptytemplate_v2
 

KĂźrzlich hochgeladen

Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSAishani27
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxStephen266013
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiSuhani Kapoor
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxolyaivanovalion
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfLars Albertsson
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxMohammedJunaid861692
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023ymrp368
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 

KĂźrzlich hochgeladen (20)

Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICS
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docx
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 

Machine Learning in R

  • 1. Machine Learning in R Suja A. Alex, Assistant Professor, Dept. of Information Technology, St.Xavier’s Catholic College of Engineering
  • 2. Data Science • Multidisciplinary field • Data Science is the science which uses computer science, statistics and machine learning, visualization to collect, clean, integrate, analyze, visualize, interact with data to create data products. • Data science principles apply to all data – big and small 2
  • 3. 5 Vs of Big Data: • Raw Data: Volume • Change over time: Velocity • Data types: Variety • Data Quality: Veracity • Information for Decision Making: Value 3
  • 5. Input: Datasets in R • https://vincentarelbundock.github.io/Rdatasets/datasets.html • http://archive.ics.uci.edu/ml/datasets.php • https://www.kaggle.com/datasets 5
  • 6. Output: Data Visualization Packages in R • graphics - plot(), barplot(), boxplot() • ggplot2 - Scatterplot • lattice - tiled plots • plotly - Line plot, Time series chart, interactive 3D plots 6
  • 7. 1. Cluster Analysis • Finding groups of objects • Objects in a group will be similar (or related) to one another and different from (or unrelated to) the objects in other groups. 7
  • 9. Taxonomy of Clustering Algorithms 9
  • 10. K-means clustering - Example Data: S={2,3,4,10,11,12,20,25,30} If we choose K=2 Find first set of Means (choose randomly): M1=4, M2=12 Assign elements to two clusters K1 and K2: K1={2,3,4} K2={10,11,12,20,25,30} Find second set of Means: M1=(2+3+4)/3=3 M2=(108/6)=18 Re-assign elements to two clusters K1 and K2: K1={2,3,4,10} K2={11,12,20,25,30} Now M1=(19/4)=4.75 = 5 M2=19.6 = 20 K1={2,3,4,10,11,12} K2={20,25,30} M1=7 M2=25 K1={2,3,4,10,11,12} K2={20,25,30} M1=7 M2=25 If we get same means, the k-means algorithm stops…We got final two clusters… 10
  • 11. K-means clustering • Simple unsupervised machine learning algorithm • Partitional clustering approach • Each cluster is associated with a centroid or mean (center point) • Each point is assigned to the cluster with the closest centroid. • Number of clusters K must be specified. K-means Algorithm: 11
  • 12. Clustering packages in R 1. Cluster 2. ClusterR 3. NbClust Function for k-means in R: kmeans(x, centers, nstart) where x  numeric dataset (matrix or data frame) centers  number of clusters to extract nstart  generate number of initial configurations 12
  • 13. K-means-clustering in R // Before Clustering // Explore data library(datasets) head(iris) library(ggplot2) ggplot(iris, aes(Petal.Length, Petal.Width, color = Species)) + geom_point() // After K-means Clustering set.seed(20) irisCluster <- kmeans(iris[, 3:4], 3) irisCluster irisCluster$cluster <- as.factor(irisCluster$cluster) ggplot(iris, aes(Petal.Length, Petal.Width, color = irisCluster$cluster)) + geom_point() 13
  • 14. 2. Classification • Categorize our data into a desired and distinct number of classes. 14
  • 16. Classification Algorithm • Decision Tree • Bayes Classifier • Nearest Neighbor • Support Vector Machines • Naive Linear Classifiers (or Logistic Regression) 16
  • 20. KNN classification: • Supervised learning algorithm • Lazy learning algorithm. • Based on similarity measure (distance function) Steps for KNN: 1. Calculate distance (e.g. Euclidean distance, Hamming distance, etc.) 2. Find k closest neighbors 3. Vote for labels or calculate the mean 20
  • 22. KNN classification in R df <- data(iris) ##load data head(iris) ## see the stucture ##Generate a random number that is 90% of the total number of rows in dataset. ran <- sample(1:nrow(iris), 0.9 * nrow(iris)) ##the normalization function is created nor <-function(x) { (x -min(x))/(max(x)-min(x)) } ##Run nomalization on first 4 coulumns of dataset because they are the predictors iris_norm <- as.data.frame(lapply(iris[,c(1,2,3,4)], nor)) summary(iris_norm) ##extract training set iris_train <- iris_norm[ran,] ##extract testing set iris_test <- iris_norm[-ran,] ##extract 5th column of train dataset because it will be used as 'cl' argument in knn function. iris_target_category <- iris[ran,5] ##extract 5th column if test dataset to measure the accuracy iris_test_category <- iris[-ran,5] ##load the package class library(class) ##run knn function pr <- knn(iris_train,iris_test,cl=iris_target_category,k=13) ##create confusion matrix tab <- table(pr,iris_test_category) ##this function divides the correct predictions by total number of predictions that tell us how accurate teh model is. accuracy <- function(x){sum(diag(x)/(sum(rowSums(x)))) * 100} accuracy(tab) 22
  • 23. 3. Regression Analysis 1. Linear Regression: • linear relationship between the input variable (x) and the soutput variable (y). • Fitting a straight line to data. 2. Multiple linear regression: When there are multiple input variables, literature from statistics often refers to the method as multiple linear regression. 23
  • 24. Simple Linear Regression: x <- c(151, 174, 138, 186, 128, 136, 179, 163, 152, 131) y <- c(63, 81, 56, 91, 47, 57, 76, 72, 62, 48) # Apply the lm() function. relation <- lm(y~x) print(relation) summary(relation) 24
  • 25. Multiple Linear Regression: input <- mtcars[,c("mpg","disp","hp","wt")] print(head(input)) // Create Relationship Model & get the Coefficients input <- mtcars[,c("mpg","disp","hp","wt")] # Create the relationship model. model <- lm(mpg~disp+hp+wt, data = input) # Show the model. print(model) # Get the Intercept and coefficients as vector elements. cat("# # # # The Coefficient Values # # # ","n") a <- coef(model)[1] print(a) Xdisp <- coef(model)[2] Xhp <- coef(model)[3] Xwt <- coef(model)[4] print(Xdisp) print(Xhp) print(Xwt) 25