SlideShare ist ein Scribd-Unternehmen logo
1 von 20
Downloaden Sie, um offline zu lesen
DECISION TREE
REGRESSION
Akhilesh Joshi
decision tree regression
decision tree regression
The value in
green box
represents the
average of data
points in that
split
decision tree regression
decision tree regression
PYTHON
READING FILE DYNAMICALLY
from tkinter import *
from tkinter.filedialog import askopenfilename
root = Tk()
root.withdraw()
root.update()
file_path = askopenfilename()
root.destroy()
IMPORTING LIBRARIES
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
IMPORTING DATASET
dataset = pd.read_csv(file_path)
X= dataset.iloc[:,1:2].values
y= dataset.iloc[:,2:3].values
DECISION TREE REGRESSOR
from sklearn.tree import DecisionTreeRegressor
regressor = DecisionTreeRegressor(random_state=42)
model = regressor.fit(X,y)
PREDICTION
model.predict(6.5)
SIMPLE PLOT
plt.scatter(X,y,color="red")
plt.plot(X,model.predict(X),color="blue")
plt.title('Truth or Bluff (Decision Tree
Regression)')
plt.xlabel('Position level')
plt.ylabel('Salary')
plt.show()
NOTE : Whats wrong here ? Well in the simple plot the Decision Tree Regressor model is treated as a c
But it is not a continuous model. Decision Tree Regressor is a discrete model hence it should be treate
FIX : plotting the same graph with grid with small step size say 0.01 will help us visualize better
UPDATED PLOT
X_grid = np.arange(min(X), max(X), 0.001)
X_grid = X_grid.reshape((len(X_grid), 1))
plt.scatter(X, y, color = 'red')
plt.plot(X_grid, regressor.predict(X_grid),
color = 'blue')
plt.title('Truth or Bluff (Decision Tree
Regression)')
plt.xlabel('Position level')
plt.ylabel('Salary')
plt.show()
Note : Here the graph that is plotted gives us the clear discrete structure
R
READ DATASET
library(readr)
dataset <- read_csv("D:/machine learning AZ/Machine Learning A-Z
Template Folder/Part 2 - Regression/Section 7 - Support Vector
Regression (SVR)/SVR/Position_Salaries.csv")
dataset= dataset[2:3]
LIBRARY REQUIRED - RPART
library('rpart')
regressor = rpart(Salary ~ . , data= dataset , control =
rpart.control(minsplit = 1))
NOTE : do not forget to include control parameter as it decides the
number of splits in your model.
PLOT
# Visualising the Regression Model results
# install.packages('ggplot2')
library(ggplot2)
ggplot() +
geom_point(aes(x = dataset$Level, y = dataset$Salary),
colour = 'red') +
geom_line(aes(x = dataset$Level, y = predict(regressor, newdata =
dataset)),
colour = 'blue') +
ggtitle('Truth or Bluff (Regression Model)') +
xlab('Level') +
ylab('Salary')
Here model is not treated as
discrete hence plot simply joins
the prediction points since we
don’t have any predictions for
this interval.
Solution : plot Level as grid with
step size as small as 0.01 or
whatever you want it to be
SMOOTHER PLOT
# install.packages('ggplot2')
library(ggplot2)
x_grid = seq(min(dataset$Level), max(dataset$Level), 0.001)
ggplot() +
geom_point(aes(x = dataset$Level, y = dataset$Salary),
colour = 'red') +
geom_line(aes(x = x_grid, y = predict(regressor, newdata =
data.frame(Level = x_grid))),
colour = 'blue') +
ggtitle('Truth or Bluff (Regression Model)') +
xlab('Level') +
ylab('Salary')
PERFECT PLOT
PREDICTIONS
prediction = predict(regressor,data.frame(Level=6.5))

Weitere ähnliche Inhalte

Was ist angesagt?

support vector regression
support vector regressionsupport vector regression
support vector regressionAkhilesh Joshi
 
Decision Tree - C4.5&CART
Decision Tree - C4.5&CARTDecision Tree - C4.5&CART
Decision Tree - C4.5&CARTXueping Peng
 
ML - Multiple Linear Regression
ML - Multiple Linear RegressionML - Multiple Linear Regression
ML - Multiple Linear RegressionAndrew Ferlitsch
 
Naive bayesian classification
Naive bayesian classificationNaive bayesian classification
Naive bayesian classificationDr-Dipali Meher
 
Support vector machines (svm)
Support vector machines (svm)Support vector machines (svm)
Support vector machines (svm)Sharayu Patil
 
Support Vector Machine ppt presentation
Support Vector Machine ppt presentationSupport Vector Machine ppt presentation
Support Vector Machine ppt presentationAyanaRukasar
 
Linear regression
Linear regressionLinear regression
Linear regressionMartinHogg9
 
Decision trees in Machine Learning
Decision trees in Machine Learning Decision trees in Machine Learning
Decision trees in Machine Learning Mohammad Junaid Khan
 
Data Science - Part V - Decision Trees & Random Forests
Data Science - Part V - Decision Trees & Random Forests Data Science - Part V - Decision Trees & Random Forests
Data Science - Part V - Decision Trees & Random Forests Derek Kane
 
Decision Tree Algorithm | Decision Tree in Python | Machine Learning Algorith...
Decision Tree Algorithm | Decision Tree in Python | Machine Learning Algorith...Decision Tree Algorithm | Decision Tree in Python | Machine Learning Algorith...
Decision Tree Algorithm | Decision Tree in Python | Machine Learning Algorith...Edureka!
 
Machine Learning with Decision trees
Machine Learning with Decision treesMachine Learning with Decision trees
Machine Learning with Decision treesKnoldus Inc.
 
Machine Learning With Logistic Regression
Machine Learning  With Logistic RegressionMachine Learning  With Logistic Regression
Machine Learning With Logistic RegressionKnoldus Inc.
 

Was ist angesagt? (20)

Principal Component Analysis
Principal Component AnalysisPrincipal Component Analysis
Principal Component Analysis
 
K - Nearest neighbor ( KNN )
K - Nearest neighbor  ( KNN )K - Nearest neighbor  ( KNN )
K - Nearest neighbor ( KNN )
 
Random forest
Random forestRandom forest
Random forest
 
support vector regression
support vector regressionsupport vector regression
support vector regression
 
Naive bayes
Naive bayesNaive bayes
Naive bayes
 
Decision Tree - C4.5&CART
Decision Tree - C4.5&CARTDecision Tree - C4.5&CART
Decision Tree - C4.5&CART
 
ML - Multiple Linear Regression
ML - Multiple Linear RegressionML - Multiple Linear Regression
ML - Multiple Linear Regression
 
Naive bayesian classification
Naive bayesian classificationNaive bayesian classification
Naive bayesian classification
 
K Nearest Neighbors
K Nearest NeighborsK Nearest Neighbors
K Nearest Neighbors
 
Support vector machines (svm)
Support vector machines (svm)Support vector machines (svm)
Support vector machines (svm)
 
Decision tree
Decision treeDecision tree
Decision tree
 
Support Vector Machine ppt presentation
Support Vector Machine ppt presentationSupport Vector Machine ppt presentation
Support Vector Machine ppt presentation
 
Linear regression
Linear regressionLinear regression
Linear regression
 
Decision trees in Machine Learning
Decision trees in Machine Learning Decision trees in Machine Learning
Decision trees in Machine Learning
 
Presentation on K-Means Clustering
Presentation on K-Means ClusteringPresentation on K-Means Clustering
Presentation on K-Means Clustering
 
Data Science - Part V - Decision Trees & Random Forests
Data Science - Part V - Decision Trees & Random Forests Data Science - Part V - Decision Trees & Random Forests
Data Science - Part V - Decision Trees & Random Forests
 
Decision Tree Algorithm | Decision Tree in Python | Machine Learning Algorith...
Decision Tree Algorithm | Decision Tree in Python | Machine Learning Algorith...Decision Tree Algorithm | Decision Tree in Python | Machine Learning Algorith...
Decision Tree Algorithm | Decision Tree in Python | Machine Learning Algorith...
 
Decision Tree Learning
Decision Tree LearningDecision Tree Learning
Decision Tree Learning
 
Machine Learning with Decision trees
Machine Learning with Decision treesMachine Learning with Decision trees
Machine Learning with Decision trees
 
Machine Learning With Logistic Regression
Machine Learning  With Logistic RegressionMachine Learning  With Logistic Regression
Machine Learning With Logistic Regression
 

Ähnlich wie decision tree regression

polynomial linear regression
polynomial linear regressionpolynomial linear regression
polynomial linear regressionAkhilesh Joshi
 
logistic regression with python and R
logistic regression with python and Rlogistic regression with python and R
logistic regression with python and RAkhilesh Joshi
 
R Cheat Sheet for Data Analysts and Statisticians.pdf
R Cheat Sheet for Data Analysts and Statisticians.pdfR Cheat Sheet for Data Analysts and Statisticians.pdf
R Cheat Sheet for Data Analysts and Statisticians.pdfTimothy McBush Hiele
 
Linear Logistic regession_Practical.pptx
Linear Logistic regession_Practical.pptxLinear Logistic regession_Practical.pptx
Linear Logistic regession_Practical.pptxDr. Amanpreet Kaur
 
random forest regression
random forest regressionrandom forest regression
random forest regressionAkhilesh Joshi
 
Basic R Data Manipulation
Basic R Data ManipulationBasic R Data Manipulation
Basic R Data ManipulationChu An
 
maXbox starter69 Machine Learning VII
maXbox starter69 Machine Learning VIImaXbox starter69 Machine Learning VII
maXbox starter69 Machine Learning VIIMax Kleiner
 
Python High Level Functions_Ch 11.ppt
Python High Level Functions_Ch 11.pptPython High Level Functions_Ch 11.ppt
Python High Level Functions_Ch 11.pptAnishaJ7
 
ggtimeseries-->ggplot2 extensions
ggtimeseries-->ggplot2 extensions ggtimeseries-->ggplot2 extensions
ggtimeseries-->ggplot2 extensions Dr. Volkan OBAN
 
The input cata x1-x2-y can be loaded from fie- -x1_x2_y_circle2-csv- W.docx
The input cata x1-x2-y can be loaded from fie- -x1_x2_y_circle2-csv- W.docxThe input cata x1-x2-y can be loaded from fie- -x1_x2_y_circle2-csv- W.docx
The input cata x1-x2-y can be loaded from fie- -x1_x2_y_circle2-csv- W.docxGordonB0fPaigey
 
Pengolahan Data Panel Logit di Stata: Penilaian Goodness of Fit, Uji Model, d...
Pengolahan Data Panel Logit di Stata: Penilaian Goodness of Fit, Uji Model, d...Pengolahan Data Panel Logit di Stata: Penilaian Goodness of Fit, Uji Model, d...
Pengolahan Data Panel Logit di Stata: Penilaian Goodness of Fit, Uji Model, d...The1 Uploader
 
Grid search (parameter tuning)
Grid search (parameter tuning)Grid search (parameter tuning)
Grid search (parameter tuning)Akhilesh Joshi
 

Ähnlich wie decision tree regression (20)

polynomial linear regression
polynomial linear regressionpolynomial linear regression
polynomial linear regression
 
logistic regression with python and R
logistic regression with python and Rlogistic regression with python and R
logistic regression with python and R
 
R Cheat Sheet for Data Analysts and Statisticians.pdf
R Cheat Sheet for Data Analysts and Statisticians.pdfR Cheat Sheet for Data Analysts and Statisticians.pdf
R Cheat Sheet for Data Analysts and Statisticians.pdf
 
Linear Logistic regession_Practical.pptx
Linear Logistic regession_Practical.pptxLinear Logistic regession_Practical.pptx
Linear Logistic regession_Practical.pptx
 
random forest regression
random forest regressionrandom forest regression
random forest regression
 
Basic R Data Manipulation
Basic R Data ManipulationBasic R Data Manipulation
Basic R Data Manipulation
 
R Language Introduction
R Language IntroductionR Language Introduction
R Language Introduction
 
maXbox starter69 Machine Learning VII
maXbox starter69 Machine Learning VIImaXbox starter69 Machine Learning VII
maXbox starter69 Machine Learning VII
 
Python High Level Functions_Ch 11.ppt
Python High Level Functions_Ch 11.pptPython High Level Functions_Ch 11.ppt
Python High Level Functions_Ch 11.ppt
 
R code for data manipulation
R code for data manipulationR code for data manipulation
R code for data manipulation
 
R code for data manipulation
R code for data manipulationR code for data manipulation
R code for data manipulation
 
Xgboost
XgboostXgboost
Xgboost
 
Py lecture5 python plots
Py lecture5 python plotsPy lecture5 python plots
Py lecture5 python plots
 
ggtimeseries-->ggplot2 extensions
ggtimeseries-->ggplot2 extensions ggtimeseries-->ggplot2 extensions
ggtimeseries-->ggplot2 extensions
 
The input cata x1-x2-y can be loaded from fie- -x1_x2_y_circle2-csv- W.docx
The input cata x1-x2-y can be loaded from fie- -x1_x2_y_circle2-csv- W.docxThe input cata x1-x2-y can be loaded from fie- -x1_x2_y_circle2-csv- W.docx
The input cata x1-x2-y can be loaded from fie- -x1_x2_y_circle2-csv- W.docx
 
Seminar PSU 10.10.2014 mme
Seminar PSU 10.10.2014 mmeSeminar PSU 10.10.2014 mme
Seminar PSU 10.10.2014 mme
 
Pengolahan Data Panel Logit di Stata: Penilaian Goodness of Fit, Uji Model, d...
Pengolahan Data Panel Logit di Stata: Penilaian Goodness of Fit, Uji Model, d...Pengolahan Data Panel Logit di Stata: Penilaian Goodness of Fit, Uji Model, d...
Pengolahan Data Panel Logit di Stata: Penilaian Goodness of Fit, Uji Model, d...
 
K fold
K foldK fold
K fold
 
Grid search (parameter tuning)
Grid search (parameter tuning)Grid search (parameter tuning)
Grid search (parameter tuning)
 
knn classification
knn classificationknn classification
knn classification
 

Mehr von Akhilesh Joshi

PCA and LDA in machine learning
PCA and LDA in machine learningPCA and LDA in machine learning
PCA and LDA in machine learningAkhilesh Joshi
 
multiple linear regression
multiple linear regressionmultiple linear regression
multiple linear regressionAkhilesh Joshi
 
simple linear regression
simple linear regressionsimple linear regression
simple linear regressionAkhilesh Joshi
 
R square vs adjusted r square
R square vs adjusted r squareR square vs adjusted r square
R square vs adjusted r squareAkhilesh Joshi
 
Data preprocessing for Machine Learning with R and Python
Data preprocessing for Machine Learning with R and PythonData preprocessing for Machine Learning with R and Python
Data preprocessing for Machine Learning with R and PythonAkhilesh Joshi
 
Bastion Host : Amazon Web Services
Bastion Host : Amazon Web ServicesBastion Host : Amazon Web Services
Bastion Host : Amazon Web ServicesAkhilesh Joshi
 
Design patterns in MapReduce
Design patterns in MapReduceDesign patterns in MapReduce
Design patterns in MapReduceAkhilesh Joshi
 
Google knowledge graph
Google knowledge graphGoogle knowledge graph
Google knowledge graphAkhilesh Joshi
 
Machine learning (domingo's paper)
Machine learning (domingo's paper)Machine learning (domingo's paper)
Machine learning (domingo's paper)Akhilesh Joshi
 
SoLoMo - Future of Marketing
SoLoMo - Future of MarketingSoLoMo - Future of Marketing
SoLoMo - Future of MarketingAkhilesh Joshi
 

Mehr von Akhilesh Joshi (13)

PCA and LDA in machine learning
PCA and LDA in machine learningPCA and LDA in machine learning
PCA and LDA in machine learning
 
multiple linear regression
multiple linear regressionmultiple linear regression
multiple linear regression
 
simple linear regression
simple linear regressionsimple linear regression
simple linear regression
 
R square vs adjusted r square
R square vs adjusted r squareR square vs adjusted r square
R square vs adjusted r square
 
svm classification
svm classificationsvm classification
svm classification
 
Data preprocessing for Machine Learning with R and Python
Data preprocessing for Machine Learning with R and PythonData preprocessing for Machine Learning with R and Python
Data preprocessing for Machine Learning with R and Python
 
Design patterns
Design patternsDesign patterns
Design patterns
 
Bastion Host : Amazon Web Services
Bastion Host : Amazon Web ServicesBastion Host : Amazon Web Services
Bastion Host : Amazon Web Services
 
Design patterns in MapReduce
Design patterns in MapReduceDesign patterns in MapReduce
Design patterns in MapReduce
 
Google knowledge graph
Google knowledge graphGoogle knowledge graph
Google knowledge graph
 
Machine learning (domingo's paper)
Machine learning (domingo's paper)Machine learning (domingo's paper)
Machine learning (domingo's paper)
 
SoLoMo - Future of Marketing
SoLoMo - Future of MarketingSoLoMo - Future of Marketing
SoLoMo - Future of Marketing
 
Webcrawler
WebcrawlerWebcrawler
Webcrawler
 

Kürzlich hochgeladen

Optimal Decision Making - Cost Reduction in Logistics
Optimal Decision Making - Cost Reduction in LogisticsOptimal Decision Making - Cost Reduction in Logistics
Optimal Decision Making - Cost Reduction in LogisticsThinkInnovation
 
Rock Songs common codes and conventions.pptx
Rock Songs common codes and conventions.pptxRock Songs common codes and conventions.pptx
Rock Songs common codes and conventions.pptxFinatron037
 
How is Real-Time Analytics Different from Traditional OLAP?
How is Real-Time Analytics Different from Traditional OLAP?How is Real-Time Analytics Different from Traditional OLAP?
How is Real-Time Analytics Different from Traditional OLAP?sonikadigital1
 
Elements of language learning - an analysis of how different elements of lang...
Elements of language learning - an analysis of how different elements of lang...Elements of language learning - an analysis of how different elements of lang...
Elements of language learning - an analysis of how different elements of lang...PrithaVashisht1
 
Mapping the pubmed data under different suptopics using NLP.pptx
Mapping the pubmed data under different suptopics using NLP.pptxMapping the pubmed data under different suptopics using NLP.pptx
Mapping the pubmed data under different suptopics using NLP.pptxVenkatasubramani13
 
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptx
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptxTINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptx
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptxDwiAyuSitiHartinah
 
Strategic CX: A Deep Dive into Voice of the Customer Insights for Clarity
Strategic CX: A Deep Dive into Voice of the Customer Insights for ClarityStrategic CX: A Deep Dive into Voice of the Customer Insights for Clarity
Strategic CX: A Deep Dive into Voice of the Customer Insights for ClarityAggregage
 
Virtuosoft SmartSync Product Introduction
Virtuosoft SmartSync Product IntroductionVirtuosoft SmartSync Product Introduction
Virtuosoft SmartSync Product Introductionsanjaymuralee1
 
ChistaDATA Real-Time DATA Analytics Infrastructure
ChistaDATA Real-Time DATA Analytics InfrastructureChistaDATA Real-Time DATA Analytics Infrastructure
ChistaDATA Real-Time DATA Analytics Infrastructuresonikadigital1
 
CI, CD -Tools to integrate without manual intervention
CI, CD -Tools to integrate without manual interventionCI, CD -Tools to integrate without manual intervention
CI, CD -Tools to integrate without manual interventionajayrajaganeshkayala
 
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024Guido X Jansen
 
CCS336-Cloud-Services-Management-Lecture-Notes-1.pptx
CCS336-Cloud-Services-Management-Lecture-Notes-1.pptxCCS336-Cloud-Services-Management-Lecture-Notes-1.pptx
CCS336-Cloud-Services-Management-Lecture-Notes-1.pptxdhiyaneswaranv1
 
Cash Is Still King: ATM market research '2023
Cash Is Still King: ATM market research '2023Cash Is Still King: ATM market research '2023
Cash Is Still King: ATM market research '2023Vladislav Solodkiy
 
5 Ds to Define Data Archiving Best Practices
5 Ds to Define Data Archiving Best Practices5 Ds to Define Data Archiving Best Practices
5 Ds to Define Data Archiving Best PracticesDataArchiva
 
Master's Thesis - Data Science - Presentation
Master's Thesis - Data Science - PresentationMaster's Thesis - Data Science - Presentation
Master's Thesis - Data Science - PresentationGiorgio Carbone
 
The Universal GTM - how we design GTM and dataLayer
The Universal GTM - how we design GTM and dataLayerThe Universal GTM - how we design GTM and dataLayer
The Universal GTM - how we design GTM and dataLayerPavel Šabatka
 

Kürzlich hochgeladen (16)

Optimal Decision Making - Cost Reduction in Logistics
Optimal Decision Making - Cost Reduction in LogisticsOptimal Decision Making - Cost Reduction in Logistics
Optimal Decision Making - Cost Reduction in Logistics
 
Rock Songs common codes and conventions.pptx
Rock Songs common codes and conventions.pptxRock Songs common codes and conventions.pptx
Rock Songs common codes and conventions.pptx
 
How is Real-Time Analytics Different from Traditional OLAP?
How is Real-Time Analytics Different from Traditional OLAP?How is Real-Time Analytics Different from Traditional OLAP?
How is Real-Time Analytics Different from Traditional OLAP?
 
Elements of language learning - an analysis of how different elements of lang...
Elements of language learning - an analysis of how different elements of lang...Elements of language learning - an analysis of how different elements of lang...
Elements of language learning - an analysis of how different elements of lang...
 
Mapping the pubmed data under different suptopics using NLP.pptx
Mapping the pubmed data under different suptopics using NLP.pptxMapping the pubmed data under different suptopics using NLP.pptx
Mapping the pubmed data under different suptopics using NLP.pptx
 
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptx
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptxTINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptx
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptx
 
Strategic CX: A Deep Dive into Voice of the Customer Insights for Clarity
Strategic CX: A Deep Dive into Voice of the Customer Insights for ClarityStrategic CX: A Deep Dive into Voice of the Customer Insights for Clarity
Strategic CX: A Deep Dive into Voice of the Customer Insights for Clarity
 
Virtuosoft SmartSync Product Introduction
Virtuosoft SmartSync Product IntroductionVirtuosoft SmartSync Product Introduction
Virtuosoft SmartSync Product Introduction
 
ChistaDATA Real-Time DATA Analytics Infrastructure
ChistaDATA Real-Time DATA Analytics InfrastructureChistaDATA Real-Time DATA Analytics Infrastructure
ChistaDATA Real-Time DATA Analytics Infrastructure
 
CI, CD -Tools to integrate without manual intervention
CI, CD -Tools to integrate without manual interventionCI, CD -Tools to integrate without manual intervention
CI, CD -Tools to integrate without manual intervention
 
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
 
CCS336-Cloud-Services-Management-Lecture-Notes-1.pptx
CCS336-Cloud-Services-Management-Lecture-Notes-1.pptxCCS336-Cloud-Services-Management-Lecture-Notes-1.pptx
CCS336-Cloud-Services-Management-Lecture-Notes-1.pptx
 
Cash Is Still King: ATM market research '2023
Cash Is Still King: ATM market research '2023Cash Is Still King: ATM market research '2023
Cash Is Still King: ATM market research '2023
 
5 Ds to Define Data Archiving Best Practices
5 Ds to Define Data Archiving Best Practices5 Ds to Define Data Archiving Best Practices
5 Ds to Define Data Archiving Best Practices
 
Master's Thesis - Data Science - Presentation
Master's Thesis - Data Science - PresentationMaster's Thesis - Data Science - Presentation
Master's Thesis - Data Science - Presentation
 
The Universal GTM - how we design GTM and dataLayer
The Universal GTM - how we design GTM and dataLayerThe Universal GTM - how we design GTM and dataLayer
The Universal GTM - how we design GTM and dataLayer
 

decision tree regression

  • 4. The value in green box represents the average of data points in that split
  • 8. READING FILE DYNAMICALLY from tkinter import * from tkinter.filedialog import askopenfilename root = Tk() root.withdraw() root.update() file_path = askopenfilename() root.destroy()
  • 9. IMPORTING LIBRARIES import pandas as pd import numpy as np import matplotlib.pyplot as plt
  • 10. IMPORTING DATASET dataset = pd.read_csv(file_path) X= dataset.iloc[:,1:2].values y= dataset.iloc[:,2:3].values
  • 11. DECISION TREE REGRESSOR from sklearn.tree import DecisionTreeRegressor regressor = DecisionTreeRegressor(random_state=42) model = regressor.fit(X,y)
  • 13. SIMPLE PLOT plt.scatter(X,y,color="red") plt.plot(X,model.predict(X),color="blue") plt.title('Truth or Bluff (Decision Tree Regression)') plt.xlabel('Position level') plt.ylabel('Salary') plt.show() NOTE : Whats wrong here ? Well in the simple plot the Decision Tree Regressor model is treated as a c But it is not a continuous model. Decision Tree Regressor is a discrete model hence it should be treate FIX : plotting the same graph with grid with small step size say 0.01 will help us visualize better
  • 14. UPDATED PLOT X_grid = np.arange(min(X), max(X), 0.001) X_grid = X_grid.reshape((len(X_grid), 1)) plt.scatter(X, y, color = 'red') plt.plot(X_grid, regressor.predict(X_grid), color = 'blue') plt.title('Truth or Bluff (Decision Tree Regression)') plt.xlabel('Position level') plt.ylabel('Salary') plt.show() Note : Here the graph that is plotted gives us the clear discrete structure
  • 15. R
  • 16. READ DATASET library(readr) dataset <- read_csv("D:/machine learning AZ/Machine Learning A-Z Template Folder/Part 2 - Regression/Section 7 - Support Vector Regression (SVR)/SVR/Position_Salaries.csv") dataset= dataset[2:3]
  • 17. LIBRARY REQUIRED - RPART library('rpart') regressor = rpart(Salary ~ . , data= dataset , control = rpart.control(minsplit = 1)) NOTE : do not forget to include control parameter as it decides the number of splits in your model.
  • 18. PLOT # Visualising the Regression Model results # install.packages('ggplot2') library(ggplot2) ggplot() + geom_point(aes(x = dataset$Level, y = dataset$Salary), colour = 'red') + geom_line(aes(x = dataset$Level, y = predict(regressor, newdata = dataset)), colour = 'blue') + ggtitle('Truth or Bluff (Regression Model)') + xlab('Level') + ylab('Salary') Here model is not treated as discrete hence plot simply joins the prediction points since we don’t have any predictions for this interval. Solution : plot Level as grid with step size as small as 0.01 or whatever you want it to be
  • 19. SMOOTHER PLOT # install.packages('ggplot2') library(ggplot2) x_grid = seq(min(dataset$Level), max(dataset$Level), 0.001) ggplot() + geom_point(aes(x = dataset$Level, y = dataset$Salary), colour = 'red') + geom_line(aes(x = x_grid, y = predict(regressor, newdata = data.frame(Level = x_grid))), colour = 'blue') + ggtitle('Truth or Bluff (Regression Model)') + xlab('Level') + ylab('Salary') PERFECT PLOT