SlideShare ist ein Scribd-Unternehmen logo
1 von 44
Machine
Learning?
What
is
Soumya Mukherjee
Md Shamimuddin
https://www.linkedin.com/in/soumyarmukherjee/
https://www.linkedin.com/in/mdshamimuddin/
Agenda
 Overview of AI and ML
 Terminology awareness
 Applications in real world
 Use cases within Nokia
 Types of Learning
 Regression
 Classification
 Clustering
 Linear Regression Single Variable with python
• Arthur Samuel (1959)
Machine Learning: Field of study that gives computers the
ability to learn without being explicitly programmed.
• Tom Mitchell (1998)
A computer program is said to learn from experience E with
respect to some task T and some performance measure P, if its
performance on T, as measured by P, improves with experience E.
Machine Learning Definition
Artificial Intelligence Vs Machine Learning Vs Deep Learning
Terminology Awareness
Implies huge data
volumes that cannot be
processed effectively with
traditional applications.
Big Data processing
begins with raw data that
is not aggregated and it is
often impossible to store
such data in the memory
of a single computer
Is about using Statistics
as well as other
programming methods to
find patterns hidden in
the data so that you can
explain some
phenomenon. Machine
Learning uses Data
Mining techniques and
other learning algorithms
to build models of what is
happening behind
some data.
Big Data Data Mining
Is an artificial
intelligence technique
that is broadly used
in Data Mining. ML uses
a training dataset to build
a model that can predict
values of target variables.
Data Mining uses the
predictive force of
Machine Learning by
applying various ML
algorithms on Big data.
Machine Learning
WHAT IS ARTIFICIAL INTELLIGENCE
• Artificial intelligence (AI) is an area of computer science that emphasizes the creation of intelligent
machines that work and react like humans. Some of the activities computers with artificial intelligence
are designed for include:
Knowledge
Gain
Reasoning
Problem
Solving
Learning
Artificial Intelligence
Machine Learning
Supervised Unsupervised Reinforcement
Types of Learning
Supervised
Learning
Unsupervised
Learning
Reinforcement
Learning
Target/outcome
variable to be
predicted from set of
predictors is known
at training phase.
E.g. Regression,
Decision Tree,
Random Forest, KNN
Target/outcome
variable to be
predicted from set of
predictors is
unknown at training
phase.
E.g. Clustering (K-
means, Apriori)
Machine is trained to
take specific decision
Exposed to an
environment where it
trains itself
continually using trial
and error.
E.g. Markov Decision
process
Applications in real world
• Google search engine
• Self driving cars
• Facebook auto tagging
• Netflix movie recommendation
• Amazon product recommendation
• Healthcare diagnosis
• Speech recognition
• StackOverflow QA tagging
• Chatbot
Data as input
(Text files,
spreadsheet,
SQL database)
Feature Engineering
(Removing unwanted data,
Handle missing values,
Normalization or
Standardization)
Algorithm
Output/
Model
Pipeline solving ML Problem
Pipeline in solving ML Problem
Data Exploration/Feature Engineering
1. Variable Identification
• Predictor(s) n Target
• Type n Category of variable
2. Univariate Analysis
• Central tendency
• Measure of Dispersion
• Visualization Method
• Frequency table(categorical)
3. Bivariate Analysis
• Relation between 2 variables
• Correlation
• Chi-square test
• Z-test
4. Missing Value
Treatment
• Deletion
• Imputation
• Prediction Model
• KNN Imputation
5. Outlier Handling
Detection
• Very Important to handle outlier
• Visualization technique like box-
plot, scatter plot, Histogram
• Any value beyond -1.5IQR to
1.5IQR is an outlier
Treatment
• Remove
• Scale or Normalize
• Transform
• Impute
SUPERVISED LEARNING
• Supervised learning is used whenever we want to predict a certain outcome from
a given input, and we have examples of input/output pairs.
• We build a machine learning model from these input/output pairs, which
comprise our training set.
• Our goal is to make accurate predictions for new, never-before-seen data.
• Supervised learning often requires human effort to build the training set, but
afterward automates and often speeds up an otherwise laborious or infeasible
task.
TYPES OF SUPERVISED MODEL
• Regression :
• regression is the process of predicting a continuous value
• Classification
• predict a class label, which is a choice from a predefined list of possibilities.
CLASSIFICATION
• Binary Classification : Distinguishing between exactly two classes
• Multiclass classification : Classification between more than two classes.
Types of regression
1. Simple Linear Regression
Single predictor + single target
y = m*x + c
2. Multiple Linear Regression
Multiple predictors + single target
y = m1*x1 + m2*x2 + c
3. Polynomial Regression
One or many predictors + single target
Y = mn * x^n + … + m2*x^2 + m1*x1 + c
4. Stepwise Regression
Useful in case of multiple predictors
Add or Remove predictors as needed
Forward selection
Backward elimination
5. Lasso Regression
6. Ridge Regression
7. ElasticNet Regression
Simple Linear Regression
• Single predictor and single target
• Y = b0 + b1*X
• Minimum sum squared error
• Standard packages are already available
• Formula
• Programming example
Classification
 Type of supervised learning
 Output or target is a categorical outcome
Example
 Mail spam or no spam
 Weather rainy, sunny, humid
 Stock price up or down
Predictor(s) Algorithm
Categorical
Target
Types of Classification
1. K-nearest Neighbor Classifier
2. Logistic Regression
3. Naïve Bayes 6. Support Vector Machine
Classifier
5. Random Forest Classifier
4. Decision Tree Classifier
Clustering (Unsupervised learning)
Cluster 1
Cluster 2
Cluster 3
Unsupervised learning
• Unsupervised learning is the training of machine using
information that is neither classified nor labelled
For instance, Given an image having both dogs and cats which have not seen ever.
Machine tries to find pattern
based on shape of head,
ears, body structure etc.
Reinforcement Learning
• Reinforcement learning (RL) is an area of machine learning concerned with
how software agents ought to take actions in an environment so as to maximize some
notion of cumulative reward. (source : Wikipedia)
Eg : you go near fire , its warm : positive reinforcement
you touch fire, it burns your hand : negative reinforcement  learn not to touch
fire
• Algorithms for RL include – MonteCarlo methods, Markov Decision Processes, Q-
learning etc
ML in Python:
• Numpy
• Pandas
• Scikit-learn
• Matplotlib
• Seaborn
Non-
Programming:
• Weka
• Orange
• RapidMiner
• Qlik Sense
• xls
Deep Learning:
• Tensorflow
• Keras
• PyTorch
• Theano
Tools And Packages
LINEAR REGRESSION
SINGLE VARIABLE
LINEAR REGRESSION
• Linear regression, or ordinary least squares (OLS), is the simplest and most classic
linear method for regression. Linear regression finds the parameters m and b that
minimize the mean squared error between predictions and the true regression
targets, y, on the training set.
HOME PRICES
area price
2600 550000
3000 565000
3200 610000
3600 680000
4000 725000
HOME PRICES
area price
2600 550000
3000 565000
3200 610000
3600 680000
4000 725000
Given these home prices, find
out the price of homes whose
area is
3300 square feet
5000 square feet
SCATTER PLOT
BEST FIT LINE.
PREDICT HOME PRICES FOR A GIVEN AREA
PREDICT HOME PRICES FOR A GIVEN AREA (CONT.)
PREDICT HOME PRICES FOR A GIVEN AREA (CONT.)
SLOPE INTERSECTION EQUATION OF A STRAIGHT
LINE
PROGRAM IN PYTHON
EVALUATING MODEL PERFORMANCE
• The performance of a regression model can be understood by knowing the error
rate of the predictions made by the model. You can also measure the performance
by knowing how well your regression line fit the dataset.
• Let’s try to understand how to measure the performance of regression models.
• A good regression model is one where the difference between the actual or
observed values and predicted values for the selected model is small and unbiased
for train, validation and test data sets.
EVALUATING MODEL PERFORMANCE
• To measure the performance of your regression model, some statistical metrics are used. They
are-
• Mean Absolute Error(MAE)
• Root Mean Square Error(RMSE)
• Coefficient of determination or R2
• Adjusted R2
MEAN ABSOLUTE ERROR(MAE)
• This is the simplest of all the metrics. It is measured by taking the average of the absolute
difference between actual values and the predictions.
MEAN ABSOLUTE ERROR (MAE)
ROOT MEAN SQUARE ERROR(RMSE)
• The Root Mean Square Error is measured
by taking the square root of the average
of the squared difference between the
prediction and the actual value.
• It represents the sample standard
deviation of the differences between
predicted values and observed
values(also called residuals). It is
calculated using the following formula:
ROOT MEAN SQUARE ERROR(RMSE)
COEFFICIENT OF DETERMINATION OR R^2
• It measures how well the actual
outcomes are replicated by the
regression line.
• It helps you to understand how well the
independent variable adjusted with the
variance in your model.
• That means how good is your model
for a dataset.
• The mathematical representation for
R^2 is
Here, SSR = Sum Square of
Residuals(the squared difference
between the predicted and the
average value)
SST = Sum Square of Total(the
squared difference between the
actual and average value)
COEFFICIENT OF DETERMINATION OR R^2 (CONT.)
• Here the green line represents the regression line
and the red line represents the average line. The
differences in data points from these lines are
taken in the equation.
• Usually, the value of R^2 lies between 0 to 1(it
can be negative if the regression line somehow
has a worse fit than the average!). The closer its
value to one, the better your model is. This is
because either your regression line has well fitted
the dataset or the data points are distributed with
low variance. Which lessens the value of the Sum
of Residuals. Hence, the equation gets closer to
one.
THANK YOU

Weitere ähnliche Inhalte

Was ist angesagt?

Introduction to Natural Language Processing
Introduction to Natural Language ProcessingIntroduction to Natural Language Processing
Introduction to Natural Language ProcessingPranav Gupta
 
Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...
Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...
Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...Simplilearn
 
Machine Learning Course | Edureka
Machine Learning Course | EdurekaMachine Learning Course | Edureka
Machine Learning Course | EdurekaEdureka!
 
Knowledge representation in AI
Knowledge representation in AIKnowledge representation in AI
Knowledge representation in AIVishal Singh
 
Ch 7 Knowledge Representation.pdf
Ch 7 Knowledge Representation.pdfCh 7 Knowledge Representation.pdf
Ch 7 Knowledge Representation.pdfKrishnaMadala1
 
2.2 decision tree
2.2 decision tree2.2 decision tree
2.2 decision treeKrish_ver2
 
Classification in data mining
Classification in data mining Classification in data mining
Classification in data mining Sulman Ahmed
 
Ensemble methods in machine learning
Ensemble methods in machine learningEnsemble methods in machine learning
Ensemble methods in machine learningSANTHOSH RAJA M G
 
Supervised and unsupervised learning
Supervised and unsupervised learningSupervised and unsupervised learning
Supervised and unsupervised learningParas Kohli
 
Back propagation
Back propagationBack propagation
Back propagationNagarajan
 
KNN Algorithm - How KNN Algorithm Works With Example | Data Science For Begin...
KNN Algorithm - How KNN Algorithm Works With Example | Data Science For Begin...KNN Algorithm - How KNN Algorithm Works With Example | Data Science For Begin...
KNN Algorithm - How KNN Algorithm Works With Example | Data Science For Begin...Simplilearn
 
Deep learning for real life applications
Deep learning for real life applicationsDeep learning for real life applications
Deep learning for real life applicationsAnas Arram, Ph.D
 
Machine learning Algorithms
Machine learning AlgorithmsMachine learning Algorithms
Machine learning AlgorithmsWalaa Hamdy Assy
 
Classification and Regression
Classification and RegressionClassification and Regression
Classification and RegressionMegha Sharma
 
Machine learning session4(linear regression)
Machine learning   session4(linear regression)Machine learning   session4(linear regression)
Machine learning session4(linear regression)Abhimanyu Dwivedi
 
Linear Regression Analysis | Linear Regression in Python | Machine Learning A...
Linear Regression Analysis | Linear Regression in Python | Machine Learning A...Linear Regression Analysis | Linear Regression in Python | Machine Learning A...
Linear Regression Analysis | Linear Regression in Python | Machine Learning A...Simplilearn
 
Binary Class and Multi Class Strategies for Machine Learning
Binary Class and Multi Class Strategies for Machine LearningBinary Class and Multi Class Strategies for Machine Learning
Binary Class and Multi Class Strategies for Machine LearningPaxcel Technologies
 
Machine Learning and Data Mining: 16 Classifiers Ensembles
Machine Learning and Data Mining: 16 Classifiers EnsemblesMachine Learning and Data Mining: 16 Classifiers Ensembles
Machine Learning and Data Mining: 16 Classifiers EnsemblesPier Luca Lanzi
 
Ensemble learning
Ensemble learningEnsemble learning
Ensemble learningHaris Jamil
 
Matrix chain multiplication
Matrix chain multiplicationMatrix chain multiplication
Matrix chain multiplicationRespa Peter
 

Was ist angesagt? (20)

Introduction to Natural Language Processing
Introduction to Natural Language ProcessingIntroduction to Natural Language Processing
Introduction to Natural Language Processing
 
Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...
Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...
Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...
 
Machine Learning Course | Edureka
Machine Learning Course | EdurekaMachine Learning Course | Edureka
Machine Learning Course | Edureka
 
Knowledge representation in AI
Knowledge representation in AIKnowledge representation in AI
Knowledge representation in AI
 
Ch 7 Knowledge Representation.pdf
Ch 7 Knowledge Representation.pdfCh 7 Knowledge Representation.pdf
Ch 7 Knowledge Representation.pdf
 
2.2 decision tree
2.2 decision tree2.2 decision tree
2.2 decision tree
 
Classification in data mining
Classification in data mining Classification in data mining
Classification in data mining
 
Ensemble methods in machine learning
Ensemble methods in machine learningEnsemble methods in machine learning
Ensemble methods in machine learning
 
Supervised and unsupervised learning
Supervised and unsupervised learningSupervised and unsupervised learning
Supervised and unsupervised learning
 
Back propagation
Back propagationBack propagation
Back propagation
 
KNN Algorithm - How KNN Algorithm Works With Example | Data Science For Begin...
KNN Algorithm - How KNN Algorithm Works With Example | Data Science For Begin...KNN Algorithm - How KNN Algorithm Works With Example | Data Science For Begin...
KNN Algorithm - How KNN Algorithm Works With Example | Data Science For Begin...
 
Deep learning for real life applications
Deep learning for real life applicationsDeep learning for real life applications
Deep learning for real life applications
 
Machine learning Algorithms
Machine learning AlgorithmsMachine learning Algorithms
Machine learning Algorithms
 
Classification and Regression
Classification and RegressionClassification and Regression
Classification and Regression
 
Machine learning session4(linear regression)
Machine learning   session4(linear regression)Machine learning   session4(linear regression)
Machine learning session4(linear regression)
 
Linear Regression Analysis | Linear Regression in Python | Machine Learning A...
Linear Regression Analysis | Linear Regression in Python | Machine Learning A...Linear Regression Analysis | Linear Regression in Python | Machine Learning A...
Linear Regression Analysis | Linear Regression in Python | Machine Learning A...
 
Binary Class and Multi Class Strategies for Machine Learning
Binary Class and Multi Class Strategies for Machine LearningBinary Class and Multi Class Strategies for Machine Learning
Binary Class and Multi Class Strategies for Machine Learning
 
Machine Learning and Data Mining: 16 Classifiers Ensembles
Machine Learning and Data Mining: 16 Classifiers EnsemblesMachine Learning and Data Mining: 16 Classifiers Ensembles
Machine Learning and Data Mining: 16 Classifiers Ensembles
 
Ensemble learning
Ensemble learningEnsemble learning
Ensemble learning
 
Matrix chain multiplication
Matrix chain multiplicationMatrix chain multiplication
Matrix chain multiplication
 

Ähnlich wie Machine learning and linear regression programming

Unsupervised Learning: Clustering
Unsupervised Learning: Clustering Unsupervised Learning: Clustering
Unsupervised Learning: Clustering Experfy
 
Application of Machine Learning in Agriculture
Application of Machine  Learning in AgricultureApplication of Machine  Learning in Agriculture
Application of Machine Learning in AgricultureAman Vasisht
 
Machine Learning techniques used in AI.
Machine Learning  techniques used in AI.Machine Learning  techniques used in AI.
Machine Learning techniques used in AI.ArchanaT32
 
An Introduction to Simulation in the Social Sciences
An Introduction to Simulation in the Social SciencesAn Introduction to Simulation in the Social Sciences
An Introduction to Simulation in the Social Sciencesfsmart01
 
The Power of Auto ML and How Does it Work
The Power of Auto ML and How Does it WorkThe Power of Auto ML and How Does it Work
The Power of Auto ML and How Does it WorkIvo Andreev
 
Machine learning ppt unit one syllabuspptx
Machine learning ppt unit one syllabuspptxMachine learning ppt unit one syllabuspptx
Machine learning ppt unit one syllabuspptxVenkateswaraBabuRavi
 
Supervised and unsupervised learning
Supervised and unsupervised learningSupervised and unsupervised learning
Supervised and unsupervised learningAmAn Singh
 
Machine Learning, Deep Learning and Data Analysis Introduction
Machine Learning, Deep Learning and Data Analysis IntroductionMachine Learning, Deep Learning and Data Analysis Introduction
Machine Learning, Deep Learning and Data Analysis IntroductionTe-Yen Liu
 
An introduction to machine learning and statistics
An introduction to machine learning and statisticsAn introduction to machine learning and statistics
An introduction to machine learning and statisticsSpotle.ai
 
مدخل إلى تعلم الآلة
مدخل إلى تعلم الآلةمدخل إلى تعلم الآلة
مدخل إلى تعلم الآلةFares Al-Qunaieer
 
General Tips for participating Kaggle Competitions
General Tips for participating Kaggle CompetitionsGeneral Tips for participating Kaggle Competitions
General Tips for participating Kaggle CompetitionsMark Peng
 
Machine Learning.pdf
Machine Learning.pdfMachine Learning.pdf
Machine Learning.pdfBeyaNasr1
 
Kaggle Gold Medal Case Study
Kaggle Gold Medal Case StudyKaggle Gold Medal Case Study
Kaggle Gold Medal Case StudyAlon Bochman, CFA
 
Intro to machine learning
Intro to machine learningIntro to machine learning
Intro to machine learningAkshay Kanchan
 
Deep learning from a novice perspective
Deep learning from a novice perspectiveDeep learning from a novice perspective
Deep learning from a novice perspectiveAnirban Santara
 

Ähnlich wie Machine learning and linear regression programming (20)

LR2. Summary Day 2
LR2. Summary Day 2LR2. Summary Day 2
LR2. Summary Day 2
 
Unsupervised Learning: Clustering
Unsupervised Learning: Clustering Unsupervised Learning: Clustering
Unsupervised Learning: Clustering
 
Ml ppt at
Ml ppt atMl ppt at
Ml ppt at
 
Application of Machine Learning in Agriculture
Application of Machine  Learning in AgricultureApplication of Machine  Learning in Agriculture
Application of Machine Learning in Agriculture
 
Machine Learning techniques used in AI.
Machine Learning  techniques used in AI.Machine Learning  techniques used in AI.
Machine Learning techniques used in AI.
 
An Introduction to Simulation in the Social Sciences
An Introduction to Simulation in the Social SciencesAn Introduction to Simulation in the Social Sciences
An Introduction to Simulation in the Social Sciences
 
The Power of Auto ML and How Does it Work
The Power of Auto ML and How Does it WorkThe Power of Auto ML and How Does it Work
The Power of Auto ML and How Does it Work
 
Machine learning ppt unit one syllabuspptx
Machine learning ppt unit one syllabuspptxMachine learning ppt unit one syllabuspptx
Machine learning ppt unit one syllabuspptx
 
Supervised and unsupervised learning
Supervised and unsupervised learningSupervised and unsupervised learning
Supervised and unsupervised learning
 
Machine Learning, Deep Learning and Data Analysis Introduction
Machine Learning, Deep Learning and Data Analysis IntroductionMachine Learning, Deep Learning and Data Analysis Introduction
Machine Learning, Deep Learning and Data Analysis Introduction
 
EiB Seminar from Esteban Vegas, Ph.D.
EiB Seminar from Esteban Vegas, Ph.D. EiB Seminar from Esteban Vegas, Ph.D.
EiB Seminar from Esteban Vegas, Ph.D.
 
fINAL ML PPT.pptx
fINAL ML PPT.pptxfINAL ML PPT.pptx
fINAL ML PPT.pptx
 
An introduction to machine learning and statistics
An introduction to machine learning and statisticsAn introduction to machine learning and statistics
An introduction to machine learning and statistics
 
مدخل إلى تعلم الآلة
مدخل إلى تعلم الآلةمدخل إلى تعلم الآلة
مدخل إلى تعلم الآلة
 
random forest.pptx
random forest.pptxrandom forest.pptx
random forest.pptx
 
General Tips for participating Kaggle Competitions
General Tips for participating Kaggle CompetitionsGeneral Tips for participating Kaggle Competitions
General Tips for participating Kaggle Competitions
 
Machine Learning.pdf
Machine Learning.pdfMachine Learning.pdf
Machine Learning.pdf
 
Kaggle Gold Medal Case Study
Kaggle Gold Medal Case StudyKaggle Gold Medal Case Study
Kaggle Gold Medal Case Study
 
Intro to machine learning
Intro to machine learningIntro to machine learning
Intro to machine learning
 
Deep learning from a novice perspective
Deep learning from a novice perspectiveDeep learning from a novice perspective
Deep learning from a novice perspective
 

Kürzlich hochgeladen

Principles and Practices of Data Visualization
Principles and Practices of Data VisualizationPrinciples and Practices of Data Visualization
Principles and Practices of Data VisualizationKianJazayeri1
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxaleedritatuxx
 
Introduction to Mongo DB-open-­‐source, high-­‐performance, document-­‐orient...
Introduction to Mongo DB-open-­‐source, high-­‐performance, document-­‐orient...Introduction to Mongo DB-open-­‐source, high-­‐performance, document-­‐orient...
Introduction to Mongo DB-open-­‐source, high-­‐performance, document-­‐orient...boychatmate1
 
DATA ANALYSIS using various data sets like shoping data set etc
DATA ANALYSIS using various data sets like shoping data set etcDATA ANALYSIS using various data sets like shoping data set etc
DATA ANALYSIS using various data sets like shoping data set etclalithasri22
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Boston Institute of Analytics
 
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...Jack Cole
 
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...Dr Arash Najmaei ( Phd., MBA, BSc)
 
Digital Indonesia Report 2024 by We Are Social .pdf
Digital Indonesia Report 2024 by We Are Social .pdfDigital Indonesia Report 2024 by We Are Social .pdf
Digital Indonesia Report 2024 by We Are Social .pdfNicoChristianSunaryo
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Cathrine Wilhelmsen
 
Non Text Magic Studio Magic Design for Presentations L&P.pdf
Non Text Magic Studio Magic Design for Presentations L&P.pdfNon Text Magic Studio Magic Design for Presentations L&P.pdf
Non Text Magic Studio Magic Design for Presentations L&P.pdfPratikPatil591646
 
IBEF report on the Insurance market in India
IBEF report on the Insurance market in IndiaIBEF report on the Insurance market in India
IBEF report on the Insurance market in IndiaManalVerma4
 
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024Susanna-Assunta Sansone
 
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis model
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis modelDecoding Movie Sentiments: Analyzing Reviews with Data Analysis model
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis modelBoston Institute of Analytics
 
Rithik Kumar Singh codealpha pythohn.pdf
Rithik Kumar Singh codealpha pythohn.pdfRithik Kumar Singh codealpha pythohn.pdf
Rithik Kumar Singh codealpha pythohn.pdfrahulyadav957181
 
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBoston Institute of Analytics
 
knowledge representation in artificial intelligence
knowledge representation in artificial intelligenceknowledge representation in artificial intelligence
knowledge representation in artificial intelligencePriyadharshiniG41
 
World Economic Forum Metaverse Ecosystem By Utpal Chakraborty.pdf
World Economic Forum Metaverse Ecosystem By Utpal Chakraborty.pdfWorld Economic Forum Metaverse Ecosystem By Utpal Chakraborty.pdf
World Economic Forum Metaverse Ecosystem By Utpal Chakraborty.pdfsimulationsindia
 

Kürzlich hochgeladen (20)

Principles and Practices of Data Visualization
Principles and Practices of Data VisualizationPrinciples and Practices of Data Visualization
Principles and Practices of Data Visualization
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
 
Insurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis ProjectInsurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis Project
 
Introduction to Mongo DB-open-­‐source, high-­‐performance, document-­‐orient...
Introduction to Mongo DB-open-­‐source, high-­‐performance, document-­‐orient...Introduction to Mongo DB-open-­‐source, high-­‐performance, document-­‐orient...
Introduction to Mongo DB-open-­‐source, high-­‐performance, document-­‐orient...
 
DATA ANALYSIS using various data sets like shoping data set etc
DATA ANALYSIS using various data sets like shoping data set etcDATA ANALYSIS using various data sets like shoping data set etc
DATA ANALYSIS using various data sets like shoping data set etc
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
 
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
 
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
 
Digital Indonesia Report 2024 by We Are Social .pdf
Digital Indonesia Report 2024 by We Are Social .pdfDigital Indonesia Report 2024 by We Are Social .pdf
Digital Indonesia Report 2024 by We Are Social .pdf
 
Data Analysis Project: Stroke Prediction
Data Analysis Project: Stroke PredictionData Analysis Project: Stroke Prediction
Data Analysis Project: Stroke Prediction
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)
 
Non Text Magic Studio Magic Design for Presentations L&P.pdf
Non Text Magic Studio Magic Design for Presentations L&P.pdfNon Text Magic Studio Magic Design for Presentations L&P.pdf
Non Text Magic Studio Magic Design for Presentations L&P.pdf
 
IBEF report on the Insurance market in India
IBEF report on the Insurance market in IndiaIBEF report on the Insurance market in India
IBEF report on the Insurance market in India
 
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
 
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis model
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis modelDecoding Movie Sentiments: Analyzing Reviews with Data Analysis model
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis model
 
Rithik Kumar Singh codealpha pythohn.pdf
Rithik Kumar Singh codealpha pythohn.pdfRithik Kumar Singh codealpha pythohn.pdf
Rithik Kumar Singh codealpha pythohn.pdf
 
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
 
knowledge representation in artificial intelligence
knowledge representation in artificial intelligenceknowledge representation in artificial intelligence
knowledge representation in artificial intelligence
 
World Economic Forum Metaverse Ecosystem By Utpal Chakraborty.pdf
World Economic Forum Metaverse Ecosystem By Utpal Chakraborty.pdfWorld Economic Forum Metaverse Ecosystem By Utpal Chakraborty.pdf
World Economic Forum Metaverse Ecosystem By Utpal Chakraborty.pdf
 
2023 Survey Shows Dip in High School E-Cigarette Use
2023 Survey Shows Dip in High School E-Cigarette Use2023 Survey Shows Dip in High School E-Cigarette Use
2023 Survey Shows Dip in High School E-Cigarette Use
 

Machine learning and linear regression programming

  • 2. Agenda  Overview of AI and ML  Terminology awareness  Applications in real world  Use cases within Nokia  Types of Learning  Regression  Classification  Clustering  Linear Regression Single Variable with python
  • 3. • Arthur Samuel (1959) Machine Learning: Field of study that gives computers the ability to learn without being explicitly programmed. • Tom Mitchell (1998) A computer program is said to learn from experience E with respect to some task T and some performance measure P, if its performance on T, as measured by P, improves with experience E. Machine Learning Definition
  • 4. Artificial Intelligence Vs Machine Learning Vs Deep Learning
  • 6. Implies huge data volumes that cannot be processed effectively with traditional applications. Big Data processing begins with raw data that is not aggregated and it is often impossible to store such data in the memory of a single computer Is about using Statistics as well as other programming methods to find patterns hidden in the data so that you can explain some phenomenon. Machine Learning uses Data Mining techniques and other learning algorithms to build models of what is happening behind some data. Big Data Data Mining Is an artificial intelligence technique that is broadly used in Data Mining. ML uses a training dataset to build a model that can predict values of target variables. Data Mining uses the predictive force of Machine Learning by applying various ML algorithms on Big data. Machine Learning
  • 7. WHAT IS ARTIFICIAL INTELLIGENCE • Artificial intelligence (AI) is an area of computer science that emphasizes the creation of intelligent machines that work and react like humans. Some of the activities computers with artificial intelligence are designed for include: Knowledge Gain Reasoning Problem Solving Learning
  • 9. Types of Learning Supervised Learning Unsupervised Learning Reinforcement Learning Target/outcome variable to be predicted from set of predictors is known at training phase. E.g. Regression, Decision Tree, Random Forest, KNN Target/outcome variable to be predicted from set of predictors is unknown at training phase. E.g. Clustering (K- means, Apriori) Machine is trained to take specific decision Exposed to an environment where it trains itself continually using trial and error. E.g. Markov Decision process
  • 10. Applications in real world • Google search engine • Self driving cars • Facebook auto tagging • Netflix movie recommendation • Amazon product recommendation • Healthcare diagnosis • Speech recognition • StackOverflow QA tagging • Chatbot
  • 11. Data as input (Text files, spreadsheet, SQL database) Feature Engineering (Removing unwanted data, Handle missing values, Normalization or Standardization) Algorithm Output/ Model Pipeline solving ML Problem
  • 12. Pipeline in solving ML Problem
  • 13. Data Exploration/Feature Engineering 1. Variable Identification • Predictor(s) n Target • Type n Category of variable 2. Univariate Analysis • Central tendency • Measure of Dispersion • Visualization Method • Frequency table(categorical) 3. Bivariate Analysis • Relation between 2 variables • Correlation • Chi-square test • Z-test 4. Missing Value Treatment • Deletion • Imputation • Prediction Model • KNN Imputation 5. Outlier Handling Detection • Very Important to handle outlier • Visualization technique like box- plot, scatter plot, Histogram • Any value beyond -1.5IQR to 1.5IQR is an outlier Treatment • Remove • Scale or Normalize • Transform • Impute
  • 14. SUPERVISED LEARNING • Supervised learning is used whenever we want to predict a certain outcome from a given input, and we have examples of input/output pairs. • We build a machine learning model from these input/output pairs, which comprise our training set. • Our goal is to make accurate predictions for new, never-before-seen data. • Supervised learning often requires human effort to build the training set, but afterward automates and often speeds up an otherwise laborious or infeasible task.
  • 15. TYPES OF SUPERVISED MODEL • Regression : • regression is the process of predicting a continuous value • Classification • predict a class label, which is a choice from a predefined list of possibilities.
  • 16. CLASSIFICATION • Binary Classification : Distinguishing between exactly two classes • Multiclass classification : Classification between more than two classes.
  • 17. Types of regression 1. Simple Linear Regression Single predictor + single target y = m*x + c 2. Multiple Linear Regression Multiple predictors + single target y = m1*x1 + m2*x2 + c 3. Polynomial Regression One or many predictors + single target Y = mn * x^n + … + m2*x^2 + m1*x1 + c 4. Stepwise Regression Useful in case of multiple predictors Add or Remove predictors as needed Forward selection Backward elimination 5. Lasso Regression 6. Ridge Regression 7. ElasticNet Regression
  • 18. Simple Linear Regression • Single predictor and single target • Y = b0 + b1*X • Minimum sum squared error • Standard packages are already available • Formula • Programming example
  • 19. Classification  Type of supervised learning  Output or target is a categorical outcome Example  Mail spam or no spam  Weather rainy, sunny, humid  Stock price up or down Predictor(s) Algorithm Categorical Target
  • 20. Types of Classification 1. K-nearest Neighbor Classifier 2. Logistic Regression 3. Naïve Bayes 6. Support Vector Machine Classifier 5. Random Forest Classifier 4. Decision Tree Classifier
  • 22. Unsupervised learning • Unsupervised learning is the training of machine using information that is neither classified nor labelled For instance, Given an image having both dogs and cats which have not seen ever. Machine tries to find pattern based on shape of head, ears, body structure etc.
  • 23. Reinforcement Learning • Reinforcement learning (RL) is an area of machine learning concerned with how software agents ought to take actions in an environment so as to maximize some notion of cumulative reward. (source : Wikipedia) Eg : you go near fire , its warm : positive reinforcement you touch fire, it burns your hand : negative reinforcement  learn not to touch fire • Algorithms for RL include – MonteCarlo methods, Markov Decision Processes, Q- learning etc
  • 24. ML in Python: • Numpy • Pandas • Scikit-learn • Matplotlib • Seaborn Non- Programming: • Weka • Orange • RapidMiner • Qlik Sense • xls Deep Learning: • Tensorflow • Keras • PyTorch • Theano Tools And Packages
  • 26. LINEAR REGRESSION • Linear regression, or ordinary least squares (OLS), is the simplest and most classic linear method for regression. Linear regression finds the parameters m and b that minimize the mean squared error between predictions and the true regression targets, y, on the training set.
  • 27. HOME PRICES area price 2600 550000 3000 565000 3200 610000 3600 680000 4000 725000
  • 28. HOME PRICES area price 2600 550000 3000 565000 3200 610000 3600 680000 4000 725000 Given these home prices, find out the price of homes whose area is 3300 square feet 5000 square feet
  • 31. PREDICT HOME PRICES FOR A GIVEN AREA
  • 32. PREDICT HOME PRICES FOR A GIVEN AREA (CONT.)
  • 33. PREDICT HOME PRICES FOR A GIVEN AREA (CONT.)
  • 34. SLOPE INTERSECTION EQUATION OF A STRAIGHT LINE
  • 36. EVALUATING MODEL PERFORMANCE • The performance of a regression model can be understood by knowing the error rate of the predictions made by the model. You can also measure the performance by knowing how well your regression line fit the dataset. • Let’s try to understand how to measure the performance of regression models. • A good regression model is one where the difference between the actual or observed values and predicted values for the selected model is small and unbiased for train, validation and test data sets.
  • 37. EVALUATING MODEL PERFORMANCE • To measure the performance of your regression model, some statistical metrics are used. They are- • Mean Absolute Error(MAE) • Root Mean Square Error(RMSE) • Coefficient of determination or R2 • Adjusted R2
  • 38. MEAN ABSOLUTE ERROR(MAE) • This is the simplest of all the metrics. It is measured by taking the average of the absolute difference between actual values and the predictions.
  • 40. ROOT MEAN SQUARE ERROR(RMSE) • The Root Mean Square Error is measured by taking the square root of the average of the squared difference between the prediction and the actual value. • It represents the sample standard deviation of the differences between predicted values and observed values(also called residuals). It is calculated using the following formula:
  • 41. ROOT MEAN SQUARE ERROR(RMSE)
  • 42. COEFFICIENT OF DETERMINATION OR R^2 • It measures how well the actual outcomes are replicated by the regression line. • It helps you to understand how well the independent variable adjusted with the variance in your model. • That means how good is your model for a dataset. • The mathematical representation for R^2 is Here, SSR = Sum Square of Residuals(the squared difference between the predicted and the average value) SST = Sum Square of Total(the squared difference between the actual and average value)
  • 43. COEFFICIENT OF DETERMINATION OR R^2 (CONT.) • Here the green line represents the regression line and the red line represents the average line. The differences in data points from these lines are taken in the equation. • Usually, the value of R^2 lies between 0 to 1(it can be negative if the regression line somehow has a worse fit than the average!). The closer its value to one, the better your model is. This is because either your regression line has well fitted the dataset or the data points are distributed with low variance. Which lessens the value of the Sum of Residuals. Hence, the equation gets closer to one.

Hinweis der Redaktion

  1. list of possibilities. classification approach can be thought of as a means of categorizing or "classifying" some unknown items into a discrete set of "classes."
  2. plt.scatter(df['area'],df['price'] , marker = '*', color = 'red')
  3. plt.xlabel('area') plt.ylabel('price') plt.scatter(df['area'],df['price'], marker = '*', color = 'red') plt.plot(df['area'], model.predict(df[['area']]))