SlideShare a Scribd company logo
1 of 12
Download to read offline
Module
          12
Machine Learning
        Version 2 CSE IIT, Kharagpur
Lesson
               39
Neural Networks - III
            Version 2 CSE IIT, Kharagpur
12.4.4 Multi-Layer Perceptrons

In contrast to perceptrons, multilayer networks can learn not only multiple decision
boundaries, but the boundaries may be nonlinear. The typical architecture of a multi-layer
perceptron (MLP) is shown below.


 Output nodes

 Internal nodes

  Input nodes
To make nonlinear partitions on the space we need to define each unit as a nonlinear
function (unlike the perceptron). One solution is to use the sigmoid unit. Another reason
for using sigmoids are that they are continuous unlike linear thresholds and are thus
differentiable at all points.

x1            w1
x2                                 net
           w2             Σ

                        w0                                 O = σ(net) = 1 / 1 + e -net
xn         wn
                   X0=1



   O(x1,x2,…,xn) = σ ( WX )
  where: σ ( WX ) = 1 / 1 + e -WX

Function σ is called the sigmoid or logistic function. It has the following property:

 d σ(y) / dy = σ(y) (1 – σ(y))




                                                             Version 2 CSE IIT, Kharagpur
12.4.4.1 Back-Propagation Algorithm

Multi-layered perceptrons can be trained using the back-propagation algorithm described
next.

Goal: To learn the weights for all links in an interconnected multilayer network.

We begin by defining our measure of error:

E(W) = ½ Σd Σk (tkd – okd) 2

k varies along the output nodes and d over the training examples.

The idea is to use again a gradient descent over the space of weights to find a global
minimum (no guarantee).

Algorithm:

   1. Create a network with nin input nodes, nhidden internal nodes, and nout output
      nodes.
   2. Initialize all weights to small random numbers.
   3. Until error is small do:
               For each example X do
                   • Propagate example X forward through the network
                   • Propagate errors backward through the network


Forward Propagation

Given example X, compute the output of every node until we reach the output nodes:

Output

Internal                                                            Compute sigmoid
                                                                      function

 Input


           Example



                                                            Version 2 CSE IIT, Kharagpur
Backward Propagation

A. For each output node k compute the error:
    δk = Ok (1-Ok)(tk – Ok)

B. For each hidden unit h, calculate the error:
    δh = Oh (1-Oh) Σk Wkh δk

C. Update each network weight:

C. Wji = Wji + ΔWji
   where ΔWji = η δj Xji       (Wji and Xji are the input and weight of node i to node j)

A momentum term, depending on the weight value at last iteration, may also be added to
the update rule as follows. At iteration n we have the following:

ΔWji (n) = η δj Xji + αΔWji (n)

Where α ( 0 <= α <= 1) is a constant called the momentum.

   1. It increases the speed along a local minimum.
   2. It increases the speed along flat regions.

Remarks on Back-propagation

   1.   It implements a gradient descent search over the weight space.
   2.   It may become trapped in local minima.
   3.   In practice, it is very effective.
   4.   How to avoid local minima?
            a) Add momentum.
            b) Use stochastic gradient descent.
            c) Use different networks with different initial values for the weights.


Multi-layered perceptrons have high representational power. They can represent the
following:

   1. Boolean functions. Every boolean function can be represented with a network
      having two layers of units.

   2. Continuous functions. All bounded continuous functions can also be
      approximated with a network having two layers of units.

   3. Arbitrary functions. Any arbitrary function can be approximated with a network
      with three layers of units.



                                                            Version 2 CSE IIT, Kharagpur
Generalization and overfitting:

   One obvious stopping point for backpropagation is to continue iterating until the error is
   below some threshold; this can lead to overfitting.


                                                                    Validation set
Error

                                                                    Training set

                                           Number of weight
   Overfitting can be avoided using the following strategies.

        •   Use a validation set and stop until the error is small in this set.

        •   Use 10 fold cross validation.

        •   Use weight decay; the weights are decreased slowly on each iteration.




   Applications of Neural Networks

   Neural networks have broad applicability to real world business problems. They have
   already been successfully applied in many industries.

   Since neural networks are best at identifying patterns or trends in data, they are well
   suited for prediction or forecasting needs including:

        •   sales forecasting
        •   industrial process control
        •   customer research
        •   data validation
        •   risk management
        •   target marketing




                                                                   Version 2 CSE IIT, Kharagpur
Because of their adaptive and non-linear nature they have also been used in a number of
control system application domains like,

   •    process control in chemical plants
   •    unmanned vehicles
   •    robotics
   •    consumer electronics

Neural networks are also used a number of other applications which are too hard to
model using classical techniques. These include, computer vision, path planning and user
modeling.

Questions
1. Assume a learning problem where each example is represented by four attributes. Each
attribute can take two values in the set {a,b}. Run the candidate elimination algorithm on
the following examples and indicate the resulting version space. What is the size of the
space?
                  ((a, b, b, a), +)
                  ((b, b, b, b), +)
                  ((a, b, b, b), +)
                  ((a, b, a, a), -)


2. Decision Trees

The following table contains training examples that help a robot janitor predict whether
or not an office contains a recycling bin.

          STATUS DEPT. OFFICE SIZE RECYCLING BIN?
       1. faculty ee   large       no
       2. staff   ee   small       no
       3. faculty cs   medium      yes
       4. student ee   large       yes
       5. staff   cs   medium      no
       6. faculty cs   large       yes
       7. student ee   small       yes
       8. staff   cs   medium      no

Use entropy splitting function to construct a minimal decision tree that predicts whether
or not an office contains a recycling bin. Show each step of the computation.

3. Translate the above decision tree into a collection of decision rules.

4. Neural networks: Consider the following function of two variables:

                                                             Version 2 CSE IIT, Kharagpur
A      B   Desired Output
          0      0     1
          1      0     0
          0      1     0
          1      1     1

Prove that this function cannot be learned by a single perceptron that uses the step
function as its activation function.

   4. Construct by hand a perceptron that can calculate the logic function implies (=>).
      Assume that 1 = true and 0 = false for all inputs and outputs.

Solution
1. The general and specific boundaries of the version space are as follows:

S: {(?,b,b,?)}
G: {(?,?,b,?)}

The size of the version space is 2.

2. The solution steps are as follows:

1. Selecting the attribute for the root node:

                                            YES                        NO

       Status:

                     Faculty:    (3/8)*[ -(2/3)*log_2(2/3) - (1/3)*log_2(1/3)]

                     Staff:     + (3/8)*[ -(0/3)*log_2(0/3) - (3/3)*log_2(3/3)]

                     Student:   + (2/8)*[ -(2/2)*log_2(2/2) - (0/2)*log_2(0/2)]

                     TOTAL:     = 0.35 + 0 + 0 = 0.35

       Size:

                     Large:      (3/8)*[ -(2/3)*log_2(2/3) - (1/3)*log_2(1/3)]

                     Medium:    + (3/8)*[ -(1/3)*log_2(1/3) - (2/3)*log_2(2/3)]

                     Small:     + (2/8)*[ -(1/2)*log_2(1/2) - (1/2)*log_2(1/2)]

                     TOTAL:     = 0.35 + 0.35 + 2/8 = 0.95




                                                           Version 2 CSE IIT, Kharagpur
Dept:

              ee:              (4/8)*[ -(2/4)*log_2(2/4) - (2/4)*log_2(2/4)]

              cs:           + (4/8)*[ -(2/4)*log_2(2/4) - (2/4)*log_2(2/4)]

              TOTAL:        = 0.5 + 0.5 = 1

Since status is the attribute with the lowest entropy, it is selected
as the root node:




                         Status

                    /      |       

      Faculty    /         | staff  student

                /          |           

        1-,3+,6+        2-,5-,8-    4+,7+

         BIN=?           BIN=NO    BIN=YES

Only the branch corresponding to Status=Faculty needs further
processing.

Selecting the attribute to split the branch Status=Faculty:

                                   YES                   NO

      Dept:

       ee:              (1/3)*[ -(0/1)*log_2(0/1) - (1/1)*log_2(1/1)]

       cs:          + (2/3)*[ -(2/2)*log_2(2/2) - (0/2)*log_2(0/2)]

       TOTAL:       = 0 + 0 = 0




                                                     Version 2 CSE IIT, Kharagpur
Since the minimum possible entropy is 0 and dept has that minimum
possible entropy, it is selected as the best attribute to split the
branch Status=Faculty. Note that we don't even have to calculate the
entropy of the attribute size since it cannot possibly be better than
0.

                            Status

                        /     |        

         Faculty     /        | staff  student

                    /         |            

           1-,3+,6+         2-,5-,8-   4+,7+

              Dept          BIN=NO     BIN=YES

              /     

       ee /           cs

          /             

         1-          3+,6+

      BIN=NO        BIN=YES

Each branch ends in a homogeneous set of examples so the construction
of the decision tree ends here.

3. The rules are:

   IF status=faculty AND dept=ee THEN recycling-bin=no

   IF status=faculty AND dept=cs THEN recycling-bin=yes

   IF status=staff THEN recycling-bin=no

   IF status=student THEN recycling-bin=yes




                                                  Version 2 CSE IIT, Kharagpur
4. This function cannot be computed using a single perceptron since the function is not
linearly separable:
           B
            ^
            |
            0    1
            |
            |
         ---1----0-------->
            |              A

5. Let




Choose g to be the threshold function (output 1 on input > 0, output 0 otherwise).
There are two inputs to the implies function,          Therefore,




Arbitrarily choose W2 = 1, and substitute into above for all values of X1 and X2.




Note that greater/less than signs are decided by the threshold fn. and the result of
           For example, when X1=0 and X2=0, the resulting equation must be greater than
0 since g is threshold fn. and          is True. Similarly, when X1=1 and X2=0, the
resulting equation must be less than 0 since g is the threshold fn. and
          is False.

Since W0 < 0, and W0 > W1, and W0 – W1 < 1, choose W1 = -1. Substitute where possible.




                                                             Version 2 CSE IIT, Kharagpur
Since W0 < 0, and W0 > -1, choose W0 = -0.5.

Thus… W0 = -0.5, W1 = -1, W2 = 1.

Finally,




                                               Version 2 CSE IIT, Kharagpur

More Related Content

What's hot

Introduction to RBM for written digits recognition
Introduction to RBM for written digits recognitionIntroduction to RBM for written digits recognition
Introduction to RBM for written digits recognitionSergey Kharagorgiev
 
PPT - Adaptive Quantitative Trading : An Imitative Deep Reinforcement Learnin...
PPT - Adaptive Quantitative Trading : An Imitative Deep Reinforcement Learnin...PPT - Adaptive Quantitative Trading : An Imitative Deep Reinforcement Learnin...
PPT - Adaptive Quantitative Trading : An Imitative Deep Reinforcement Learnin...Jisang Yoon
 
DESIGN AND IMPLEMENTATION OF BINARY NEURAL NETWORK LEARNING WITH FUZZY CLUSTE...
DESIGN AND IMPLEMENTATION OF BINARY NEURAL NETWORK LEARNING WITH FUZZY CLUSTE...DESIGN AND IMPLEMENTATION OF BINARY NEURAL NETWORK LEARNING WITH FUZZY CLUSTE...
DESIGN AND IMPLEMENTATION OF BINARY NEURAL NETWORK LEARNING WITH FUZZY CLUSTE...cscpconf
 
Rabbit challenge 3 DNN Day2
Rabbit challenge 3 DNN Day2Rabbit challenge 3 DNN Day2
Rabbit challenge 3 DNN Day2TOMMYLINK1
 
04 image transformations_ii
04 image transformations_ii04 image transformations_ii
04 image transformations_iiankit_ppt
 
Kohonen self organizing maps
Kohonen self organizing mapsKohonen self organizing maps
Kohonen self organizing mapsraphaelkiminya
 
03 image transformations_i
03 image transformations_i03 image transformations_i
03 image transformations_iankit_ppt
 
Foundations: Artificial Neural Networks
Foundations: Artificial Neural NetworksFoundations: Artificial Neural Networks
Foundations: Artificial Neural Networksananth
 
Introduction to Applied Machine Learning
Introduction to Applied Machine LearningIntroduction to Applied Machine Learning
Introduction to Applied Machine LearningSheilaJimenezMorejon
 
15 Machine Learning Multilayer Perceptron
15 Machine Learning Multilayer Perceptron15 Machine Learning Multilayer Perceptron
15 Machine Learning Multilayer PerceptronAndres Mendez-Vazquez
 
08 neural networks
08 neural networks08 neural networks
08 neural networksankit_ppt
 
CC282 Unsupervised Learning (Clustering) Lecture 7 slides for ...
CC282 Unsupervised Learning (Clustering) Lecture 7 slides for ...CC282 Unsupervised Learning (Clustering) Lecture 7 slides for ...
CC282 Unsupervised Learning (Clustering) Lecture 7 slides for ...butest
 
Convolutional Neural Network (CNN) presentation from theory to code in Theano
Convolutional Neural Network (CNN) presentation from theory to code in TheanoConvolutional Neural Network (CNN) presentation from theory to code in Theano
Convolutional Neural Network (CNN) presentation from theory to code in TheanoSeongwon Hwang
 
Rabbit challenge 5_dnn3
Rabbit challenge 5_dnn3Rabbit challenge 5_dnn3
Rabbit challenge 5_dnn3TOMMYLINK1
 
Section5 Rbf
Section5 RbfSection5 Rbf
Section5 Rbfkylin
 
Nural network ER.Abhishek k. upadhyay
Nural network  ER.Abhishek k. upadhyayNural network  ER.Abhishek k. upadhyay
Nural network ER.Abhishek k. upadhyayabhishek upadhyay
 
Multiclass Recognition with Multiple Feature Trees
Multiclass Recognition with Multiple Feature TreesMulticlass Recognition with Multiple Feature Trees
Multiclass Recognition with Multiple Feature Treescsandit
 
Analytical study of feature extraction techniques in opinion mining
Analytical study of feature extraction techniques in opinion miningAnalytical study of feature extraction techniques in opinion mining
Analytical study of feature extraction techniques in opinion miningcsandit
 

What's hot (19)

Introduction to RBM for written digits recognition
Introduction to RBM for written digits recognitionIntroduction to RBM for written digits recognition
Introduction to RBM for written digits recognition
 
PPT - Adaptive Quantitative Trading : An Imitative Deep Reinforcement Learnin...
PPT - Adaptive Quantitative Trading : An Imitative Deep Reinforcement Learnin...PPT - Adaptive Quantitative Trading : An Imitative Deep Reinforcement Learnin...
PPT - Adaptive Quantitative Trading : An Imitative Deep Reinforcement Learnin...
 
DESIGN AND IMPLEMENTATION OF BINARY NEURAL NETWORK LEARNING WITH FUZZY CLUSTE...
DESIGN AND IMPLEMENTATION OF BINARY NEURAL NETWORK LEARNING WITH FUZZY CLUSTE...DESIGN AND IMPLEMENTATION OF BINARY NEURAL NETWORK LEARNING WITH FUZZY CLUSTE...
DESIGN AND IMPLEMENTATION OF BINARY NEURAL NETWORK LEARNING WITH FUZZY CLUSTE...
 
Fuzzy logic member functions
Fuzzy logic member functionsFuzzy logic member functions
Fuzzy logic member functions
 
Rabbit challenge 3 DNN Day2
Rabbit challenge 3 DNN Day2Rabbit challenge 3 DNN Day2
Rabbit challenge 3 DNN Day2
 
04 image transformations_ii
04 image transformations_ii04 image transformations_ii
04 image transformations_ii
 
Kohonen self organizing maps
Kohonen self organizing mapsKohonen self organizing maps
Kohonen self organizing maps
 
03 image transformations_i
03 image transformations_i03 image transformations_i
03 image transformations_i
 
Foundations: Artificial Neural Networks
Foundations: Artificial Neural NetworksFoundations: Artificial Neural Networks
Foundations: Artificial Neural Networks
 
Introduction to Applied Machine Learning
Introduction to Applied Machine LearningIntroduction to Applied Machine Learning
Introduction to Applied Machine Learning
 
15 Machine Learning Multilayer Perceptron
15 Machine Learning Multilayer Perceptron15 Machine Learning Multilayer Perceptron
15 Machine Learning Multilayer Perceptron
 
08 neural networks
08 neural networks08 neural networks
08 neural networks
 
CC282 Unsupervised Learning (Clustering) Lecture 7 slides for ...
CC282 Unsupervised Learning (Clustering) Lecture 7 slides for ...CC282 Unsupervised Learning (Clustering) Lecture 7 slides for ...
CC282 Unsupervised Learning (Clustering) Lecture 7 slides for ...
 
Convolutional Neural Network (CNN) presentation from theory to code in Theano
Convolutional Neural Network (CNN) presentation from theory to code in TheanoConvolutional Neural Network (CNN) presentation from theory to code in Theano
Convolutional Neural Network (CNN) presentation from theory to code in Theano
 
Rabbit challenge 5_dnn3
Rabbit challenge 5_dnn3Rabbit challenge 5_dnn3
Rabbit challenge 5_dnn3
 
Section5 Rbf
Section5 RbfSection5 Rbf
Section5 Rbf
 
Nural network ER.Abhishek k. upadhyay
Nural network  ER.Abhishek k. upadhyayNural network  ER.Abhishek k. upadhyay
Nural network ER.Abhishek k. upadhyay
 
Multiclass Recognition with Multiple Feature Trees
Multiclass Recognition with Multiple Feature TreesMulticlass Recognition with Multiple Feature Trees
Multiclass Recognition with Multiple Feature Trees
 
Analytical study of feature extraction techniques in opinion mining
Analytical study of feature extraction techniques in opinion miningAnalytical study of feature extraction techniques in opinion mining
Analytical study of feature extraction techniques in opinion mining
 

Viewers also liked

Improving Performance of Back propagation Learning Algorithm
Improving Performance of Back propagation Learning AlgorithmImproving Performance of Back propagation Learning Algorithm
Improving Performance of Back propagation Learning Algorithmijsrd.com
 
Introduction to Neural networks (under graduate course) Lecture 9 of 9
Introduction to Neural networks (under graduate course) Lecture 9 of 9Introduction to Neural networks (under graduate course) Lecture 9 of 9
Introduction to Neural networks (under graduate course) Lecture 9 of 9Randa Elanwar
 
Introduction to Neural networks (under graduate course) Lecture 1 of 9
Introduction to Neural networks (under graduate course) Lecture 1 of 9Introduction to Neural networks (under graduate course) Lecture 1 of 9
Introduction to Neural networks (under graduate course) Lecture 1 of 9Randa Elanwar
 
Back propagation network
Back propagation networkBack propagation network
Back propagation networkHIRA Zaidi
 
Artificial Neural Networks Lect3: Neural Network Learning rules
Artificial Neural Networks Lect3: Neural Network Learning rulesArtificial Neural Networks Lect3: Neural Network Learning rules
Artificial Neural Networks Lect3: Neural Network Learning rulesMohammed Bennamoun
 
The Back Propagation Learning Algorithm
The Back Propagation Learning AlgorithmThe Back Propagation Learning Algorithm
The Back Propagation Learning AlgorithmESCOM
 
backpropagation in neural networks
backpropagation in neural networksbackpropagation in neural networks
backpropagation in neural networksAkash Goel
 
Back propagation
Back propagationBack propagation
Back propagationNagarajan
 
Neurophysiological and evolutionary
Neurophysiological and evolutionaryNeurophysiological and evolutionary
Neurophysiological and evolutionarySnowPea Guh
 
Neural network & its applications
Neural network & its applications Neural network & its applications
Neural network & its applications Ahmed_hashmi
 
Hebbian Learning
Hebbian LearningHebbian Learning
Hebbian LearningESCOM
 

Viewers also liked (14)

Improving Performance of Back propagation Learning Algorithm
Improving Performance of Back propagation Learning AlgorithmImproving Performance of Back propagation Learning Algorithm
Improving Performance of Back propagation Learning Algorithm
 
Introduction to Neural networks (under graduate course) Lecture 9 of 9
Introduction to Neural networks (under graduate course) Lecture 9 of 9Introduction to Neural networks (under graduate course) Lecture 9 of 9
Introduction to Neural networks (under graduate course) Lecture 9 of 9
 
Introduction to Neural networks (under graduate course) Lecture 1 of 9
Introduction to Neural networks (under graduate course) Lecture 1 of 9Introduction to Neural networks (under graduate course) Lecture 1 of 9
Introduction to Neural networks (under graduate course) Lecture 1 of 9
 
hopfield neural network
hopfield neural networkhopfield neural network
hopfield neural network
 
Back propagation network
Back propagation networkBack propagation network
Back propagation network
 
Artificial Neural Networks Lect3: Neural Network Learning rules
Artificial Neural Networks Lect3: Neural Network Learning rulesArtificial Neural Networks Lect3: Neural Network Learning rules
Artificial Neural Networks Lect3: Neural Network Learning rules
 
Hopfield Networks
Hopfield NetworksHopfield Networks
Hopfield Networks
 
The Back Propagation Learning Algorithm
The Back Propagation Learning AlgorithmThe Back Propagation Learning Algorithm
The Back Propagation Learning Algorithm
 
backpropagation in neural networks
backpropagation in neural networksbackpropagation in neural networks
backpropagation in neural networks
 
HOPFIELD NETWORK
HOPFIELD NETWORKHOPFIELD NETWORK
HOPFIELD NETWORK
 
Back propagation
Back propagationBack propagation
Back propagation
 
Neurophysiological and evolutionary
Neurophysiological and evolutionaryNeurophysiological and evolutionary
Neurophysiological and evolutionary
 
Neural network & its applications
Neural network & its applications Neural network & its applications
Neural network & its applications
 
Hebbian Learning
Hebbian LearningHebbian Learning
Hebbian Learning
 

Similar to AI Lesson 39

Multilayer Perceptron (DLAI D1L2 2017 UPC Deep Learning for Artificial Intell...
Multilayer Perceptron (DLAI D1L2 2017 UPC Deep Learning for Artificial Intell...Multilayer Perceptron (DLAI D1L2 2017 UPC Deep Learning for Artificial Intell...
Multilayer Perceptron (DLAI D1L2 2017 UPC Deep Learning for Artificial Intell...Universitat Politècnica de Catalunya
 
Quantization and Training of Neural Networks for Efficient Integer-Arithmetic...
Quantization and Training of Neural Networks for Efficient Integer-Arithmetic...Quantization and Training of Neural Networks for Efficient Integer-Arithmetic...
Quantization and Training of Neural Networks for Efficient Integer-Arithmetic...Ryo Takahashi
 
Tutorial-on-DNN-09A-Co-design-Sparsity.pdf
Tutorial-on-DNN-09A-Co-design-Sparsity.pdfTutorial-on-DNN-09A-Co-design-Sparsity.pdf
Tutorial-on-DNN-09A-Co-design-Sparsity.pdfDuy-Hieu Bui
 
Chap 8. Optimization for training deep models
Chap 8. Optimization for training deep modelsChap 8. Optimization for training deep models
Chap 8. Optimization for training deep modelsYoung-Geun Choi
 
IRJET - Implementation of Neural Network on FPGA
IRJET - Implementation of Neural Network on FPGAIRJET - Implementation of Neural Network on FPGA
IRJET - Implementation of Neural Network on FPGAIRJET Journal
 
VCE Unit 01 (1).pptx
VCE Unit 01 (1).pptxVCE Unit 01 (1).pptx
VCE Unit 01 (1).pptxskilljiolms
 
Detection focal loss 딥러닝 논문읽기 모임 발표자료
Detection focal loss 딥러닝 논문읽기 모임 발표자료Detection focal loss 딥러닝 논문읽기 모임 발표자료
Detection focal loss 딥러닝 논문읽기 모임 발표자료taeseon ryu
 
Web Spam Classification Using Supervised Artificial Neural Network Algorithms
Web Spam Classification Using Supervised Artificial Neural Network AlgorithmsWeb Spam Classification Using Supervised Artificial Neural Network Algorithms
Web Spam Classification Using Supervised Artificial Neural Network Algorithmsaciijournal
 
A Review on Color Recognition using Deep Learning and Different Image Segment...
A Review on Color Recognition using Deep Learning and Different Image Segment...A Review on Color Recognition using Deep Learning and Different Image Segment...
A Review on Color Recognition using Deep Learning and Different Image Segment...IRJET Journal
 
AI optimizing HPC simulations (presentation from 6th EULAG Workshop)
AI optimizing HPC simulations (presentation from  6th EULAG Workshop)AI optimizing HPC simulations (presentation from  6th EULAG Workshop)
AI optimizing HPC simulations (presentation from 6th EULAG Workshop)byteLAKE
 
Deep Learning for Natural Language Processing
Deep Learning for Natural Language ProcessingDeep Learning for Natural Language Processing
Deep Learning for Natural Language ProcessingIRJET Journal
 
Using CNTK's Python Interface for Deep LearningDave DeBarr -
Using CNTK's Python Interface for Deep LearningDave DeBarr - Using CNTK's Python Interface for Deep LearningDave DeBarr -
Using CNTK's Python Interface for Deep LearningDave DeBarr - PyData
 
240318_JW_labseminar[Attention Is All You Need].pptx
240318_JW_labseminar[Attention Is All You Need].pptx240318_JW_labseminar[Attention Is All You Need].pptx
240318_JW_labseminar[Attention Is All You Need].pptxthanhdowork
 
IRJET- Image Classification – Cat and Dog Images
IRJET- Image Classification – Cat and Dog ImagesIRJET- Image Classification – Cat and Dog Images
IRJET- Image Classification – Cat and Dog ImagesIRJET Journal
 
LightGBM and Multilayer perceptron (MLP) slide
LightGBM and Multilayer perceptron (MLP) slideLightGBM and Multilayer perceptron (MLP) slide
LightGBM and Multilayer perceptron (MLP) slideriahaque1950
 

Similar to AI Lesson 39 (20)

Eye deep
Eye deepEye deep
Eye deep
 
Multilayer Perceptron (DLAI D1L2 2017 UPC Deep Learning for Artificial Intell...
Multilayer Perceptron (DLAI D1L2 2017 UPC Deep Learning for Artificial Intell...Multilayer Perceptron (DLAI D1L2 2017 UPC Deep Learning for Artificial Intell...
Multilayer Perceptron (DLAI D1L2 2017 UPC Deep Learning for Artificial Intell...
 
Quantization and Training of Neural Networks for Efficient Integer-Arithmetic...
Quantization and Training of Neural Networks for Efficient Integer-Arithmetic...Quantization and Training of Neural Networks for Efficient Integer-Arithmetic...
Quantization and Training of Neural Networks for Efficient Integer-Arithmetic...
 
Unit ii supervised ii
Unit ii supervised iiUnit ii supervised ii
Unit ii supervised ii
 
Tutorial-on-DNN-09A-Co-design-Sparsity.pdf
Tutorial-on-DNN-09A-Co-design-Sparsity.pdfTutorial-on-DNN-09A-Co-design-Sparsity.pdf
Tutorial-on-DNN-09A-Co-design-Sparsity.pdf
 
Chap 8. Optimization for training deep models
Chap 8. Optimization for training deep modelsChap 8. Optimization for training deep models
Chap 8. Optimization for training deep models
 
IRJET - Implementation of Neural Network on FPGA
IRJET - Implementation of Neural Network on FPGAIRJET - Implementation of Neural Network on FPGA
IRJET - Implementation of Neural Network on FPGA
 
VCE Unit 01 (1).pptx
VCE Unit 01 (1).pptxVCE Unit 01 (1).pptx
VCE Unit 01 (1).pptx
 
Detection focal loss 딥러닝 논문읽기 모임 발표자료
Detection focal loss 딥러닝 논문읽기 모임 발표자료Detection focal loss 딥러닝 논문읽기 모임 발표자료
Detection focal loss 딥러닝 논문읽기 모임 발표자료
 
Web Spam Classification Using Supervised Artificial Neural Network Algorithms
Web Spam Classification Using Supervised Artificial Neural Network AlgorithmsWeb Spam Classification Using Supervised Artificial Neural Network Algorithms
Web Spam Classification Using Supervised Artificial Neural Network Algorithms
 
N ns 1
N ns 1N ns 1
N ns 1
 
A Review on Color Recognition using Deep Learning and Different Image Segment...
A Review on Color Recognition using Deep Learning and Different Image Segment...A Review on Color Recognition using Deep Learning and Different Image Segment...
A Review on Color Recognition using Deep Learning and Different Image Segment...
 
AI optimizing HPC simulations (presentation from 6th EULAG Workshop)
AI optimizing HPC simulations (presentation from  6th EULAG Workshop)AI optimizing HPC simulations (presentation from  6th EULAG Workshop)
AI optimizing HPC simulations (presentation from 6th EULAG Workshop)
 
Deep Learning for Natural Language Processing
Deep Learning for Natural Language ProcessingDeep Learning for Natural Language Processing
Deep Learning for Natural Language Processing
 
Using CNTK's Python Interface for Deep LearningDave DeBarr -
Using CNTK's Python Interface for Deep LearningDave DeBarr - Using CNTK's Python Interface for Deep LearningDave DeBarr -
Using CNTK's Python Interface for Deep LearningDave DeBarr -
 
240318_JW_labseminar[Attention Is All You Need].pptx
240318_JW_labseminar[Attention Is All You Need].pptx240318_JW_labseminar[Attention Is All You Need].pptx
240318_JW_labseminar[Attention Is All You Need].pptx
 
IRJET- Image Classification – Cat and Dog Images
IRJET- Image Classification – Cat and Dog ImagesIRJET- Image Classification – Cat and Dog Images
IRJET- Image Classification – Cat and Dog Images
 
Lec 6-bp
Lec 6-bpLec 6-bp
Lec 6-bp
 
eam2
eam2eam2
eam2
 
LightGBM and Multilayer perceptron (MLP) slide
LightGBM and Multilayer perceptron (MLP) slideLightGBM and Multilayer perceptron (MLP) slide
LightGBM and Multilayer perceptron (MLP) slide
 

More from Assistant Professor (20)

AI Lesson 41
AI Lesson 41AI Lesson 41
AI Lesson 41
 
AI Lesson 40
AI Lesson 40AI Lesson 40
AI Lesson 40
 
AI Lesson 38
AI Lesson 38AI Lesson 38
AI Lesson 38
 
AI Lesson 37
AI Lesson 37AI Lesson 37
AI Lesson 37
 
AI Lesson 36
AI Lesson 36AI Lesson 36
AI Lesson 36
 
AI Lesson 35
AI Lesson 35AI Lesson 35
AI Lesson 35
 
AI Lesson 34
AI Lesson 34AI Lesson 34
AI Lesson 34
 
AI Lesson 33
AI Lesson 33AI Lesson 33
AI Lesson 33
 
AI Lesson 32
AI Lesson 32AI Lesson 32
AI Lesson 32
 
AI Lesson 31
AI Lesson 31AI Lesson 31
AI Lesson 31
 
AI Lesson 30
AI Lesson 30AI Lesson 30
AI Lesson 30
 
AI Lesson 29
AI Lesson 29AI Lesson 29
AI Lesson 29
 
AI Lesson 28
AI Lesson 28AI Lesson 28
AI Lesson 28
 
AI Lesson 27
AI Lesson 27AI Lesson 27
AI Lesson 27
 
AI Lesson 26
AI Lesson 26AI Lesson 26
AI Lesson 26
 
AI Lesson 25
AI Lesson 25AI Lesson 25
AI Lesson 25
 
AI Lesson 24
AI Lesson 24AI Lesson 24
AI Lesson 24
 
AI Lesson 23
AI Lesson 23AI Lesson 23
AI Lesson 23
 
AI Lesson 22
AI Lesson 22AI Lesson 22
AI Lesson 22
 
AI Lesson 21
AI Lesson 21AI Lesson 21
AI Lesson 21
 

Recently uploaded

Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDThiyagu K
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Celine George
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...fonyou31
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Krashi Coaching
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeThiyagu K
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Disha Kariya
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfJayanti Pande
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13Steve Thomason
 
JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...
JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...
JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...anjaliyadav012327
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactPECB
 
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...Pooja Nehwal
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104misteraugie
 

Recently uploaded (20)

Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SD
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
Advance Mobile Application Development class 07
Advance Mobile Application Development class 07Advance Mobile Application Development class 07
Advance Mobile Application Development class 07
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...
JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...
JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...
 
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
 
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 

AI Lesson 39

  • 1. Module 12 Machine Learning Version 2 CSE IIT, Kharagpur
  • 2. Lesson 39 Neural Networks - III Version 2 CSE IIT, Kharagpur
  • 3. 12.4.4 Multi-Layer Perceptrons In contrast to perceptrons, multilayer networks can learn not only multiple decision boundaries, but the boundaries may be nonlinear. The typical architecture of a multi-layer perceptron (MLP) is shown below. Output nodes Internal nodes Input nodes To make nonlinear partitions on the space we need to define each unit as a nonlinear function (unlike the perceptron). One solution is to use the sigmoid unit. Another reason for using sigmoids are that they are continuous unlike linear thresholds and are thus differentiable at all points. x1 w1 x2 net w2 Σ w0 O = σ(net) = 1 / 1 + e -net xn wn X0=1 O(x1,x2,…,xn) = σ ( WX ) where: σ ( WX ) = 1 / 1 + e -WX Function σ is called the sigmoid or logistic function. It has the following property: d σ(y) / dy = σ(y) (1 – σ(y)) Version 2 CSE IIT, Kharagpur
  • 4. 12.4.4.1 Back-Propagation Algorithm Multi-layered perceptrons can be trained using the back-propagation algorithm described next. Goal: To learn the weights for all links in an interconnected multilayer network. We begin by defining our measure of error: E(W) = ½ Σd Σk (tkd – okd) 2 k varies along the output nodes and d over the training examples. The idea is to use again a gradient descent over the space of weights to find a global minimum (no guarantee). Algorithm: 1. Create a network with nin input nodes, nhidden internal nodes, and nout output nodes. 2. Initialize all weights to small random numbers. 3. Until error is small do: For each example X do • Propagate example X forward through the network • Propagate errors backward through the network Forward Propagation Given example X, compute the output of every node until we reach the output nodes: Output Internal Compute sigmoid function Input Example Version 2 CSE IIT, Kharagpur
  • 5. Backward Propagation A. For each output node k compute the error: δk = Ok (1-Ok)(tk – Ok) B. For each hidden unit h, calculate the error: δh = Oh (1-Oh) Σk Wkh δk C. Update each network weight: C. Wji = Wji + ΔWji where ΔWji = η δj Xji (Wji and Xji are the input and weight of node i to node j) A momentum term, depending on the weight value at last iteration, may also be added to the update rule as follows. At iteration n we have the following: ΔWji (n) = η δj Xji + αΔWji (n) Where α ( 0 <= α <= 1) is a constant called the momentum. 1. It increases the speed along a local minimum. 2. It increases the speed along flat regions. Remarks on Back-propagation 1. It implements a gradient descent search over the weight space. 2. It may become trapped in local minima. 3. In practice, it is very effective. 4. How to avoid local minima? a) Add momentum. b) Use stochastic gradient descent. c) Use different networks with different initial values for the weights. Multi-layered perceptrons have high representational power. They can represent the following: 1. Boolean functions. Every boolean function can be represented with a network having two layers of units. 2. Continuous functions. All bounded continuous functions can also be approximated with a network having two layers of units. 3. Arbitrary functions. Any arbitrary function can be approximated with a network with three layers of units. Version 2 CSE IIT, Kharagpur
  • 6. Generalization and overfitting: One obvious stopping point for backpropagation is to continue iterating until the error is below some threshold; this can lead to overfitting. Validation set Error Training set Number of weight Overfitting can be avoided using the following strategies. • Use a validation set and stop until the error is small in this set. • Use 10 fold cross validation. • Use weight decay; the weights are decreased slowly on each iteration. Applications of Neural Networks Neural networks have broad applicability to real world business problems. They have already been successfully applied in many industries. Since neural networks are best at identifying patterns or trends in data, they are well suited for prediction or forecasting needs including: • sales forecasting • industrial process control • customer research • data validation • risk management • target marketing Version 2 CSE IIT, Kharagpur
  • 7. Because of their adaptive and non-linear nature they have also been used in a number of control system application domains like, • process control in chemical plants • unmanned vehicles • robotics • consumer electronics Neural networks are also used a number of other applications which are too hard to model using classical techniques. These include, computer vision, path planning and user modeling. Questions 1. Assume a learning problem where each example is represented by four attributes. Each attribute can take two values in the set {a,b}. Run the candidate elimination algorithm on the following examples and indicate the resulting version space. What is the size of the space? ((a, b, b, a), +) ((b, b, b, b), +) ((a, b, b, b), +) ((a, b, a, a), -) 2. Decision Trees The following table contains training examples that help a robot janitor predict whether or not an office contains a recycling bin. STATUS DEPT. OFFICE SIZE RECYCLING BIN? 1. faculty ee large no 2. staff ee small no 3. faculty cs medium yes 4. student ee large yes 5. staff cs medium no 6. faculty cs large yes 7. student ee small yes 8. staff cs medium no Use entropy splitting function to construct a minimal decision tree that predicts whether or not an office contains a recycling bin. Show each step of the computation. 3. Translate the above decision tree into a collection of decision rules. 4. Neural networks: Consider the following function of two variables: Version 2 CSE IIT, Kharagpur
  • 8. A B Desired Output 0 0 1 1 0 0 0 1 0 1 1 1 Prove that this function cannot be learned by a single perceptron that uses the step function as its activation function. 4. Construct by hand a perceptron that can calculate the logic function implies (=>). Assume that 1 = true and 0 = false for all inputs and outputs. Solution 1. The general and specific boundaries of the version space are as follows: S: {(?,b,b,?)} G: {(?,?,b,?)} The size of the version space is 2. 2. The solution steps are as follows: 1. Selecting the attribute for the root node: YES NO Status: Faculty: (3/8)*[ -(2/3)*log_2(2/3) - (1/3)*log_2(1/3)] Staff: + (3/8)*[ -(0/3)*log_2(0/3) - (3/3)*log_2(3/3)] Student: + (2/8)*[ -(2/2)*log_2(2/2) - (0/2)*log_2(0/2)] TOTAL: = 0.35 + 0 + 0 = 0.35 Size: Large: (3/8)*[ -(2/3)*log_2(2/3) - (1/3)*log_2(1/3)] Medium: + (3/8)*[ -(1/3)*log_2(1/3) - (2/3)*log_2(2/3)] Small: + (2/8)*[ -(1/2)*log_2(1/2) - (1/2)*log_2(1/2)] TOTAL: = 0.35 + 0.35 + 2/8 = 0.95 Version 2 CSE IIT, Kharagpur
  • 9. Dept: ee: (4/8)*[ -(2/4)*log_2(2/4) - (2/4)*log_2(2/4)] cs: + (4/8)*[ -(2/4)*log_2(2/4) - (2/4)*log_2(2/4)] TOTAL: = 0.5 + 0.5 = 1 Since status is the attribute with the lowest entropy, it is selected as the root node: Status / | Faculty / | staff student / | 1-,3+,6+ 2-,5-,8- 4+,7+ BIN=? BIN=NO BIN=YES Only the branch corresponding to Status=Faculty needs further processing. Selecting the attribute to split the branch Status=Faculty: YES NO Dept: ee: (1/3)*[ -(0/1)*log_2(0/1) - (1/1)*log_2(1/1)] cs: + (2/3)*[ -(2/2)*log_2(2/2) - (0/2)*log_2(0/2)] TOTAL: = 0 + 0 = 0 Version 2 CSE IIT, Kharagpur
  • 10. Since the minimum possible entropy is 0 and dept has that minimum possible entropy, it is selected as the best attribute to split the branch Status=Faculty. Note that we don't even have to calculate the entropy of the attribute size since it cannot possibly be better than 0. Status / | Faculty / | staff student / | 1-,3+,6+ 2-,5-,8- 4+,7+ Dept BIN=NO BIN=YES / ee / cs / 1- 3+,6+ BIN=NO BIN=YES Each branch ends in a homogeneous set of examples so the construction of the decision tree ends here. 3. The rules are: IF status=faculty AND dept=ee THEN recycling-bin=no IF status=faculty AND dept=cs THEN recycling-bin=yes IF status=staff THEN recycling-bin=no IF status=student THEN recycling-bin=yes Version 2 CSE IIT, Kharagpur
  • 11. 4. This function cannot be computed using a single perceptron since the function is not linearly separable: B ^ | 0 1 | | ---1----0--------> | A 5. Let Choose g to be the threshold function (output 1 on input > 0, output 0 otherwise). There are two inputs to the implies function, Therefore, Arbitrarily choose W2 = 1, and substitute into above for all values of X1 and X2. Note that greater/less than signs are decided by the threshold fn. and the result of For example, when X1=0 and X2=0, the resulting equation must be greater than 0 since g is threshold fn. and is True. Similarly, when X1=1 and X2=0, the resulting equation must be less than 0 since g is the threshold fn. and is False. Since W0 < 0, and W0 > W1, and W0 – W1 < 1, choose W1 = -1. Substitute where possible. Version 2 CSE IIT, Kharagpur
  • 12. Since W0 < 0, and W0 > -1, choose W0 = -0.5. Thus… W0 = -0.5, W1 = -1, W2 = 1. Finally, Version 2 CSE IIT, Kharagpur