SlideShare ist ein Scribd-Unternehmen logo
1 von 70
INTRODUCTION AND ARTIFICIAL NEURAL
NETWORKS
UNIT I
Mr. S. VINOD
ASSISTANT PROFESSOR
EEE DEPARTMENT
• computing techniques
• applications of soft computing
• Neuron
• Nerve structure and synapse-
• Artificial Neuron and its model
• Activation functions
• Neural network architecture
• single layer and multilayer feed forward networks
• McCullochPitts neuron model
• perceptron model- Adaline and Madaline
• multilayer perception model
• back propagation learning methods
• effect of learning rule coefficient
• back propagation algorithm
• factors affecting back propagation training applications.
2
Neural Networks
● Artificial neural network (ANN) is a machine learning
approach that models human brain and consists of a number
of artificial neurons.
● Neuron in ANNs tend to have fewer connections than
biological neurons.
● Each neuron in ANN receives a number of inputs.
● An activation function is applied to these inputs which results
in activation level of neuron (output value of the neuron).
● Knowledge about the learning task is given in the form of
examples called training examples.
Contd..
● An Artificial Neural Network is specified by:
− neuron model: the information processing unit of the NN,
− an architecture: a set of neurons and links connecting neurons.
Each link has a weight,
− a learning algorithm: used for training the NN by modifying the
weights in order to model a particular learning task correctly on
the training examples.
● The aim is to obtain a NN that is trained and generalizes
well.
● It should behaves correctly on new instances of the
learning task.
Neuron
● The neuron is the basic information processing unit of a
NN. It consists of:
1 A set of links, describing the neuron inputs, with weights W1, W2,
…, Wm
2 An adder function (linear combiner) for computing the weighted
sum of the inputs:
(real numbers)
3 Activation function for limiting the amplitude of the neuron
output. Here ‘b’ denotes bias.


m
1
jjxwu
j

)(uy b
The Neuron Diagram
Input
values
weights
Summing
function
Bias
b
Activation
functionInduced
Field
v
Output
y
x1
x2
xm
w2
wm
w1
 
 )(
Bias of a Neuron
● The bias b has the effect of applying a transformation to
the weighted sum u
v = u + b
● The bias is an external parameter of the neuron. It can be
modeled by adding an extra input.
● v is called induced field of the neuron
bw
xwv j
m
j
j

 
0
0
Neuron Models
● The choice of activation function determines the
neuron model.
Examples:
● step function:
● ramp function:
● sigmoid function with z,x,y parameters
● Gaussian function:














 

2
2
1
exp
2
1
)(




v
v
)exp(1
1
)(
yxv
zv











otherwise))/())(((
if
if
)(
cdabcva
dvb
cva
v






cvb
cva
v
if
if
)(
c
b
a
Step Function
c d
b
a
Ramp Function
Sigmoid function
• The Gaussian function is the probability
function of the normal distribution. Sometimes
also called the frequency curve.
Network Architectures
● Three different classes of network architectures
− single-layer feed-forward
− multi-layer feed-forward
− recurrent
● The architecture of a neural network is linked
with the learning algorithm used to train
Single Layer Feed-forward
Input layer
of
source nodes
Output layer
of
neurons
Perceptron: Neuron Model
(Special form of single layer feed forward)
− The perceptron was first proposed by Rosenblatt (1958) is a simple
neuron that is used to classify its input into one of two categories.
− A perceptron uses a step function that returns +1 if weighted sum
of its input  0 and -1 otherwise
x1
x2
xn
w2
w1
wn
b (bias)
v y
(v)






0if1
0if1
)(
v
v
v
Perceptron for Classification
● The perceptron is used for binary classification.
● First train a perceptron for a classification task.
− Find suitable weights in such a way that the training examples are
correctly classified.
− Geometrically try to find a hyper-plane that separates the examples of the
two classes.
● The perceptron can only model linearly separable classes.
● When the two classes are not linearly separable, it may be
desirable to obtain a linear separator that minimizes the mean
squared error.
● Given training examples of classes C1, C2 train the perceptron in
such a way that :
− If the output of the perceptron is +1 then the input is assigned to class C1
− If the output is -1 then the input is assigned to C2
X1
1 true true
false true
0 1 X2
Boolean function OR – Linearly separable
Learning Process for Perceptron
● Initially assign random weights to inputs between -0.5 and
+0.5
● Training data is presented to perceptron and its output is
observed.
● If output is incorrect, the weights are adjusted accordingly
using following formula.
wi  wi + (a* xi *e), where ‘e’ is error produced
and ‘a’ (-1  a  1) is learning rate
− ‘a’ is defined as 0 if output is correct, it is +ve, if output is too low and
–ve, if output is too high.
− Once the modification to weights has taken place, the next piece of
training data is used in the same way.
− Once all the training data have been applied, the process starts again
until all the weights are correct and all errors are zero.
− Each iteration of this process is known as an epoch.
Example: Perceptron to learn OR
function
● Initially consider w1 = -0.2 and w2 = 0.4
● Training data say, x1 = 0 and x2 = 0, output is 0.
● Compute y = Step(w1*x1 + w2*x2) = 0. Output is correct so
weights are not changed.
● For training data x1=0 and x2 = 1, output is 1
● Compute y = Step(w1*x1 + w2*x2) = 0.4 = 1. Output is correct
so weights are not changed.
● Next training data x1=1 and x2 = 0 and output is 1
● Compute y = Step(w1*x1 + w2*x2) = - 0.2 = 0. Output is
incorrect, hence weights are to be changed.
● Assume a = 0.2 and error e=1
wi = wi + (a * xi * e) gives w1 = 0 and w2 =0.4
● With these weights, test the remaining test data.
● Repeat the process till we get stable result.
Perceptron: Limitations
● The perceptron can only model linearly separable
functions,
− those functions which can be drawn in 2-dim graph and single
straight line separates values in two part.
● Boolean functions given below are linearly separable:
− AND
− OR
− COMPLEMENT
● It cannot model XOR function as it is non linearly
separable.
− When the two classes are not linearly separable, it may be
desirable to obtain a linear separator that minimizes the mean
squared error.
XOR – Non linearly separable function
● A typical example of non-linearly separable function is the
XOR that computes the logical exclusive or..
● This function takes two input arguments with values in {0,1}
and returns one output in {0,1},
● Here 0 and 1 are encoding of the truth values false and
true,
● The output is true if and only if the two inputs have
different truth values.
● XOR is non linearly separable function which can not be
modeled by perceptron.
● For such functions we have to use multi layer feed-forward
network.
These two classes (true and false) cannot be separated using a
line. Hence XOR is non linearly separable.
Input Output
X1 X2 X1 XOR X2
0 0 0
0 1 1
1 0 1
1 1 0
X1
1 true false
false true
0 1 X2
What is ADALINE
ADALINE ARCHITECTURE
Using ADALINE Network
ADALINE widrow-hoff Learning
Learning algorithm
Least square minimization
LSM
Application
Comparison with perceptron
SUMMARY
MADLINE
MADLINE
ARCHITECTURE
Multi layer feed-forward NN (FFNN)
● FFNN is a more general network architecture, where there are
hidden layers between input and output layers.
● Hidden nodes do not directly receive inputs nor send outputs to
the external environment.
● FFNNs overcome the limitation of single-layer NN.
● They can handle non-linearly separable learning tasks.
Input
layer
Output
layer
Hidden Layer
3-4-2 Network
Inputs OutputofHiddenNodes Output
Node
X1XORX2
X1 X2 H1 H2
0 0 0 0 –0.50 0
0 1 –10 1 0.5 1 1
1 0 1 –10 0.5 1 1
1 1 0 0 –0.50 0
Since we are representing two states by 0 (false) and 1 (true), we
will map negative outputs (–1, –0.5) of hidden and output layers
to 0 and positive output (0.5) to 1.
FFNN for XOR
● The ANN for XOR has two hidden nodes that realizes this non-linear
separation and uses the sign (step) activation function.
● Arrows from input nodes to two hidden nodes indicate the directions of
the weight vectors (1,-1) and (-1,1).
● The output node is used to combine the outputs of the two hidden
nodes.
Input nodes Hidden layer Output layer Output
H1 –0.5
X1 1
–1 1
Y
–1 H2
X2 1 1
FFNN NEURON MODEL
● The classical learning algorithm of FFNN is based on the
gradient descent method.
● For this reason the activation function used in FFNN are
continuous functions of the weights, differentiable
everywhere.
● The activation function for node i may be defined as a
simple form of the sigmoid function in the following
manner:
where A > 0, Vi =  Wij * Yj , such that Wij is a weight of the link
from node i to node j and Yj is the output of node j.
)*(
1
1
)( ViA
e
Vi 


Training Algorithm: Back-propagation
● The Back propagation algorithm learns in the same way as
single perceptron.
● It searches for weight values that minimize the total error of
the network over the set of training examples (training set).
● Back propagation consists of the repeated application of the
following two passes:
− Forward pass: In this step, the network is activated on one example
and the error of (each neuron of) the output layer is computed.
− Backward pass: in this step the network error is used for updating
the weights. The error is propagated backwards from the output
layer through the network layer by layer. This is done by recursively
computing the local gradient of each neuron.
Feed-forward Network
Feed-forward networks often have one or more hidden layers of sigmoid neurons followed
by an output layer of linear neurons.
Multiple layers of neurons with nonlinear transfer functions allow the network to learn
nonlinear and linear relationships between input and output vectors.
The linear output layer lets the network produce values outside the range -1 to +1. On the
other hand, if you want to constrain the outputs of a network (such as between 0 and 1),
then the output layer should use a sigmoid transfer function (such as logsig).
Backpropagation Learning
Algorithm
The following slides describes teaching process of multi-layer neural network
employing back-propagation algorithm. To illustrate this process the three layer neural
network with two inputs and one output, which is shown in the picture below, is used:
Learning Algorithm
Backpropagation
Each neuron is composed of two units. First unit adds products of weights coefficients and
input signals. The second unit realise nonlinear function, called neuron transfer (activation)
function. Signal e is adder output signal, and y = f(e) is output signal of nonlinear element.
Signal y is also output signal of neuron.
Learning Algorithm:
Backpropagation
To teach the neural network we need training data set. The training data set consists of
input signals (x1 and x2 ) assigned with corresponding target (desired output) z.
The network training is an iterative process. In each iteration weights coefficients of nodes
are modified using new data from training data set. Modification is calculated using
algorithm described below:
Each teaching step starts with forcing both input signals from training set. After this stage
we can determine output signals values for each neuron in each network layer.
Learning Algorithm:
Backpropagation
Pictures below illustrate how signal is propagating through the network,
Symbols w(xm)n represent weights of connections between network input xm and
neuron n in input layer. Symbols yn represents output signal of neuron n.
Learning Algorithm:
Backpropagation
Learning Algorithm:
Backpropagation
Learning Algorithm:
Backpropagation
Propagation of signals through the hidden layer. Symbols wmn represent weights
of connections between output of neuron m and input of neuron n in the next
layer.
Learning Algorithm:
Backpropagation
Learning Algorithm:
Backpropagation
n
Learning Algorithm:
Backpropagation
Propagation of signals through the output layer.
Learning Algorithm:
Backpropagation
In the next algorithm step the output signal of the network y is
compared with the desired output value (the target), which is found in
training data set. The difference is called error signal d of output layer
neuron
Learning Algorithm:
Backpropagation
The idea is to propagate error signal d (computed in single teaching step)
back to all neurons, which output signals were input for discussed
neuron.
Learning Algorithm:
Backpropagation
The idea is to propagate error signal d (computed in single teaching step)
back to all neurons, which output signals were input for discussed
neuron.
Learning Algorithm:
Backpropagation
The weights' coefficients wmn used to propagate errors back are equal to
this used during computing output value. Only the direction of data flow
is changed (signals are propagated from output to inputs one after the
other). This technique is used for all network layers. If propagated errors
came from few neurons they are added. The illustration is below:
Learning Algorithm:
Backpropagation
When the error signal for each neuron is computed, the weights
coefficients of each neuron input node may be modified. In formulas
below df(e)/de represents derivative of neuron activation function
(which weights are modified).
Learning Algorithm:
Backpropagation
When the error signal for each neuron is computed, the weights
coefficients of each neuron input node may be modified. In formulas
below df(e)/de represents derivative of neuron activation function
(which weights are modified).
Learning Algorithm:
Backpropagation
When the error signal for each neuron is computed, the weights
coefficients of each neuron input node may be modified. In formulas
below df(e)/de represents derivative of neuron activation function
(which weights are modified).
Weight Update Rule
● The Backprop weight update rule is based on the gradient
descent method:
− It takes a step in the direction yielding the maximum decrease of
the network error E.
− This direction is the opposite of the gradient of E.
● Iteration of the Backprop algorithm is usually terminated
when the sum of squares of errors of the output values for
all training data in an epoch is less than some threshold
such as 0.01
ijijij www 
ij
ij
w
-w



E

Back-prop learning algorithm
(incremental-mode)
n=1;
initialize weights randomly;
while (stopping criterion not satisfied or n <max_iterations)
for each example (x,d)
- run the network with input x and compute the output y
- update the weights in backward order starting from those of
the output layer:
with computed using the (generalized) Delta rule
end-for
n = n+1;
end-while;
jijiji www 
jiw
Stopping criterions
● Total mean squared error change:
− Back-prop is considered to have converged when the absolute
rate of change in the average squared error per epoch is
sufficiently small (in the range [0.1, 0.01]).
● Generalization based criterion:
− After each epoch, the NN is tested for generalization.
− If the generalization performance is adequate then stop.
− If this stopping criterion is used then the part of the training set
used for testing the network generalization will not used for
updating the weights.
● Data representation
● Network Topology
● Network Parameters
● Training
● Validation
NN DESIGN ISSUES
● Data representation depends on the problem.
● In general ANNs work on continuous (real valued) attributes.
Therefore symbolic attributes are encoded into continuous ones.
● Attributes of different types may have different ranges of values
which affect the training process.
● Normalization may be used, like the following one which scales
each attribute to assume values between 0 and 1.
for each value xi of ith attribute, mini and maxi are the minimum and
maximum value of that attribute over the training set.
Data Representation
i
i
minmax
min



i
i
i
x
x
● The number of layers and neurons depend on the specific
task.
● In practice this issue is solved by trial and error.
● Two types of adaptive algorithms can be used:
− start from a large network and successively remove some neurons
and links until network performance degrades.
− begin with a small network and introduce new neurons until
performance is satisfactory.
Network Topology
● How are the weights initialized?
● How is the learning rate chosen?
● How many hidden layers and how many
neurons?
● How many examples in the training set?
Network parameters
Initialization of weights
● In general, initial weights are randomly chosen, with
typical values between -1.0 and 1.0 or -0.5 and 0.5.
● If some inputs are much larger than others, random
initialization may bias the network to give much more
importance to larger inputs.
● In such a case, weights can be initialized as follows:


Ni
N
,...,1
|x|
1
2
1
ij i
w For weights from the input to the first layer
For weights from the first to the second layer


Ni
N
i
,...,1
)xw(
1
2
1
jk ij
w 
● The right value of  depends on the application.
● Values between 0.1 and 0.9 have been used in
many applications.
● Other heuristics is that adapt  during the
training as described in previous slides.
Choice of learning rate
Training
● Rule of thumb:
− the number of training examples should be at least five to ten
times the number of weights of the network.
● Other rule:
|W|= number of weights
a=expected accuracy on test seta)-(1
|W|
N 
Contd..
● The networks generated using these
weights and input vectors are stable, except
X2.
● X2 stabilizes to X1 (which is at hamming
distance 1).
● Finally, with the obtained weights and
stable states (X1 and X3), we can stabilize
any new (partial) pattern to one of those

Weitere ähnliche Inhalte

Was ist angesagt?

Artificial Neural Network
Artificial Neural NetworkArtificial Neural Network
Artificial Neural NetworkPrakash K
 
Feed forward ,back propagation,gradient descent
Feed forward ,back propagation,gradient descentFeed forward ,back propagation,gradient descent
Feed forward ,back propagation,gradient descentMuhammad Rasel
 
Introduction Of Artificial neural network
Introduction Of Artificial neural networkIntroduction Of Artificial neural network
Introduction Of Artificial neural networkNagarajan
 
Perceptron (neural network)
Perceptron (neural network)Perceptron (neural network)
Perceptron (neural network)EdutechLearners
 
Radial basis function network ppt bySheetal,Samreen and Dhanashri
Radial basis function network ppt bySheetal,Samreen and DhanashriRadial basis function network ppt bySheetal,Samreen and Dhanashri
Radial basis function network ppt bySheetal,Samreen and Dhanashrisheetal katkar
 
Artificial Neural Network
Artificial Neural NetworkArtificial Neural Network
Artificial Neural NetworkAtul Krishna
 
Artificial Neural Networks Lect5: Multi-Layer Perceptron & Backpropagation
Artificial Neural Networks Lect5: Multi-Layer Perceptron & BackpropagationArtificial Neural Networks Lect5: Multi-Layer Perceptron & Backpropagation
Artificial Neural Networks Lect5: Multi-Layer Perceptron & BackpropagationMohammed Bennamoun
 
Artificial neural network
Artificial neural networkArtificial neural network
Artificial neural networkmustafa aadel
 
Feedforward neural network
Feedforward neural networkFeedforward neural network
Feedforward neural networkSopheaktra YONG
 
Artificial Neural Networks Lect3: Neural Network Learning rules
Artificial Neural Networks Lect3: Neural Network Learning rulesArtificial Neural Networks Lect3: Neural Network Learning rules
Artificial Neural Networks Lect3: Neural Network Learning rulesMohammed Bennamoun
 
Artificial Neural Network(Artificial intelligence)
Artificial Neural Network(Artificial intelligence)Artificial Neural Network(Artificial intelligence)
Artificial Neural Network(Artificial intelligence)spartacus131211
 
Deep Feed Forward Neural Networks and Regularization
Deep Feed Forward Neural Networks and RegularizationDeep Feed Forward Neural Networks and Regularization
Deep Feed Forward Neural Networks and RegularizationYan Xu
 
Artificial Neural Network
Artificial Neural NetworkArtificial Neural Network
Artificial Neural NetworkKnoldus Inc.
 
Artificial Neural Networks - ANN
Artificial Neural Networks - ANNArtificial Neural Networks - ANN
Artificial Neural Networks - ANNMohamed Talaat
 
Artifical Neural Network and its applications
Artifical Neural Network and its applicationsArtifical Neural Network and its applications
Artifical Neural Network and its applicationsSangeeta Tiwari
 

Was ist angesagt? (20)

Artificial Neural Network
Artificial Neural NetworkArtificial Neural Network
Artificial Neural Network
 
Feed forward ,back propagation,gradient descent
Feed forward ,back propagation,gradient descentFeed forward ,back propagation,gradient descent
Feed forward ,back propagation,gradient descent
 
Introduction Of Artificial neural network
Introduction Of Artificial neural networkIntroduction Of Artificial neural network
Introduction Of Artificial neural network
 
Hebb network
Hebb networkHebb network
Hebb network
 
Backpropagation algo
Backpropagation  algoBackpropagation  algo
Backpropagation algo
 
Perceptron (neural network)
Perceptron (neural network)Perceptron (neural network)
Perceptron (neural network)
 
Radial basis function network ppt bySheetal,Samreen and Dhanashri
Radial basis function network ppt bySheetal,Samreen and DhanashriRadial basis function network ppt bySheetal,Samreen and Dhanashri
Radial basis function network ppt bySheetal,Samreen and Dhanashri
 
Artificial Neural Network
Artificial Neural NetworkArtificial Neural Network
Artificial Neural Network
 
Artificial Neural Networks Lect5: Multi-Layer Perceptron & Backpropagation
Artificial Neural Networks Lect5: Multi-Layer Perceptron & BackpropagationArtificial Neural Networks Lect5: Multi-Layer Perceptron & Backpropagation
Artificial Neural Networks Lect5: Multi-Layer Perceptron & Backpropagation
 
Artificial neural network
Artificial neural networkArtificial neural network
Artificial neural network
 
Feedforward neural network
Feedforward neural networkFeedforward neural network
Feedforward neural network
 
Adaptive Resonance Theory (ART)
Adaptive Resonance Theory (ART)Adaptive Resonance Theory (ART)
Adaptive Resonance Theory (ART)
 
Ensemble learning
Ensemble learningEnsemble learning
Ensemble learning
 
Artificial Neural Networks Lect3: Neural Network Learning rules
Artificial Neural Networks Lect3: Neural Network Learning rulesArtificial Neural Networks Lect3: Neural Network Learning rules
Artificial Neural Networks Lect3: Neural Network Learning rules
 
Artificial Neural Network(Artificial intelligence)
Artificial Neural Network(Artificial intelligence)Artificial Neural Network(Artificial intelligence)
Artificial Neural Network(Artificial intelligence)
 
HOPFIELD NETWORK
HOPFIELD NETWORKHOPFIELD NETWORK
HOPFIELD NETWORK
 
Deep Feed Forward Neural Networks and Regularization
Deep Feed Forward Neural Networks and RegularizationDeep Feed Forward Neural Networks and Regularization
Deep Feed Forward Neural Networks and Regularization
 
Artificial Neural Network
Artificial Neural NetworkArtificial Neural Network
Artificial Neural Network
 
Artificial Neural Networks - ANN
Artificial Neural Networks - ANNArtificial Neural Networks - ANN
Artificial Neural Networks - ANN
 
Artifical Neural Network and its applications
Artifical Neural Network and its applicationsArtifical Neural Network and its applications
Artifical Neural Network and its applications
 

Andere mochten auch

Adaline madaline
Adaline madalineAdaline madaline
Adaline madalineNagarajan
 
Johnsson: applications of_self-organizing_maps
Johnsson: applications of_self-organizing_mapsJohnsson: applications of_self-organizing_maps
Johnsson: applications of_self-organizing_mapsArchiLab 7
 
Implementation Of Back-Propagation Neural Network For Isolated Bangla Speech ...
Implementation Of Back-Propagation Neural Network For Isolated Bangla Speech ...Implementation Of Back-Propagation Neural Network For Isolated Bangla Speech ...
Implementation Of Back-Propagation Neural Network For Isolated Bangla Speech ...ijistjournal
 
Multidimensional Perceptual Map for Project Prioritization and Selection - 20...
Multidimensional Perceptual Map for Project Prioritization and Selection - 20...Multidimensional Perceptual Map for Project Prioritization and Selection - 20...
Multidimensional Perceptual Map for Project Prioritization and Selection - 20...Jack Zheng
 
Learning Vector Quantization LVQ
Learning Vector Quantization LVQLearning Vector Quantization LVQ
Learning Vector Quantization LVQESCOM
 
Back propagation network
Back propagation networkBack propagation network
Back propagation networkHIRA Zaidi
 
Kohonen self organizing maps
Kohonen self organizing mapsKohonen self organizing maps
Kohonen self organizing mapsraphaelkiminya
 
memristor
memristormemristor
memristorjithoot
 
Back propagation
Back propagationBack propagation
Back propagationNagarajan
 
neural network
neural networkneural network
neural networkSTUDENT
 
Neural network & its applications
Neural network & its applications Neural network & its applications
Neural network & its applications Ahmed_hashmi
 

Andere mochten auch (15)

Adaline madaline
Adaline madalineAdaline madaline
Adaline madaline
 
Johnsson: applications of_self-organizing_maps
Johnsson: applications of_self-organizing_mapsJohnsson: applications of_self-organizing_maps
Johnsson: applications of_self-organizing_maps
 
Implementation Of Back-Propagation Neural Network For Isolated Bangla Speech ...
Implementation Of Back-Propagation Neural Network For Isolated Bangla Speech ...Implementation Of Back-Propagation Neural Network For Isolated Bangla Speech ...
Implementation Of Back-Propagation Neural Network For Isolated Bangla Speech ...
 
face detection
face detectionface detection
face detection
 
Multidimensional Perceptual Map for Project Prioritization and Selection - 20...
Multidimensional Perceptual Map for Project Prioritization and Selection - 20...Multidimensional Perceptual Map for Project Prioritization and Selection - 20...
Multidimensional Perceptual Map for Project Prioritization and Selection - 20...
 
Learning Vector Quantization LVQ
Learning Vector Quantization LVQLearning Vector Quantization LVQ
Learning Vector Quantization LVQ
 
Vector quantization
Vector quantizationVector quantization
Vector quantization
 
Back propagation network
Back propagation networkBack propagation network
Back propagation network
 
Hopfield Networks
Hopfield NetworksHopfield Networks
Hopfield Networks
 
Neural Networks
Neural NetworksNeural Networks
Neural Networks
 
Kohonen self organizing maps
Kohonen self organizing mapsKohonen self organizing maps
Kohonen self organizing maps
 
memristor
memristormemristor
memristor
 
Back propagation
Back propagationBack propagation
Back propagation
 
neural network
neural networkneural network
neural network
 
Neural network & its applications
Neural network & its applications Neural network & its applications
Neural network & its applications
 

Ähnlich wie Unit 1

Machine Learning With Neural Networks
Machine Learning  With Neural NetworksMachine Learning  With Neural Networks
Machine Learning With Neural NetworksKnoldus Inc.
 
Artificial Neural Network
Artificial Neural NetworkArtificial Neural Network
Artificial Neural NetworkPratik Aggarwal
 
Deep Learning Module 2A Training MLP.pptx
Deep Learning Module 2A Training MLP.pptxDeep Learning Module 2A Training MLP.pptx
Deep Learning Module 2A Training MLP.pptxvipul6601
 
nural network ER. Abhishek k. upadhyay
nural network ER. Abhishek  k. upadhyaynural network ER. Abhishek  k. upadhyay
nural network ER. Abhishek k. upadhyayabhishek upadhyay
 
10 Backpropagation Algorithm for Neural Networks (1).pptx
10 Backpropagation Algorithm for Neural Networks (1).pptx10 Backpropagation Algorithm for Neural Networks (1).pptx
10 Backpropagation Algorithm for Neural Networks (1).pptxSaifKhan703888
 
cs621-lect18-feedforward-network-contd-2009-9-24.ppt
cs621-lect18-feedforward-network-contd-2009-9-24.pptcs621-lect18-feedforward-network-contd-2009-9-24.ppt
cs621-lect18-feedforward-network-contd-2009-9-24.pptGayathriRHICETCSESTA
 
cs621-lect18-feedforward-network-contd-2009-9-24.ppt
cs621-lect18-feedforward-network-contd-2009-9-24.pptcs621-lect18-feedforward-network-contd-2009-9-24.ppt
cs621-lect18-feedforward-network-contd-2009-9-24.pptGayathriRHICETCSESTA
 
Neural network
Neural networkNeural network
Neural networkmarada0033
 
Deep neural networks & computational graphs
Deep neural networks & computational graphsDeep neural networks & computational graphs
Deep neural networks & computational graphsRevanth Kumar
 

Ähnlich wie Unit 1 (20)

SOFTCOMPUTERING TECHNICS - Unit
SOFTCOMPUTERING TECHNICS - UnitSOFTCOMPUTERING TECHNICS - Unit
SOFTCOMPUTERING TECHNICS - Unit
 
Machine Learning With Neural Networks
Machine Learning  With Neural NetworksMachine Learning  With Neural Networks
Machine Learning With Neural Networks
 
Artificial Neural Network
Artificial Neural NetworkArtificial Neural Network
Artificial Neural Network
 
Deep Learning Module 2A Training MLP.pptx
Deep Learning Module 2A Training MLP.pptxDeep Learning Module 2A Training MLP.pptx
Deep Learning Module 2A Training MLP.pptx
 
nural network ER. Abhishek k. upadhyay
nural network ER. Abhishek  k. upadhyaynural network ER. Abhishek  k. upadhyay
nural network ER. Abhishek k. upadhyay
 
Multi Layer Network
Multi Layer NetworkMulti Layer Network
Multi Layer Network
 
CS767_Lecture_04.pptx
CS767_Lecture_04.pptxCS767_Lecture_04.pptx
CS767_Lecture_04.pptx
 
Lec 6-bp
Lec 6-bpLec 6-bp
Lec 6-bp
 
03 Single layer Perception Classifier
03 Single layer Perception Classifier03 Single layer Perception Classifier
03 Single layer Perception Classifier
 
CS767_Lecture_05.pptx
CS767_Lecture_05.pptxCS767_Lecture_05.pptx
CS767_Lecture_05.pptx
 
Neural Networks
Neural NetworksNeural Networks
Neural Networks
 
DNN.pptx
DNN.pptxDNN.pptx
DNN.pptx
 
10 Backpropagation Algorithm for Neural Networks (1).pptx
10 Backpropagation Algorithm for Neural Networks (1).pptx10 Backpropagation Algorithm for Neural Networks (1).pptx
10 Backpropagation Algorithm for Neural Networks (1).pptx
 
feedforward-network-
feedforward-network-feedforward-network-
feedforward-network-
 
cs621-lect18-feedforward-network-contd-2009-9-24.ppt
cs621-lect18-feedforward-network-contd-2009-9-24.pptcs621-lect18-feedforward-network-contd-2009-9-24.ppt
cs621-lect18-feedforward-network-contd-2009-9-24.ppt
 
cs621-lect18-feedforward-network-contd-2009-9-24.ppt
cs621-lect18-feedforward-network-contd-2009-9-24.pptcs621-lect18-feedforward-network-contd-2009-9-24.ppt
cs621-lect18-feedforward-network-contd-2009-9-24.ppt
 
Neural Networks
Neural NetworksNeural Networks
Neural Networks
 
Neural network
Neural networkNeural network
Neural network
 
Deep neural networks & computational graphs
Deep neural networks & computational graphsDeep neural networks & computational graphs
Deep neural networks & computational graphs
 
MNN
MNNMNN
MNN
 

Kürzlich hochgeladen

Bridge Jacking Design Sample Calculation.pptx
Bridge Jacking Design Sample Calculation.pptxBridge Jacking Design Sample Calculation.pptx
Bridge Jacking Design Sample Calculation.pptxnuruddin69
 
Tamil Call Girls Bhayandar WhatsApp +91-9930687706, Best Service
Tamil Call Girls Bhayandar WhatsApp +91-9930687706, Best ServiceTamil Call Girls Bhayandar WhatsApp +91-9930687706, Best Service
Tamil Call Girls Bhayandar WhatsApp +91-9930687706, Best Servicemeghakumariji156
 
+97470301568>> buy weed in qatar,buy thc oil qatar,buy weed and vape oil in d...
+97470301568>> buy weed in qatar,buy thc oil qatar,buy weed and vape oil in d...+97470301568>> buy weed in qatar,buy thc oil qatar,buy weed and vape oil in d...
+97470301568>> buy weed in qatar,buy thc oil qatar,buy weed and vape oil in d...Health
 
Generative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTGenerative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTbhaskargani46
 
Computer Networks Basics of Network Devices
Computer Networks  Basics of Network DevicesComputer Networks  Basics of Network Devices
Computer Networks Basics of Network DevicesChandrakantDivate1
 
notes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptnotes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptMsecMca
 
COST-EFFETIVE and Energy Efficient BUILDINGS ptx
COST-EFFETIVE  and Energy Efficient BUILDINGS ptxCOST-EFFETIVE  and Energy Efficient BUILDINGS ptx
COST-EFFETIVE and Energy Efficient BUILDINGS ptxJIT KUMAR GUPTA
 
Online electricity billing project report..pdf
Online electricity billing project report..pdfOnline electricity billing project report..pdf
Online electricity billing project report..pdfKamal Acharya
 
Introduction to Serverless with AWS Lambda
Introduction to Serverless with AWS LambdaIntroduction to Serverless with AWS Lambda
Introduction to Serverless with AWS LambdaOmar Fathy
 
HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKAR
HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKARHAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKAR
HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKARKOUSTAV SARKAR
 
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXssuser89054b
 
A CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptx
A CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptxA CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptx
A CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptxmaisarahman1
 
Work-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptxWork-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptxJuliansyahHarahap1
 
Hostel management system project report..pdf
Hostel management system project report..pdfHostel management system project report..pdf
Hostel management system project report..pdfKamal Acharya
 
Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...
Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...
Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...soginsider
 
Thermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptThermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptDineshKumar4165
 
Unleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapUnleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapRishantSharmaFr
 
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptxHOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptxSCMS School of Architecture
 

Kürzlich hochgeladen (20)

Bridge Jacking Design Sample Calculation.pptx
Bridge Jacking Design Sample Calculation.pptxBridge Jacking Design Sample Calculation.pptx
Bridge Jacking Design Sample Calculation.pptx
 
Tamil Call Girls Bhayandar WhatsApp +91-9930687706, Best Service
Tamil Call Girls Bhayandar WhatsApp +91-9930687706, Best ServiceTamil Call Girls Bhayandar WhatsApp +91-9930687706, Best Service
Tamil Call Girls Bhayandar WhatsApp +91-9930687706, Best Service
 
+97470301568>> buy weed in qatar,buy thc oil qatar,buy weed and vape oil in d...
+97470301568>> buy weed in qatar,buy thc oil qatar,buy weed and vape oil in d...+97470301568>> buy weed in qatar,buy thc oil qatar,buy weed and vape oil in d...
+97470301568>> buy weed in qatar,buy thc oil qatar,buy weed and vape oil in d...
 
Generative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTGenerative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPT
 
Computer Networks Basics of Network Devices
Computer Networks  Basics of Network DevicesComputer Networks  Basics of Network Devices
Computer Networks Basics of Network Devices
 
notes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptnotes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.ppt
 
COST-EFFETIVE and Energy Efficient BUILDINGS ptx
COST-EFFETIVE  and Energy Efficient BUILDINGS ptxCOST-EFFETIVE  and Energy Efficient BUILDINGS ptx
COST-EFFETIVE and Energy Efficient BUILDINGS ptx
 
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced LoadsFEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
 
Online electricity billing project report..pdf
Online electricity billing project report..pdfOnline electricity billing project report..pdf
Online electricity billing project report..pdf
 
Introduction to Serverless with AWS Lambda
Introduction to Serverless with AWS LambdaIntroduction to Serverless with AWS Lambda
Introduction to Serverless with AWS Lambda
 
HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKAR
HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKARHAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKAR
HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKAR
 
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
 
A CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptx
A CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptxA CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptx
A CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptx
 
Work-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptxWork-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptx
 
Hostel management system project report..pdf
Hostel management system project report..pdfHostel management system project report..pdf
Hostel management system project report..pdf
 
Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...
Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...
Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...
 
Thermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptThermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.ppt
 
Unleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapUnleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leap
 
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak HamilCara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
 
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptxHOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
 

Unit 1

  • 1. INTRODUCTION AND ARTIFICIAL NEURAL NETWORKS UNIT I Mr. S. VINOD ASSISTANT PROFESSOR EEE DEPARTMENT
  • 2. • computing techniques • applications of soft computing • Neuron • Nerve structure and synapse- • Artificial Neuron and its model • Activation functions • Neural network architecture • single layer and multilayer feed forward networks • McCullochPitts neuron model • perceptron model- Adaline and Madaline • multilayer perception model • back propagation learning methods • effect of learning rule coefficient • back propagation algorithm • factors affecting back propagation training applications. 2
  • 3. Neural Networks ● Artificial neural network (ANN) is a machine learning approach that models human brain and consists of a number of artificial neurons. ● Neuron in ANNs tend to have fewer connections than biological neurons. ● Each neuron in ANN receives a number of inputs. ● An activation function is applied to these inputs which results in activation level of neuron (output value of the neuron). ● Knowledge about the learning task is given in the form of examples called training examples.
  • 4. Contd.. ● An Artificial Neural Network is specified by: − neuron model: the information processing unit of the NN, − an architecture: a set of neurons and links connecting neurons. Each link has a weight, − a learning algorithm: used for training the NN by modifying the weights in order to model a particular learning task correctly on the training examples. ● The aim is to obtain a NN that is trained and generalizes well. ● It should behaves correctly on new instances of the learning task.
  • 5. Neuron ● The neuron is the basic information processing unit of a NN. It consists of: 1 A set of links, describing the neuron inputs, with weights W1, W2, …, Wm 2 An adder function (linear combiner) for computing the weighted sum of the inputs: (real numbers) 3 Activation function for limiting the amplitude of the neuron output. Here ‘b’ denotes bias.   m 1 jjxwu j  )(uy b
  • 7. Bias of a Neuron ● The bias b has the effect of applying a transformation to the weighted sum u v = u + b ● The bias is an external parameter of the neuron. It can be modeled by adding an extra input. ● v is called induced field of the neuron bw xwv j m j j    0 0
  • 8. Neuron Models ● The choice of activation function determines the neuron model. Examples: ● step function: ● ramp function: ● sigmoid function with z,x,y parameters ● Gaussian function:                  2 2 1 exp 2 1 )(     v v )exp(1 1 )( yxv zv            otherwise))/())((( if if )( cdabcva dvb cva v       cvb cva v if if )(
  • 12. • The Gaussian function is the probability function of the normal distribution. Sometimes also called the frequency curve.
  • 13. Network Architectures ● Three different classes of network architectures − single-layer feed-forward − multi-layer feed-forward − recurrent ● The architecture of a neural network is linked with the learning algorithm used to train
  • 14. Single Layer Feed-forward Input layer of source nodes Output layer of neurons
  • 15. Perceptron: Neuron Model (Special form of single layer feed forward) − The perceptron was first proposed by Rosenblatt (1958) is a simple neuron that is used to classify its input into one of two categories. − A perceptron uses a step function that returns +1 if weighted sum of its input  0 and -1 otherwise x1 x2 xn w2 w1 wn b (bias) v y (v)       0if1 0if1 )( v v v
  • 16. Perceptron for Classification ● The perceptron is used for binary classification. ● First train a perceptron for a classification task. − Find suitable weights in such a way that the training examples are correctly classified. − Geometrically try to find a hyper-plane that separates the examples of the two classes. ● The perceptron can only model linearly separable classes. ● When the two classes are not linearly separable, it may be desirable to obtain a linear separator that minimizes the mean squared error. ● Given training examples of classes C1, C2 train the perceptron in such a way that : − If the output of the perceptron is +1 then the input is assigned to class C1 − If the output is -1 then the input is assigned to C2
  • 17. X1 1 true true false true 0 1 X2 Boolean function OR – Linearly separable
  • 18. Learning Process for Perceptron ● Initially assign random weights to inputs between -0.5 and +0.5 ● Training data is presented to perceptron and its output is observed. ● If output is incorrect, the weights are adjusted accordingly using following formula. wi  wi + (a* xi *e), where ‘e’ is error produced and ‘a’ (-1  a  1) is learning rate − ‘a’ is defined as 0 if output is correct, it is +ve, if output is too low and –ve, if output is too high. − Once the modification to weights has taken place, the next piece of training data is used in the same way. − Once all the training data have been applied, the process starts again until all the weights are correct and all errors are zero. − Each iteration of this process is known as an epoch.
  • 19. Example: Perceptron to learn OR function ● Initially consider w1 = -0.2 and w2 = 0.4 ● Training data say, x1 = 0 and x2 = 0, output is 0. ● Compute y = Step(w1*x1 + w2*x2) = 0. Output is correct so weights are not changed. ● For training data x1=0 and x2 = 1, output is 1 ● Compute y = Step(w1*x1 + w2*x2) = 0.4 = 1. Output is correct so weights are not changed. ● Next training data x1=1 and x2 = 0 and output is 1 ● Compute y = Step(w1*x1 + w2*x2) = - 0.2 = 0. Output is incorrect, hence weights are to be changed. ● Assume a = 0.2 and error e=1 wi = wi + (a * xi * e) gives w1 = 0 and w2 =0.4 ● With these weights, test the remaining test data. ● Repeat the process till we get stable result.
  • 20. Perceptron: Limitations ● The perceptron can only model linearly separable functions, − those functions which can be drawn in 2-dim graph and single straight line separates values in two part. ● Boolean functions given below are linearly separable: − AND − OR − COMPLEMENT ● It cannot model XOR function as it is non linearly separable. − When the two classes are not linearly separable, it may be desirable to obtain a linear separator that minimizes the mean squared error.
  • 21. XOR – Non linearly separable function ● A typical example of non-linearly separable function is the XOR that computes the logical exclusive or.. ● This function takes two input arguments with values in {0,1} and returns one output in {0,1}, ● Here 0 and 1 are encoding of the truth values false and true, ● The output is true if and only if the two inputs have different truth values. ● XOR is non linearly separable function which can not be modeled by perceptron. ● For such functions we have to use multi layer feed-forward network.
  • 22. These two classes (true and false) cannot be separated using a line. Hence XOR is non linearly separable. Input Output X1 X2 X1 XOR X2 0 0 0 0 1 1 1 0 1 1 1 0 X1 1 true false false true 0 1 X2
  • 27.
  • 30. LSM
  • 37. Multi layer feed-forward NN (FFNN) ● FFNN is a more general network architecture, where there are hidden layers between input and output layers. ● Hidden nodes do not directly receive inputs nor send outputs to the external environment. ● FFNNs overcome the limitation of single-layer NN. ● They can handle non-linearly separable learning tasks. Input layer Output layer Hidden Layer 3-4-2 Network
  • 38. Inputs OutputofHiddenNodes Output Node X1XORX2 X1 X2 H1 H2 0 0 0 0 –0.50 0 0 1 –10 1 0.5 1 1 1 0 1 –10 0.5 1 1 1 1 0 0 –0.50 0 Since we are representing two states by 0 (false) and 1 (true), we will map negative outputs (–1, –0.5) of hidden and output layers to 0 and positive output (0.5) to 1.
  • 39. FFNN for XOR ● The ANN for XOR has two hidden nodes that realizes this non-linear separation and uses the sign (step) activation function. ● Arrows from input nodes to two hidden nodes indicate the directions of the weight vectors (1,-1) and (-1,1). ● The output node is used to combine the outputs of the two hidden nodes. Input nodes Hidden layer Output layer Output H1 –0.5 X1 1 –1 1 Y –1 H2 X2 1 1
  • 40. FFNN NEURON MODEL ● The classical learning algorithm of FFNN is based on the gradient descent method. ● For this reason the activation function used in FFNN are continuous functions of the weights, differentiable everywhere. ● The activation function for node i may be defined as a simple form of the sigmoid function in the following manner: where A > 0, Vi =  Wij * Yj , such that Wij is a weight of the link from node i to node j and Yj is the output of node j. )*( 1 1 )( ViA e Vi   
  • 41. Training Algorithm: Back-propagation ● The Back propagation algorithm learns in the same way as single perceptron. ● It searches for weight values that minimize the total error of the network over the set of training examples (training set). ● Back propagation consists of the repeated application of the following two passes: − Forward pass: In this step, the network is activated on one example and the error of (each neuron of) the output layer is computed. − Backward pass: in this step the network error is used for updating the weights. The error is propagated backwards from the output layer through the network layer by layer. This is done by recursively computing the local gradient of each neuron.
  • 42. Feed-forward Network Feed-forward networks often have one or more hidden layers of sigmoid neurons followed by an output layer of linear neurons. Multiple layers of neurons with nonlinear transfer functions allow the network to learn nonlinear and linear relationships between input and output vectors. The linear output layer lets the network produce values outside the range -1 to +1. On the other hand, if you want to constrain the outputs of a network (such as between 0 and 1), then the output layer should use a sigmoid transfer function (such as logsig).
  • 43. Backpropagation Learning Algorithm The following slides describes teaching process of multi-layer neural network employing back-propagation algorithm. To illustrate this process the three layer neural network with two inputs and one output, which is shown in the picture below, is used:
  • 44. Learning Algorithm Backpropagation Each neuron is composed of two units. First unit adds products of weights coefficients and input signals. The second unit realise nonlinear function, called neuron transfer (activation) function. Signal e is adder output signal, and y = f(e) is output signal of nonlinear element. Signal y is also output signal of neuron.
  • 45. Learning Algorithm: Backpropagation To teach the neural network we need training data set. The training data set consists of input signals (x1 and x2 ) assigned with corresponding target (desired output) z. The network training is an iterative process. In each iteration weights coefficients of nodes are modified using new data from training data set. Modification is calculated using algorithm described below: Each teaching step starts with forcing both input signals from training set. After this stage we can determine output signals values for each neuron in each network layer.
  • 46. Learning Algorithm: Backpropagation Pictures below illustrate how signal is propagating through the network, Symbols w(xm)n represent weights of connections between network input xm and neuron n in input layer. Symbols yn represents output signal of neuron n.
  • 49. Learning Algorithm: Backpropagation Propagation of signals through the hidden layer. Symbols wmn represent weights of connections between output of neuron m and input of neuron n in the next layer.
  • 52. Learning Algorithm: Backpropagation Propagation of signals through the output layer.
  • 53. Learning Algorithm: Backpropagation In the next algorithm step the output signal of the network y is compared with the desired output value (the target), which is found in training data set. The difference is called error signal d of output layer neuron
  • 54. Learning Algorithm: Backpropagation The idea is to propagate error signal d (computed in single teaching step) back to all neurons, which output signals were input for discussed neuron.
  • 55. Learning Algorithm: Backpropagation The idea is to propagate error signal d (computed in single teaching step) back to all neurons, which output signals were input for discussed neuron.
  • 56. Learning Algorithm: Backpropagation The weights' coefficients wmn used to propagate errors back are equal to this used during computing output value. Only the direction of data flow is changed (signals are propagated from output to inputs one after the other). This technique is used for all network layers. If propagated errors came from few neurons they are added. The illustration is below:
  • 57. Learning Algorithm: Backpropagation When the error signal for each neuron is computed, the weights coefficients of each neuron input node may be modified. In formulas below df(e)/de represents derivative of neuron activation function (which weights are modified).
  • 58. Learning Algorithm: Backpropagation When the error signal for each neuron is computed, the weights coefficients of each neuron input node may be modified. In formulas below df(e)/de represents derivative of neuron activation function (which weights are modified).
  • 59. Learning Algorithm: Backpropagation When the error signal for each neuron is computed, the weights coefficients of each neuron input node may be modified. In formulas below df(e)/de represents derivative of neuron activation function (which weights are modified).
  • 60. Weight Update Rule ● The Backprop weight update rule is based on the gradient descent method: − It takes a step in the direction yielding the maximum decrease of the network error E. − This direction is the opposite of the gradient of E. ● Iteration of the Backprop algorithm is usually terminated when the sum of squares of errors of the output values for all training data in an epoch is less than some threshold such as 0.01 ijijij www  ij ij w -w    E 
  • 61. Back-prop learning algorithm (incremental-mode) n=1; initialize weights randomly; while (stopping criterion not satisfied or n <max_iterations) for each example (x,d) - run the network with input x and compute the output y - update the weights in backward order starting from those of the output layer: with computed using the (generalized) Delta rule end-for n = n+1; end-while; jijiji www  jiw
  • 62. Stopping criterions ● Total mean squared error change: − Back-prop is considered to have converged when the absolute rate of change in the average squared error per epoch is sufficiently small (in the range [0.1, 0.01]). ● Generalization based criterion: − After each epoch, the NN is tested for generalization. − If the generalization performance is adequate then stop. − If this stopping criterion is used then the part of the training set used for testing the network generalization will not used for updating the weights.
  • 63. ● Data representation ● Network Topology ● Network Parameters ● Training ● Validation NN DESIGN ISSUES
  • 64. ● Data representation depends on the problem. ● In general ANNs work on continuous (real valued) attributes. Therefore symbolic attributes are encoded into continuous ones. ● Attributes of different types may have different ranges of values which affect the training process. ● Normalization may be used, like the following one which scales each attribute to assume values between 0 and 1. for each value xi of ith attribute, mini and maxi are the minimum and maximum value of that attribute over the training set. Data Representation i i minmax min    i i i x x
  • 65. ● The number of layers and neurons depend on the specific task. ● In practice this issue is solved by trial and error. ● Two types of adaptive algorithms can be used: − start from a large network and successively remove some neurons and links until network performance degrades. − begin with a small network and introduce new neurons until performance is satisfactory. Network Topology
  • 66. ● How are the weights initialized? ● How is the learning rate chosen? ● How many hidden layers and how many neurons? ● How many examples in the training set? Network parameters
  • 67. Initialization of weights ● In general, initial weights are randomly chosen, with typical values between -1.0 and 1.0 or -0.5 and 0.5. ● If some inputs are much larger than others, random initialization may bias the network to give much more importance to larger inputs. ● In such a case, weights can be initialized as follows:   Ni N ,...,1 |x| 1 2 1 ij i w For weights from the input to the first layer For weights from the first to the second layer   Ni N i ,...,1 )xw( 1 2 1 jk ij w 
  • 68. ● The right value of  depends on the application. ● Values between 0.1 and 0.9 have been used in many applications. ● Other heuristics is that adapt  during the training as described in previous slides. Choice of learning rate
  • 69. Training ● Rule of thumb: − the number of training examples should be at least five to ten times the number of weights of the network. ● Other rule: |W|= number of weights a=expected accuracy on test seta)-(1 |W| N 
  • 70. Contd.. ● The networks generated using these weights and input vectors are stable, except X2. ● X2 stabilizes to X1 (which is at hamming distance 1). ● Finally, with the obtained weights and stable states (X1 and X3), we can stabilize any new (partial) pattern to one of those