SlideShare ist ein Scribd-Unternehmen logo
1 von 70
Back-Propagation Algorithm
Basic Neuron Model In A
Feedforward Network
• Inputs xi
arrive through
pre-synaptic connections
• Synaptic efficacy is
modeled using real
weights wi
• The response of the
neuron is a nonlinear
function f of its weighted
inputs
Task
Plot the following type of Neural activation functions.
1(a) Threshold Function
φ(v)= +1 for v≥0
0 for v<0
1(b) Threshold Function
φ(v)= +1 for v≥0
-1 otherwise
2 Piecewise linear Function
φ(v)= 1 for v≥+1/2
v for +1/2>v>-1/2
0 for v≤-1/2
3(a) Sigmoid Function
φ(v)=1/(1+ exp(-λv))
3(b) Sigmoid Function
φ(v)=2/(1+ exp(-λv))
3(c) Sigmoid Function
φ(v)=tanh(λv)
For 3 vary the value of ‘λ’ and show the changes in the graph.
Multiple Input Neuron
Single Layer Artificial Neural Networks
Layer of Neurons
Multilayer Network
Banana & Apple Sorter
Prototype Vectors
Banana Apple Problem
Illustration of a Neural
Network
Different networks
☻Perceptron
– Feedforward Network, Linear Decision Boundary, One Neuron for
Each Decision
☻Hamming Network
☻Hopfield Network
- Dynamic Associative Memory Network
☻Error Back Propagation network
☻Radial basis network
☻ART
☻Brain in a box neural network
☻Cellular neural Network
☻Neocognitron
☻Functional
Network Topology
Feedforward
Inputs
Outputs
Inputs
Feedback
Outputs
1970s
The Backpropagation algorithm was first proposed by
Paul Werbos in the 1970's. However, it was
rediscoved in 1986 by Rumelhart and McClelland &
became widely used.
It took 30 years before the error backpropagation (or
in short: backprop) algorithm popularized.
Differences In Networks
Feedforward Networks
• Solutions are known
• Weights are learned
• Evolves in the weight
space
• Used for:
– Prediction
– Classification
– Function
approximation
Feedback Networks
• Solutions are
unknown
• Weights are
prescribed
• Evolves in the state
space
• Used for:
– Constraint satisfaction
– Optimization
– Feature matching
Architecture
A Back Prop network has atleast 3 layers of units:
an input layer, at least one intermediate hidden layer, &
an output layer. Connection weights in a Back Prop
network are one way. Units are connected in a feed-
forward fashion with input units fully connected to units
in the hidden layer & hidden units fully connected to units
in the output layer. When a Back Prop network is cycled,
an input pattern is propagated forward to the output units
through the intervening input-to-hidden and hidden-to-
output weights.
Inputs To Neurons
• Arise from other neurons or from outside
the network
• Nodes whose inputs arise outside the
network are called input nodes and simply
copy values
• An input may excite or inhibit the response
of the neuron to which it is applied,
depending upon the weight of the
connection
Fully connected network
Weights
• Represent synaptic efficacy and may be
excitatory or inhibitory
• Normally, positive weights are considered
as excitatory while negative weights are
thought of as inhibitory
• Learning is the process of modifying the
weights in order to produce a network that
performs some function
Finding net
Output
• The response function is normally nonlinear
• Samples include
– Sigmoid
– Piecewise linear
x
e
xf λ−
+
=
1
1
)(



<
≥
=
θ
θ
xif
xifx
xf
,0
,
)(
Back propagation Networks
I1
I2
1
Hidden Layer
H1
H2
O1
O2
Output Layer
Wi,j
Wj,k
1’s - bias
∑
+
= −
j
jxj Hw
e
xO
,
1
1
)(
I3
1
∑
+
= −
i
ixi Iw
e
xH
,
1
1
)(
Weight updation
Backpropagation Preparation
• Training Set
A collection of input-output patterns that are
used to train the network
• Testing Set
A collection of input-output patterns that are
used to assess network performance
• Learning Rate-η
A scalar parameter, analogous to step size in
numerical integration, used to set the rate of
adjustments
Learning
• Learning occurs during a training phase in which each input
pattern in a training set is applied to the input units and then
propagated forward.
• The pattern of activation arriving at the output layer is then
compared with the correct output pattern to calculate an
error signal.
• The error signal for each such target output pattern is then
back propagated from the outputs to the inputs in order to
appropriately adjust the weights in each layer of the network.
Learning
• The process goes on for several cycles till the error
reduces to a predefined limit.
• After a BackProp network has learned the correct
classification for a set of inputs, it can be tested on a
second set of inputs to see how well it classifies
untrained patterns.
• Thus, an important consideration in applying
BackProp learning is how well the network
generalizes.
The basic principles of the back propagation algorithm are:
(1) the error of the output signal of a neuron is used to
adjust its weights such that the error decreases, and (2)
the error in hidden layers is estimated proportional to the
weighted sum of the (estimated) errors in the layer
above.
Patterns
Training patterns (70%)
Testing patterns (30%)
During the training, the data is presented to the network
several thousand times. For each data sample, the
current output of the network is calculated and compared
to the "true" target value. The error signal dj of neuron j
is computed from the difference between the target and
the calculated output. For hidden neurons, this difference
is estimated by the weighted error signals of the layer
above. The error terms are then used to adjust the
weights wij of the neural network.
A Pseudo-Code Algorithm
• Randomly choose the initial weights
• While error is too large
– For each training pattern (presented in random order)
• Apply the inputs to the network
• Calculate the output for every neuron from the input layer,
through the hidden layer(s), to the output layer
• Calculate the error at the outputs
• Use the output error to compute error signals for pre-output
layers
• Use the error signals to compute weight adjustments
• Apply the weight adjustments
– Periodically evaluate the network performance
Network Error
• Total-Sum-Squared-Error (TSSE)
• Root-Mean-Squared-Error (RMSE)
∑ ∑ −=
patterns outputs
actualdesiredTSSE 2
)(
2
1
outputspatterns
TSSE
RMSE
*##
*2
=
Apply Inputs From A Pattern
• Apply the value of
each input parameter
to each input node
• Input nodes computer
only the identity
function
Feedforward
Inputs
Outputs
Calculate Outputs For Each
Neuron Based On The Pattern
• The output from neuron j
for pattern p is Opj where
and
k ranges over the input
indices and Wjk is the
weight on the connection
from input k to neuron j
Feedforward
Inputs
Outputs
jnetjpj
e
netO λ−
+
=
1
1
)(
∑+=
k
jkpkbiasj WOWbiasnet *
Calculate The Error Signal For
Each Output Neuron
• The output neuron error signal δpj is given
by δpj=(Tpj-Opj) Opj (1-Opj)
• Tpj is the target value of output neuron j for
pattern p
• Opj is the actual output value of output
neuron j for pattern p
Calculate The Error Signal For
Each Hidden Neuron
• The hidden neuron error signal δpj is given
by
where δpk is the error signal of a post-
synaptic neuron k and Wkj is the weight of
the connection from hidden neuron j to the
post-synaptic neuron k
kj
k
pkpjpjpj WOO ∑−= δδ )1(
Calculate And Apply Weight
Adjustments
• Compute weight adjustments ∆Wji at time
t by
∆Wji(t)= η δpj Opi
• Apply weight adjustments according to
Wji(t+1) = Wji(t) + ∆Wji(t)
• Some add a momentum term α∗∆Wji(t-1)
• Thus, the network adjusts its weights after each data
sample. This learning process is in fact a gradient
descent in the error surface of the weight space - with all
its drawbacks. The learning algorithm is slow and prone
to getting stuck in a local minimum.
Simulation Issues
 How to Select Initial Weights
 Local Minima
 Solutions to Local minima
 Rate of Learning
 Stopping Criterion
 Initialization
• For the standard back propagation algorithm, the initial
weights of the multi-layer perceptron have to be
relatively small. They can, for instance, be selected
randomly from a small interval around zero. During
training they are slowly adapted. Starting with small
weights is crucial, because large weights are rigid and
cannot be changed quickly.
Sequential & Batch modes
For a given training set ,back-propagation learning
proceeds in two basic ways:
1. Sequential Mode
2. Batch Mode
Sequential mode
• The sequential mode of back-propagation learning is also
referred to as on-line, pattern or stochastic mode.
• To be specific, consider an epoch consisting of N training ex.
Arranged in the order (x(1),d(1)),…,(x(N),d(N)).
• The first ex. pair (x(1),d(1))in the epoch is presented to the
network,& the sequence of forward & backward computations
described previously is performed, resulting in certain adjustments
to the synaptic weights & bias level of the network.
• The second ex. (x(N),d(N)) in the epoch is presented,& the
sequence of forward & backward computations is repeated,
resulting in the further adjustments to the synaptic weights & bias
levels. This process is continued until the last example pair
(x(N),d(N)) in the epoch is accounted for.
Batch Propagation
• In this mode of back-propagation learning weight
updating is performed after the presentation of all the
training examples that constitute an epoch.
• For a particular epoch, the cost function is the average
squared error, reproduced here in composite form is
defined as:-
ξav = (1/2N )Σ Σ ej
2
(n) for n=1 to N
for j € C
• Let N denote the total no. of patterns contained in the
training set. The average squared error energy is
obtained by summing ξ(n) over all n and then
normalizing with respect to the set size N, as shown by :-
• ξav = 1/N Σ ξ(n) for n=1 to N
Stopping Criteria
• The back-propagation algorithm cannot be shown to converge .
• To formulate a criterion, it is logically to think in terms of the
unique properties of a local or global minimum.
• The back-propagation algorithm is considered to have
converged when the Euclidean norm of the gradient vector reaches
a sufficient small gradient threshold.
• The back-propagation algorithm is considered to have converged
when the absolute rate of change in the average squared error pre
epoch is sufficiently small.
• The drawback of this convergence criterion is that, for
successful trials, learning time may be long.
• The back-propagation algorithm makes adjustments by
computing the derivative, or slope of the network error
with respect to each neuron’s output. It attempts to
minimize the overall error by descending this slope to the
minimum value for every weight. It advances one step
down the slope each epoch. If the network takes steps
that are too large, it may pass the global minimum. If it
takes steps that are small, it may settle on local minima,
or take an inordinate amount of time to arrive at the
global minimum. The ideal step size for a given problem
requires detailed, high-order derivative analysis, a task
not performed by the algorithm.
Minima
• Local minima
• Global minima
Local Minima
For simple 2 layer networks (without a hidden layer), the
error surface is bowl shaped and using gradient-descent to
minimize error is not a problem; the network will always
find an errorless solution (at the bottom of the bowl). Such
errorless solutions are called global minima.
However, extra hidden layer implies complex surfaces.
Since some minima are deeper than others, it is possible
that gradient descent may not find a global minima.
Instead, the network may fall into local minima which
represent suboptimal solutions.
• The algorithm cycles through the training samples as:-
• Initialization
• Presentation of training Examples
• Forward Computation
Initialization
• Assuming that no prior information is available, pick the
synaptic weights and thresholds from a uniform
distribution whose mean is zero & whose variance is
chosen to make the standard deviation of the induced
local fields of the neurons lie at the transition between
the linear and saturated parts of the sigmoid activation
function.
Presentation of training Examples
Present the network with an epoch of training examples.
For each example in the set order in same fashion,
perform the sequence of forward and backward
computation as described below.
Solutions to Local minima
Usual solution : More hidden layers. Logic -
Although additional hidden units increase the
complexity of the error surface, the extra
dimensionalilty increases the number of possible
escape routes.
Our solution – Tunneling
Rate of Learning
If the learning rate η is very small, then the
algorithm proceeds slowly, but accurately follows
the path of steepest descent in weight space.
If η is large, the algorithm may oscillate.
A simple method of effectively increasing the rate of
learning is to modify the delta rule by including a
momentum term:
Δwji
(n) = α Δwji
(n-1) + η δj
(n)yi
(n)
where α is a positive constant termed the momentum
constant. This is called the generalized delta rule.
 The effect is that if the basic delta rule is consistently
pushing a weight in the same direction, then it gradually
gathers "momentum" in that direction.
Forward Computation
An Example: Exclusive “OR”
• Training set
– ((0.1, 0.1), 0.1)
– ((0.1, 0.9), 0.9)
– ((0.9, 0.1), 0.9)
– ((0.9, 0.9), 0.1)
• Testing set
– Use at least 121 pairs equally spaced on the
unit square and plot the results
– Omit the training set (if desired)
An Example (continued):
Network Architectureinputs
output(s)
An Example (continued):
Network Architecture
Sample
input
0.1
0.9
Target
output
0.9
1
1
1
Feedforward Network Training by
Backpropagation: Process
Summary
• Select an architecture
• Randomly initialize weights
• While error is too large
– Select training pattern and feedforward to find
actual network output
– Calculate errors and backpropagate error
signals
– Adjust weights
• Evaluate performance using the test set
An Example (continued):
Network Architecture
Sample
input
0.1
0.9
Actual
output
???
1
1
1
??
??
??
??
??
??
??
??
??
Target
output
0.9
Feedforward Network Training by
Backpropagation: Process
Summary
• Select an architecture
• Randomly initialize weights
• While error is too large
– Select training pattern and feedforward to find
actual network output
– Calculate errors and backpropagate error
signals
– Adjust weights
• Evaluate performance using the test set
Backpropagation
•Very powerful - can learn any function, given enough
hidden units! With enough hidden units, we can
generate any function.
•Have the same problems of Generalization vs.
Memorization. With too many units, we will tend to
memorize the input and not generalize well. Some
schemes exist to “prune” the neural network.
BackProp networks are not limited in its use because
they can adapt their weights to acquire new
knowledge. BackProp networks learn by example,
and can be used to make predictions.
Write a program to train and simulate neural
network for following network
– Input Nodes = 2 &
Output Nodes = 1
– Input Nodes = 3 and
Output nodes = 1
Inputs Outputs
A B Y
0 0 0
0 1 1
1 0 1
1 1 0
Inputs Outputs
A B C Y
0 0 0 0
0 0 1 0
0 1 0 0
0 1 1 0
1 0 0 1
1 0 1 1
1 1 0 1
1 1 1 1
• Artificial Neural Network
– Simon Haykin
• Artificial Neural Network
– Jacek Zurada

Weitere ähnliche Inhalte

Was ist angesagt?

Multi-Layer Perceptrons
Multi-Layer PerceptronsMulti-Layer Perceptrons
Multi-Layer Perceptrons
ESCOM
 
Machine Learning: Introduction to Neural Networks
Machine Learning: Introduction to Neural NetworksMachine Learning: Introduction to Neural Networks
Machine Learning: Introduction to Neural Networks
Francesco Collova'
 

Was ist angesagt? (20)

Ensemble methods in machine learning
Ensemble methods in machine learningEnsemble methods in machine learning
Ensemble methods in machine learning
 
Autoencoders
AutoencodersAutoencoders
Autoencoders
 
Radial basis function network ppt bySheetal,Samreen and Dhanashri
Radial basis function network ppt bySheetal,Samreen and DhanashriRadial basis function network ppt bySheetal,Samreen and Dhanashri
Radial basis function network ppt bySheetal,Samreen and Dhanashri
 
Ensemble learning
Ensemble learningEnsemble learning
Ensemble learning
 
Multi-Layer Perceptrons
Multi-Layer PerceptronsMulti-Layer Perceptrons
Multi-Layer Perceptrons
 
Convolutional neural network
Convolutional neural network Convolutional neural network
Convolutional neural network
 
Deep learning
Deep learningDeep learning
Deep learning
 
Associative memory network
Associative memory networkAssociative memory network
Associative memory network
 
Convolutional Neural Network (CNN)
Convolutional Neural Network (CNN)Convolutional Neural Network (CNN)
Convolutional Neural Network (CNN)
 
HOPFIELD NETWORK
HOPFIELD NETWORKHOPFIELD NETWORK
HOPFIELD NETWORK
 
Perceptron algorithm
Perceptron algorithmPerceptron algorithm
Perceptron algorithm
 
Regularization in deep learning
Regularization in deep learningRegularization in deep learning
Regularization in deep learning
 
Neural Networks: Multilayer Perceptron
Neural Networks: Multilayer PerceptronNeural Networks: Multilayer Perceptron
Neural Networks: Multilayer Perceptron
 
Convolutional neural network
Convolutional neural networkConvolutional neural network
Convolutional neural network
 
Artificial Neural Network
Artificial Neural NetworkArtificial Neural Network
Artificial Neural Network
 
Machine learning clustering
Machine learning clusteringMachine learning clustering
Machine learning clustering
 
T9. Trust and reputation in multi-agent systems
T9. Trust and reputation in multi-agent systemsT9. Trust and reputation in multi-agent systems
T9. Trust and reputation in multi-agent systems
 
Machine Learning - Convolutional Neural Network
Machine Learning - Convolutional Neural NetworkMachine Learning - Convolutional Neural Network
Machine Learning - Convolutional Neural Network
 
Ensemble Method (Bagging Boosting)
Ensemble Method (Bagging Boosting)Ensemble Method (Bagging Boosting)
Ensemble Method (Bagging Boosting)
 
Machine Learning: Introduction to Neural Networks
Machine Learning: Introduction to Neural NetworksMachine Learning: Introduction to Neural Networks
Machine Learning: Introduction to Neural Networks
 

Andere mochten auch

The Back Propagation Learning Algorithm
The Back Propagation Learning AlgorithmThe Back Propagation Learning Algorithm
The Back Propagation Learning Algorithm
ESCOM
 
Learning Vector Quantization LVQ
Learning Vector Quantization LVQLearning Vector Quantization LVQ
Learning Vector Quantization LVQ
ESCOM
 
Learning Vector Quantization LVQ
Learning Vector Quantization LVQLearning Vector Quantization LVQ
Learning Vector Quantization LVQ
ESCOM
 
Principles of soft computing-Associative memory networks
Principles of soft computing-Associative memory networksPrinciples of soft computing-Associative memory networks
Principles of soft computing-Associative memory networks
Sivagowry Shathesh
 
Counterpropagation NETWORK
Counterpropagation NETWORKCounterpropagation NETWORK
Counterpropagation NETWORK
ESCOM
 

Andere mochten auch (20)

Back propagation
Back propagationBack propagation
Back propagation
 
The Back Propagation Learning Algorithm
The Back Propagation Learning AlgorithmThe Back Propagation Learning Algorithm
The Back Propagation Learning Algorithm
 
Hopfield Networks
Hopfield NetworksHopfield Networks
Hopfield Networks
 
Perceptron
PerceptronPerceptron
Perceptron
 
Kohonen self organizing maps
Kohonen self organizing mapsKohonen self organizing maps
Kohonen self organizing maps
 
2.5 backpropagation
2.5 backpropagation2.5 backpropagation
2.5 backpropagation
 
Counter propagation Network
Counter propagation NetworkCounter propagation Network
Counter propagation Network
 
Learning Vector Quantization LVQ
Learning Vector Quantization LVQLearning Vector Quantization LVQ
Learning Vector Quantization LVQ
 
Learning Vector Quantization LVQ
Learning Vector Quantization LVQLearning Vector Quantization LVQ
Learning Vector Quantization LVQ
 
Neural networks Self Organizing Map by Engr. Edgar Carrillo II
Neural networks Self Organizing Map by Engr. Edgar Carrillo IINeural networks Self Organizing Map by Engr. Edgar Carrillo II
Neural networks Self Organizing Map by Engr. Edgar Carrillo II
 
Neural Networks: Self-Organizing Maps (SOM)
Neural Networks:  Self-Organizing Maps (SOM)Neural Networks:  Self-Organizing Maps (SOM)
Neural Networks: Self-Organizing Maps (SOM)
 
Artificial Neural Network Lecture 6- Associative Memories & Discrete Hopfield...
Artificial Neural Network Lecture 6- Associative Memories & Discrete Hopfield...Artificial Neural Network Lecture 6- Associative Memories & Discrete Hopfield...
Artificial Neural Network Lecture 6- Associative Memories & Discrete Hopfield...
 
Backpropagation algo
Backpropagation  algoBackpropagation  algo
Backpropagation algo
 
Principles of soft computing-Associative memory networks
Principles of soft computing-Associative memory networksPrinciples of soft computing-Associative memory networks
Principles of soft computing-Associative memory networks
 
Back propagation network
Back propagation networkBack propagation network
Back propagation network
 
Counterpropagation NETWORK
Counterpropagation NETWORKCounterpropagation NETWORK
Counterpropagation NETWORK
 
Self-Organising Maps for Customer Segmentation using R - Shane Lynn - Dublin R
Self-Organising Maps for Customer Segmentation using R - Shane Lynn - Dublin RSelf-Organising Maps for Customer Segmentation using R - Shane Lynn - Dublin R
Self-Organising Maps for Customer Segmentation using R - Shane Lynn - Dublin R
 
Adaline madaline
Adaline madalineAdaline madaline
Adaline madaline
 
Self-organizing map
Self-organizing mapSelf-organizing map
Self-organizing map
 
How to calculate back propagation
How to calculate back propagationHow to calculate back propagation
How to calculate back propagation
 

Ähnlich wie nural network ER. Abhishek k. upadhyay

Electricity Demand Forecasting Using ANN
Electricity Demand Forecasting Using ANNElectricity Demand Forecasting Using ANN
Electricity Demand Forecasting Using ANN
Naren Chandra Kattla
 
Deep Learning Interview Questions And Answers | AI & Deep Learning Interview ...
Deep Learning Interview Questions And Answers | AI & Deep Learning Interview ...Deep Learning Interview Questions And Answers | AI & Deep Learning Interview ...
Deep Learning Interview Questions And Answers | AI & Deep Learning Interview ...
Simplilearn
 
nural network ER. Abhishek k. upadhyay
nural network ER. Abhishek  k. upadhyaynural network ER. Abhishek  k. upadhyay
nural network ER. Abhishek k. upadhyay
abhishek upadhyay
 

Ähnlich wie nural network ER. Abhishek k. upadhyay (20)

Lec 6-bp
Lec 6-bpLec 6-bp
Lec 6-bp
 
Artificial Neural Network
Artificial Neural NetworkArtificial Neural Network
Artificial Neural Network
 
Artificial neural network
Artificial neural networkArtificial neural network
Artificial neural network
 
Classification by back propagation, multi layered feed forward neural network...
Classification by back propagation, multi layered feed forward neural network...Classification by back propagation, multi layered feed forward neural network...
Classification by back propagation, multi layered feed forward neural network...
 
Artificial neural networks
Artificial neural networksArtificial neural networks
Artificial neural networks
 
Backpropagation.pptx
Backpropagation.pptxBackpropagation.pptx
Backpropagation.pptx
 
Hands on machine learning with scikit-learn and tensor flow by ahmed yousry
Hands on machine learning with scikit-learn and tensor flow by ahmed yousryHands on machine learning with scikit-learn and tensor flow by ahmed yousry
Hands on machine learning with scikit-learn and tensor flow by ahmed yousry
 
Multi Layer Network
Multi Layer NetworkMulti Layer Network
Multi Layer Network
 
Unit 2 ml.pptx
Unit 2 ml.pptxUnit 2 ml.pptx
Unit 2 ml.pptx
 
Unit iii update
Unit iii updateUnit iii update
Unit iii update
 
Artificial neural network
Artificial neural networkArtificial neural network
Artificial neural network
 
Reinforcement Learning and Artificial Neural Nets
Reinforcement Learning and Artificial Neural NetsReinforcement Learning and Artificial Neural Nets
Reinforcement Learning and Artificial Neural Nets
 
6
66
6
 
Neural network basic and introduction of Deep learning
Neural network basic and introduction of Deep learningNeural network basic and introduction of Deep learning
Neural network basic and introduction of Deep learning
 
Electricity Demand Forecasting Using ANN
Electricity Demand Forecasting Using ANNElectricity Demand Forecasting Using ANN
Electricity Demand Forecasting Using ANN
 
04 Multi-layer Feedforward Networks
04 Multi-layer Feedforward Networks04 Multi-layer Feedforward Networks
04 Multi-layer Feedforward Networks
 
08 neural networks
08 neural networks08 neural networks
08 neural networks
 
Deep Learning Interview Questions And Answers | AI & Deep Learning Interview ...
Deep Learning Interview Questions And Answers | AI & Deep Learning Interview ...Deep Learning Interview Questions And Answers | AI & Deep Learning Interview ...
Deep Learning Interview Questions And Answers | AI & Deep Learning Interview ...
 
nural network ER. Abhishek k. upadhyay
nural network ER. Abhishek  k. upadhyaynural network ER. Abhishek  k. upadhyay
nural network ER. Abhishek k. upadhyay
 
Neural Networks
Neural NetworksNeural Networks
Neural Networks
 

Mehr von abhishek upadhyay (13)

Nural network ER. Abhishek k. upadhyay Learning rules
Nural network ER. Abhishek  k. upadhyay Learning rulesNural network ER. Abhishek  k. upadhyay Learning rules
Nural network ER. Abhishek k. upadhyay Learning rules
 
Nural network ER. Abhishek k. upadhyay
Nural network ER. Abhishek  k. upadhyayNural network ER. Abhishek  k. upadhyay
Nural network ER. Abhishek k. upadhyay
 
Nural network ER.Abhishek k. upadhyay
Nural network  ER.Abhishek k. upadhyayNural network  ER.Abhishek k. upadhyay
Nural network ER.Abhishek k. upadhyay
 
bi copter Major project report ER.Abhishek upadhyay b.tech (ECE)
bi copter  Major project report ER.Abhishek upadhyay b.tech (ECE)bi copter  Major project report ER.Abhishek upadhyay b.tech (ECE)
bi copter Major project report ER.Abhishek upadhyay b.tech (ECE)
 
A project report on
A project report onA project report on
A project report on
 
Oc ppt
Oc pptOc ppt
Oc ppt
 
lcd
lcdlcd
lcd
 
abhishek
abhishekabhishek
abhishek
 
mmu
mmummu
mmu
 
(1) nanowire battery gerling (4)
(1) nanowire battery gerling (4)(1) nanowire battery gerling (4)
(1) nanowire battery gerling (4)
 
moving message display of lcd
 moving message display of lcd moving message display of lcd
moving message display of lcd
 
Bluetooth
BluetoothBluetooth
Bluetooth
 
Khetarpal
KhetarpalKhetarpal
Khetarpal
 

Kürzlich hochgeladen

Top Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoor
Top Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoorTop Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoor
Top Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoor
dharasingh5698
 
Call Girls in Netaji Nagar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Netaji Nagar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort ServiceCall Girls in Netaji Nagar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Netaji Nagar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
dharasingh5698
 

Kürzlich hochgeladen (20)

Bhosari ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready For ...
Bhosari ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready For ...Bhosari ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready For ...
Bhosari ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready For ...
 
Top Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoor
Top Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoorTop Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoor
Top Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoor
 
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
 
University management System project report..pdf
University management System project report..pdfUniversity management System project report..pdf
University management System project report..pdf
 
Thermal Engineering Unit - I & II . ppt
Thermal Engineering  Unit - I & II . pptThermal Engineering  Unit - I & II . ppt
Thermal Engineering Unit - I & II . ppt
 
2016EF22_0 solar project report rooftop projects
2016EF22_0 solar project report rooftop projects2016EF22_0 solar project report rooftop projects
2016EF22_0 solar project report rooftop projects
 
Generative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTGenerative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPT
 
Unit 2- Effective stress & Permeability.pdf
Unit 2- Effective stress & Permeability.pdfUnit 2- Effective stress & Permeability.pdf
Unit 2- Effective stress & Permeability.pdf
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghly
 
Thermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptThermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.ppt
 
Call Girls in Netaji Nagar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Netaji Nagar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort ServiceCall Girls in Netaji Nagar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Netaji Nagar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
 
COST-EFFETIVE and Energy Efficient BUILDINGS ptx
COST-EFFETIVE  and Energy Efficient BUILDINGS ptxCOST-EFFETIVE  and Energy Efficient BUILDINGS ptx
COST-EFFETIVE and Energy Efficient BUILDINGS ptx
 
Employee leave management system project.
Employee leave management system project.Employee leave management system project.
Employee leave management system project.
 
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
 
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
 
(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7
(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7
(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7
 
A Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna MunicipalityA Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna Municipality
 
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
 
Double Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torqueDouble Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torque
 
Design For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the startDesign For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the start
 

nural network ER. Abhishek k. upadhyay

  • 2. Basic Neuron Model In A Feedforward Network • Inputs xi arrive through pre-synaptic connections • Synaptic efficacy is modeled using real weights wi • The response of the neuron is a nonlinear function f of its weighted inputs
  • 3.
  • 4. Task Plot the following type of Neural activation functions. 1(a) Threshold Function φ(v)= +1 for v≥0 0 for v<0 1(b) Threshold Function φ(v)= +1 for v≥0 -1 otherwise 2 Piecewise linear Function φ(v)= 1 for v≥+1/2 v for +1/2>v>-1/2 0 for v≤-1/2 3(a) Sigmoid Function φ(v)=1/(1+ exp(-λv)) 3(b) Sigmoid Function φ(v)=2/(1+ exp(-λv)) 3(c) Sigmoid Function φ(v)=tanh(λv) For 3 vary the value of ‘λ’ and show the changes in the graph.
  • 6. Single Layer Artificial Neural Networks
  • 9. Banana & Apple Sorter
  • 12. Illustration of a Neural Network
  • 13. Different networks ☻Perceptron – Feedforward Network, Linear Decision Boundary, One Neuron for Each Decision ☻Hamming Network ☻Hopfield Network - Dynamic Associative Memory Network ☻Error Back Propagation network ☻Radial basis network ☻ART ☻Brain in a box neural network ☻Cellular neural Network ☻Neocognitron ☻Functional
  • 15. 1970s The Backpropagation algorithm was first proposed by Paul Werbos in the 1970's. However, it was rediscoved in 1986 by Rumelhart and McClelland & became widely used. It took 30 years before the error backpropagation (or in short: backprop) algorithm popularized.
  • 16.
  • 17. Differences In Networks Feedforward Networks • Solutions are known • Weights are learned • Evolves in the weight space • Used for: – Prediction – Classification – Function approximation Feedback Networks • Solutions are unknown • Weights are prescribed • Evolves in the state space • Used for: – Constraint satisfaction – Optimization – Feature matching
  • 18. Architecture A Back Prop network has atleast 3 layers of units: an input layer, at least one intermediate hidden layer, & an output layer. Connection weights in a Back Prop network are one way. Units are connected in a feed- forward fashion with input units fully connected to units in the hidden layer & hidden units fully connected to units in the output layer. When a Back Prop network is cycled, an input pattern is propagated forward to the output units through the intervening input-to-hidden and hidden-to- output weights.
  • 19. Inputs To Neurons • Arise from other neurons or from outside the network • Nodes whose inputs arise outside the network are called input nodes and simply copy values • An input may excite or inhibit the response of the neuron to which it is applied, depending upon the weight of the connection
  • 21. Weights • Represent synaptic efficacy and may be excitatory or inhibitory • Normally, positive weights are considered as excitatory while negative weights are thought of as inhibitory • Learning is the process of modifying the weights in order to produce a network that performs some function
  • 23. Output • The response function is normally nonlinear • Samples include – Sigmoid – Piecewise linear x e xf λ− + = 1 1 )(    < ≥ = θ θ xif xifx xf ,0 , )(
  • 24. Back propagation Networks I1 I2 1 Hidden Layer H1 H2 O1 O2 Output Layer Wi,j Wj,k 1’s - bias ∑ + = − j jxj Hw e xO , 1 1 )( I3 1 ∑ + = − i ixi Iw e xH , 1 1 )(
  • 26. Backpropagation Preparation • Training Set A collection of input-output patterns that are used to train the network • Testing Set A collection of input-output patterns that are used to assess network performance • Learning Rate-η A scalar parameter, analogous to step size in numerical integration, used to set the rate of adjustments
  • 27. Learning • Learning occurs during a training phase in which each input pattern in a training set is applied to the input units and then propagated forward. • The pattern of activation arriving at the output layer is then compared with the correct output pattern to calculate an error signal. • The error signal for each such target output pattern is then back propagated from the outputs to the inputs in order to appropriately adjust the weights in each layer of the network.
  • 28. Learning • The process goes on for several cycles till the error reduces to a predefined limit. • After a BackProp network has learned the correct classification for a set of inputs, it can be tested on a second set of inputs to see how well it classifies untrained patterns. • Thus, an important consideration in applying BackProp learning is how well the network generalizes.
  • 29. The basic principles of the back propagation algorithm are: (1) the error of the output signal of a neuron is used to adjust its weights such that the error decreases, and (2) the error in hidden layers is estimated proportional to the weighted sum of the (estimated) errors in the layer above.
  • 31. During the training, the data is presented to the network several thousand times. For each data sample, the current output of the network is calculated and compared to the "true" target value. The error signal dj of neuron j is computed from the difference between the target and the calculated output. For hidden neurons, this difference is estimated by the weighted error signals of the layer above. The error terms are then used to adjust the weights wij of the neural network.
  • 32. A Pseudo-Code Algorithm • Randomly choose the initial weights • While error is too large – For each training pattern (presented in random order) • Apply the inputs to the network • Calculate the output for every neuron from the input layer, through the hidden layer(s), to the output layer • Calculate the error at the outputs • Use the output error to compute error signals for pre-output layers • Use the error signals to compute weight adjustments • Apply the weight adjustments – Periodically evaluate the network performance
  • 33. Network Error • Total-Sum-Squared-Error (TSSE) • Root-Mean-Squared-Error (RMSE) ∑ ∑ −= patterns outputs actualdesiredTSSE 2 )( 2 1 outputspatterns TSSE RMSE *## *2 =
  • 34.
  • 35. Apply Inputs From A Pattern • Apply the value of each input parameter to each input node • Input nodes computer only the identity function Feedforward Inputs Outputs
  • 36. Calculate Outputs For Each Neuron Based On The Pattern • The output from neuron j for pattern p is Opj where and k ranges over the input indices and Wjk is the weight on the connection from input k to neuron j Feedforward Inputs Outputs jnetjpj e netO λ− + = 1 1 )( ∑+= k jkpkbiasj WOWbiasnet *
  • 37. Calculate The Error Signal For Each Output Neuron • The output neuron error signal δpj is given by δpj=(Tpj-Opj) Opj (1-Opj) • Tpj is the target value of output neuron j for pattern p • Opj is the actual output value of output neuron j for pattern p
  • 38. Calculate The Error Signal For Each Hidden Neuron • The hidden neuron error signal δpj is given by where δpk is the error signal of a post- synaptic neuron k and Wkj is the weight of the connection from hidden neuron j to the post-synaptic neuron k kj k pkpjpjpj WOO ∑−= δδ )1(
  • 39. Calculate And Apply Weight Adjustments • Compute weight adjustments ∆Wji at time t by ∆Wji(t)= η δpj Opi • Apply weight adjustments according to Wji(t+1) = Wji(t) + ∆Wji(t) • Some add a momentum term α∗∆Wji(t-1)
  • 40. • Thus, the network adjusts its weights after each data sample. This learning process is in fact a gradient descent in the error surface of the weight space - with all its drawbacks. The learning algorithm is slow and prone to getting stuck in a local minimum.
  • 41.
  • 42. Simulation Issues  How to Select Initial Weights  Local Minima  Solutions to Local minima  Rate of Learning  Stopping Criterion  Initialization
  • 43. • For the standard back propagation algorithm, the initial weights of the multi-layer perceptron have to be relatively small. They can, for instance, be selected randomly from a small interval around zero. During training they are slowly adapted. Starting with small weights is crucial, because large weights are rigid and cannot be changed quickly.
  • 44. Sequential & Batch modes For a given training set ,back-propagation learning proceeds in two basic ways: 1. Sequential Mode 2. Batch Mode
  • 45. Sequential mode • The sequential mode of back-propagation learning is also referred to as on-line, pattern or stochastic mode. • To be specific, consider an epoch consisting of N training ex. Arranged in the order (x(1),d(1)),…,(x(N),d(N)). • The first ex. pair (x(1),d(1))in the epoch is presented to the network,& the sequence of forward & backward computations described previously is performed, resulting in certain adjustments to the synaptic weights & bias level of the network. • The second ex. (x(N),d(N)) in the epoch is presented,& the sequence of forward & backward computations is repeated, resulting in the further adjustments to the synaptic weights & bias levels. This process is continued until the last example pair (x(N),d(N)) in the epoch is accounted for.
  • 46. Batch Propagation • In this mode of back-propagation learning weight updating is performed after the presentation of all the training examples that constitute an epoch. • For a particular epoch, the cost function is the average squared error, reproduced here in composite form is defined as:- ξav = (1/2N )Σ Σ ej 2 (n) for n=1 to N for j € C
  • 47. • Let N denote the total no. of patterns contained in the training set. The average squared error energy is obtained by summing ξ(n) over all n and then normalizing with respect to the set size N, as shown by :- • ξav = 1/N Σ ξ(n) for n=1 to N
  • 48. Stopping Criteria • The back-propagation algorithm cannot be shown to converge . • To formulate a criterion, it is logically to think in terms of the unique properties of a local or global minimum. • The back-propagation algorithm is considered to have converged when the Euclidean norm of the gradient vector reaches a sufficient small gradient threshold. • The back-propagation algorithm is considered to have converged when the absolute rate of change in the average squared error pre epoch is sufficiently small. • The drawback of this convergence criterion is that, for successful trials, learning time may be long.
  • 49.
  • 50. • The back-propagation algorithm makes adjustments by computing the derivative, or slope of the network error with respect to each neuron’s output. It attempts to minimize the overall error by descending this slope to the minimum value for every weight. It advances one step down the slope each epoch. If the network takes steps that are too large, it may pass the global minimum. If it takes steps that are small, it may settle on local minima, or take an inordinate amount of time to arrive at the global minimum. The ideal step size for a given problem requires detailed, high-order derivative analysis, a task not performed by the algorithm.
  • 51.
  • 53. Local Minima For simple 2 layer networks (without a hidden layer), the error surface is bowl shaped and using gradient-descent to minimize error is not a problem; the network will always find an errorless solution (at the bottom of the bowl). Such errorless solutions are called global minima. However, extra hidden layer implies complex surfaces. Since some minima are deeper than others, it is possible that gradient descent may not find a global minima. Instead, the network may fall into local minima which represent suboptimal solutions.
  • 54. • The algorithm cycles through the training samples as:- • Initialization • Presentation of training Examples • Forward Computation
  • 55. Initialization • Assuming that no prior information is available, pick the synaptic weights and thresholds from a uniform distribution whose mean is zero & whose variance is chosen to make the standard deviation of the induced local fields of the neurons lie at the transition between the linear and saturated parts of the sigmoid activation function.
  • 56. Presentation of training Examples Present the network with an epoch of training examples. For each example in the set order in same fashion, perform the sequence of forward and backward computation as described below.
  • 57. Solutions to Local minima Usual solution : More hidden layers. Logic - Although additional hidden units increase the complexity of the error surface, the extra dimensionalilty increases the number of possible escape routes. Our solution – Tunneling
  • 58. Rate of Learning If the learning rate η is very small, then the algorithm proceeds slowly, but accurately follows the path of steepest descent in weight space. If η is large, the algorithm may oscillate.
  • 59. A simple method of effectively increasing the rate of learning is to modify the delta rule by including a momentum term: Δwji (n) = α Δwji (n-1) + η δj (n)yi (n) where α is a positive constant termed the momentum constant. This is called the generalized delta rule.  The effect is that if the basic delta rule is consistently pushing a weight in the same direction, then it gradually gathers "momentum" in that direction.
  • 61. An Example: Exclusive “OR” • Training set – ((0.1, 0.1), 0.1) – ((0.1, 0.9), 0.9) – ((0.9, 0.1), 0.9) – ((0.9, 0.9), 0.1) • Testing set – Use at least 121 pairs equally spaced on the unit square and plot the results – Omit the training set (if desired)
  • 62. An Example (continued): Network Architectureinputs output(s)
  • 63. An Example (continued): Network Architecture Sample input 0.1 0.9 Target output 0.9 1 1 1
  • 64. Feedforward Network Training by Backpropagation: Process Summary • Select an architecture • Randomly initialize weights • While error is too large – Select training pattern and feedforward to find actual network output – Calculate errors and backpropagate error signals – Adjust weights • Evaluate performance using the test set
  • 65. An Example (continued): Network Architecture Sample input 0.1 0.9 Actual output ??? 1 1 1 ?? ?? ?? ?? ?? ?? ?? ?? ?? Target output 0.9
  • 66. Feedforward Network Training by Backpropagation: Process Summary • Select an architecture • Randomly initialize weights • While error is too large – Select training pattern and feedforward to find actual network output – Calculate errors and backpropagate error signals – Adjust weights • Evaluate performance using the test set
  • 67. Backpropagation •Very powerful - can learn any function, given enough hidden units! With enough hidden units, we can generate any function. •Have the same problems of Generalization vs. Memorization. With too many units, we will tend to memorize the input and not generalize well. Some schemes exist to “prune” the neural network.
  • 68. BackProp networks are not limited in its use because they can adapt their weights to acquire new knowledge. BackProp networks learn by example, and can be used to make predictions.
  • 69. Write a program to train and simulate neural network for following network – Input Nodes = 2 & Output Nodes = 1 – Input Nodes = 3 and Output nodes = 1 Inputs Outputs A B Y 0 0 0 0 1 1 1 0 1 1 1 0 Inputs Outputs A B C Y 0 0 0 0 0 0 1 0 0 1 0 0 0 1 1 0 1 0 0 1 1 0 1 1 1 1 0 1 1 1 1 1
  • 70. • Artificial Neural Network – Simon Haykin • Artificial Neural Network – Jacek Zurada