SlideShare ist ein Scribd-Unternehmen logo
1 von 21
Downloaden Sie, um offline zu lesen
Parveen Malik
Assistant Professor
KIIT University
Neural Networks
Backpropagation
Background
• Perceptron Learning Algorithm , Hebbian Learning can classify input pattern if input
patterns are linearly separable.
• We need an algorithm which can train multilayer of perceptron or classify patterns
which are not linearly separable.
• Algorithm should also be able to use non-linear activation function.
𝒙𝟏
𝒙𝟐
𝑪𝒍𝒂𝒔𝒔 𝟏
𝑪𝒍𝒂𝒔𝒔 𝟐
𝑳𝒊𝒏𝒆𝒂𝒓𝒍𝒚 𝑺𝒆𝒑𝒆𝒓𝒂𝒃𝒍𝒆
𝒙𝟏
𝒙𝟐
𝑪𝒍𝒂𝒔𝒔 𝟏
𝑪𝒍𝒂𝒔𝒔 𝟐
𝑳𝒊𝒏𝒆𝒂𝒓𝒍𝒚 𝑵𝒐𝒏 − 𝒔𝒆𝒑𝒆𝒓𝒂𝒃𝒍𝒆
• Need non-linear boundaries
• Perceptron algorithm can't be used
• Variation of GD rule is used.
• More layers are required
• Non-linear activation function required
Perceptron Algorithm −
𝑾𝒊+𝟏
𝑵𝒆𝒘
= 𝑾𝒊
𝑵𝒆𝒘
+ (𝒕 − 𝒂)𝒙𝒊
Gradient Descent Algorithm-
𝑾𝒊+𝟏
𝑵𝒆𝒘
= 𝑾𝒊
𝒐𝒍𝒅
− 𝜼
𝝏𝑳
𝝏𝒘𝒊
𝑳𝒐𝒔𝒔 𝒇𝒖𝒏𝒄𝒕𝒊𝒐𝒏, 𝑳 =
𝟏
𝟐
𝒕 − 𝒂 𝟐
Background- Back Propagation
• The perceptron learning rule of Frank Rosenblatt and the LMS algorithm of Bernard Widrow and
Marcian Hoff were designed to train single-layer perceptron-like networks.
• Single-layer networks suffer from the disadvantage that they are only able to solve linearly separable
classification problems. Both Rosenblatt and Widrow were aware of these limitations and proposed
multilayer networks that could overcome them, but they were not able to generalize their algorithms
to train these more powerful networks.
• First description of an algorithm to train multilayer networks was contained in the thesis of Paul
Werbos in 1974 .This thesis presented the algorithm in the context of general networks, with neural
networks as a special case, and was not disseminated in the neural network community.
• It was not until the mid 1980s that the backpropagation algorithm was rediscovered and widely
publicized. It was rediscovered independently by David Rumelhart, Geoffrey Hinton and Ronald
Williams 1986, David Parker 1985 and Yann Le Cun 1985.
• The algorithm was popularized by its inclusion in the book Parallel Distributed Processing [RuMc86],
which described the work of the Parallel Distributed Processing Group led by psychologists David
Rumelhart and James Mc-Clelland
• The multilayer perceptron, trained by the backpropagation algorithm, is currently the most widely
used neural network.
Network Design
Problem : Whether you watch a movie or not ?
Step 1 : Design – Output can be Yes (1) or No (0). Therefore one neuron or perceptron is
sufficient.
Step -2 : Choose suitable activation function in the output along with a rule to update the
weights. (Hard Limit function for perceptron learning algorithm, sigmoid for the
Widrow-Hoff rule or delta rule.)
𝑾𝒊+𝟏
𝑵𝒆𝒘
= 𝑾𝒊
𝒐𝒍𝒅
− 𝜼
𝝏𝑳
𝝏𝒘𝒊
𝑳 =
𝟏
𝟐
𝒚 − ෝ
𝒚 𝟐
=
𝟏
𝟐
𝒚 − 𝒇 𝒘𝒙 + 𝒃
𝟐
𝝏𝑳
𝝏𝒘
= 𝟐 ∗
𝟏
𝟐
𝒚 − 𝒇 𝒘𝒙 + 𝒃
𝝏𝒇 𝒘𝒙 + 𝒃
𝝏𝒘
= − 𝒚 − ෝ
𝒚 𝒇′ 𝒘𝒙 + 𝒃 𝒙
𝑥 ෍ 𝑓
Director
or
Actor
or
Genre
or
IMDB
w
Yes (1)
or
No (0)
𝑤𝑥 + 𝑏
ෝ
𝒚 = 𝒇 𝒘𝒙 + 𝒃 =
𝟏
𝟏 + 𝒆−𝒘𝒙+𝒃
𝒇 𝒘𝒙 + 𝒃
𝑤0 = 𝑏
1
Network Design
Problem : Sort the students in the 4 house based on their three qualities like lineage,
choice and ethics ?
Step 1 : Design – Here, the input vector is 3-D i.e for each students, 𝑆𝑡𝑢𝑑𝑒𝑛𝑡 1 =
𝐿1
𝐶1
𝐸1
,
𝑆𝑡𝑢𝑑𝑒𝑛𝑡 2 =
𝐿2
𝐶2
𝐸2
𝒙𝟏
𝒙𝟐
𝒙𝟑
N
𝒙𝟏
𝒙𝟐
𝒙𝟑
𝒙𝟎=1
𝒘𝟏
𝒘𝟐
𝒘𝟑
𝒘𝟎 = 𝒃
Yes (1)
or
No (0)
𝑁1
𝑁2
ෝ
𝒚𝟏 = 𝒇 𝒘𝟏𝟏𝒙𝟏 + 𝒘𝟏𝟐𝒙𝟐 + 𝒘𝟏𝟑𝒙𝟑 + 𝒃𝟏
𝒙𝟏
𝒙𝟐
𝒙𝟑
𝒘𝟏𝟏
𝒘𝟏𝟐
𝒘𝟏𝟑
𝒘𝟐𝟏
𝒘𝟐𝟐
𝒘𝟐𝟑
ෝ
𝒚𝟐 = 𝒇 𝒘𝟐𝟏𝒙𝟏 + 𝒘𝟐𝟐𝒙𝟐 + 𝒘𝟐𝟑𝒙𝟑 + 𝒃𝟐
𝒃𝟏
𝒃𝟐
ො
𝑦1
ො
𝑦2
0
1
1
0
1
1
0
0
A B C D
Houses
𝑦1
𝑦2
Actual Output
Target Output
Network Design
Step 2 : Choosing the activation function and rule to update weights
Loss function, 𝐿 =
1
2
𝑦 − ො
𝑦 2
1
1
𝑁1
𝑁2
ෝ
𝒚𝟏 = 𝒇 𝒘𝟏𝟏𝒙𝟏 + 𝒘𝟏𝟐𝒙𝟐 + 𝒘𝟏𝟑𝒙𝟑 + 𝒃𝟏
𝒙𝟏
𝒙𝟐
𝒙𝟑
𝒘𝟏𝟏
𝒘𝟏𝟐
𝒘𝟏𝟑
𝒘𝟐𝟏
𝒘𝟐𝟐
𝒘𝟐𝟑
ෝ
𝒚𝟐 = 𝒇 𝒘𝟐𝟏𝒙𝟏 + 𝒘𝟐𝟐𝒙𝟐 + 𝒘𝟐𝟑𝒙𝟑 + 𝒃𝟐
𝒃𝟏
𝒃𝟐
ො
𝑦1
ො
𝑦2
0
1
1
0
0
0
A B C D
Houses
𝑦1
𝑦2
Actual Output
Target Output
𝑾𝒊𝒋 𝒕 + 𝟏 = 𝑾𝒊𝒋 𝒕 − 𝜼
𝝏𝑳
𝝏𝒘𝒊𝒋
𝝏𝑳
𝝏𝒘𝟏𝟏
= 𝒚 − ෝ
𝒚 𝒇′ 𝒘𝟏𝟏𝒙𝟏 + +𝒘𝟏𝟐𝒙𝟐 + 𝒘𝟏𝟑𝒙𝟑 + 𝒃𝟏 𝒙𝟏
Network Architectures (Complex)
𝒙𝟏
𝒉𝟏
𝒙𝒏
𝒙𝟐
𝒙𝒊
𝒉𝟐
𝒉𝒎
𝒉𝒋
𝒚𝟏
𝒚𝟐
𝒚𝒍
𝒚𝒌
⋮
Input Layer Hidden Layer Output Layer
𝑾(𝟏)
𝑾(𝟐)
⋮
⋮
⋮
⋮
⋮
Network Architectures (More Complex)
𝑥1 𝑥2 𝑥2 𝑥2
ℎ2
(1)
ℎ1
(1)
ℎ3
(1)
ℎ1
(2)
ℎ2
(2)
ℎ3
(2)
𝑦1 𝑦2
𝑾(𝟏)
𝑾(𝟐)
𝑾(𝟑)
Input
𝝏𝑳
𝝏𝑾𝒊𝒍
= 𝜹𝒊𝒁𝒍
𝜹𝒊 = 𝝈′
𝒂𝒊 ෍
𝑱
𝜹𝒋𝑾𝒋𝒊
𝑎𝑙 𝑧𝑙
𝑎𝑖 𝑧𝑖
𝑎𝑗 𝑧𝑗
⋮
⋮
⋮
⋮
⋮
⋮
𝑾𝒊𝒍
𝑾𝒋𝒊
𝜹𝒊 =
𝝏𝑳
𝝏𝒂𝒊
𝜹𝒋 =
𝝏𝑳
𝝏𝒂𝒋
Cost Function
𝑳 =
𝟏
𝟐
(𝒚 − ෝ
𝒚) 𝟐
Error to
input layer
𝝈 𝒂𝒊 𝟏 − 𝝈 𝒂𝒊
Back-propagation Algorithm (Generalized Expression)
𝜹𝒋 =
𝝏𝑳
𝝏𝒂𝒋
= − 𝒚 − ෝ
𝒚
𝝏ෝ
𝒚
𝝏𝒂𝒋
𝒂𝒊 = ෍
𝒍
𝑾𝒊𝒍𝒁𝒍
𝜹𝒊 =
𝝏𝑳
𝝏𝒂𝒊
= ෍
𝑱
𝝏𝑳
𝝏𝒂𝒋
𝝏𝒂𝒋
𝝏𝒂𝒊
𝝏𝒂𝒋
𝝏𝒂𝒊
=
𝝏𝒂𝒋
𝝏𝒁𝒊
𝝏𝒁𝒊
𝝏𝒂𝒊
= 𝑾𝒋𝒊𝝈′ 𝒂𝒊
Back-propagation Algorithm
𝒙𝟏
𝒙𝟐
ෝ
𝒚
𝑎1 𝜎 𝑎1
𝑎2 𝜎 𝑎2
𝑏1 𝜎 𝑏1
1
1
1
0.6
0.4
-0.1
0.5
-0.3
0.3
0.4
0.1 -0.2
Back-propagation Algorithm
𝒙𝟏
𝒙𝟐
ෝ
𝒚
𝑎1 𝜎 𝑎1
𝑎2 𝜎 𝑎2
𝑏1 𝜎 𝑏1
𝑻𝒂𝒓𝒈𝒆𝒕
𝒚 = 𝟏
1
1
1
0
1
0.6
0.4
-0.1
0.5
-0.3
0.3
0.4
0.1 -0.2
𝑾𝟏𝟏
(𝟏)
𝑾𝟐𝟏
(𝟏)
𝑾𝟏𝟐
(𝟏)
𝑾𝟏𝟎
(𝟏)
𝑾𝟏𝟐
(𝟏)
𝑾𝟐𝟎
(𝟏)
𝑾𝟏𝟐
(𝟐)
𝑾𝟏𝟎
(𝟐)
𝑾𝟏𝟏
(𝟐)
Back-propagation Algorithm
𝒙𝟏
𝒙𝟐
ෝ
𝒚
𝑎1 𝜎 𝑎1
𝑎2 𝜎 𝑎2
𝑏1 𝜎 𝑏1
𝑻𝒂𝒓𝒈𝒆𝒕
𝒚 = 𝟏
1
1
1
0
1
0.6
0.4
-0.1
0.5
-0.3
0.3
0.4
0.1 -0.2
𝑾𝟏𝟏
(𝟏)
𝑾𝟐𝟏
(𝟏)
𝑾𝟏𝟐
(𝟏)
𝑾𝟏𝟎
(𝟏)
𝑾𝟏𝟐
(𝟏)
𝑾𝟐𝟎
(𝟏)
𝑾𝟏𝟐
(𝟐)
𝑾𝟏𝟎
(𝟐)
𝑾𝟏𝟏
(𝟐)
𝑳 =
𝟏
𝟐
𝒚 − ෝ
𝒚 𝟐
Loss function
Back-propagation Algorithm
𝒙𝟏
𝒙𝟐
ෝ
𝒚
𝑎1 𝜎 𝑎1
𝑎2 𝜎 𝑎2
𝑏1 𝜎 𝑏1
𝑻𝒂𝒓𝒈𝒆𝒕
𝒚 = 𝟏
1
1
1
0
1
0.6
0.4
-0.1
0.5
-0.3
0.3
0.4
0.1 -0.2
𝑾𝟏𝟏
(𝟏)
𝑾𝟐𝟏
(𝟏)
𝑾𝟏𝟐
(𝟏)
𝑾𝟏𝟎
(𝟏)
𝑾𝟏𝟐
(𝟏)
𝑾𝟐𝟎
(𝟏)
𝑾𝟏𝟐
(𝟐)
𝑾𝟏𝟎
(𝟐)
𝑾𝟏𝟏
(𝟐)
𝑳 =
𝟏
𝟐
𝒚 − ෝ
𝒚 𝟐
Loss function
𝑾𝑵𝒆𝒘
= 𝑾𝒐𝒍𝒅
− 𝜼
𝝏𝑳
𝝏𝑾𝒐𝒍𝒅
Back-propagation Algorithm
Step 1 : Forward pass
𝒙𝟏
𝒙𝟐
𝑎1 𝜎 𝑎1
𝒂𝟏=0.2
𝑎2 𝜎 𝑎2
𝑏1 𝜎 𝑏1
𝑻𝒂𝒓𝒈𝒆𝒕
𝒚 = 𝟏
1
1
1
0
1
0.6
0.4
-0.1
0.5
-0.3
0.3
0.4
0.1 -0.2
𝑾𝟏𝟏
(𝟏)
𝑾𝟐𝟏
(𝟏)
𝑾𝟏𝟐
(𝟏)
𝑾𝟏𝟎
(𝟏)
𝑊
12
(1)
𝑾𝟐𝟎
(𝟏)
𝑾𝟏𝟐
(𝟐)
𝑾𝟏𝟎
(𝟐)
𝑾𝟏𝟏
(𝟐)
𝑳 =
𝟏
𝟐
𝒚 − ෝ
𝒚 𝟐
Loss function
𝝈 𝒂𝟏 =
𝟏
𝟏 + 𝒆−𝟎.𝟐
= 𝟎. 𝟓𝟒𝟗𝟖
𝑾𝑵𝒆𝒘 = 𝑾𝒐𝒍𝒅 − 𝜼
𝝏𝑳
𝝏𝑾𝒐𝒍𝒅
Back-propagation Algorithm
Step 1 : Forward pass
𝒙𝟏
𝒙𝟐
𝑎1 𝜎 𝑎1
𝒂𝟏=0.2
𝒂𝟐=0.9
𝑎2 𝜎 𝑎2
𝑏1 𝜎 𝑏1
𝑻𝒂𝒓𝒈𝒆𝒕
𝒚 = 𝟏
1
1
1
0
1
0.6
0.4
-0.1
0.5
-0.3
0.3
0.4
0.1 -0.2
𝑾𝟏𝟏
(𝟏)
𝑾𝟐𝟏
(𝟏)
𝑾𝟏𝟐
(𝟏)
𝑾𝟏𝟎
(𝟏)
𝑾𝟏𝟐
(𝟏)
𝑾𝟐𝟎
(𝟏)
𝑾𝟏𝟐
(𝟐)
𝑾𝟏𝟎
(𝟐)
𝑾𝟏𝟏
(𝟐)
𝑳 =
𝟏
𝟐
𝒚 − ෝ
𝒚 𝟐
Loss function
𝝈 𝒂𝟏 =
𝟏
𝟏 + 𝒆−𝟎.𝟐
= 𝟎. 𝟓𝟒𝟗𝟖
𝝈 𝒂𝟐 =
𝟏
𝟏 + 𝒆−𝟎.𝟗
= 𝟎. 𝟕𝟏𝟎𝟗
ෝ
𝒚
𝑾𝑵𝒆𝒘
= 𝑾𝒐𝒍𝒅
− 𝜼
𝝏𝑳
𝝏𝑾𝒐𝒍𝒅
Back-propagation Algorithm
Step 1 : Forward pass
𝒙𝟏
𝒙𝟐
𝑎1 𝜎 𝑎1
𝒂𝟏=0.2
𝒂𝟐=0.9
𝒃𝟏=0.09101
𝑎2 𝜎 𝑎2
𝑏1 𝜎 𝑏1
ෝ
𝒚 = 𝝈 𝒃𝟏 =0.5227
𝑻𝒂𝒓𝒈𝒆𝒕
𝒚 = 𝟏
1
1
1
0
1
0.6
0.4
-0.1
0.5
-0.3
0.3
0.4
0.1 -0.2
𝑾𝟏𝟏
(𝟏)
𝑾𝟐𝟏
(𝟏)
𝑾𝟏𝟐
(𝟏)
𝑾𝟏𝟎
(𝟏)
𝑾𝟏𝟐
(𝟏)
𝑾𝟐𝟎
(𝟏)
𝑾𝟏𝟐
(𝟐)
𝑾𝟏𝟎
(𝟐)
𝑾𝟏𝟏
(𝟐)
𝑳 =
𝟏
𝟐
𝒚 − ෝ
𝒚 𝟐
Loss function
𝝈 𝒂𝟏 =
𝟏
𝟏 + 𝒆−𝟎.𝟐
= 𝟎. 𝟓𝟒𝟗𝟖
𝝈 𝒂𝟐 =
𝟏
𝟏 + 𝒆−𝟎.𝟗
= 𝟎. 𝟕𝟏𝟎𝟗
Back-propagation Algorithm
Step 2 : Backpropagation of error
𝒙𝟏
𝒙𝟐
𝑎1 𝜎 𝑎1
𝒂𝟏=0.2
𝒂𝟐=0.9
𝒃𝟏=0.09101
𝑎2 𝜎 𝑎2
𝑏1 𝜎 𝑏1
ෝ
𝒚 = 𝝈 𝒃𝟏 =0.5227
𝑻𝒂𝒓𝒈𝒆𝒕
𝒚 = 𝟏
1
1
1
0
1
0.6
0.4
-0.1
0.5
-0.3
0.3
0.4
0.1 -0.2
𝑾𝟏𝟏
(𝟏)
𝑾𝟐𝟏
(𝟏)
𝑾𝟏𝟐
(𝟏)
𝑾𝟏𝟎
(𝟏)
𝑾𝟏𝟐
(𝟏)
𝑾𝟐𝟎
(𝟏)
𝑾𝟏𝟐
(𝟐)
𝑾𝟏𝟎
(𝟐)
𝑾𝟏𝟏
(𝟐)
𝑳 =
𝟏
𝟐
𝒚 − ෝ
𝒚 𝟐
Loss function
𝝈 𝒂𝟏 =
𝟏
𝟏 + 𝒆−𝟎.𝟐
= 𝟎. 𝟓𝟒𝟗𝟖
𝝈 𝒂𝟐 =
𝟏
𝟏 + 𝒆−𝟎.𝟗
= 𝟎. 𝟕𝟏𝟎𝟗
𝑎𝑙 𝑧𝑙
𝑎𝑖 𝑧𝑖
𝑎𝑗 𝑧𝑗
⋮
⋮
⋮
⋮
⋮
⋮
𝝏𝑳
𝝏𝑾𝒊𝒍
= 𝜹𝒊𝒁𝒍
𝜹𝒊 = 𝝈′
𝒂𝒊 ෍
𝑱
𝜹𝒋𝑾𝒋𝒊
𝑾𝒊𝒍
𝑾𝒋𝒊
Imagine
Back-propagation Algorithm
𝝏𝑳
𝝏𝑾𝒊𝒍
= 𝜹𝒊𝒁𝒍
𝜹𝒊 = 𝝈′
𝒂𝒊 ෍
𝑱
𝜹𝒋𝑾𝒋𝒊
𝑎𝑙 𝑧𝑙
𝑎𝑖 𝑧𝑖
𝑎𝑗 𝑧𝑗
⋮
⋮
⋮
⋮
⋮
⋮
𝑾𝒊𝒍 𝑾𝒋𝒊
𝜹𝒊=
𝝏𝑳
𝝏𝒂𝒊
𝜹𝒋=
𝝏𝑳
𝝏𝒂𝒋
𝝈′
𝒂𝒊 = 𝝈 𝒂𝒊 𝟏 − 𝝈 𝒂𝒊
Cost/Error Function
𝑳 =
𝟏
𝟐
(𝒚 − ෝ
𝒚) 𝟐
𝒛𝒊 = 𝝈 𝒂𝒊
𝜹𝟏 = 𝝈′
𝒂𝟏 𝜹𝒋𝑾𝟏𝟏
(𝟐)
= 𝝈 𝒂𝟏 𝟏 − 𝝈 𝒂𝟏 𝜹𝒐𝒖𝒕𝑾𝟏𝟏
(𝟐)
= −𝝈 𝒂𝟏 𝟏 − 𝝈 𝒂𝟏 𝒚 − 𝒛𝒐𝒖𝒕 𝝈 𝒃𝒐𝒖𝒕 𝟏 − 𝝈 𝒃𝒐𝒖𝒕 𝑾𝟏𝟏
(𝟐)
𝜹𝒋=
𝝏𝑳
𝝏𝒂𝒋
𝜹𝒐𝒖𝒕 =
𝝏𝑳
𝝏𝒃𝒐𝒖𝒕
= − 𝒚 − ෝ
𝒚
𝝏ෝ
𝒚
𝝏𝒃𝒐𝒖𝒕
= − 𝒚 − 𝒛𝒐𝒖𝒕
𝝏𝒛𝒐𝒖𝒕
𝝏𝒃𝒐𝒖𝒕
= − 𝒚 − 𝒛𝒐𝒖𝒕 𝝈′ 𝒃𝒐𝒖𝒕
= − 𝒚 − 𝒛𝒐𝒖𝒕 𝝈 𝒃𝒐𝒖𝒕 𝟏 − 𝝈 𝒃𝒐𝒖𝒕
𝒙𝟏
𝑎1 𝑧1
𝒃𝒐𝒖𝒕 𝒛𝒐𝒖𝒕
𝑾𝟏𝟏
(𝟐)
𝑾𝟏𝟏
(𝟏)
𝒛𝟏 = 𝝈 𝒂𝟏
𝒛𝒐𝒖𝒕 = 𝝈 𝒃𝒐𝒖𝒕 = ෝ
𝒚
𝒙𝟐
𝑎2 𝑧2
𝑾𝟏𝟐
(𝟐)
𝑾𝟐𝟐
(𝟏)
𝒛𝟐 = 𝝈 𝒂𝟐
𝜹𝟐 = 𝝈′ 𝒂𝟐 𝜹𝒋𝑾𝟏𝟐
(𝟐)
= 𝝈 𝒂𝟐 𝟏 − 𝝈 𝒂𝟐 𝜹𝒐𝒖𝒕𝑾𝟏𝟐
(𝟐)
= −𝝈 𝒂𝟐 𝟏 − 𝝈 𝒂𝟐 𝒚 − 𝒛𝒐𝒖𝒕 𝝈 𝒃𝒐𝒖𝒕 𝟏 − 𝝈 𝒃𝒐𝒖𝒕 𝑾𝟏𝟐
(𝟐)
Back-propagation Algorithm - error propagation (Update of Layer 1 weights)
0.5227
0.09101
𝟎. 𝟕𝟏𝟎𝟗
0.9
𝟎. 𝟓𝟒𝟗𝟖
0.2
𝜹𝟏 = 𝝈′
𝒂𝟏 𝜹𝒋𝑾𝟏𝟏
(𝟐)
= 𝝈 𝒂𝟏 𝟏 − 𝝈 𝒂𝟏 𝜹𝒐𝒖𝒕𝑾𝟏𝟏
(𝟐)
= −𝝈 𝒂𝟏 𝟏 − 𝝈 𝒂𝟏 𝒚 − 𝒛𝒐𝒖𝒕 𝝈 𝒃𝒐𝒖𝒕 𝟏 − 𝝈 𝒃𝒐𝒖𝒕 𝑾𝟏𝟏
(𝟐)
𝜹𝒋=
𝝏𝑳
𝝏𝒂𝒋
𝜹𝒐𝒖𝒕 =
𝝏𝑳
𝝏𝒃𝒐𝒖𝒕
= − 𝒚 − ෝ
𝒚
𝝏ෝ
𝒚
𝝏𝒃𝒐𝒖𝒕
= − 𝒚 − 𝒛𝒐𝒖𝒕
𝝏𝒛𝒐𝒖𝒕
𝝏𝒃𝒐𝒖𝒕
= − 𝒚 − 𝒛𝒐𝒖𝒕 𝝈′
𝒃𝒐𝒖𝒕
= − 𝒚 − 𝒛𝒐𝒖𝒕 𝝈 𝒃𝒐𝒖𝒕 𝟏 − 𝝈 𝒃𝒐𝒖𝒕
𝒙𝟏
𝑎1 𝑧1
𝒃𝒐𝒖𝒕 𝒛𝒐𝒖𝒕
𝑾𝟏𝟏
(𝟐)
𝑾𝟏𝟏
(𝟏)
𝒛𝟏 = 𝝈 𝒂𝟏
𝒛𝒐𝒖𝒕 = 𝝈 𝒃𝒐𝒖𝒕 = ෝ
𝒚
𝑾𝟏𝟐
(𝟏)
𝑾𝟐𝟐
(𝟏) 𝑾𝟏𝟎
(𝟏)
𝑾𝟐𝟎
(𝟏)
𝑾𝟏𝟎
(𝟐)
𝒙𝟐
𝑎2 𝑧2
𝑾𝟏𝟐
(𝟐)
𝑾𝟐𝟐
(𝟏)
𝒛𝟐 = 𝝈 𝒂𝟐
𝜹𝟐 = 𝝈′
𝒂𝟐 𝜹𝒋𝑾𝟏𝟐
(𝟐)
= 𝝈 𝒂𝟐 𝟏 − 𝝈 𝒂𝟐 𝜹𝒐𝒖𝒕𝑾𝟏𝟐
(𝟐)
= −𝝈 𝒂𝟐 𝟏 − 𝝈 𝒂𝟐 𝒚 − 𝒛𝒐𝒖𝒕 𝝈 𝒃𝒐𝒖𝒕 𝟏 − 𝝈 𝒃𝒐𝒖𝒕 𝑾𝟏𝟐
(𝟐)
0.4
0.1
𝜹𝟏 = − 𝟎. 𝟓𝟒𝟗𝟖 𝟏 − 𝟎. 𝟓𝟒𝟗𝟖 𝟏 − 𝟎. 𝟓𝟐𝟐𝟕 𝟎. 𝟓𝟐𝟐𝟕 𝟏 − 𝟎. 𝟓𝟐𝟐𝟕 𝟎. 𝟒 = −𝟎. 𝟎𝟏𝟏𝟕𝟖𝟗
𝜹𝟐 = − 𝟎. 𝟕𝟏𝟎𝟗 𝟏 − 𝟎. 𝟕𝟏𝟎𝟗 𝟏 − 𝟎. 𝟓𝟐𝟐𝟕 𝟎. 𝟓𝟐𝟐𝟕 𝟏 − 𝟎. 𝟓𝟐𝟐𝟕 𝟎. 𝟏 = −𝟎. 𝟎𝟎𝟐𝟒𝟒𝟕
0
1
𝒚 = 𝟏
1
1
1
Back-propagation Algorithm (Update of Layer 1 weights)
𝑾𝒊𝒋
𝑵𝒆𝒘
= 𝑾𝒊𝒋
𝒐𝒍𝒅
− 𝜼
𝝏𝑳
𝝏𝑾𝒊𝒋
𝒐𝒍𝒅
𝜹𝟏 = −𝟎. 𝟎𝟏𝟏𝟕𝟖𝟗
𝜹𝟐 = −𝟎. 𝟎𝟎𝟐𝟒𝟒𝟕
𝜼 = 𝟎. 𝟐𝟓
0.5227
0.09101
0.9
𝟎. 𝟕𝟏𝟎𝟗
𝟎. 𝟓𝟒𝟗𝟖
0.2
𝒙𝟏
𝑎1 𝑧1
𝒃𝒐𝒖𝒕 𝒛𝒐𝒖𝒕
𝑾𝟏𝟏
(𝟐)
𝑾𝟏𝟏
(𝟏)
𝒛𝟏 = 𝝈 𝒂𝟏
𝒛𝒐𝒖𝒕 = 𝝈 𝒃𝒐𝒖𝒕 = ෝ
𝒚
𝑾𝟏𝟐
(𝟏)
𝑾𝟐𝟏
(𝟏)
𝑾𝟏𝟎
(𝟏)
1
𝑾𝟐𝟎
(𝟏)
1
1
𝒙𝟐
𝑎2 𝑧2
𝑾𝟏𝟐
(𝟐)
𝑾𝟐𝟐
(𝟏)
𝒛𝟐 = 𝝈 𝒂𝟐
0.6
0.4
0.4
0.1
-0.1
-0.3
0
1
𝒚 = 𝟏
0.3
0.5
-0.2
𝝏𝑳
𝝏𝑾𝟏𝟏
(𝟏)
= 𝜹𝟏𝒙𝟏 = −𝟎. 𝟎𝟏𝟏𝟕𝟖𝟗 × 𝟎 = 𝟎
𝝏𝑳
𝝏𝑾𝟏𝟐
(𝟏)
= 𝜹𝟏𝒙𝟐 = −𝟎. 𝟎𝟏𝟏𝟕𝟖𝟗 × 𝟏 = −𝟎. 𝟎𝟏𝟏𝟕𝟖𝟗
𝝏𝑳
𝝏𝑾𝟏𝟎
(𝟏)
= 𝜹𝟏𝒙𝟎 = −𝟎. 𝟎𝟏𝟏𝟕𝟖𝟗 × 𝟏 = −𝟎. 𝟎𝟏𝟏𝟕𝟖𝟗
𝝏𝑳
𝝏𝑾𝟐𝟏
(𝟏)
= 𝜹𝟐𝒙𝟏 = −𝟎. 𝟎𝟎𝟐𝟒𝟒𝟕 × 𝟎 = 𝟎
𝝏𝑳
𝝏𝑾𝟐𝟐
(𝟏) = 𝜹𝟐𝒙𝟐 = −𝟎. 𝟎𝟎𝟐𝟒𝟒𝟕 × 𝟏 = −𝟎. 𝟎02447
𝝏𝑳
𝝏𝑾𝟐𝟎
(𝟏) = 𝜹𝟐𝒙𝟎 = −𝟎. 𝟎𝟎𝟐𝟒𝟒𝟕 × 𝟏 = −𝟎. 𝟎𝟎2447
𝑾𝟏𝟏
𝟏
𝒕 + 𝟏 = 𝑾𝟏𝟏
𝟏
𝒕 − 𝟎. 𝟐𝟓
𝝏𝑳
𝝏𝑾𝟏𝟏
𝟏
𝒕
= 𝟎. 𝟔 − 𝟎. 𝟐𝟓 × 𝟎 = 𝟎. 𝟔
𝑾𝟏𝟐
𝟏
𝒕 + 𝟏 = 𝑾𝟏𝟐
𝟏
𝒕 − 𝟎. 𝟐𝟓
𝝏𝑳
𝝏𝑾𝟏𝟐
𝟏
𝒕
= −𝟎. 𝟏 + 𝟎. 𝟐𝟓 × 𝟎. 𝟎𝟏𝟏𝟕𝟖𝟗 = −𝟎. . 𝟎𝟗𝟕𝟎𝟓
𝑾𝟏𝟎
𝟏
𝒕 + 𝟏 = 𝑾𝟏𝟎
𝟏
𝒕 − 𝟎. 𝟐𝟓
𝝏𝑳
𝝏𝑾𝟏𝟎
𝟏
𝒕
= 𝟎. 𝟑 + 𝟎. 𝟐𝟓 × 𝟎. 𝟎𝟏𝟏𝟕𝟖𝟗 = 𝟎. 𝟑𝟎𝟐𝟗𝟓
𝑾𝟐𝟏
𝟏
𝒕 + 𝟏 = 𝑾𝟐𝟏
𝟏
𝒕 − 𝟎. 𝟐𝟓
𝝏𝑳
𝝏𝑾𝟐𝟏
𝟏
𝒕
= −𝟎. 𝟑 − 𝟎. 𝟐𝟓 × 𝟎 = −𝟎. 𝟑
𝑾𝟐𝟐
𝟏
𝒕 + 𝟏 = 𝑾𝟐𝟐
𝟏
𝒕 − 𝟎. 𝟐𝟓
𝝏𝑳
𝝏𝑾𝟐𝟐
𝟏
𝒕
= 𝟎. 𝟒 + 𝟎. 𝟐𝟓 × 𝟎. 𝟎𝟎𝟐𝟒𝟒𝟕 = 𝟎.4006125
𝑾𝟐𝟎
𝟏
𝒕 + 𝟏 = 𝑾𝟐𝟎
𝟏
𝒕 − 𝟎. 𝟐𝟓
𝝏𝑳
𝝏𝑾𝟐𝟎
𝟏
𝒕
= 𝟎. 𝟓 + 𝟎. 𝟐𝟓 × 𝟎. 𝟎𝟎𝟐𝟒𝟒𝟕 = 𝟎. 𝟓𝟎𝟎𝟔𝟏𝟐𝟓
𝑾𝟏𝟎
(𝟐)
Back-propagation Algorithm (Update of layer 2 Weights)
𝑾𝒊𝒋
𝑵𝒆𝒘
= 𝑾𝒊𝒋
𝒐𝒍𝒅
− 𝜼
𝝏𝑳
𝝏𝑾𝒊𝒋
𝒐𝒍𝒅
𝜹𝟏 = −𝟎. 𝟎𝟏𝟏𝟕𝟖𝟗
𝜹𝟐 = −𝟎. 𝟎𝟎𝟐𝟒𝟒𝟕
0.5227
0.09101
𝟎. 𝟕𝟏𝟎𝟗
0.9
0.2 𝟎. 𝟓𝟒𝟗𝟖
𝒙𝟏
𝑎1 𝑧1
𝒃𝒐𝒖𝒕 𝒛𝒐𝒖𝒕
𝑾𝟏𝟏
(𝟐)
𝑾𝟏𝟏
(𝟏)
𝒛𝟏 = 𝝈 𝒂𝟏
𝒛𝒐𝒖𝒕 = 𝝈 𝒃𝒐𝒖𝒕 = ෝ
𝒚
𝑾𝟏𝟐
(𝟏)
𝑾𝟐𝟏
(𝟏)
𝑾𝟏𝟎
(𝟏)
1
𝑾𝟐𝟎
(𝟏)
1
1
𝑾𝟏𝟎
(𝟐)
𝒙𝟐
𝑎2 𝑧2
𝑾𝟏𝟐
(𝟐)
𝑾𝟐𝟐
(𝟏)
𝒛𝟐 = 𝝈 𝒂𝟐
0.6
0.4
0.4
0.1
-0.1
-0.3
0
1
𝒚 = 𝟏
0.3
0.5
-0.2
𝒃𝒐𝒖𝒕 = 𝒛𝟏𝑾𝟏𝟏
(𝟐)
+ 𝒛𝟐𝑾𝟏𝟐
(𝟐)
+ 𝑾𝟏𝟎
(𝟐)
, 𝒛𝒐𝒖𝒕 = 𝝈 𝒃𝒐𝒖𝒕 = ෝ
𝒚
𝝏𝑳
𝝏𝑾𝟏𝟏
(𝟐)
= − 𝒚 − ෝ
𝒚
𝝏ෝ
𝒚
𝝏𝑾𝟏𝟏
𝟐
= − 𝒚 − ෝ
𝒚 𝝈′
𝒃𝒐𝒖𝒕 𝒛𝟏 = 𝜹𝒐𝒖𝒕𝒛𝟏
𝝏𝑳
𝝏𝑾𝟏𝟐
(𝟐)
= − 𝒚 − ෝ
𝒚
𝝏ෝ
𝒚
𝝏𝑾𝟏𝟐
𝟐
= − 𝒚 − ෝ
𝒚 𝝈′
𝒃𝒐𝒖𝒕 𝒛𝟐 = 𝜹𝒐𝒖𝒕𝒛𝟐
𝝏𝑳
𝝏𝑾𝟏𝟎
(𝟐)
= − 𝒚 − ෝ
𝒚
𝝏ෝ
𝒚
𝝏𝑾𝟏𝟎
𝟐
= − 𝒚 − ෝ
𝒚 𝝈′
𝒃𝒐𝒖𝒕 = 𝜹𝒐𝒖𝒕
𝝈′
𝒃𝒐𝒖𝒕 = 𝝈 𝒃𝒐𝒖𝒕 𝟏 − 𝝈 𝒃𝒐𝒖𝒕
𝜹𝑶𝒖𝒕 = − 𝒚 − ෝ
𝒚 𝝈 𝒃𝒐𝒖𝒕 𝟏 − 𝝈 𝒃𝒐𝒖𝒕
= − 𝟏 − 𝟎. 𝟓𝟐𝟐𝟕 𝟎. 𝟓𝟐𝟐𝟕 𝟏 − 𝟎. 𝟓𝟐𝟐𝟕
= −𝟎. 𝟏𝟏𝟗𝟎𝟖
𝑾𝟏𝟏
𝟐
𝒕 + 𝟏 = 𝑾𝟏𝟏
𝟐
𝒕 − 𝜼
𝝏𝑳
𝝏𝑾𝟏𝟏
𝟐
= 𝟎. 𝟒 + 𝟎. 𝟐𝟓 × 𝟎. 𝟏𝟏𝟗𝟎𝟖 × 𝟎. 𝟓𝟒𝟗𝟖 = 𝟎. 𝟒𝟏𝟔𝟒
𝑾𝟏𝟐
𝟐
𝒕 + 𝟏 = 𝑾𝟏𝟐
𝟐
𝒕 − 𝜼
𝝏𝑳
𝝏𝑾𝟏𝟐
𝟐
= 𝟎. 𝟏 + 𝟎. 𝟐𝟓 × 𝟎. 𝟏𝟏𝟗𝟎𝟖 × 𝟎. 𝟕𝟏𝟎𝟗 = 𝟎. 𝟏𝟐𝟏𝟐
𝑾𝟏𝟎
𝟐
𝒕 + 𝟏 = 𝑾𝟏𝟎
𝟐
𝒕 − 𝜼
𝝏𝑳
𝝏𝑾𝟏𝟎
𝟐
= −𝟎. 𝟐 + 𝟎. 𝟐𝟓 × 𝟎. 𝟏𝟏𝟗𝟎𝟖 = −𝟎. 𝟏𝟕𝟎𝟐𝟑

Weitere ähnliche Inhalte

Was ist angesagt?

Supervised Learning
Supervised LearningSupervised Learning
Supervised Learning
butest
 
Support Vector Machines
Support Vector MachinesSupport Vector Machines
Support Vector Machines
nextlib
 

Was ist angesagt? (20)

Artificial Neural Networks Lect5: Multi-Layer Perceptron & Backpropagation
Artificial Neural Networks Lect5: Multi-Layer Perceptron & BackpropagationArtificial Neural Networks Lect5: Multi-Layer Perceptron & Backpropagation
Artificial Neural Networks Lect5: Multi-Layer Perceptron & Backpropagation
 
Artificial neural networks (2)
Artificial neural networks (2)Artificial neural networks (2)
Artificial neural networks (2)
 
Artificial Neural Network
Artificial Neural NetworkArtificial Neural Network
Artificial Neural Network
 
Supervised Learning
Supervised LearningSupervised Learning
Supervised Learning
 
artificial neural network
artificial neural networkartificial neural network
artificial neural network
 
Artifical Neural Network and its applications
Artifical Neural Network and its applicationsArtifical Neural Network and its applications
Artifical Neural Network and its applications
 
Artificial neural network
Artificial neural networkArtificial neural network
Artificial neural network
 
Classification by back propagation, multi layered feed forward neural network...
Classification by back propagation, multi layered feed forward neural network...Classification by back propagation, multi layered feed forward neural network...
Classification by back propagation, multi layered feed forward neural network...
 
Neural Networks: Rosenblatt's Perceptron
Neural Networks: Rosenblatt's PerceptronNeural Networks: Rosenblatt's Perceptron
Neural Networks: Rosenblatt's Perceptron
 
Recurrent Neural Networks. Part 1: Theory
Recurrent Neural Networks. Part 1: TheoryRecurrent Neural Networks. Part 1: Theory
Recurrent Neural Networks. Part 1: Theory
 
Neural network
Neural networkNeural network
Neural network
 
Artificial neural network for machine learning
Artificial neural network for machine learningArtificial neural network for machine learning
Artificial neural network for machine learning
 
Artificial Neural Networks Lect7: Neural networks based on competition
Artificial Neural Networks Lect7: Neural networks based on competitionArtificial Neural Networks Lect7: Neural networks based on competition
Artificial Neural Networks Lect7: Neural networks based on competition
 
Perceptron 2015.ppt
Perceptron 2015.pptPerceptron 2015.ppt
Perceptron 2015.ppt
 
Convolutional Neural Networks (CNN)
Convolutional Neural Networks (CNN)Convolutional Neural Networks (CNN)
Convolutional Neural Networks (CNN)
 
An introduction to reinforcement learning
An introduction to  reinforcement learningAn introduction to  reinforcement learning
An introduction to reinforcement learning
 
Support Vector Machines
Support Vector MachinesSupport Vector Machines
Support Vector Machines
 
Genetic Algorithm
Genetic AlgorithmGenetic Algorithm
Genetic Algorithm
 
Adaptive Resonance Theory
Adaptive Resonance TheoryAdaptive Resonance Theory
Adaptive Resonance Theory
 
Machine Learning - Convolutional Neural Network
Machine Learning - Convolutional Neural NetworkMachine Learning - Convolutional Neural Network
Machine Learning - Convolutional Neural Network
 

Ähnlich wie Lecture 5 backpropagation

14th_Class_19-03-2024 Control systems.pptx
14th_Class_19-03-2024 Control systems.pptx14th_Class_19-03-2024 Control systems.pptx
14th_Class_19-03-2024 Control systems.pptx
buttshaheemsoci77
 

Ähnlich wie Lecture 5 backpropagation (20)

04 Multi-layer Feedforward Networks
04 Multi-layer Feedforward Networks04 Multi-layer Feedforward Networks
04 Multi-layer Feedforward Networks
 
Deep learning study 2
Deep learning study 2Deep learning study 2
Deep learning study 2
 
Deep Feed Forward Neural Networks and Regularization
Deep Feed Forward Neural Networks and RegularizationDeep Feed Forward Neural Networks and Regularization
Deep Feed Forward Neural Networks and Regularization
 
Solving Poisson Equation using Conjugate Gradient Method and its implementation
Solving Poisson Equation using Conjugate Gradient Methodand its implementationSolving Poisson Equation using Conjugate Gradient Methodand its implementation
Solving Poisson Equation using Conjugate Gradient Method and its implementation
 
Neural Networks
Neural NetworksNeural Networks
Neural Networks
 
Vibration Isolation of a LEGO® plate
Vibration Isolation of a LEGO® plateVibration Isolation of a LEGO® plate
Vibration Isolation of a LEGO® plate
 
Passivity-based control of rigid-body manipulator
Passivity-based control of rigid-body manipulatorPassivity-based control of rigid-body manipulator
Passivity-based control of rigid-body manipulator
 
Lecture 6 radial basis-function_network
Lecture 6 radial basis-function_networkLecture 6 radial basis-function_network
Lecture 6 radial basis-function_network
 
Stochastic optimal control & rl
Stochastic optimal control & rlStochastic optimal control & rl
Stochastic optimal control & rl
 
Lec10.pptx
Lec10.pptxLec10.pptx
Lec10.pptx
 
تطبيقات المعادلات التفاضلية
تطبيقات المعادلات التفاضليةتطبيقات المعادلات التفاضلية
تطبيقات المعادلات التفاضلية
 
14th_Class_19-03-2024 Control systems.pptx
14th_Class_19-03-2024 Control systems.pptx14th_Class_19-03-2024 Control systems.pptx
14th_Class_19-03-2024 Control systems.pptx
 
MLU_DTE_Lecture_2.pptx
MLU_DTE_Lecture_2.pptxMLU_DTE_Lecture_2.pptx
MLU_DTE_Lecture_2.pptx
 
Lecture Notes: EEEC4340318 Instrumentation and Control Systems - Fundamental...
Lecture Notes:  EEEC4340318 Instrumentation and Control Systems - Fundamental...Lecture Notes:  EEEC4340318 Instrumentation and Control Systems - Fundamental...
Lecture Notes: EEEC4340318 Instrumentation and Control Systems - Fundamental...
 
Introduction to PyTorch
Introduction to PyTorchIntroduction to PyTorch
Introduction to PyTorch
 
Introduction to Neural Networks and Deep Learning from Scratch
Introduction to Neural Networks and Deep Learning from ScratchIntroduction to Neural Networks and Deep Learning from Scratch
Introduction to Neural Networks and Deep Learning from Scratch
 
Introduction to Neural Networks and Deep Learning
Introduction to Neural Networks and Deep LearningIntroduction to Neural Networks and Deep Learning
Introduction to Neural Networks and Deep Learning
 
Direct solution of sparse network equations by optimally ordered triangular f...
Direct solution of sparse network equations by optimally ordered triangular f...Direct solution of sparse network equations by optimally ordered triangular f...
Direct solution of sparse network equations by optimally ordered triangular f...
 
DNN and RBM
DNN and RBMDNN and RBM
DNN and RBM
 
Backpropagation - Elisa Sayrol - UPC Barcelona 2018
Backpropagation - Elisa Sayrol - UPC Barcelona 2018Backpropagation - Elisa Sayrol - UPC Barcelona 2018
Backpropagation - Elisa Sayrol - UPC Barcelona 2018
 

Mehr von ParveenMalik18 (10)

Lecture 4 neural networks
Lecture 4 neural networksLecture 4 neural networks
Lecture 4 neural networks
 
Lecture 3 fuzzy inference system
Lecture 3  fuzzy inference systemLecture 3  fuzzy inference system
Lecture 3 fuzzy inference system
 
Lecture 2 fuzzy inference system
Lecture 2  fuzzy inference systemLecture 2  fuzzy inference system
Lecture 2 fuzzy inference system
 
Lecture 1 computational intelligence
Lecture 1  computational intelligenceLecture 1  computational intelligence
Lecture 1 computational intelligence
 
Chapter8
Chapter8Chapter8
Chapter8
 
Chapter6
Chapter6Chapter6
Chapter6
 
Chapter5
Chapter5Chapter5
Chapter5
 
Chapter3
Chapter3Chapter3
Chapter3
 
Chapter2
Chapter2Chapter2
Chapter2
 
Electrical and Electronic Measurement
Electrical and Electronic MeasurementElectrical and Electronic Measurement
Electrical and Electronic Measurement
 

Kürzlich hochgeladen

Top Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoor
Top Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoorTop Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoor
Top Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoor
dharasingh5698
 
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
Standard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power PlayStandard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power Play
Epec Engineered Technologies
 
Call Girls in Netaji Nagar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Netaji Nagar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort ServiceCall Girls in Netaji Nagar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Netaji Nagar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
dharasingh5698
 
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak HamilCara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Kandungan 087776558899
 
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort ServiceCall Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 

Kürzlich hochgeladen (20)

Top Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoor
Top Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoorTop Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoor
Top Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoor
 
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
 
Standard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power PlayStandard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power Play
 
Call Girls in Netaji Nagar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Netaji Nagar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort ServiceCall Girls in Netaji Nagar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Netaji Nagar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
 
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
 
data_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdfdata_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdf
 
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performance
 
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
 
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
 
Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...
Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...
Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...
 
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak HamilCara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
 
2016EF22_0 solar project report rooftop projects
2016EF22_0 solar project report rooftop projects2016EF22_0 solar project report rooftop projects
2016EF22_0 solar project report rooftop projects
 
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
 
Design For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the startDesign For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the start
 
Double Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torqueDouble Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torque
 
Hostel management system project report..pdf
Hostel management system project report..pdfHostel management system project report..pdf
Hostel management system project report..pdf
 
COST-EFFETIVE and Energy Efficient BUILDINGS ptx
COST-EFFETIVE  and Energy Efficient BUILDINGS ptxCOST-EFFETIVE  and Energy Efficient BUILDINGS ptx
COST-EFFETIVE and Energy Efficient BUILDINGS ptx
 
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort ServiceCall Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
 
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
 
Generative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTGenerative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPT
 

Lecture 5 backpropagation

  • 1. Parveen Malik Assistant Professor KIIT University Neural Networks Backpropagation
  • 2. Background • Perceptron Learning Algorithm , Hebbian Learning can classify input pattern if input patterns are linearly separable. • We need an algorithm which can train multilayer of perceptron or classify patterns which are not linearly separable. • Algorithm should also be able to use non-linear activation function. 𝒙𝟏 𝒙𝟐 𝑪𝒍𝒂𝒔𝒔 𝟏 𝑪𝒍𝒂𝒔𝒔 𝟐 𝑳𝒊𝒏𝒆𝒂𝒓𝒍𝒚 𝑺𝒆𝒑𝒆𝒓𝒂𝒃𝒍𝒆 𝒙𝟏 𝒙𝟐 𝑪𝒍𝒂𝒔𝒔 𝟏 𝑪𝒍𝒂𝒔𝒔 𝟐 𝑳𝒊𝒏𝒆𝒂𝒓𝒍𝒚 𝑵𝒐𝒏 − 𝒔𝒆𝒑𝒆𝒓𝒂𝒃𝒍𝒆 • Need non-linear boundaries • Perceptron algorithm can't be used • Variation of GD rule is used. • More layers are required • Non-linear activation function required Perceptron Algorithm − 𝑾𝒊+𝟏 𝑵𝒆𝒘 = 𝑾𝒊 𝑵𝒆𝒘 + (𝒕 − 𝒂)𝒙𝒊 Gradient Descent Algorithm- 𝑾𝒊+𝟏 𝑵𝒆𝒘 = 𝑾𝒊 𝒐𝒍𝒅 − 𝜼 𝝏𝑳 𝝏𝒘𝒊 𝑳𝒐𝒔𝒔 𝒇𝒖𝒏𝒄𝒕𝒊𝒐𝒏, 𝑳 = 𝟏 𝟐 𝒕 − 𝒂 𝟐
  • 3. Background- Back Propagation • The perceptron learning rule of Frank Rosenblatt and the LMS algorithm of Bernard Widrow and Marcian Hoff were designed to train single-layer perceptron-like networks. • Single-layer networks suffer from the disadvantage that they are only able to solve linearly separable classification problems. Both Rosenblatt and Widrow were aware of these limitations and proposed multilayer networks that could overcome them, but they were not able to generalize their algorithms to train these more powerful networks. • First description of an algorithm to train multilayer networks was contained in the thesis of Paul Werbos in 1974 .This thesis presented the algorithm in the context of general networks, with neural networks as a special case, and was not disseminated in the neural network community. • It was not until the mid 1980s that the backpropagation algorithm was rediscovered and widely publicized. It was rediscovered independently by David Rumelhart, Geoffrey Hinton and Ronald Williams 1986, David Parker 1985 and Yann Le Cun 1985. • The algorithm was popularized by its inclusion in the book Parallel Distributed Processing [RuMc86], which described the work of the Parallel Distributed Processing Group led by psychologists David Rumelhart and James Mc-Clelland • The multilayer perceptron, trained by the backpropagation algorithm, is currently the most widely used neural network.
  • 4. Network Design Problem : Whether you watch a movie or not ? Step 1 : Design – Output can be Yes (1) or No (0). Therefore one neuron or perceptron is sufficient. Step -2 : Choose suitable activation function in the output along with a rule to update the weights. (Hard Limit function for perceptron learning algorithm, sigmoid for the Widrow-Hoff rule or delta rule.) 𝑾𝒊+𝟏 𝑵𝒆𝒘 = 𝑾𝒊 𝒐𝒍𝒅 − 𝜼 𝝏𝑳 𝝏𝒘𝒊 𝑳 = 𝟏 𝟐 𝒚 − ෝ 𝒚 𝟐 = 𝟏 𝟐 𝒚 − 𝒇 𝒘𝒙 + 𝒃 𝟐 𝝏𝑳 𝝏𝒘 = 𝟐 ∗ 𝟏 𝟐 𝒚 − 𝒇 𝒘𝒙 + 𝒃 𝝏𝒇 𝒘𝒙 + 𝒃 𝝏𝒘 = − 𝒚 − ෝ 𝒚 𝒇′ 𝒘𝒙 + 𝒃 𝒙 𝑥 ෍ 𝑓 Director or Actor or Genre or IMDB w Yes (1) or No (0) 𝑤𝑥 + 𝑏 ෝ 𝒚 = 𝒇 𝒘𝒙 + 𝒃 = 𝟏 𝟏 + 𝒆−𝒘𝒙+𝒃 𝒇 𝒘𝒙 + 𝒃 𝑤0 = 𝑏 1
  • 5. Network Design Problem : Sort the students in the 4 house based on their three qualities like lineage, choice and ethics ? Step 1 : Design – Here, the input vector is 3-D i.e for each students, 𝑆𝑡𝑢𝑑𝑒𝑛𝑡 1 = 𝐿1 𝐶1 𝐸1 , 𝑆𝑡𝑢𝑑𝑒𝑛𝑡 2 = 𝐿2 𝐶2 𝐸2 𝒙𝟏 𝒙𝟐 𝒙𝟑 N 𝒙𝟏 𝒙𝟐 𝒙𝟑 𝒙𝟎=1 𝒘𝟏 𝒘𝟐 𝒘𝟑 𝒘𝟎 = 𝒃 Yes (1) or No (0) 𝑁1 𝑁2 ෝ 𝒚𝟏 = 𝒇 𝒘𝟏𝟏𝒙𝟏 + 𝒘𝟏𝟐𝒙𝟐 + 𝒘𝟏𝟑𝒙𝟑 + 𝒃𝟏 𝒙𝟏 𝒙𝟐 𝒙𝟑 𝒘𝟏𝟏 𝒘𝟏𝟐 𝒘𝟏𝟑 𝒘𝟐𝟏 𝒘𝟐𝟐 𝒘𝟐𝟑 ෝ 𝒚𝟐 = 𝒇 𝒘𝟐𝟏𝒙𝟏 + 𝒘𝟐𝟐𝒙𝟐 + 𝒘𝟐𝟑𝒙𝟑 + 𝒃𝟐 𝒃𝟏 𝒃𝟐 ො 𝑦1 ො 𝑦2 0 1 1 0 1 1 0 0 A B C D Houses 𝑦1 𝑦2 Actual Output Target Output
  • 6. Network Design Step 2 : Choosing the activation function and rule to update weights Loss function, 𝐿 = 1 2 𝑦 − ො 𝑦 2 1 1 𝑁1 𝑁2 ෝ 𝒚𝟏 = 𝒇 𝒘𝟏𝟏𝒙𝟏 + 𝒘𝟏𝟐𝒙𝟐 + 𝒘𝟏𝟑𝒙𝟑 + 𝒃𝟏 𝒙𝟏 𝒙𝟐 𝒙𝟑 𝒘𝟏𝟏 𝒘𝟏𝟐 𝒘𝟏𝟑 𝒘𝟐𝟏 𝒘𝟐𝟐 𝒘𝟐𝟑 ෝ 𝒚𝟐 = 𝒇 𝒘𝟐𝟏𝒙𝟏 + 𝒘𝟐𝟐𝒙𝟐 + 𝒘𝟐𝟑𝒙𝟑 + 𝒃𝟐 𝒃𝟏 𝒃𝟐 ො 𝑦1 ො 𝑦2 0 1 1 0 0 0 A B C D Houses 𝑦1 𝑦2 Actual Output Target Output 𝑾𝒊𝒋 𝒕 + 𝟏 = 𝑾𝒊𝒋 𝒕 − 𝜼 𝝏𝑳 𝝏𝒘𝒊𝒋 𝝏𝑳 𝝏𝒘𝟏𝟏 = 𝒚 − ෝ 𝒚 𝒇′ 𝒘𝟏𝟏𝒙𝟏 + +𝒘𝟏𝟐𝒙𝟐 + 𝒘𝟏𝟑𝒙𝟑 + 𝒃𝟏 𝒙𝟏
  • 8. Network Architectures (More Complex) 𝑥1 𝑥2 𝑥2 𝑥2 ℎ2 (1) ℎ1 (1) ℎ3 (1) ℎ1 (2) ℎ2 (2) ℎ3 (2) 𝑦1 𝑦2 𝑾(𝟏) 𝑾(𝟐) 𝑾(𝟑)
  • 9. Input 𝝏𝑳 𝝏𝑾𝒊𝒍 = 𝜹𝒊𝒁𝒍 𝜹𝒊 = 𝝈′ 𝒂𝒊 ෍ 𝑱 𝜹𝒋𝑾𝒋𝒊 𝑎𝑙 𝑧𝑙 𝑎𝑖 𝑧𝑖 𝑎𝑗 𝑧𝑗 ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ 𝑾𝒊𝒍 𝑾𝒋𝒊 𝜹𝒊 = 𝝏𝑳 𝝏𝒂𝒊 𝜹𝒋 = 𝝏𝑳 𝝏𝒂𝒋 Cost Function 𝑳 = 𝟏 𝟐 (𝒚 − ෝ 𝒚) 𝟐 Error to input layer 𝝈 𝒂𝒊 𝟏 − 𝝈 𝒂𝒊 Back-propagation Algorithm (Generalized Expression) 𝜹𝒋 = 𝝏𝑳 𝝏𝒂𝒋 = − 𝒚 − ෝ 𝒚 𝝏ෝ 𝒚 𝝏𝒂𝒋 𝒂𝒊 = ෍ 𝒍 𝑾𝒊𝒍𝒁𝒍 𝜹𝒊 = 𝝏𝑳 𝝏𝒂𝒊 = ෍ 𝑱 𝝏𝑳 𝝏𝒂𝒋 𝝏𝒂𝒋 𝝏𝒂𝒊 𝝏𝒂𝒋 𝝏𝒂𝒊 = 𝝏𝒂𝒋 𝝏𝒁𝒊 𝝏𝒁𝒊 𝝏𝒂𝒊 = 𝑾𝒋𝒊𝝈′ 𝒂𝒊
  • 10. Back-propagation Algorithm 𝒙𝟏 𝒙𝟐 ෝ 𝒚 𝑎1 𝜎 𝑎1 𝑎2 𝜎 𝑎2 𝑏1 𝜎 𝑏1 1 1 1 0.6 0.4 -0.1 0.5 -0.3 0.3 0.4 0.1 -0.2
  • 11. Back-propagation Algorithm 𝒙𝟏 𝒙𝟐 ෝ 𝒚 𝑎1 𝜎 𝑎1 𝑎2 𝜎 𝑎2 𝑏1 𝜎 𝑏1 𝑻𝒂𝒓𝒈𝒆𝒕 𝒚 = 𝟏 1 1 1 0 1 0.6 0.4 -0.1 0.5 -0.3 0.3 0.4 0.1 -0.2 𝑾𝟏𝟏 (𝟏) 𝑾𝟐𝟏 (𝟏) 𝑾𝟏𝟐 (𝟏) 𝑾𝟏𝟎 (𝟏) 𝑾𝟏𝟐 (𝟏) 𝑾𝟐𝟎 (𝟏) 𝑾𝟏𝟐 (𝟐) 𝑾𝟏𝟎 (𝟐) 𝑾𝟏𝟏 (𝟐)
  • 12. Back-propagation Algorithm 𝒙𝟏 𝒙𝟐 ෝ 𝒚 𝑎1 𝜎 𝑎1 𝑎2 𝜎 𝑎2 𝑏1 𝜎 𝑏1 𝑻𝒂𝒓𝒈𝒆𝒕 𝒚 = 𝟏 1 1 1 0 1 0.6 0.4 -0.1 0.5 -0.3 0.3 0.4 0.1 -0.2 𝑾𝟏𝟏 (𝟏) 𝑾𝟐𝟏 (𝟏) 𝑾𝟏𝟐 (𝟏) 𝑾𝟏𝟎 (𝟏) 𝑾𝟏𝟐 (𝟏) 𝑾𝟐𝟎 (𝟏) 𝑾𝟏𝟐 (𝟐) 𝑾𝟏𝟎 (𝟐) 𝑾𝟏𝟏 (𝟐) 𝑳 = 𝟏 𝟐 𝒚 − ෝ 𝒚 𝟐 Loss function
  • 13. Back-propagation Algorithm 𝒙𝟏 𝒙𝟐 ෝ 𝒚 𝑎1 𝜎 𝑎1 𝑎2 𝜎 𝑎2 𝑏1 𝜎 𝑏1 𝑻𝒂𝒓𝒈𝒆𝒕 𝒚 = 𝟏 1 1 1 0 1 0.6 0.4 -0.1 0.5 -0.3 0.3 0.4 0.1 -0.2 𝑾𝟏𝟏 (𝟏) 𝑾𝟐𝟏 (𝟏) 𝑾𝟏𝟐 (𝟏) 𝑾𝟏𝟎 (𝟏) 𝑾𝟏𝟐 (𝟏) 𝑾𝟐𝟎 (𝟏) 𝑾𝟏𝟐 (𝟐) 𝑾𝟏𝟎 (𝟐) 𝑾𝟏𝟏 (𝟐) 𝑳 = 𝟏 𝟐 𝒚 − ෝ 𝒚 𝟐 Loss function 𝑾𝑵𝒆𝒘 = 𝑾𝒐𝒍𝒅 − 𝜼 𝝏𝑳 𝝏𝑾𝒐𝒍𝒅
  • 14. Back-propagation Algorithm Step 1 : Forward pass 𝒙𝟏 𝒙𝟐 𝑎1 𝜎 𝑎1 𝒂𝟏=0.2 𝑎2 𝜎 𝑎2 𝑏1 𝜎 𝑏1 𝑻𝒂𝒓𝒈𝒆𝒕 𝒚 = 𝟏 1 1 1 0 1 0.6 0.4 -0.1 0.5 -0.3 0.3 0.4 0.1 -0.2 𝑾𝟏𝟏 (𝟏) 𝑾𝟐𝟏 (𝟏) 𝑾𝟏𝟐 (𝟏) 𝑾𝟏𝟎 (𝟏) 𝑊 12 (1) 𝑾𝟐𝟎 (𝟏) 𝑾𝟏𝟐 (𝟐) 𝑾𝟏𝟎 (𝟐) 𝑾𝟏𝟏 (𝟐) 𝑳 = 𝟏 𝟐 𝒚 − ෝ 𝒚 𝟐 Loss function 𝝈 𝒂𝟏 = 𝟏 𝟏 + 𝒆−𝟎.𝟐 = 𝟎. 𝟓𝟒𝟗𝟖 𝑾𝑵𝒆𝒘 = 𝑾𝒐𝒍𝒅 − 𝜼 𝝏𝑳 𝝏𝑾𝒐𝒍𝒅
  • 15. Back-propagation Algorithm Step 1 : Forward pass 𝒙𝟏 𝒙𝟐 𝑎1 𝜎 𝑎1 𝒂𝟏=0.2 𝒂𝟐=0.9 𝑎2 𝜎 𝑎2 𝑏1 𝜎 𝑏1 𝑻𝒂𝒓𝒈𝒆𝒕 𝒚 = 𝟏 1 1 1 0 1 0.6 0.4 -0.1 0.5 -0.3 0.3 0.4 0.1 -0.2 𝑾𝟏𝟏 (𝟏) 𝑾𝟐𝟏 (𝟏) 𝑾𝟏𝟐 (𝟏) 𝑾𝟏𝟎 (𝟏) 𝑾𝟏𝟐 (𝟏) 𝑾𝟐𝟎 (𝟏) 𝑾𝟏𝟐 (𝟐) 𝑾𝟏𝟎 (𝟐) 𝑾𝟏𝟏 (𝟐) 𝑳 = 𝟏 𝟐 𝒚 − ෝ 𝒚 𝟐 Loss function 𝝈 𝒂𝟏 = 𝟏 𝟏 + 𝒆−𝟎.𝟐 = 𝟎. 𝟓𝟒𝟗𝟖 𝝈 𝒂𝟐 = 𝟏 𝟏 + 𝒆−𝟎.𝟗 = 𝟎. 𝟕𝟏𝟎𝟗 ෝ 𝒚 𝑾𝑵𝒆𝒘 = 𝑾𝒐𝒍𝒅 − 𝜼 𝝏𝑳 𝝏𝑾𝒐𝒍𝒅
  • 16. Back-propagation Algorithm Step 1 : Forward pass 𝒙𝟏 𝒙𝟐 𝑎1 𝜎 𝑎1 𝒂𝟏=0.2 𝒂𝟐=0.9 𝒃𝟏=0.09101 𝑎2 𝜎 𝑎2 𝑏1 𝜎 𝑏1 ෝ 𝒚 = 𝝈 𝒃𝟏 =0.5227 𝑻𝒂𝒓𝒈𝒆𝒕 𝒚 = 𝟏 1 1 1 0 1 0.6 0.4 -0.1 0.5 -0.3 0.3 0.4 0.1 -0.2 𝑾𝟏𝟏 (𝟏) 𝑾𝟐𝟏 (𝟏) 𝑾𝟏𝟐 (𝟏) 𝑾𝟏𝟎 (𝟏) 𝑾𝟏𝟐 (𝟏) 𝑾𝟐𝟎 (𝟏) 𝑾𝟏𝟐 (𝟐) 𝑾𝟏𝟎 (𝟐) 𝑾𝟏𝟏 (𝟐) 𝑳 = 𝟏 𝟐 𝒚 − ෝ 𝒚 𝟐 Loss function 𝝈 𝒂𝟏 = 𝟏 𝟏 + 𝒆−𝟎.𝟐 = 𝟎. 𝟓𝟒𝟗𝟖 𝝈 𝒂𝟐 = 𝟏 𝟏 + 𝒆−𝟎.𝟗 = 𝟎. 𝟕𝟏𝟎𝟗
  • 17. Back-propagation Algorithm Step 2 : Backpropagation of error 𝒙𝟏 𝒙𝟐 𝑎1 𝜎 𝑎1 𝒂𝟏=0.2 𝒂𝟐=0.9 𝒃𝟏=0.09101 𝑎2 𝜎 𝑎2 𝑏1 𝜎 𝑏1 ෝ 𝒚 = 𝝈 𝒃𝟏 =0.5227 𝑻𝒂𝒓𝒈𝒆𝒕 𝒚 = 𝟏 1 1 1 0 1 0.6 0.4 -0.1 0.5 -0.3 0.3 0.4 0.1 -0.2 𝑾𝟏𝟏 (𝟏) 𝑾𝟐𝟏 (𝟏) 𝑾𝟏𝟐 (𝟏) 𝑾𝟏𝟎 (𝟏) 𝑾𝟏𝟐 (𝟏) 𝑾𝟐𝟎 (𝟏) 𝑾𝟏𝟐 (𝟐) 𝑾𝟏𝟎 (𝟐) 𝑾𝟏𝟏 (𝟐) 𝑳 = 𝟏 𝟐 𝒚 − ෝ 𝒚 𝟐 Loss function 𝝈 𝒂𝟏 = 𝟏 𝟏 + 𝒆−𝟎.𝟐 = 𝟎. 𝟓𝟒𝟗𝟖 𝝈 𝒂𝟐 = 𝟏 𝟏 + 𝒆−𝟎.𝟗 = 𝟎. 𝟕𝟏𝟎𝟗 𝑎𝑙 𝑧𝑙 𝑎𝑖 𝑧𝑖 𝑎𝑗 𝑧𝑗 ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ 𝝏𝑳 𝝏𝑾𝒊𝒍 = 𝜹𝒊𝒁𝒍 𝜹𝒊 = 𝝈′ 𝒂𝒊 ෍ 𝑱 𝜹𝒋𝑾𝒋𝒊 𝑾𝒊𝒍 𝑾𝒋𝒊 Imagine
  • 18. Back-propagation Algorithm 𝝏𝑳 𝝏𝑾𝒊𝒍 = 𝜹𝒊𝒁𝒍 𝜹𝒊 = 𝝈′ 𝒂𝒊 ෍ 𝑱 𝜹𝒋𝑾𝒋𝒊 𝑎𝑙 𝑧𝑙 𝑎𝑖 𝑧𝑖 𝑎𝑗 𝑧𝑗 ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ 𝑾𝒊𝒍 𝑾𝒋𝒊 𝜹𝒊= 𝝏𝑳 𝝏𝒂𝒊 𝜹𝒋= 𝝏𝑳 𝝏𝒂𝒋 𝝈′ 𝒂𝒊 = 𝝈 𝒂𝒊 𝟏 − 𝝈 𝒂𝒊 Cost/Error Function 𝑳 = 𝟏 𝟐 (𝒚 − ෝ 𝒚) 𝟐 𝒛𝒊 = 𝝈 𝒂𝒊 𝜹𝟏 = 𝝈′ 𝒂𝟏 𝜹𝒋𝑾𝟏𝟏 (𝟐) = 𝝈 𝒂𝟏 𝟏 − 𝝈 𝒂𝟏 𝜹𝒐𝒖𝒕𝑾𝟏𝟏 (𝟐) = −𝝈 𝒂𝟏 𝟏 − 𝝈 𝒂𝟏 𝒚 − 𝒛𝒐𝒖𝒕 𝝈 𝒃𝒐𝒖𝒕 𝟏 − 𝝈 𝒃𝒐𝒖𝒕 𝑾𝟏𝟏 (𝟐) 𝜹𝒋= 𝝏𝑳 𝝏𝒂𝒋 𝜹𝒐𝒖𝒕 = 𝝏𝑳 𝝏𝒃𝒐𝒖𝒕 = − 𝒚 − ෝ 𝒚 𝝏ෝ 𝒚 𝝏𝒃𝒐𝒖𝒕 = − 𝒚 − 𝒛𝒐𝒖𝒕 𝝏𝒛𝒐𝒖𝒕 𝝏𝒃𝒐𝒖𝒕 = − 𝒚 − 𝒛𝒐𝒖𝒕 𝝈′ 𝒃𝒐𝒖𝒕 = − 𝒚 − 𝒛𝒐𝒖𝒕 𝝈 𝒃𝒐𝒖𝒕 𝟏 − 𝝈 𝒃𝒐𝒖𝒕 𝒙𝟏 𝑎1 𝑧1 𝒃𝒐𝒖𝒕 𝒛𝒐𝒖𝒕 𝑾𝟏𝟏 (𝟐) 𝑾𝟏𝟏 (𝟏) 𝒛𝟏 = 𝝈 𝒂𝟏 𝒛𝒐𝒖𝒕 = 𝝈 𝒃𝒐𝒖𝒕 = ෝ 𝒚 𝒙𝟐 𝑎2 𝑧2 𝑾𝟏𝟐 (𝟐) 𝑾𝟐𝟐 (𝟏) 𝒛𝟐 = 𝝈 𝒂𝟐 𝜹𝟐 = 𝝈′ 𝒂𝟐 𝜹𝒋𝑾𝟏𝟐 (𝟐) = 𝝈 𝒂𝟐 𝟏 − 𝝈 𝒂𝟐 𝜹𝒐𝒖𝒕𝑾𝟏𝟐 (𝟐) = −𝝈 𝒂𝟐 𝟏 − 𝝈 𝒂𝟐 𝒚 − 𝒛𝒐𝒖𝒕 𝝈 𝒃𝒐𝒖𝒕 𝟏 − 𝝈 𝒃𝒐𝒖𝒕 𝑾𝟏𝟐 (𝟐)
  • 19. Back-propagation Algorithm - error propagation (Update of Layer 1 weights) 0.5227 0.09101 𝟎. 𝟕𝟏𝟎𝟗 0.9 𝟎. 𝟓𝟒𝟗𝟖 0.2 𝜹𝟏 = 𝝈′ 𝒂𝟏 𝜹𝒋𝑾𝟏𝟏 (𝟐) = 𝝈 𝒂𝟏 𝟏 − 𝝈 𝒂𝟏 𝜹𝒐𝒖𝒕𝑾𝟏𝟏 (𝟐) = −𝝈 𝒂𝟏 𝟏 − 𝝈 𝒂𝟏 𝒚 − 𝒛𝒐𝒖𝒕 𝝈 𝒃𝒐𝒖𝒕 𝟏 − 𝝈 𝒃𝒐𝒖𝒕 𝑾𝟏𝟏 (𝟐) 𝜹𝒋= 𝝏𝑳 𝝏𝒂𝒋 𝜹𝒐𝒖𝒕 = 𝝏𝑳 𝝏𝒃𝒐𝒖𝒕 = − 𝒚 − ෝ 𝒚 𝝏ෝ 𝒚 𝝏𝒃𝒐𝒖𝒕 = − 𝒚 − 𝒛𝒐𝒖𝒕 𝝏𝒛𝒐𝒖𝒕 𝝏𝒃𝒐𝒖𝒕 = − 𝒚 − 𝒛𝒐𝒖𝒕 𝝈′ 𝒃𝒐𝒖𝒕 = − 𝒚 − 𝒛𝒐𝒖𝒕 𝝈 𝒃𝒐𝒖𝒕 𝟏 − 𝝈 𝒃𝒐𝒖𝒕 𝒙𝟏 𝑎1 𝑧1 𝒃𝒐𝒖𝒕 𝒛𝒐𝒖𝒕 𝑾𝟏𝟏 (𝟐) 𝑾𝟏𝟏 (𝟏) 𝒛𝟏 = 𝝈 𝒂𝟏 𝒛𝒐𝒖𝒕 = 𝝈 𝒃𝒐𝒖𝒕 = ෝ 𝒚 𝑾𝟏𝟐 (𝟏) 𝑾𝟐𝟐 (𝟏) 𝑾𝟏𝟎 (𝟏) 𝑾𝟐𝟎 (𝟏) 𝑾𝟏𝟎 (𝟐) 𝒙𝟐 𝑎2 𝑧2 𝑾𝟏𝟐 (𝟐) 𝑾𝟐𝟐 (𝟏) 𝒛𝟐 = 𝝈 𝒂𝟐 𝜹𝟐 = 𝝈′ 𝒂𝟐 𝜹𝒋𝑾𝟏𝟐 (𝟐) = 𝝈 𝒂𝟐 𝟏 − 𝝈 𝒂𝟐 𝜹𝒐𝒖𝒕𝑾𝟏𝟐 (𝟐) = −𝝈 𝒂𝟐 𝟏 − 𝝈 𝒂𝟐 𝒚 − 𝒛𝒐𝒖𝒕 𝝈 𝒃𝒐𝒖𝒕 𝟏 − 𝝈 𝒃𝒐𝒖𝒕 𝑾𝟏𝟐 (𝟐) 0.4 0.1 𝜹𝟏 = − 𝟎. 𝟓𝟒𝟗𝟖 𝟏 − 𝟎. 𝟓𝟒𝟗𝟖 𝟏 − 𝟎. 𝟓𝟐𝟐𝟕 𝟎. 𝟓𝟐𝟐𝟕 𝟏 − 𝟎. 𝟓𝟐𝟐𝟕 𝟎. 𝟒 = −𝟎. 𝟎𝟏𝟏𝟕𝟖𝟗 𝜹𝟐 = − 𝟎. 𝟕𝟏𝟎𝟗 𝟏 − 𝟎. 𝟕𝟏𝟎𝟗 𝟏 − 𝟎. 𝟓𝟐𝟐𝟕 𝟎. 𝟓𝟐𝟐𝟕 𝟏 − 𝟎. 𝟓𝟐𝟐𝟕 𝟎. 𝟏 = −𝟎. 𝟎𝟎𝟐𝟒𝟒𝟕 0 1 𝒚 = 𝟏 1 1 1
  • 20. Back-propagation Algorithm (Update of Layer 1 weights) 𝑾𝒊𝒋 𝑵𝒆𝒘 = 𝑾𝒊𝒋 𝒐𝒍𝒅 − 𝜼 𝝏𝑳 𝝏𝑾𝒊𝒋 𝒐𝒍𝒅 𝜹𝟏 = −𝟎. 𝟎𝟏𝟏𝟕𝟖𝟗 𝜹𝟐 = −𝟎. 𝟎𝟎𝟐𝟒𝟒𝟕 𝜼 = 𝟎. 𝟐𝟓 0.5227 0.09101 0.9 𝟎. 𝟕𝟏𝟎𝟗 𝟎. 𝟓𝟒𝟗𝟖 0.2 𝒙𝟏 𝑎1 𝑧1 𝒃𝒐𝒖𝒕 𝒛𝒐𝒖𝒕 𝑾𝟏𝟏 (𝟐) 𝑾𝟏𝟏 (𝟏) 𝒛𝟏 = 𝝈 𝒂𝟏 𝒛𝒐𝒖𝒕 = 𝝈 𝒃𝒐𝒖𝒕 = ෝ 𝒚 𝑾𝟏𝟐 (𝟏) 𝑾𝟐𝟏 (𝟏) 𝑾𝟏𝟎 (𝟏) 1 𝑾𝟐𝟎 (𝟏) 1 1 𝒙𝟐 𝑎2 𝑧2 𝑾𝟏𝟐 (𝟐) 𝑾𝟐𝟐 (𝟏) 𝒛𝟐 = 𝝈 𝒂𝟐 0.6 0.4 0.4 0.1 -0.1 -0.3 0 1 𝒚 = 𝟏 0.3 0.5 -0.2 𝝏𝑳 𝝏𝑾𝟏𝟏 (𝟏) = 𝜹𝟏𝒙𝟏 = −𝟎. 𝟎𝟏𝟏𝟕𝟖𝟗 × 𝟎 = 𝟎 𝝏𝑳 𝝏𝑾𝟏𝟐 (𝟏) = 𝜹𝟏𝒙𝟐 = −𝟎. 𝟎𝟏𝟏𝟕𝟖𝟗 × 𝟏 = −𝟎. 𝟎𝟏𝟏𝟕𝟖𝟗 𝝏𝑳 𝝏𝑾𝟏𝟎 (𝟏) = 𝜹𝟏𝒙𝟎 = −𝟎. 𝟎𝟏𝟏𝟕𝟖𝟗 × 𝟏 = −𝟎. 𝟎𝟏𝟏𝟕𝟖𝟗 𝝏𝑳 𝝏𝑾𝟐𝟏 (𝟏) = 𝜹𝟐𝒙𝟏 = −𝟎. 𝟎𝟎𝟐𝟒𝟒𝟕 × 𝟎 = 𝟎 𝝏𝑳 𝝏𝑾𝟐𝟐 (𝟏) = 𝜹𝟐𝒙𝟐 = −𝟎. 𝟎𝟎𝟐𝟒𝟒𝟕 × 𝟏 = −𝟎. 𝟎02447 𝝏𝑳 𝝏𝑾𝟐𝟎 (𝟏) = 𝜹𝟐𝒙𝟎 = −𝟎. 𝟎𝟎𝟐𝟒𝟒𝟕 × 𝟏 = −𝟎. 𝟎𝟎2447 𝑾𝟏𝟏 𝟏 𝒕 + 𝟏 = 𝑾𝟏𝟏 𝟏 𝒕 − 𝟎. 𝟐𝟓 𝝏𝑳 𝝏𝑾𝟏𝟏 𝟏 𝒕 = 𝟎. 𝟔 − 𝟎. 𝟐𝟓 × 𝟎 = 𝟎. 𝟔 𝑾𝟏𝟐 𝟏 𝒕 + 𝟏 = 𝑾𝟏𝟐 𝟏 𝒕 − 𝟎. 𝟐𝟓 𝝏𝑳 𝝏𝑾𝟏𝟐 𝟏 𝒕 = −𝟎. 𝟏 + 𝟎. 𝟐𝟓 × 𝟎. 𝟎𝟏𝟏𝟕𝟖𝟗 = −𝟎. . 𝟎𝟗𝟕𝟎𝟓 𝑾𝟏𝟎 𝟏 𝒕 + 𝟏 = 𝑾𝟏𝟎 𝟏 𝒕 − 𝟎. 𝟐𝟓 𝝏𝑳 𝝏𝑾𝟏𝟎 𝟏 𝒕 = 𝟎. 𝟑 + 𝟎. 𝟐𝟓 × 𝟎. 𝟎𝟏𝟏𝟕𝟖𝟗 = 𝟎. 𝟑𝟎𝟐𝟗𝟓 𝑾𝟐𝟏 𝟏 𝒕 + 𝟏 = 𝑾𝟐𝟏 𝟏 𝒕 − 𝟎. 𝟐𝟓 𝝏𝑳 𝝏𝑾𝟐𝟏 𝟏 𝒕 = −𝟎. 𝟑 − 𝟎. 𝟐𝟓 × 𝟎 = −𝟎. 𝟑 𝑾𝟐𝟐 𝟏 𝒕 + 𝟏 = 𝑾𝟐𝟐 𝟏 𝒕 − 𝟎. 𝟐𝟓 𝝏𝑳 𝝏𝑾𝟐𝟐 𝟏 𝒕 = 𝟎. 𝟒 + 𝟎. 𝟐𝟓 × 𝟎. 𝟎𝟎𝟐𝟒𝟒𝟕 = 𝟎.4006125 𝑾𝟐𝟎 𝟏 𝒕 + 𝟏 = 𝑾𝟐𝟎 𝟏 𝒕 − 𝟎. 𝟐𝟓 𝝏𝑳 𝝏𝑾𝟐𝟎 𝟏 𝒕 = 𝟎. 𝟓 + 𝟎. 𝟐𝟓 × 𝟎. 𝟎𝟎𝟐𝟒𝟒𝟕 = 𝟎. 𝟓𝟎𝟎𝟔𝟏𝟐𝟓 𝑾𝟏𝟎 (𝟐)
  • 21. Back-propagation Algorithm (Update of layer 2 Weights) 𝑾𝒊𝒋 𝑵𝒆𝒘 = 𝑾𝒊𝒋 𝒐𝒍𝒅 − 𝜼 𝝏𝑳 𝝏𝑾𝒊𝒋 𝒐𝒍𝒅 𝜹𝟏 = −𝟎. 𝟎𝟏𝟏𝟕𝟖𝟗 𝜹𝟐 = −𝟎. 𝟎𝟎𝟐𝟒𝟒𝟕 0.5227 0.09101 𝟎. 𝟕𝟏𝟎𝟗 0.9 0.2 𝟎. 𝟓𝟒𝟗𝟖 𝒙𝟏 𝑎1 𝑧1 𝒃𝒐𝒖𝒕 𝒛𝒐𝒖𝒕 𝑾𝟏𝟏 (𝟐) 𝑾𝟏𝟏 (𝟏) 𝒛𝟏 = 𝝈 𝒂𝟏 𝒛𝒐𝒖𝒕 = 𝝈 𝒃𝒐𝒖𝒕 = ෝ 𝒚 𝑾𝟏𝟐 (𝟏) 𝑾𝟐𝟏 (𝟏) 𝑾𝟏𝟎 (𝟏) 1 𝑾𝟐𝟎 (𝟏) 1 1 𝑾𝟏𝟎 (𝟐) 𝒙𝟐 𝑎2 𝑧2 𝑾𝟏𝟐 (𝟐) 𝑾𝟐𝟐 (𝟏) 𝒛𝟐 = 𝝈 𝒂𝟐 0.6 0.4 0.4 0.1 -0.1 -0.3 0 1 𝒚 = 𝟏 0.3 0.5 -0.2 𝒃𝒐𝒖𝒕 = 𝒛𝟏𝑾𝟏𝟏 (𝟐) + 𝒛𝟐𝑾𝟏𝟐 (𝟐) + 𝑾𝟏𝟎 (𝟐) , 𝒛𝒐𝒖𝒕 = 𝝈 𝒃𝒐𝒖𝒕 = ෝ 𝒚 𝝏𝑳 𝝏𝑾𝟏𝟏 (𝟐) = − 𝒚 − ෝ 𝒚 𝝏ෝ 𝒚 𝝏𝑾𝟏𝟏 𝟐 = − 𝒚 − ෝ 𝒚 𝝈′ 𝒃𝒐𝒖𝒕 𝒛𝟏 = 𝜹𝒐𝒖𝒕𝒛𝟏 𝝏𝑳 𝝏𝑾𝟏𝟐 (𝟐) = − 𝒚 − ෝ 𝒚 𝝏ෝ 𝒚 𝝏𝑾𝟏𝟐 𝟐 = − 𝒚 − ෝ 𝒚 𝝈′ 𝒃𝒐𝒖𝒕 𝒛𝟐 = 𝜹𝒐𝒖𝒕𝒛𝟐 𝝏𝑳 𝝏𝑾𝟏𝟎 (𝟐) = − 𝒚 − ෝ 𝒚 𝝏ෝ 𝒚 𝝏𝑾𝟏𝟎 𝟐 = − 𝒚 − ෝ 𝒚 𝝈′ 𝒃𝒐𝒖𝒕 = 𝜹𝒐𝒖𝒕 𝝈′ 𝒃𝒐𝒖𝒕 = 𝝈 𝒃𝒐𝒖𝒕 𝟏 − 𝝈 𝒃𝒐𝒖𝒕 𝜹𝑶𝒖𝒕 = − 𝒚 − ෝ 𝒚 𝝈 𝒃𝒐𝒖𝒕 𝟏 − 𝝈 𝒃𝒐𝒖𝒕 = − 𝟏 − 𝟎. 𝟓𝟐𝟐𝟕 𝟎. 𝟓𝟐𝟐𝟕 𝟏 − 𝟎. 𝟓𝟐𝟐𝟕 = −𝟎. 𝟏𝟏𝟗𝟎𝟖 𝑾𝟏𝟏 𝟐 𝒕 + 𝟏 = 𝑾𝟏𝟏 𝟐 𝒕 − 𝜼 𝝏𝑳 𝝏𝑾𝟏𝟏 𝟐 = 𝟎. 𝟒 + 𝟎. 𝟐𝟓 × 𝟎. 𝟏𝟏𝟗𝟎𝟖 × 𝟎. 𝟓𝟒𝟗𝟖 = 𝟎. 𝟒𝟏𝟔𝟒 𝑾𝟏𝟐 𝟐 𝒕 + 𝟏 = 𝑾𝟏𝟐 𝟐 𝒕 − 𝜼 𝝏𝑳 𝝏𝑾𝟏𝟐 𝟐 = 𝟎. 𝟏 + 𝟎. 𝟐𝟓 × 𝟎. 𝟏𝟏𝟗𝟎𝟖 × 𝟎. 𝟕𝟏𝟎𝟗 = 𝟎. 𝟏𝟐𝟏𝟐 𝑾𝟏𝟎 𝟐 𝒕 + 𝟏 = 𝑾𝟏𝟎 𝟐 𝒕 − 𝜼 𝝏𝑳 𝝏𝑾𝟏𝟎 𝟐 = −𝟎. 𝟐 + 𝟎. 𝟐𝟓 × 𝟎. 𝟏𝟏𝟗𝟎𝟖 = −𝟎. 𝟏𝟕𝟎𝟐𝟑