SlideShare ist ein Scribd-Unternehmen logo
1 von 7
1. Normalizes the values of Z in the network.
2. BN is done for mini batch mode.
3. Let’s assume we are trying to apply BN to layer 2 of the network shown below.
4. Assume, batch_size is 10 which means there will be 10 data points for every batch.
Batch Normalization - Algorithm
1. Training - Batch 1:
1. z_vector –
1. For sample 1 in batch 1, the z vector is [z_2_1,
z_2_2………..z_2_5]
2. Same z vector is computed for all sample ranging
from sample 1 till 10 in batch 1.
2. z_normalized_vector – znorm
1. The z values of across all samples in batch are
standardized to make z_normalized_vector.
2. Even though we say normalization, we are doing
standardization of z values. Normalization is done to
restrict the values of data in range 0 – 1.
Standardization converts data into distribution with
mean 0 and S.D of 1.
3. z_tilda – z~ = ((gamma* z_normalized_vector) + beta)
1. gamma is the scale and beta is the shift.
2. The concept behind gamma and beta – While
converting z_vector into a z_normalized_vector, we
assume that z follows standard normal distribution.
It may not be the case always. To account for other
scenarios, we scale (γ) the data which essentially
means distribute the data and then shift (β) the data
which essentially means move the data across scale.
Batch Normalization – Normalizing z
Batch Normalization – Shift & Scale
1. Training - Batch 1: (continued..)
3. z_tilda – (continued..)
2. Update gamma and beta - The gamma and beta are initialized to 1 and 0 by for all nodes in layer 2 of network. The values
remain same through out batch 1. This value is updated by using optimizer (example - gradient descent) at the start of batch 2.
This is done like weight update done using gradient descent.
1. The FP for the samples 1 through 10 is carried on with the initialized value z_tilda in layer 2.
2. During BP, we compute vector for error gradient w.r.t beta. This is done for all samples from 1 through 10. Once done, we
will compute averaged error gradient vector w.r.t beta. We will use this in gradient descent formula to update value of
beta vector for layer 2 for batch 2.
Batch Normalization – update β & γ
2. The same process as mentioned above is continued after batch 1 as well till we reach convergence.
3. Test – Test/Validation time is different than Training time since we are dealing with one sample at a time at the time of test. In such case,
how to we normalize the value of z. To normalize z, we need mean & S.D of data.
1. We can pick the value of mean and S.D used for normalizing z in layer 2 during the last iteration of training.
2. Another alternative is to do a weighted average (or average) of mean and S.D values used for normalizing z in layer 2 during all
iterations of training.
Batch Normalization – Algorithm
Batch Normalization – β & γ
What would happen if we don’t use (β) &
(γ) to calculate z~.
Let’s assume we don’t use (β) & (γ) and we
are dealing with sigmoid activation
function. In such a case, we see in this
picture that there is literally no use of
using the activation function itself. Since
standard normal data is near to 0, every
data point will cross as-is through the
activation.
1. High fluctuations in z value keep the network training for long. BN increases the speed of training by keeping z values in control. If wide
fluctuations in z are limited, the fluctuations in errors and gradients are also limited making the weight updated optimal (neither too high
nor too low).
2. BN increases the computations happening in every iteration of network. This means that every iteration takes more time to finish which
should eventually translate to more training time. However, training time is reduced. This is because global minima is achieved in less
number of iterations while using BN. So, overall, we end up reducing training time.
3. BN can be applied to input layer thus normalizing input data.
4. BN can be applied either after z or after a. General practice is to use it after z.
5. No use of Bias in case we use BN for a layer – Bias used in the computation of z (z = wx + b) is meant to shift the distribution of data. When
we use BN, we do standard normalization to z. That means that we convert z distribution to a 0 mean and 1 SD distribution. So, the use of
adding bias does not make sense since we are anyways shifting the distribution back standard normal distribution.
model = Sequential()
model.add(Dense(32), use_bias=False)
model.add(BatchNormalization())
model.add(Activation('relu’))
6. BN helps in regularization.
1. During BN, we compute mean and SD of z values at a specific layer for all the samples in the batch. We use this mean and SD values
to normalize z values to compute znorm.
2. The mean and SD is only of z values of samples involved in 1 batch. If a batch 1 has 10 samples, the mean and SD is for z values only
corresponding to these 10 samples and not the entire dataset.
3. For next batch 2, we will again use mean and SD of next 10 samples which will be different from mean and SD of z values of previous
10 samples from batch 1. This way, we are introducing some noise in the dataset and hence, helping in generalization / regularization.
7. BN helps in preventing the probability of vanishing and exploding gradients. This is because it normalizes the value of z thereby limiting the
effect of higher or lower weights. z = wx + b for first layer and z = wa + b for subsequent layers.
8. BN does not help network w.r.t covariate shift as was listed in one of the research paper which has been proven false.
Batch Normalization – Points to Note

Weitere ähnliche Inhalte

Ähnlich wie Batch Normalization

nural network ER. Abhishek k. upadhyay
nural network ER. Abhishek  k. upadhyaynural network ER. Abhishek  k. upadhyay
nural network ER. Abhishek k. upadhyayabhishek upadhyay
 
International Journal of Engineering Research and Development (IJERD)
 International Journal of Engineering Research and Development (IJERD) International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)IJERD Editor
 
ML_ Unit 2_Part_B
ML_ Unit 2_Part_BML_ Unit 2_Part_B
ML_ Unit 2_Part_BSrimatre K
 
Setting Artificial Neural Networks parameters
Setting Artificial Neural Networks parametersSetting Artificial Neural Networks parameters
Setting Artificial Neural Networks parametersMadhumita Tamhane
 
3. Training Artificial Neural Networks.pptx
3. Training Artificial Neural Networks.pptx3. Training Artificial Neural Networks.pptx
3. Training Artificial Neural Networks.pptxmunwar7
 
Multilayer & Back propagation algorithm
Multilayer & Back propagation algorithmMultilayer & Back propagation algorithm
Multilayer & Back propagation algorithmswapnac12
 
Vanishing & Exploding Gradients
Vanishing & Exploding GradientsVanishing & Exploding Gradients
Vanishing & Exploding GradientsSiddharth Vij
 
Objective Evaluation of a Deep Neural Network Approach for Single-Channel Spe...
Objective Evaluation of a Deep Neural Network Approach for Single-Channel Spe...Objective Evaluation of a Deep Neural Network Approach for Single-Channel Spe...
Objective Evaluation of a Deep Neural Network Approach for Single-Channel Spe...csandit
 
IMAGE ENHANCEMENT IN CASE OF UNEVEN ILLUMINATION USING VARIABLE THRESHOLDING ...
IMAGE ENHANCEMENT IN CASE OF UNEVEN ILLUMINATION USING VARIABLE THRESHOLDING ...IMAGE ENHANCEMENT IN CASE OF UNEVEN ILLUMINATION USING VARIABLE THRESHOLDING ...
IMAGE ENHANCEMENT IN CASE OF UNEVEN ILLUMINATION USING VARIABLE THRESHOLDING ...ijsrd.com
 
Error detection and correction
Error detection and correctionError detection and correction
Error detection and correctionSiddique Ibrahim
 
PERFORMANCE EVALUATION OF DIFFERENT TECHNIQUES FOR TEXTURE CLASSIFICATION
PERFORMANCE EVALUATION OF DIFFERENT TECHNIQUES FOR TEXTURE CLASSIFICATION PERFORMANCE EVALUATION OF DIFFERENT TECHNIQUES FOR TEXTURE CLASSIFICATION
PERFORMANCE EVALUATION OF DIFFERENT TECHNIQUES FOR TEXTURE CLASSIFICATION cscpconf
 
ALGORITHMS FOR PACKET ROUTING IN SWITCHING NETWORKS WITH RECONFIGURATION OVER...
ALGORITHMS FOR PACKET ROUTING IN SWITCHING NETWORKS WITH RECONFIGURATION OVER...ALGORITHMS FOR PACKET ROUTING IN SWITCHING NETWORKS WITH RECONFIGURATION OVER...
ALGORITHMS FOR PACKET ROUTING IN SWITCHING NETWORKS WITH RECONFIGURATION OVER...csandit
 
Batch normalization paper review
Batch normalization paper reviewBatch normalization paper review
Batch normalization paper reviewMinho Heo
 
Deep learning MindMap
Deep learning MindMapDeep learning MindMap
Deep learning MindMapAshish Patel
 
Why Batch Normalization Works so Well
Why Batch Normalization Works so WellWhy Batch Normalization Works so Well
Why Batch Normalization Works so WellChun-Ming Chang
 

Ähnlich wie Batch Normalization (20)

nural network ER. Abhishek k. upadhyay
nural network ER. Abhishek  k. upadhyaynural network ER. Abhishek  k. upadhyay
nural network ER. Abhishek k. upadhyay
 
International Journal of Engineering Research and Development (IJERD)
 International Journal of Engineering Research and Development (IJERD) International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)
 
ML_ Unit 2_Part_B
ML_ Unit 2_Part_BML_ Unit 2_Part_B
ML_ Unit 2_Part_B
 
Setting Artificial Neural Networks parameters
Setting Artificial Neural Networks parametersSetting Artificial Neural Networks parameters
Setting Artificial Neural Networks parameters
 
3. Training Artificial Neural Networks.pptx
3. Training Artificial Neural Networks.pptx3. Training Artificial Neural Networks.pptx
3. Training Artificial Neural Networks.pptx
 
Multilayer & Back propagation algorithm
Multilayer & Back propagation algorithmMultilayer & Back propagation algorithm
Multilayer & Back propagation algorithm
 
Vanishing & Exploding Gradients
Vanishing & Exploding GradientsVanishing & Exploding Gradients
Vanishing & Exploding Gradients
 
Objective Evaluation of a Deep Neural Network Approach for Single-Channel Spe...
Objective Evaluation of a Deep Neural Network Approach for Single-Channel Spe...Objective Evaluation of a Deep Neural Network Approach for Single-Channel Spe...
Objective Evaluation of a Deep Neural Network Approach for Single-Channel Spe...
 
IMAGE ENHANCEMENT IN CASE OF UNEVEN ILLUMINATION USING VARIABLE THRESHOLDING ...
IMAGE ENHANCEMENT IN CASE OF UNEVEN ILLUMINATION USING VARIABLE THRESHOLDING ...IMAGE ENHANCEMENT IN CASE OF UNEVEN ILLUMINATION USING VARIABLE THRESHOLDING ...
IMAGE ENHANCEMENT IN CASE OF UNEVEN ILLUMINATION USING VARIABLE THRESHOLDING ...
 
Error detection and correction
Error detection and correctionError detection and correction
Error detection and correction
 
PERFORMANCE EVALUATION OF DIFFERENT TECHNIQUES FOR TEXTURE CLASSIFICATION
PERFORMANCE EVALUATION OF DIFFERENT TECHNIQUES FOR TEXTURE CLASSIFICATION PERFORMANCE EVALUATION OF DIFFERENT TECHNIQUES FOR TEXTURE CLASSIFICATION
PERFORMANCE EVALUATION OF DIFFERENT TECHNIQUES FOR TEXTURE CLASSIFICATION
 
ALGORITHMS FOR PACKET ROUTING IN SWITCHING NETWORKS WITH RECONFIGURATION OVER...
ALGORITHMS FOR PACKET ROUTING IN SWITCHING NETWORKS WITH RECONFIGURATION OVER...ALGORITHMS FOR PACKET ROUTING IN SWITCHING NETWORKS WITH RECONFIGURATION OVER...
ALGORITHMS FOR PACKET ROUTING IN SWITCHING NETWORKS WITH RECONFIGURATION OVER...
 
Guide
GuideGuide
Guide
 
Batch normalization paper review
Batch normalization paper reviewBatch normalization paper review
Batch normalization paper review
 
Deep learning MindMap
Deep learning MindMapDeep learning MindMap
Deep learning MindMap
 
Why Batch Normalization Works so Well
Why Batch Normalization Works so WellWhy Batch Normalization Works so Well
Why Batch Normalization Works so Well
 
Report
ReportReport
Report
 
Capstone paper
Capstone paperCapstone paper
Capstone paper
 
Fcm1
Fcm1Fcm1
Fcm1
 
Fcm1
Fcm1Fcm1
Fcm1
 

Kürzlich hochgeladen

Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramMoniSankarHazra
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightDelhi Call girls
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...SUHANI PANDEY
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Valters Lauzums
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...amitlee9823
 
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night StandCall Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...amitlee9823
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...amitlee9823
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...amitlee9823
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...ZurliaSoop
 
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Pooja Nehwal
 

Kürzlich hochgeladen (20)

Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
CHEAP Call Girls in Rabindra Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Rabindra Nagar  (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Rabindra Nagar  (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Rabindra Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics Program
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night StandCall Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
 

Batch Normalization

  • 1. 1. Normalizes the values of Z in the network. 2. BN is done for mini batch mode. 3. Let’s assume we are trying to apply BN to layer 2 of the network shown below. 4. Assume, batch_size is 10 which means there will be 10 data points for every batch. Batch Normalization - Algorithm
  • 2. 1. Training - Batch 1: 1. z_vector – 1. For sample 1 in batch 1, the z vector is [z_2_1, z_2_2………..z_2_5] 2. Same z vector is computed for all sample ranging from sample 1 till 10 in batch 1. 2. z_normalized_vector – znorm 1. The z values of across all samples in batch are standardized to make z_normalized_vector. 2. Even though we say normalization, we are doing standardization of z values. Normalization is done to restrict the values of data in range 0 – 1. Standardization converts data into distribution with mean 0 and S.D of 1. 3. z_tilda – z~ = ((gamma* z_normalized_vector) + beta) 1. gamma is the scale and beta is the shift. 2. The concept behind gamma and beta – While converting z_vector into a z_normalized_vector, we assume that z follows standard normal distribution. It may not be the case always. To account for other scenarios, we scale (γ) the data which essentially means distribute the data and then shift (β) the data which essentially means move the data across scale. Batch Normalization – Normalizing z
  • 3. Batch Normalization – Shift & Scale
  • 4. 1. Training - Batch 1: (continued..) 3. z_tilda – (continued..) 2. Update gamma and beta - The gamma and beta are initialized to 1 and 0 by for all nodes in layer 2 of network. The values remain same through out batch 1. This value is updated by using optimizer (example - gradient descent) at the start of batch 2. This is done like weight update done using gradient descent. 1. The FP for the samples 1 through 10 is carried on with the initialized value z_tilda in layer 2. 2. During BP, we compute vector for error gradient w.r.t beta. This is done for all samples from 1 through 10. Once done, we will compute averaged error gradient vector w.r.t beta. We will use this in gradient descent formula to update value of beta vector for layer 2 for batch 2. Batch Normalization – update β & γ
  • 5. 2. The same process as mentioned above is continued after batch 1 as well till we reach convergence. 3. Test – Test/Validation time is different than Training time since we are dealing with one sample at a time at the time of test. In such case, how to we normalize the value of z. To normalize z, we need mean & S.D of data. 1. We can pick the value of mean and S.D used for normalizing z in layer 2 during the last iteration of training. 2. Another alternative is to do a weighted average (or average) of mean and S.D values used for normalizing z in layer 2 during all iterations of training. Batch Normalization – Algorithm
  • 6. Batch Normalization – β & γ What would happen if we don’t use (β) & (γ) to calculate z~. Let’s assume we don’t use (β) & (γ) and we are dealing with sigmoid activation function. In such a case, we see in this picture that there is literally no use of using the activation function itself. Since standard normal data is near to 0, every data point will cross as-is through the activation.
  • 7. 1. High fluctuations in z value keep the network training for long. BN increases the speed of training by keeping z values in control. If wide fluctuations in z are limited, the fluctuations in errors and gradients are also limited making the weight updated optimal (neither too high nor too low). 2. BN increases the computations happening in every iteration of network. This means that every iteration takes more time to finish which should eventually translate to more training time. However, training time is reduced. This is because global minima is achieved in less number of iterations while using BN. So, overall, we end up reducing training time. 3. BN can be applied to input layer thus normalizing input data. 4. BN can be applied either after z or after a. General practice is to use it after z. 5. No use of Bias in case we use BN for a layer – Bias used in the computation of z (z = wx + b) is meant to shift the distribution of data. When we use BN, we do standard normalization to z. That means that we convert z distribution to a 0 mean and 1 SD distribution. So, the use of adding bias does not make sense since we are anyways shifting the distribution back standard normal distribution. model = Sequential() model.add(Dense(32), use_bias=False) model.add(BatchNormalization()) model.add(Activation('relu’)) 6. BN helps in regularization. 1. During BN, we compute mean and SD of z values at a specific layer for all the samples in the batch. We use this mean and SD values to normalize z values to compute znorm. 2. The mean and SD is only of z values of samples involved in 1 batch. If a batch 1 has 10 samples, the mean and SD is for z values only corresponding to these 10 samples and not the entire dataset. 3. For next batch 2, we will again use mean and SD of next 10 samples which will be different from mean and SD of z values of previous 10 samples from batch 1. This way, we are introducing some noise in the dataset and hence, helping in generalization / regularization. 7. BN helps in preventing the probability of vanishing and exploding gradients. This is because it normalizes the value of z thereby limiting the effect of higher or lower weights. z = wx + b for first layer and z = wa + b for subsequent layers. 8. BN does not help network w.r.t covariate shift as was listed in one of the research paper which has been proven false. Batch Normalization – Points to Note

Hinweis der Redaktion

  1. Why BN is not applied in batch or stochastic mode?
  2. Why BN is not applied in batch or stochastic mode?
  3. Why BN is not applied in batch or stochastic mode?
  4. Why BN is not applied in batch or stochastic mode?
  5. Why BN is not applied in batch or stochastic mode?
  6. Why BN is not applied in batch or stochastic mode?
  7. Why BN is not applied in batch or stochastic mode?