SlideShare ist ein Scribd-Unternehmen logo
1 von 33
Downloaden Sie, um offline zu lesen
Neural networks. Overview
Oleksandr Baiev, PhD
Senior Engineer
Samsung R&D Institute Ukraine
Neural networks. Overview
โ€ข Common principles
โ€“ Structure
โ€“ Learning
โ€ข Shallow and Deep NN
โ€ข Additional methods
โ€“ Conventional
โ€“ Voodoo
Neural networks. Overview
โ€ข Common principles
โ€“ Structure
โ€“ Learning
โ€ข Shallow and Deep NN
โ€ข Additional methods
โ€“ Conventional
โ€“ Voodoo
Canonical/Typical tasks
Solutions in general
๐‘ฅ๐‘— = ๐‘ฅ1, ๐‘ฅ2, ๐‘ฅ3, ๐‘ฅ4, โ€ฆ , ๐‘ฅ๐‘–, โ€ฆ ๐‘— โˆˆ ๐‘‹
๐‘ฆ๐‘— = ๐‘ฆ1, ๐‘ฆ2, โ€ฆ , ๐‘ฆ ๐‘˜, โ€ฆ ๐‘— โˆˆ ๐‘Œ
๐น: ๐‘‹ โ†’ ๐‘Œ
Classification
๐‘ฆ1 = 1,0,0
๐‘ฆ2 = 0,0,1
๐‘ฆ3 = 0,1,0
๐‘ฆ4 = 0,1,0
Index of sample in dataset
sample of class โ€œ0โ€
sample of class โ€œ2โ€
sample of class โ€œ2โ€
sample of class โ€œ1โ€
Regression
๐‘ฆ1 = 0.3
๐‘ฆ2 = 0.2
๐‘ฆ3 = 1.0
๐‘ฆ4 = 0.65
What is artificial Neural Networks?
Is it biology?
Simulation of biological neural networks (synapses, axons,
chains, layers, etc.) is a good abstraction for understanding
topology.
Bio NN is only inspiration and illustration. Nothing more!
What is artificial Neural Networks?
Letโ€™s imagine black box!
F
inputs
params
outputs
General form:
๐‘œ๐‘ข๐‘ก๐‘๐‘ข๐‘ก๐‘  = ๐น ๐‘–๐‘›๐‘๐‘ข๐‘ก๐‘ , ๐‘๐‘Ž๐‘Ÿ๐‘Ž๐‘š๐‘ 
Steps:
1) choose โ€œformโ€ of F
2) find params
What is artificial Neural Networks?
Itโ€™s a simple math!
free parameters
activation function
๐‘ ๐‘– =
๐‘—=1
๐‘›
๐‘ค๐‘–๐‘— ๐‘ฅ๐‘— + ๐‘๐‘–
๐‘ฆ๐‘– = ๐‘“ ๐‘ ๐‘–
Output of i-th neuron:
What is artificial Neural Networks?
Itโ€™s a simple math!
activation: ๐‘ฆ = ๐‘“ ๐‘ค๐‘ฅ + ๐‘ = ๐‘ ๐‘–๐‘”๐‘š๐‘œ๐‘–๐‘‘(๐‘ค๐‘ฅ + ๐‘)
What is artificial Neural Networks?
Itโ€™s a simple math!
activation: ๐‘ฆ = ๐‘“ ๐‘ค๐‘ฅ + ๐‘ = ๐‘ ๐‘–๐‘”๐‘š๐‘œ๐‘–๐‘‘(๐‘ค๐‘ฅ + ๐‘)
What is artificial Neural Networks?
Itโ€™s a simple math!
activation: ๐‘ฆ = ๐‘“ ๐‘ค๐‘ฅ + ๐‘ = ๐‘ ๐‘–๐‘”๐‘š๐‘œ๐‘–๐‘‘(๐‘ค๐‘ฅ + ๐‘)
What is artificial Neural Networks?
Itโ€™s a simple math!
activation: ๐‘ฆ = ๐‘“ ๐‘ค๐‘ฅ + ๐‘ = ๐‘ ๐‘–๐‘”๐‘š๐‘œ๐‘–๐‘‘(๐‘ค๐‘ฅ + ๐‘)
What is artificial Neural Networks?
Itโ€™s a simple math!
activation: ๐‘ฆ = ๐‘“ ๐‘ค๐‘ฅ + ๐‘ = ๐‘ ๐‘–๐‘”๐‘š๐‘œ๐‘–๐‘‘(๐‘ค๐‘ฅ + ๐‘)
What is artificial Neural Networks?
Itโ€™s a simple math!
n inputs
m neurons
in hidden layer
๐‘ ๐‘– =
๐‘—=1
๐‘›
๐‘ค๐‘–๐‘— ๐‘ฅ๐‘— + ๐‘๐‘–
๐‘ฆ๐‘– = ๐‘“ ๐‘ ๐‘–
Output of i-th neuron:
Output of k-th layer:
1) ๐‘† ๐‘˜ = ๐‘Š๐‘˜ ๐‘‹ ๐‘˜ + ๐ต ๐‘˜ =
=
๐‘ค11 ๐‘ค12 โ‹ฏ ๐‘ค1๐‘›
๐‘ค21 ๐‘ค21 โ‹ฏ ๐‘ค21
โ‹ฏ โ‹ฏ โ‹ฏ โ‹ฏ
๐‘ค ๐‘š1 ๐‘ค ๐‘š2 โ‹ฏ ๐‘ค ๐‘š๐‘› ๐‘˜
๐‘ฅ1
๐‘ฅ2
๐‘ฅ3
โ‹ฎ
๐‘ฅ ๐‘› ๐‘˜
+
๐‘1
๐‘2
๐‘3
โ‹ฎ
๐‘ ๐‘› ๐‘˜
2) ๐‘Œ๐‘˜ = ๐‘“๐‘˜ ๐‘† ๐‘˜
apply element-wise
Kolmagorov & Arnold function superposition
Form of F:
Neural networks. Overview
โ€ข Common principles
โ€“ Structure
โ€“ Learning
โ€ข Shallow and Deep NN
โ€ข Additional methods
โ€“ Conventional
โ€“ Voodoo
How to find parameters
W and B?
Supervised learning:
Training set (pairs of variables and responses):
๐‘‹; ๐‘Œ ๐‘–, ๐‘– = 1. . ๐‘
Find: ๐‘Šโˆ—
, ๐ตโˆ—
= ๐‘Ž๐‘Ÿ๐‘”๐‘š๐‘–๐‘›
๐‘Š,๐ต
๐ฟ ๐น ๐‘‹ , ๐‘Œ
Cost function (loss, error):
logloss: L ๐น ๐‘‹ , ๐‘Œ =
1
๐‘ ๐‘–=1
๐‘
๐‘—=1
๐‘€
๐‘ฆ๐‘–.๐‘— log ๐‘“๐‘–,๐‘—
rmse: L ๐น ๐‘‹ , ๐‘Œ =
1
๐‘ ๐‘–=1
๐‘
๐น ๐‘‹๐‘– โˆ’ ๐‘Œ๐‘– 2
โ€œ1โ€ if in i-th sample is
class j else โ€œ0โ€
previously scaled:
๐‘“๐‘–,๐‘— = ๐‘“๐‘–,๐‘— ๐‘— ๐‘“๐‘–,๐‘—
Just an examples.
Cost function depend on
problem (classification,
regression) and domain
knowledge
Training or optimization algorithm
So, we have model cost ๐ฟ (or error of prediction)
And we want to update weights in order to minimize ๐‘ณ:
๐‘คโˆ— = ๐‘ค + ๐›ผฮ”๐‘ค
In accordance to gradient descent: ฮ”๐‘ค = โˆ’๐›ป๐ฟ
Itโ€™s clear for network with only one layer (we have
predicted outputs and targets, so can evaluate ๐ฟ).
But how to find ๐œŸ๐’˜ for hidden layers?
Meet โ€œError Back Propagationโ€
Find ฮ”๐‘ค for each layer from the last to the first
as influence of weights to cost:
โˆ†๐‘ค๐‘–,๐‘— =
๐œ•๐ฟ
๐œ•๐‘ค๐‘–,๐‘—
and:
๐œ•๐ฟ
๐œ•๐‘ค๐‘–,๐‘—
=
๐œ•๐ฟ
๐œ•๐‘“๐‘—
๐œ•๐‘“๐‘—
๐œ•๐‘  ๐‘—
๐œ•๐‘  ๐‘—
๐œ•๐‘ค๐‘–,๐‘—
Error Back Propagation
Details
๐œ•๐ฟ
๐œ•๐‘ค๐‘–,๐‘—
=
๐œ•๐ฟ
๐œ•๐‘“๐‘—
๐œ•๐‘“๐‘—
๐œ•๐‘ ๐‘—
๐œ•๐‘ ๐‘—
๐œ•๐‘ค๐‘–,๐‘—
๐›ฟ๐‘— =
๐œ•๐ฟ
๐œ•๐‘“๐‘—
๐œ•๐‘“๐‘—
๐œ•๐‘  ๐‘—
๐›ฟ๐‘— =
๐ฟโ€ฒ
๐น ๐‘‹ , ๐‘Œ ๐‘“โ€ฒ ๐‘ ๐‘— , ๐‘œ๐‘ข๐‘ก๐‘๐‘ข๐‘ก ๐‘™๐‘Ž๐‘ฆ๐‘’๐‘Ÿ
๐‘™ โˆˆ ๐‘›๐‘’๐‘ฅ๐‘ก ๐‘™๐‘Ž๐‘ฆ๐‘’๐‘Ÿ ๐›ฟ๐‘™ ๐‘ค ๐‘—,๐‘™ ๐‘“โ€ฒ ๐‘ ๐‘— , โ„Ž๐‘–๐‘‘๐‘‘๐‘’๐‘› ๐‘™๐‘Ž๐‘ฆ๐‘’๐‘Ÿ๐‘ 
โˆ†๐‘ค = โˆ’๐›ผ ๐›ฟ ๐‘ฅ
Gradient Descent
in real life
Recall gradient descent:
๐‘คโˆ—
= ๐‘ค + ๐›ผฮ”๐‘ค
๐›ผ is a โ€œstepโ€ coefficient. In term of ML โ€“ learning rate.
Recall cost function:
๐ฟ =
1
๐‘
๐‘
โ€ฆ
GD modification: update ๐‘ค for each sample.
Sum along all samples,
And what if ๐‘ = 106 or more?
Typical: ๐›ผ = 0.01. . 0.1
Gradient Descent
Stochastic & Minibatch
โ€œBatchโ€ GD
(L for full set)
need a lot of memory
Stochastic GD
(L for each sample)
fast, but fluctuation
Minibatch GD
(L for subsets)
less memory & less fluctuations
Size of minibatch depends on HW Typical: minibatch=32โ€ฆ256
Termination criteria
By epochs count
max number of iterations along all data set
By value of gradient
when gradient is equal to 0 than minimum, but small gradient => very
slow learning
When cost didnโ€™t change during several epochs
if error is not change than training procedure is not converges
Early stopping
Stop when โ€œvalidationโ€ score starts increase
even when โ€œtrainโ€ score continue decreasing
Typical: epochs=50โ€ฆ200
Neural networks. Overview
โ€ข Common principles
โ€“ Structure
โ€“ Learning
โ€ข Shallow and Deep NN
โ€ข Additional methods
โ€“ Conventional
โ€“ Voodoo
What about โ€œformโ€ of F?
Network topology
โ€œShallowโ€ networks 1, 2 hidden layers => not
enough parameters => pure separation abilities
โ€œDeepโ€ networks is a NN with 2..10 layers
โ€œVery deepโ€ networks is a NN with >10 layers
Deep learning. Problems
โ€ข Big networks => Too huge
separating ability => Overfitting
โ€ข Vanishing gradient problem
during training
โ€ข Complex errorโ€™s surface => Local
minimum
โ€ข Curse of dimensionality => memory
& computations
๐‘š(๐‘–โˆ’1)
๐‘š(๐‘–)
dim ๐‘Š(๐‘–)
= ๐‘š ๐‘–โˆ’1
โˆ— ๐‘š(๐‘–)
Neural networks. Overview
โ€ข Common principles
โ€“ Structure
โ€“ Learning
โ€ข Shallow and Deep NN
โ€ข Additional methods
โ€“ Conventional
โ€“ Voodoo
Additional methods
Conventional
โ€ข Momentum (prevent the variations on error surface)
โˆ†๐‘ค(๐‘ก)
= โˆ’๐›ผ๐›ป๐ฟ ๐‘ค ๐‘ก
+ ๐›ฝโˆ†๐‘ค(๐‘กโˆ’1)
๐‘š๐‘œ๐‘š๐‘’๐‘›๐‘ก๐‘ข๐‘š
โ€ข LR decay (make smaller steps near optimum)
๐›ผ(๐‘ก)
= ๐‘˜๐›ผ(๐‘กโˆ’1)
, 0 < ๐‘˜ < 1
โ€ข Weight Decay(prevent weight growing, and smooth F)
๐ฟโˆ—
= ๐ฟ + ๐œ† ๐‘ค(๐‘ก)
L1 or L2 regularization often used
Typical: ๐›ฝ = 0.9
Typical: apply LR decay (๐‘˜ = 0.1) each 10..100 epochs
Typical: ๐ฟ2 with ๐œ† = 0.0005
Neural networks. Overview
โ€ข Common principles
โ€“ Structure
โ€“ Learning
โ€ข Shallow and Deep NN
โ€ข Additional methods
โ€“ Conventional
โ€“ Voodoo
Additional methods
Contemporary
Dropout/DropConnect
โ€“ ensembles of networks
โ€“ 2 ๐‘ networks in one: for each example
hide neurons output randomly (๐‘ƒ = 0.5)
Additional methods
Contemporary
Data augmentation - more data
with all available cases:
โ€“ affine transformations, flips, crop,
contrast, noise, scale
โ€“ pseudo-labeling
Additional methods
Contemporary
New activation function:
โ€“ Linear: ๐‘ฆ๐‘– = ๐‘“ ๐‘ ๐‘– = ๐‘Ž๐‘ ๐‘–
โ€“ ReLU: ๐‘ฆ๐‘– = ๐‘š๐‘Ž๐‘ฅ ๐‘ ๐‘–, 0
โ€“ Leaky ReLU: ๐‘ฆ๐‘– =
๐‘ ๐‘– ๐‘ ๐‘– > 0
๐‘Ž๐‘ ๐‘– ๐‘œ๐‘กโ„Ž๐‘’๐‘Ÿ๐‘ค๐‘–๐‘ ๐‘’
โ€“ Maxout: ๐‘ฆ๐‘– = ๐‘š๐‘Ž๐‘ฅ ๐‘ 1,๐‘–, ๐‘ 2,๐‘–, โ€ฆ , ๐‘  ๐‘˜,๐‘–
Typical: ๐‘Ž = 0.01
Typical: ๐‘˜ = 2. . 3
Additional methods
Contemporary
Pre-training
โ€“ train layer-by-layer,
โ€“ re-train โ€œotherโ€ network
Sources
โ€ข Jeffry Hinton Course โ€œNeural Networks for Machine Learningโ€
[http://www.coursera.org/course/neuralnets]
โ€ข Ian Goodfellow, Yoshua Bengio and Aaron Courville โ€œDeep Learningโ€
[http://www.deeplearningbook.org/]
โ€ข http://neuralnetworksanddeeplearning.com
โ€ข CS231n: Convolutional Neural Networks for Visual Recognition
[http://cs231n.stanford.edu/]
โ€ข CS224d: Deep Learning for Natural Language Processing [http://cs224d.stanford.edu/]
โ€ข Schmidhuber โ€œDeep Learning in Neural Networks: An Overviewโ€
โ€ข kaggle.com competitions and forums

Weitere รคhnliche Inhalte

Was ist angesagt?

Training Neural Networks
Training Neural NetworksTraining Neural Networks
Training Neural NetworksDatabricks
ย 
Deep learning study 2
Deep learning study 2Deep learning study 2
Deep learning study 2San Kim
ย 
Deep Learning in Recommender Systems - RecSys Summer School 2017
Deep Learning in Recommender Systems - RecSys Summer School 2017Deep Learning in Recommender Systems - RecSys Summer School 2017
Deep Learning in Recommender Systems - RecSys Summer School 2017Balรกzs Hidasi
ย 
Introduction to Neural networks (under graduate course) Lecture 9 of 9
Introduction to Neural networks (under graduate course) Lecture 9 of 9Introduction to Neural networks (under graduate course) Lecture 9 of 9
Introduction to Neural networks (under graduate course) Lecture 9 of 9Randa Elanwar
ย 
Python for Image Understanding: Deep Learning with Convolutional Neural Nets
Python for Image Understanding: Deep Learning with Convolutional Neural NetsPython for Image Understanding: Deep Learning with Convolutional Neural Nets
Python for Image Understanding: Deep Learning with Convolutional Neural NetsRoelof Pieters
ย 
Using Feature Grouping as a Stochastic Regularizer for High Dimensional Noisy...
Using Feature Grouping as a Stochastic Regularizer for High Dimensional Noisy...Using Feature Grouping as a Stochastic Regularizer for High Dimensional Noisy...
Using Feature Grouping as a Stochastic Regularizer for High Dimensional Noisy...WiMLDSMontreal
ย 
Machine Learning, Deep Learning and Data Analysis Introduction
Machine Learning, Deep Learning and Data Analysis IntroductionMachine Learning, Deep Learning and Data Analysis Introduction
Machine Learning, Deep Learning and Data Analysis IntroductionTe-Yen Liu
ย 
03 Single layer Perception Classifier
03 Single layer Perception Classifier03 Single layer Perception Classifier
03 Single layer Perception ClassifierTamer Ahmed Farrag, PhD
ย 
Neural network basic and introduction of Deep learning
Neural network basic and introduction of Deep learningNeural network basic and introduction of Deep learning
Neural network basic and introduction of Deep learningTapas Majumdar
ย 
Deep Learning for Personalized Search and Recommender Systems
Deep Learning for Personalized Search and Recommender SystemsDeep Learning for Personalized Search and Recommender Systems
Deep Learning for Personalized Search and Recommender SystemsBenjamin Le
ย 
Artificial Neural Networks Lect2: Neurobiology & Architectures of ANNS
Artificial Neural Networks Lect2: Neurobiology & Architectures of ANNSArtificial Neural Networks Lect2: Neurobiology & Architectures of ANNS
Artificial Neural Networks Lect2: Neurobiology & Architectures of ANNSMohammed Bennamoun
ย 
Introduction to Neural networks (under graduate course) Lecture 6 of 9
Introduction to Neural networks (under graduate course) Lecture 6 of 9Introduction to Neural networks (under graduate course) Lecture 6 of 9
Introduction to Neural networks (under graduate course) Lecture 6 of 9Randa Elanwar
ย 
Paper Study: Melding the data decision pipeline
Paper Study: Melding the data decision pipelinePaper Study: Melding the data decision pipeline
Paper Study: Melding the data decision pipelineChenYiHuang5
ย 
Paper study: Learning to solve circuit sat
Paper study: Learning to solve circuit satPaper study: Learning to solve circuit sat
Paper study: Learning to solve circuit satChenYiHuang5
ย 
Hands-on Tutorial of Deep Learning
Hands-on Tutorial of Deep LearningHands-on Tutorial of Deep Learning
Hands-on Tutorial of Deep LearningChun-Ming Chang
ย 
Clustering introduction
Clustering introductionClustering introduction
Clustering introductionYan Xu
ย 
Gan seminar
Gan seminarGan seminar
Gan seminarSan Kim
ย 
Paper study: Attention, learn to solve routing problems!
Paper study: Attention, learn to solve routing problems!Paper study: Attention, learn to solve routing problems!
Paper study: Attention, learn to solve routing problems!ChenYiHuang5
ย 
H2O Distributed Deep Learning by Arno Candel 071614
H2O Distributed Deep Learning by Arno Candel 071614H2O Distributed Deep Learning by Arno Candel 071614
H2O Distributed Deep Learning by Arno Candel 071614Sri Ambati
ย 
Paper Study: OptNet: Differentiable Optimization as a Layer in Neural Networks
Paper Study: OptNet: Differentiable Optimization as a Layer in Neural NetworksPaper Study: OptNet: Differentiable Optimization as a Layer in Neural Networks
Paper Study: OptNet: Differentiable Optimization as a Layer in Neural NetworksChenYiHuang5
ย 

Was ist angesagt? (20)

Training Neural Networks
Training Neural NetworksTraining Neural Networks
Training Neural Networks
ย 
Deep learning study 2
Deep learning study 2Deep learning study 2
Deep learning study 2
ย 
Deep Learning in Recommender Systems - RecSys Summer School 2017
Deep Learning in Recommender Systems - RecSys Summer School 2017Deep Learning in Recommender Systems - RecSys Summer School 2017
Deep Learning in Recommender Systems - RecSys Summer School 2017
ย 
Introduction to Neural networks (under graduate course) Lecture 9 of 9
Introduction to Neural networks (under graduate course) Lecture 9 of 9Introduction to Neural networks (under graduate course) Lecture 9 of 9
Introduction to Neural networks (under graduate course) Lecture 9 of 9
ย 
Python for Image Understanding: Deep Learning with Convolutional Neural Nets
Python for Image Understanding: Deep Learning with Convolutional Neural NetsPython for Image Understanding: Deep Learning with Convolutional Neural Nets
Python for Image Understanding: Deep Learning with Convolutional Neural Nets
ย 
Using Feature Grouping as a Stochastic Regularizer for High Dimensional Noisy...
Using Feature Grouping as a Stochastic Regularizer for High Dimensional Noisy...Using Feature Grouping as a Stochastic Regularizer for High Dimensional Noisy...
Using Feature Grouping as a Stochastic Regularizer for High Dimensional Noisy...
ย 
Machine Learning, Deep Learning and Data Analysis Introduction
Machine Learning, Deep Learning and Data Analysis IntroductionMachine Learning, Deep Learning and Data Analysis Introduction
Machine Learning, Deep Learning and Data Analysis Introduction
ย 
03 Single layer Perception Classifier
03 Single layer Perception Classifier03 Single layer Perception Classifier
03 Single layer Perception Classifier
ย 
Neural network basic and introduction of Deep learning
Neural network basic and introduction of Deep learningNeural network basic and introduction of Deep learning
Neural network basic and introduction of Deep learning
ย 
Deep Learning for Personalized Search and Recommender Systems
Deep Learning for Personalized Search and Recommender SystemsDeep Learning for Personalized Search and Recommender Systems
Deep Learning for Personalized Search and Recommender Systems
ย 
Artificial Neural Networks Lect2: Neurobiology & Architectures of ANNS
Artificial Neural Networks Lect2: Neurobiology & Architectures of ANNSArtificial Neural Networks Lect2: Neurobiology & Architectures of ANNS
Artificial Neural Networks Lect2: Neurobiology & Architectures of ANNS
ย 
Introduction to Neural networks (under graduate course) Lecture 6 of 9
Introduction to Neural networks (under graduate course) Lecture 6 of 9Introduction to Neural networks (under graduate course) Lecture 6 of 9
Introduction to Neural networks (under graduate course) Lecture 6 of 9
ย 
Paper Study: Melding the data decision pipeline
Paper Study: Melding the data decision pipelinePaper Study: Melding the data decision pipeline
Paper Study: Melding the data decision pipeline
ย 
Paper study: Learning to solve circuit sat
Paper study: Learning to solve circuit satPaper study: Learning to solve circuit sat
Paper study: Learning to solve circuit sat
ย 
Hands-on Tutorial of Deep Learning
Hands-on Tutorial of Deep LearningHands-on Tutorial of Deep Learning
Hands-on Tutorial of Deep Learning
ย 
Clustering introduction
Clustering introductionClustering introduction
Clustering introduction
ย 
Gan seminar
Gan seminarGan seminar
Gan seminar
ย 
Paper study: Attention, learn to solve routing problems!
Paper study: Attention, learn to solve routing problems!Paper study: Attention, learn to solve routing problems!
Paper study: Attention, learn to solve routing problems!
ย 
H2O Distributed Deep Learning by Arno Candel 071614
H2O Distributed Deep Learning by Arno Candel 071614H2O Distributed Deep Learning by Arno Candel 071614
H2O Distributed Deep Learning by Arno Candel 071614
ย 
Paper Study: OptNet: Differentiable Optimization as a Layer in Neural Networks
Paper Study: OptNet: Differentiable Optimization as a Layer in Neural NetworksPaper Study: OptNet: Differentiable Optimization as a Layer in Neural Networks
Paper Study: OptNet: Differentiable Optimization as a Layer in Neural Networks
ย 

ร„hnlich wie Neural Networks. Overview

JAISTใ‚ตใƒžใƒผใ‚นใ‚ฏใƒผใƒซ2016ใ€Œ่„ณใ‚’็Ÿฅใ‚‹ใŸใ‚ใฎ็†่ซ–ใ€่ฌ›็พฉ04 Neural Networks and Neuroscience
JAISTใ‚ตใƒžใƒผใ‚นใ‚ฏใƒผใƒซ2016ใ€Œ่„ณใ‚’็Ÿฅใ‚‹ใŸใ‚ใฎ็†่ซ–ใ€่ฌ›็พฉ04 Neural Networks and Neuroscience JAISTใ‚ตใƒžใƒผใ‚นใ‚ฏใƒผใƒซ2016ใ€Œ่„ณใ‚’็Ÿฅใ‚‹ใŸใ‚ใฎ็†่ซ–ใ€่ฌ›็พฉ04 Neural Networks and Neuroscience
JAISTใ‚ตใƒžใƒผใ‚นใ‚ฏใƒผใƒซ2016ใ€Œ่„ณใ‚’็Ÿฅใ‚‹ใŸใ‚ใฎ็†่ซ–ใ€่ฌ›็พฉ04 Neural Networks and Neuroscience hirokazutanaka
ย 
tutorial.ppt
tutorial.ppttutorial.ppt
tutorial.pptVara Prasad
ย 
Deep learning from a novice perspective
Deep learning from a novice perspectiveDeep learning from a novice perspective
Deep learning from a novice perspectiveAnirban Santara
ย 
Introduction to deep learning
Introduction to deep learningIntroduction to deep learning
Introduction to deep learningJunaid Bhat
ย 
Deep Learning: Application & Opportunity
Deep Learning: Application & OpportunityDeep Learning: Application & Opportunity
Deep Learning: Application & OpportunityiTrain
ย 
Deep learning lecture - part 1 (basics, CNN)
Deep learning lecture - part 1 (basics, CNN)Deep learning lecture - part 1 (basics, CNN)
Deep learning lecture - part 1 (basics, CNN)SungminYou
ย 
Neural Networks in Data Mining - โ€œAn Overviewโ€
Neural Networks  in Data Mining -   โ€œAn Overviewโ€Neural Networks  in Data Mining -   โ€œAn Overviewโ€
Neural Networks in Data Mining - โ€œAn Overviewโ€Dr.(Mrs).Gethsiyal Augasta
ย 
Deep Feed Forward Neural Networks and Regularization
Deep Feed Forward Neural Networks and RegularizationDeep Feed Forward Neural Networks and Regularization
Deep Feed Forward Neural Networks and RegularizationYan Xu
ย 
The neural tangent link between CNN denoisers and non-local filters
The neural tangent link between CNN denoisers and non-local filtersThe neural tangent link between CNN denoisers and non-local filters
The neural tangent link between CNN denoisers and non-local filtersJuliรกn Tachella
ย 
Artificial Neural Network
Artificial Neural NetworkArtificial Neural Network
Artificial Neural Networkssuserab4f3e
ย 
Lecture 5 backpropagation
Lecture 5 backpropagationLecture 5 backpropagation
Lecture 5 backpropagationParveenMalik18
ย 
Machine Learning - Neural Networks - Perceptron
Machine Learning - Neural Networks - PerceptronMachine Learning - Neural Networks - Perceptron
Machine Learning - Neural Networks - PerceptronAndrew Ferlitsch
ย 
Machine Learning - Introduction to Neural Networks
Machine Learning - Introduction to Neural NetworksMachine Learning - Introduction to Neural Networks
Machine Learning - Introduction to Neural NetworksAndrew Ferlitsch
ย 
Hands on machine learning with scikit-learn and tensor flow by ahmed yousry
Hands on machine learning with scikit-learn and tensor flow by ahmed yousryHands on machine learning with scikit-learn and tensor flow by ahmed yousry
Hands on machine learning with scikit-learn and tensor flow by ahmed yousryAhmed Yousry
ย 

ร„hnlich wie Neural Networks. Overview (20)

JAISTใ‚ตใƒžใƒผใ‚นใ‚ฏใƒผใƒซ2016ใ€Œ่„ณใ‚’็Ÿฅใ‚‹ใŸใ‚ใฎ็†่ซ–ใ€่ฌ›็พฉ04 Neural Networks and Neuroscience
JAISTใ‚ตใƒžใƒผใ‚นใ‚ฏใƒผใƒซ2016ใ€Œ่„ณใ‚’็Ÿฅใ‚‹ใŸใ‚ใฎ็†่ซ–ใ€่ฌ›็พฉ04 Neural Networks and Neuroscience JAISTใ‚ตใƒžใƒผใ‚นใ‚ฏใƒผใƒซ2016ใ€Œ่„ณใ‚’็Ÿฅใ‚‹ใŸใ‚ใฎ็†่ซ–ใ€่ฌ›็พฉ04 Neural Networks and Neuroscience
JAISTใ‚ตใƒžใƒผใ‚นใ‚ฏใƒผใƒซ2016ใ€Œ่„ณใ‚’็Ÿฅใ‚‹ใŸใ‚ใฎ็†่ซ–ใ€่ฌ›็พฉ04 Neural Networks and Neuroscience
ย 
tutorial.ppt
tutorial.ppttutorial.ppt
tutorial.ppt
ย 
Deep learning from a novice perspective
Deep learning from a novice perspectiveDeep learning from a novice perspective
Deep learning from a novice perspective
ย 
Introduction to deep learning
Introduction to deep learningIntroduction to deep learning
Introduction to deep learning
ย 
Deep Learning: Application & Opportunity
Deep Learning: Application & OpportunityDeep Learning: Application & Opportunity
Deep Learning: Application & Opportunity
ย 
Neural Networks
Neural NetworksNeural Networks
Neural Networks
ย 
Deep learning lecture - part 1 (basics, CNN)
Deep learning lecture - part 1 (basics, CNN)Deep learning lecture - part 1 (basics, CNN)
Deep learning lecture - part 1 (basics, CNN)
ย 
Neural Networks in Data Mining - โ€œAn Overviewโ€
Neural Networks  in Data Mining -   โ€œAn Overviewโ€Neural Networks  in Data Mining -   โ€œAn Overviewโ€
Neural Networks in Data Mining - โ€œAn Overviewโ€
ย 
Deep Feed Forward Neural Networks and Regularization
Deep Feed Forward Neural Networks and RegularizationDeep Feed Forward Neural Networks and Regularization
Deep Feed Forward Neural Networks and Regularization
ย 
The neural tangent link between CNN denoisers and non-local filters
The neural tangent link between CNN denoisers and non-local filtersThe neural tangent link between CNN denoisers and non-local filters
The neural tangent link between CNN denoisers and non-local filters
ย 
Lec10.pptx
Lec10.pptxLec10.pptx
Lec10.pptx
ย 
ann-ics320Part4.ppt
ann-ics320Part4.pptann-ics320Part4.ppt
ann-ics320Part4.ppt
ย 
ann-ics320Part4.ppt
ann-ics320Part4.pptann-ics320Part4.ppt
ann-ics320Part4.ppt
ย 
Artificial Neural Network
Artificial Neural NetworkArtificial Neural Network
Artificial Neural Network
ย 
Lecture 5 backpropagation
Lecture 5 backpropagationLecture 5 backpropagation
Lecture 5 backpropagation
ย 
Annintro
AnnintroAnnintro
Annintro
ย 
Multilayer Perceptron - Elisa Sayrol - UPC Barcelona 2018
Multilayer Perceptron - Elisa Sayrol - UPC Barcelona 2018Multilayer Perceptron - Elisa Sayrol - UPC Barcelona 2018
Multilayer Perceptron - Elisa Sayrol - UPC Barcelona 2018
ย 
Machine Learning - Neural Networks - Perceptron
Machine Learning - Neural Networks - PerceptronMachine Learning - Neural Networks - Perceptron
Machine Learning - Neural Networks - Perceptron
ย 
Machine Learning - Introduction to Neural Networks
Machine Learning - Introduction to Neural NetworksMachine Learning - Introduction to Neural Networks
Machine Learning - Introduction to Neural Networks
ย 
Hands on machine learning with scikit-learn and tensor flow by ahmed yousry
Hands on machine learning with scikit-learn and tensor flow by ahmed yousryHands on machine learning with scikit-learn and tensor flow by ahmed yousry
Hands on machine learning with scikit-learn and tensor flow by ahmed yousry
ย 

Kรผrzlich hochgeladen

๐Ÿฌ The future of MySQL is Postgres ๐Ÿ˜
๐Ÿฌ  The future of MySQL is Postgres   ๐Ÿ˜๐Ÿฌ  The future of MySQL is Postgres   ๐Ÿ˜
๐Ÿฌ The future of MySQL is Postgres ๐Ÿ˜RTylerCroy
ย 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
ย 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
ย 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
ย 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
ย 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
ย 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
ย 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araรบjo
ย 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
ย 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
ย 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
ย 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
ย 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
ย 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
ย 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
ย 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
ย 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
ย 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
ย 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
ย 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
ย 

Kรผrzlich hochgeladen (20)

๐Ÿฌ The future of MySQL is Postgres ๐Ÿ˜
๐Ÿฌ  The future of MySQL is Postgres   ๐Ÿ˜๐Ÿฌ  The future of MySQL is Postgres   ๐Ÿ˜
๐Ÿฌ The future of MySQL is Postgres ๐Ÿ˜
ย 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
ย 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
ย 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
ย 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
ย 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
ย 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
ย 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
ย 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
ย 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
ย 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
ย 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
ย 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
ย 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
ย 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
ย 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ย 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
ย 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
ย 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
ย 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
ย 

Neural Networks. Overview

  • 1. Neural networks. Overview Oleksandr Baiev, PhD Senior Engineer Samsung R&D Institute Ukraine
  • 2. Neural networks. Overview โ€ข Common principles โ€“ Structure โ€“ Learning โ€ข Shallow and Deep NN โ€ข Additional methods โ€“ Conventional โ€“ Voodoo
  • 3. Neural networks. Overview โ€ข Common principles โ€“ Structure โ€“ Learning โ€ข Shallow and Deep NN โ€ข Additional methods โ€“ Conventional โ€“ Voodoo
  • 5. Solutions in general ๐‘ฅ๐‘— = ๐‘ฅ1, ๐‘ฅ2, ๐‘ฅ3, ๐‘ฅ4, โ€ฆ , ๐‘ฅ๐‘–, โ€ฆ ๐‘— โˆˆ ๐‘‹ ๐‘ฆ๐‘— = ๐‘ฆ1, ๐‘ฆ2, โ€ฆ , ๐‘ฆ ๐‘˜, โ€ฆ ๐‘— โˆˆ ๐‘Œ ๐น: ๐‘‹ โ†’ ๐‘Œ Classification ๐‘ฆ1 = 1,0,0 ๐‘ฆ2 = 0,0,1 ๐‘ฆ3 = 0,1,0 ๐‘ฆ4 = 0,1,0 Index of sample in dataset sample of class โ€œ0โ€ sample of class โ€œ2โ€ sample of class โ€œ2โ€ sample of class โ€œ1โ€ Regression ๐‘ฆ1 = 0.3 ๐‘ฆ2 = 0.2 ๐‘ฆ3 = 1.0 ๐‘ฆ4 = 0.65
  • 6. What is artificial Neural Networks? Is it biology? Simulation of biological neural networks (synapses, axons, chains, layers, etc.) is a good abstraction for understanding topology. Bio NN is only inspiration and illustration. Nothing more!
  • 7. What is artificial Neural Networks? Letโ€™s imagine black box! F inputs params outputs General form: ๐‘œ๐‘ข๐‘ก๐‘๐‘ข๐‘ก๐‘  = ๐น ๐‘–๐‘›๐‘๐‘ข๐‘ก๐‘ , ๐‘๐‘Ž๐‘Ÿ๐‘Ž๐‘š๐‘  Steps: 1) choose โ€œformโ€ of F 2) find params
  • 8. What is artificial Neural Networks? Itโ€™s a simple math! free parameters activation function ๐‘ ๐‘– = ๐‘—=1 ๐‘› ๐‘ค๐‘–๐‘— ๐‘ฅ๐‘— + ๐‘๐‘– ๐‘ฆ๐‘– = ๐‘“ ๐‘ ๐‘– Output of i-th neuron:
  • 9. What is artificial Neural Networks? Itโ€™s a simple math! activation: ๐‘ฆ = ๐‘“ ๐‘ค๐‘ฅ + ๐‘ = ๐‘ ๐‘–๐‘”๐‘š๐‘œ๐‘–๐‘‘(๐‘ค๐‘ฅ + ๐‘)
  • 10. What is artificial Neural Networks? Itโ€™s a simple math! activation: ๐‘ฆ = ๐‘“ ๐‘ค๐‘ฅ + ๐‘ = ๐‘ ๐‘–๐‘”๐‘š๐‘œ๐‘–๐‘‘(๐‘ค๐‘ฅ + ๐‘)
  • 11. What is artificial Neural Networks? Itโ€™s a simple math! activation: ๐‘ฆ = ๐‘“ ๐‘ค๐‘ฅ + ๐‘ = ๐‘ ๐‘–๐‘”๐‘š๐‘œ๐‘–๐‘‘(๐‘ค๐‘ฅ + ๐‘)
  • 12. What is artificial Neural Networks? Itโ€™s a simple math! activation: ๐‘ฆ = ๐‘“ ๐‘ค๐‘ฅ + ๐‘ = ๐‘ ๐‘–๐‘”๐‘š๐‘œ๐‘–๐‘‘(๐‘ค๐‘ฅ + ๐‘)
  • 13. What is artificial Neural Networks? Itโ€™s a simple math! activation: ๐‘ฆ = ๐‘“ ๐‘ค๐‘ฅ + ๐‘ = ๐‘ ๐‘–๐‘”๐‘š๐‘œ๐‘–๐‘‘(๐‘ค๐‘ฅ + ๐‘)
  • 14. What is artificial Neural Networks? Itโ€™s a simple math! n inputs m neurons in hidden layer ๐‘ ๐‘– = ๐‘—=1 ๐‘› ๐‘ค๐‘–๐‘— ๐‘ฅ๐‘— + ๐‘๐‘– ๐‘ฆ๐‘– = ๐‘“ ๐‘ ๐‘– Output of i-th neuron: Output of k-th layer: 1) ๐‘† ๐‘˜ = ๐‘Š๐‘˜ ๐‘‹ ๐‘˜ + ๐ต ๐‘˜ = = ๐‘ค11 ๐‘ค12 โ‹ฏ ๐‘ค1๐‘› ๐‘ค21 ๐‘ค21 โ‹ฏ ๐‘ค21 โ‹ฏ โ‹ฏ โ‹ฏ โ‹ฏ ๐‘ค ๐‘š1 ๐‘ค ๐‘š2 โ‹ฏ ๐‘ค ๐‘š๐‘› ๐‘˜ ๐‘ฅ1 ๐‘ฅ2 ๐‘ฅ3 โ‹ฎ ๐‘ฅ ๐‘› ๐‘˜ + ๐‘1 ๐‘2 ๐‘3 โ‹ฎ ๐‘ ๐‘› ๐‘˜ 2) ๐‘Œ๐‘˜ = ๐‘“๐‘˜ ๐‘† ๐‘˜ apply element-wise Kolmagorov & Arnold function superposition Form of F:
  • 15. Neural networks. Overview โ€ข Common principles โ€“ Structure โ€“ Learning โ€ข Shallow and Deep NN โ€ข Additional methods โ€“ Conventional โ€“ Voodoo
  • 16. How to find parameters W and B? Supervised learning: Training set (pairs of variables and responses): ๐‘‹; ๐‘Œ ๐‘–, ๐‘– = 1. . ๐‘ Find: ๐‘Šโˆ— , ๐ตโˆ— = ๐‘Ž๐‘Ÿ๐‘”๐‘š๐‘–๐‘› ๐‘Š,๐ต ๐ฟ ๐น ๐‘‹ , ๐‘Œ Cost function (loss, error): logloss: L ๐น ๐‘‹ , ๐‘Œ = 1 ๐‘ ๐‘–=1 ๐‘ ๐‘—=1 ๐‘€ ๐‘ฆ๐‘–.๐‘— log ๐‘“๐‘–,๐‘— rmse: L ๐น ๐‘‹ , ๐‘Œ = 1 ๐‘ ๐‘–=1 ๐‘ ๐น ๐‘‹๐‘– โˆ’ ๐‘Œ๐‘– 2 โ€œ1โ€ if in i-th sample is class j else โ€œ0โ€ previously scaled: ๐‘“๐‘–,๐‘— = ๐‘“๐‘–,๐‘— ๐‘— ๐‘“๐‘–,๐‘— Just an examples. Cost function depend on problem (classification, regression) and domain knowledge
  • 17. Training or optimization algorithm So, we have model cost ๐ฟ (or error of prediction) And we want to update weights in order to minimize ๐‘ณ: ๐‘คโˆ— = ๐‘ค + ๐›ผฮ”๐‘ค In accordance to gradient descent: ฮ”๐‘ค = โˆ’๐›ป๐ฟ Itโ€™s clear for network with only one layer (we have predicted outputs and targets, so can evaluate ๐ฟ). But how to find ๐œŸ๐’˜ for hidden layers?
  • 18. Meet โ€œError Back Propagationโ€ Find ฮ”๐‘ค for each layer from the last to the first as influence of weights to cost: โˆ†๐‘ค๐‘–,๐‘— = ๐œ•๐ฟ ๐œ•๐‘ค๐‘–,๐‘— and: ๐œ•๐ฟ ๐œ•๐‘ค๐‘–,๐‘— = ๐œ•๐ฟ ๐œ•๐‘“๐‘— ๐œ•๐‘“๐‘— ๐œ•๐‘  ๐‘— ๐œ•๐‘  ๐‘— ๐œ•๐‘ค๐‘–,๐‘—
  • 19. Error Back Propagation Details ๐œ•๐ฟ ๐œ•๐‘ค๐‘–,๐‘— = ๐œ•๐ฟ ๐œ•๐‘“๐‘— ๐œ•๐‘“๐‘— ๐œ•๐‘ ๐‘— ๐œ•๐‘ ๐‘— ๐œ•๐‘ค๐‘–,๐‘— ๐›ฟ๐‘— = ๐œ•๐ฟ ๐œ•๐‘“๐‘— ๐œ•๐‘“๐‘— ๐œ•๐‘  ๐‘— ๐›ฟ๐‘— = ๐ฟโ€ฒ ๐น ๐‘‹ , ๐‘Œ ๐‘“โ€ฒ ๐‘ ๐‘— , ๐‘œ๐‘ข๐‘ก๐‘๐‘ข๐‘ก ๐‘™๐‘Ž๐‘ฆ๐‘’๐‘Ÿ ๐‘™ โˆˆ ๐‘›๐‘’๐‘ฅ๐‘ก ๐‘™๐‘Ž๐‘ฆ๐‘’๐‘Ÿ ๐›ฟ๐‘™ ๐‘ค ๐‘—,๐‘™ ๐‘“โ€ฒ ๐‘ ๐‘— , โ„Ž๐‘–๐‘‘๐‘‘๐‘’๐‘› ๐‘™๐‘Ž๐‘ฆ๐‘’๐‘Ÿ๐‘  โˆ†๐‘ค = โˆ’๐›ผ ๐›ฟ ๐‘ฅ
  • 20. Gradient Descent in real life Recall gradient descent: ๐‘คโˆ— = ๐‘ค + ๐›ผฮ”๐‘ค ๐›ผ is a โ€œstepโ€ coefficient. In term of ML โ€“ learning rate. Recall cost function: ๐ฟ = 1 ๐‘ ๐‘ โ€ฆ GD modification: update ๐‘ค for each sample. Sum along all samples, And what if ๐‘ = 106 or more? Typical: ๐›ผ = 0.01. . 0.1
  • 21. Gradient Descent Stochastic & Minibatch โ€œBatchโ€ GD (L for full set) need a lot of memory Stochastic GD (L for each sample) fast, but fluctuation Minibatch GD (L for subsets) less memory & less fluctuations Size of minibatch depends on HW Typical: minibatch=32โ€ฆ256
  • 22. Termination criteria By epochs count max number of iterations along all data set By value of gradient when gradient is equal to 0 than minimum, but small gradient => very slow learning When cost didnโ€™t change during several epochs if error is not change than training procedure is not converges Early stopping Stop when โ€œvalidationโ€ score starts increase even when โ€œtrainโ€ score continue decreasing Typical: epochs=50โ€ฆ200
  • 23. Neural networks. Overview โ€ข Common principles โ€“ Structure โ€“ Learning โ€ข Shallow and Deep NN โ€ข Additional methods โ€“ Conventional โ€“ Voodoo
  • 24. What about โ€œformโ€ of F? Network topology โ€œShallowโ€ networks 1, 2 hidden layers => not enough parameters => pure separation abilities โ€œDeepโ€ networks is a NN with 2..10 layers โ€œVery deepโ€ networks is a NN with >10 layers
  • 25. Deep learning. Problems โ€ข Big networks => Too huge separating ability => Overfitting โ€ข Vanishing gradient problem during training โ€ข Complex errorโ€™s surface => Local minimum โ€ข Curse of dimensionality => memory & computations ๐‘š(๐‘–โˆ’1) ๐‘š(๐‘–) dim ๐‘Š(๐‘–) = ๐‘š ๐‘–โˆ’1 โˆ— ๐‘š(๐‘–)
  • 26. Neural networks. Overview โ€ข Common principles โ€“ Structure โ€“ Learning โ€ข Shallow and Deep NN โ€ข Additional methods โ€“ Conventional โ€“ Voodoo
  • 27. Additional methods Conventional โ€ข Momentum (prevent the variations on error surface) โˆ†๐‘ค(๐‘ก) = โˆ’๐›ผ๐›ป๐ฟ ๐‘ค ๐‘ก + ๐›ฝโˆ†๐‘ค(๐‘กโˆ’1) ๐‘š๐‘œ๐‘š๐‘’๐‘›๐‘ก๐‘ข๐‘š โ€ข LR decay (make smaller steps near optimum) ๐›ผ(๐‘ก) = ๐‘˜๐›ผ(๐‘กโˆ’1) , 0 < ๐‘˜ < 1 โ€ข Weight Decay(prevent weight growing, and smooth F) ๐ฟโˆ— = ๐ฟ + ๐œ† ๐‘ค(๐‘ก) L1 or L2 regularization often used Typical: ๐›ฝ = 0.9 Typical: apply LR decay (๐‘˜ = 0.1) each 10..100 epochs Typical: ๐ฟ2 with ๐œ† = 0.0005
  • 28. Neural networks. Overview โ€ข Common principles โ€“ Structure โ€“ Learning โ€ข Shallow and Deep NN โ€ข Additional methods โ€“ Conventional โ€“ Voodoo
  • 29. Additional methods Contemporary Dropout/DropConnect โ€“ ensembles of networks โ€“ 2 ๐‘ networks in one: for each example hide neurons output randomly (๐‘ƒ = 0.5)
  • 30. Additional methods Contemporary Data augmentation - more data with all available cases: โ€“ affine transformations, flips, crop, contrast, noise, scale โ€“ pseudo-labeling
  • 31. Additional methods Contemporary New activation function: โ€“ Linear: ๐‘ฆ๐‘– = ๐‘“ ๐‘ ๐‘– = ๐‘Ž๐‘ ๐‘– โ€“ ReLU: ๐‘ฆ๐‘– = ๐‘š๐‘Ž๐‘ฅ ๐‘ ๐‘–, 0 โ€“ Leaky ReLU: ๐‘ฆ๐‘– = ๐‘ ๐‘– ๐‘ ๐‘– > 0 ๐‘Ž๐‘ ๐‘– ๐‘œ๐‘กโ„Ž๐‘’๐‘Ÿ๐‘ค๐‘–๐‘ ๐‘’ โ€“ Maxout: ๐‘ฆ๐‘– = ๐‘š๐‘Ž๐‘ฅ ๐‘ 1,๐‘–, ๐‘ 2,๐‘–, โ€ฆ , ๐‘  ๐‘˜,๐‘– Typical: ๐‘Ž = 0.01 Typical: ๐‘˜ = 2. . 3
  • 32. Additional methods Contemporary Pre-training โ€“ train layer-by-layer, โ€“ re-train โ€œotherโ€ network
  • 33. Sources โ€ข Jeffry Hinton Course โ€œNeural Networks for Machine Learningโ€ [http://www.coursera.org/course/neuralnets] โ€ข Ian Goodfellow, Yoshua Bengio and Aaron Courville โ€œDeep Learningโ€ [http://www.deeplearningbook.org/] โ€ข http://neuralnetworksanddeeplearning.com โ€ข CS231n: Convolutional Neural Networks for Visual Recognition [http://cs231n.stanford.edu/] โ€ข CS224d: Deep Learning for Natural Language Processing [http://cs224d.stanford.edu/] โ€ข Schmidhuber โ€œDeep Learning in Neural Networks: An Overviewโ€ โ€ข kaggle.com competitions and forums