SlideShare ist ein Scribd-Unternehmen logo
1 von 47
1
R. Rao, Week 6: Neural Networks
CSE 599 Lecture 6: Neural Networks and Models
 Last Lecture:
 Neurons, membranes, channels
 Structure of Neurons: dendrites, cell body and axon
 Input: Synapses on dendrites and cell body (soma)
 Output: Axon, myelin for fast signal propagation
 Function of Neurons:
 Internal voltage controlled by Na/K/Cl… channels
 Channels gated by chemicals (neurotransmitters) and membrane
voltage
 Inputs at synapses release neurotransmitter, causing increase or
decrease in membrane voltages (via currents in channels)
 Large positive change in voltage  membrane voltage exceeds
threshold  action potential or spike generated.
2
R. Rao, Week 6: Neural Networks
Basic Input-Output Transformation
Input
Spikes
Output
Spike
(Excitatory Post-Synaptic Potential)
3
R. Rao, Week 6: Neural Networks
McCulloch–Pitts “neuron” (1943)
 Attributes of neuron
 m binary inputs and 1 output (0 or 1)
 Synaptic weights wij
 Threshold i
4
R. Rao, Week 6: Neural Networks
McCulloch–Pitts Neural Networks
 Synchronous discrete time operation
 Time quantized in units of synaptic delay
 Output is 1 if and only if weighted
sum of inputs is greater than threshold
(x) = 1 if x  0 and 0 if x < 0
 Behavior of network can be simulated by a finite automaton
 Any FA can be simulated by a McCulloch-Pitts Network
n t w n t
i ij j
j
i

     
L
N
M O
Q
P

1   n i
w i j
i
ij
i




output of unit
step function
weight from unit to
threshold


j to i
5
R. Rao, Week 6: Neural Networks
Properties of Artificial Neural Networks
 High level abstraction of neural input-output transformation
 Inputs  weighted sum of inputs  nonlinear function  output
 Typically no spikes
 Typically use implausible constraints or learning rules
 Often used where data or functions are uncertain
 Goal is to learn from a set of training data
 And to generalize from learned instances to new unseen data
 Key attributes
 Parallel computation
 Distributed representation and storage of data
 Learning (networks adapt themselves to solve a problem)
 Fault tolerance (insensitive to component failures)
6
R. Rao, Week 6: Neural Networks
Topologies of Neural Networks
completely
connected feedforward
(directed, a-cyclic)
recurrent
(feedback connections)
7
R. Rao, Week 6: Neural Networks
Networks Types
 Feedforward versus recurrent networks
 Feedforward: No loops, input  hidden layers  output
 Recurrent: Use feedback (positive or negative)
 Continuous versus spiking
 Continuous networks model mean spike rate (firing rate)
 Assume spikes are integrated over time
 Consistent with rate-code model of neural coding
 Supervised versus unsupervised learning
 Supervised networks use a “teacher”
 The desired output for each input is provided by user
 Unsupervised networks find hidden statistical patterns in input data
 Clustering, principal component analysis
8
R. Rao, Week 6: Neural Networks
History
 1943: McCulloch–Pitts “neuron”
 Started the field
 1962: Rosenblatt’s perceptron
 Learned its own weight values; convergence proof
 1969: Minsky & Papert book on perceptrons
 Proved limitations of single-layer perceptron networks
 1982: Hopfield and convergence in symmetric networks
 Introduced energy-function concept
 1986: Backpropagation of errors
 Method for training multilayer networks
 Present: Probabilistic interpretations, Bayesian and spiking networks
9
R. Rao, Week 6: Neural Networks
Perceptrons
 Attributes
 Layered feedforward networks
 Supervised learning
 Hebbian: Adjust weights to enforce correlations
 Parameters: weights wij
 Binary output = (weighted sum of inputs)
 Take wo to be the threshold with fixed input –1.
Output w
i ij j
j

L
N
M
M
O
Q
P
P

 
Multilayer
Single-layer
10
R. Rao, Week 6: Neural Networks
Training Perceptrons to Compute a Function
 Given inputs j to neuron i and desired output Yi, find its weight values
by iterative improvement:
1. Feed an input pattern
2. Is the binary output correct?
Yes: Go to the next pattern
No: Modify the connection weights using error signal (Yi – Oi)
Increase weight if neuron didn’t fire when it should have and vice versa
 Learning rule is Hebbian (based on input/output correlation)
 converges in a finite number of steps if a solution exists
 Used in ADALINE (adaptive linear neuron) networks
w Y O
Y w
ij i i j
i ij j j
 
 
 
  
a f
d i






learning rate
input
desired output
actual output
j
i
i
Y
O
11
R. Rao, Week 6: Neural Networks
Computational Power of Perceptrons
 Consider a single-layer perceptron
 Assume threshold units
 Assume binary inputs and outputs
 Weighted sum forms a linear hyperplane
 Consider a single output network with two inputs
 Only functions that are linearly separable can be computed
 Example: AND is linearly separable
wij j
j

  0
o  1
12
R. Rao, Week 6: Neural Networks
Linear inseparability
 Single-layer perceptron with threshold units fails if problem
is not linearly separable
 Example: XOR
 Can use other tricks (e.g.
complicated threshold
functions) but complexity
blows up
 Minsky and Papert’s book
showing these negative results
was very influential
13
R. Rao, Week 6: Neural Networks
Solution in 1980s: Multilayer perceptrons
 Removes many limitations of single-layer networks
 Can solve XOR
 Exercise: Draw a two-layer perceptron that computes the
XOR function
 2 binary inputs 1 and 2
 1 binary output
 One “hidden” layer
 Find the appropriate
weights and threshold
14
R. Rao, Week 6: Neural Networks
Solution in 1980s: Multilayer perceptrons
 Examples of two-layer perceptrons that compute XOR
 E.g. Right side network
 Output is 1 if and only if x + y – 2(x + y – 1.5 > 0) – 0.5 > 0
x y
15
R. Rao, Week 6: Neural Networks
Multilayer Perceptron
Input nodes
Output neurons
One or more
layers of
hidden units
(hidden layers)
a
e
a
g 



1
1
)
(
a
(a)
1
The most common
output function
(Sigmoid):
(non-linear
squashing function)
g(a)
16
R. Rao, Week 6: Neural Networks
x y
out
x
y
1
1
2
1 2
2
1
1
1
 1

2
1

1

1
2
1

?
Example: Perceptrons as Constraint Satisfaction Networks
17
R. Rao, Week 6: Neural Networks
x y
out
x
y
1
1
2
1 2
0
2
1
1 

 y
x
0
2
1
1 

 y
x
=0
=1
2
1
1
1

Example: Perceptrons as Constraint Satisfaction Networks
18
R. Rao, Week 6: Neural Networks
x y
out
x
y
1
1
2
1 2
0
2 

 y
x 0
2 

 y
x
=0
=0
=1
=1
1

2
1

Example: Perceptrons as Constraint Satisfaction Networks
19
R. Rao, Week 6: Neural Networks
x y
out
x
y
1
1
2
1 2
=0
=0
=1
=1
1

1
2
1

-
2
1
 >0
Example: Perceptrons as Constraint Satisfaction Networks
20
R. Rao, Week 6: Neural Networks
x y
out
x
y
1
1
2
1 2
0
2 

 y
x 0
2 

 y
x
0
2
1
1 

 y
x
0
2
1
1 

 y
x
=0
=0
=1
=1
2
1
1
1
 1

2
1

1

1
2
1

Perceptrons as Constraint Satisfaction Networks
21
R. Rao, Week 6: Neural Networks
Learning networks
 We want networks that configure themselves
 Learn from the input data or from training examples
 Generalize from learned data
Can this network configure itself
to solve a problem?
How do we train it?
22
R. Rao, Week 6: Neural Networks
Gradient-descent learning
 Use a differentiable activation function
 Try a continuous function f ( ) instead of ( )
 First guess: Use a linear unit
 Define an error function (cost function or “energy” function)
 Changes weights in the direction of smaller errors
 Minimizes the mean-squared error over input patterns 
 Called Delta rule = adaline rule = Widrow-Hoff rule = LMS rule
E Y w
i
u
ij j
j
u
i
 
L
N
M
M
O
Q
P
P



1
2
2

Then w
E
w
Y w
ij
ij
i
u
ij j
j
u
j
  
L
N
M
M
O
Q
P
P

 

   
Cost function measures the
network’s performance as a
differentiable function of the
weights
23
R. Rao, Week 6: Neural Networks
Backpropagation of errors
 Use a nonlinear, differentiable activation function
 Such as a sigmoid
 Use a multilayer feedforward network
 Outputs are differentiable functions of the inputs
 Result: Can propagate credit/blame back to internal nodes
 Chain rule (calculus) gives wij for internal “hidden” nodes
 Based on gradient-descent learning
f
h
h wij j
j

 
 
1
1 2
exp 

a f where
24
R. Rao, Week 6: Neural Networks
Backpropagation of errors (con’t)
Vj
25
R. Rao, Week 6: Neural Networks
Backpropagation of errors (con’t)
 Let Ai be the activation (weighted sum of inputs) of neuron i
 Let Vj = g(Aj) be output of hidden unit j
 Learning rule for hidden-output connection weights:
 Wij = -E/Wij =  S [di – ai] g’(Ai) Vj
=  S di Vj
 Learning rule for input-hidden connection weights:
 wjk = - E/wjk = - (E/Vj ) (Vj/wjk ) {chain rule}
 S,i ([di – ai] g’(Ai) Wij) (g’ (Aj) k)
=  S dj k
26
R. Rao, Week 6: Neural Networks
Backpropagation
 Can be extended to arbitrary number of layers but three is most
commonly used
 Can approximate arbitrary functions: crucial issues are
 generalization to examples not in test data set
 number of hidden units
 number of samples
 speed of convergence to a stable set of weights (sometimes a momentum
term a wpq is added to the learning rule to speed up learning)
 In your homework, you will use backpropagation and the delta rule in a
simple pattern recognition task: classifying noisy images of the digits 0
through 9
 C Code for the networks is already given – you will only need to modify
the input and output
27
R. Rao, Week 6: Neural Networks
Hopfield networks
 Act as “autoassociative” memories to store patterns
 McCulloch-Pitts neurons with outputs -1 or 1, and threshold 
 All neurons connected to each other
 Symmetric weights (wij = wji) and wii = 0
 Asynchronous updating of outputs
 Let si be the state of unit i
 At each time step, pick a random unit
 Set si to 1 if Sj wij sj  ; otherwise, set si to -1
completely
connected
28
R. Rao, Week 6: Neural Networks
Hopfield networks
 Hopfield showed that asynchronous updating in symmetric
networks minimizes an “energy” function and leads to a
stable final state for a given initial state
 Define an energy function (analogous to the gradient descent
error function)
 E = -1/2 Si,j wij si sj + Si si i
 Suppose a random unit i was updated: E always decreases!
 If si is initially –1 and Sj wij sj > i, then si becomes +1
 Change in E = -1/2 Sj (wij sj + wji sj ) + i = - Sj wij sj + i < 0 !!
 If si is initially +1 and Sj wij sj < i, then si becomes -1
 Change in E = 1/2 Sj (wij sj + wji sj ) -  i = Sj wij sj - i < 0 !!
29
R. Rao, Week 6: Neural Networks
Hopfield networks
 Note: Network converges to local minima which store
different patterns.
 Store p N-dimensional pattern vectors x1, …, xp using
Hebbian learning rule:
 wji = 1/N Sm=1,..,p x m,j x m,i for all j  i; 0 for j = i
 W = 1/N Sm=1,..,p x m x m
T (outer product of vectors; diagonal zero)
 T denotes vector transpose
x4
x1
30
R. Rao, Week 6: Neural Networks
Pattern Completion in a Hopfield Network

Local minimum
(“attractor”)
of energy function
stores pattern
31
R. Rao, Week 6: Neural Networks
Radial Basis Function Networks
input nodes
output neurons
one layer of
hidden neurons
32
R. Rao, Week 6: Neural Networks
Radial Basis Function Networks
propagation function:




n
i
j
i
i
j x
a
1
2
, )
( 
input nodes
output neurons
33
R. Rao, Week 6: Neural Networks
Radial Basis Function Networks
2
2
2
)
( 
a
e
a
h


output function:
(Gauss’bell-shaped function)
a
(a)
input nodes
output neurons
h(a)
34
R. Rao, Week 6: Neural Networks
Radial Basis Function Networks
output of network:


i
i
j
i
j h
w,
out
input nodes
output neurons
35
R. Rao, Week 6: Neural Networks
RBF networks
 Radial basis functions
 Hidden units store means and
variances
 Hidden units compute a
Gaussian function of inputs
x1,…xn that constitute the
input vector x
 Learn weights wi, means i,
and variances i by
minimizing squared error
function (gradient descent
learning)
36
R. Rao, Week 6: Neural Networks
RBF Networks and Multilayer Perceptrons
RBF: MLP:
input nodes
output neurons
37
R. Rao, Week 6: Neural Networks
Recurrent networks
 Employ feedback (positive, negative, or both)
 Not necessarily stable
 Symmetric connections can ensure stability
 Why use recurrent networks?
 Can learn temporal patterns (time series or oscillations)
 Biologically realistic
 Majority of connections to neurons in cerebral cortex are
feedback connections from local or distant neurons
 Examples
 Hopfield network
 Boltzmann machine (Hopfield-like net with input & output units)
 Recurrent backpropagation networks: for small sequences, unfold
network in time dimension and use backpropagation learning
38
R. Rao, Week 6: Neural Networks
Recurrent networks (con’t)
 Example
 Elman networks
 Partially recurrent
 Context units keep
internal memory of part
inputs
 Fixed context weights
 Backpropagation for
learning
 E.g. Can disambiguate
ABC and CBA
Elman network
39
R. Rao, Week 6: Neural Networks
Unsupervised Networks
 No feedback to say how output differs from desired output
(no error signal) or even whether output was right or wrong
 Network must discover patterns in the input data by itself
 Only works if there are redundancies in the input data
 Network self-organizes to find these redundancies
 Clustering: Decide which group an input belongs to
 Synaptic weights of one neuron represents one group
 Principal Component Analysis: Finds the principal eigenvector of
data covariance matrix
 Hebb rule performs PCA! (Oja, 1982)
 wi =  iy
 Output y  Si wi i
40
R. Rao, Week 6: Neural Networks
Self-Organizing Maps (Kohonen Maps)
 Feature maps
 Competitive networks
 Neurons have locations
 For each input, winner is
the unit with largest output
 Weights of winner and
nearby units modified to
resemble input pattern
 Nearby inputs are thus
mapped topographically
 Biological relevance
 Retinotopic map
 Somatosensory map
 Tonotopic map
41
R. Rao, Week 6: Neural Networks
Example of a 2D Self-Organizing Map
• 10 x 10 array of
neurons
• 2D inputs (x,y)
• Initial weights w1
and w2 random
as shown on right;
lines connect
neighbors
42
R. Rao, Week 6: Neural Networks
Example of a 2D Self-Organizing Map
• 10 x 10 array of
neurons
• 2D inputs (x,y)
• Weights after 10
iterations
43
R. Rao, Week 6: Neural Networks
Example of a 2D Self-Organizing Map
• 10 x 10 array of
neurons
• 2D inputs (x,y)
• Weights after 20
iterations
44
R. Rao, Week 6: Neural Networks
Example of a 2D Self-Organizing Map
• 10 x 10 array of
neurons
• 2D inputs (x,y)
• Weights after 40
iterations
45
R. Rao, Week 6: Neural Networks
Example of a 2D Self-Organizing Map
• 10 x 10 array of
neurons
• 2D inputs (x,y)
• Weights after 80
iterations
46
R. Rao, Week 6: Neural Networks
Example of a 2D Self-Organizing Map
• 10 x 10 array of
neurons
• 2D inputs (x,y)
• Final Weights (after
160 iterations)
• Topography of inputs
has been captured
47
R. Rao, Week 6: Neural Networks
Summary: Biology and Neural Networks
 So many similarities
 Information is contained in synaptic connections
 Network learns to perform specific functions
 Network generalizes to new inputs
 But NNs are woefully inadequate compared with biology
 Simplistic model of neuron and synapse, implausible learning rules
 Hard to train large networks
 Network construction (structure, learning rate etc.) is a heuristic art
 One obvious difference: Spike representation
 Recent models explore spikes and spike-timing dependent plasticity
 Other Recent Trends: Probabilistic approach
 NNs as Bayesian networks (allows principled derivation of dynamics,
learning rules, and even structure of network)
 Not clear how neurons encode probabilities in spikes

Weitere ähnliche Inhalte

Ähnlich wie Lecture6.ppt

Multi-Layer Perceptrons
Multi-Layer PerceptronsMulti-Layer Perceptrons
Multi-Layer Perceptrons
ESCOM
 
lecture07.ppt
lecture07.pptlecture07.ppt
lecture07.ppt
butest
 

Ähnlich wie Lecture6.ppt (20)

ai7.ppt
ai7.pptai7.ppt
ai7.ppt
 
Perceptron
PerceptronPerceptron
Perceptron
 
Artificial Neural Networks ppt.pptx for final sem cse
Artificial Neural Networks  ppt.pptx for final sem cseArtificial Neural Networks  ppt.pptx for final sem cse
Artificial Neural Networks ppt.pptx for final sem cse
 
Classification by backpropacation
Classification by backpropacationClassification by backpropacation
Classification by backpropacation
 
Artificial neural networks
Artificial neural networks Artificial neural networks
Artificial neural networks
 
Multi-Layer Perceptrons
Multi-Layer PerceptronsMulti-Layer Perceptrons
Multi-Layer Perceptrons
 
Character Recognition using Artificial Neural Networks
Character Recognition using Artificial Neural NetworksCharacter Recognition using Artificial Neural Networks
Character Recognition using Artificial Neural Networks
 
N ns 1
N ns 1N ns 1
N ns 1
 
Backpropagation.pptx
Backpropagation.pptxBackpropagation.pptx
Backpropagation.pptx
 
lecture11_Artificial neural networks.ppt
lecture11_Artificial neural networks.pptlecture11_Artificial neural networks.ppt
lecture11_Artificial neural networks.ppt
 
20120140503023
2012014050302320120140503023
20120140503023
 
Ann
Ann Ann
Ann
 
lecture07.ppt
lecture07.pptlecture07.ppt
lecture07.ppt
 
Nn devs
Nn devsNn devs
Nn devs
 
Optimization of Number of Neurons in the Hidden Layer in Feed Forward Neural ...
Optimization of Number of Neurons in the Hidden Layer in Feed Forward Neural ...Optimization of Number of Neurons in the Hidden Layer in Feed Forward Neural ...
Optimization of Number of Neurons in the Hidden Layer in Feed Forward Neural ...
 
Web spam classification using supervised artificial neural network algorithms
Web spam classification using supervised artificial neural network algorithmsWeb spam classification using supervised artificial neural network algorithms
Web spam classification using supervised artificial neural network algorithms
 
Introduction to Neural networks (under graduate course) Lecture 9 of 9
Introduction to Neural networks (under graduate course) Lecture 9 of 9Introduction to Neural networks (under graduate course) Lecture 9 of 9
Introduction to Neural networks (under graduate course) Lecture 9 of 9
 
Echo state networks and locomotion patterns
Echo state networks and locomotion patternsEcho state networks and locomotion patterns
Echo state networks and locomotion patterns
 
Intro to Deep learning - Autoencoders
Intro to Deep learning - Autoencoders Intro to Deep learning - Autoencoders
Intro to Deep learning - Autoencoders
 
SoftComputing6
SoftComputing6SoftComputing6
SoftComputing6
 

Mehr von KabileshCm (9)

ROBOTICS APPLICATIONS of Electronics.pptx
ROBOTICS APPLICATIONS of Electronics.pptxROBOTICS APPLICATIONS of Electronics.pptx
ROBOTICS APPLICATIONS of Electronics.pptx
 
Presentation On Machine Learning greator.pptx
Presentation On Machine Learning greator.pptxPresentation On Machine Learning greator.pptx
Presentation On Machine Learning greator.pptx
 
STUDENTS ATTENDANCE USING FACE RECOGNITION-1.pptx
STUDENTS ATTENDANCE USING FACE RECOGNITION-1.pptxSTUDENTS ATTENDANCE USING FACE RECOGNITION-1.pptx
STUDENTS ATTENDANCE USING FACE RECOGNITION-1.pptx
 
project[1].pdf
project[1].pdfproject[1].pdf
project[1].pdf
 
DATA SCIENCE WITH PYTHON.pptx
DATA SCIENCE WITH PYTHON.pptxDATA SCIENCE WITH PYTHON.pptx
DATA SCIENCE WITH PYTHON.pptx
 
Application Development for the Internet of Things.pptx
Application Development for the Internet of Things.pptxApplication Development for the Internet of Things.pptx
Application Development for the Internet of Things.pptx
 
neural.ppt
neural.pptneural.ppt
neural.ppt
 
background.pptx
background.pptxbackground.pptx
background.pptx
 
Python-data-science.pptx
Python-data-science.pptxPython-data-science.pptx
Python-data-science.pptx
 

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
 
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024
 
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptxSKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...
 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibit
 
Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptx
 
Spatium Project Simulation student brief
Spatium Project Simulation student briefSpatium Project Simulation student brief
Spatium Project Simulation student brief
 
How to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptxHow to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptx
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POS
 
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfUnit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdf
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
 

Lecture6.ppt

  • 1. 1 R. Rao, Week 6: Neural Networks CSE 599 Lecture 6: Neural Networks and Models  Last Lecture:  Neurons, membranes, channels  Structure of Neurons: dendrites, cell body and axon  Input: Synapses on dendrites and cell body (soma)  Output: Axon, myelin for fast signal propagation  Function of Neurons:  Internal voltage controlled by Na/K/Cl… channels  Channels gated by chemicals (neurotransmitters) and membrane voltage  Inputs at synapses release neurotransmitter, causing increase or decrease in membrane voltages (via currents in channels)  Large positive change in voltage  membrane voltage exceeds threshold  action potential or spike generated.
  • 2. 2 R. Rao, Week 6: Neural Networks Basic Input-Output Transformation Input Spikes Output Spike (Excitatory Post-Synaptic Potential)
  • 3. 3 R. Rao, Week 6: Neural Networks McCulloch–Pitts “neuron” (1943)  Attributes of neuron  m binary inputs and 1 output (0 or 1)  Synaptic weights wij  Threshold i
  • 4. 4 R. Rao, Week 6: Neural Networks McCulloch–Pitts Neural Networks  Synchronous discrete time operation  Time quantized in units of synaptic delay  Output is 1 if and only if weighted sum of inputs is greater than threshold (x) = 1 if x  0 and 0 if x < 0  Behavior of network can be simulated by a finite automaton  Any FA can be simulated by a McCulloch-Pitts Network n t w n t i ij j j i        L N M O Q P  1   n i w i j i ij i     output of unit step function weight from unit to threshold   j to i
  • 5. 5 R. Rao, Week 6: Neural Networks Properties of Artificial Neural Networks  High level abstraction of neural input-output transformation  Inputs  weighted sum of inputs  nonlinear function  output  Typically no spikes  Typically use implausible constraints or learning rules  Often used where data or functions are uncertain  Goal is to learn from a set of training data  And to generalize from learned instances to new unseen data  Key attributes  Parallel computation  Distributed representation and storage of data  Learning (networks adapt themselves to solve a problem)  Fault tolerance (insensitive to component failures)
  • 6. 6 R. Rao, Week 6: Neural Networks Topologies of Neural Networks completely connected feedforward (directed, a-cyclic) recurrent (feedback connections)
  • 7. 7 R. Rao, Week 6: Neural Networks Networks Types  Feedforward versus recurrent networks  Feedforward: No loops, input  hidden layers  output  Recurrent: Use feedback (positive or negative)  Continuous versus spiking  Continuous networks model mean spike rate (firing rate)  Assume spikes are integrated over time  Consistent with rate-code model of neural coding  Supervised versus unsupervised learning  Supervised networks use a “teacher”  The desired output for each input is provided by user  Unsupervised networks find hidden statistical patterns in input data  Clustering, principal component analysis
  • 8. 8 R. Rao, Week 6: Neural Networks History  1943: McCulloch–Pitts “neuron”  Started the field  1962: Rosenblatt’s perceptron  Learned its own weight values; convergence proof  1969: Minsky & Papert book on perceptrons  Proved limitations of single-layer perceptron networks  1982: Hopfield and convergence in symmetric networks  Introduced energy-function concept  1986: Backpropagation of errors  Method for training multilayer networks  Present: Probabilistic interpretations, Bayesian and spiking networks
  • 9. 9 R. Rao, Week 6: Neural Networks Perceptrons  Attributes  Layered feedforward networks  Supervised learning  Hebbian: Adjust weights to enforce correlations  Parameters: weights wij  Binary output = (weighted sum of inputs)  Take wo to be the threshold with fixed input –1. Output w i ij j j  L N M M O Q P P    Multilayer Single-layer
  • 10. 10 R. Rao, Week 6: Neural Networks Training Perceptrons to Compute a Function  Given inputs j to neuron i and desired output Yi, find its weight values by iterative improvement: 1. Feed an input pattern 2. Is the binary output correct? Yes: Go to the next pattern No: Modify the connection weights using error signal (Yi – Oi) Increase weight if neuron didn’t fire when it should have and vice versa  Learning rule is Hebbian (based on input/output correlation)  converges in a finite number of steps if a solution exists  Used in ADALINE (adaptive linear neuron) networks w Y O Y w ij i i j i ij j j          a f d i       learning rate input desired output actual output j i i Y O
  • 11. 11 R. Rao, Week 6: Neural Networks Computational Power of Perceptrons  Consider a single-layer perceptron  Assume threshold units  Assume binary inputs and outputs  Weighted sum forms a linear hyperplane  Consider a single output network with two inputs  Only functions that are linearly separable can be computed  Example: AND is linearly separable wij j j    0 o  1
  • 12. 12 R. Rao, Week 6: Neural Networks Linear inseparability  Single-layer perceptron with threshold units fails if problem is not linearly separable  Example: XOR  Can use other tricks (e.g. complicated threshold functions) but complexity blows up  Minsky and Papert’s book showing these negative results was very influential
  • 13. 13 R. Rao, Week 6: Neural Networks Solution in 1980s: Multilayer perceptrons  Removes many limitations of single-layer networks  Can solve XOR  Exercise: Draw a two-layer perceptron that computes the XOR function  2 binary inputs 1 and 2  1 binary output  One “hidden” layer  Find the appropriate weights and threshold
  • 14. 14 R. Rao, Week 6: Neural Networks Solution in 1980s: Multilayer perceptrons  Examples of two-layer perceptrons that compute XOR  E.g. Right side network  Output is 1 if and only if x + y – 2(x + y – 1.5 > 0) – 0.5 > 0 x y
  • 15. 15 R. Rao, Week 6: Neural Networks Multilayer Perceptron Input nodes Output neurons One or more layers of hidden units (hidden layers) a e a g     1 1 ) ( a (a) 1 The most common output function (Sigmoid): (non-linear squashing function) g(a)
  • 16. 16 R. Rao, Week 6: Neural Networks x y out x y 1 1 2 1 2 2 1 1 1  1  2 1  1  1 2 1  ? Example: Perceptrons as Constraint Satisfaction Networks
  • 17. 17 R. Rao, Week 6: Neural Networks x y out x y 1 1 2 1 2 0 2 1 1    y x 0 2 1 1    y x =0 =1 2 1 1 1  Example: Perceptrons as Constraint Satisfaction Networks
  • 18. 18 R. Rao, Week 6: Neural Networks x y out x y 1 1 2 1 2 0 2    y x 0 2    y x =0 =0 =1 =1 1  2 1  Example: Perceptrons as Constraint Satisfaction Networks
  • 19. 19 R. Rao, Week 6: Neural Networks x y out x y 1 1 2 1 2 =0 =0 =1 =1 1  1 2 1  - 2 1  >0 Example: Perceptrons as Constraint Satisfaction Networks
  • 20. 20 R. Rao, Week 6: Neural Networks x y out x y 1 1 2 1 2 0 2    y x 0 2    y x 0 2 1 1    y x 0 2 1 1    y x =0 =0 =1 =1 2 1 1 1  1  2 1  1  1 2 1  Perceptrons as Constraint Satisfaction Networks
  • 21. 21 R. Rao, Week 6: Neural Networks Learning networks  We want networks that configure themselves  Learn from the input data or from training examples  Generalize from learned data Can this network configure itself to solve a problem? How do we train it?
  • 22. 22 R. Rao, Week 6: Neural Networks Gradient-descent learning  Use a differentiable activation function  Try a continuous function f ( ) instead of ( )  First guess: Use a linear unit  Define an error function (cost function or “energy” function)  Changes weights in the direction of smaller errors  Minimizes the mean-squared error over input patterns   Called Delta rule = adaline rule = Widrow-Hoff rule = LMS rule E Y w i u ij j j u i   L N M M O Q P P    1 2 2  Then w E w Y w ij ij i u ij j j u j    L N M M O Q P P         Cost function measures the network’s performance as a differentiable function of the weights
  • 23. 23 R. Rao, Week 6: Neural Networks Backpropagation of errors  Use a nonlinear, differentiable activation function  Such as a sigmoid  Use a multilayer feedforward network  Outputs are differentiable functions of the inputs  Result: Can propagate credit/blame back to internal nodes  Chain rule (calculus) gives wij for internal “hidden” nodes  Based on gradient-descent learning f h h wij j j      1 1 2 exp   a f where
  • 24. 24 R. Rao, Week 6: Neural Networks Backpropagation of errors (con’t) Vj
  • 25. 25 R. Rao, Week 6: Neural Networks Backpropagation of errors (con’t)  Let Ai be the activation (weighted sum of inputs) of neuron i  Let Vj = g(Aj) be output of hidden unit j  Learning rule for hidden-output connection weights:  Wij = -E/Wij =  S [di – ai] g’(Ai) Vj =  S di Vj  Learning rule for input-hidden connection weights:  wjk = - E/wjk = - (E/Vj ) (Vj/wjk ) {chain rule}  S,i ([di – ai] g’(Ai) Wij) (g’ (Aj) k) =  S dj k
  • 26. 26 R. Rao, Week 6: Neural Networks Backpropagation  Can be extended to arbitrary number of layers but three is most commonly used  Can approximate arbitrary functions: crucial issues are  generalization to examples not in test data set  number of hidden units  number of samples  speed of convergence to a stable set of weights (sometimes a momentum term a wpq is added to the learning rule to speed up learning)  In your homework, you will use backpropagation and the delta rule in a simple pattern recognition task: classifying noisy images of the digits 0 through 9  C Code for the networks is already given – you will only need to modify the input and output
  • 27. 27 R. Rao, Week 6: Neural Networks Hopfield networks  Act as “autoassociative” memories to store patterns  McCulloch-Pitts neurons with outputs -1 or 1, and threshold   All neurons connected to each other  Symmetric weights (wij = wji) and wii = 0  Asynchronous updating of outputs  Let si be the state of unit i  At each time step, pick a random unit  Set si to 1 if Sj wij sj  ; otherwise, set si to -1 completely connected
  • 28. 28 R. Rao, Week 6: Neural Networks Hopfield networks  Hopfield showed that asynchronous updating in symmetric networks minimizes an “energy” function and leads to a stable final state for a given initial state  Define an energy function (analogous to the gradient descent error function)  E = -1/2 Si,j wij si sj + Si si i  Suppose a random unit i was updated: E always decreases!  If si is initially –1 and Sj wij sj > i, then si becomes +1  Change in E = -1/2 Sj (wij sj + wji sj ) + i = - Sj wij sj + i < 0 !!  If si is initially +1 and Sj wij sj < i, then si becomes -1  Change in E = 1/2 Sj (wij sj + wji sj ) -  i = Sj wij sj - i < 0 !!
  • 29. 29 R. Rao, Week 6: Neural Networks Hopfield networks  Note: Network converges to local minima which store different patterns.  Store p N-dimensional pattern vectors x1, …, xp using Hebbian learning rule:  wji = 1/N Sm=1,..,p x m,j x m,i for all j  i; 0 for j = i  W = 1/N Sm=1,..,p x m x m T (outer product of vectors; diagonal zero)  T denotes vector transpose x4 x1
  • 30. 30 R. Rao, Week 6: Neural Networks Pattern Completion in a Hopfield Network  Local minimum (“attractor”) of energy function stores pattern
  • 31. 31 R. Rao, Week 6: Neural Networks Radial Basis Function Networks input nodes output neurons one layer of hidden neurons
  • 32. 32 R. Rao, Week 6: Neural Networks Radial Basis Function Networks propagation function:     n i j i i j x a 1 2 , ) (  input nodes output neurons
  • 33. 33 R. Rao, Week 6: Neural Networks Radial Basis Function Networks 2 2 2 ) (  a e a h   output function: (Gauss’bell-shaped function) a (a) input nodes output neurons h(a)
  • 34. 34 R. Rao, Week 6: Neural Networks Radial Basis Function Networks output of network:   i i j i j h w, out input nodes output neurons
  • 35. 35 R. Rao, Week 6: Neural Networks RBF networks  Radial basis functions  Hidden units store means and variances  Hidden units compute a Gaussian function of inputs x1,…xn that constitute the input vector x  Learn weights wi, means i, and variances i by minimizing squared error function (gradient descent learning)
  • 36. 36 R. Rao, Week 6: Neural Networks RBF Networks and Multilayer Perceptrons RBF: MLP: input nodes output neurons
  • 37. 37 R. Rao, Week 6: Neural Networks Recurrent networks  Employ feedback (positive, negative, or both)  Not necessarily stable  Symmetric connections can ensure stability  Why use recurrent networks?  Can learn temporal patterns (time series or oscillations)  Biologically realistic  Majority of connections to neurons in cerebral cortex are feedback connections from local or distant neurons  Examples  Hopfield network  Boltzmann machine (Hopfield-like net with input & output units)  Recurrent backpropagation networks: for small sequences, unfold network in time dimension and use backpropagation learning
  • 38. 38 R. Rao, Week 6: Neural Networks Recurrent networks (con’t)  Example  Elman networks  Partially recurrent  Context units keep internal memory of part inputs  Fixed context weights  Backpropagation for learning  E.g. Can disambiguate ABC and CBA Elman network
  • 39. 39 R. Rao, Week 6: Neural Networks Unsupervised Networks  No feedback to say how output differs from desired output (no error signal) or even whether output was right or wrong  Network must discover patterns in the input data by itself  Only works if there are redundancies in the input data  Network self-organizes to find these redundancies  Clustering: Decide which group an input belongs to  Synaptic weights of one neuron represents one group  Principal Component Analysis: Finds the principal eigenvector of data covariance matrix  Hebb rule performs PCA! (Oja, 1982)  wi =  iy  Output y  Si wi i
  • 40. 40 R. Rao, Week 6: Neural Networks Self-Organizing Maps (Kohonen Maps)  Feature maps  Competitive networks  Neurons have locations  For each input, winner is the unit with largest output  Weights of winner and nearby units modified to resemble input pattern  Nearby inputs are thus mapped topographically  Biological relevance  Retinotopic map  Somatosensory map  Tonotopic map
  • 41. 41 R. Rao, Week 6: Neural Networks Example of a 2D Self-Organizing Map • 10 x 10 array of neurons • 2D inputs (x,y) • Initial weights w1 and w2 random as shown on right; lines connect neighbors
  • 42. 42 R. Rao, Week 6: Neural Networks Example of a 2D Self-Organizing Map • 10 x 10 array of neurons • 2D inputs (x,y) • Weights after 10 iterations
  • 43. 43 R. Rao, Week 6: Neural Networks Example of a 2D Self-Organizing Map • 10 x 10 array of neurons • 2D inputs (x,y) • Weights after 20 iterations
  • 44. 44 R. Rao, Week 6: Neural Networks Example of a 2D Self-Organizing Map • 10 x 10 array of neurons • 2D inputs (x,y) • Weights after 40 iterations
  • 45. 45 R. Rao, Week 6: Neural Networks Example of a 2D Self-Organizing Map • 10 x 10 array of neurons • 2D inputs (x,y) • Weights after 80 iterations
  • 46. 46 R. Rao, Week 6: Neural Networks Example of a 2D Self-Organizing Map • 10 x 10 array of neurons • 2D inputs (x,y) • Final Weights (after 160 iterations) • Topography of inputs has been captured
  • 47. 47 R. Rao, Week 6: Neural Networks Summary: Biology and Neural Networks  So many similarities  Information is contained in synaptic connections  Network learns to perform specific functions  Network generalizes to new inputs  But NNs are woefully inadequate compared with biology  Simplistic model of neuron and synapse, implausible learning rules  Hard to train large networks  Network construction (structure, learning rate etc.) is a heuristic art  One obvious difference: Spike representation  Recent models explore spikes and spike-timing dependent plasticity  Other Recent Trends: Probabilistic approach  NNs as Bayesian networks (allows principled derivation of dynamics, learning rules, and even structure of network)  Not clear how neurons encode probabilities in spikes

Hinweis der Redaktion

  1. Mögliche Topologien für Neuronale Netze: - vollständig vernetzt: Selbstassoziative Netze (Hopfield) - feedforward: vielparametrisierter Approximator - recurrent: zur Modellierung zeitveränderlicher Signale Für die Approximation zeitkonstanter Signale werden meist vorwärtsgerichtete Neuronale Netze eingesetzt. Darauf beschränken wir uns auch im weiteren...
  2. Klare Struktur, Informationsfluss vorwaerts. Mit nur einer verdeckten Schicht, lässt sich jede beliebige Funktion approximieren (man braucht evtl halt viele Neuronen…) Andere Ausgabefunktionen fuehren zu anderen Netztypen, dazu spaeter mehr. Uebergang zur Stufenfunktion fuer 
  3. Beispiel: 2 Eingaben, eine Ausgabe. Stufenfunktion in den Neuronen. 1er-Neuron fuer Schwellwert der Stufenfunktion: Wie sieht die Ausgabe des MLP abhängig von den zwei Eingaben x und y aus?
  4. Ok, oben also Ausgabe des roten Neurons 1, unten 0. -> Entscheidungsgerade (-ebene im 3D)
  5. Ebenso das zweite (blaue) Neuron
  6. Abgrenzung eines Bereichs, abhängig von der Lage der Hyperebenen, die durch die verdeckten Neuronen aufgespannt werden.
  7. Message: Man kann beliebige Region abgrenzen, wenn man nur genuegend Neuronen in der verdeckten Schicht spendiert. Was kann denn nun ein Trainingsalgorithmus alles automatisch, d.h. beispieldatengetrieben einstellen?
  8. Andere Netzwerktypen können z.B. durch andere Aktivierungsfunktionen erzielt werden. MLP: globale Aufteilung des Merkmalraums RBF: lokale Aufteilung
  9. Propagierungsfunktion zum verdeckten Layer: Euklidische Distanz zum Gewichtsvektor.
  10. Ausgabe der verdeckten Schicht in [0,1], je dichter Eingabe am Gewichtsvektor liegt, desto grösser die Ausgabe. -> lokale Aktiverung (Gewichtsvektior = Prototyp)
  11. Ausgabe: Summe der verdeckten Neuronen.
  12. Ausgabe ist also eine gewichtete Summe von Basisfunktionen, parametrisiert durch den jeweiligen Gewichtsvektor. D.h. auch RBFs koennen bei genügender Anzahl Neuronen in der verdeckten Schicht beliebige Funktionen approximieren. Vergleich zum MLP: Hyperebenen teilen Raum global auf, beim RBF sind es Hyperkugeln und eine lokale Aufteilung. Problem wieder: Interpretation: Wie stellt man sich den Bereich “um” einen Prototyp vor?
  13. Retinatopic map: from retina to visual cortex Somatosensory map: from skin to somatosensory cortex Tonotopic map: from ear to auditory cortex Hard problem: try to map a circle to a line