SlideShare ist ein Scribd-Unternehmen logo
1 von 45
Downloaden Sie, um offline zu lesen
An Introduction to Neural Networks

                              Vincent Cheung
                              Kevin Cannons
                      Signal & Data Compression Laboratory
                        Electrical & Computer Engineering
                             University of Manitoba
                          Winnipeg, Manitoba, Canada
                            Advisor: Dr. W. Kinsner



May 27, 2002
Neural Networks

         Outline

         ● Fundamentals
         ● Classes
         ● Design and Verification
         ● Results and Discussion
         ● Conclusion




Cheung/Cannons                                    1
Fundamentals                                                                Neural Networks

               What Are Artificial Neural Networks?

               ● An extremely simplified model of the brain
Classes




               ● Essentially a function approximator
                  ►   Transforms inputs into outputs to the best of its ability
Design




                                 Inputs                    Outputs
Results




                                   Inputs     NN         Outputs




Cheung/Cannons                                                                           2
Fundamentals                                                  Neural Networks

               What Are Artificial Neural Networks?

               ● Composed of many “neurons” that co-operate
Classes




                 to perform the desired function
Design
Results




Cheung/Cannons                                                             3
Fundamentals                                                           Neural Networks

               What Are They Used For?

               ● Classification
Classes




                  ►   Pattern recognition, feature extraction, image
                      matching

               ● Noise Reduction
                  ►   Recognize patterns in the inputs and produce
Design




                      noiseless outputs

               ● Prediction
                  ►   Extrapolation based on historical data
Results




Cheung/Cannons                                                                      4
Fundamentals                                                              Neural Networks

               Why Use Neural Networks?

               ● Ability to learn
Classes




                  ►   NN’s figure out how to perform their function on their own
                  ►   Determine their function based only upon sample inputs

               ● Ability to generalize
                      i.e. produce reasonable outputs for inputs it has not been
Design




                  ►
                      taught how to deal with
Results




Cheung/Cannons                                                                         5
Fundamentals                                                                   Neural Networks

               How Do Neural Networks Work?

               ● The output of a neuron is a function of the
Classes




                 weighted sum of the inputs plus a bias
                         i1 w1
                             w2
                        i2
                             w3 Neuron     Output = f(i1w1 + i2w2 + i3w3 + bias)
Design




                          i3

                                 bias
Results




               ● The function of the entire neural network is simply
                 the computation of the outputs of all the neurons
                  ►   An entirely deterministic calculation



Cheung/Cannons                                                                              6
Fundamentals                                                                 Neural Networks

               Activation Functions

               ● Applied to the weighted sum of the inputs of a
Classes




                 neuron to produce the output
               ● Majority of NN’s use sigmoid functions
                  ►   Smooth, continuous, and monotonically increasing
                      (derivative is always positive)
Design




                  ►   Bounded range - but never reaches max or min
                       ■ Consider “ON” to be slightly less than the max and “OFF” to
                         be slightly greater than the min
Results




Cheung/Cannons                                                                            7
Fundamentals                                                               Neural Networks

               Activation Functions

               ● The most common sigmoid function used is the
Classes




                 logistic function
                  ►   f(x) = 1/(1 + e-x)
                  ►   The calculation of derivatives are important for neural
                      networks and the logistic function has a very nice
Design




                      derivative
                       ■ f’(x) = f(x)(1 - f(x))

               ● Other sigmoid functions also used
                      hyperbolic tangent
Results




                  ►
                  ►   arctangent

               ● The exact nature of the function has little effect on
                 the abilities of the neural network

Cheung/Cannons                                                                          8
Fundamentals                                                                      Neural Networks

               Where Do The Weights Come From?

               ● The weights in a neural network are the most
Classes




                 important factor in determining its function
               ● Training is the act of presenting the network with
                 some sample data and modifying the weights to
                 better approximate the desired function
Design




               ● There are two main types of training
                  ►   Supervised Training
                       ■ Supplies the neural network with inputs and the desired
Results




                         outputs
                       ■ Response of the network to the inputs is measured
                             The weights are modified to reduce the difference between
                             the actual and desired outputs



Cheung/Cannons                                                                                 9
Fundamentals                                                                       Neural Networks

               Where Do The Weights Come From?

                 ►   Unsupervised Training
Classes




                      ■ Only supplies inputs
                      ■ The neural network adjusts its own weights so that similar
                        inputs cause similar outputs
                            The network identifies the patterns and differences in the
                            inputs without any external assistance
Design




               ● Epoch
                      ■ One iteration through the process of providing the network
                        with an input and updating the network's weights
                      ■ Typically many epochs are required to train the neural
Results




                        network




Cheung/Cannons                                                                                 10
Fundamentals                                                                   Neural Networks

               Perceptrons

               ● First neural network with the ability to learn
Classes




               ● Made up of only input neurons and output neurons
               ● Input neurons typically have two states: ON and OFF
               ● Output neurons use a simple threshold activation function
Design




               ● In basic form, can only solve linear problems
                   ►   Limited applications
                                                     .5
Results




                                                     .2

                                                     .8




                                     Input Neurons   Weights   Output Neuron


Cheung/Cannons                                                                             11
Fundamentals                                                                                       Neural Networks

               How Do Perceptrons Learn?

               ● Uses supervised training
Classes




               ● If the output is not correct, the weights are
                 adjusted according to the formula:
                         ■ wnew = wold + α(desired – output)*input                    α is the learning rate
Design




                              1                  0.5

                                                 0.2
                              0                                          1


                              1                      0.8
Results




                                                           Assume Output was supposed to be 0
                                                               update the weights
                 1 * 0.5 + 0 * 0.2 + 1 * 0.8 = 1.3
                 Assuming Output Threshold = 1.2           Assume α = 1
                  1.3 > 1.2                                W 1new = 0.5 + 1*(0-1)*1 = -0.5
                                                           W 2new = 0.2 + 1*(0-1)*0 = 0.2
                                                           W 3new = 0.8 + 1*(0-1)*1 = -0.2


Cheung/Cannons                                                                                                 12
Fundamentals                                                                 Neural Networks

               Multilayer Feedforward Networks

               ● Most common neural network
Classes




               ● An extension of the perceptron
                  ►   Multiple layers
                       ■ The addition of one or more “hidden” layers in between the
                         input and output layers
Design




                  ►   Activation function is not simply a threshold
                       ■ Usually a sigmoid function
                  ►   A general function approximator
                       ■ Not limited to linear problems
Results




               ● Information flows in one direction
                  ►   The outputs of one layer act as inputs to the next layer



Cheung/Cannons                                                                           13
Fundamentals                                                                                    Neural Networks

               XOR Example
Classes




                       0

                Inputs
                                                                                      Output
Design




                       1



               Inputs: 0, 1
Results




               H1: Net = 0(4.83) + 1(-4.83) – 2.82 = -7.65    H2: Net = 0(-4.63) + 1(4.6) – 2.74 = 1.86
                   Output = 1 / (1 + e7.65) = 4.758 x 10-4        Output = 1 / (1 + e-1.86) = 0.8652

               O: Net = 4.758 x 10-4(5.73) + 0.8652(5.83) – 2.86 = 2.187
                  Output = 1 / (1 + e-2.187) = 0.8991 ≡ “1”


Cheung/Cannons                                                                                              14
Fundamentals                                                Neural Networks

               Backpropagation

               ● Most common method of obtaining the many
Classes




                 weights in the network
               ● A form of supervised training
               ● The basic backpropagation algorithm is based on
Design




                 minimizing the error of the network using the
                 derivatives of the error function
                  ►   Simple
                  ►   Slow
Results




                  ►   Prone to local minima issues




Cheung/Cannons                                                          15
Fundamentals                                                                       Neural Networks

               Backpropagation

               ● Most common measure of error is the mean
Classes




                 square error:
                       E = (target – output)2


               ● Partial derivatives of the error wrt the weights:
Design




                  ►   Output Neurons:
                       let: δj = f’(netj) (targetj – outputj)   j = output neuron
                       ∂E/∂wji = -outputi δj                    i = neuron in last hidden
Results




                  ►   Hidden Neurons:
                                                                j = hidden neuron
                       let: δj = f’(netj) Σ(δkwkj)
                                                                i = neuron in previous layer
                       ∂E/∂wji = -outputi δj                    k = neuron in next layer


Cheung/Cannons                                                                                 16
Fundamentals                                                       Neural Networks

               Backpropagation

               ● Calculation of the derivatives flows backwards
Classes




                 through the network, hence the name,
                 backpropagation
               ● These derivatives point in the direction of the
                 maximum increase of the error function
Design




               ● A small step (learning rate) in the opposite
                 direction will result in the maximum decrease of
                 the (local) error function:
Results




                      wnew = wold – α ∂E/∂wold
                      where α is the learning rate


Cheung/Cannons                                                                 17
Fundamentals                                                                      Neural Networks

               Backpropagation

               ● The learning rate is important
Classes




                  ►   Too small
                       ■ Convergence extremely slow
                  ►   Too large
                       ■ May not converge
Design




               ● Momentum
                  ►   Tends to aid convergence
                  ►   Applies smoothed averaging to the change in weights:
Results




                       ∆new = β∆old - α ∂E/∂wold           β is the momentum coefficient

                       wnew = wold + ∆new
                  ►   Acts as a low-pass filter by reducing rapid fluctuations



Cheung/Cannons                                                                                18
Fundamentals                                                                        Neural Networks

               Local Minima

               ● Training is essentially minimizing the mean square
Classes




                 error function
                  ►   Key problem is avoiding local minima
                  ►   Traditional techniques for avoiding local minima:
                       ■ Simulated annealing
Design




                             Perturb the weights in progressively smaller amounts
                       ■ Genetic algorithms
                             Use the weights as chromosomes
                             Apply natural selection, mating, and mutations to these
                             chromosomes
Results




Cheung/Cannons                                                                                  19
Fundamentals                                                                     Neural Networks

               Counterpropagation (CP) Networks

               ● Another multilayer feedforward network
Classes




               ● Up to 100 times faster than backpropagation
               ● Not as general as backpropagation
               ● Made up of three layers:
Design




                  ►   Input
                  ►   Kohonen
                  ►   Grossberg (Output)
Results




                                            Inputs   Input   Kohonen   Grossberg Outputs
                                                     Layer   Layer     Layer



Cheung/Cannons                                                                               20
Fundamentals                                                           Neural Networks

               How Do They Work?

               ● Kohonen Layer:
Classes




                  ►   Neurons in the Kohonen layer sum all of the weighted
                      inputs received
                  ►   The neuron with the largest sum outputs a 1 and the
                      other neurons output 0
Design




               ● Grossberg Layer:
                  ►   Each Grossberg neuron merely outputs the weight of the
                      connection between itself and the one active Kohonen
                      neuron
Results




Cheung/Cannons                                                                     21
Fundamentals                                                                Neural Networks

               Why Two Different Types of Layers?

               ● More accurate representation of biological neural
Classes




                 networks
               ● Each layer has its own distinct purpose:
                  ►   Kohonen layer separates inputs into separate classes
                       ■ Inputs in the same class will turn on the same Kohonen
Design




                         neuron
                  ►   Grossberg layer adjusts weights to obtain acceptable
                      outputs for each class
Results




Cheung/Cannons                                                                          22
Fundamentals                                                                Neural Networks

               Training a CP Network

               ● Training the Kohonen layer
Classes




                  ►   Uses unsupervised training
                  ►   Input vectors are often normalized
                  ►   The one active Kohonen neuron updates its weights
                      according to the formula:
Design




                       wnew = wold + α(input - wold)
                       where α is the learning rate
Results




                       ■ The weights of the connections are being modified to more
                         closely match the values of the inputs
                       ■ At the end of training, the weights will approximate the
                         average value of the inputs in that class


Cheung/Cannons                                                                          23
Fundamentals                                                               Neural Networks

               Training a CP Network

               ● Training the Grossberg layer
Classes




                  ►   Uses supervised training
                  ►   Weight update algorithm is similar to that used in
                      backpropagation
Design
Results




Cheung/Cannons                                                                         24
Fundamentals                                                                      Neural Networks

               Hidden Layers and Neurons

               ● For most problems, one layer is sufficient
Classes




               ● Two layers are required when the function is
                 discontinuous
               ● The number of neurons is very important:
Design




                  ►   Too few
                       ■ Underfit the data – NN can’t learn the details
                  ►   Too many
                       ■ Overfit the data – NN learns the insignificant details
Results




                  ►   Start small and increase the number until satisfactory
                      results are obtained




Cheung/Cannons                                                                                25
Fundamentals                 Neural Networks

               Overfitting
Classes
Design




                                Training
                                Test
                                W ell fit
                                Overfit
Results




Cheung/Cannons                              26
Fundamentals                                                           Neural Networks

               How is the Training Set Chosen?

               ● Overfitting can also occur if a “good” training set is
Classes




                 not chosen
               ● What constitutes a “good” training set?
                  ►   Samples must represent the general population
                      Samples must contain members of each class
Design




                  ►
                  ►   Samples in each class must contain a wide range of
                      variations or noise effect
Results




Cheung/Cannons                                                                     27
Fundamentals                                                                   Neural Networks

               Size of the Training Set

               ● The size of the training set is related to the
Classes




                 number of hidden neurons
                  ►   Eg. 10 inputs, 5 hidden neurons, 2 outputs:
                  ►   11(5) + 6(2) = 67 weights (variables)
                  ►   If only 10 training samples are used to determine these
Design




                      weights, the network will end up being overfit
                       ■ Any solution found will be specific to the 10 training
                         samples
                       ■ Analogous to having 10 equations, 67 unknowns          you
Results




                         can come up with a specific solution, but you can’t find the
                         general solution with the given information




Cheung/Cannons                                                                             28
Fundamentals                                                                   Neural Networks

               Training and Verification

               ● The set of all known samples is broken into two
Classes




                 orthogonal (independent) sets:
                  ►   Training set
                       ■ A group of samples used to train the neural network
                  ►   Testing set
Design




                       ■ A group of samples used to test the performance of the
                         neural network
                       ■ Used to estimate the error rate
Results




                                       Known Samples

                                     Training    Testing
                                       Set        Set



Cheung/Cannons                                                                             29
Fundamentals                                                                  Neural Networks

               Verification

               ● Provides an unbiased test of the quality of the
Classes




                 network
               ● Common error is to “test” the neural network using
                 the same samples that were used to train the
                 neural network
Design




                  ►   The network was optimized on these samples, and will
                      obviously perform well on them
                  ►   Doesn’t give any indication as to how well the network
                      will be able to classify inputs that weren’t in the training
Results




                      set




Cheung/Cannons                                                                            30
Fundamentals                                                              Neural Networks

               Verification

               ● Various metrics can be used to grade the
Classes




                 performance of the neural network based upon the
                 results of the testing set
                  ►   Mean square error, SNR, etc.

               ● Resampling is an alternative method of estimating
Design




                 error rate of the neural network
                  ►   Basic idea is to iterate the training and testing
                      procedures multiple times
Results




                  ►   Two main techniques are used:
                       ■ Cross-Validation
                       ■ Bootstrapping




Cheung/Cannons                                                                        31
Fundamentals                                                              Neural Networks

               Results and Discussion

               ● A simple toy problem was used to test the
                 operation of a perceptron
Classes




               ● Provided the perceptron with 5 pieces of
                 information about a face – the individual’s hair,
                 eye, nose, mouth, and ear type
Design




                  ►   Each piece of information could take a value of +1 or -1
                       ■ +1 indicates a “girl” feature
                       ■ -1 indicates a “guy” feature

               ● The individual was to be classified as a girl if the
Results




                 face had more “girl” features than “guy” features
                 and a boy otherwise



Cheung/Cannons                                                                        32
Fundamentals                                                               Neural Networks

               Results and Discussion

               ● Constructed a perceptron with 5 inputs and 1
Classes




                 output
Design




                                Face      Input    Output   Output value
                               Feature   neurons   neuron    indicating
                                Input                        boy or girl
                               Values
Results




               ● Trained the perceptron with 24 out of the 32
                 possible inputs over 1000 epochs
               ● The perceptron was able to classify the faces that
                 were not in the training set
Cheung/Cannons                                                                         33
Fundamentals                                                                       Neural Networks

               Results and Discussion

               ● A number of toy problems were tested on
                 multilayer feedforward NN’s with a single hidden
Classes




                 layer and backpropagation:
                  ►   Inverter
                       ■ The NN was trained to simply output 0.1 when given a “1”
                         and 0.9 when given a “0”
Design




                                 A demonstration of the NN’s ability to memorize
                       ■ 1 input, 1 hidden neuron, 1 output
                       ■ With learning rate of 0.5 and no momentum, it took about
                         3,500 epochs for sufficient training
                       ■ Including a momentum coefficient of 0.9 reduced the
Results




                         number of epochs required to about 250




Cheung/Cannons                                                                                 34
Fundamentals                                                                      Neural Networks

               Results and Discussion

                 ►   Inverter (continued)
Classes




                      ■ Increasing the learning rate decreased the training time
                        without hampering convergence for this simple example
                      ■ Increasing the epoch size, the number of samples per
                        epoch, decreased the number of epochs required and
                        seemed to aid in convergence (reduced fluctuations)
Design




                      ■ Increasing the number of hidden neurons decreased the
                        number of epochs required
                            Allowed the NN to better memorize the training set – the goal
                            of this toy problem
                            Not recommended to use in “real” problems, since the NN
Results




                            loses its ability to generalize




Cheung/Cannons                                                                                35
Fundamentals                                                           Neural Networks

               Results and Discussion

                 ►   AND gate
Classes




                      ■ 2 inputs, 2 hidden neurons, 1 output
                      ■ About 2,500 epochs were required when using momentum
                 ►   XOR gate
                      ■ Same as AND gate
Design




                 ►   3-to-8 decoder
                      ■ 3 inputs, 3 hidden neurons, 8 outputs
                      ■ About 5,000 epochs were required when using momentum
Results




Cheung/Cannons                                                                     36
Fundamentals                                                                      Neural Networks

               Results and Discussion

                 ►   Absolute sine function approximator (|sin(x)|)
Classes




                      ■ A demonstration of the NN’s ability to learn the desired
                        function, |sin(x)|, and to generalize
                      ■ 1 input, 5 hidden neurons, 1 output
                      ■ The NN was trained with samples between –π/2 and π/2
                            The inputs were rounded to one decimal place
Design




                            The desired targets were scaled to between 0.1 and 0.9
                      ■ The test data contained samples in between the training
                        samples (i.e. more than 1 decimal place)
                            The outputs were translated back to between 0 and 1
Results




                      ■ About 50,000 epochs required with momentum
                      ■ Not smooth function at 0 (only piece-wise continuous)




Cheung/Cannons                                                                                37
Fundamentals                                                                 Neural Networks

               Results and Discussion

                                                            2
                 ►   Gaussian function approximator (e-x )
Classes




                      ■ 1 input, 2 hidden neurons, 1 output
                      ■ Similar to the absolute sine function approximator, except
                        that the domain was changed to between -3 and 3
                      ■ About 10,000 epochs were required with momentum
                      ■ Smooth function
Design
Results




Cheung/Cannons                                                                           38
Fundamentals                                                                      Neural Networks

               Results and Discussion

                 ►   Primality tester
Classes




                      ■ 7 inputs, 8 hidden neurons, 1 output
                      ■ The input to the NN was a binary number
                      ■ The NN was trained to output 0.9 if the number was prime
                        and 0.1 if the number was composite
                             Classification and memorization test
Design




                      ■ The inputs were restricted to between 0 and 100
                      ■ About 50,000 epochs required for the NN to memorize the
                        classifications for the training set
                             No attempts at generalization were made due to the
Results




                             complexity of the pattern of prime numbers
                      ■ Some issues with local minima




Cheung/Cannons                                                                                39
Fundamentals                                                                     Neural Networks

               Results and Discussion

                 ►   Prime number generator
                      ■ Provide the network with a seed, and a prime number of the
Classes




                        same order should be returned
                      ■ 7 inputs, 4 hidden neurons, 7 outputs
                      ■ Both the input and outputs were binary numbers
                      ■ The network was trained as an autoassociative network
Design




                            Prime numbers from 0 to 100 were presented to the network
                            and it was requested that the network echo the prime
                            numbers
                            The intent was to have the network output the closest prime
                            number when given a composite number
                      ■ After one million epochs, the network was successfully able
Results




                        to produce prime numbers for about 85 - 90% of the
                        numbers between 0 and 100
                      ■ Using Gray code instead of binary did not improve results
                      ■ Perhaps needs a second hidden layer, or implement some
                        heuristics to reduce local minima issues

Cheung/Cannons                                                                               40
Neural Networks

         Conclusion

         ● The toy examples confirmed the basic operation of
           neural networks and also demonstrated their
           ability to learn the desired function and generalize
           when needed
         ● The ability of neural networks to learn and
           generalize in addition to their wide range of
           applicability makes them very powerful tools




Cheung/Cannons                                                         41
Neural Networks




                 Questions and Comments




Cheung/Cannons                                        42
Neural Networks

         Acknowledgements

         ● Natural Sciences and Engineering Research
           Council (NSERC)


         ● University of Manitoba




Cheung/Cannons                                                     43
Neural Networks

         References

         [AbDo99] H. Abdi, D. Valentin, B. Edelman, Neural Networks, Thousand Oaks, CA: SAGE Publication
             Inc., 1999.


         [Hayk94] S. Haykin, Neural Networks, New York, NY: Nacmillan College Publishing Company, Inc., 1994.


         [Mast93] T. Masters, Practial Neural Network Recipes in C++, Toronto, ON: Academic Press, Inc., 1993.


         [Scha97] R. Schalkoff, Artificial Neural Networks, Toronto, ON: the McGraw-Hill Companies, Inc., 1997.


         [WeKu91] S. M. Weiss and C. A. Kulikowski, Computer Systems That Learn, San Mateo, CA: Morgan
             Kaufmann Publishers, Inc., 1991.


         [Wass89] P. D. Wasserman, Neural Computing: Theory and Practice, New York, NY: Van Nostrand
             Reinhold, 1989.




Cheung/Cannons                                                                                                      44

Weitere ähnliche Inhalte

Was ist angesagt?

Artificial Neural Network
Artificial Neural NetworkArtificial Neural Network
Artificial Neural NetworkPratik Aggarwal
 
Neural network
Neural networkNeural network
Neural networkmarada0033
 
(Artificial) Neural Network
(Artificial) Neural Network(Artificial) Neural Network
(Artificial) Neural NetworkPutri Wikie
 
Artificial Neural Networks for NIU
Artificial Neural Networks for NIUArtificial Neural Networks for NIU
Artificial Neural Networks for NIUProf. Neeta Awasthy
 
Artificial Neural Networks for NIU session 2016 17
Artificial Neural Networks for NIU session 2016 17 Artificial Neural Networks for NIU session 2016 17
Artificial Neural Networks for NIU session 2016 17 Prof. Neeta Awasthy
 
Neural network
Neural networkNeural network
Neural networkFacebook
 
Artificial nueral network slideshare
Artificial nueral network slideshareArtificial nueral network slideshare
Artificial nueral network slideshareRed Innovators
 
Artificial Neural Network
Artificial Neural NetworkArtificial Neural Network
Artificial Neural NetworkAtul Krishna
 
Neural networks
Neural networksNeural networks
Neural networksSlideshare
 
Application of artificial_neural_network
Application of artificial_neural_networkApplication of artificial_neural_network
Application of artificial_neural_networkgabo GAG
 
Neural network
Neural networkNeural network
Neural networkSilicon
 
Neural networks
Neural networksNeural networks
Neural networksSlideshare
 
Artificial neural networks and its applications
Artificial neural networks and its applications Artificial neural networks and its applications
Artificial neural networks and its applications PoojaKoshti2
 

Was ist angesagt? (20)

Artificial Neural Network
Artificial Neural NetworkArtificial Neural Network
Artificial Neural Network
 
Neural network
Neural networkNeural network
Neural network
 
Artifical Neural Network
Artifical Neural NetworkArtifical Neural Network
Artifical Neural Network
 
(Artificial) Neural Network
(Artificial) Neural Network(Artificial) Neural Network
(Artificial) Neural Network
 
Artificial Neural Networks for NIU
Artificial Neural Networks for NIUArtificial Neural Networks for NIU
Artificial Neural Networks for NIU
 
Artificial Neural Networks for NIU session 2016 17
Artificial Neural Networks for NIU session 2016 17 Artificial Neural Networks for NIU session 2016 17
Artificial Neural Networks for NIU session 2016 17
 
Neural network
Neural networkNeural network
Neural network
 
Artificial Neuron network
Artificial Neuron network Artificial Neuron network
Artificial Neuron network
 
Artificial nueral network slideshare
Artificial nueral network slideshareArtificial nueral network slideshare
Artificial nueral network slideshare
 
Neural network and mlp
Neural network and mlpNeural network and mlp
Neural network and mlp
 
Neural network
Neural networkNeural network
Neural network
 
neural-networks (1)
neural-networks (1)neural-networks (1)
neural-networks (1)
 
Neural Networks
Neural NetworksNeural Networks
Neural Networks
 
Artificial Neural Network
Artificial Neural NetworkArtificial Neural Network
Artificial Neural Network
 
Neural networks
Neural networksNeural networks
Neural networks
 
Application of artificial_neural_network
Application of artificial_neural_networkApplication of artificial_neural_network
Application of artificial_neural_network
 
Neural network
Neural networkNeural network
Neural network
 
Neural networks
Neural networksNeural networks
Neural networks
 
Artificial neural networks and its applications
Artificial neural networks and its applications Artificial neural networks and its applications
Artificial neural networks and its applications
 
Artificial Neural Network.pptx
Artificial Neural Network.pptxArtificial Neural Network.pptx
Artificial Neural Network.pptx
 

Ähnlich wie Neural networks.cheungcannonnotes

Ann model and its application
Ann model and its applicationAnn model and its application
Ann model and its applicationmilan107
 
Basics of Artificial Neural Network
Basics of Artificial Neural Network Basics of Artificial Neural Network
Basics of Artificial Neural Network Subham Preetam
 
ENNEoS Presentation - HackMiami
ENNEoS Presentation - HackMiamiENNEoS Presentation - HackMiami
ENNEoS Presentation - HackMiamiDrew Kirkpatrick
 
Dr. Syed Muhammad Ali Tirmizi - Special topics in finance lec 13
Dr. Syed Muhammad Ali Tirmizi - Special topics in finance   lec 13Dr. Syed Muhammad Ali Tirmizi - Special topics in finance   lec 13
Dr. Syed Muhammad Ali Tirmizi - Special topics in finance lec 13Dr. Muhammad Ali Tirmizi., Ph.D.
 
NEUROMORPHIC COMPUTING.pptx
NEUROMORPHIC COMPUTING.pptxNEUROMORPHIC COMPUTING.pptx
NEUROMORPHIC COMPUTING.pptxkomalpawooskar1
 
ENNEoS Presentation - CackalackyCon
ENNEoS Presentation - CackalackyConENNEoS Presentation - CackalackyCon
ENNEoS Presentation - CackalackyConDrew Kirkpatrick
 
machine learning alg.pptx
machine learning alg.pptxmachine learning alg.pptx
machine learning alg.pptxNANDHINIS109942
 
Artificial neural network
Artificial neural networkArtificial neural network
Artificial neural networkDEEPASHRI HK
 
deep learning from scratch chapter 3 neural network
deep learning from scratch chapter 3 neural networkdeep learning from scratch chapter 3 neural network
deep learning from scratch chapter 3 neural networkJaey Jeong
 
neural network
neural networkneural network
neural networkSTUDENT
 
Deep learning crash course
Deep learning crash courseDeep learning crash course
Deep learning crash courseVishwas N
 
nil-100128213838-phpapp02.pdf
nil-100128213838-phpapp02.pdfnil-100128213838-phpapp02.pdf
nil-100128213838-phpapp02.pdfdlakmlkfma
 
Deep Neural Networks.pptx
Deep Neural Networks.pptxDeep Neural Networks.pptx
Deep Neural Networks.pptxYashPaul20
 
artificialneuralnetwork-130409001108-phpapp02 (2).pptx
artificialneuralnetwork-130409001108-phpapp02 (2).pptxartificialneuralnetwork-130409001108-phpapp02 (2).pptx
artificialneuralnetwork-130409001108-phpapp02 (2).pptxREG83NITHYANANTHANN
 
Exploring Randomly Wired Neural Networks for Image Recognition
Exploring Randomly Wired Neural Networks for Image RecognitionExploring Randomly Wired Neural Networks for Image Recognition
Exploring Randomly Wired Neural Networks for Image RecognitionYongsu Baek
 

Ähnlich wie Neural networks.cheungcannonnotes (20)

Neural
NeuralNeural
Neural
 
Neural network
Neural networkNeural network
Neural network
 
CNN.pptx.pdf
CNN.pptx.pdfCNN.pptx.pdf
CNN.pptx.pdf
 
Ann model and its application
Ann model and its applicationAnn model and its application
Ann model and its application
 
Basics of Artificial Neural Network
Basics of Artificial Neural Network Basics of Artificial Neural Network
Basics of Artificial Neural Network
 
ENNEoS Presentation - HackMiami
ENNEoS Presentation - HackMiamiENNEoS Presentation - HackMiami
ENNEoS Presentation - HackMiami
 
Dr. Syed Muhammad Ali Tirmizi - Special topics in finance lec 13
Dr. Syed Muhammad Ali Tirmizi - Special topics in finance   lec 13Dr. Syed Muhammad Ali Tirmizi - Special topics in finance   lec 13
Dr. Syed Muhammad Ali Tirmizi - Special topics in finance lec 13
 
NEUROMORPHIC COMPUTING.pptx
NEUROMORPHIC COMPUTING.pptxNEUROMORPHIC COMPUTING.pptx
NEUROMORPHIC COMPUTING.pptx
 
ENNEoS Presentation - CackalackyCon
ENNEoS Presentation - CackalackyConENNEoS Presentation - CackalackyCon
ENNEoS Presentation - CackalackyCon
 
MACHINE LEARNING.pptx
MACHINE LEARNING.pptxMACHINE LEARNING.pptx
MACHINE LEARNING.pptx
 
ml.pptx
ml.pptxml.pptx
ml.pptx
 
machine learning alg.pptx
machine learning alg.pptxmachine learning alg.pptx
machine learning alg.pptx
 
Artificial neural network
Artificial neural networkArtificial neural network
Artificial neural network
 
deep learning from scratch chapter 3 neural network
deep learning from scratch chapter 3 neural networkdeep learning from scratch chapter 3 neural network
deep learning from scratch chapter 3 neural network
 
neural network
neural networkneural network
neural network
 
Deep learning crash course
Deep learning crash courseDeep learning crash course
Deep learning crash course
 
nil-100128213838-phpapp02.pdf
nil-100128213838-phpapp02.pdfnil-100128213838-phpapp02.pdf
nil-100128213838-phpapp02.pdf
 
Deep Neural Networks.pptx
Deep Neural Networks.pptxDeep Neural Networks.pptx
Deep Neural Networks.pptx
 
artificialneuralnetwork-130409001108-phpapp02 (2).pptx
artificialneuralnetwork-130409001108-phpapp02 (2).pptxartificialneuralnetwork-130409001108-phpapp02 (2).pptx
artificialneuralnetwork-130409001108-phpapp02 (2).pptx
 
Exploring Randomly Wired Neural Networks for Image Recognition
Exploring Randomly Wired Neural Networks for Image RecognitionExploring Randomly Wired Neural Networks for Image Recognition
Exploring Randomly Wired Neural Networks for Image Recognition
 

Kürzlich hochgeladen

Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 

Kürzlich hochgeladen (20)

Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 

Neural networks.cheungcannonnotes

  • 1. An Introduction to Neural Networks Vincent Cheung Kevin Cannons Signal & Data Compression Laboratory Electrical & Computer Engineering University of Manitoba Winnipeg, Manitoba, Canada Advisor: Dr. W. Kinsner May 27, 2002
  • 2. Neural Networks Outline ● Fundamentals ● Classes ● Design and Verification ● Results and Discussion ● Conclusion Cheung/Cannons 1
  • 3. Fundamentals Neural Networks What Are Artificial Neural Networks? ● An extremely simplified model of the brain Classes ● Essentially a function approximator ► Transforms inputs into outputs to the best of its ability Design Inputs Outputs Results Inputs NN Outputs Cheung/Cannons 2
  • 4. Fundamentals Neural Networks What Are Artificial Neural Networks? ● Composed of many “neurons” that co-operate Classes to perform the desired function Design Results Cheung/Cannons 3
  • 5. Fundamentals Neural Networks What Are They Used For? ● Classification Classes ► Pattern recognition, feature extraction, image matching ● Noise Reduction ► Recognize patterns in the inputs and produce Design noiseless outputs ● Prediction ► Extrapolation based on historical data Results Cheung/Cannons 4
  • 6. Fundamentals Neural Networks Why Use Neural Networks? ● Ability to learn Classes ► NN’s figure out how to perform their function on their own ► Determine their function based only upon sample inputs ● Ability to generalize i.e. produce reasonable outputs for inputs it has not been Design ► taught how to deal with Results Cheung/Cannons 5
  • 7. Fundamentals Neural Networks How Do Neural Networks Work? ● The output of a neuron is a function of the Classes weighted sum of the inputs plus a bias i1 w1 w2 i2 w3 Neuron Output = f(i1w1 + i2w2 + i3w3 + bias) Design i3 bias Results ● The function of the entire neural network is simply the computation of the outputs of all the neurons ► An entirely deterministic calculation Cheung/Cannons 6
  • 8. Fundamentals Neural Networks Activation Functions ● Applied to the weighted sum of the inputs of a Classes neuron to produce the output ● Majority of NN’s use sigmoid functions ► Smooth, continuous, and monotonically increasing (derivative is always positive) Design ► Bounded range - but never reaches max or min ■ Consider “ON” to be slightly less than the max and “OFF” to be slightly greater than the min Results Cheung/Cannons 7
  • 9. Fundamentals Neural Networks Activation Functions ● The most common sigmoid function used is the Classes logistic function ► f(x) = 1/(1 + e-x) ► The calculation of derivatives are important for neural networks and the logistic function has a very nice Design derivative ■ f’(x) = f(x)(1 - f(x)) ● Other sigmoid functions also used hyperbolic tangent Results ► ► arctangent ● The exact nature of the function has little effect on the abilities of the neural network Cheung/Cannons 8
  • 10. Fundamentals Neural Networks Where Do The Weights Come From? ● The weights in a neural network are the most Classes important factor in determining its function ● Training is the act of presenting the network with some sample data and modifying the weights to better approximate the desired function Design ● There are two main types of training ► Supervised Training ■ Supplies the neural network with inputs and the desired Results outputs ■ Response of the network to the inputs is measured The weights are modified to reduce the difference between the actual and desired outputs Cheung/Cannons 9
  • 11. Fundamentals Neural Networks Where Do The Weights Come From? ► Unsupervised Training Classes ■ Only supplies inputs ■ The neural network adjusts its own weights so that similar inputs cause similar outputs The network identifies the patterns and differences in the inputs without any external assistance Design ● Epoch ■ One iteration through the process of providing the network with an input and updating the network's weights ■ Typically many epochs are required to train the neural Results network Cheung/Cannons 10
  • 12. Fundamentals Neural Networks Perceptrons ● First neural network with the ability to learn Classes ● Made up of only input neurons and output neurons ● Input neurons typically have two states: ON and OFF ● Output neurons use a simple threshold activation function Design ● In basic form, can only solve linear problems ► Limited applications .5 Results .2 .8 Input Neurons Weights Output Neuron Cheung/Cannons 11
  • 13. Fundamentals Neural Networks How Do Perceptrons Learn? ● Uses supervised training Classes ● If the output is not correct, the weights are adjusted according to the formula: ■ wnew = wold + α(desired – output)*input α is the learning rate Design 1 0.5 0.2 0 1 1 0.8 Results Assume Output was supposed to be 0 update the weights 1 * 0.5 + 0 * 0.2 + 1 * 0.8 = 1.3 Assuming Output Threshold = 1.2 Assume α = 1 1.3 > 1.2 W 1new = 0.5 + 1*(0-1)*1 = -0.5 W 2new = 0.2 + 1*(0-1)*0 = 0.2 W 3new = 0.8 + 1*(0-1)*1 = -0.2 Cheung/Cannons 12
  • 14. Fundamentals Neural Networks Multilayer Feedforward Networks ● Most common neural network Classes ● An extension of the perceptron ► Multiple layers ■ The addition of one or more “hidden” layers in between the input and output layers Design ► Activation function is not simply a threshold ■ Usually a sigmoid function ► A general function approximator ■ Not limited to linear problems Results ● Information flows in one direction ► The outputs of one layer act as inputs to the next layer Cheung/Cannons 13
  • 15. Fundamentals Neural Networks XOR Example Classes 0 Inputs Output Design 1 Inputs: 0, 1 Results H1: Net = 0(4.83) + 1(-4.83) – 2.82 = -7.65 H2: Net = 0(-4.63) + 1(4.6) – 2.74 = 1.86 Output = 1 / (1 + e7.65) = 4.758 x 10-4 Output = 1 / (1 + e-1.86) = 0.8652 O: Net = 4.758 x 10-4(5.73) + 0.8652(5.83) – 2.86 = 2.187 Output = 1 / (1 + e-2.187) = 0.8991 ≡ “1” Cheung/Cannons 14
  • 16. Fundamentals Neural Networks Backpropagation ● Most common method of obtaining the many Classes weights in the network ● A form of supervised training ● The basic backpropagation algorithm is based on Design minimizing the error of the network using the derivatives of the error function ► Simple ► Slow Results ► Prone to local minima issues Cheung/Cannons 15
  • 17. Fundamentals Neural Networks Backpropagation ● Most common measure of error is the mean Classes square error: E = (target – output)2 ● Partial derivatives of the error wrt the weights: Design ► Output Neurons: let: δj = f’(netj) (targetj – outputj) j = output neuron ∂E/∂wji = -outputi δj i = neuron in last hidden Results ► Hidden Neurons: j = hidden neuron let: δj = f’(netj) Σ(δkwkj) i = neuron in previous layer ∂E/∂wji = -outputi δj k = neuron in next layer Cheung/Cannons 16
  • 18. Fundamentals Neural Networks Backpropagation ● Calculation of the derivatives flows backwards Classes through the network, hence the name, backpropagation ● These derivatives point in the direction of the maximum increase of the error function Design ● A small step (learning rate) in the opposite direction will result in the maximum decrease of the (local) error function: Results wnew = wold – α ∂E/∂wold where α is the learning rate Cheung/Cannons 17
  • 19. Fundamentals Neural Networks Backpropagation ● The learning rate is important Classes ► Too small ■ Convergence extremely slow ► Too large ■ May not converge Design ● Momentum ► Tends to aid convergence ► Applies smoothed averaging to the change in weights: Results ∆new = β∆old - α ∂E/∂wold β is the momentum coefficient wnew = wold + ∆new ► Acts as a low-pass filter by reducing rapid fluctuations Cheung/Cannons 18
  • 20. Fundamentals Neural Networks Local Minima ● Training is essentially minimizing the mean square Classes error function ► Key problem is avoiding local minima ► Traditional techniques for avoiding local minima: ■ Simulated annealing Design Perturb the weights in progressively smaller amounts ■ Genetic algorithms Use the weights as chromosomes Apply natural selection, mating, and mutations to these chromosomes Results Cheung/Cannons 19
  • 21. Fundamentals Neural Networks Counterpropagation (CP) Networks ● Another multilayer feedforward network Classes ● Up to 100 times faster than backpropagation ● Not as general as backpropagation ● Made up of three layers: Design ► Input ► Kohonen ► Grossberg (Output) Results Inputs Input Kohonen Grossberg Outputs Layer Layer Layer Cheung/Cannons 20
  • 22. Fundamentals Neural Networks How Do They Work? ● Kohonen Layer: Classes ► Neurons in the Kohonen layer sum all of the weighted inputs received ► The neuron with the largest sum outputs a 1 and the other neurons output 0 Design ● Grossberg Layer: ► Each Grossberg neuron merely outputs the weight of the connection between itself and the one active Kohonen neuron Results Cheung/Cannons 21
  • 23. Fundamentals Neural Networks Why Two Different Types of Layers? ● More accurate representation of biological neural Classes networks ● Each layer has its own distinct purpose: ► Kohonen layer separates inputs into separate classes ■ Inputs in the same class will turn on the same Kohonen Design neuron ► Grossberg layer adjusts weights to obtain acceptable outputs for each class Results Cheung/Cannons 22
  • 24. Fundamentals Neural Networks Training a CP Network ● Training the Kohonen layer Classes ► Uses unsupervised training ► Input vectors are often normalized ► The one active Kohonen neuron updates its weights according to the formula: Design wnew = wold + α(input - wold) where α is the learning rate Results ■ The weights of the connections are being modified to more closely match the values of the inputs ■ At the end of training, the weights will approximate the average value of the inputs in that class Cheung/Cannons 23
  • 25. Fundamentals Neural Networks Training a CP Network ● Training the Grossberg layer Classes ► Uses supervised training ► Weight update algorithm is similar to that used in backpropagation Design Results Cheung/Cannons 24
  • 26. Fundamentals Neural Networks Hidden Layers and Neurons ● For most problems, one layer is sufficient Classes ● Two layers are required when the function is discontinuous ● The number of neurons is very important: Design ► Too few ■ Underfit the data – NN can’t learn the details ► Too many ■ Overfit the data – NN learns the insignificant details Results ► Start small and increase the number until satisfactory results are obtained Cheung/Cannons 25
  • 27. Fundamentals Neural Networks Overfitting Classes Design Training Test W ell fit Overfit Results Cheung/Cannons 26
  • 28. Fundamentals Neural Networks How is the Training Set Chosen? ● Overfitting can also occur if a “good” training set is Classes not chosen ● What constitutes a “good” training set? ► Samples must represent the general population Samples must contain members of each class Design ► ► Samples in each class must contain a wide range of variations or noise effect Results Cheung/Cannons 27
  • 29. Fundamentals Neural Networks Size of the Training Set ● The size of the training set is related to the Classes number of hidden neurons ► Eg. 10 inputs, 5 hidden neurons, 2 outputs: ► 11(5) + 6(2) = 67 weights (variables) ► If only 10 training samples are used to determine these Design weights, the network will end up being overfit ■ Any solution found will be specific to the 10 training samples ■ Analogous to having 10 equations, 67 unknowns you Results can come up with a specific solution, but you can’t find the general solution with the given information Cheung/Cannons 28
  • 30. Fundamentals Neural Networks Training and Verification ● The set of all known samples is broken into two Classes orthogonal (independent) sets: ► Training set ■ A group of samples used to train the neural network ► Testing set Design ■ A group of samples used to test the performance of the neural network ■ Used to estimate the error rate Results Known Samples Training Testing Set Set Cheung/Cannons 29
  • 31. Fundamentals Neural Networks Verification ● Provides an unbiased test of the quality of the Classes network ● Common error is to “test” the neural network using the same samples that were used to train the neural network Design ► The network was optimized on these samples, and will obviously perform well on them ► Doesn’t give any indication as to how well the network will be able to classify inputs that weren’t in the training Results set Cheung/Cannons 30
  • 32. Fundamentals Neural Networks Verification ● Various metrics can be used to grade the Classes performance of the neural network based upon the results of the testing set ► Mean square error, SNR, etc. ● Resampling is an alternative method of estimating Design error rate of the neural network ► Basic idea is to iterate the training and testing procedures multiple times Results ► Two main techniques are used: ■ Cross-Validation ■ Bootstrapping Cheung/Cannons 31
  • 33. Fundamentals Neural Networks Results and Discussion ● A simple toy problem was used to test the operation of a perceptron Classes ● Provided the perceptron with 5 pieces of information about a face – the individual’s hair, eye, nose, mouth, and ear type Design ► Each piece of information could take a value of +1 or -1 ■ +1 indicates a “girl” feature ■ -1 indicates a “guy” feature ● The individual was to be classified as a girl if the Results face had more “girl” features than “guy” features and a boy otherwise Cheung/Cannons 32
  • 34. Fundamentals Neural Networks Results and Discussion ● Constructed a perceptron with 5 inputs and 1 Classes output Design Face Input Output Output value Feature neurons neuron indicating Input boy or girl Values Results ● Trained the perceptron with 24 out of the 32 possible inputs over 1000 epochs ● The perceptron was able to classify the faces that were not in the training set Cheung/Cannons 33
  • 35. Fundamentals Neural Networks Results and Discussion ● A number of toy problems were tested on multilayer feedforward NN’s with a single hidden Classes layer and backpropagation: ► Inverter ■ The NN was trained to simply output 0.1 when given a “1” and 0.9 when given a “0” Design A demonstration of the NN’s ability to memorize ■ 1 input, 1 hidden neuron, 1 output ■ With learning rate of 0.5 and no momentum, it took about 3,500 epochs for sufficient training ■ Including a momentum coefficient of 0.9 reduced the Results number of epochs required to about 250 Cheung/Cannons 34
  • 36. Fundamentals Neural Networks Results and Discussion ► Inverter (continued) Classes ■ Increasing the learning rate decreased the training time without hampering convergence for this simple example ■ Increasing the epoch size, the number of samples per epoch, decreased the number of epochs required and seemed to aid in convergence (reduced fluctuations) Design ■ Increasing the number of hidden neurons decreased the number of epochs required Allowed the NN to better memorize the training set – the goal of this toy problem Not recommended to use in “real” problems, since the NN Results loses its ability to generalize Cheung/Cannons 35
  • 37. Fundamentals Neural Networks Results and Discussion ► AND gate Classes ■ 2 inputs, 2 hidden neurons, 1 output ■ About 2,500 epochs were required when using momentum ► XOR gate ■ Same as AND gate Design ► 3-to-8 decoder ■ 3 inputs, 3 hidden neurons, 8 outputs ■ About 5,000 epochs were required when using momentum Results Cheung/Cannons 36
  • 38. Fundamentals Neural Networks Results and Discussion ► Absolute sine function approximator (|sin(x)|) Classes ■ A demonstration of the NN’s ability to learn the desired function, |sin(x)|, and to generalize ■ 1 input, 5 hidden neurons, 1 output ■ The NN was trained with samples between –π/2 and π/2 The inputs were rounded to one decimal place Design The desired targets were scaled to between 0.1 and 0.9 ■ The test data contained samples in between the training samples (i.e. more than 1 decimal place) The outputs were translated back to between 0 and 1 Results ■ About 50,000 epochs required with momentum ■ Not smooth function at 0 (only piece-wise continuous) Cheung/Cannons 37
  • 39. Fundamentals Neural Networks Results and Discussion 2 ► Gaussian function approximator (e-x ) Classes ■ 1 input, 2 hidden neurons, 1 output ■ Similar to the absolute sine function approximator, except that the domain was changed to between -3 and 3 ■ About 10,000 epochs were required with momentum ■ Smooth function Design Results Cheung/Cannons 38
  • 40. Fundamentals Neural Networks Results and Discussion ► Primality tester Classes ■ 7 inputs, 8 hidden neurons, 1 output ■ The input to the NN was a binary number ■ The NN was trained to output 0.9 if the number was prime and 0.1 if the number was composite Classification and memorization test Design ■ The inputs were restricted to between 0 and 100 ■ About 50,000 epochs required for the NN to memorize the classifications for the training set No attempts at generalization were made due to the Results complexity of the pattern of prime numbers ■ Some issues with local minima Cheung/Cannons 39
  • 41. Fundamentals Neural Networks Results and Discussion ► Prime number generator ■ Provide the network with a seed, and a prime number of the Classes same order should be returned ■ 7 inputs, 4 hidden neurons, 7 outputs ■ Both the input and outputs were binary numbers ■ The network was trained as an autoassociative network Design Prime numbers from 0 to 100 were presented to the network and it was requested that the network echo the prime numbers The intent was to have the network output the closest prime number when given a composite number ■ After one million epochs, the network was successfully able Results to produce prime numbers for about 85 - 90% of the numbers between 0 and 100 ■ Using Gray code instead of binary did not improve results ■ Perhaps needs a second hidden layer, or implement some heuristics to reduce local minima issues Cheung/Cannons 40
  • 42. Neural Networks Conclusion ● The toy examples confirmed the basic operation of neural networks and also demonstrated their ability to learn the desired function and generalize when needed ● The ability of neural networks to learn and generalize in addition to their wide range of applicability makes them very powerful tools Cheung/Cannons 41
  • 43. Neural Networks Questions and Comments Cheung/Cannons 42
  • 44. Neural Networks Acknowledgements ● Natural Sciences and Engineering Research Council (NSERC) ● University of Manitoba Cheung/Cannons 43
  • 45. Neural Networks References [AbDo99] H. Abdi, D. Valentin, B. Edelman, Neural Networks, Thousand Oaks, CA: SAGE Publication Inc., 1999. [Hayk94] S. Haykin, Neural Networks, New York, NY: Nacmillan College Publishing Company, Inc., 1994. [Mast93] T. Masters, Practial Neural Network Recipes in C++, Toronto, ON: Academic Press, Inc., 1993. [Scha97] R. Schalkoff, Artificial Neural Networks, Toronto, ON: the McGraw-Hill Companies, Inc., 1997. [WeKu91] S. M. Weiss and C. A. Kulikowski, Computer Systems That Learn, San Mateo, CA: Morgan Kaufmann Publishers, Inc., 1991. [Wass89] P. D. Wasserman, Neural Computing: Theory and Practice, New York, NY: Van Nostrand Reinhold, 1989. Cheung/Cannons 44