SlideShare ist ein Scribd-Unternehmen logo
1 von 37
Downloaden Sie, um offline zu lesen
Neural Networks




Machine Learning; Thu May 21, 2008
Motivation
Problem: With our linear methods, we can train the
weights but not the basis functions:




            Activator
                                      Fixed basis
                        Trainable       function
                         weight
Motivation
Problem: With our linear methods, we can train the
weights but not the basis functions:




Can we learn the basis functions?
Motivation
Problem: With our linear methods, we can train the
weights but not the basis functions:




Can we learn the basis functions?



Maybe reuse ideas from our previous technology?
Adjustable weight vectors?
Historical notes
This motivation is not how the technology came
about!

Neural networks were developed in an attempt to
achieve AI by modeling the brain (but artificial
neural networks have very little to do with biological
neural networks).
Feed-forward neural network
Two levels of “linear regression” (or classification):




                                                  Output layer




                                                  Hidden layer
Feed-forward neural network
Example



    tanh activation function for 3
    hidden layers

    Linear activation function for
    output layer
Example



    tanh activation function for 3
    hidden layers

    Linear activation function for
    output layer
Example



    tanh activation function for 3
    hidden layers

    Linear activation function for
    output layer
Example



    tanh activation function for 3
    hidden layers

    Linear activation function for
    output layer
Example



    tanh activation function for 3
    hidden layers

    Linear activation function for
    output layer
Network training
If we can give the network a probabilistic
interpretation, we can use our “standard” estimation
methods for training.
Network training
If we can give the network a probabilistic
interpretation, we can use our “standard” estimation
methods for training.

Regression problems:
Network training
If we can give the network a probabilistic
interpretation, we can use our “standard” estimation
methods for training.

Classification problems:
Network training
If we can give the network a probabilistic
interpretation, we can use our “standard” estimation
methods for training.

Training by minimizing an error function.
Network training
If we can give the network a probabilistic
interpretation, we can use our “standard” estimation
methods for training.

Training by minimizing an error function.

Typically, we cannot minimize analytically but need
numerical optimization algorithms.
Network training
If we can give the network a probabilistic
interpretation, we can use our “standard” estimation
methods for training.

Training by minimizing an error function.

Typically, we cannot minimize analytically but need
numerical optimization algorithms.
Notice: There will typically be more than one
minimum of the error function (due to symmetries).

Notice: At best, we can find local minima.
Parameter optimization
Numerical optimization is beyond the scope of this
class – you can (should) get away with just using
appropriate libraries such as GSL.
Parameter optimization
Numerical optimization is beyond the scope of this
class – you can (should) get away with just using
appropriate libraries such as GSL.

These can typically find (local) minima of general
functions, but the most efficient algorithms also
need the gradient of the function to minimize.
Back propagation
Back propagation is an efficient algorithm for
computing both the error function and the gradient.
Back propagation
Back propagation is an efficient algorithm for
computing both the error function and the gradient.


For iid data:
Back propagation
Back propagation is an efficient algorithm for
computing both the error function and the gradient.


For iid data:




                                           We just focus
                                            on this guy
Back propagation
After running the feed-forward algorithm, we
know the values zi and yk.




we need to compute the ds (called errors).
Back propagation

The choice of error function and output
activator typically makes it simple to
deal with the output layer
Back propagation

For hidden layers, we apply the chain
rule
Calculating error and gradient


Algorithm

1. Apply feed forward to 
calculate hidden and output 
variables and then the error.

2. Calculate output errors 
and propagate errors back to 
get the gradient.

3. Iterate ad lib.
Calculating error and gradient


Algorithm

1. Apply feed forward to 
calculate hidden and output 
variables and then the error.

2. Calculate output errors 
and propagate errors back to 
get the gradient.

3. Iterate ad lib.
Calculating error and gradient


Algorithm

1. Apply feed forward to 
calculate hidden and output 
variables and then the error.

2. Calculate output errors 
and propagate errors back to 
get the gradient.

3. Iterate ad lib.
Calculating error and gradient


Algorithm

1. Apply feed forward to 
calculate hidden and output 
variables and then the error.

2. Calculate output errors 
and propagate errors back to 
get the gradient.

3. Iterate ad lib.
Calculating error and gradient


Algorithm

1. Apply feed forward to 
calculate hidden and output 
variables and then the error.

2. Calculate output errors 
and propagate errors back to 
get the gradient.

3. Iterate ad lib.
Calculating error and gradient


Algorithm

1. Apply feed forward to 
calculate hidden and output 
variables and then the error.

2. Calculate output errors 
and propagate errors back to 
get the gradient.

3. Iterate ad lib.
Calculating error and gradient


Algorithm

1. Apply feed forward to 
calculate hidden and output 
variables and then the error.

2. Calculate output errors 
and propagate errors back to 
get the gradient.

3. Iterate ad lib.
Calculating error and gradient


Algorithm

1. Apply feed forward to 
calculate hidden and output 
variables and then the error.

2. Calculate output errors 
and propagate errors back to 
get the gradient.

3. Iterate ad lib.
Overfitting
The complexity of the network is determined by the
hidden layer – and overfitting is still a problem.

Use e.g. test datasets to alleviate this problem.
Early stopping
...or just stop the training when overfitting starts to
occur.




 Training data                           Test data
Summary
   ●   Neural network as general
       regression/classification
       models
   ●   Feed-forward evaluation
   ●   Maximum-likelihood
       framework
   ●   Error back-propagation to
       obtain gradient

Weitere ähnliche Inhalte

Was ist angesagt?

Corinna Cortes, Head of Research, Google, at MLconf NYC 2017
Corinna Cortes, Head of Research, Google, at MLconf NYC 2017Corinna Cortes, Head of Research, Google, at MLconf NYC 2017
Corinna Cortes, Head of Research, Google, at MLconf NYC 2017
MLconf
 

Was ist angesagt? (11)

Getting started with Machine Learning
Getting started with Machine LearningGetting started with Machine Learning
Getting started with Machine Learning
 
ANN load forecasting
ANN load forecastingANN load forecasting
ANN load forecasting
 
Deep Dive into Hyperparameter Tuning
Deep Dive into Hyperparameter TuningDeep Dive into Hyperparameter Tuning
Deep Dive into Hyperparameter Tuning
 
Tips and tricks to win kaggle data science competitions
Tips and tricks to win kaggle data science competitionsTips and tricks to win kaggle data science competitions
Tips and tricks to win kaggle data science competitions
 
Automated Machine Learning (Auto ML)
Automated Machine Learning (Auto ML)Automated Machine Learning (Auto ML)
Automated Machine Learning (Auto ML)
 
Neural Network from Scratch in Python
Neural Network from Scratch in PythonNeural Network from Scratch in Python
Neural Network from Scratch in Python
 
Neural network image recognition
Neural network image recognitionNeural network image recognition
Neural network image recognition
 
Introduction to Neural Networks
Introduction to Neural NetworksIntroduction to Neural Networks
Introduction to Neural Networks
 
Presentation on BornoNet Research Paper and Python Basics
Presentation on BornoNet Research Paper and Python BasicsPresentation on BornoNet Research Paper and Python Basics
Presentation on BornoNet Research Paper and Python Basics
 
Brief introduction to Machine Learning
Brief introduction to Machine LearningBrief introduction to Machine Learning
Brief introduction to Machine Learning
 
Corinna Cortes, Head of Research, Google, at MLconf NYC 2017
Corinna Cortes, Head of Research, Google, at MLconf NYC 2017Corinna Cortes, Head of Research, Google, at MLconf NYC 2017
Corinna Cortes, Head of Research, Google, at MLconf NYC 2017
 

Ähnlich wie Neural Networks

Machine learning Module-2, 6th Semester Elective
Machine learning Module-2, 6th Semester ElectiveMachine learning Module-2, 6th Semester Elective
Machine learning Module-2, 6th Semester Elective
MayuraD1
 

Ähnlich wie Neural Networks (20)

Introduction Of Artificial neural network
Introduction Of Artificial neural networkIntroduction Of Artificial neural network
Introduction Of Artificial neural network
 
Deep learning crash course
Deep learning crash courseDeep learning crash course
Deep learning crash course
 
Deep learning from scratch
Deep learning from scratch Deep learning from scratch
Deep learning from scratch
 
backpropagation in neural networks
backpropagation in neural networksbackpropagation in neural networks
backpropagation in neural networks
 
Activation functions and Training Algorithms for Deep Neural network
Activation functions and Training Algorithms for Deep Neural networkActivation functions and Training Algorithms for Deep Neural network
Activation functions and Training Algorithms for Deep Neural network
 
PRML Chapter 5
PRML Chapter 5PRML Chapter 5
PRML Chapter 5
 
cnn ppt.pptx
cnn ppt.pptxcnn ppt.pptx
cnn ppt.pptx
 
N ns 1
N ns 1N ns 1
N ns 1
 
Neural network based numerical digits recognization using nnt in matlab
Neural network based numerical digits recognization using nnt in matlabNeural network based numerical digits recognization using nnt in matlab
Neural network based numerical digits recognization using nnt in matlab
 
Higgs Boson Challenge
Higgs Boson ChallengeHiggs Boson Challenge
Higgs Boson Challenge
 
Machine Learning Essentials Demystified part2 | Big Data Demystified
Machine Learning Essentials Demystified part2 | Big Data DemystifiedMachine Learning Essentials Demystified part2 | Big Data Demystified
Machine Learning Essentials Demystified part2 | Big Data Demystified
 
Deep learning from a novice perspective
Deep learning from a novice perspectiveDeep learning from a novice perspective
Deep learning from a novice perspective
 
Machine learning Module-2, 6th Semester Elective
Machine learning Module-2, 6th Semester ElectiveMachine learning Module-2, 6th Semester Elective
Machine learning Module-2, 6th Semester Elective
 
08 neural networks
08 neural networks08 neural networks
08 neural networks
 
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
 
PPT - AutoML-Zero: Evolving Machine Learning Algorithms From Scratch
PPT - AutoML-Zero: Evolving Machine Learning Algorithms From ScratchPPT - AutoML-Zero: Evolving Machine Learning Algorithms From Scratch
PPT - AutoML-Zero: Evolving Machine Learning Algorithms From Scratch
 
Deep Learning
Deep LearningDeep Learning
Deep Learning
 
Neural Networks in Data Mining - “An Overview”
Neural Networks  in Data Mining -   “An Overview”Neural Networks  in Data Mining -   “An Overview”
Neural Networks in Data Mining - “An Overview”
 
EssentialsOfMachineLearning.pdf
EssentialsOfMachineLearning.pdfEssentialsOfMachineLearning.pdf
EssentialsOfMachineLearning.pdf
 
Introduction to Machine Learning for Java Developers
Introduction to Machine Learning for Java DevelopersIntroduction to Machine Learning for Java Developers
Introduction to Machine Learning for Java Developers
 

Mehr von mailund

Ku 05 08 2009
Ku 05 08 2009Ku 05 08 2009
Ku 05 08 2009
mailund
 
Association mapping using local genealogies
Association mapping using local genealogiesAssociation mapping using local genealogies
Association mapping using local genealogies
mailund
 
Probability And Stats Intro
Probability And Stats IntroProbability And Stats Intro
Probability And Stats Intro
mailund
 
Probability And Stats Intro2
Probability And Stats Intro2Probability And Stats Intro2
Probability And Stats Intro2
mailund
 
Linear Regression Ex
Linear Regression ExLinear Regression Ex
Linear Regression Ex
mailund
 

Mehr von mailund (20)

Chapter 9 divide and conquer handouts with notes
Chapter 9   divide and conquer handouts with notesChapter 9   divide and conquer handouts with notes
Chapter 9 divide and conquer handouts with notes
 
Chapter 9 divide and conquer handouts
Chapter 9   divide and conquer handoutsChapter 9   divide and conquer handouts
Chapter 9 divide and conquer handouts
 
Chapter 9 divide and conquer
Chapter 9   divide and conquerChapter 9   divide and conquer
Chapter 9 divide and conquer
 
Chapter 7 recursion handouts with notes
Chapter 7   recursion handouts with notesChapter 7   recursion handouts with notes
Chapter 7 recursion handouts with notes
 
Chapter 7 recursion handouts
Chapter 7   recursion handoutsChapter 7   recursion handouts
Chapter 7 recursion handouts
 
Chapter 7 recursion
Chapter 7   recursionChapter 7   recursion
Chapter 7 recursion
 
Chapter 5 searching and sorting handouts with notes
Chapter 5   searching and sorting handouts with notesChapter 5   searching and sorting handouts with notes
Chapter 5 searching and sorting handouts with notes
 
Chapter 5 searching and sorting handouts
Chapter 5   searching and sorting handoutsChapter 5   searching and sorting handouts
Chapter 5 searching and sorting handouts
 
Chapter 5 searching and sorting
Chapter 5   searching and sortingChapter 5   searching and sorting
Chapter 5 searching and sorting
 
Chapter 4 algorithmic efficiency handouts (with notes)
Chapter 4   algorithmic efficiency handouts (with notes)Chapter 4   algorithmic efficiency handouts (with notes)
Chapter 4 algorithmic efficiency handouts (with notes)
 
Chapter 4 algorithmic efficiency handouts
Chapter 4   algorithmic efficiency handoutsChapter 4   algorithmic efficiency handouts
Chapter 4 algorithmic efficiency handouts
 
Chapter 4 algorithmic efficiency
Chapter 4   algorithmic efficiencyChapter 4   algorithmic efficiency
Chapter 4 algorithmic efficiency
 
Chapter 3 introduction to algorithms slides
Chapter 3 introduction to algorithms slidesChapter 3 introduction to algorithms slides
Chapter 3 introduction to algorithms slides
 
Chapter 3 introduction to algorithms handouts (with notes)
Chapter 3 introduction to algorithms handouts (with notes)Chapter 3 introduction to algorithms handouts (with notes)
Chapter 3 introduction to algorithms handouts (with notes)
 
Chapter 3 introduction to algorithms handouts
Chapter 3 introduction to algorithms handoutsChapter 3 introduction to algorithms handouts
Chapter 3 introduction to algorithms handouts
 
Ku 05 08 2009
Ku 05 08 2009Ku 05 08 2009
Ku 05 08 2009
 
Association mapping using local genealogies
Association mapping using local genealogiesAssociation mapping using local genealogies
Association mapping using local genealogies
 
Probability And Stats Intro
Probability And Stats IntroProbability And Stats Intro
Probability And Stats Intro
 
Probability And Stats Intro2
Probability And Stats Intro2Probability And Stats Intro2
Probability And Stats Intro2
 
Linear Regression Ex
Linear Regression ExLinear Regression Ex
Linear Regression Ex
 

Kürzlich hochgeladen

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Kürzlich hochgeladen (20)

Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 

Neural Networks

  • 2. Motivation Problem: With our linear methods, we can train the weights but not the basis functions: Activator Fixed basis Trainable function weight
  • 3. Motivation Problem: With our linear methods, we can train the weights but not the basis functions: Can we learn the basis functions?
  • 4. Motivation Problem: With our linear methods, we can train the weights but not the basis functions: Can we learn the basis functions? Maybe reuse ideas from our previous technology? Adjustable weight vectors?
  • 5. Historical notes This motivation is not how the technology came about! Neural networks were developed in an attempt to achieve AI by modeling the brain (but artificial neural networks have very little to do with biological neural networks).
  • 6. Feed-forward neural network Two levels of “linear regression” (or classification): Output layer Hidden layer
  • 8. Example tanh activation function for 3 hidden layers Linear activation function for output layer
  • 9. Example tanh activation function for 3 hidden layers Linear activation function for output layer
  • 10. Example tanh activation function for 3 hidden layers Linear activation function for output layer
  • 11. Example tanh activation function for 3 hidden layers Linear activation function for output layer
  • 12. Example tanh activation function for 3 hidden layers Linear activation function for output layer
  • 13. Network training If we can give the network a probabilistic interpretation, we can use our “standard” estimation methods for training.
  • 14. Network training If we can give the network a probabilistic interpretation, we can use our “standard” estimation methods for training. Regression problems:
  • 15. Network training If we can give the network a probabilistic interpretation, we can use our “standard” estimation methods for training. Classification problems:
  • 16. Network training If we can give the network a probabilistic interpretation, we can use our “standard” estimation methods for training. Training by minimizing an error function.
  • 17. Network training If we can give the network a probabilistic interpretation, we can use our “standard” estimation methods for training. Training by minimizing an error function. Typically, we cannot minimize analytically but need numerical optimization algorithms.
  • 18. Network training If we can give the network a probabilistic interpretation, we can use our “standard” estimation methods for training. Training by minimizing an error function. Typically, we cannot minimize analytically but need numerical optimization algorithms. Notice: There will typically be more than one minimum of the error function (due to symmetries). Notice: At best, we can find local minima.
  • 19. Parameter optimization Numerical optimization is beyond the scope of this class – you can (should) get away with just using appropriate libraries such as GSL.
  • 20. Parameter optimization Numerical optimization is beyond the scope of this class – you can (should) get away with just using appropriate libraries such as GSL. These can typically find (local) minima of general functions, but the most efficient algorithms also need the gradient of the function to minimize.
  • 21. Back propagation Back propagation is an efficient algorithm for computing both the error function and the gradient.
  • 22. Back propagation Back propagation is an efficient algorithm for computing both the error function and the gradient. For iid data:
  • 23. Back propagation Back propagation is an efficient algorithm for computing both the error function and the gradient. For iid data: We just focus on this guy
  • 24. Back propagation After running the feed-forward algorithm, we know the values zi and yk. we need to compute the ds (called errors).
  • 25. Back propagation The choice of error function and output activator typically makes it simple to deal with the output layer
  • 26. Back propagation For hidden layers, we apply the chain rule
  • 27. Calculating error and gradient Algorithm 1. Apply feed forward to  calculate hidden and output  variables and then the error. 2. Calculate output errors  and propagate errors back to  get the gradient. 3. Iterate ad lib.
  • 28. Calculating error and gradient Algorithm 1. Apply feed forward to  calculate hidden and output  variables and then the error. 2. Calculate output errors  and propagate errors back to  get the gradient. 3. Iterate ad lib.
  • 29. Calculating error and gradient Algorithm 1. Apply feed forward to  calculate hidden and output  variables and then the error. 2. Calculate output errors  and propagate errors back to  get the gradient. 3. Iterate ad lib.
  • 30. Calculating error and gradient Algorithm 1. Apply feed forward to  calculate hidden and output  variables and then the error. 2. Calculate output errors  and propagate errors back to  get the gradient. 3. Iterate ad lib.
  • 31. Calculating error and gradient Algorithm 1. Apply feed forward to  calculate hidden and output  variables and then the error. 2. Calculate output errors  and propagate errors back to  get the gradient. 3. Iterate ad lib.
  • 32. Calculating error and gradient Algorithm 1. Apply feed forward to  calculate hidden and output  variables and then the error. 2. Calculate output errors  and propagate errors back to  get the gradient. 3. Iterate ad lib.
  • 33. Calculating error and gradient Algorithm 1. Apply feed forward to  calculate hidden and output  variables and then the error. 2. Calculate output errors  and propagate errors back to  get the gradient. 3. Iterate ad lib.
  • 34. Calculating error and gradient Algorithm 1. Apply feed forward to  calculate hidden and output  variables and then the error. 2. Calculate output errors  and propagate errors back to  get the gradient. 3. Iterate ad lib.
  • 35. Overfitting The complexity of the network is determined by the hidden layer – and overfitting is still a problem. Use e.g. test datasets to alleviate this problem.
  • 36. Early stopping ...or just stop the training when overfitting starts to occur. Training data Test data
  • 37. Summary ● Neural network as general regression/classification models ● Feed-forward evaluation ● Maximum-likelihood framework ● Error back-propagation to obtain gradient