SlideShare ist ein Scribd-Unternehmen logo
1 von 33
How does a neural network
work
… literally!
I mean, … really literally!
Don’t give me the formulas,
Don’t give me the python code
JUST GIVE ME THE NUMBERS!
Ok, here it is ….
NN
HORIZON
OR
NO HORIZON
Let’s start with an easy neural network to classify images.
NN HORIZON
Every pixel value gets feed into the network
HORIZON
At Wikipedia you find images like these, which show, that
Every pixel value is feed into every neuron.
Looks complicated! But the math is actually not that hard.
It’ just a lot of calculations! So let’s break it down.
HORIZON
1
0
Let’s focus on images with two pixels only. (for now)
HORIZON
1
0
And let’s break it down to only one neuron! (for now)
1
0
1
For every input we have a wanted outcome.
here it is 1,0 -> 1
At first we do not know, what outcome
the neural net will be calculating.
?
1
0
1
* -0.16
* 0.99
Random Weights
To start with the calculations, we set some random
weights for the different inputs.
These can be any numbers, but mostly one starts
in the range of -1 to 1.
With these we calculate our first inputs to the neuron.
?
= -0.16
= 0.00
1
0
1
* -0.16
* 0.99
Random Weights
Then we add up the results for every weight.
?
= -0.16
= 0.00
+ -0.16
1 * -0.16 = -0.16
0 * 0.99 = 0.0
1
0
+ -0.16 > 0,46 0 1
Then we do one “higher math” step (not really), and get out 0,46
(Actually we just put -0.16 as x into the sigmoid-function,
which is explaind the next slide and might look complicated
but really isn’t.)
0,46 rounds to 0 which which is wrong … (we wanted 1)
But that’s all there is to get results from a neural network
that’s all there is in a neuron!
1 * -0.16 = -0.16
0 * 0.99 = 0.0
1
0
+ -0.16 > 0,46 0 1
Example Input: Output:
-1000 0.0
-100 0.0
-10 0.00004
-5 0.0066
-1 0.2689
-0.75 0.3208
-0.25 0.4378
0.0 0.5
0.25 0.5621
0.75 0.6791
1 0.7310
5 0.9933
10 0.9999
100 1.0
1000 1.0
The “higher math”:
With the sum of the weighted inputs (-0.16)
-> we call the sigmoid-function(x)
Whoohooo!
But what it does, is actually pretty easy: It maps every
input to a value between 0 and 1 (see examples left)
When we put -0.16 into the function we get 0.46 as a
result. If we round this value we get 0 as an outcome of
our neuron. This is wrong!
What to do? Change the weights!
1 * -0.16 = -0.16
0 * 0.99 = 0.0
1
0
+ -0.16 > 0,46 0
0.546 * 1 = 0.546
0.546 * 0 = 0.0
* 1- 0,46 = 0,546 1
1
-0.16+-0.546 = 0.38
0.99 + 0.0 = 0,99
+=
+=
0.16 -> 0.38
0.99 -> 0.99
New weights:
To correct the weights, we…
…calculate the output error
(0.546)
…calculate the error
per weight and input
…and add it to the old weight.
That’s called “Backpropagation”
(because it goes backwards)
1 * 0.38 = 0.38
0 * 0.99 = 0.0
+ 0.38 > 0,59 1
1
0
AND THAT‘s IT!
Well… basically it really is.
It‘s all about getting input values to the right output values
by adjusting the weights.
But obviously neural nets have more than one neuron.
And normally there is more than one input.
Let‘s increase the input‘s first!
Now we calculate again, and get the right output!
1
1
0
1
0
1
Let‘s assume there is another input [1,1] which we want to map to 0.
Now we have:
This has some influence on how we adjust the weights on the way back – have a look:
1 * -0.16 = -0.16
1 * -0.16 = -0.16
0 * 0.99 = 0.00
1 * 0.99 = 0.99
1
0
+
-0.16 -> 0.460
0.83 -> 0.696
0
0.546 * 1 = 0.546
-0.696 * 1 = -0.696
0.546 * 0 = 0.0
-0.696 * 1 = -0.696
*
1- 0.460= 0.546
0- 0.696= -0.696
1
-0.16+ 0.546 -0.696
= -0.316
0.99+ 0.0 -0.696
= +0.293
+=
+=
1
1
1 0
0
1
0.16 -> -0.316
0.99 -> +0.293
Round 1
Now it‘s 2 values to add here…:
We do the same for both inputs
All what changes is how we
sum up the new weight
-0.16 + (1 * -0.546 + 1 * -0.696) = -0.16 + 0.546 - 0.696 = -0.316
We can write this update regarding to the errors
in one line for each input weight:
0.99 + (0 * -0.546 + 1 * -0.696) = 0.99 + 0.000 - 0.696 = 0.293
0.16 -> -0.316
0.99 -> +0.293
0.546 * 1 = 0.546
-0.696 * 1 = -0.696
0.546 * 0 = 0.0
-0.696 * 1 = -0.696
*
1- 0.460= 0.546
0- 0.696= -0.696
-0.16+ 0.546 -0.696
= -0.316
0.99+ 0.0 -0.696
= +0.293
+=
+=
(we will do this again later when we have many weights…)
1 * -0.316 = -0.316
1 * -0.316 = -0.316
0 * 0.293 = 0.00
1 * 0.293 = 0.293
1
0
+
-0.316 -> 0,421
-0.022 -> 0,494
0
0.578 * 1 = 0.578
-0.494 * 1 = -0.494
0,578 * 0 = 0.00
-0,494 * 1 = -0.494
*
1- 0,421= 0,578
0- 0,494= -0,494
1
-0.316+ 0.578 -0.494
= -0.232
0.293+ 0.0 -0.494
= -0.200
+=
+=
1
1
0 0
0
1
-0.316 -> -0.232
+0.293 -> -0.200
Round 2
Still not right.
So we do another round.
1 * -0.232 = 0.232
1 * -0.232 = 0.232
0 * -0.200 = 0.00
1 * -0.200 = -0.200
1
0
+
-0.232 -> 0,442
-0.432 -> 0,393
01
1
0
-0.232 -> -0.067
-0.200 -> -0.594…
… -> 1.087
… -> -0.934
… -> 0.483
… -> 0,340
… -> 0.277
… -> -1.238
… -> 0.527
… -> 0,304
… -> 0.568
… -> 0,276
… -> 0.431
… -> -1.515
… -> 0.968
… -> 0.021
… -> 3.426
… -> -7.271
Round 4
Round 3
Round 5
Round 6
Round 100
…………………………………………………………………………………………………………………………….
1
0
And another
And another
And another
…
Till it fits!
weight 1 (started at -0.16)
weight 2 (started at 0.99)
sum of errors
the error converges to 0.0 (hopefully :-)
Ok, I let out oooone little thing, which we need later (with more neurons and more layers)
We do not actually take the error-values directly to correct the weights, but a weighted error…
Have a look at sig‘(x):
The results are always between 0 and 1
If the NN is not sure if 1 or 0 is the right answer:
max=0.5 > maps to 0.25
Sure if 1 or 0:
min=0 or 1 > maps to ~ 0.0
multiply the error with simgoid derivate of the result:
e.g.:
result = 0.46, error = 1-0.46 = 0.54
sigmoid derivate of result : 0.248
value for correction: 0.546 * 0.248 = 0.134
This practice assures, we give those errors more
importance where we are unsure about the answer.
sig‘(x)=
sig(x)*(1-sig(x))
sig(x)=
1
(1 + e )-x
We used this to map results to 0 or 1:
We use this to evaluate which errors are more important
1 * -0.16 = -0.16
0 * 0.99 = 0.0
1
0
+ -0.16 > 0,46 0
0.134 * 1 = 0.134
0.134 * 0 = 0.0
*
1- 0,46 = 0.546
0.46 > 0.248
0.546 * 0.248 =
0.134
1
1
-0.16+ 0.134= -0.025
0.99 – 0.0 = -0,99
+=
+=
0.16 -> -0.025
0.99 -> 0.99
Now the weights are corrected more cautious:
0.16 -> 0.38
0.99 -> 0.99
Corrected weights before:
Basically only this number changes a bit
1 * -0.16 = -0.16
0 * 0.99 = 0.0
1
0
+ -0.16 > 0,46 0
0.546 * 1 = 0.546
0.546 * 0 = 0.0
* 1- 0,46 = 0,546
1
1
-0.16+-0.546 = 0.38
0.99 + 0.0 = 0,99
+=
+=
0.16 -> 0.38
0.99 -> 0.99
New weights:
To correct the weights, we…
…calculate the output error
(0.546)
…calculate the error
per weight and input
…and add it to the old weight.
That’s called “Backpropagation”
(because it goes backwards)
(s. slide 12)
This was fun – let’s try another more complicated example with our neuron
…. XOR
0
0
0
0
1
1
1
0
1
1
1
0
in out
0 0 0
0 1 1
1 0 1
1 1 0
XOR
spoiler: does not work... (with one neuron)
sum of error == 0!
But…:
[0,0] -> out = 0.5 -> error = -0.5
[0,1] -> out = 0.5 -> error = 0.5
[1,0] -> out = 0.5 -> error = 0.5
[1,1] -> out = 0.5 -> error = -0.5
The weights are actually both 0!
If you think about it, it‘s pretty obvious.
Every weight has to be the same, because
a 1 or 0 leads to 0 or 1
in the same amount of cases
For XOR to work we need more layers!
0
1
1
?
We try it with three input neurons and one output neuron
Let’s look at only one input first.
0
1
0.81
-0.84
0.299
0.90
-0.63 0.588
-0.77
0.88
0.707
-0.79
0.69
0.667
0.80
sig( 0*0.81 + 1*-0.84 ) = sig(-0.84) = 0.299
sig( 0.299*0.90 + 0.707*-0.63 + 0.667*0,80) = sig(-0.353) = 0.588
The forward calculation for one input
The output is 1 that’s good! – let’s check the other inputs:
layer0 layer1
In layer2 we have not 0 or 1 as input, but fractions!
1
!
layer2layer0 layer1
0.81
-0.84
0.500
0.299
0.692
0.491
0.90
-0.63
0.630
0.588
0.662
0.619
-0.77
0.88
-0.79
0.69
0.80
0.500
0.707
0.315
0.527
0.500
0.667
0.310
0.474
first results for all inputs (not so good…)
The forward calculation for ALL FOUR inputs
1.0
1.0
1.0
1.0
For all 4 inputs the results are not so good…
we have to update the weights… layer by layer…
0.81
-0.84
0.84
-0.69-0.77
0.88
-0.79
0.69
0.75
l2_error
0-r1= -0.630
1-r2= 0.411
1-r3= 0.337
0-r4= -0.619
l2_delta
sig‘(0.63)*-0.63=-0.146
sig‘(0.58)* 0.41= 0.099
sig‘(0.66)* 0.34= 0.075
sig‘(0.62)*-0.62=-0.14
0.90 + (0.500*-0.146 + 0.299*0.099 + 0.692*0.075 + 0.491*-0.14) = 0.84
-0.63 + (0.500*-0.146 + 0.707*0.099 + 0.315*0.075 + 0.527*-0.14) = -0.69
0.80 + (0.500*-0.146 + 0.667*0.099 + 0.310*0.075 + 0.474*-0.14) = 0.75
same as before (but with more values…)
0.500
0.299
0.692
0.491
0.500
0.707
0.315
0.527
0.500
0.667
0.310
0.474
* * * *
+=
+=
+=
+=
+=
+=
0,0 -> r1= 0.630
0,1 -> r2= 0.588
1,0 -> r3= 0.662
1,1 -> r4= 0.619
Backward Propagation:
Now we update the old weights
regarding to the errors of all inputs.
First in layer 2
0.90
-0.63
0.80
0.795
-0.861
0.81 + (0*-0.033 + 0*0.018 + 1*0.014 + 1*-0.033) = 0.795
-0.84 + (0*-0.033 + 1*0.018 + 0*0.014 + 1*-0.033) = -0.861
Again as before (but with more values…)
0.500
0.299
0.692
0.491
* * *
+=
+=
+=
+=
0,0 -> r1= 0.5
0,1 -> r2= 0.299
1,0 -> r3= 0.692
1,1 -> r4= 0.491
l1_delta
sig‘(0.500) *-0.132 =-0.033
sig‘(0.299) * 0.090 = 0.018
sig‘(0.692) * 0.068 = 0.014
sig‘(0.491) *-0.131 =-0.033
l1_error =
l2_delta * weight of neuron 1 (to l2)
0,0 -> -0.146*0.90 = -0.132
0,1 -> 0.099*0.90 = 0.090
1,0 -> 0.075*0.90 = 0.068
1,1 -> 0.140*0.90 = -0.131
And now the weights in layer 1!
The error in neuron1
is a fraction of the error
in the output neuron (l2)
Therefore we take the delta-error of layer2;
neuron1 accounted 0.90 to this error:
l2_delta * weight of neuron 1 to l2
And of this we get the delta again -> l1_delta
*
+=
+=
0.81
-0.84
Neuron 1 in layer 1:
error curve (depends strongly on the start weights)
np.random.seed(11) np.random.seed(44) np.random.seed(101) np.random.seed(1023)
but in converges pretty good (mostly…)
Others:
(for python users: the starter weights have been generated with:
np.random.seed(245))
np.random.seed(25)
WOOPS!
Output:
0,0 -> [ 0.0214737 ]
0,1 -> [ 0.98097769]
1,0 -> [ 0.9810124 ]
1,1 -> [ 0.50051869] ? – probably runs into a local minimum…
That’s it so far – hope you had a good ride.
Obviously there are more questions coming up after this session, which I might dig into later:
• Why are there so different error curves?
• How to avoid a wrong output like the last error curve.
• How many iterations does a neural network has to do, to learn?
• How many neurons to take?
• How deep the layers should be (deep learning)?
• Are there other functions to use instead of the sigmoid function?
• Different nets working together
• etc etc…

Weitere ähnliche Inhalte

Was ist angesagt?

How does a Neural Network work?
How does a Neural Network work?How does a Neural Network work?
How does a Neural Network work?Nikolay Kostadinov
 
Introduction to Neural Networks and Deep Learning from Scratch
Introduction to Neural Networks and Deep Learning from ScratchIntroduction to Neural Networks and Deep Learning from Scratch
Introduction to Neural Networks and Deep Learning from ScratchAhmed BESBES
 
Newton backward interpolation
Newton backward interpolationNewton backward interpolation
Newton backward interpolationMUHAMMADUMAIR647
 
vibration of machines and structures
vibration of machines and structuresvibration of machines and structures
vibration of machines and structuresAniruddhsinh Barad
 

Was ist angesagt? (7)

How does a Neural Network work?
How does a Neural Network work?How does a Neural Network work?
How does a Neural Network work?
 
Introduction to Neural Networks and Deep Learning from Scratch
Introduction to Neural Networks and Deep Learning from ScratchIntroduction to Neural Networks and Deep Learning from Scratch
Introduction to Neural Networks and Deep Learning from Scratch
 
Newton backward interpolation
Newton backward interpolationNewton backward interpolation
Newton backward interpolation
 
Ch06 5
Ch06 5Ch06 5
Ch06 5
 
vibration of machines and structures
vibration of machines and structuresvibration of machines and structures
vibration of machines and structures
 
redes neuronais
redes neuronaisredes neuronais
redes neuronais
 
assignment_2
assignment_2assignment_2
assignment_2
 

Andere mochten auch

Our culture programme – energy for everything that’s important to us.
Our culture programme – energy for everything that’s important to us.Our culture programme – energy for everything that’s important to us.
Our culture programme – energy for everything that’s important to us.PwC Switzerland
 
OpenOffice.org Writer: The Basics
OpenOffice.org Writer: The BasicsOpenOffice.org Writer: The Basics
OpenOffice.org Writer: The BasicsCzarli Evangelista
 
Aprendizaje sensorial
Aprendizaje sensorialAprendizaje sensorial
Aprendizaje sensorialpedroamr33
 
Modern Ages
Modern AgesModern Ages
Modern Agesioteresa
 
Writer's Workshop An Introduction
Writer's Workshop An IntroductionWriter's Workshop An Introduction
Writer's Workshop An IntroductionDiane Moore
 

Andere mochten auch (8)

Our culture programme – energy for everything that’s important to us.
Our culture programme – energy for everything that’s important to us.Our culture programme – energy for everything that’s important to us.
Our culture programme – energy for everything that’s important to us.
 
OpenOffice.org Writer: The Basics
OpenOffice.org Writer: The BasicsOpenOffice.org Writer: The Basics
OpenOffice.org Writer: The Basics
 
Aprendizaje sensorial
Aprendizaje sensorialAprendizaje sensorial
Aprendizaje sensorial
 
Modern Ages
Modern AgesModern Ages
Modern Ages
 
Writer's Workshop An Introduction
Writer's Workshop An IntroductionWriter's Workshop An Introduction
Writer's Workshop An Introduction
 
Modernism Ppt
Modernism PptModernism Ppt
Modernism Ppt
 
Neoclassical Literature
Neoclassical LiteratureNeoclassical Literature
Neoclassical Literature
 
Modernism In Literature
Modernism In LiteratureModernism In Literature
Modernism In Literature
 

Ähnlich wie Neural network - how does it work - I mean... literally!

Iterative methods for the solution
Iterative methods for the solutionIterative methods for the solution
Iterative methods for the solutionOscar Mendivelso
 
Iterative methods for the solution
Iterative methods for the solutionIterative methods for the solution
Iterative methods for the solutionOscar Mendivelso
 
Iterative methods for the solution
Iterative methods for the solutionIterative methods for the solution
Iterative methods for the solutionOscar Mendivelso
 
Data mining assignment 5
Data mining assignment 5Data mining assignment 5
Data mining assignment 5BarryK88
 
Wrangle 2016: (Lightning Talk) FizzBuzz in TensorFlow
Wrangle 2016: (Lightning Talk) FizzBuzz in TensorFlowWrangle 2016: (Lightning Talk) FizzBuzz in TensorFlow
Wrangle 2016: (Lightning Talk) FizzBuzz in TensorFlowWrangleConf
 
Machine Learning.pdf
Machine Learning.pdfMachine Learning.pdf
Machine Learning.pdfnikola_tesla1
 
There are two types of ciphers - Block and Stream. Block is used to .docx
There are two types of ciphers - Block and Stream. Block is used to .docxThere are two types of ciphers - Block and Stream. Block is used to .docx
There are two types of ciphers - Block and Stream. Block is used to .docxrelaine1
 
Lecture 2: Artificial Neural Network
Lecture 2: Artificial Neural NetworkLecture 2: Artificial Neural Network
Lecture 2: Artificial Neural NetworkMohamed Loey
 
Neurvvvvvvvvvvvvvvvvvvvval Networks.pptx
Neurvvvvvvvvvvvvvvvvvvvval Networks.pptxNeurvvvvvvvvvvvvvvvvvvvval Networks.pptx
Neurvvvvvvvvvvvvvvvvvvvval Networks.pptxeman458700
 
OR presentation simplex.pptx
OR presentation simplex.pptxOR presentation simplex.pptx
OR presentation simplex.pptxpranalpatilPranal
 
Deep learning simplified
Deep learning simplifiedDeep learning simplified
Deep learning simplifiedLovelyn Rose
 

Ähnlich wie Neural network - how does it work - I mean... literally! (12)

Iterative methods for the solution
Iterative methods for the solutionIterative methods for the solution
Iterative methods for the solution
 
Iterative methods for the solution
Iterative methods for the solutionIterative methods for the solution
Iterative methods for the solution
 
Iterative methods for the solution
Iterative methods for the solutionIterative methods for the solution
Iterative methods for the solution
 
Data mining assignment 5
Data mining assignment 5Data mining assignment 5
Data mining assignment 5
 
Wrangle 2016: (Lightning Talk) FizzBuzz in TensorFlow
Wrangle 2016: (Lightning Talk) FizzBuzz in TensorFlowWrangle 2016: (Lightning Talk) FizzBuzz in TensorFlow
Wrangle 2016: (Lightning Talk) FizzBuzz in TensorFlow
 
Machine Learning.pdf
Machine Learning.pdfMachine Learning.pdf
Machine Learning.pdf
 
There are two types of ciphers - Block and Stream. Block is used to .docx
There are two types of ciphers - Block and Stream. Block is used to .docxThere are two types of ciphers - Block and Stream. Block is used to .docx
There are two types of ciphers - Block and Stream. Block is used to .docx
 
Lecture 2: Artificial Neural Network
Lecture 2: Artificial Neural NetworkLecture 2: Artificial Neural Network
Lecture 2: Artificial Neural Network
 
Neurvvvvvvvvvvvvvvvvvvvval Networks.pptx
Neurvvvvvvvvvvvvvvvvvvvval Networks.pptxNeurvvvvvvvvvvvvvvvvvvvval Networks.pptx
Neurvvvvvvvvvvvvvvvvvvvval Networks.pptx
 
OR presentation simplex.pptx
OR presentation simplex.pptxOR presentation simplex.pptx
OR presentation simplex.pptx
 
Deep learning simplified
Deep learning simplifiedDeep learning simplified
Deep learning simplified
 
Springpractice
SpringpracticeSpringpractice
Springpractice
 

Kürzlich hochgeladen

The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 

Kürzlich hochgeladen (20)

The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 

Neural network - how does it work - I mean... literally!

  • 1. How does a neural network work … literally! I mean, … really literally! Don’t give me the formulas, Don’t give me the python code JUST GIVE ME THE NUMBERS! Ok, here it is ….
  • 2. NN HORIZON OR NO HORIZON Let’s start with an easy neural network to classify images.
  • 3. NN HORIZON Every pixel value gets feed into the network
  • 4. HORIZON At Wikipedia you find images like these, which show, that Every pixel value is feed into every neuron. Looks complicated! But the math is actually not that hard. It’ just a lot of calculations! So let’s break it down.
  • 5. HORIZON 1 0 Let’s focus on images with two pixels only. (for now)
  • 6. HORIZON 1 0 And let’s break it down to only one neuron! (for now)
  • 7. 1 0 1 For every input we have a wanted outcome. here it is 1,0 -> 1 At first we do not know, what outcome the neural net will be calculating. ?
  • 8. 1 0 1 * -0.16 * 0.99 Random Weights To start with the calculations, we set some random weights for the different inputs. These can be any numbers, but mostly one starts in the range of -1 to 1. With these we calculate our first inputs to the neuron. ? = -0.16 = 0.00
  • 9. 1 0 1 * -0.16 * 0.99 Random Weights Then we add up the results for every weight. ? = -0.16 = 0.00 + -0.16
  • 10. 1 * -0.16 = -0.16 0 * 0.99 = 0.0 1 0 + -0.16 > 0,46 0 1 Then we do one “higher math” step (not really), and get out 0,46 (Actually we just put -0.16 as x into the sigmoid-function, which is explaind the next slide and might look complicated but really isn’t.) 0,46 rounds to 0 which which is wrong … (we wanted 1) But that’s all there is to get results from a neural network that’s all there is in a neuron!
  • 11. 1 * -0.16 = -0.16 0 * 0.99 = 0.0 1 0 + -0.16 > 0,46 0 1 Example Input: Output: -1000 0.0 -100 0.0 -10 0.00004 -5 0.0066 -1 0.2689 -0.75 0.3208 -0.25 0.4378 0.0 0.5 0.25 0.5621 0.75 0.6791 1 0.7310 5 0.9933 10 0.9999 100 1.0 1000 1.0 The “higher math”: With the sum of the weighted inputs (-0.16) -> we call the sigmoid-function(x) Whoohooo! But what it does, is actually pretty easy: It maps every input to a value between 0 and 1 (see examples left) When we put -0.16 into the function we get 0.46 as a result. If we round this value we get 0 as an outcome of our neuron. This is wrong! What to do? Change the weights!
  • 12. 1 * -0.16 = -0.16 0 * 0.99 = 0.0 1 0 + -0.16 > 0,46 0 0.546 * 1 = 0.546 0.546 * 0 = 0.0 * 1- 0,46 = 0,546 1 1 -0.16+-0.546 = 0.38 0.99 + 0.0 = 0,99 += += 0.16 -> 0.38 0.99 -> 0.99 New weights: To correct the weights, we… …calculate the output error (0.546) …calculate the error per weight and input …and add it to the old weight. That’s called “Backpropagation” (because it goes backwards)
  • 13. 1 * 0.38 = 0.38 0 * 0.99 = 0.0 + 0.38 > 0,59 1 1 0 AND THAT‘s IT! Well… basically it really is. It‘s all about getting input values to the right output values by adjusting the weights. But obviously neural nets have more than one neuron. And normally there is more than one input. Let‘s increase the input‘s first! Now we calculate again, and get the right output!
  • 14. 1 1 0 1 0 1 Let‘s assume there is another input [1,1] which we want to map to 0. Now we have: This has some influence on how we adjust the weights on the way back – have a look:
  • 15. 1 * -0.16 = -0.16 1 * -0.16 = -0.16 0 * 0.99 = 0.00 1 * 0.99 = 0.99 1 0 + -0.16 -> 0.460 0.83 -> 0.696 0 0.546 * 1 = 0.546 -0.696 * 1 = -0.696 0.546 * 0 = 0.0 -0.696 * 1 = -0.696 * 1- 0.460= 0.546 0- 0.696= -0.696 1 -0.16+ 0.546 -0.696 = -0.316 0.99+ 0.0 -0.696 = +0.293 += += 1 1 1 0 0 1 0.16 -> -0.316 0.99 -> +0.293 Round 1 Now it‘s 2 values to add here…: We do the same for both inputs All what changes is how we sum up the new weight
  • 16. -0.16 + (1 * -0.546 + 1 * -0.696) = -0.16 + 0.546 - 0.696 = -0.316 We can write this update regarding to the errors in one line for each input weight: 0.99 + (0 * -0.546 + 1 * -0.696) = 0.99 + 0.000 - 0.696 = 0.293 0.16 -> -0.316 0.99 -> +0.293 0.546 * 1 = 0.546 -0.696 * 1 = -0.696 0.546 * 0 = 0.0 -0.696 * 1 = -0.696 * 1- 0.460= 0.546 0- 0.696= -0.696 -0.16+ 0.546 -0.696 = -0.316 0.99+ 0.0 -0.696 = +0.293 += += (we will do this again later when we have many weights…)
  • 17. 1 * -0.316 = -0.316 1 * -0.316 = -0.316 0 * 0.293 = 0.00 1 * 0.293 = 0.293 1 0 + -0.316 -> 0,421 -0.022 -> 0,494 0 0.578 * 1 = 0.578 -0.494 * 1 = -0.494 0,578 * 0 = 0.00 -0,494 * 1 = -0.494 * 1- 0,421= 0,578 0- 0,494= -0,494 1 -0.316+ 0.578 -0.494 = -0.232 0.293+ 0.0 -0.494 = -0.200 += += 1 1 0 0 0 1 -0.316 -> -0.232 +0.293 -> -0.200 Round 2 Still not right. So we do another round.
  • 18. 1 * -0.232 = 0.232 1 * -0.232 = 0.232 0 * -0.200 = 0.00 1 * -0.200 = -0.200 1 0 + -0.232 -> 0,442 -0.432 -> 0,393 01 1 0 -0.232 -> -0.067 -0.200 -> -0.594… … -> 1.087 … -> -0.934 … -> 0.483 … -> 0,340 … -> 0.277 … -> -1.238 … -> 0.527 … -> 0,304 … -> 0.568 … -> 0,276 … -> 0.431 … -> -1.515 … -> 0.968 … -> 0.021 … -> 3.426 … -> -7.271 Round 4 Round 3 Round 5 Round 6 Round 100 ……………………………………………………………………………………………………………………………. 1 0 And another And another And another … Till it fits!
  • 19. weight 1 (started at -0.16) weight 2 (started at 0.99) sum of errors the error converges to 0.0 (hopefully :-)
  • 20. Ok, I let out oooone little thing, which we need later (with more neurons and more layers) We do not actually take the error-values directly to correct the weights, but a weighted error… Have a look at sig‘(x): The results are always between 0 and 1 If the NN is not sure if 1 or 0 is the right answer: max=0.5 > maps to 0.25 Sure if 1 or 0: min=0 or 1 > maps to ~ 0.0 multiply the error with simgoid derivate of the result: e.g.: result = 0.46, error = 1-0.46 = 0.54 sigmoid derivate of result : 0.248 value for correction: 0.546 * 0.248 = 0.134 This practice assures, we give those errors more importance where we are unsure about the answer. sig‘(x)= sig(x)*(1-sig(x)) sig(x)= 1 (1 + e )-x We used this to map results to 0 or 1: We use this to evaluate which errors are more important
  • 21. 1 * -0.16 = -0.16 0 * 0.99 = 0.0 1 0 + -0.16 > 0,46 0 0.134 * 1 = 0.134 0.134 * 0 = 0.0 * 1- 0,46 = 0.546 0.46 > 0.248 0.546 * 0.248 = 0.134 1 1 -0.16+ 0.134= -0.025 0.99 – 0.0 = -0,99 += += 0.16 -> -0.025 0.99 -> 0.99 Now the weights are corrected more cautious: 0.16 -> 0.38 0.99 -> 0.99 Corrected weights before: Basically only this number changes a bit
  • 22. 1 * -0.16 = -0.16 0 * 0.99 = 0.0 1 0 + -0.16 > 0,46 0 0.546 * 1 = 0.546 0.546 * 0 = 0.0 * 1- 0,46 = 0,546 1 1 -0.16+-0.546 = 0.38 0.99 + 0.0 = 0,99 += += 0.16 -> 0.38 0.99 -> 0.99 New weights: To correct the weights, we… …calculate the output error (0.546) …calculate the error per weight and input …and add it to the old weight. That’s called “Backpropagation” (because it goes backwards) (s. slide 12)
  • 23. This was fun – let’s try another more complicated example with our neuron …. XOR
  • 24. 0 0 0 0 1 1 1 0 1 1 1 0 in out 0 0 0 0 1 1 1 0 1 1 1 0 XOR spoiler: does not work... (with one neuron)
  • 25. sum of error == 0! But…: [0,0] -> out = 0.5 -> error = -0.5 [0,1] -> out = 0.5 -> error = 0.5 [1,0] -> out = 0.5 -> error = 0.5 [1,1] -> out = 0.5 -> error = -0.5 The weights are actually both 0! If you think about it, it‘s pretty obvious. Every weight has to be the same, because a 1 or 0 leads to 0 or 1 in the same amount of cases For XOR to work we need more layers!
  • 26. 0 1 1 ? We try it with three input neurons and one output neuron Let’s look at only one input first.
  • 27. 0 1 0.81 -0.84 0.299 0.90 -0.63 0.588 -0.77 0.88 0.707 -0.79 0.69 0.667 0.80 sig( 0*0.81 + 1*-0.84 ) = sig(-0.84) = 0.299 sig( 0.299*0.90 + 0.707*-0.63 + 0.667*0,80) = sig(-0.353) = 0.588 The forward calculation for one input The output is 1 that’s good! – let’s check the other inputs: layer0 layer1 In layer2 we have not 0 or 1 as input, but fractions! 1 !
  • 28. layer2layer0 layer1 0.81 -0.84 0.500 0.299 0.692 0.491 0.90 -0.63 0.630 0.588 0.662 0.619 -0.77 0.88 -0.79 0.69 0.80 0.500 0.707 0.315 0.527 0.500 0.667 0.310 0.474 first results for all inputs (not so good…) The forward calculation for ALL FOUR inputs 1.0 1.0 1.0 1.0 For all 4 inputs the results are not so good… we have to update the weights… layer by layer…
  • 29. 0.81 -0.84 0.84 -0.69-0.77 0.88 -0.79 0.69 0.75 l2_error 0-r1= -0.630 1-r2= 0.411 1-r3= 0.337 0-r4= -0.619 l2_delta sig‘(0.63)*-0.63=-0.146 sig‘(0.58)* 0.41= 0.099 sig‘(0.66)* 0.34= 0.075 sig‘(0.62)*-0.62=-0.14 0.90 + (0.500*-0.146 + 0.299*0.099 + 0.692*0.075 + 0.491*-0.14) = 0.84 -0.63 + (0.500*-0.146 + 0.707*0.099 + 0.315*0.075 + 0.527*-0.14) = -0.69 0.80 + (0.500*-0.146 + 0.667*0.099 + 0.310*0.075 + 0.474*-0.14) = 0.75 same as before (but with more values…) 0.500 0.299 0.692 0.491 0.500 0.707 0.315 0.527 0.500 0.667 0.310 0.474 * * * * += += += += += += 0,0 -> r1= 0.630 0,1 -> r2= 0.588 1,0 -> r3= 0.662 1,1 -> r4= 0.619 Backward Propagation: Now we update the old weights regarding to the errors of all inputs. First in layer 2 0.90 -0.63 0.80
  • 30. 0.795 -0.861 0.81 + (0*-0.033 + 0*0.018 + 1*0.014 + 1*-0.033) = 0.795 -0.84 + (0*-0.033 + 1*0.018 + 0*0.014 + 1*-0.033) = -0.861 Again as before (but with more values…) 0.500 0.299 0.692 0.491 * * * += += += += 0,0 -> r1= 0.5 0,1 -> r2= 0.299 1,0 -> r3= 0.692 1,1 -> r4= 0.491 l1_delta sig‘(0.500) *-0.132 =-0.033 sig‘(0.299) * 0.090 = 0.018 sig‘(0.692) * 0.068 = 0.014 sig‘(0.491) *-0.131 =-0.033 l1_error = l2_delta * weight of neuron 1 (to l2) 0,0 -> -0.146*0.90 = -0.132 0,1 -> 0.099*0.90 = 0.090 1,0 -> 0.075*0.90 = 0.068 1,1 -> 0.140*0.90 = -0.131 And now the weights in layer 1! The error in neuron1 is a fraction of the error in the output neuron (l2) Therefore we take the delta-error of layer2; neuron1 accounted 0.90 to this error: l2_delta * weight of neuron 1 to l2 And of this we get the delta again -> l1_delta * += += 0.81 -0.84 Neuron 1 in layer 1:
  • 31. error curve (depends strongly on the start weights) np.random.seed(11) np.random.seed(44) np.random.seed(101) np.random.seed(1023) but in converges pretty good (mostly…) Others: (for python users: the starter weights have been generated with: np.random.seed(245))
  • 32. np.random.seed(25) WOOPS! Output: 0,0 -> [ 0.0214737 ] 0,1 -> [ 0.98097769] 1,0 -> [ 0.9810124 ] 1,1 -> [ 0.50051869] ? – probably runs into a local minimum…
  • 33. That’s it so far – hope you had a good ride. Obviously there are more questions coming up after this session, which I might dig into later: • Why are there so different error curves? • How to avoid a wrong output like the last error curve. • How many iterations does a neural network has to do, to learn? • How many neurons to take? • How deep the layers should be (deep learning)? • Are there other functions to use instead of the sigmoid function? • Different nets working together • etc etc…