SlideShare ist ein Scribd-Unternehmen logo
1 von 21
A New Learning Method for Single Layer Neural Networks Based on a Regularized Cost Function Juan A. Suárez-Romero Óscar Fontenla-Romero Bertha Guijarro-Berdiñas Amparo Alonso-Betanzos Laboratory for Research and Development in Artificial Intelligence Department of Computer Science, University of A Coruña, Spain
Outline ,[object Object],[object Object],[object Object],[object Object],[object Object]
Single layer neural network ,[object Object],[object Object],[object Object]
Single layer neural network
Cost function ,[object Object],MSE Regularization term (Weight Decay) Non-linear neural functions  Not guaranteed to have a unique minimum (local minima)
Alternative loss function ,[object Object],[object Object]
Alternative loss function
Alternative cost function ,[object Object],Alternative MSE Regularization term (Weight Decay)
Alternative cost function ,[object Object]
Alternative cost function ,[object Object],[object Object],[object Object],[object Object],[object Object],Variables Independent terms Coefficients
Experimental results ,[object Object],[object Object],[object Object],[object Object],[object Object]
Intrusion Detection problem ,[object Object],[object Object],[object Object],[object Object],[object Object]
Intrusion Detection problem ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Intrusion Detection problem 700 400
Box-Jenkins problem ,[object Object],[object Object],[object Object],[object Object]
Box-Jenkins problem ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Box-Jenkins problem
Box-Jenkins problem ,[object Object],[object Object],[object Object]
Box-Jenkins problem 198 189 207
Conclusions  and Future Work ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
A New Learning Method for Single Layer Neural Networks Based on a Regularized Cost Function Juan A. Suárez-Romero Óscar Fontenla-Romero Bertha Guijarro-Berdiñas Amparo Alonso-Betanzos Laboratory for Research and Development in Artificial Intelligence Department of Computer Science, University of A Coruña, Spain T h a n k  y o u  f o r  y o u r  a t t e n t i o n !

Weitere ähnliche Inhalte

Was ist angesagt?

NEURAL NETWORK Widrow-Hoff Learning Adaline Hagan LMS
NEURAL NETWORK Widrow-Hoff Learning Adaline Hagan LMSNEURAL NETWORK Widrow-Hoff Learning Adaline Hagan LMS
NEURAL NETWORK Widrow-Hoff Learning Adaline Hagan LMS
ESCOM
 
Machine Learning: Introduction to Neural Networks
Machine Learning: Introduction to Neural NetworksMachine Learning: Introduction to Neural Networks
Machine Learning: Introduction to Neural Networks
Francesco Collova'
 
Lecture artificial neural networks and pattern recognition
Lecture   artificial neural networks and pattern recognitionLecture   artificial neural networks and pattern recognition
Lecture artificial neural networks and pattern recognition
Hưng Đặng
 
lecture07.ppt
lecture07.pptlecture07.ppt
lecture07.ppt
butest
 
Robust inference via generative classifiers for handling noisy labels
Robust inference via generative classifiers for handling noisy labelsRobust inference via generative classifiers for handling noisy labels
Robust inference via generative classifiers for handling noisy labels
Kimin Lee
 

Was ist angesagt? (19)

NEURAL NETWORK Widrow-Hoff Learning Adaline Hagan LMS
NEURAL NETWORK Widrow-Hoff Learning Adaline Hagan LMSNEURAL NETWORK Widrow-Hoff Learning Adaline Hagan LMS
NEURAL NETWORK Widrow-Hoff Learning Adaline Hagan LMS
 
02 Fundamental Concepts of ANN
02 Fundamental Concepts of ANN02 Fundamental Concepts of ANN
02 Fundamental Concepts of ANN
 
Introduction to Neural networks (under graduate course) Lecture 7 of 9
Introduction to Neural networks (under graduate course) Lecture 7 of 9Introduction to Neural networks (under graduate course) Lecture 7 of 9
Introduction to Neural networks (under graduate course) Lecture 7 of 9
 
neural networksNnf
neural networksNnfneural networksNnf
neural networksNnf
 
Artificial Neural Networks-Supervised Learning Models
Artificial Neural Networks-Supervised Learning ModelsArtificial Neural Networks-Supervised Learning Models
Artificial Neural Networks-Supervised Learning Models
 
Artificial Neural Network Lect4 : Single Layer Perceptron Classifiers
Artificial Neural Network Lect4 : Single Layer Perceptron ClassifiersArtificial Neural Network Lect4 : Single Layer Perceptron Classifiers
Artificial Neural Network Lect4 : Single Layer Perceptron Classifiers
 
ラビットチャレンジ 深層学習Day1 day2レポート
ラビットチャレンジ 深層学習Day1 day2レポートラビットチャレンジ 深層学習Day1 day2レポート
ラビットチャレンジ 深層学習Day1 day2レポート
 
Adaline madaline
Adaline madalineAdaline madaline
Adaline madaline
 
04 Multi-layer Feedforward Networks
04 Multi-layer Feedforward Networks04 Multi-layer Feedforward Networks
04 Multi-layer Feedforward Networks
 
Lecture 9 Perceptron
Lecture 9 PerceptronLecture 9 Perceptron
Lecture 9 Perceptron
 
Machine Learning: Introduction to Neural Networks
Machine Learning: Introduction to Neural NetworksMachine Learning: Introduction to Neural Networks
Machine Learning: Introduction to Neural Networks
 
Multilayer perceptron
Multilayer perceptronMultilayer perceptron
Multilayer perceptron
 
Nural network ER.Abhishek k. upadhyay
Nural network  ER.Abhishek k. upadhyayNural network  ER.Abhishek k. upadhyay
Nural network ER.Abhishek k. upadhyay
 
15 Machine Learning Multilayer Perceptron
15 Machine Learning Multilayer Perceptron15 Machine Learning Multilayer Perceptron
15 Machine Learning Multilayer Perceptron
 
Regression and Classification: An Artificial Neural Network Approach
Regression and Classification: An Artificial Neural Network ApproachRegression and Classification: An Artificial Neural Network Approach
Regression and Classification: An Artificial Neural Network Approach
 
Lecture artificial neural networks and pattern recognition
Lecture   artificial neural networks and pattern recognitionLecture   artificial neural networks and pattern recognition
Lecture artificial neural networks and pattern recognition
 
lecture07.ppt
lecture07.pptlecture07.ppt
lecture07.ppt
 
Robust inference via generative classifiers for handling noisy labels
Robust inference via generative classifiers for handling noisy labelsRobust inference via generative classifiers for handling noisy labels
Robust inference via generative classifiers for handling noisy labels
 
Feedforward neural network
Feedforward neural networkFeedforward neural network
Feedforward neural network
 

Ähnlich wie A New Learning Method for Single Layer Neural Networks Based on a Regularized Cost Function

Machine Learning
Machine LearningMachine Learning
Machine Learning
butest
 

Ähnlich wie A New Learning Method for Single Layer Neural Networks Based on a Regularized Cost Function (20)

Machine Learning
Machine LearningMachine Learning
Machine Learning
 
Artificial Neural Networks-Supervised Learning Models
Artificial Neural Networks-Supervised Learning ModelsArtificial Neural Networks-Supervised Learning Models
Artificial Neural Networks-Supervised Learning Models
 
Artificial Neural Networks-Supervised Learning Models
Artificial Neural Networks-Supervised Learning ModelsArtificial Neural Networks-Supervised Learning Models
Artificial Neural Networks-Supervised Learning Models
 
Artificial Neural Network for machine learning
Artificial Neural Network for machine learningArtificial Neural Network for machine learning
Artificial Neural Network for machine learning
 
Batch gradient method for training of
Batch gradient method for training ofBatch gradient method for training of
Batch gradient method for training of
 
19_Learning.ppt
19_Learning.ppt19_Learning.ppt
19_Learning.ppt
 
Lecture 2 Basic Concepts in Machine Learning for Language Technology
Lecture 2 Basic Concepts in Machine Learning for Language TechnologyLecture 2 Basic Concepts in Machine Learning for Language Technology
Lecture 2 Basic Concepts in Machine Learning for Language Technology
 
Artificial Neural Networks Deep Learning Report
Artificial Neural Networks   Deep Learning ReportArtificial Neural Networks   Deep Learning Report
Artificial Neural Networks Deep Learning Report
 
ACUMENS ON NEURAL NET AKG 20 7 23.pptx
ACUMENS ON NEURAL NET AKG 20 7 23.pptxACUMENS ON NEURAL NET AKG 20 7 23.pptx
ACUMENS ON NEURAL NET AKG 20 7 23.pptx
 
Lec 6-bp
Lec 6-bpLec 6-bp
Lec 6-bp
 
Master Defense Slides (translated)
Master Defense Slides (translated)Master Defense Slides (translated)
Master Defense Slides (translated)
 
机器学习Adaboost
机器学习Adaboost机器学习Adaboost
机器学习Adaboost
 
(研究会輪読) Weight Uncertainty in Neural Networks
(研究会輪読) Weight Uncertainty in Neural Networks(研究会輪読) Weight Uncertainty in Neural Networks
(研究会輪読) Weight Uncertainty in Neural Networks
 
G234247
G234247G234247
G234247
 
AI Class Topic 6: Easy Way to Learn Deep Learning AI Technologies
AI Class Topic 6: Easy Way to Learn Deep Learning AI TechnologiesAI Class Topic 6: Easy Way to Learn Deep Learning AI Technologies
AI Class Topic 6: Easy Way to Learn Deep Learning AI Technologies
 
Data Mining: Concepts and techniques classification _chapter 9 :advanced methods
Data Mining: Concepts and techniques classification _chapter 9 :advanced methodsData Mining: Concepts and techniques classification _chapter 9 :advanced methods
Data Mining: Concepts and techniques classification _chapter 9 :advanced methods
 
Data-Driven Recommender Systems
Data-Driven Recommender SystemsData-Driven Recommender Systems
Data-Driven Recommender Systems
 
SoftComputing6
SoftComputing6SoftComputing6
SoftComputing6
 
Robust Fault-Tolerant Training Strategy Using Neural Network to Perform Funct...
Robust Fault-Tolerant Training Strategy Using Neural Network to Perform Funct...Robust Fault-Tolerant Training Strategy Using Neural Network to Perform Funct...
Robust Fault-Tolerant Training Strategy Using Neural Network to Perform Funct...
 
ai7.ppt
ai7.pptai7.ppt
ai7.ppt
 

Mehr von Juan A. Suárez Romero

Mehr von Juan A. Suárez Romero (16)

Graphics stack updates for Raspberry Pi devices (FOSDEM 2024)
Graphics stack updates for Raspberry Pi devices (FOSDEM 2024)Graphics stack updates for Raspberry Pi devices (FOSDEM 2024)
Graphics stack updates for Raspberry Pi devices (FOSDEM 2024)
 
On-going challenges in the Raspberry Pi driver stack: OpenGL 3, Vulkan and mo...
On-going challenges in the Raspberry Pi driver stack: OpenGL 3, Vulkan and mo...On-going challenges in the Raspberry Pi driver stack: OpenGL 3, Vulkan and mo...
On-going challenges in the Raspberry Pi driver stack: OpenGL 3, Vulkan and mo...
 
Writing multimedia applications with Grilo
Writing multimedia applications with GriloWriting multimedia applications with Grilo
Writing multimedia applications with Grilo
 
Grilo: Easy Access to Online Multimedia Content
Grilo: Easy Access to Online Multimedia ContentGrilo: Easy Access to Online Multimedia Content
Grilo: Easy Access to Online Multimedia Content
 
Grilo: present and future
Grilo: present and futureGrilo: present and future
Grilo: present and future
 
Rygel-Grilo
Rygel-GriloRygel-Grilo
Rygel-Grilo
 
MSL2008. Debugging
MSL2008. DebuggingMSL2008. Debugging
MSL2008. Debugging
 
MSL2009. Valgrind
MSL2009. ValgrindMSL2009. Valgrind
MSL2009. Valgrind
 
MSL2009. Gdb
MSL2009. GdbMSL2009. Gdb
MSL2009. Gdb
 
Logical Volume Manager. An Introduction
Logical Volume Manager. An IntroductionLogical Volume Manager. An Introduction
Logical Volume Manager. An Introduction
 
Una Arquitectura Multiagente Inteligente para la Detección de Intrusos
Una Arquitectura Multiagente Inteligente para la Detección de IntrusosUna Arquitectura Multiagente Inteligente para la Detección de Intrusos
Una Arquitectura Multiagente Inteligente para la Detección de Intrusos
 
An add-on for managing behaviours with priority in JADE
An add-on for managing behaviours with priority in JADEAn add-on for managing behaviours with priority in JADE
An add-on for managing behaviours with priority in JADE
 
Integrating a Priority-Based Scheduler of Behaviours in JADE
Integrating a Priority-Based Scheduler of Behaviours in JADEIntegrating a Priority-Based Scheduler of Behaviours in JADE
Integrating a Priority-Based Scheduler of Behaviours in JADE
 
A Tool for Agent Communication in Mozart/Oz
A Tool for Agent Communication in Mozart/OzA Tool for Agent Communication in Mozart/Oz
A Tool for Agent Communication in Mozart/Oz
 
A Multi-Agent Architecture for Intrusion Detection
A Multi-Agent Architecture for Intrusion DetectionA Multi-Agent Architecture for Intrusion Detection
A Multi-Agent Architecture for Intrusion Detection
 
The KNITTER System: KQML for Erlang
The KNITTER System: KQML for ErlangThe KNITTER System: KQML for Erlang
The KNITTER System: KQML for Erlang
 

Kürzlich hochgeladen

Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
panagenda
 

Kürzlich hochgeladen (20)

The Metaverse: Are We There Yet?
The  Metaverse:    Are   We  There  Yet?The  Metaverse:    Are   We  There  Yet?
The Metaverse: Are We There Yet?
 
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
 
Working together SRE & Platform Engineering
Working together SRE & Platform EngineeringWorking together SRE & Platform Engineering
Working together SRE & Platform Engineering
 
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
 
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdfLinux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
 
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
 
Long journey of Ruby Standard library at RubyKaigi 2024
Long journey of Ruby Standard library at RubyKaigi 2024Long journey of Ruby Standard library at RubyKaigi 2024
Long journey of Ruby Standard library at RubyKaigi 2024
 
A Business-Centric Approach to Design System Strategy
A Business-Centric Approach to Design System StrategyA Business-Centric Approach to Design System Strategy
A Business-Centric Approach to Design System Strategy
 
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdfThe Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
 
Portal Kombat : extension du réseau de propagande russe
Portal Kombat : extension du réseau de propagande russePortal Kombat : extension du réseau de propagande russe
Portal Kombat : extension du réseau de propagande russe
 
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdfIntroduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
 
AI presentation and introduction - Retrieval Augmented Generation RAG 101
AI presentation and introduction - Retrieval Augmented Generation RAG 101AI presentation and introduction - Retrieval Augmented Generation RAG 101
AI presentation and introduction - Retrieval Augmented Generation RAG 101
 
Intro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджераIntro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджера
 
Google I/O Extended 2024 Warsaw
Google I/O Extended 2024 WarsawGoogle I/O Extended 2024 Warsaw
Google I/O Extended 2024 Warsaw
 
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdfHow Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
 
TopCryptoSupers 12thReport OrionX May2024
TopCryptoSupers 12thReport OrionX May2024TopCryptoSupers 12thReport OrionX May2024
TopCryptoSupers 12thReport OrionX May2024
 
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
 
Continuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on ThanabotsContinuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
 
Microsoft CSP Briefing Pre-Engagement - Questionnaire
Microsoft CSP Briefing Pre-Engagement - QuestionnaireMicrosoft CSP Briefing Pre-Engagement - Questionnaire
Microsoft CSP Briefing Pre-Engagement - Questionnaire
 
WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024
 

A New Learning Method for Single Layer Neural Networks Based on a Regularized Cost Function

  • 1. A New Learning Method for Single Layer Neural Networks Based on a Regularized Cost Function Juan A. Suárez-Romero Óscar Fontenla-Romero Bertha Guijarro-Berdiñas Amparo Alonso-Betanzos Laboratory for Research and Development in Artificial Intelligence Department of Computer Science, University of A Coruña, Spain
  • 2.
  • 3.
  • 5.
  • 6.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.
  • 13.
  • 15.
  • 16.
  • 18.
  • 20.
  • 21. A New Learning Method for Single Layer Neural Networks Based on a Regularized Cost Function Juan A. Suárez-Romero Óscar Fontenla-Romero Bertha Guijarro-Berdiñas Amparo Alonso-Betanzos Laboratory for Research and Development in Artificial Intelligence Department of Computer Science, University of A Coruña, Spain T h a n k y o u f o r y o u r a t t e n t i o n !

Hinweis der Redaktion

  1. Thank you very much I’m going to present here a new learning method for single layer neural networks based on a regularized cost function
  2. Let me first make an outline of the main points of this presentation I’m going to start with a little introduction about single layer neural networks Next I’ll explain supervised learning with regularization in this kind of networks And show an alternative loss function that allows to obtain a n analytical solution Finally, I’ll show experimental results, the conclusions and future work
  3. As we can notice, our supervised learning algorithm is applied to a single layer neural network, with I inputs and J outputs To train the network we have S examples Generally, the activation functions used are non-linear At last , as can be seen, in this kind of networks the outputs are independent one of each other Because the weights related with each output are independent one set from another
  4. So, in order to simplify the explanation , we’ll work with one output PRESS NEXT KEY So the real outputs of the network are obtained through a non-linear function, where the input is the sum of the inputs by the weights, plus the bias If the error function used is the MSE, as is in our case, then the goal is to obtain the values of the weights and the bias which minimi z es the MSE between the real and the desired outputs
  5. So adding a regularization term to the cost function, our goal is minimi z e this cost function, which has two terms PRESS NEXT KEY The first term is the loss function, here the MSE, which is the square of difference between the desire output and the real output PRESS NEXT KEY And the second term is the regularization term, weighted by the regularization parameter alpha In our case the regularization term used is weight decay, which tries to smooth the obtained curve To minimi z e this cost function, we can derive both terms with respect to weights and bias, and equating to zero PRESS NEXT KEY The problem is that, in the first term, the weights are inside the non-linear function, so it isn’t guaranteed to have a unique minimum And also these minima can’t be obtained using an analytical method , but an iterative method
  6. In order to solve this problem, we present here an alternative loss function that is based on the following theorem READ THE THEOREM BRIEFLY
  7. Roughly speaking, the idea is that minimi z e the error difference in the output is equivalent to minimi z e the error difference before the non-linear function, weighting it by a factor
  8. So now, applying the theorem, we have the new cost function PRESS NEXT KEY In which the alternative loss function is the MSE, but obtained before the non-linear functions PRESS NEXT KEY And the regularization term, which is the same as in the previous cost function We can notice that now the weights and the bias are outside the non-linear function
  9. So to minimi z e the new cost function we derive both terms with respect to the weights and the bias, and equating the partial derivatives to zero Obtaining the equations showed in the slide
  10. We can rewrite the previous system in order to obtain a new system of I+1 by I+1 linear equations, where PRESS NEXT KEY We have the variables, which are the weights and the bias PRESS NEXT KEY The cofficients PRESS NEXT KEY And the independent terms PRESS NEXT KEY So we can use an analytical method to solve this system of equations, obtaining the optima weights and bias This implies that the training is doing very fast with a low computational cost Also, this system of equations ha s an unique minimum, except for degenerated systems At last, we can do an incremental learning And even a parallel learning, where the training process is divided in several distributed neural networks, and the results are merged to obtain the global training In both cases, only the coefficients matrix and the independent terms vector must be stored
  11. In order to probe our algorithm, we have applied it to a classification problem and to a regression problem PRESS NEXT KEY In both cases, we have used the logistic function in the neural functions PRESS NEXT KEY And the parameter alpha has been constrained to the interval [0, 1]
  12. The first problem, a classification problem, has been extracted from the KDD Cup 99 competition Each sample summarized a connection between two hosts, and i t’ s formed by 41 inputs The goal is to classify each sample in two classes: attack or normal connection We have 30.000 samples for training, and almost 5.000 for testing
  13. In order to study the influence of the training set size and the regularization parameter, we have generated several training sets For doing this , we have generated an initial training set formed by 100 samples. Each new training set is formed adding to the previous set 100 new samples, up to 2500 samples By this way we have 25 training sets For each training set, several neural networks have been trained with differents alphas, from 0 to 1, in steps of 0.005 All this process has been repeated 12 times, to obtain a better estimation of the true error Finally, the regularization parameter that provides the minimum test classification error is chosen
  14. As it can be seen in the figure, using regularization in all cases produces a better results that without it Mainly in small training set sizes In order to check that really this difference is statistically significant, we have applied a statistical test, confirming this fact PRESS NEXT KEY Also, we only need 400 samples to stabilize the error using regularization, while without regularization we need 700 samples
  15. The other problem is a regression problem, specifically the Box-Jenkins problem The problem consists in estimate the concentration of CO2 in a gas furnace at a time instant from the 4 previous concentrations and the 6 previous methane flow rates
  16. Like in t h e previous problem, we have generated several training set sizes Initially we have done a 10-fold cross validation, using 261 samples for training and 29 for testing In each validation round, several training sets have been generated, from 9 to 261 examples, in steps of 9 samples, using the same process as in the case of the intrusion detection problem Finally, for each training set, several neural networks have been trained and tested, varying alpha from 0 to 1 in steps of 0.001 In order to obtain a better estimation of the true error, mainly with small training sets, we have repeated the previous process 1 0 times And at last, the alpha that produces the minimum normalized MSE has been chosen
  17. The results are showed in the figure Though it seems that using regularization i s worse than not use it, statistically there is no difference Except for small training sets
  18. In this case the neural network performs very well, and using regularization do es n’t enhance the results In order to check the generalization capability of regularization in the presence of noisy data, we have added two normal random noise One with a standard desviation that is the half of the standard desviation of the original time series (gamma 0.5) And the other with the same standard desviation (gamma 1)
  19. We can notice the results together with the previous results As we can see, using regularization with nois y data improves the results In fact, in both cases there is a statistically difference using regularization and without regularization In the case of gamma 0.5, this difference only exists until a training set size of 225 samples PRESS NEXT KEY If we search for the smallest training set from which the error stabilizes With gamma 0.5 this size is 198, either using regularization or not But with gamma 1, that is, with a more noisy dat a , this size is 189 with regularization, and 207 without regularization
  20. As conclusions , we have proposed a new supervised learning method for single layer neural networks using regularization Among its features, we can remark that it allows to obtain the global optimum in an analytical way and, hence, faster than the current iterative methods It allows incremental learning and distributed learning And, due to the regularization term, a better generalization capability, mainly with small training sets or with noisy data We have applied it to two kind of problems, a classification problem and a regression problem, obtaining generally better results As future work, an analytical method to obtain the regularization parameter is being analyzed
  21. Thank you very much