1 5

Training and Testing
Neural Networks

서울대학교 산업공학과
생산정보시스템연구실
이상진

Contents

• Introduction
• When Is the Neural Network Trained?
• Controlling the Training Process with Learning
Parameters
• Iterative Development Process
• Avoiding Over-training
• Automating the Process

Introduction (1)
• Training a neural network
– perform a specific processing function
1) 어떤 parameter?
2) how used to control the training process
3) management of the training data - training process 에 미치는 영향
?
– Development Process
• 1) Data preparation
• 2) neural network model & architecture 선택
• 3) train the neural network
– neural network 의 구조와 그 function 에 의해 결정
– Application
– “trained”

Introduction (2)
• Learning Parameters for Neural Network

• Disciplined approach to iterative neural network
development

When Is the Neural Network Trained?
• When the network is trained?
– the type of neural network
– the function performing
• classification
• clustering data
• build a model or time-series forecast
– the acceptance criteria
• meets the specified accuracy
– the connection weights are “locked”
– cannot be adjusted

Classification (1)
• Measure of success : percentage of correct
classification
– incorrect classification
– no classification : unknown, undecided
• threshold limit

Classification (2)
•confusion matrix
: possible output categories and the corresponding
percentage of correct and incorrect classifications

Category A Category B Category C

Category A 0.6 0.25 0.15

Category B 0.25 0.45 0.3

Category C 0.15 0.3 0.55

Clustering (1)
• Output a of clustering network
– open to analysis by the user
• Training regimen is determined:
– the number of times the data is presented to the neural
network
– how fast the learning rate and the neighborhood decay
• Adaptive resonance network training (ART)
– vigilance training parameter
– learn rate

Clustering (2)
• Lock the ART network weights
– disadvantage : online learning
• ART network are sensitive to the order of the
training data

Modeling (1)
• Modeling or regression problems
• Usual Error measure
– RMS(Root Square Error)
• Measure of Prediction accuracy
– average
– MSE(Mean Square Error)
• The Expected behavior
– 초기의 RMS error 는 매우 높으나 , 점차 stable
minimum 으로 안정화된다

Modeling (2)

Modeling (3)
• 안정화되지 않는 경우
– network fall into a local minima
• the prediction error doesn’t fall
• oscillating up and down
– 해결 방법
• reset(randomize) weight and start again
• training parameter
• data representation
• model architecture

When Is the Neural Network
Trained?
• Forecasting Forecasting (1)
– prediction problem
– visualize : time plot of the actual and desired network
output
• Time-series forecasting
– long-term trend
• influenced by cyclical factor etc.
– random component
• variability and uncertainty
– neural network are excellent tools for modeling
complex time-series problems
• recurrent neural network : nonlinear dynamic systems
– no self-feedback loop & no hidden neurons

When Is the Neural Network
Trained?
Forecasting (2)

Controlling the Training Process with
Learning Parameters (1)
• Learning Parameters depends on
– Type of learning algorithm
– Type of neural network

- Supervised training

Pattern
Pattern

Neural Network Prediction
Prediction

Desired
Desired
Output
Output
1) How the error is computed
2) How big a step we take when adjusting the
connection weights

• Learning rate
– magnitude of the change when adjusting the connection
weights
– the current training pattern and desired output
• large rate
– giant oscillations
• small rate
– to learn the major features of the problem
• generalize to patterns

• Momentum
– filter out high-frequency changes in the weight values
– oscillating around a set values 방지
– Error 가 오랫동안 영향을 미친다
• Error tolerance
– how close is close enough
– 많은 경우 0.1
– 필요성
• net input must be quite large?

-Unsupervised learning
• Parameter
– selection for the number of outputs
• granularity of the segmentation
(clustering, segmentation)
– learning parameters (architecture is set)
• neighborhood parameter : Kohonen maps
• vigilance parameter : ART

• Neighborhood
– the area around the winning unit, where the non-wining
units will also be modified
– roughly half the size of maximum dimension of the
output layer
– 2 methods for controlling
• square neighborhood function, linear decrease in the learning
rate
• Gaussian shaped neighborhood, exponential decay of the
learning rate
– the number of epochs parameter
– important in keeping the locality of the topographic
amps

• Vigilance
– control how picky the neural network is going to be
when clustering data
– discriminating when evaluating the differences between
two patterns
– close-enough
– Too-high Vigilance
• use up all of the output units

Iterative Development Process (1)

• Network convergence issues
– fall quickly and then stays flat / reach the global
minima
– oscillates up and down / trapped in a local minima
– 문제의 해결 방법
• some random noise
• reset the network weights and start all again
• design decision


• Model selection
– inappropriate neural network model for the function to
perform
– add hidden units or another layer of hidden units
– strong temporal or time element embedded
• recurrent back propagation
• radial basis function network
• Data representation
– key parameter is not scaled or coded
– key parameter is missing from the training data
– experience


• Model architecture
– not converge : too complex for the architecture
– some additional hidden units, good
– adding many more?
• Just, Memorize the training patterns
– Keeping the hidden layers as this as possible, get the
best results

Avoiding Over-training

• Over-training
– 같은 pattern 을 계속적으로 학습
– cannot generalize
– 새로운 pattern 에 대한 처리
– switch between training and testing data

Automating the Process

• Automate the selection of the appropriate number
of hidden layers and hidden units
– pruning out nodes and connections
– genetic algorithms
– opposite approach to pruning
– the use of intelligent agents

1 5

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Ähnlich wie 1 5

Ähnlich wie 1 5 (20)

1 5