SlideShare ist ein Scribd-Unternehmen logo
1 von 4
Downloaden Sie, um offline zu lesen
A DYNAMIC FORMULATION OF THE PATTERN RECOGNITION
PROBLEM
M.B. Shavlovsky1
, O.V. Krasotkina2
, V.V. Mottl3
1
Moscow Institute of Physics and Technology
Dolgoprudny, Moscow Region, 141700, Institutsky Pereulok, 9, shavlovsky@yandex.ru
2
Tula State University
Tula, 300600, Lenin Ave., 92, krasotkina@uic.tula.ru
3
Computing Center of the Russian Academy of Sciences
Moscow, 119333, Vavilov St., 40, vmottl@yandex.ru
The classical learning problem of pattern recognition in a finite-dimensional linear space
of real-valued features is studied under the conditions of a non-stationary universe. The
simplest statement of this problem with two classes of objects is considered under the as-
sumption that the instantaneous property the universe is completely expressed by a dis-
criminant hyperplane whose parameters are sufficiently slowly changing in time. In this
case, any object has to be considered along with the time marker which specifies when the
object was selected from the universe, and the training set becomes, actually, a time se-
ries. The training criterion of non-stationary pattern recognition is formulated as a genera-
lization of the classical Support Vector Machine. The respective numerical algorithm has
the computational complexity proportional to the length of the training time series.
Introduction 
The aim of this study is creation of the main
mathematical framework and simplest algo-
rithms for solving the typical practical prob-
lems of pattern recognition learning in un-
iverses whose properties are changing in time.
The commonly known classical statement of
the pattern recognition problem is based on the
tacit assumption that the properties of the un-
iverse at the moment of decision making re-
main the same as when the training set had
been formed. The more realistic assumption on
the non-stationarity of the universe, which as
accepted in this work, inevitably leads to the
necessity to analyze a sequence of samples at
some time moments and find different recogni-
tion rules for them.

This work is supported by grants of the Russian
Foundation for Basic Research No. 05-01-00679
and 06-01-00412.
The classical stationary pattern recognition
problem with two classes of objects: The
support Vector Machine
Let each object of the universe  be
represented by a point in the linear space of
features ( ) x  (1) ( )
( ),..., ( )n n
x x  R , and
its hidden membership in one of two classes be
specified by the value of the class index
 ( ) 1, 1y    . The classical approach to the
training problem developed by V. Vapnik [1]
is based on the treating the model of the un-
iverse in the form of a discriminant function
defined by a hyperplane having a priori un-
known direction vector and threshold:
 ( ) ( )T
f b   x a x is primarily 0 if
( ) 1y   , and 0 if ( ) 1y    .
The unknown parameters of the hyperplane are
to be estimated via analyzing a training set of
objects  , 1,...,j j N  represented by their
feature vectors and class-membership indices,
so that the training set as a whole is a finite set
of pairs  ( , ), 1,...,n
j jy j N  x R R . The
commonly adopted training principle is that of
the optimal discriminant hyperplane to be cho-
sen from the criterion of maximizing the num-
ber of points which are classified correctly
with a guaranteed margin conventionally taken
to be equal to unity:
1
( , , , 1,..., ) min,
( ) 1 , 0, 1,..., .
N
T
j j
j
T
j j j j
J b j N C
y b j N

     
     
a a a
a x
(1)
The notion of time is completely absent here.
The mathematical model of a non-
stationary universe and the main kinds of
the training problems
The principal novelty of the concept of the non-
stationary universe is involving the time factor
t . It is assumed that the main property of the
non-stationary universe is completely expressed
by the time-varying discriminant hyperplane
which, in its turn, is completely determined by
direction vector and the threshold both being
functions of time:  ( ) ( )T
t t tf b   x a x pri-
marily 0 if ( ) 1y   and 0 if ( ) 1y    .
Any object  is to be considered always
along with the time mark of its appearance
( , )t . As a result, the training set gains the
structure of a set of triples instead of pairs:
 ( , , ), 1,...,n
j j jy t j N  x R R . If we order
the objects as they appear, it is appropriate to
speak rather about the training sequence than
training set, and consider it as a time series
with varying time steps, in the general case.
The hidden discriminant hyperplane has dif-
ferent values of the direction vector and thre-
shold at different time moments jt . So, there
exists a two-component time series with one
hidden and one observable component, respec-
tively, ( , )j jba and ( , )j jyx .
The dynamic formulation turns the training
problem into that of two-component time se-
ries analysis, in which it is required to estimate
the hidden component from the observable
one. This is a standard signal (time series)
analysis problem whose specificity boils down
to the assumed model of the relationship be-
tween the hidden and the observable compo-
nent. In accordance with the classification in-
troduced by N. Wiener [2], it is natural to dis-
tinguish between, at last, two kinds of training
problems as those of estimating the hidden
component.
The problem of filtration of the training time
series. Let a new object appear at the time mo-
ment jt when the feature vectors and class-
membership indices of the previous are already
registered  1 1...,( , ),( , )j j j jy y x x including the
current moment 1(..., , )j jt t . It is required to
recurrently estimate the parameters of the dis-
criminant hyperplane ˆˆ( , )j jba at each time
moment jt immediately in the process of obser-
vation.
The problem of interpolation. Let the training
time series be completely registered in some
time interval 1 1( , ),...,( , ){ }N Ny yx x before its
processing starts. It is required to estimate the
time-varying parameters of the discriminant
hyperplane in the entire observation interval
1 1
ˆ ˆˆ ˆ( , ),...,( , ){ }N Nb ba a .
It is assumed that the parameters of the discri-
minant hyperplane ta and tb are changing
slowly in the sense that the values
1 1
1
( ) ( )T
j j j j
j jt t
 

 
 

a
a a a a
и
2
1
1
( )j j
j j
b
b b
t t



 

are, as a rule, sufficiently small. This assump-
tion prevents from the degeneration of the fil-
tration and interpolation problem into a collec-
tion of independent ill-posed two-class train-
ing problems each with a single observation.
From the formal point of view, the interpola-
tion-based estimate of the discriminant hyper-
plane parameters ˆˆ( , )N Nba obtained at the last
point of the observation interval is just the so-
lution of the filtration problem at this time
moment. However, the essence of the filtration
problem is the requirement of evaluating the
estimates in the on-line mode immediately as
the observations are coming one after another
without solving, each time, the interpolation
problem for the time-series of increasing
length.
The training criterion in the interpolation
mode
We consider here only the interpolation prob-
lem. The proposed formulation of this problem
differs from a collection of classical SVM-
based criteria (1) for consecutive time mo-
ments only by the presence of additional terms
which penalize the difference between adja-
cent values of the hyperplane parameters
1 1( , )j jb a and ( , )j jba :
1 1 1
1 1
2
12
1
( , , , 1,..., )
( ) ( ) ( )
,
( ) 1 , 0, 1,..., .
j j jj j j
N N
T
j j j j j j
j j
T
N
j jj
T
j j j j j j
b
t t t t t t
J b j N C
N
D D b b
t t
y b j N
  
 

     
   

     
 

a
a a a
a a a a
a x
(2)
The coefficients 0D a
and 0b
D  are hy-
per-parameters which preset the desired level
of smoothing the instantaneous parameters of
the discriminant hyperplane.
The criterion (2) implements the concept of
the optimal sufficiently smooth sequence of
discriminant hyperplanes in contrast to the
concept of the only optimal hyperplane in (1).
The sought-for hyperplanes have to provide
the correct classification of the feature vectors
for as many time moments as possible with a
guaranteed margin taken equal to unity just
like in (1).
The training algorithm
Just as the classical training problem, the dy-
namic problem (2) is that of quadratic pro-
gramming, but it contains ( 1)N n N  va-
riables in contrast to ( 1)n N  variables in
(1). It is known that the computational com-
plexity of the quadratic programming problem
of general kind is proportional to the cube of
the number of variables, i.e. the dynamic prob-
lem appears, at first glance, to be essentially
more complicated than the classical one.
However, the goal function of the dynamic
problem ( , , , 1,..., )j j jJ b j N a is pair-wise
separable, i.e. is representable as the sum of
partial functions each of which depends only
on one or two variables associated with one or
two adjacent time moments. This circumstance
makes it possible to build an algorithm which
numerically solves the problem in time propor-
tional to the length of the training time series
N .
The application of the Kuhn-Tacker theorem
to the dynamic problem (2) turns it into the
dual form with respect to the Lagrange multip-
liers 0j  at the inequality constraints
( ) 1T
j j j j jy b  a x :
 
1
1 1 1
1
( ,..., )
1
( ) max,
2
0, 0 2, 1,..., .
N
N N N
T
j j l j jl l jl j l
j j l
N
j j j
j
W
y y f
y C j N
  

  
     
     
 

x Q x (3)
Matrices jlQ ( )n n and ( )jlfF ( )N N do
not depend here from the training time series
and are determined only by the coefficients
Da
and b
D which penalize the unsmoothness
of the sequence of hyperplane parameters, re-
spectively, direction vectors and thresholds in
(2).
Theorem. The solution of the training problem
(2) is completely determined by the training
time series and the values of Lagrange multip-
liers 1( ,..., )N  obtained as the solution of the
dual problem (3):
: 0
ˆ
l
j l l jl l
l
y
 
 a Q x ,
: 0
ˆ
l
j l l jl
l
b b y f
 
   , (4)
: 2:0 2
:0 2 : 0
' ''
, ' ,
2
'' ( ).
jj
j l
j
j
T
j l l l jl j jl
j Cj C
j C l
b b C
b b y
b y f
  
   

 

   


  x Q x
(5)
It is seen from these formulas that the solution
of the dynamic training problem depends only
on those elements of the training time series
( , )j jyx whose Lagrange multipliers have ob-
tained positive values 0j  . It is natural to
call the feature vectors of the respective ob-
jects the support vectors. So, we have come to
some generalization of the Support Vector
Machine [1] which follows from the concept
of the optimal discriminant hyperplane (1).
The classical training problem is a particular
case of the problem (2) when the penalties on
the time variation of the hyperplane parameters
infinitely grow D a
and b
D . In this
case, we have jl Q I , 0jlf  , and the dual
problem (3) turns into the classical dual prob-
lem [1] which corresponds to the initial prob-
lem (1):
1
1 1 1
1
( ,..., )
1
( ) max,
2
0, 0 2, 1,..., ,
N
N N N
T
j j l j l j l
j j l
N
j j j
j
W
y y
y C j N
  

  
    
     
 

x x
The formulas (4) and (5) will determine the
training result in accordance with the classical
support vector method 1
ˆ ˆ ˆ... N  a a a and
1
ˆ ˆ ˆ... Nb b b   :
: 0
ˆ
j
j j j
j
y
 
 a x ,
: 2:0 2
:0 2 : 0
' ''
, ' ,
2
'' .
jj
j l
j
j
T
j l l l j
j Cj C
j C l
b b C
b b y
b y
  
   

 

  


  x x
Despite the fact that the dual problem (3) is not
pair-wise separable, the pair-wise separability
of the initial problem (2) makes it possible to
compute the gradient of the goal function
1( ,..., )NW   at each point 1( ,..., )N  
and then to find the optimal admissible max-
imization direction relative to the constraints
via an algorithm of the linear computational
complexity with respect to the length of the
training time series. In particular, the standard
steepest descent method of solving quadratic
programming problems [3], being applied to
the function 1( ,..., )NW   , yields a generali-
zation of the known SMO algorithm (Sequen-
tial Minimum Optimization) [4] which is typi-
cally used when solving dual problems in
SVM.
References
1. Vapnik V. Statistical Learning Theory. John-Wiley
& Sons, Inc. 1998.
2. Wiener N. Extrapolation, Interpolation, and Smooth-
ing of Stationary Random Time Series with Engi-
neering Applications. Technology Press of MIT,
John Wiley & Sons, 1949, 163 p.
3. Bazaraa M.S., Sherali H.D., Shetty C.M. Nonlinear
Programming: Theory and Algorithms. John Wiley
& Sons, 1993.
4. Platt J.C. Fast training of support vector machines
using sequential minimal optimization. Advances in
Kernel Methods: Support Vector Learning, MIT
Press, Cambridge, MA, 1999.

Weitere ähnliche Inhalte

Was ist angesagt?

2012 mdsp pr05 particle filter
2012 mdsp pr05 particle filter2012 mdsp pr05 particle filter
2012 mdsp pr05 particle filternozomuhamada
 
Stochastic Approximation and Simulated Annealing
Stochastic Approximation and Simulated AnnealingStochastic Approximation and Simulated Annealing
Stochastic Approximation and Simulated AnnealingSSA KPI
 
Monte Carlo Statistical Methods
Monte Carlo Statistical MethodsMonte Carlo Statistical Methods
Monte Carlo Statistical MethodsChristian Robert
 
Application of the Monte-Carlo Method to Nonlinear Stochastic Optimization wi...
Application of the Monte-Carlo Method to Nonlinear Stochastic Optimization wi...Application of the Monte-Carlo Method to Nonlinear Stochastic Optimization wi...
Application of the Monte-Carlo Method to Nonlinear Stochastic Optimization wi...SSA KPI
 
Computational Method to Solve the Partial Differential Equations (PDEs)
Computational Method to Solve the Partial Differential  Equations (PDEs)Computational Method to Solve the Partial Differential  Equations (PDEs)
Computational Method to Solve the Partial Differential Equations (PDEs)Dr. Khurram Mehboob
 
Tensor Decomposition and its Applications
Tensor Decomposition and its ApplicationsTensor Decomposition and its Applications
Tensor Decomposition and its ApplicationsKeisuke OTAKI
 
Fractional pseudo-Newton method and its use in the solution of a nonlinear sy...
Fractional pseudo-Newton method and its use in the solution of a nonlinear sy...Fractional pseudo-Newton method and its use in the solution of a nonlinear sy...
Fractional pseudo-Newton method and its use in the solution of a nonlinear sy...mathsjournal
 
Deformation 1
Deformation 1Deformation 1
Deformation 1anashalim
 
What are free particles in quantum mechanics
What are free particles in quantum mechanicsWhat are free particles in quantum mechanics
What are free particles in quantum mechanicsbhaskar chatterjee
 
Complex Dynamics and Statistics in Hamiltonian 1-D Lattices - Tassos Bountis
Complex Dynamics and Statistics  in Hamiltonian 1-D Lattices - Tassos Bountis Complex Dynamics and Statistics  in Hamiltonian 1-D Lattices - Tassos Bountis
Complex Dynamics and Statistics in Hamiltonian 1-D Lattices - Tassos Bountis Lake Como School of Advanced Studies
 
Parametric time domain system identification of a mass spring-damper
Parametric time domain system identification of a mass spring-damperParametric time domain system identification of a mass spring-damper
Parametric time domain system identification of a mass spring-damperMidoOoz
 
Parellelism in spectral methods
Parellelism in spectral methodsParellelism in spectral methods
Parellelism in spectral methodsRamona Corman
 

Was ist angesagt? (20)

2012 mdsp pr05 particle filter
2012 mdsp pr05 particle filter2012 mdsp pr05 particle filter
2012 mdsp pr05 particle filter
 
Stochastic Approximation and Simulated Annealing
Stochastic Approximation and Simulated AnnealingStochastic Approximation and Simulated Annealing
Stochastic Approximation and Simulated Annealing
 
Chris Sherlock's slides
Chris Sherlock's slidesChris Sherlock's slides
Chris Sherlock's slides
 
Monte Carlo Statistical Methods
Monte Carlo Statistical MethodsMonte Carlo Statistical Methods
Monte Carlo Statistical Methods
 
Application of the Monte-Carlo Method to Nonlinear Stochastic Optimization wi...
Application of the Monte-Carlo Method to Nonlinear Stochastic Optimization wi...Application of the Monte-Carlo Method to Nonlinear Stochastic Optimization wi...
Application of the Monte-Carlo Method to Nonlinear Stochastic Optimization wi...
 
Computational Method to Solve the Partial Differential Equations (PDEs)
Computational Method to Solve the Partial Differential  Equations (PDEs)Computational Method to Solve the Partial Differential  Equations (PDEs)
Computational Method to Solve the Partial Differential Equations (PDEs)
 
Tensor Decomposition and its Applications
Tensor Decomposition and its ApplicationsTensor Decomposition and its Applications
Tensor Decomposition and its Applications
 
Jere Koskela slides
Jere Koskela slidesJere Koskela slides
Jere Koskela slides
 
Fractional pseudo-Newton method and its use in the solution of a nonlinear sy...
Fractional pseudo-Newton method and its use in the solution of a nonlinear sy...Fractional pseudo-Newton method and its use in the solution of a nonlinear sy...
Fractional pseudo-Newton method and its use in the solution of a nonlinear sy...
 
Rdnd2008
Rdnd2008Rdnd2008
Rdnd2008
 
Adc
AdcAdc
Adc
 
Deformation 1
Deformation 1Deformation 1
Deformation 1
 
What are free particles in quantum mechanics
What are free particles in quantum mechanicsWhat are free particles in quantum mechanics
What are free particles in quantum mechanics
 
Quantum chaos of generic systems - Marko Robnik
Quantum chaos of generic systems - Marko RobnikQuantum chaos of generic systems - Marko Robnik
Quantum chaos of generic systems - Marko Robnik
 
Quantum chaos in clean many-body systems - Tomaž Prosen
Quantum chaos in clean many-body systems - Tomaž ProsenQuantum chaos in clean many-body systems - Tomaž Prosen
Quantum chaos in clean many-body systems - Tomaž Prosen
 
Complex Dynamics and Statistics in Hamiltonian 1-D Lattices - Tassos Bountis
Complex Dynamics and Statistics  in Hamiltonian 1-D Lattices - Tassos Bountis Complex Dynamics and Statistics  in Hamiltonian 1-D Lattices - Tassos Bountis
Complex Dynamics and Statistics in Hamiltonian 1-D Lattices - Tassos Bountis
 
R180304110115
R180304110115R180304110115
R180304110115
 
Parametric time domain system identification of a mass spring-damper
Parametric time domain system identification of a mass spring-damperParametric time domain system identification of a mass spring-damper
Parametric time domain system identification of a mass spring-damper
 
Adaptive dynamic programming algorithm for uncertain nonlinear switched systems
Adaptive dynamic programming algorithm for uncertain nonlinear switched systemsAdaptive dynamic programming algorithm for uncertain nonlinear switched systems
Adaptive dynamic programming algorithm for uncertain nonlinear switched systems
 
Parellelism in spectral methods
Parellelism in spectral methodsParellelism in spectral methods
Parellelism in spectral methods
 

Ähnlich wie Pria 2007

On Approach of Estimation Time Scales of Relaxation of Concentration of Charg...
On Approach of Estimation Time Scales of Relaxation of Concentration of Charg...On Approach of Estimation Time Scales of Relaxation of Concentration of Charg...
On Approach of Estimation Time Scales of Relaxation of Concentration of Charg...Zac Darcy
 
Modelos de Redes Neuronales
Modelos de Redes NeuronalesModelos de Redes Neuronales
Modelos de Redes NeuronalesCarlosNigerDiaz
 
Problems and solutions statistical physics 1
Problems and solutions   statistical physics 1Problems and solutions   statistical physics 1
Problems and solutions statistical physics 1Alberto de Mesquita
 
A Stochastic Limit Approach To The SAT Problem
A Stochastic Limit Approach To The SAT ProblemA Stochastic Limit Approach To The SAT Problem
A Stochastic Limit Approach To The SAT ProblemValerie Felton
 
Semi-Classical Transport Theory.ppt
Semi-Classical Transport Theory.pptSemi-Classical Transport Theory.ppt
Semi-Classical Transport Theory.pptVivekDixit100
 
Relative superior mandelbrot and julia sets for integer and non integer values
Relative superior mandelbrot and julia sets for integer and non integer valuesRelative superior mandelbrot and julia sets for integer and non integer values
Relative superior mandelbrot and julia sets for integer and non integer valueseSAT Journals
 
Relative superior mandelbrot sets and relative
Relative superior mandelbrot sets and relativeRelative superior mandelbrot sets and relative
Relative superior mandelbrot sets and relativeeSAT Publishing House
 
JAISTサマースクール2016「脳を知るための理論」講義01 Single neuron models
JAISTサマースクール2016「脳を知るための理論」講義01 Single neuron modelsJAISTサマースクール2016「脳を知るための理論」講義01 Single neuron models
JAISTサマースクール2016「脳を知るための理論」講義01 Single neuron modelshirokazutanaka
 
thermodynamics
thermodynamicsthermodynamics
thermodynamicskcrycss
 
Metodo Monte Carlo -Wang Landau
Metodo Monte Carlo -Wang LandauMetodo Monte Carlo -Wang Landau
Metodo Monte Carlo -Wang Landauangely alcendra
 
Numerical Solutions of Burgers' Equation Project Report
Numerical Solutions of Burgers' Equation Project ReportNumerical Solutions of Burgers' Equation Project Report
Numerical Solutions of Burgers' Equation Project ReportShikhar Agarwal
 
Causality in special relativity
Causality in special relativityCausality in special relativity
Causality in special relativityMuhammad Ishaq
 
Random Matrix Theory and Machine Learning - Part 3
Random Matrix Theory and Machine Learning - Part 3Random Matrix Theory and Machine Learning - Part 3
Random Matrix Theory and Machine Learning - Part 3Fabian Pedregosa
 
Quantum assignment
Quantum assignmentQuantum assignment
Quantum assignmentViraj Dande
 
Regularized Compression of A Noisy Blurred Image
Regularized Compression of A Noisy Blurred Image Regularized Compression of A Noisy Blurred Image
Regularized Compression of A Noisy Blurred Image ijcsa
 

Ähnlich wie Pria 2007 (20)

On Approach of Estimation Time Scales of Relaxation of Concentration of Charg...
On Approach of Estimation Time Scales of Relaxation of Concentration of Charg...On Approach of Estimation Time Scales of Relaxation of Concentration of Charg...
On Approach of Estimation Time Scales of Relaxation of Concentration of Charg...
 
D143136
D143136D143136
D143136
 
Modelos de Redes Neuronales
Modelos de Redes NeuronalesModelos de Redes Neuronales
Modelos de Redes Neuronales
 
Problems and solutions statistical physics 1
Problems and solutions   statistical physics 1Problems and solutions   statistical physics 1
Problems and solutions statistical physics 1
 
A Stochastic Limit Approach To The SAT Problem
A Stochastic Limit Approach To The SAT ProblemA Stochastic Limit Approach To The SAT Problem
A Stochastic Limit Approach To The SAT Problem
 
Dynamics
DynamicsDynamics
Dynamics
 
Semi-Classical Transport Theory.ppt
Semi-Classical Transport Theory.pptSemi-Classical Transport Theory.ppt
Semi-Classical Transport Theory.ppt
 
Relative superior mandelbrot and julia sets for integer and non integer values
Relative superior mandelbrot and julia sets for integer and non integer valuesRelative superior mandelbrot and julia sets for integer and non integer values
Relative superior mandelbrot and julia sets for integer and non integer values
 
Relative superior mandelbrot sets and relative
Relative superior mandelbrot sets and relativeRelative superior mandelbrot sets and relative
Relative superior mandelbrot sets and relative
 
JAISTサマースクール2016「脳を知るための理論」講義01 Single neuron models
JAISTサマースクール2016「脳を知るための理論」講義01 Single neuron modelsJAISTサマースクール2016「脳を知るための理論」講義01 Single neuron models
JAISTサマースクール2016「脳を知るための理論」講義01 Single neuron models
 
Problem and solution i ph o 25
Problem and solution i ph o 25Problem and solution i ph o 25
Problem and solution i ph o 25
 
thermodynamics
thermodynamicsthermodynamics
thermodynamics
 
Metodo Monte Carlo -Wang Landau
Metodo Monte Carlo -Wang LandauMetodo Monte Carlo -Wang Landau
Metodo Monte Carlo -Wang Landau
 
Master's thesis
Master's thesisMaster's thesis
Master's thesis
 
Numerical Solutions of Burgers' Equation Project Report
Numerical Solutions of Burgers' Equation Project ReportNumerical Solutions of Burgers' Equation Project Report
Numerical Solutions of Burgers' Equation Project Report
 
Causality in special relativity
Causality in special relativityCausality in special relativity
Causality in special relativity
 
Random Matrix Theory and Machine Learning - Part 3
Random Matrix Theory and Machine Learning - Part 3Random Matrix Theory and Machine Learning - Part 3
Random Matrix Theory and Machine Learning - Part 3
 
ICLR 2018
ICLR 2018ICLR 2018
ICLR 2018
 
Quantum assignment
Quantum assignmentQuantum assignment
Quantum assignment
 
Regularized Compression of A Noisy Blurred Image
Regularized Compression of A Noisy Blurred Image Regularized Compression of A Noisy Blurred Image
Regularized Compression of A Noisy Blurred Image
 

Kürzlich hochgeladen

Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 

Kürzlich hochgeladen (20)

Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 

Pria 2007

  • 1. A DYNAMIC FORMULATION OF THE PATTERN RECOGNITION PROBLEM M.B. Shavlovsky1 , O.V. Krasotkina2 , V.V. Mottl3 1 Moscow Institute of Physics and Technology Dolgoprudny, Moscow Region, 141700, Institutsky Pereulok, 9, shavlovsky@yandex.ru 2 Tula State University Tula, 300600, Lenin Ave., 92, krasotkina@uic.tula.ru 3 Computing Center of the Russian Academy of Sciences Moscow, 119333, Vavilov St., 40, vmottl@yandex.ru The classical learning problem of pattern recognition in a finite-dimensional linear space of real-valued features is studied under the conditions of a non-stationary universe. The simplest statement of this problem with two classes of objects is considered under the as- sumption that the instantaneous property the universe is completely expressed by a dis- criminant hyperplane whose parameters are sufficiently slowly changing in time. In this case, any object has to be considered along with the time marker which specifies when the object was selected from the universe, and the training set becomes, actually, a time se- ries. The training criterion of non-stationary pattern recognition is formulated as a genera- lization of the classical Support Vector Machine. The respective numerical algorithm has the computational complexity proportional to the length of the training time series. Introduction  The aim of this study is creation of the main mathematical framework and simplest algo- rithms for solving the typical practical prob- lems of pattern recognition learning in un- iverses whose properties are changing in time. The commonly known classical statement of the pattern recognition problem is based on the tacit assumption that the properties of the un- iverse at the moment of decision making re- main the same as when the training set had been formed. The more realistic assumption on the non-stationarity of the universe, which as accepted in this work, inevitably leads to the necessity to analyze a sequence of samples at some time moments and find different recogni- tion rules for them.  This work is supported by grants of the Russian Foundation for Basic Research No. 05-01-00679 and 06-01-00412. The classical stationary pattern recognition problem with two classes of objects: The support Vector Machine Let each object of the universe  be represented by a point in the linear space of features ( ) x  (1) ( ) ( ),..., ( )n n x x  R , and its hidden membership in one of two classes be specified by the value of the class index  ( ) 1, 1y    . The classical approach to the training problem developed by V. Vapnik [1] is based on the treating the model of the un- iverse in the form of a discriminant function defined by a hyperplane having a priori un- known direction vector and threshold:  ( ) ( )T f b   x a x is primarily 0 if ( ) 1y   , and 0 if ( ) 1y    . The unknown parameters of the hyperplane are to be estimated via analyzing a training set of objects  , 1,...,j j N  represented by their
  • 2. feature vectors and class-membership indices, so that the training set as a whole is a finite set of pairs  ( , ), 1,...,n j jy j N  x R R . The commonly adopted training principle is that of the optimal discriminant hyperplane to be cho- sen from the criterion of maximizing the num- ber of points which are classified correctly with a guaranteed margin conventionally taken to be equal to unity: 1 ( , , , 1,..., ) min, ( ) 1 , 0, 1,..., . N T j j j T j j j j J b j N C y b j N              a a a a x (1) The notion of time is completely absent here. The mathematical model of a non- stationary universe and the main kinds of the training problems The principal novelty of the concept of the non- stationary universe is involving the time factor t . It is assumed that the main property of the non-stationary universe is completely expressed by the time-varying discriminant hyperplane which, in its turn, is completely determined by direction vector and the threshold both being functions of time:  ( ) ( )T t t tf b   x a x pri- marily 0 if ( ) 1y   and 0 if ( ) 1y    . Any object  is to be considered always along with the time mark of its appearance ( , )t . As a result, the training set gains the structure of a set of triples instead of pairs:  ( , , ), 1,...,n j j jy t j N  x R R . If we order the objects as they appear, it is appropriate to speak rather about the training sequence than training set, and consider it as a time series with varying time steps, in the general case. The hidden discriminant hyperplane has dif- ferent values of the direction vector and thre- shold at different time moments jt . So, there exists a two-component time series with one hidden and one observable component, respec- tively, ( , )j jba and ( , )j jyx . The dynamic formulation turns the training problem into that of two-component time se- ries analysis, in which it is required to estimate the hidden component from the observable one. This is a standard signal (time series) analysis problem whose specificity boils down to the assumed model of the relationship be- tween the hidden and the observable compo- nent. In accordance with the classification in- troduced by N. Wiener [2], it is natural to dis- tinguish between, at last, two kinds of training problems as those of estimating the hidden component. The problem of filtration of the training time series. Let a new object appear at the time mo- ment jt when the feature vectors and class- membership indices of the previous are already registered  1 1...,( , ),( , )j j j jy y x x including the current moment 1(..., , )j jt t . It is required to recurrently estimate the parameters of the dis- criminant hyperplane ˆˆ( , )j jba at each time moment jt immediately in the process of obser- vation. The problem of interpolation. Let the training time series be completely registered in some time interval 1 1( , ),...,( , ){ }N Ny yx x before its processing starts. It is required to estimate the time-varying parameters of the discriminant hyperplane in the entire observation interval 1 1 ˆ ˆˆ ˆ( , ),...,( , ){ }N Nb ba a . It is assumed that the parameters of the discri- minant hyperplane ta and tb are changing slowly in the sense that the values 1 1 1 ( ) ( )T j j j j j jt t         a a a a a и 2 1 1 ( )j j j j b b b t t       are, as a rule, sufficiently small. This assump- tion prevents from the degeneration of the fil- tration and interpolation problem into a collec- tion of independent ill-posed two-class train- ing problems each with a single observation. From the formal point of view, the interpola- tion-based estimate of the discriminant hyper- plane parameters ˆˆ( , )N Nba obtained at the last point of the observation interval is just the so- lution of the filtration problem at this time moment. However, the essence of the filtration problem is the requirement of evaluating the estimates in the on-line mode immediately as
  • 3. the observations are coming one after another without solving, each time, the interpolation problem for the time-series of increasing length. The training criterion in the interpolation mode We consider here only the interpolation prob- lem. The proposed formulation of this problem differs from a collection of classical SVM- based criteria (1) for consecutive time mo- ments only by the presence of additional terms which penalize the difference between adja- cent values of the hyperplane parameters 1 1( , )j jb a and ( , )j jba : 1 1 1 1 1 2 12 1 ( , , , 1,..., ) ( ) ( ) ( ) , ( ) 1 , 0, 1,..., . j j jj j j N N T j j j j j j j j T N j jj T j j j j j j b t t t t t t J b j N C N D D b b t t y b j N                           a a a a a a a a a x (2) The coefficients 0D a and 0b D  are hy- per-parameters which preset the desired level of smoothing the instantaneous parameters of the discriminant hyperplane. The criterion (2) implements the concept of the optimal sufficiently smooth sequence of discriminant hyperplanes in contrast to the concept of the only optimal hyperplane in (1). The sought-for hyperplanes have to provide the correct classification of the feature vectors for as many time moments as possible with a guaranteed margin taken equal to unity just like in (1). The training algorithm Just as the classical training problem, the dy- namic problem (2) is that of quadratic pro- gramming, but it contains ( 1)N n N  va- riables in contrast to ( 1)n N  variables in (1). It is known that the computational com- plexity of the quadratic programming problem of general kind is proportional to the cube of the number of variables, i.e. the dynamic prob- lem appears, at first glance, to be essentially more complicated than the classical one. However, the goal function of the dynamic problem ( , , , 1,..., )j j jJ b j N a is pair-wise separable, i.e. is representable as the sum of partial functions each of which depends only on one or two variables associated with one or two adjacent time moments. This circumstance makes it possible to build an algorithm which numerically solves the problem in time propor- tional to the length of the training time series N . The application of the Kuhn-Tacker theorem to the dynamic problem (2) turns it into the dual form with respect to the Lagrange multip- liers 0j  at the inequality constraints ( ) 1T j j j j jy b  a x :   1 1 1 1 1 ( ,..., ) 1 ( ) max, 2 0, 0 2, 1,..., . N N N N T j j l j jl l jl j l j j l N j j j j W y y f y C j N                       x Q x (3) Matrices jlQ ( )n n and ( )jlfF ( )N N do not depend here from the training time series and are determined only by the coefficients Da and b D which penalize the unsmoothness of the sequence of hyperplane parameters, re- spectively, direction vectors and thresholds in (2). Theorem. The solution of the training problem (2) is completely determined by the training time series and the values of Lagrange multip- liers 1( ,..., )N  obtained as the solution of the dual problem (3): : 0 ˆ l j l l jl l l y    a Q x , : 0 ˆ l j l l jl l b b y f      , (4) : 2:0 2 :0 2 : 0 ' '' , ' , 2 '' ( ). jj j l j j T j l l l jl j jl j Cj C j C l b b C b b y b y f                    x Q x (5) It is seen from these formulas that the solution of the dynamic training problem depends only on those elements of the training time series ( , )j jyx whose Lagrange multipliers have ob- tained positive values 0j  . It is natural to call the feature vectors of the respective ob- jects the support vectors. So, we have come to some generalization of the Support Vector Machine [1] which follows from the concept of the optimal discriminant hyperplane (1).
  • 4. The classical training problem is a particular case of the problem (2) when the penalties on the time variation of the hyperplane parameters infinitely grow D a and b D . In this case, we have jl Q I , 0jlf  , and the dual problem (3) turns into the classical dual prob- lem [1] which corresponds to the initial prob- lem (1): 1 1 1 1 1 ( ,..., ) 1 ( ) max, 2 0, 0 2, 1,..., , N N N N T j j l j l j l j j l N j j j j W y y y C j N                      x x The formulas (4) and (5) will determine the training result in accordance with the classical support vector method 1 ˆ ˆ ˆ... N  a a a and 1 ˆ ˆ ˆ... Nb b b   : : 0 ˆ j j j j j y    a x , : 2:0 2 :0 2 : 0 ' '' , ' , 2 '' . jj j l j j T j l l l j j Cj C j C l b b C b b y b y                   x x Despite the fact that the dual problem (3) is not pair-wise separable, the pair-wise separability of the initial problem (2) makes it possible to compute the gradient of the goal function 1( ,..., )NW   at each point 1( ,..., )N   and then to find the optimal admissible max- imization direction relative to the constraints via an algorithm of the linear computational complexity with respect to the length of the training time series. In particular, the standard steepest descent method of solving quadratic programming problems [3], being applied to the function 1( ,..., )NW   , yields a generali- zation of the known SMO algorithm (Sequen- tial Minimum Optimization) [4] which is typi- cally used when solving dual problems in SVM. References 1. Vapnik V. Statistical Learning Theory. John-Wiley & Sons, Inc. 1998. 2. Wiener N. Extrapolation, Interpolation, and Smooth- ing of Stationary Random Time Series with Engi- neering Applications. Technology Press of MIT, John Wiley & Sons, 1949, 163 p. 3. Bazaraa M.S., Sherali H.D., Shetty C.M. Nonlinear Programming: Theory and Algorithms. John Wiley & Sons, 1993. 4. Platt J.C. Fast training of support vector machines using sequential minimal optimization. Advances in Kernel Methods: Support Vector Learning, MIT Press, Cambridge, MA, 1999.