SlideShare ist ein Scribd-Unternehmen logo
1 von 25
Downloaden Sie, um offline zu lesen
4th International Summer School
Achievements and Applications of Contemporary
Informatics, Mathematics and Physics
National University of Technology of the Ukraine
Kiev, Ukraine, August 5-16, 2009



                              Classification Theory
                       Modelling of Kernel Machine by
                   Infinite and Semi-Infinite Programming

                   Süreyya Özöğür-Akyüz, Gerhard-Wilhelm Weber *

                  Institute of Applied Mathematics, METU, Ankara, Turkey

       * Faculty of Economics, Management Science and Law, University of Siegen, Germany
           Center for Research on Optimization and Control, University of Aveiro, Portugal



                                               1
                                                                               August 7, 2009
Motivation      Prediction of Cleavage Sites




signal part   mature part




                                 γ


                            2
                                           August 7, 2009
Logistic Regression

          P(Y = 1 X = xl ) 
     log                     = β0 + β1 ⋅ xl1 + β2 ⋅ xl 2 + K + β p ⋅ xlp
          P(Y = 0 X = x ) 
                        l 


                                                           (l = 1, 2,..., N )




                                  3
                                                            August 7, 2009
Linear Classifiers

  Maximum margin classifier:
                                       γ i := yi ⋅ (< w, xi > +b)

                               Note:   γ i > 0 implies correct classification.




                        γ
                                                    yk ⋅ (< w, xk > +b) = 1




        y j ⋅ (< w, x j > +b) = 1
                                            4
                                                                              August 7, 2009
Linear Classifiers


                                   2
 •   The geometric margin:      γ=
                                   w            2

                 2                                  2
        max                           min       w
                 w   2
                                                    2




                                  2

       Convex            min w
                          w ,b
                                  2

       Problem
                         subject to    yi ⋅ ( w, xi + b) ≥ 1 (i = 1, 2,..., l)



                                            5
                                                                           August 7, 2009
Linear Classifiers



  Dual Problem:

                          l
                             1 l
                  max ∑ α i − ∑ yi y jα iα j xi , x j
                      i =1   2 i , j =1
                                l
                  subject to   ∑ yα
                               i =1
                                      i   i   = 0,

                               α i ≥ 0 (i = 1, 2,..., l).



                                      6
                                                            August 7, 2009
Linear Classifiers



  Dual Problem:

                          l
                             1 l
                  max ∑ α i − ∑ yi y jα iα j κ ( xi , x j )
                      i =1   2 i , j =1
                                l                    kernel function
                  subject to   ∑ yα
                               i =1
                                      i   i   = 0,

                               α i ≥ 0 (i = 1, 2,..., l).



                                      7
                                                                 August 7, 2009
Linear Classifiers
     Soft Margin Classifier:

 •    Introduce slack variables to allow the margin constraints to be
      violated


                    subject to        yi ⋅ ( w, x i + b) ≥ 1 − ξi ,
                                      ξi ≥ 0                     (i = 1, 2,..., l)


                                                     l
                                        w + C ∑ ξi2
                                           2
                       min
                           ξ , w ,b        2
                                                    i =1

                       subject to       yi ⋅ ( w, xi + b) ≥ 1 − ξi ,
                                       ξi ≥ 0                    (i = 1, 2,..., l)

                                                8
                                                                                     August 7, 2009
Linear Classifiers

• Projection of the data into a higher dimensional feature space.

• Mapping the input space X into a new space F :


                       x = ( x1 ,..., xn ) a φ ( x) = (φ1 ( x),..., φN ( x))




                                                                                    φ (x)
                                                                        φ (x)
                                                        φ (0)               φ (x)    φ (x)
                                                        φ (0)
                                                                                    φ (x)
                                                                φ (0)
                                                          φ (0)          φ (0)
                                                                                        φ (x)



                                         9
                                                                                        August 7, 2009
Nonlinear Classifiers

                                             N
 set of hypotheses                 f ( x) =∑ wiφi ( x) + b,
                                            i =1

                                            l
 dual representation               f ( x) =∑ α i yi φ ( xi ), φ ( x) + b.
                                           i =1


                                                    kernel function



       Ex.:       polynomial kernels               κ ( x, z ) = (1 + xT z )k

                  sigmoid Kernel                   κ ( x, z ) = tanh(axT z + b)

                                                   κ ( x, z ) = exp(− x − z / σ 2 )
                                                                               2
                  Gaussian (RBF) kernel                                        2




                                          10
                                                                                   August 7, 2009
(In-) Finite Kernel Learning

     •       Based on the motivation of multiple kernel learning (MKL):

                              K
               (         )                 (
             κ xi , x j = ∑ β k κ k xi , x j          )
                             k =1
                                                              kernel functions κ l (⋅, ⋅) :

                                                              βl ≥ 0 ( l = 1,K, K ) ,      ∑          βk = 1
                                                                                               K
                                                                                               k =1

     •       Semi-infinite LP formulation:



      (SILP MKL)
                                    max θ
                                    θ ,β
                                                    (θ ∈R, β ∈RK )
                                                              ∑
                                                                K
                                    such that       0 ≤ β,          β
                                                                k =1 k
                                                                          = 1,

                                                    ∑k =1βk Sk (α ) ≥ θ          ∀α ∈ Rl with 0 ≤ α ≤ C1 and ∑i =1αi yi = 0.
                                                      K                                                          l



Sk (α ) :=
             1 l
             2
                                       (        )
               ∑ i, j =1αiα j yi y jκ k xi , x j − ∑ i =1αi
                                                     l
                                                                      11
                                                                                                               August 7, 2009
Infinite Kernel Learning Infinite Programming

                                                                  2
     ex.:                                           −ω xi − x j
                                                         *
                    κ ( xi , x j , ω ) := ω exp                   2   + (1 − ω )(1 + xiT x j ) d


            H (ω ) := κ ( xi , x j , ω )                             homotopy


                                                                                                          2
                                                                                          −ω * xi − x j
                                           H (0) = (1 + xi x j ) d
                                                         T
                                                                            H (1) = exp                   2




 κ β ( xi , x j ) := ∫ κ ( xi , x j , ω )d β (ω )
                    Ω
                                                                          Infinite Programming
                                                    12
                                                                                      August 7, 2009
Infinite Kernel Learning Infinite Programming

•   Introducing Riemann-Stieltjes integrals to the problem (SILP-MKL),
    we get the following general problem formulation:

                      κ β ( xi , x j ) = ∫ κ ( xi , x j , ω )d β (ω )    Ω = [0,1]
                                        Ω




                                                 13
                                                                          August 7, 2009
Infinite Kernel Learning Infinite Programming

 •    Introducing Riemann-Stieltjes integrals to the problem (SILP-MKL),
      we get the following general problem formulation:



               max θ
                 θ ,β
                            (θ ∈ R, β : [0,1] → R : monotonically increasing )
     (IP)
                                 1
               subject to       ∫0 d β (ω ) = 1,
                  1                        
                      S (ω , α ) − ∑ i =1αi  d β (ω ) ≥ θ ∀α ∈ R l with 0 ≤ α ≤ C , ∑ i =1αi yi = 0.
                                     l                                                 l
               ∫Ω  2
                                           




                                                                                                     
                                       (              )
              1 l                                                                           l
S (ω , α ) := ∑ i , j =1α iα j yi y jκ xi , x j , ω                                                  
                                                               A := α ∈ R 0 ≤ α ≤ C1 and ∑ α i yi =0 
                                                                          l
              2                                                                          i =1        
                                                                                                     
             1
T (ω , α ) := S (ω , α ) − ∑ α i
                             l                            14
             2               i =1                                                    August 7, 2009
Infinite Kernel Learning Infinite Programming
                max θ       (θ ∈ R, β :    a positive measure on Ω )
(IP)            θ ,β
                such that θ − ∫ T (ω , α )d β (ω ) ≤ 0 ∀α ∈ A,           ∫Ω d β (ω ) = 1.
                                  Ω

                                                                            infinite programming
dual of (IP):

                min σ       (σ ∈ R , ρ :   a positive measure on A )
                σ ,ρ
(DIP)
                such that    σ -∫ T (ω , α )d ρ (α ) ≥ 0 ∀ω ∈ Ω,       ∫A d ρ (α ) = 1.
                                 A

•    Duality Conditions: Let (θ , β ) and (σ , ρ ) be feasible for their respective problems, and
     complementary slack, so
    β has measure only where σ = ∫A T (ω , α )d ρ    and
    ρ has measure only where θ = ∫ T (ω , α )d β .
                                      Ω


    Then, both solutions are optimal for their respective problems.


                                                   15
                                                                                          August 7, 2009
Infinite Kernel Learning Infinite Programming

 •   The interesting theoretical problem here is to find conditions
     which ensure that solutions are point masses
     (i.e., the original monotonic β is a step function).

 •   Because of this and in view of the compactness of the feasible (index) sets at the
     lower levels, A and Ω , we are interested in the nondegeneracy of the local minima
     of the lower level problem to get finitely many local minimizers of

                      g ( (σ , ρ ) , ω ) := σ − ∫ T (ω , α ) d ρ (α ).
                                                A


 •   Lower Level Problem: For a given parameter (σ , ρ ), we consider

      (LLP)
                     min g ( (σ , ρ ) , ω ) subject to ω ∈ Ω .
                      ω



                                                16
                                                                             August 7, 2009
Infinite Kernel Learning Infinite Programming


• “reduction ansatz” and
• Implicit Function Theorem
• parametrical measures




•   “finite optimization”
                              17
                                        August 7, 2009
Infinite Kernel Learning Infinite Programming


• “reduction ansatz” and
• Implicit Function Theorem
• parametrical measures                                       1      −(ω − µ )2
                                   e.g., f (ω ;( µ , σ )) =
                                                    2
                                                                 exp
                                                            σ 2π       2σ 2

                                                     λ exp(−λω), ω ≥ 0
                                         f (ω ; λ) = 
                                                     0,          ω<0

                                                            H (ω − a) − H (ω − b)
                                         f (ω ;(a, b)) =
                                                                    b−a
                                                                ωα −1 (1 − ω ) β −1
                                         f (ω;(α , β )) =    1 α −1         β −1
                                                            ∫0
                                                               u    (1 − u ) du
•   “finite optimization”
                              18
                                                                      August 7, 2009
Infinite Kernel Learning Reduction Ansatz


• “reduction ansatz” and
• Implicit Function Theorem
                                                     g ( x, ⋅)
                                                         %
• parametrical measures

                                         g ( x ,.)




                                                                 Ω

  g ( x, y ) ≥ 0 ∀y ∈ I                              yj yj
                                                        %                   yp
  ⇔ min g ( x, y ) ≥ 0
     y∈I                           x a y j ( x)            implicit function
                              19
                                                                     August 7, 2009
Infinite Kernel Learning Reduction Ansatz
based on the reduction ansatz :

 min f ( x)
 subject to g j ( x) := g ( x, y j ( x)) ≥ 0 ( j ∈ J := {1, 2, K, p})


                                                         g ((σ , ρ ), ⋅)



                                                                           g ((σ , ρ ), ⋅)



                                               • (σ , ρ )
                                  •
                            ω     ω           (σ , ρ )
                                                                                         topology
 ω = ω (σ , ρ )
 %                                      20
                                                                                  August 7, 2009
Infinite Kernel Learning Regularization
regularization
                                t                                                                  t
                    d                                                                        d2
   min − θ + sup µ     ∫ d β (ω )                                                                  ∫ d β (ω )
   θ ,β     t∈[0,1] dt 0
                                                                                               2
                                                                                             dt 0
         subject to the constraints
                                                                                     0 = t0 < t1 < K < tι = 1

                                                   tν +1              tν

                                tν                  ∫      d β (ω ) − ∫ d β (ω )                         tν +1
                           d                                                                   1
                                 ∫ d β (ω ) ≈ 0                        0                =                    ∫    d β (ω )
                           dt                                 tν +1 − tν                  tν +1 − tν
                                 0                                                                           tν

                                                                     tν + 2                            tν +1
                                                           1                                  1
                                                                       ∫      d β (ω ) −                ∫      d β (ω )
                               2 tν                 tν + 2 − tν +1                       tν +1 − tν
                           d                                         tν +1                              tν
                          dt 2 0
                                    ∫ d β (ω ) ≈                                tν +1 − tν

                                                      21
                                                                                                       August 7, 2009
Infinite Kernel Learning Topology

Radon measure: measure on the σ -algebra of Borel sets of E that is
locally finite and inner regular.


(E,d):    metric space                                            inner regularity
Η (E) :   set of Radon measures on E
neighbourhood of measure ρ :
                                                                          µ (Kν )
                                            
                                            
Bρ (ε ) :=  µ ∈ Η ( E ) ∫ fd µ − ∫ fd ρ < ε 
 f
           
                        A        A          
                                             

dual space ( Η ( E ))′ of continuous bounded functions,               Kν ⊂ E : compact set
f ∈ ( Η ( E ))′

                                             22
                                                                             August 7, 2009
Infinite Kernel Learning Topology

Def.: Basis of neighbourhood of a measure    ρ ( f1,..., fn ∈(Η(E))′; ε > 0) :

       {µ ∈ Η (E)         ∫E fi d ρ − ∫E fi d µ < ε                     }
                                                        (i = 1, 2,..., n) .


Def.: Prokhorov metric:

      d0 ( µ , ρ ) := inf {ε ≥ 0 | µ ( A) ≤ ρ ( Aε ) + ε and ρ ( A) ≤ µ ( Aε ) + ε (A : closed)} ,
                    ε
      where     Aε := { x ∈ E | d ( x, A) < ε }.

      Open    δ -neighbourhood of a measure ρ :
      Bδ ( ρ ) := {µ ∈ Η ( E ) d0 ( ρ , µ ) < δ }.


                                                   23
                                                                                 August 7, 2009
Infinite Kernel Learning        Numerical Results




                           24
                                              August 7, 2009
References
Özöğür, S., Shawe-Taylor, J., Weber, G.-W., and Ögel, Z.B., Pattern analysis for the prediction of eukoryatic pro
peptide cleavage sites, in the special issue Networks in Computational Biology of Discrete Applied Mathematics 157,
10 (May 2009) 2388-2394.

Özöğür-Akyüz, S., and Weber, G.-W., Infinite kernel learning by infinite and semi-infinite programming,
Proceedings of the Second Global Conference on Power Control and Optimization, AIP Conference Proceedings
1159, Bali, Indonesia, 1-3 June 2009, Subseries: Mathematical and Statistical Physics; ISBN 978-0-7354-0696-4
(August 2009) 306-313; Hakim, A.H., Vasant, P., and Barsoum, N., guest eds..

Özöğür-Akyüz, S., and Weber, G.-W., Infinite Kernel Learning via infinite and semi-infinite programming, to
appear in the special issue of OMS (Optimization Software and Application) at the occasion of International
Conference on Engineering Optimization (EngOpt 2008; Rio de Janeiro, Brazil, June 1-5, 2008), Schittkowski, K.
(guest ed.).

Özöğür-Akyüz, S., and Weber, G.-W., On numerical optimization theory of infinite kernel learning, preprint at IAM,
METU, submitted to JOGO (Journal of Global Optimization).




                                                           25
                                                                                                   August 7, 2009

Weitere ähnliche Inhalte

Was ist angesagt?

Modern features-part-3-software
Modern features-part-3-softwareModern features-part-3-software
Modern features-part-3-software
zukun
 
CVPR2010: Advanced ITinCVPR in a Nutshell: part 7: Future Trend
CVPR2010: Advanced ITinCVPR in a Nutshell: part 7: Future TrendCVPR2010: Advanced ITinCVPR in a Nutshell: part 7: Future Trend
CVPR2010: Advanced ITinCVPR in a Nutshell: part 7: Future Trend
zukun
 
Structured regression for efficient object detection
Structured regression for efficient object detectionStructured regression for efficient object detection
Structured regression for efficient object detection
zukun
 
Lecture3 linear svm_with_slack
Lecture3 linear svm_with_slackLecture3 linear svm_with_slack
Lecture3 linear svm_with_slack
Stéphane Canu
 
Bai giang ham so kha vi va vi phan cua ham nhieu bien
Bai giang ham so kha vi va vi phan cua ham nhieu bienBai giang ham so kha vi va vi phan cua ham nhieu bien
Bai giang ham so kha vi va vi phan cua ham nhieu bien
Nhan Nguyen
 

Was ist angesagt? (20)

Modern features-part-3-software
Modern features-part-3-softwareModern features-part-3-software
Modern features-part-3-software
 
provenance of lists - TAPP'11 Mini-tutorial
provenance of lists - TAPP'11 Mini-tutorialprovenance of lists - TAPP'11 Mini-tutorial
provenance of lists - TAPP'11 Mini-tutorial
 
Montpellier Math Colloquium
Montpellier Math ColloquiumMontpellier Math Colloquium
Montpellier Math Colloquium
 
CVPR2010: Advanced ITinCVPR in a Nutshell: part 7: Future Trend
CVPR2010: Advanced ITinCVPR in a Nutshell: part 7: Future TrendCVPR2010: Advanced ITinCVPR in a Nutshell: part 7: Future Trend
CVPR2010: Advanced ITinCVPR in a Nutshell: part 7: Future Trend
 
集合知プログラミングゼミ第1回
集合知プログラミングゼミ第1回集合知プログラミングゼミ第1回
集合知プログラミングゼミ第1回
 
Further Advanced Methods from Mathematical Optimization
Further Advanced Methods from Mathematical OptimizationFurther Advanced Methods from Mathematical Optimization
Further Advanced Methods from Mathematical Optimization
 
Kernel based models for geo- and environmental sciences- Alexei Pozdnoukhov –...
Kernel based models for geo- and environmental sciences- Alexei Pozdnoukhov –...Kernel based models for geo- and environmental sciences- Alexei Pozdnoukhov –...
Kernel based models for geo- and environmental sciences- Alexei Pozdnoukhov –...
 
On recent improvements in the conic optimizer in MOSEK
On recent improvements in the conic optimizer in MOSEKOn recent improvements in the conic optimizer in MOSEK
On recent improvements in the conic optimizer in MOSEK
 
cOnscienS: social and organizational framework for gaming AI
cOnscienS: social and organizational framework for gaming AIcOnscienS: social and organizational framework for gaming AI
cOnscienS: social and organizational framework for gaming AI
 
Structured regression for efficient object detection
Structured regression for efficient object detectionStructured regression for efficient object detection
Structured regression for efficient object detection
 
Functional Programming in C++
Functional Programming in C++Functional Programming in C++
Functional Programming in C++
 
Lecture 2: linear SVM in the dual
Lecture 2: linear SVM in the dualLecture 2: linear SVM in the dual
Lecture 2: linear SVM in the dual
 
Mo u quantified
Mo u   quantifiedMo u   quantified
Mo u quantified
 
Lesson 24: The Definite Integral (Section 10 version)
Lesson 24: The Definite Integral (Section 10 version)Lesson 24: The Definite Integral (Section 10 version)
Lesson 24: The Definite Integral (Section 10 version)
 
Lesson 24: The Definite Integral (Section 4 version)
Lesson 24: The Definite Integral (Section 4 version)Lesson 24: The Definite Integral (Section 4 version)
Lesson 24: The Definite Integral (Section 4 version)
 
Journey to structure from motion
Journey to structure from motionJourney to structure from motion
Journey to structure from motion
 
Common derivatives integrals_reduced
Common derivatives integrals_reducedCommon derivatives integrals_reduced
Common derivatives integrals_reduced
 
Lecture3 linear svm_with_slack
Lecture3 linear svm_with_slackLecture3 linear svm_with_slack
Lecture3 linear svm_with_slack
 
Lecture5 kernel svm
Lecture5 kernel svmLecture5 kernel svm
Lecture5 kernel svm
 
Bai giang ham so kha vi va vi phan cua ham nhieu bien
Bai giang ham so kha vi va vi phan cua ham nhieu bienBai giang ham so kha vi va vi phan cua ham nhieu bien
Bai giang ham so kha vi va vi phan cua ham nhieu bien
 

Ähnlich wie Classification Theory

Munich07 Foils
Munich07 FoilsMunich07 Foils
Munich07 Foils
Antonini
 
Lecture6
Lecture6Lecture6
Lecture6
voracle
 
X2 T05 06 Partial Fractions
X2 T05 06 Partial FractionsX2 T05 06 Partial Fractions
X2 T05 06 Partial Fractions
Nigel Simmons
 
Computer Science and Information Science 4th semester (2012-June Question
Computer Science and Information Science 4th semester (2012-June Question Computer Science and Information Science 4th semester (2012-June Question
Computer Science and Information Science 4th semester (2012-June Question
B G S Institute of Technolgy
 

Ähnlich wie Classification Theory (20)

Regression Theory
Regression TheoryRegression Theory
Regression Theory
 
Lesson 11: Limits and Continuity
Lesson 11: Limits and ContinuityLesson 11: Limits and Continuity
Lesson 11: Limits and Continuity
 
Munich07 Foils
Munich07 FoilsMunich07 Foils
Munich07 Foils
 
YSC 2013
YSC 2013YSC 2013
YSC 2013
 
Lecture6
Lecture6Lecture6
Lecture6
 
Lesson18 Double Integrals Over Rectangles Slides
Lesson18   Double Integrals Over Rectangles SlidesLesson18   Double Integrals Over Rectangles Slides
Lesson18 Double Integrals Over Rectangles Slides
 
Matlab
MatlabMatlab
Matlab
 
Lesson 21: Curve Sketching (Section 10 version)
Lesson 21: Curve Sketching (Section 10 version)Lesson 21: Curve Sketching (Section 10 version)
Lesson 21: Curve Sketching (Section 10 version)
 
Lesson31 Higher Dimensional First Order Difference Equations Slides
Lesson31   Higher Dimensional First Order Difference Equations SlidesLesson31   Higher Dimensional First Order Difference Equations Slides
Lesson31 Higher Dimensional First Order Difference Equations Slides
 
Calculus II - 32
Calculus II - 32Calculus II - 32
Calculus II - 32
 
X2 T05 06 Partial Fractions
X2 T05 06 Partial FractionsX2 T05 06 Partial Fractions
X2 T05 06 Partial Fractions
 
Lesson 22: Quadratic Forms
Lesson 22: Quadratic FormsLesson 22: Quadratic Forms
Lesson 22: Quadratic Forms
 
Computer Science and Information Science 4th semester (2012-June) Question
Computer Science and Information Science 4th semester (2012-June) QuestionComputer Science and Information Science 4th semester (2012-June) Question
Computer Science and Information Science 4th semester (2012-June) Question
 
Computer Science and Information Science 4th semester (2012-June Question
Computer Science and Information Science 4th semester (2012-June Question Computer Science and Information Science 4th semester (2012-June Question
Computer Science and Information Science 4th semester (2012-June Question
 
Pre-Cal 40S April 14, 2009
Pre-Cal 40S April 14, 2009Pre-Cal 40S April 14, 2009
Pre-Cal 40S April 14, 2009
 
Bouguet's MatLab Camera Calibration Toolbox
Bouguet's MatLab Camera Calibration ToolboxBouguet's MatLab Camera Calibration Toolbox
Bouguet's MatLab Camera Calibration Toolbox
 
0007
00070007
0007
 
Lesson 28: Lagrange Multipliers II
Lesson 28: Lagrange Multipliers IILesson 28: Lagrange Multipliers II
Lesson 28: Lagrange Multipliers II
 
Lesson 28: Lagrange Multipliers II
Lesson 28: Lagrange  Multipliers IILesson 28: Lagrange  Multipliers II
Lesson 28: Lagrange Multipliers II
 
Lesson 21: Curve Sketching (Section 4 version)
Lesson 21: Curve Sketching (Section 4 version)Lesson 21: Curve Sketching (Section 4 version)
Lesson 21: Curve Sketching (Section 4 version)
 

Mehr von SSA KPI

Germany presentation
Germany presentationGermany presentation
Germany presentation
SSA KPI
 
Grand challenges in energy
Grand challenges in energyGrand challenges in energy
Grand challenges in energy
SSA KPI
 
Engineering role in sustainability
Engineering role in sustainabilityEngineering role in sustainability
Engineering role in sustainability
SSA KPI
 
Consensus and interaction on a long term strategy for sustainable development
Consensus and interaction on a long term strategy for sustainable developmentConsensus and interaction on a long term strategy for sustainable development
Consensus and interaction on a long term strategy for sustainable development
SSA KPI
 
Competences in sustainability in engineering education
Competences in sustainability in engineering educationCompetences in sustainability in engineering education
Competences in sustainability in engineering education
SSA KPI
 
Introducatio SD for enginers
Introducatio SD for enginersIntroducatio SD for enginers
Introducatio SD for enginers
SSA KPI
 

Mehr von SSA KPI (20)

Germany presentation
Germany presentationGermany presentation
Germany presentation
 
Grand challenges in energy
Grand challenges in energyGrand challenges in energy
Grand challenges in energy
 
Engineering role in sustainability
Engineering role in sustainabilityEngineering role in sustainability
Engineering role in sustainability
 
Consensus and interaction on a long term strategy for sustainable development
Consensus and interaction on a long term strategy for sustainable developmentConsensus and interaction on a long term strategy for sustainable development
Consensus and interaction on a long term strategy for sustainable development
 
Competences in sustainability in engineering education
Competences in sustainability in engineering educationCompetences in sustainability in engineering education
Competences in sustainability in engineering education
 
Introducatio SD for enginers
Introducatio SD for enginersIntroducatio SD for enginers
Introducatio SD for enginers
 
DAAD-10.11.2011
DAAD-10.11.2011DAAD-10.11.2011
DAAD-10.11.2011
 
Talking with money
Talking with moneyTalking with money
Talking with money
 
'Green' startup investment
'Green' startup investment'Green' startup investment
'Green' startup investment
 
From Huygens odd sympathy to the energy Huygens' extraction from the sea waves
From Huygens odd sympathy to the energy Huygens' extraction from the sea wavesFrom Huygens odd sympathy to the energy Huygens' extraction from the sea waves
From Huygens odd sympathy to the energy Huygens' extraction from the sea waves
 
Dynamics of dice games
Dynamics of dice gamesDynamics of dice games
Dynamics of dice games
 
Energy Security Costs
Energy Security CostsEnergy Security Costs
Energy Security Costs
 
Naturally Occurring Radioactivity (NOR) in natural and anthropic environments
Naturally Occurring Radioactivity (NOR) in natural and anthropic environmentsNaturally Occurring Radioactivity (NOR) in natural and anthropic environments
Naturally Occurring Radioactivity (NOR) in natural and anthropic environments
 
Advanced energy technology for sustainable development. Part 5
Advanced energy technology for sustainable development. Part 5Advanced energy technology for sustainable development. Part 5
Advanced energy technology for sustainable development. Part 5
 
Advanced energy technology for sustainable development. Part 4
Advanced energy technology for sustainable development. Part 4Advanced energy technology for sustainable development. Part 4
Advanced energy technology for sustainable development. Part 4
 
Advanced energy technology for sustainable development. Part 3
Advanced energy technology for sustainable development. Part 3Advanced energy technology for sustainable development. Part 3
Advanced energy technology for sustainable development. Part 3
 
Advanced energy technology for sustainable development. Part 2
Advanced energy technology for sustainable development. Part 2Advanced energy technology for sustainable development. Part 2
Advanced energy technology for sustainable development. Part 2
 
Advanced energy technology for sustainable development. Part 1
Advanced energy technology for sustainable development. Part 1Advanced energy technology for sustainable development. Part 1
Advanced energy technology for sustainable development. Part 1
 
Fluorescent proteins in current biology
Fluorescent proteins in current biologyFluorescent proteins in current biology
Fluorescent proteins in current biology
 
Neurotransmitter systems of the brain and their functions
Neurotransmitter systems of the brain and their functionsNeurotransmitter systems of the brain and their functions
Neurotransmitter systems of the brain and their functions
 

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Basic Intentional Injuries Health Education
Basic Intentional Injuries Health EducationBasic Intentional Injuries Health Education
Basic Intentional Injuries Health Education
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...
 
Fostering Friendships - Enhancing Social Bonds in the Classroom
Fostering Friendships - Enhancing Social Bonds  in the ClassroomFostering Friendships - Enhancing Social Bonds  in the Classroom
Fostering Friendships - Enhancing Social Bonds in the Classroom
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POS
 
Graduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - EnglishGraduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - English
 
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptxHMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
 
How to Add New Custom Addons Path in Odoo 17
How to Add New Custom Addons Path in Odoo 17How to Add New Custom Addons Path in Odoo 17
How to Add New Custom Addons Path in Odoo 17
 
Wellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxWellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptx
 
Interdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptxInterdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptx
 
REMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptxREMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptx
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentation
 
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
 
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibit
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdf
 
Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxTowards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptx
 

Classification Theory

  • 1. 4th International Summer School Achievements and Applications of Contemporary Informatics, Mathematics and Physics National University of Technology of the Ukraine Kiev, Ukraine, August 5-16, 2009 Classification Theory Modelling of Kernel Machine by Infinite and Semi-Infinite Programming Süreyya Özöğür-Akyüz, Gerhard-Wilhelm Weber * Institute of Applied Mathematics, METU, Ankara, Turkey * Faculty of Economics, Management Science and Law, University of Siegen, Germany Center for Research on Optimization and Control, University of Aveiro, Portugal 1 August 7, 2009
  • 2. Motivation Prediction of Cleavage Sites signal part mature part γ 2 August 7, 2009
  • 3. Logistic Regression  P(Y = 1 X = xl )  log  = β0 + β1 ⋅ xl1 + β2 ⋅ xl 2 + K + β p ⋅ xlp  P(Y = 0 X = x )   l  (l = 1, 2,..., N ) 3 August 7, 2009
  • 4. Linear Classifiers Maximum margin classifier: γ i := yi ⋅ (< w, xi > +b) Note: γ i > 0 implies correct classification. γ yk ⋅ (< w, xk > +b) = 1 y j ⋅ (< w, x j > +b) = 1 4 August 7, 2009
  • 5. Linear Classifiers 2 • The geometric margin: γ= w 2 2 2 max min w w 2 2 2 Convex min w w ,b 2 Problem subject to yi ⋅ ( w, xi + b) ≥ 1 (i = 1, 2,..., l) 5 August 7, 2009
  • 6. Linear Classifiers Dual Problem: l 1 l max ∑ α i − ∑ yi y jα iα j xi , x j i =1 2 i , j =1 l subject to ∑ yα i =1 i i = 0, α i ≥ 0 (i = 1, 2,..., l). 6 August 7, 2009
  • 7. Linear Classifiers Dual Problem: l 1 l max ∑ α i − ∑ yi y jα iα j κ ( xi , x j ) i =1 2 i , j =1 l kernel function subject to ∑ yα i =1 i i = 0, α i ≥ 0 (i = 1, 2,..., l). 7 August 7, 2009
  • 8. Linear Classifiers Soft Margin Classifier: • Introduce slack variables to allow the margin constraints to be violated subject to yi ⋅ ( w, x i + b) ≥ 1 − ξi , ξi ≥ 0 (i = 1, 2,..., l) l w + C ∑ ξi2 2 min ξ , w ,b 2 i =1 subject to yi ⋅ ( w, xi + b) ≥ 1 − ξi , ξi ≥ 0 (i = 1, 2,..., l) 8 August 7, 2009
  • 9. Linear Classifiers • Projection of the data into a higher dimensional feature space. • Mapping the input space X into a new space F : x = ( x1 ,..., xn ) a φ ( x) = (φ1 ( x),..., φN ( x)) φ (x) φ (x) φ (0) φ (x) φ (x) φ (0) φ (x) φ (0) φ (0) φ (0) φ (x) 9 August 7, 2009
  • 10. Nonlinear Classifiers N set of hypotheses f ( x) =∑ wiφi ( x) + b, i =1 l dual representation f ( x) =∑ α i yi φ ( xi ), φ ( x) + b. i =1 kernel function Ex.: polynomial kernels κ ( x, z ) = (1 + xT z )k sigmoid Kernel κ ( x, z ) = tanh(axT z + b) κ ( x, z ) = exp(− x − z / σ 2 ) 2 Gaussian (RBF) kernel 2 10 August 7, 2009
  • 11. (In-) Finite Kernel Learning • Based on the motivation of multiple kernel learning (MKL): K ( ) ( κ xi , x j = ∑ β k κ k xi , x j ) k =1 kernel functions κ l (⋅, ⋅) : βl ≥ 0 ( l = 1,K, K ) , ∑ βk = 1 K k =1 • Semi-infinite LP formulation: (SILP MKL) max θ θ ,β (θ ∈R, β ∈RK ) ∑ K such that 0 ≤ β, β k =1 k = 1, ∑k =1βk Sk (α ) ≥ θ ∀α ∈ Rl with 0 ≤ α ≤ C1 and ∑i =1αi yi = 0. K l Sk (α ) := 1 l 2 ( ) ∑ i, j =1αiα j yi y jκ k xi , x j − ∑ i =1αi l 11 August 7, 2009
  • 12. Infinite Kernel Learning Infinite Programming 2 ex.: −ω xi − x j * κ ( xi , x j , ω ) := ω exp 2 + (1 − ω )(1 + xiT x j ) d H (ω ) := κ ( xi , x j , ω ) homotopy 2 −ω * xi − x j H (0) = (1 + xi x j ) d T H (1) = exp 2 κ β ( xi , x j ) := ∫ κ ( xi , x j , ω )d β (ω ) Ω Infinite Programming 12 August 7, 2009
  • 13. Infinite Kernel Learning Infinite Programming • Introducing Riemann-Stieltjes integrals to the problem (SILP-MKL), we get the following general problem formulation: κ β ( xi , x j ) = ∫ κ ( xi , x j , ω )d β (ω ) Ω = [0,1] Ω 13 August 7, 2009
  • 14. Infinite Kernel Learning Infinite Programming • Introducing Riemann-Stieltjes integrals to the problem (SILP-MKL), we get the following general problem formulation: max θ θ ,β (θ ∈ R, β : [0,1] → R : monotonically increasing ) (IP) 1 subject to ∫0 d β (ω ) = 1, 1  S (ω , α ) − ∑ i =1αi  d β (ω ) ≥ θ ∀α ∈ R l with 0 ≤ α ≤ C , ∑ i =1αi yi = 0. l l ∫Ω  2     ( ) 1 l l S (ω , α ) := ∑ i , j =1α iα j yi y jκ xi , x j , ω   A := α ∈ R 0 ≤ α ≤ C1 and ∑ α i yi =0  l 2  i =1    1 T (ω , α ) := S (ω , α ) − ∑ α i l 14 2 i =1 August 7, 2009
  • 15. Infinite Kernel Learning Infinite Programming max θ (θ ∈ R, β : a positive measure on Ω ) (IP) θ ,β such that θ − ∫ T (ω , α )d β (ω ) ≤ 0 ∀α ∈ A, ∫Ω d β (ω ) = 1. Ω infinite programming dual of (IP): min σ (σ ∈ R , ρ : a positive measure on A ) σ ,ρ (DIP) such that σ -∫ T (ω , α )d ρ (α ) ≥ 0 ∀ω ∈ Ω, ∫A d ρ (α ) = 1. A • Duality Conditions: Let (θ , β ) and (σ , ρ ) be feasible for their respective problems, and complementary slack, so β has measure only where σ = ∫A T (ω , α )d ρ and ρ has measure only where θ = ∫ T (ω , α )d β . Ω Then, both solutions are optimal for their respective problems. 15 August 7, 2009
  • 16. Infinite Kernel Learning Infinite Programming • The interesting theoretical problem here is to find conditions which ensure that solutions are point masses (i.e., the original monotonic β is a step function). • Because of this and in view of the compactness of the feasible (index) sets at the lower levels, A and Ω , we are interested in the nondegeneracy of the local minima of the lower level problem to get finitely many local minimizers of g ( (σ , ρ ) , ω ) := σ − ∫ T (ω , α ) d ρ (α ). A • Lower Level Problem: For a given parameter (σ , ρ ), we consider (LLP) min g ( (σ , ρ ) , ω ) subject to ω ∈ Ω . ω 16 August 7, 2009
  • 17. Infinite Kernel Learning Infinite Programming • “reduction ansatz” and • Implicit Function Theorem • parametrical measures • “finite optimization” 17 August 7, 2009
  • 18. Infinite Kernel Learning Infinite Programming • “reduction ansatz” and • Implicit Function Theorem • parametrical measures 1 −(ω − µ )2 e.g., f (ω ;( µ , σ )) = 2 exp σ 2π 2σ 2 λ exp(−λω), ω ≥ 0 f (ω ; λ) =  0, ω<0 H (ω − a) − H (ω − b) f (ω ;(a, b)) = b−a ωα −1 (1 − ω ) β −1 f (ω;(α , β )) = 1 α −1 β −1 ∫0 u (1 − u ) du • “finite optimization” 18 August 7, 2009
  • 19. Infinite Kernel Learning Reduction Ansatz • “reduction ansatz” and • Implicit Function Theorem g ( x, ⋅) % • parametrical measures g ( x ,.) Ω g ( x, y ) ≥ 0 ∀y ∈ I yj yj % yp ⇔ min g ( x, y ) ≥ 0 y∈I x a y j ( x) implicit function 19 August 7, 2009
  • 20. Infinite Kernel Learning Reduction Ansatz based on the reduction ansatz : min f ( x) subject to g j ( x) := g ( x, y j ( x)) ≥ 0 ( j ∈ J := {1, 2, K, p}) g ((σ , ρ ), ⋅) g ((σ , ρ ), ⋅) • (σ , ρ ) • ω ω (σ , ρ ) topology ω = ω (σ , ρ ) % 20 August 7, 2009
  • 21. Infinite Kernel Learning Regularization regularization t t d d2 min − θ + sup µ ∫ d β (ω ) ∫ d β (ω ) θ ,β t∈[0,1] dt 0 2 dt 0 subject to the constraints 0 = t0 < t1 < K < tι = 1 tν +1 tν tν ∫ d β (ω ) − ∫ d β (ω ) tν +1 d 1 ∫ d β (ω ) ≈ 0 0 = ∫ d β (ω ) dt tν +1 − tν tν +1 − tν 0 tν tν + 2 tν +1 1 1 ∫ d β (ω ) − ∫ d β (ω ) 2 tν tν + 2 − tν +1 tν +1 − tν d tν +1 tν dt 2 0 ∫ d β (ω ) ≈ tν +1 − tν 21 August 7, 2009
  • 22. Infinite Kernel Learning Topology Radon measure: measure on the σ -algebra of Borel sets of E that is locally finite and inner regular. (E,d): metric space inner regularity Η (E) : set of Radon measures on E neighbourhood of measure ρ : µ (Kν )     Bρ (ε ) :=  µ ∈ Η ( E ) ∫ fd µ − ∫ fd ρ < ε  f   A A   dual space ( Η ( E ))′ of continuous bounded functions, Kν ⊂ E : compact set f ∈ ( Η ( E ))′ 22 August 7, 2009
  • 23. Infinite Kernel Learning Topology Def.: Basis of neighbourhood of a measure ρ ( f1,..., fn ∈(Η(E))′; ε > 0) : {µ ∈ Η (E) ∫E fi d ρ − ∫E fi d µ < ε } (i = 1, 2,..., n) . Def.: Prokhorov metric: d0 ( µ , ρ ) := inf {ε ≥ 0 | µ ( A) ≤ ρ ( Aε ) + ε and ρ ( A) ≤ µ ( Aε ) + ε (A : closed)} , ε where Aε := { x ∈ E | d ( x, A) < ε }. Open δ -neighbourhood of a measure ρ : Bδ ( ρ ) := {µ ∈ Η ( E ) d0 ( ρ , µ ) < δ }. 23 August 7, 2009
  • 24. Infinite Kernel Learning Numerical Results 24 August 7, 2009
  • 25. References Özöğür, S., Shawe-Taylor, J., Weber, G.-W., and Ögel, Z.B., Pattern analysis for the prediction of eukoryatic pro peptide cleavage sites, in the special issue Networks in Computational Biology of Discrete Applied Mathematics 157, 10 (May 2009) 2388-2394. Özöğür-Akyüz, S., and Weber, G.-W., Infinite kernel learning by infinite and semi-infinite programming, Proceedings of the Second Global Conference on Power Control and Optimization, AIP Conference Proceedings 1159, Bali, Indonesia, 1-3 June 2009, Subseries: Mathematical and Statistical Physics; ISBN 978-0-7354-0696-4 (August 2009) 306-313; Hakim, A.H., Vasant, P., and Barsoum, N., guest eds.. Özöğür-Akyüz, S., and Weber, G.-W., Infinite Kernel Learning via infinite and semi-infinite programming, to appear in the special issue of OMS (Optimization Software and Application) at the occasion of International Conference on Engineering Optimization (EngOpt 2008; Rio de Janeiro, Brazil, June 1-5, 2008), Schittkowski, K. (guest ed.). Özöğür-Akyüz, S., and Weber, G.-W., On numerical optimization theory of infinite kernel learning, preprint at IAM, METU, submitted to JOGO (Journal of Global Optimization). 25 August 7, 2009