SlideShare ist ein Scribd-Unternehmen logo
1 von 24
Detection Tracking and Recognition of Human Poses
            for a Real Time Spatial Game

Feifei Huo, Emile Hendriks, A.H.J. Oomes, Pascal van Beek, Remco Veltkamp




                Presenter: Feifei Huo
                Information and Communication Theory (ICT) Group
                Delft University of Technology




                                                                   June 16, 2009
Outline:
•   Introduction to visual analysis system
•   People detection, tracking and pose recognition system
     – Human body detection and body parts segmentation
     – Feature points representation and tracking
     – Pose recognition

•   Experimental results and conclusion
•   Spatial game application and future works
Introduction to Visual Analysis System

1. virtual reality
2. smart environment systems
3. sports video indexing
4. advanced users interfaces




                               Video-based
                               applications




                               Pose-Driven
                               Spatial Game
Pose-Driven Spatial Game
The state of the art:
 • combining bottom-up and top-down approaches.
 • incorporating appearance, kinematic, temporal constraints, etc.


The proposed system:
 • real time system
 • a variety of poses
 • spatial game control




              Fig.1. The flowchart of the proposed system
People Detection, Tracking and Pose Recognition System


    Video                    People                    People      Pose        Spatial
  Sequence                   Detection                Tracking   Recognition   Game




                                 Whole Human Blob
             Initial Frame
                                     Detection


                                   Different Body
                                 Parts Segmentation
Methodology
•   Background subtraction
     – Mixture of Gaussian
•   Head and torso detection and tracking
     – 2D upper-body model




                                                        B     F       Area( F ) = Area( B)


                 (a)                                          (b)

    Fig.2. (a) Foreground binary image of the initial frame, (b) 2D upper-body model for
    human torso detection and tracking.
Particle Filtering
{s   (n)
           , n = 1, 2 , 3 … , N } →




             P( B A = s ( ) )
                          n




{π   (n)
           , n = 1, 2,3,… , N } →




                                                           8
People detection and tracking
• A sample set {s , π , n = 1, 2, N } is generated with an initial
                                   (n)   (n)



  distribution s ( n ) = p ( n ) = ( x ( n ) , y ( n ) , scale( n ) ).




• Then the observation steps take place.
               (n )                     1        ⎧ ∑ F ( n ) − ∑ B ( n ) , if ∑ F ( n ) >
                                                 ⎪                                          ∑B   (n)
                                                                                                       ⎫
                                                                                                       ⎪
   P(B A = s          )=ω   (n)
                                  =         (n)
                                                ×⎨                                                     ⎬
                                    Area ( F ) ⎪ ⎩     0,                  otherwise                   ⎪
                                                                                                       ⎭
People detection and tracking

•   This observation is updated by taking
    the prior weight into account.
                                       π t(−1)
                                           n
             ωt
               (n)
                     =ω    (n)
                                 ×    N

                                      ∑π
                                      n =1
                                               (n)
                                              t −1




•   The normalized observation forms a
    new set of particle weight.
                                 ωt( n )
                 π   (n)
                     t     =     N                   Fig.3. 2D upper-body model for human
                               ∑ω
                               n =1
                                       t
                                        (n)
                                                     torso detection and tracking.
Methodology

•   Hand detection and tracking
    – Foreground pixels are segmented into skin-color and non-skin-
      color regions.
             B π π               G π   π              B π    π
      arctan( ) − < ,     arctan( ) − < ,      arctan( ) − <
             R   4 8             R   6 18             G   5 15


    – The face is excluded from the candidate hands regions by using
      the size of the connected skin color area.
People Detection, Tracking and Pose Recognition System


    Video    People                    People                  Pose        Spatial
  Sequence   Detection                Tracking               Recognition   Game




                                            Feature Points
                     Multiple Views
                                              Location


                      Subsequent            Feature Points
                     Video Frames             Tracking
Torso and Hand Segmentation




    Fig.4. Results of torso and hand segmentation
3D Reconstruction
•   Three synchronized cameras are used.
     – One front view
     – Two side views


•   The 3D positions of torso and hands can be obtained.




                     Fig.5. Multiple camera settings
People Detection, Tracking and Pose Recognition System


    Video    People       People                 Pose                   Spatial
  Sequence   Detection   Tracking              Recognition              Game




                                                       Construction


                                    Predefined Key
                                                        Classifier
                                        Poses


                                                     Pose Recognition
Pose Recognition

•   Feature space construction

    2D and 3D positions of the torso center and the hands

    normalized feature space

    relative positions between hands and torso center
Predefined Key Poses

                         Pose Classification
                       • 9 poses into 9 classes
                       • 15 persons
                       • 1515 samples in total
Results and Discussion
Cross-validation results of pose classifiers (mean errors with standard deviation)
      method                       LOPO                               FORO
                    mean pose err.    max pose err.    mean pose err.    max pose err.

       NMC            0.06(0.09)          0.18(0.35)     0.04(0.02)          0.09(0.10)

       LDC            0.06(0.07)          0.14(0.35)     0.01(0.01)          0.04(0.05)

       QDC            0.10(0.11)          0.23(0.34)     0.01(0.01)          0.04(0.06)

     LDA+QDC          0.07(0.09)          0.16(0.35)     0.02(0.01)          0.04(0.06)

       Parzen         0.07(0.09)          0.16(0.35)     0.01(0.01)          0.02(0.04)

    LDA+Parzen        0.06(0.07)          0.14(0.35)     0.00(0.00)          0.01(0.03)


Conclusion: the simplest method (NMC) provides comparable
performance to more complex classifiers.
Results and Discussion
                         Confusion matrices of nine poses
                                        Estimated Labels
                        P1    P2   P3    P4   P5   P6      P7 P8 P9
                  P1    198    0  0  0  0  0  0  0  0
    True Labels


                  P2     0    193 0  0  0  0  0  0  0
                  P3     2     0 157 0  0  0  0  0  0
                  P4     0     0  0 159 0 20 0   0  0
                  P5     1     0  1  0 164 0  2  0  0
                  P6     2     3  6  0  0 129 0  0  0
                  P7     0     0  1  0  3  0 164 0  0
                  P8     0     0  9  0  6  0  1 162 0
                  P9     0     0  5  3  0  0  0  0 133

Conclusion: most of the poses can be recognized very well.
However, there is quite a large error between pose4 and pose6.
People Detection, Tracking and Pose Recognition System


   Video    People       People      Pose                   Spatial
 Sequence   Detection   Tracking   Recognition              Game




                                                  Pose          Color Control



                                                 Location      Position Control
Spatial Game Demo
Application: Spatial Game

•   Real-time application: 20 frames/second    PRSD Studio, http://prsysdesign.net/



•   Robust to different environments: different indoor settings

•   Adapt to different users: various users
Future Works

•   Improve the robustness of the system
    better skin colour detection, more robust feature detection

•   Develop multiple-user applications
    solve occlusion problem
Thanks for your attention !

            ?

Weitere ähnliche Inhalte

Was ist angesagt?

Monocular simultaneous localization and generalized object mapping with undel...
Monocular simultaneous localization and generalized object mapping with undel...Monocular simultaneous localization and generalized object mapping with undel...
Monocular simultaneous localization and generalized object mapping with undel...
Chen-Han Hsiao
 
Brunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer visionBrunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer vision
zukun
 
Modern features-part-3-software
Modern features-part-3-softwareModern features-part-3-software
Modern features-part-3-software
zukun
 
Dr.Kawewong Ph.D Thesis
Dr.Kawewong Ph.D ThesisDr.Kawewong Ph.D Thesis
Dr.Kawewong Ph.D Thesis
SOINN Inc.
 
A Multimodal Approach for Video Geocoding
A Multimodal Approach for   Video Geocoding A Multimodal Approach for   Video Geocoding
A Multimodal Approach for Video Geocoding
MediaEval2012
 

Was ist angesagt? (13)

UCB 2012-02-28
UCB 2012-02-28UCB 2012-02-28
UCB 2012-02-28
 
Gaining Colour Stability in Live Image Capturing
Gaining Colour Stability in Live Image CapturingGaining Colour Stability in Live Image Capturing
Gaining Colour Stability in Live Image Capturing
 
Monocular simultaneous localization and generalized object mapping with undel...
Monocular simultaneous localization and generalized object mapping with undel...Monocular simultaneous localization and generalized object mapping with undel...
Monocular simultaneous localization and generalized object mapping with undel...
 
Can you see it? Annotating Image Regions based on Users' Gaze Information
Can you see it? Annotating Image Regions based on Users' Gaze InformationCan you see it? Annotating Image Regions based on Users' Gaze Information
Can you see it? Annotating Image Regions based on Users' Gaze Information
 
CG OpenGL line & area-course 3
CG OpenGL line & area-course 3CG OpenGL line & area-course 3
CG OpenGL line & area-course 3
 
BMC 2012
BMC 2012BMC 2012
BMC 2012
 
Computer vision techniques for interactive art
Computer vision techniques for interactive artComputer vision techniques for interactive art
Computer vision techniques for interactive art
 
Brunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer visionBrunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer vision
 
Computer Vision For Computer Music
Computer Vision For Computer MusicComputer Vision For Computer Music
Computer Vision For Computer Music
 
ICPR 2012
ICPR 2012ICPR 2012
ICPR 2012
 
Modern features-part-3-software
Modern features-part-3-softwareModern features-part-3-software
Modern features-part-3-software
 
Dr.Kawewong Ph.D Thesis
Dr.Kawewong Ph.D ThesisDr.Kawewong Ph.D Thesis
Dr.Kawewong Ph.D Thesis
 
A Multimodal Approach for Video Geocoding
A Multimodal Approach for   Video Geocoding A Multimodal Approach for   Video Geocoding
A Multimodal Approach for Video Geocoding
 

Andere mochten auch

Real Time Human Posture Detection with Multiple Depth Sensors
Real Time Human Posture Detection with Multiple Depth SensorsReal Time Human Posture Detection with Multiple Depth Sensors
Real Time Human Posture Detection with Multiple Depth Sensors
Wassim Filali
 
Real-time Face Recognition & Detection Systems 1
Real-time Face Recognition & Detection Systems 1Real-time Face Recognition & Detection Systems 1
Real-time Face Recognition & Detection Systems 1
Suvadip Shome
 
Face Detection techniques
Face Detection techniquesFace Detection techniques
Face Detection techniques
Abhineet Bhamra
 

Andere mochten auch (9)

Real Time Human Posture Detection with Multiple Depth Sensors
Real Time Human Posture Detection with Multiple Depth SensorsReal Time Human Posture Detection with Multiple Depth Sensors
Real Time Human Posture Detection with Multiple Depth Sensors
 
Pfinder : Real-Time Tracking of The Human
Pfinder : Real-Time Tracking of The Human Pfinder : Real-Time Tracking of The Human
Pfinder : Real-Time Tracking of The Human
 
Motion Human Detection & Tracking Based On Background Subtraction
Motion Human Detection & Tracking Based On Background SubtractionMotion Human Detection & Tracking Based On Background Subtraction
Motion Human Detection & Tracking Based On Background Subtraction
 
Real-time Face Recognition & Detection Systems 1
Real-time Face Recognition & Detection Systems 1Real-time Face Recognition & Detection Systems 1
Real-time Face Recognition & Detection Systems 1
 
Avihu Efrat's Viola and Jones face detection slides
Avihu Efrat's Viola and Jones face detection slidesAvihu Efrat's Viola and Jones face detection slides
Avihu Efrat's Viola and Jones face detection slides
 
Real time pedestrian detection, tracking, and distance estimation
Real time pedestrian detection, tracking, and distance estimationReal time pedestrian detection, tracking, and distance estimation
Real time pedestrian detection, tracking, and distance estimation
 
Face Detection techniques
Face Detection techniquesFace Detection techniques
Face Detection techniques
 
Real Time Object Tracking
Real Time Object TrackingReal Time Object Tracking
Real Time Object Tracking
 
HUMAN MOTION DETECTION AND TRACKING FOR VIDEO SURVEILLANCE
HUMAN MOTION DETECTION AND TRACKING FOR VIDEO SURVEILLANCEHUMAN MOTION DETECTION AND TRACKING FOR VIDEO SURVEILLANCE
HUMAN MOTION DETECTION AND TRACKING FOR VIDEO SURVEILLANCE
 

Ähnlich wie Detection Tracking and Recognition of Human Poses for a Real Time Spatial Game

Collision Detection In 3D Environments
Collision Detection In 3D EnvironmentsCollision Detection In 3D Environments
Collision Detection In 3D Environments
Ung-Su Lee
 
Gesture Recognition?
Gesture Recognition?Gesture Recognition?
Gesture Recognition?
Dayo Choul
 
DEEP LEARNING TECHNIQUES POWER POINT PRESENTATION
DEEP LEARNING TECHNIQUES POWER POINT PRESENTATIONDEEP LEARNING TECHNIQUES POWER POINT PRESENTATION
DEEP LEARNING TECHNIQUES POWER POINT PRESENTATION
SelvaLakshmi63
 
Accelarating Optical Quadrature Microscopy Using GPUs
Accelarating Optical Quadrature Microscopy Using GPUsAccelarating Optical Quadrature Microscopy Using GPUs
Accelarating Optical Quadrature Microscopy Using GPUs
Perhaad Mistry
 

Ähnlich wie Detection Tracking and Recognition of Human Poses for a Real Time Spatial Game (20)

When Remote Sensing Meets Artificial Intelligence
When Remote Sensing Meets Artificial IntelligenceWhen Remote Sensing Meets Artificial Intelligence
When Remote Sensing Meets Artificial Intelligence
 
20230213_ComputerVision_연구.pptx
20230213_ComputerVision_연구.pptx20230213_ComputerVision_연구.pptx
20230213_ComputerVision_연구.pptx
 
unit 4.pptx
unit 4.pptxunit 4.pptx
unit 4.pptx
 
golf
golfgolf
golf
 
פוסטר דר פרידמן
פוסטר דר פרידמןפוסטר דר פרידמן
פוסטר דר פרידמן
 
Spatio-temporal reasoning for traffic scene understanding
Spatio-temporal reasoning for traffic scene understandingSpatio-temporal reasoning for traffic scene understanding
Spatio-temporal reasoning for traffic scene understanding
 
iv10_linear_pose.pptx
iv10_linear_pose.pptxiv10_linear_pose.pptx
iv10_linear_pose.pptx
 
Two
TwoTwo
Two
 
2d3d
2d3d2d3d
2d3d
 
Collision Detection In 3D Environments
Collision Detection In 3D EnvironmentsCollision Detection In 3D Environments
Collision Detection In 3D Environments
 
Gesture Recognition?
Gesture Recognition?Gesture Recognition?
Gesture Recognition?
 
My MS defense
My MS defenseMy MS defense
My MS defense
 
Fast Object Recognition from 3D Depth Data with Extreme Learning Machine
Fast Object Recognition from 3D Depth Data with Extreme Learning MachineFast Object Recognition from 3D Depth Data with Extreme Learning Machine
Fast Object Recognition from 3D Depth Data with Extreme Learning Machine
 
DEEP LEARNING TECHNIQUES POWER POINT PRESENTATION
DEEP LEARNING TECHNIQUES POWER POINT PRESENTATIONDEEP LEARNING TECHNIQUES POWER POINT PRESENTATION
DEEP LEARNING TECHNIQUES POWER POINT PRESENTATION
 
Accelarating Optical Quadrature Microscopy Using GPUs
Accelarating Optical Quadrature Microscopy Using GPUsAccelarating Optical Quadrature Microscopy Using GPUs
Accelarating Optical Quadrature Microscopy Using GPUs
 
CG OpenGL surface detection+illumination+rendering models-course 9
CG OpenGL surface detection+illumination+rendering models-course 9CG OpenGL surface detection+illumination+rendering models-course 9
CG OpenGL surface detection+illumination+rendering models-course 9
 
ADAPTIVE FILTER FOR DENOISING 3D DATA CAPTURED BY DEPTH SENSORS
ADAPTIVE FILTER FOR DENOISING 3D DATA CAPTURED BY DEPTH SENSORSADAPTIVE FILTER FOR DENOISING 3D DATA CAPTURED BY DEPTH SENSORS
ADAPTIVE FILTER FOR DENOISING 3D DATA CAPTURED BY DEPTH SENSORS
 
Towards Accurate Multi-person Pose Estimation in the Wild (My summery)
Towards Accurate Multi-person Pose Estimation in the Wild (My summery)Towards Accurate Multi-person Pose Estimation in the Wild (My summery)
Towards Accurate Multi-person Pose Estimation in the Wild (My summery)
 
Visible Surface Detection
Visible Surface DetectionVisible Surface Detection
Visible Surface Detection
 
CS 354 Acceleration Structures
CS 354 Acceleration StructuresCS 354 Acceleration Structures
CS 354 Acceleration Structures
 

Kürzlich hochgeladen

Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Kürzlich hochgeladen (20)

"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 

Detection Tracking and Recognition of Human Poses for a Real Time Spatial Game

  • 1. Detection Tracking and Recognition of Human Poses for a Real Time Spatial Game Feifei Huo, Emile Hendriks, A.H.J. Oomes, Pascal van Beek, Remco Veltkamp Presenter: Feifei Huo Information and Communication Theory (ICT) Group Delft University of Technology June 16, 2009
  • 2. Outline: • Introduction to visual analysis system • People detection, tracking and pose recognition system – Human body detection and body parts segmentation – Feature points representation and tracking – Pose recognition • Experimental results and conclusion • Spatial game application and future works
  • 3. Introduction to Visual Analysis System 1. virtual reality 2. smart environment systems 3. sports video indexing 4. advanced users interfaces Video-based applications Pose-Driven Spatial Game
  • 5. The state of the art: • combining bottom-up and top-down approaches. • incorporating appearance, kinematic, temporal constraints, etc. The proposed system: • real time system • a variety of poses • spatial game control Fig.1. The flowchart of the proposed system
  • 6. People Detection, Tracking and Pose Recognition System Video People People Pose Spatial Sequence Detection Tracking Recognition Game Whole Human Blob Initial Frame Detection Different Body Parts Segmentation
  • 7. Methodology • Background subtraction – Mixture of Gaussian • Head and torso detection and tracking – 2D upper-body model B F Area( F ) = Area( B) (a) (b) Fig.2. (a) Foreground binary image of the initial frame, (b) 2D upper-body model for human torso detection and tracking.
  • 8. Particle Filtering {s (n) , n = 1, 2 , 3 … , N } → P( B A = s ( ) ) n {π (n) , n = 1, 2,3,… , N } → 8
  • 9. People detection and tracking • A sample set {s , π , n = 1, 2, N } is generated with an initial (n) (n) distribution s ( n ) = p ( n ) = ( x ( n ) , y ( n ) , scale( n ) ). • Then the observation steps take place. (n ) 1 ⎧ ∑ F ( n ) − ∑ B ( n ) , if ∑ F ( n ) > ⎪ ∑B (n) ⎫ ⎪ P(B A = s )=ω (n) = (n) ×⎨ ⎬ Area ( F ) ⎪ ⎩ 0, otherwise ⎪ ⎭
  • 10. People detection and tracking • This observation is updated by taking the prior weight into account. π t(−1) n ωt (n) =ω (n) × N ∑π n =1 (n) t −1 • The normalized observation forms a new set of particle weight. ωt( n ) π (n) t = N Fig.3. 2D upper-body model for human ∑ω n =1 t (n) torso detection and tracking.
  • 11. Methodology • Hand detection and tracking – Foreground pixels are segmented into skin-color and non-skin- color regions. B π π G π π B π π arctan( ) − < , arctan( ) − < , arctan( ) − < R 4 8 R 6 18 G 5 15 – The face is excluded from the candidate hands regions by using the size of the connected skin color area.
  • 12. People Detection, Tracking and Pose Recognition System Video People People Pose Spatial Sequence Detection Tracking Recognition Game Feature Points Multiple Views Location Subsequent Feature Points Video Frames Tracking
  • 13. Torso and Hand Segmentation Fig.4. Results of torso and hand segmentation
  • 14. 3D Reconstruction • Three synchronized cameras are used. – One front view – Two side views • The 3D positions of torso and hands can be obtained. Fig.5. Multiple camera settings
  • 15. People Detection, Tracking and Pose Recognition System Video People People Pose Spatial Sequence Detection Tracking Recognition Game Construction Predefined Key Classifier Poses Pose Recognition
  • 16. Pose Recognition • Feature space construction 2D and 3D positions of the torso center and the hands normalized feature space relative positions between hands and torso center
  • 17. Predefined Key Poses Pose Classification • 9 poses into 9 classes • 15 persons • 1515 samples in total
  • 18. Results and Discussion Cross-validation results of pose classifiers (mean errors with standard deviation) method LOPO FORO mean pose err. max pose err. mean pose err. max pose err. NMC 0.06(0.09) 0.18(0.35) 0.04(0.02) 0.09(0.10) LDC 0.06(0.07) 0.14(0.35) 0.01(0.01) 0.04(0.05) QDC 0.10(0.11) 0.23(0.34) 0.01(0.01) 0.04(0.06) LDA+QDC 0.07(0.09) 0.16(0.35) 0.02(0.01) 0.04(0.06) Parzen 0.07(0.09) 0.16(0.35) 0.01(0.01) 0.02(0.04) LDA+Parzen 0.06(0.07) 0.14(0.35) 0.00(0.00) 0.01(0.03) Conclusion: the simplest method (NMC) provides comparable performance to more complex classifiers.
  • 19. Results and Discussion Confusion matrices of nine poses Estimated Labels P1 P2 P3 P4 P5 P6 P7 P8 P9 P1 198 0 0 0 0 0 0 0 0 True Labels P2 0 193 0 0 0 0 0 0 0 P3 2 0 157 0 0 0 0 0 0 P4 0 0 0 159 0 20 0 0 0 P5 1 0 1 0 164 0 2 0 0 P6 2 3 6 0 0 129 0 0 0 P7 0 0 1 0 3 0 164 0 0 P8 0 0 9 0 6 0 1 162 0 P9 0 0 5 3 0 0 0 0 133 Conclusion: most of the poses can be recognized very well. However, there is quite a large error between pose4 and pose6.
  • 20. People Detection, Tracking and Pose Recognition System Video People People Pose Spatial Sequence Detection Tracking Recognition Game Pose Color Control Location Position Control
  • 22. Application: Spatial Game • Real-time application: 20 frames/second PRSD Studio, http://prsysdesign.net/ • Robust to different environments: different indoor settings • Adapt to different users: various users
  • 23. Future Works • Improve the robustness of the system better skin colour detection, more robust feature detection • Develop multiple-user applications solve occlusion problem
  • 24. Thanks for your attention ! ?