Diese Präsentation wurde erfolgreich gemeldet.
Wir verwenden Ihre LinkedIn Profilangaben und Informationen zu Ihren Aktivitäten, um Anzeigen zu personalisieren und Ihnen relevantere Inhalte anzuzeigen. Sie können Ihre Anzeigeneinstellungen jederzeit ändern.
Action Recognition System
Ihor Tanyenkov Igor Uspeniev
CCTV: Detect Conflict Behaviour
Detect:
1. Person pushing another person.
Pushing, punching and kicking is a
hand movement...
CCTV: Falling Detection
Detect:
1. Person falling down from a
punch.
2. Person on ground getting kicked
or beat up.
3. Per...
False Depth Perception
Fist is in frontHead is far behind
People located close in angular position to
camera, but have dif...
Occluded Participant
Frames 1, 2: Normal behaviour while good visibility
Frame 3: Hit while person is occluded
Frame 4: Fa...
Occluded Hit
Frame 1: Normal behaviour while good visibility
Frames 2, 3, 4: Occluded hit
Power Standoff Without Fast Movements
Every single frame contains no strikes
Sequence of frames contains no fast motion
False Hit
A friendly hug, a pat on the shoulder can
be fast and even strong.
The difference from the power struggle
lies i...
False Grassing
Many (and perhaps
most) falls are not
due to blows, but
because of ridiculous
accidents
Standing Point Lower or Upper Than Ground Level
Impossible to detect
falling related to
ground level.
Problems in full bod...
Fighting in the Crowd
Huge count of
persons in the field
of view
Mutual occlusion
and chaotic
movement
Performance
problems
Review of Existing Solutions
Group 1: Instant frame classification:
● Body position classification
Lots of false positives...
Used Approach Step 1: Pose Estimation and Analytical Motion
Pose estimation: Detect keypoints and connections. Challenge:
...
Proposed Approach Step 2: Frame Sequence Classification
{1 0 1 01 0 1 0 1 0 0 0 . . . . . . . . }
CNN Feature extraction
{...
Data Representativity and Accuracy
Datasets Variativity
Static features:
body parts,
primitives
Dynamics:
motion matching
...
Challenges
Dataset quality on public artificial
data:
● Slow hits,
● Deceleration before hitting
● Fighting is only dynami...
Challenges and Uncertainties
● Smoothed motion
● Occluded strike
● Spatial orientation estimation
● Performance improvemen...
Architecture design
Slow Motion: Filter of Third Level
Speed Function Reconstruction
Example of Function Differentiation
Example of Function Differentiation
Space of Equilibrium: Singularity Stability
Space of Equilibrium: Homogenous Deviation
Nächste SlideShare
Wird geladen in …5
×

Embedded Fest 2019. Игорь Таненков и Игорь Успеньев. Action Recognition from Live CCTV streams

48 Aufrufe

Veröffentlicht am

Action Recognition system for video surveillance. Description of integration computer vision module based on Deep learning and analytical models into production. Challenges and approaches. How we handle multiple video streams and reduce false positives. Also we will explain how to deal with lack of datasets for action recognition.

Veröffentlicht in: Bildung
  • Loggen Sie sich ein, um Kommentare anzuzeigen.

Embedded Fest 2019. Игорь Таненков и Игорь Успеньев. Action Recognition from Live CCTV streams

  1. 1. Action Recognition System Ihor Tanyenkov Igor Uspeniev
  2. 2. CCTV: Detect Conflict Behaviour Detect: 1. Person pushing another person. Pushing, punching and kicking is a hand movement at a speed above a configurable (not fixed) threshold value, and ending with a touch to another person. 2. Person fighting another person by kicking or punching. Requirements: #1 Clearly visible hitting hand, touch, participants. #4 Strike motion projection to camera image is distinguishable as a fast motion.
  3. 3. CCTV: Falling Detection Detect: 1. Person falling down from a punch. 2. Person on ground getting kicked or beat up. 3. Person on ground laying down Requirements: #1. Clearly visible standing person #2. Clearly visible lying person
  4. 4. False Depth Perception Fist is in frontHead is far behind People located close in angular position to camera, but have difference in distance location, on RGB image looks like they are too close or even touching. In this case if one is moving fast(dancing, rotating, etc, ), the other is not influenced by these moves. So we should analyse correlation between movement intensity of people that close to each other on RGB, and filter false positives if their movements are independent.
  5. 5. Occluded Participant Frames 1, 2: Normal behaviour while good visibility Frame 3: Hit while person is occluded Frame 4: Fall while person is occluded
  6. 6. Occluded Hit Frame 1: Normal behaviour while good visibility Frames 2, 3, 4: Occluded hit
  7. 7. Power Standoff Without Fast Movements Every single frame contains no strikes Sequence of frames contains no fast motion
  8. 8. False Hit A friendly hug, a pat on the shoulder can be fast and even strong. The difference from the power struggle lies in the manner of movements, it is a complex of movements of various parts of the body.
  9. 9. False Grassing Many (and perhaps most) falls are not due to blows, but because of ridiculous accidents
  10. 10. Standing Point Lower or Upper Than Ground Level Impossible to detect falling related to ground level. Problems in full body position detection.
  11. 11. Fighting in the Crowd Huge count of persons in the field of view Mutual occlusion and chaotic movement Performance problems
  12. 12. Review of Existing Solutions Group 1: Instant frame classification: ● Body position classification Lots of false positives ● Motion as smoothed areas classification Problems: Group 2: Motion tracking in frame sequence: ● Optical flow for motion estimation and classification Frame rate dependency Group 3. Body matching in frame sequence: ● Body parts detection and matching ● Motion sequence classification
  13. 13. Used Approach Step 1: Pose Estimation and Analytical Motion Pose estimation: Detect keypoints and connections. Challenge: ● Closely located persons with body intersections ● Dress on the body ● Hidden/occluded body parts ● Crowded scenes Multiframe body matching and action classification
  14. 14. Proposed Approach Step 2: Frame Sequence Classification {1 0 1 01 0 1 0 1 0 0 0 . . . . . . . . } CNN Feature extraction {1 0 1 01 0 1 0 1 0 0 0 . . . . . . . . } {1 0 1 01 0 1 0 1 0 0 0 . . . . . . . . }{1 0 1 01 0 1 0 1 0 0 0 . . . . . . . . } Deep LSTM network Extracting feature maps Frame collection and preprocessing Create embedding for each feature map Build embedding sequence Predict sequence with LSTM networks
  15. 15. Data Representativity and Accuracy Datasets Variativity Static features: body parts, primitives Dynamics: motion matching speed estimation Datasets Ground Truth Action classification
  16. 16. Challenges Dataset quality on public artificial data: ● Slow hits, ● Deceleration before hitting ● Fighting is only dynamics ● Poor action list scenario ● No ground truth Dataset representativity: ● No touches ● No falling ● Little set of variativity: ○ environment ○ no crowd ○ person’s appearance
  17. 17. Challenges and Uncertainties ● Smoothed motion ● Occluded strike ● Spatial orientation estimation ● Performance improvement: GPU parallelism, multiple models serving, intelligent preprocessing ● Voting system ● Dataset mining and labeling, request for proprietary datasets
  18. 18. Architecture design
  19. 19. Slow Motion: Filter of Third Level
  20. 20. Speed Function Reconstruction
  21. 21. Example of Function Differentiation
  22. 22. Example of Function Differentiation
  23. 23. Space of Equilibrium: Singularity Stability
  24. 24. Space of Equilibrium: Homogenous Deviation

×