3. Motivation
Deployment of large number of surveillance cameras in recent years
London Heathrow airport has more than 5000 cameras!!
www.monash.edu.au
3
5. Research Question
How to recognize unusual, unsafe and
abnormal human and group behaviors from a
surveillance video stream in real-time?
Automatic detection of abnormal
behaviors to aid the human
monitors
Reduce the dependability on
human monitors
Improve the reliability of
surveillance systems for ensuring
human security
www.monash.edu.au
5
6. Proposed Research Framework
A real-time behavior recognition framework for visual surveillance
Surveillance
video stream
1.
Environment
Modeling
High level
description of
unusual actions
and interactions
Alarm!
2.
Feature Extraction
and Agent
Classification
Identified
active agents
Pattern
database
4.
Event/Behavior
Recognition
Classified
active agents
Tracked
trajectories
3.
Agent Tracking
with Occlusion
Handling
www.monash.edu.au
6
7. Targeted Behaviors
Mob violence
Crowding
Sudden group
formation/deformation
Shooting
Public panic
www.monash.edu.au
7
9. 1. Environment Modeling
How to extract the active regions from surveillance video stream?
Background Subtraction
Current frame
=
Background
Moving foreground
Challenges!!
• Background initialization is not a practical approach in real-world
• Dynamic nature of background environment due to illumination
variation, local motion, camera displacement and shadow
www.monash.edu.au
9
10. Environment Modeling in Literature (1 of 4)
Environment modeling
Background subtraction
Background modeling
Background maintenance
Foreground detection
Moving foreground detection
Object detection
Moving object detection
Pixel-based approaches
Single Gaussian Model
(Wren et al. PAMI’ 97)
Gaussian Mixture Model
(Stauffer et al. CVPR’ 99, Lee PAMI’
05)
Generalized Gaussian Mixture Model
(Allili et al. CRV’ 07)
Gaussian Mixture Model with SVM
(Zhang et al. THS’ 07)
Cascaded Classifiers
(Chen et al. WMVS’ 07)
www.monash.edu.au
10
11. Environment Modeling in Literature (2 of 4)
Environment modeling
Region and texture-based
approaches
Background subtraction
Incorporates neighborhood
Background modeling
information using block or texture
measure. (Sheikh et al. PAMI’ 07,
Background maintenance
Heikkila et al. PAMI’ 06, Schindler et
Foreground detection
al. ACCV’ 06)
Moving foreground detection Shape-based approaches
Use shape-based features instead of
Object detection
color features. (Noriega et al. BMVC’
Moving object detection
06, Jacobs et al. WMVC’ 07)
www.monash.edu.au
11
12. Environment Modeling in Literature (3 of 4)
Environment modeling
Background subtraction
Background modeling
Background maintenance
Foreground detection
Moving foreground detection
Object detection
Moving object detection
Predictive modeling
Uses probabilistic prediction of the
expected background. (Toyama et al.
ICCV’ 99, Monnet et al. ICCV’ 03)
Model initialization approaches
Recovering clear background from a
given sequence containing moving
objects. (Gutchess et al. ICCV’ 01,
Wang et al. ACCV’ 06, Figueroa et al.
IVC’ 06)
www.monash.edu.au
12
13. Environment Modeling in Literature (4 of 4)
Environment modeling
Nonparametric background
Background subtraction
modeling
Density estimation based on a
Background modeling
sample of intensity values.
Background maintenance
(Elgammal et al. ECCV’ 00)
Foreground detection
Stationary foreground detection
Moving foreground detection
Uses multiple model operating on
multiple time scale. (Cheng et al.
Object detection
WMVC’ 07)
Moving object detection
www.monash.edu.au
13
14. 2. Agent Classification
How to classify the active regions in real-time?
Active Regions
Human
Non-human
Single Person
Vehicle
People in Group
Person carrying
object
Single Person
Carrying
Object
Which features to use?
B. Liu and H. Zhou (NNSP’ 03)
Challenges!!
•
•
People in
Groups
Not Carrying any
Object
Features
• Position
• Width/Height
• Centroid/Perimeter
• Aspect Ratio
• Compactness
• Others….
Identifying the appropriate features for the targeted behaviors
Real-time classification using the those features
www.monash.edu.au
14
15. Agent Classification in Literature
Agent
Classification
Generic
Classification
Approaches
Domain
Specific
Classifiers
Binary image classification techniques
Algorithms for calculating ellipticity,
rectangularity, and triangularity
Feature evaluation techniques
Residential
Security
System
Classification
Using Tracked
Trajectories
For identifying
humans, pets, and
other objects.
Industrial
Robot
Manipulator
For classifying
objects on moving
conveyor.
Traffic
Monitoring
System
Coastline
Surveillance
System
Vehicle (including motorcycle,
car, bus and truck)
And human (including
pedestrian and bicycler)
For classifying
different kinds of
ships.
www.monash.edu.au
15
16. 3. Occlusion Handling during Tracking
Occlusion handling is a major Challenges!!
problem in visual surveillance. Better models need be developed
to cope with the correspondence
between features for eliminating
During occlusion only portions
errors during tracking multiple
of each objects are visible and
objects.
often at very low resolution.
www.monash.edu.au
16
17. Occlusion Handling in Literature (1 of 3)
Most practical method for addressing occlusion is through
the use of multiple cameras.
Progress is being made using statistical methods to predict
object pose, position, and so on, from available image
information.
www.monash.edu.au
17
18. Occlusion Handling in Literature (2 of 3)
Region-based tracking works well in scenes containing
only a few objects (such as highways).
Active contour-based tracking reduces computational
complexity and track under partial occlusion but sensitive
to the initialization of tracking.
www.monash.edu.au
18
19. Occlusion Handling in Literature (3 of 3)
(x,y)
height
width
Model-based tracking – high computational cost,
unsuitable for real-time implementations.
Feature-based tracking can handle occlusion
Centroid of
between two objects as long as velocity of
the bounding
box
centriods are distinguishable.
www.monash.edu.au
19
20. 4. Behavior Recognition
How to learn and recognize
a particular behavior?
Pattern
Database
Crowd
Behavior
Recognition
Movement pattern
Challenges!!
• Identifying the time-varying features
for a particular behavior
• Automatic learning of behaviors
• Recognizing the learned behaviors
in different scenarios
Violence
Sudden group
formation
www.monash.edu.au
20
21. Behavior Recognition in Literature (1 of 3)
Behavior
Recognition
Following another person
Altering one’s path to
meet another
Carrying object
Depositing an object
Exchanging objects
Real-time system for recognizing
human behaviors including
following another person and
altering one’s path to meet another.
(Oliver et al. PAMI’ 00)
Real-time system to determine
whether people are carrying
objects, depositing an object,
exchanging bags.
(Haritaoglu et al. PAMI’ 00)
www.monash.edu.au
21
22. Behavior Recognition in Literature (2 of 3)
Behavior
Recognition
Identifying abnormal movement
patterns. (Grimson et al. CVPR’ 98)
Interaction patterns among a group
of people based on simple statistics
computed on tracked trajectories.
Abnormal movement pattern
Behaviors: loitering, stalking and
following. (Wei et al. ICME’ 04)
Loitering
Stalking
Real-time behavior interpretation
Following
from traffic video for producing
Target moving towards point
lexical output. (Kumar et al. ITS’ 05)
Target crossing a point
Target stopped at a point
www.monash.edu.au
22
23. Behavior Recognition in Literature (3 of 3)
Behavior
Recognition
Tracking groups of people in metro
scene and recognizing abnormal
behaviors. Appearance/disappearance
of groups, dynamics (split and merge)
and failure of motion detector.
(Cupillard et al. WAVS’ 01)
Appearance of groups
Disappearance of groups Analyzing vehicular trajectories for
recognizing driving patterns.
Merging of groups
(Niu et al. ICSP’ 03)
Splitting of groups
Turn/Stop
Surveillance event primitives: entry/exit,
Entry/Exit
crowding, splitting and track loss.
(Guha et al. VSPETS’ 05)
Crowding
Track loss
www.monash.edu.au
23
25. Environment Modeling in the Proposed Framework
Surveillance
video stream
1.
Environment
Modeling
High level
description of
unusual actions
and interactions
Alarm!
2.
Feature Extraction
and Agent
Classification
Identified
active agents
Pattern
database
4.
Event/Behavior
Recognition
Classified
active agents
Tracked
trajectories
3.
Agent Tracking
with Occlusion
Handling
www.monash.edu.au
25
26. Environment Modeling
Environment
Modeling
Surveillance
video stream
Identified
moving objects
Baseline
Pixel-based approaches are more suitable for visual surveillance
Most popular and widely used pixel-based method was introduced
at MIT by Stauffer and Grimson (CVPR’ 99)
Gaussian Mixture Model (GMM) was used for environment
modelling
Improved adaptability proposed by Lee (PAMI’ 05)
www.monash.edu.au
26
27. Environment Modeling using Gaussian Mixtures
σ2
P(x)
µ
P(x)
x
Sky
Cloud
Leaf
Moving Person
σ2
Road
Shadow
Moving Car
Floor
Shadow
Walking People
Cloud
µ
x
P(x)
P(x)
Leaf
Person
Sky
σ2
µ
x
x (Pixel intensity)
www.monash.edu.au
27
28. Moving Object Detection
Frame 1
Frame N
road
shadow
car
shadow
road
Models are ordered by ω/σ
ω1
σ12
µ1
road
ω2
σ22
µ2
shadow
65%
20%
Background Models
K models
ω3
σ32
µ3
T = 70%
car
15%
b
B argminb ωk T
k 1
T is minimum portion of data in the environment accounted for background.
Matched model for a new pixel value Xt, |Xt - µ | < Mth * σ
www.monash.edu.au
28
30. Background Representation
How to obtain a visual representation of the background from the
environment model?
Current frame
Why?
=
Background
Moving foreground
Background
Model
Frame 1
road
Frame N
shadow
car
shadow
Which value should be
used to represent the
background?
road
Models are ordered by ω/σ
ω2
σ22
µ2
m2
ω1
σ12
µ1
m1
road
shadow
ω3
σ32
µ3
m3
car
Background
Representation
m j where
j argmaxiK i
i
www.monash.edu.au
30
31. Representation of the Computed Background
(a) Test Frame
(b) Lee’s Formulation
(c) Proposed Approach
Lee (PAMI' 05) gave an intuitive solution to
compute the expected value of the
observations believed to be background.
E[ X | B] k1 E[ X | Gk ]P(Gk | B)
K
(a)
(b)
K
k 1 k P( B | Gk ) P(Gk )
K1 P( B | G j ) P(G j )
j
(c)
www.monash.edu.au
31
32. Another Observation
Contradiction in model dropping strategy!!
Frame 1
ω
σ2
µ
m
Frame N
road
shadow
car
road
shadow
Models are ordered by ω/σ
ω1
σ12
µ1
m1
road
ω2
σ22
µ2
m2
shadow
65%
ω3
σ32
µ3
m3
K models
K=3
car
20%
15%
Which model should be dropped?
Selecting the least probable model for the new pixel value could
sacrifice the most appropriate model representing the background!
www.monash.edu.au
32
33. Model Dropping Strategy
Objectives To have a realistic background representation
To retain the most contributing background models as
long as possible
Frame 1
ω
σ2
µ
m
Frame N
road
shadow
car
road
shadow
Models are ordered by ω/σ
ω1
σ12
µ1
m1
road
ω2
σ22
µ2
m2
shadow
65%
20%
ω3
σ32
µ3
m3
K models
K=3
car
15%
Which model should be dropped?
The model having the least evidence for representing the background.
www.monash.edu.au
33
34. Representation of the Computed Background
And it works!
(a)
(b)
(c)
(d)
Test Frame
Lee’s Formulation
Proposed (ODS)
Proposed (MDS)
ODS - Original Dropping Strategy
MDS - Modified Dropping Strategy
(a)
(b)
(c)
(d)
www.monash.edu.au
34
39. Experiments
Moving Object Detection
False Classification
-
=
False Positive (FP)
Current frame
Background
Moving foreground
False Negative (FN)
Datasets
Total 14 test sequences
5 PETS sequences (Performance Evaluation for Tracking and Surveillance)
7 Wallflower sequences (Microsoft Research)
2 other sequences
Evaluation
Compared with two most widely used GMM-based methods:
Stauffer and Grimson (CVPR’ 99) and Lee (PAMI’ 05)
Results are evaluated both visually and numerically
www.monash.edu.au
39
40. Involved parameters, thresholds and constants
Learning Rate (α)
Maximum number of distribution per pixel model (K)
Matching threshold (Mth)
Subtraction threshold (Sth)
Initial high variance assigned to a new distribution (V0)
Initial low weight assigned to a new distribution (W0)
K=3
www.monash.edu.au
40
41. Experimental Results (PETS Dataset)
First
Frame
Test
Frame
Ground
Truth
GMM
(Stauffer)
GMM
(Lee)
Proposed
(ODS)
Proposed
(MDS)
(1)
(2)
(3)
(4)
(5)
(1) PETS2000; (2) PETS2006-S7-T6-B-1; (3) PETS2006-S7-T6-B-2; (4) PETS2006-S7-T6-B-3; and (5) PETS2006-S7-T6-B-4.
www.monash.edu.au
41
42. Experimental Results (Wallflower Sequences)
First
Frame
Test
Frame
Ground
Truth
GMM
(Stauffer)
GMM
(Lee)
Proposed
(ODS)
Proposed
(MDS)
(6)
(7)
(8)
(9)
(10)
(11)
(12)
(6) Bootstrap; (7) Camouflage; (8) Foreground Aperture; (9) Light Switch; (10) Moved Object; (11) Time Of Day; and (12) Waving Tree
www.monash.edu.au
42
43. Experimental Results (Football and Walk)
First
Frame
Test
Frame
Ground
Truth
GMM
(Stauffer)
GMM
(Lee)
Proposed
(ODS)
Proposed
(MDS)
(13)
(14)
(13) Football; and (14) Walk
www.monash.edu.au
43
47. Environment Modeling
Environment
Modeling
Surveillance
video stream
Identified
moving objects
Contributions
•
•
•
•
•
Independent of any environment sensitive parameter
Improved detection quality than existing GMM-based methods
No post-processing step required
Operational with same parameter setting in different environments
Fault tolerant with small camera displacement
www.monash.edu.au
47