The document is a presentation about monocular human pose estimation using Bayesian networks. It includes:
- An outline with sections on introduction, approach overview, model learning, pose estimation, feature extraction, experiments and conclusions.
- Discussion of applications of human motion capture such as animation, games, medical diagnosis and visual surveillance.
- Comparison of different sensor approaches for human pose estimation including active markers, passive markers and markerless methods using cameras.
- Description of the proposed approach which uses Bayesian networks to represent the articulated human body and estimate 2D and 3D joint positions through representation, learning and inference steps.
2. Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 2
Outline
1. Introduction
2. Markless Monocular Human Pose
Estimation
3. Overview of the Approach
4. Model Learning by EM algorithm
5. Pose Estimation by Approximate Inference
6. Feature Extraction
7. Experimental Results
8. Conclusions
3. Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 3
1. Introduction
• Applications of Human Motion
Capture
– Performance animation in movie making
– Game
– Medical diagnosis
– Sport & Health
– Visual surveillance
4. Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 4
Performance Animation
• Avatar • The Lord of the
Rings
5. Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 5
Game
• Microsoft's Project Natal for XBOX360
6. Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 6
Medical Diagnosis
• Gait analysis for
Rehabilitation
7. Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 7
Sport & Health
• Golf training
8. Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 8
Visual Surveillance
• Behavior analysis for event detection
– Irregular movement, body language, and
unusual interactions, fighting
– Car crash
• Content-based retrieval
9. Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 9
Sensor Approaches
• Active sensors
– Types
• Electro-magnetic marker
• Optical
• Accelerometer
– Wired connection
– Drawbacks Too
• Intrusive Many
• Expensive Wires
• Time consuming
• Passive sensors
by camera
– Marker-based
– Markerless
10. Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 10
Marker-based Sensors
• Add visual markers on body
– Active marker
• Visual/non-visual light
– Passive marker
• Need computer vision algorithms Active
• Advantages marker
– No wires
• Drawbacks
– Semi-intrusive Passive
– Time consuming marker
11. Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 11
Markerless Sensors
• No attachment on human body
• Heavily dependent on Pure vision
computer vision analyzer solution
– Stereo/Multiple cameras
– Monocular cameras
12. Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 12
Sensor v.s. Analyzer
T. B. Moeslund, "Computer vision-based human motion capture – a
survey", Technical report LIA 99-02, University of AALBORG, 1999.
13. Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 13
Pose Estimation
v.s. Gesture Recognition
Pose Estimation
Gesture
Recognition
Walking
14. Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 14
2D v.s. 3D
15. Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 15
2. Markerless Monocular
Human Motion Capture
• Goal
– Markless
– Single camera
– 3D poses
• Challenges
– Ill-posed
– Highly articulated Depth ambiguities &
occlusion using
– Self-occluding monocular silhouettes
16. Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 16
Joint Representation
• Articulated human body is linked by
joints
17. Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 17
Abstract Representation
2D 3D
Stick
Surface/
Volume
18. Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 18
Literature Review
Low-Level High-Level
Observation Abstraction
• Background subtraction P=f(S)
• Object detection P=f(F) P=f(J)
Marker-based
Image Human Image 2D Joint 3D Model
Space Segmentation Feature Location Parametric Space
(Pixel domain) (S) Descriptor (F) (J) (Pose domain, P)
• Full body • Shape •Joint angle
X
• Body • Silhouette
parts • Color Θi Left Right
• Appearance shoulder
Neck
shoulder
Left Right
• Motion •Joint
elbow
Left
Left Bottom Right
waist waist
elbow
Right
• Feature location
hand hand y
Left Right
point Pi knee
Left
knee
Right
foot
(corner) Z
foot
• ...
A two-stage approach is proposed P=f1(f2(F))
19. Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 19
Approaches
• Model-free [Agarwal, 2006] [Loy, 2004]
– No utilization of joints articulation to
constrain the search of function mapping
P = f(X)
• Model-based [Rbert, 2006] [Rohr, 1994]
– A model of human articulation to
constrain the search of f and P
– Two kinds of approach
• Discriminative
• Generative: Bayesian networks (BNs)
Training : f = arg max L1 (Training, f )
ˆ
f
Inference : P = arg max L2 ( f | X , P)
ˆ ˆ
P
20. Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 20
An Articulated Model
= A Bayesian Network
• Human body is represented as a
kinematics tree, consisting of divisions
linking by joints
• Kinematics models are addressed with
X
graphical probability network Left Right
shoulder shoulder
Neck
• Graphical probability models are Left
elbow
Left
hand
Left Bottom Right
waist waist
Right
elbow
Right
hand y
computed via Bayesian network Left
knee
Left
foot
Right
knee
Right
foot
Z
21. Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 21
Three Steps to Utilize BNs
• Representation, learning and inference
X1
Joints
f = arg max L1 (Training, f )
ˆ Representation
f
Feature-Joint correspondence X2 X3 X4
by Conditional
Probability Features
Learning
X1
P(X1|X2,X3,X4) Inference
Pose Estimation
P = arg max L ( f | X , P)
ˆ ˆ
2 X2 X3 X4
P
22. Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 22
Two Causal Models in BNs
• Undirected acyclic graph [Lan, 2008] [Hua, 2005]
– Bayesian network is a tree or a graph model
that the linking edge between two nodes has no
direction.
P(X1,X2)
X1 X2
• Directed acyclic graph [Ramanan, 2007] [Lee, 2006] [Leonid, 2003]
– Every node has directed arcs linked to another
node. P(X1|X2)
X1 X2
23. Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 23
Directed Bayesian Articulated
Model
• Nodes in directed acyclic graph (DAG) are
not influenced by their child nodes.
• Human body parts are not regarded as two-
way h2d,2
h2d,7 h2d,5 h2d,3 h2d,1 h2d,4 h2d,6 h2d,8
h2d,10 h2d,9 h2d,11
h2d,12 h2d,13
h2d,14 h2d,15
24. Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 24
Inference of Bayesian Networks
• Top-down approach [Gavrila, 1996]
– Has the strength at finding human body parts
in the image.
• Bottom-up approach [Ren, 2005]
– Has the strength at finding people in the image.
• Combined approach [Navaraman, 2005][Lee, 2002]
– Has the benefit from the advantages of both.
25. Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 25
3. Overview of the Approach
2D X
3D
Head
Left Right
shoulder shoulder
Left Right
Neck shoulder
Right shoulder
Left Neck
elbow elbow
Left Right
Bottom elbow elbow
Left Left Bottom Right
Right waist waist
hand Left Right
Left Right hand hand
hand y
waist waist
Left Right Left Right
knee knee knee knee
Left Right
foot foot
Left Right
foot foot
Z
They are belief propagation networks using
an annealing Gibbs sampling algorithm.
26. Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 26
System Architecture
• We estimate the 2D human joint
positions before 3D estimation.
Testing image
2D Model Training
2D Bayesian
Feature
Human Model
Extraction
Setting
3D Model Training
2D Bayesian
Training 3D Bayesian
Inference with
Features EM Training Human Model
Annealed Gibbs
Setting
Sampling
3D Bayesian
Inference with Training
EM Training Features
Annealed Gibbs
Sampling
Result
27. Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 27
2D Human Graphical Model
• The articulated structure of 2D human
body is represented by a 15-node graphical
model.
Head
H 2 D = {h2 d ,1 ,..., h2 d ,15}
h2d,2
Left Right
shoulder shoulder
Neck
h2d,7 h2d,5 h2d,3 h2d,1 h2d,4 h2d,6 h2d,8
Left Right
elbow elbow
Bottom
Left Right
hand h2d,10 h2d,9 h2d,11
Left Right hand
waist waist
Left Right
knee knee h2d,12 h2d,13
Left Right
foot foot
h2d,14 h2d,15
2D stick figure (articulated model)
28. Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 28
3D Human Graphical Model
• 3D human body model is described by a 45D
vector H3D representing joint positions for
dimensions of each joint node in the 3D space
X
H 3 D = {h3d ,1 ,..., h3d ,15}
h3d,15
Left Right
shoulder shoulder
Neck h3d,1
h3d,2 h3d,3
Left Right
elbow elbow
Left Bottom Right h3d,4 h3d,5
waist waist Right
Left
hand hand y
h3d,6 h3d,8 h3d,7
h3d,9 h3d,10
Left Right
knee knee
Left h3d,11 h3d,12
Right
foot foot
Z h3d,13 h3d,14
3D stick figure (articulated model)
29. Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 29
The BN Model
• A directed acyclic graph
h2d,2
G = (V , E , C )
h2d,7 h2d,5 h2d,3 h2d,1 h2d,4 h2d,6 h2d,8
– V: vertex set {Vi, 1≤i≤N}
h2d,10 h2d,9 h2d,11
– E : a set of directed edges (i,j) h2d,12 h2d,13
– C: (i,j) → R+, edge cost functions h2d,14 h2d,15
• To encode probabilistic information
– An edge indicates a probabilistic
dependence
– C : P(Vi | Vj): conditional probability
function set
• The 2D and 3D BNs
G2 D = (V2 D , E2 D , C2 D ) G3 D = (V3 D , E3 D , C3 D )
30. Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 30
2D Graphical Model
V2 D = {H 2 D , O2 D } h2d,2
h2d,7 h2d,5 h2d,3 h2d,1 h2d,4 h2d,6 h2d,8
O2d : Nc
S
A
C h2d,9 h2d,8 h2d,10
C2 D = {P(h2 d ,i | pa (h2 d ,i ))} h2d,11 h2d,12
h2d,13 h2d,14
31. Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 31
3D Graphical Model
h2d,3 h2d,1 h2d,9 h2d,4
V3 D = {H 3 D , O3 D }
h2d
hu3d,2 hu3d,1 hu3d,3
O3d :
h2d,5 hu3d,4 hu3d,5 h2d,6
Upper wN
body
h2d,7 hu3d,6 hu3d,7 h2d,8
L
h2d,10 h2d,9 h2d,11
C3 D = {P(h3d ,i | pa (h3d ,i ))}
hl3d,2 hl3d,1 hl3d,3
Lower
h2d,12 hl3d,4 hl3d,5 h2d,13
body
h2d,14 hl3d,6 hl3d,7 h2d,15
32. Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 32
Joint Probability Distribution
(JPD)
• The two proposed graphical models
specify two unique JPDs:
P2D(V2D) and P3D(V3D)
• Let P(V) represent the two JPDs
n
h2d,2 P(V ) = ∏ P(Vi | pa (Vi ))
h2d,7 h2d,5 h2d,3 h2d,1 h2d,4 h2d,6 h2d,8
i =1
• The factorization of the JPD comes
h2d,9 h2d,8 h2d,10 from the Markov Blanket, a local
h2d,11 h2d,12
Markov property
• If we can learn the finite conditional
h2d,13 h2d,14
probabilities, we can inference the
human pose
33. Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 33
Two Problems
• Training problem
– Given a training set : {O2d, O3d}
– How can we learn the edge cost function
C = { P(h | pa(h)) }
h2d,2
– We apply the EM algorithm
h2d,7 h2d,5 h2d,3 h2d,1 h2d,4 h2d,6 h2d,8
• Inference problem
– Given an evidence O h2d,9 h2d,8 h2d,10
– How can we inference h2d,11 h2d,12
the human pose
h2d,13 h2d,14
P(H | O) by P(V)
– We propose an annealed Gibbs sampling
algorithm
34. Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 34
4. Model Learning by EM
• Why apply the EM algorithm for model
learning
– The human poses and observations are
incomplete and sparse
• Incomplete: occlusion due to single camera
• Sparse: small training samples in large-
dimension space
35. Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 35
The Likelihood Function
• The training set D={D1,…DN}
– N represents the number of training samples
– Dl={V1[l],…,Vn[l]} is the l-th training sample
• Let θ be the learning model: C = { P(h | pa(h)) }
• θ = arg max P(θ | D) = arg max P( D | θ ) P ((θ )) = arg max P( D | θ )
ˆ P
D
θ θ θ
= arg max
θ
∏ P( D | θ )
l =1~ N
l
• A log-likelihood function LD (θ ) = log( P( D | θ )) is
formulated based on the independence
assumption of training samples
N
LD (θ ) = log ∏ P(V1[l ],...,Vn [l ] | θ )
l =1
= ∑i =1 ∑l =1 log P(Vi [l ] | pai (Vi (l )),θ )
n N
36. Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 36
MLE v.s. EM
• If D is complete, we can apply the MLE
(Maximum Likelihood Estimation) to
find θ
• However D is incomplete because of
occlusion and partial observability
• Let D=Y∪U h2d,2
h2d,7 h2d,5 h2d,3 h2d,1 h2d,4 h2d,6 h2d,8
– Y is observed data
– U is the missing data h2d,9 h2d,8 h2d,10
h2d,11 h2d,12
h2d,13 h2d,14
37. Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 37
The EM
• Expectation Step
– Computes the expectation of
the log likelihood function
Q(θ | θ (t ) ) = Eθ ( t ) = [log P( D | θ ) | θ (t ) , Y ]
• Maximization Step
– Updates the t+1 step parameter θ(t+1) from
current parameter θ(t)
θ ( t +1)
= arg max Q(θ | θ ) (t )
θ
• Stop condition of the E-M steps iteration
– LD (θ (t +1) ) − LD (θ (t ) ) converges
38. Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 38
5. Pose Estimation by
Approximate Inference
• Let the observed data be O'=O-U
– U is the set of hidden variables that are
unobservable due to occlusion
• The best estimated pose is a vector H*,
which is defined as the pose with the
maximum probability given O'.
H * = arg max P ( H | O' ) = arg max ∫ P( H , u | O' )du
u∈U n
= arg max ∫ P( H , O' , u )du = arg max ∫ ∏ P(V | pa(V ))
u∈U i =1
i i
u∈U
P(V) V= H ∪ O' ∪ U
39. Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 39
Inference of Posterior Probability
• How to calculate the posterior
probability?
H * = arg max ∫ ∏ P(Vi | pa (Vi ))du
u∈U i =1...n
– Exact inference
• Junction tree, Message passing
– Approximate inference
• Loopy belief propagation , Variational method
• Markov chain Monte Carlo (MCMC) sampling
– Metropolis-Hasting
– Gibbs sampling
40. Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 40
Approximate Inference (1/2)
• MCMC algorithm uses sampling theorem
• To approximate posterior distributions
P(V) by random number generation
• The key idea of MCMC is to simulate the
sampling process as a Markov chain
• Definition
• A sample vector v of V
• A proposal distribution q(v*|v(t-1)) to generate v*
• An acceptance distribution α to accept v* as v(t)
p(v*)q(v (t −1) | v*)
α (v ( t −1)
p(v (t −1) )q (v* | v (t −1) )
, v*) = min1,
41. Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 41
Approximate Inference (2/2)
• MCMC will generate a Markov chain
(v(0), v(1), ..., v(k), ...), as the transition
probabilities from v(t-1) to v(t)
– Depends only on v(t-1)
– But not (v(0), v(1), ..., v(t-2))
• The chain approaches its stationary
distribution
– Samples from the vector (v(k+1), ..., v(k+n)) are
samples from P(V)
• However, if V is in high dimensions,
MCMC is not easy to converge
42. Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 42
Annealed Gibbs Sampling (1/4)
• Gibbs sampling method
– Formally proposed by Geman&Geman in
1984 for Markov Random Field (MRF)
– Here the sampler is revised for the
proposed two-stage Bayesian network
– The basic idea
• Sampling uni-variate conditional
distributions
• That is, Markov chain of (v(0), v(1), ..., v(k),
...) is achieved by only changing one variable
of v
43. Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 43
Annealed Gibbs Sampling (2/4)
• We draw from the distribution
v (jt ) ~ P (V j | v1(t ) ,, v (jt−)1 , v (jt+)1 ,, vnt ) )
(
• The Annealed Gibbs (AG) sampler
– The uni-variate conditional distributions
sampling is controlled by a stochastic
process of simulated cooling
p (v * | v−ij) ) if v− j = v−tj)
( * (
q (v* | v ( t ) ) = j
0 otherwise
1
p (v*) T ( t ) q (v ( t ) | v*)
α AG = min1, j
p (v ( t ) ) q (v* | v (jt ) )
j
44. Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 44
Annealed Gibbs Sampling (3/4)
• Function T(t) is called cooling
t
Tf n
schedule T (t ) = T0 ( )
T0
• The particular value of T at any point in
the chain is called the temperature
– T0 is start temperature
– Tf is the final cool down temperatures over
n step
• As the process proceeds, we decrease
the probability of such down-hill
moves
45. Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 45
Annealed Gibbs Sampling (4/4)
• The AG sampler adopts a stochastic iterative
algorithm that converges to the set of points
which are the global maxima of the given
function
• The advantage of the AG sampler is
– Its efficiency compared to the Gibbs sampler is
better
• Because Instead of approximating P(V)
– We want to find the global maximum, i.e., the ML
estimate of posterior distribution.
– We run a Markov chain of invariant distribution
P(V) and estimate only the global mode
46. Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 46
6. Feature Extraction
• Human silhouette sampling
• Normalized width Width
Length
• Normalized center
• Spatial distribution of skin color
• Corners of silhouette
47. Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 47
Human Silhouette Sampling (S)
• Human segmentation
• Human silhouette capturing [Suzuki, 1985]
• Uniform sampling is used in human
silhouette sampling.
48. Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 48
Normalized Width (wN )
Normalization
• Human segmentation width
• Binary image profile
• Width adjust
wN = x R − x L
Profile of X coordinate
450
400
hx ≥ threshold
350
pixel accumulation value
300
xL = x for x = 1 → w
250
hx −1 < threshold
200
150
100
50
hx ≥ threshold
0
0 100 200 300 400 500 600
x coordinate of image
xR = x for x = w → 1
Width
hx +1 < threshold
Length
48
49. Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 49
Normalized Center (Nc)
• Boundary adjustment
• Center of new boundary
x N = x p + 0.5wN
y N = y p + 0.5 L
Width
Length
50. Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 50
Spatial Distribution of
Skin Color (A)
Skin color Morphology
detection by
GMM
Region Spatial distribution of
segment skin color
51. Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 51
Corners of Silhouette (C)
• Human segmentation
• Human silhouette capturing
• The level curve curvature approach
[Lindeberg, 1998] ~
I ( x, y ) = arg max Dx D yy + D y Dxx − 2 Dx D y Dxy
2 2
• Adaptive corner choice
52. Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 52
7. Experimental Results
• Experimental environment
– CPU:1.86G, RAM:1G, VC6.0
– HumanEva database I
53. Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 53
HumanEva Database I
• Provider:
– Department of Computer Science in Brown Univ.
• Actions of HumanEva I
Action Description
Walking Subjects walked in an elliptical around
the capture space.
Jog Subjects jogged in an elliptical around
the capture space.
Gesture Subjects performed “hello”
and ”good-bye” gestures in repetition.
Throw/Ca Subjects tossed and caught a baseball
tch with the help of the lab assistant.
Box Subjects imitated boxing.
Combo Subjects performed combinational
actions of walking and jogging.
54. Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 54
Environment Setting
BW1 BW2
• 7 cameras
– 3 color cameras
3m
( C1, C2, C3 )
C2 Capture Space
2m
C3 – 4 gray level cameras
( BW1, BW2, BW3, BW4 )
BW4 BW3
C1
Control Station
55. Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 55
The Experimental Data
• Our proposed method has been trained by 1900
images from walking sequences of subjects 1 and 2
from C1
• 200 testing images:
• 100 images from subject 1
• 100 images from subject 2
• Difficulties:
– Self-occluding
– Clothe variation
– Large variation of
joint location
56. Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 56
Evaluation of Accuracy
• Average distance error of poses
between estimated results and ground
• Let H = {h1, h2, ...hM}, where hm ∈ R3 (or xm ∈
truth
R2 for the 2D body model), be the position
vector of the body pose in the world (or
image respectively)
• D(H, H*): the error in estimated pose H* to
the ground truth pose H
M h −h 1 N T
*
D( H , H *) = ∑
m =1
m
M
m
ξ= ∑∑
NT n=1 t =1
D( H t ,n , H t*,n )
57. Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 57
Performance Comparison Between
Two-stage and One-stage methods
• AG sampler performs better than the Gibbs sampler,
• Two-stage approach performs better than classical
one-stage approach
• AG sampler takes less inference time
58. Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 58
Effect of Iteration Number
on Accuracy
59. Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 59
2D Results of Subject 1
Frame: GT
AGs
Frame: GT
AGs
1122 1149
GT
Frame: GT
AGs
Frame: AGs
1172 1200
60. Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 60
2D Results of Subject 2
GT GT
Frame: AGs
Frame: AGs
804 835
Frame: GT
AGs
Frame: GT
AGs
875 899
61. Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 61
3D Results
• The 1110 frame of subject 1
Ground truth AGs estimation result
150 150
100 100
50 50
0 0
-50 -50
100 0 -100 100
100
-100 0 0 100
-100 0 -100
62. Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 62
3D Results (Cont.)
• The 1135 frame of subject 1
Ground truth AGs estimation result
150 150
100 100
50 50
0 0
-50 100 -50 100
100 0 100
0 0
0
-100 -100 -100 -100
63. Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 63
3D Results (Cont.)
• The 845 frame of subject 2
Ground truth AGs estimation result
150 150
100 100
50 50
0 0
-50 -50
100 100
100 100
0 0 0 0
-100 -100 -100 -100
64. Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 64
3D Results (Cont.)
• The 872 frame of subject 2
Ground truth AGs estimation result
150 150
100 100
50 50
0 0
-50 -50
100 100
100 100
0 0
0 0
-100 -100 -100 -100
65. Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 65
8. Conclusions
• A markerless and monocular motion
capture problem is considered
• The proposed two-stage annealed Gibbs
sampling method can estimate more
accurate poses with less computation time
• The method can overcome three challenges
of the problem
– Self-occlusion
– High-degree variation of joint locations
– Clothing limitation
66. Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 66
Future Work
• Use GMM to approximate prior and
posterior distribution of our human models
• Combine model-free method and model-
based methods to obtain benefits of both
• Exploit HMM to inference human motions
in time series
• Add human parts detectors to help locate
human joints
67. Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 67