Vector-based, Structure Preserving Stroke Gesture Recognition

Vector-based, Structure Preserving Stroke Gesture Recognition
DMSVIVA’2019 (Lisbon, Portugal, July 8th-9th, 2019)

Vector-based, Structure Preserving Stroke
Gesture Recognition
DMSVIVA’2019 (Lisbon, Portugal, July 8th-9th, 2019)
Nathan Magrofuoco1, Paolo Roselli2, Jorge Luis Perez-Medina1,3,
Jean Vanderdonckt1, Santiago Villarreal1
1LouRIM, Université catholique de Louvain, Belgium
2Università degli Studi di Roma “Tor Vergata”, Roma, Italy
3Universidad de Las Américas, Quito, Ecuador

3
Background on Stroke Gesture Recognition
• Two families of approaches
• Specific: tied to a particular gesture set
• Machine Learning, SVM, Hidden Markov Models, Neural
networks,…
• Two main limitations:
• Need to re-train, re-model if gesture set is modified
• Overfitting problem
• Generic: independent of any gesture set
• Nearest-Neighbor-Classification (NNC)
• Pattern matching
• Two advantages
• No need to re-train, re-model is gesture set is modified
• No overfitting

• Invariance properties
• Variation of strokes => stroke clustering
Stroke invariance = independence of any combination of
strokes
• Variation of directions => direction interpretation
Direction invariance = independence of any direction in
gesture recognition

• Variation of sampling => re-sampling needed
Sampling invariance = independence of any sampling
Source: J.O. Wobbrock, A.D. Wilson, Y. Li, Gestures without libraries, toolkits or training: a $1 recognizer for user interface prototypes, Proc. of UIST’2007

• Variation of scale => different sizes depending on the
surface of the platform: rescaling is perhaps needed
Scale invariance = independence of any size/scale

• Variation of location => different locations depending
on the surface of the platform: translation is perhaps
needed
Translation invariance = independence of any location

• Variation of angle => different orientations depending
on position
Rotation invariance = independence of any rotation

9
Nearest-Neighbor-Classification (NNC)
• Pre-processing steps to ensure invariance
• Re-sampling
• Points with same space between: isometricity
• Points with same timestamp between: isochronicity
• Same amount of points: isoparameterization
• Re-Scaling
• Normalisation of the bounding box into [0..1]x[0..1] square
• Rotation to reference angle
• Rotate to 0°
• Re-rotating and distance computation
• Distance computed between candidate gesture and
reference gestures (1-NN)

10
Nearest-Neighbor-Classification (NNC)
• Two families of approaches
• “Between points” distance
• $-Family recognizers: $1, $3, $N, $P, $P+,
$V, $Q,…
• Variants and optimizations: ProTractor,
Protactor3D,…
• “Vector between points” distance
• PennyPincher, JackKnife,…
• A third new family of approaches
• “Vector between vectors” distance:
this paper!

11
• Definition of a basic gesture as a vector
• From p1 to p2, create vector 𝑢
• From p2 to p3, create vector 𝑣
• By derivation, create vector 𝑏 = 𝑢 + 𝑣
• Note that −(𝑢 + 𝑣) is the inverse vector
𝑣
𝑢
p1
p2
p3
𝑢
𝑣
b = 𝑢 + 𝑣 𝑏 = −(𝑢 + 𝑣)
𝑣
𝑢

12
• Local Shape Distance between 2 triangles based on
similarity
This is the
only simple
formula to
compute
𝑎
𝑏
𝑢
𝑣𝑎 + 𝑏
𝑢 + 𝑣

13
• Step 1. Vectorization for each pair of vectors between
three consecutive points
p1
p2
p3
p4
p5
p6
q1
q2
q3
q4q5
q6
Training
gesture
Candidate
gesture

14
• Step 1. Vectorization for each pair of vectors between
three consecutive points
p1
p2
p3
p4
p5
p6
q1
q2
q3
q4q5
q6
p1
p2
p3
p4
p5
p6
q4q5
q6
Training
gesture
Candidate
gesture
q1
q2
q3

15
• Step 2. Mapping candidate’s triangles onto training
gesture’s triangles
p1
p2
p3
p4
p5
p6
q1
q2
q3
q4q5
q6
p1
p2
p3
p4
p5
p6
q1
q2
q3
q4q5
q6
Training
gesture
Candidate
gesture
p1
p2
p3
p2
p3
p4
p3
p4
p5
p4
p5
p6

16
• Step 2. Mapping candidate’s triangles onto training
gesture’s triangles
p1
p2
p3
p4
p5
p6
q1
q2
q3
q4q5
q6
p1
p2
p3
p4
p5
p6
q1
q2
q3
q4q5
q6
Training
gesture
Candidate
gesture
p1
p2
p3
p2
p3
p4
p3
p4
p5
p4
p5
p6

17
• Step 3. Computation of Local Shape Distance between
pairs of triangles
p1
p2
p3
p4
p5
p6
q1
q2
q3
q4q5
q6
p1
p2
p3
p4
p5
p6
q1
q2
q3
q4q5
q6
p1
p2
p3
p3
p4
p5
p2
p3
p4
p4
p5
p6
Training
gesture
Candidate
gesture
p1p2p3,
q1q2q3
(N)LSD (
)
(
=0.02
p2p3p4,
q2q3q4 )
=0.04
(p3p4p5,
q3q4q5 )
=0.0001
)
p4p5p6,
q3q4q5
(
=0.03

18
• Step 4. Summing all individual figures into final one
• Step 5. Iterate for every training gesture
p1
p2
p3
p4
p5
p6
q1
q2
q3
q4q5
q6
p1
p2
p3
p4
p5
p6
q1
q2
q3
q4q5
q6
p1
p2
p3
p3
p4
p5
p2
p3
p4
p4
p5
p6
Training
gesture
Candidate
gesture
p1p2p3,
q1q2q3
(N)LSD (
)
p2p3p4,
q2q3q4 )
(
)
p4p5p6,
q3q4q5
((p3p4p5,
q3q4q5 )
=0.02 =0.04 =0.0001 =0.03
=0.02+0.04+0.0001+0.03=0.0901
(indicative figures)

19
• Normalized Local Shape Distance between 2 triangles
based on similarity
• LSD is not symmetric => anisotropic distance
• NLSD is symmetric => isotropic distance
𝑁𝐿𝑆𝐷 𝑎, 𝑏 , 𝑐, 𝑑 = 𝐿𝑆𝐷
𝑎
𝑎
,
𝑏
𝑏
,
𝑐
𝑐
,
𝑑
𝑑

20
Discussion
• Advantages
• !FTL is aligned with $P, the state-of-the-art recognizer in
terms of recognition rate on this gesture set
• !FTL(LSD), resp. (NLSD), is 4, resp. 3 times faster than $P
• Position, Scale, Rotation invariances
• Are algebraically guaranteed due to vector-based approach
• Can be controlled on-demand
• No need to perform pre-processing such as
• Normalization
• Re-scaling
• Re-rotation

Thank you very much
for your attention

Vector-based, Structure Preserving Stroke Gesture Recognition

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Ähnlich wie Vector-based, Structure Preserving Stroke Gesture Recognition

Ähnlich wie Vector-based, Structure Preserving Stroke Gesture Recognition (20)

Mehr von Jean Vanderdonckt

Mehr von Jean Vanderdonckt (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Vector-based, Structure Preserving Stroke Gesture Recognition