Modeling Gaze Dynamics in Human-Agent Interactions

Modeling the Dynamics of Gaze-
Contingent Social Behaviors in
Human-Agent Interaction
University of Augsburg, Germany
Human Centered Multimedia
Elisabeth André

2
My Background
 Social Robotics and Virtual Agents
 European and BMBF Projects on Affective
Computing

3
Motivation
?
http://www.kreisbote.de/http://ais.badische-zeitung.de/

4
Explicit versus Implicit
Interaction with Eye Gaze
 Explicit Interaction:
 Open interaction with a system where
humans intentionally input discrete
commands to explicitly express their needs
 Implicit Interaction:
 Information that people convey indirectly in
a conversation, but which may be derived
from dialogue and context information.
 Unconscious Interaction:
 Continuous (often nonverbal) behavior
people not voluntarily control, but which
may be (but are not necessarily expected
to be) interpreted as the implicit expression
of a particular need or intention
http://www.vision-systems.com/

5
Eye Gaze to Initiate Contact with
a Human User
Breaking the Ice in Human-Agent
Communication: Eye-Gaze Based
Initiation of Contact with an
Embodied Conversational Agent
Tober et al. IVA 2009

6
Five Phases of Flirting [Givens, 1978]
 Attention Phase
 Men and women arise each other’s attention
 Ambivalent non-verbal behavior
 Recognition Phase
 One interactant recognizes the interest of the other
 He or she may then signal readiness to continue the interaction,
e.g., by a friendly smile.
 Interaction Phase
 After mutual interest has been established, the man or woman
may be initiated the interaction phase and engage in a
conversation
 Sexual-Arousal and Resolution Phases are somehow
missing relevance to human-agent communication.

7
Attention and Recognition Phase
7

8
Interaction Modes
 Interactive version
 Non-interactive version with ideal flirt behavior:
 In the non-interactive ideal version the virtual agent behaves like
in the interactive version except for that it does not respond to
the user’s eye gaze behavior, but assumes a perfect eye gaze
behavior from the user and thus follows a fixed sequence.
 Non-interactive version with anti-flirt behavior:
 Duration of mutual gaze is increased from 3 s to 7 s
 Facial expression remains neutral (which can be interpreted as a
bored attitude towards the user)
 Virtual agent looks away upwards after gazing at the user
instead of downwards

9
Results
1. In the interactive and the ideal mode, the agent was
able to show the users that Alfred had an interest in
them and the users also had the feeling that he was
flirting with them.
2. We found that the effect was increased when moving
from the ideal to the interactive mode.
3. The interactive version contributed to the user’s
enjoyment, increased their interest to continue the
interaction or even to engage in a conversation with
Alfred.

10
Conclusions
 Alfred was lacking of attractiveness, but the eye gaze
enabled agent improved the flirting interaction.
 Flirting tactics as implemented in this work are of
benefit to a much broader range of situations with agents
than just dating, e.g. initiate human-agent interaction or
regulating turn-taking in dialogues.

11
Setting
Discovering eye gaze behavior
during human-agent conversation
an interactive storytelling
application. ICMI-MLMI 2010.

12
Gaze Model
 Parameters set on the basis of data found in the
literature
Non-interactive Interactive
Looks around
4.0 s
(2-6 s)
4.0 s
(2-6 s)
Gazes at user
(Wait for gaze)
2.0 s
(1-3 s)
2.0 s
(1-3 s)
Mutual gaze n/a
1.0 s
(0.75-1.25 s)

13
Evaluation
 Compared the 2 different gaze behavior models:
 non-interactive vs. interactive
 Study with 19 subjects
 How do people respond to different gaze models?
 Does the gaze model affect their sense of social presence?
 Order of the 2 gaze models was randomized for each
subject to avoid any bias due to ordering effects

14
Synchronized Data
Open Source Framework for Social Signal Processing: http://www.openssi.net

Starer vs. Non-Starers
starers non-starers
15

16
Results
 Users looked more at Emma while she was
speaking than when the users started to speak

17
Results
 In total users were much more looking at Emma
compared to human-human interaction
Argyle & Cook Kendon Our Study
Looking at interlocutor 58%
50%
(28% - 70%)
76%
(46% - 98%)
Looking at interlocutor
while listening
75% 81%
Looking at interlocutor
while speaking
41% 71%

18
Conclusions
 Interactive gaze mode led to a better user experience
compared to the non-interactive gaze mode
 Users adhere to patterns of gaze behaviors for speaker
and addressee that are also characteristic of dyadic
human-human interactions
 They looked more often to the virtual interlocutor than is
typical of human-human interactions.

20
Empathetic Artificial Listener
 Attention: pay attention to the signals produced by a speaker
 Perception of signals
 Comprehension: understand meaning attached to signals
 Internal reaction: the
comprehension of the
meaning may create cognitive
and emotional reaction
 Decision: communication or
not of the internal reaction
 Generation: display
behaviors

21
Generation of Facial
Expressions
 FACS (Facial Action Coding System) can be used to
generate and recognize facial expressions.
 Action Units are used to describe emotional
expressions.
 Seven Action Units were identified for the robotic face
(out of 40 Action Units for the human face)
 Upper face:
 inner brows raiser (AU 1),
 brown lowerer (AU 4),
 upper lid raiser (AU 5)
 and eye closure (AU 43).
 Lower face:
 lip corner puller (AU 12),
 lip corner depressor (AU 15)
 and lip opening (AU 25).

22
Social Signal Interpretation:
SSI by Augsburg University
Multiple Sensor Input
ECG, Skin Conduction, Blood
Glucose Level,
Speech, Acceleration, …
Preprocessing and Feature Analysis
Filtering,
Frequency
Analysis,
…
Pattern Recognition
Fusion and
Final Decision
Physiological and
Affective State,
Context Information
SSI is freely available under:
http://www.openssi.net
Johannes Wagner, Florian Lingenfelser, Tobias
Baur, Ionut Damian, Felix Kistler, Elisabeth André:
The social signal interpretation (SSI) framework:
multimodal signal processing and recognition in
real-time. ACM Multimedia 2013: 831-834

23
Generation of Facial
Expressions

24
Sensitive Artificial
Listener

25
Ideomotorische Empathie
 Mirroring of Emotions

26
Affective Empathy
 Emotional Reaction to user emotions
That is not
good!

27
Functional Empathy
 Show concern about forgotten medications to
increase problem awareness
Oh dear!

28
Functional Empathy
 Intentional smile to calm down user
Please think of
it tonight!

29
Multimodal Dialogue with a
Robot
G. Mehlmann, M. Häring, K. Janowski, T.
Baur, P. Gebhard, E. André: Exploring a
Model of Gaze for Grounding in Multi-
modal HRI. ICMI 2014: 247-254

30
Research Strategy
Model of Human Social
Behaviors
Multimodal Behavior Simulation
Corpus on Human Social Behaviors
Build
Simulate
Evaluate
Refine
Statistics

33
Gaze Recognition
 The glasses provide the video image and the gaze
coordinates
-, -
-, -
...
156, 543
189, 527
145, 567
211, 542

34
Gaze-Based Disambiguation
3
1
42
Do you mean this red object there?
„Do you
mean this
red object
there?“
1 1 1 2 2 3 3 3 2 1 1 1 2 3Gaze:
Speech:
Use this information
for disambiguation!
Gaze

35
Gaze-Based Disambiguation
0:life
42700:time
0.8:conf
circle:shape
red:color
3:name
:data
gaze:mode
event:type
3
... ...
[ ]:
3164:life
39021:time
redcolor:data
prop_quest:fun
nginfo_seeki:cat
dialog_act:type
:data
speech:mode
event:type
1,3,4
0:life
40100:time
0.6:conf
square:shape
red:color
1:name
:data
gaze:mode
event:type
1
0:life
40700:time
0.7:conf
square:shape
green:color
2:name
:data
gaze:mode
event:type
2
„Do you mean this red object there?“
1 1 1 2 2 3 3 3 2 1 1 1 2 3Gaze:
Speech:

36
Robot Behavior
 The robot‘s behavior depends on the role.
In the speaker role, the
robot awaits the dialog
manager‘s decision to play
a behavior.
In the addressee role, the robot
shows some idle gaze behavior,
occasionally reacting to the users‘
gaze movements, emotional
expressions and other cues.

37
Gaze-based Interaction
 Object grounding:
 The robot follows the user’s hand movements.
 The robot follows the user’s gaze.
 Social grounding:
 The robot seeks and recognizes mutual gaze.
 Turn management:
 The robot recognizes when the user yields the turn.

39
Results of a Study
 Object grounding was more effective than social grounding.
 People were able to interact more efficiently with object grounding.
 Social grounding did not improve the perception of the interaction.
 Assumption:
 People were rather concentrating on the task instead of the
social interaction with the robot.

41
Bayerischer Forschungsverbund:
Gender-specific Attitudes towards Robots in Elderly Care

42
Conclusions
 Effect of gaze-aware agents:
 Gaze-aware agents have a positive effect on user perception.
 Gaze-aware agents improve grounding.
 Side effects:
 Midas Touch Problem:
• The agent should not respond to each detected gaze behavior.
 Unnatural user behavior:
• Use of gaze as a pointing device
 Timing is the key.

Modeling Gaze Dynamics in Human-Agent Interactions

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Andere mochten auch

Andere mochten auch (6)

Ähnlich wie Modeling Gaze Dynamics in Human-Agent Interactions

Ähnlich wie Modeling Gaze Dynamics in Human-Agent Interactions (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Modeling Gaze Dynamics in Human-Agent Interactions