1. On Grounding Human Communication with
Human-Computer Interaction Designs
Hao-‐Chuan
Wang
.
王浩全
Department
of
Computer
Science
Ins3tute
of
Informa3on
Systems
and
Applica3ons
Na3onal
Tsing
Hua
University
h-p://www.cs.nthu.edu.tw/~haochuan
May
26,
2014
@
Department
of
Communica3on
and
Technology,
Na3onal
Chiao
Tung
University
3. The
two
“senses”
of
Human-‐Computer
Interac7on:
From
interface
…
“Interacon”
in
the
sense
of
computers
listening
and
responding
to
people’s
input
4. …
to
problem
solving
and
value
crea7on
in
the
real
world
“Interacon”
in
the
sense
of
designing
technologies
based
on
user
needs,
goals,
constraints,
and
characteriscs.
UCD:
User-‐Centered
Design.
Iden7fying
fixing
usability
problems
Technology
supported
educa7on
Persuasive
(behavioral
change)
compu7ng
5. Wang
HCI: Studying Existing and Possible Relationships between
Computers and People
5
ACM
SIGCHI
Curricula
1996
6. Wang
30 Years of the HCI Community
6
ACM
SIGCHI:
9
Turing
Award
Winners
/
188
ACM
Fellows
http://dl.acm.org/sig.cfm?id=SP923
7. Wang
What’s Changing in HCI Today?
Big
picture
is
sll
there,
but:
• More
emphasis
is
on
use
contexts
and
applicaons.
• Computers
are
of
many
forms,
doing
all
sort
of
things.
• Compung
is
not
necessarily
done
by
silicon
chips
computers.
-‐ Input
and
output
are
versale.
Not
necessarily
“keyboard
and
mouse”,
“text,
speech
or
graphics”
-‐ Collaboraon
and
social.
Not
necessarily
“one
human,
one
computer”.
7
11. Wang
Supporting Human Communication
Communicaon
in
the
sense
of
data
transmission
across
physical
distance
is
not
that
hard
today
• Wired
and
wireless
computer
networking,
internet
etc.
Communicaon,
in
the
sense
of
understanding
each
other,
or
crossing
the
“psychological
distance”
between
people
remains
hard
• Difficules
in
expressing
or
understanding
thoughts
• Barriers
between
generaons,
genders,
professions,
languages,
and
cultures.
Supporng
human
communicaon
connues
to
be
a
challenging
yet
worth-‐of-‐pursuing
topic
in
HCI.
11
12. Wang
Supporting Human Communication
Communicaon
in
the
sense
of
data
transmission
across
physical
distance
is
not
that
hard
today
• Wired
and
wireless
computer
networking,
internet
etc.
Communicaon,
in
the
sense
of
understanding
each
other,
or
crossing
the
“psychological
distance”
between
people
remains
hard
• Difficules
in
expressing
or
understanding
thoughts
• Barriers
between
generaons,
genders,
professions,
languages,
and
cultures.
Supporng
human
communicaon
connues
to
be
a
challenging
yet
worth-‐of-‐pursuing
topic
in
HCI.
12
14. Wang
Lost in Technologies
However,
technology
development
does
not
always
approach
the
goal
effecvely.
For
example:
Video
conferencing
• Bandwidth-‐demanding.
Video
lagging
that
disrupts
conversaon
• Adopon
is
not
guaranteed
.
Privacy
and
other
social
concerns
Machine
translaon
• Quality
concern
• Influent
second
language
can
beat
MT
(cf.
Yamashita
Ishida,
2006).
14
15. Wang
Observation
Designs
of
CMC
can
work
be-er
when
features
and
constraints
of
human
communicaon
are
invesgated
and
considered.
Ex.
Awareness
indicator
that
makes
“typing”
visible
in
instant
messaging.
Basic
research
stays
relevant!
What
are
the
features
of
successful
and
unsuccessful
communicaon?
What’s
the
nature
of
“understanding”?
15
17. Wang
How Would You Describe…
Where
you
live
in
Hsinchu?
Where
you
lived
when
you
were
in
U.S.?
17
18. Wang
My Answer
Where
you
live
in
Hsinchu?
Near
清大後門.
Where
you
lived
when
you
were
in
U.S.?
In
Ithaca,
a
college
town
in
the
middle
of
New
York
state
if
you
know
where
it
is.
It’s
where
Cornell
University
is
located.
18
19. Wang
My Answer
Where
you
live
in
Hsinchu?
Near
清大後門.
Where
you
lived
when
you
were
in
U.S.?
In
Ithaca,
a
college
town
in
the
middle
of
New
York
state
if
you
know
where
it
is.
It’s
where
Cornell
University
is
located.
Do
you
see
the
general
difference?
Why?
19
20. Wang
My Answer
Where
you
live
in
Hsinchu?
Near
清大後門.
Where
you
lived
when
you
were
in
U.S.?
In
Ithaca,
a
college
town
in
the
middle
of
New
York
state
if
you
know
where
it
is.
It’s
where
Cornell
University
is
located.
Do
you
see
the
general
difference?
Why?
The
amount
of
knowledge
that
we
shared.
20
21. Wang
Common Ground
21
Knowledge,
beliefs,
aitudes
we
share,
and
know
that
we
share,
and
know
that
we
know
that
we
share,
influence
how
we
use
language
to
communicate.
Grounding:
Interacve
process
by
which
communicators
exchange
evidence
of
their
understanding
to
arrive
at
the
state
of
common
ground.
Herbert
Clark
22. Wang
Evidence of Common Ground
Physical
co-‐presence
(being
co-‐located)
• “close
that
door”
Shared
community
membership
• “Let’s
meet
at
小七”
Linguisc
co-‐presence
(can
access
same
u-erances)
22
23. Wang
Evidence of Common Ground
Physical
co-‐presence
(being
co-‐located)
• “close
that
door”
Shared
community
membership
• “Let’s
meet
at
小七”
Linguisc
co-‐presence
(can
access
same
u-erances)
23
“What’s
this?”
25. Wang
The Role of Media: Affordances
An
influenal
HCI-‐rooted
concept,
which
roughly
means
“acon-‐permiing
properes”
of
objects
that
people
see
• Chair
affords
siing
• Door-‐knob
affords
door-‐opening
• Virtual
keyboard
affords
typing
(but
is
this
trivial?)
25
Don
Norman
27. Wang
Technology Changes Grounding
Affordances
of
media
constrain
how
people
may
interact
with
one
another
• E.g.,
if
no
visibility,
impossible
to
use
head-‐nodding
as
a
technique
for
grounding
People
may
learn
to
adapt
their
grounding
behaviors
(this
happens.
E.g.,
emocons
in
IM)
or
Design
new
CMC
tools
with
useful
proper7es
to
support
grounding
and
communica7on.
27
30. Kinect-taped Communication:
Using Motion Sensing to Study Gesture Use
and Similarity in Face-to-Face and
Computer-Mediated Brainstorming
Hao-Chuan Wang, Chien-Tung Lai
National Tsing Hua University, Taiwan
31. [cf.
Bos
et
al.,
2002;
Setlock
et
al.,
2004;
Scissors
et
al.,
2008,
Wang
et
al.,
2009]
Computer-mediated communication (CMC) tools are
prevalent, but are they all equal?
• Ex. Video vs. Audio
Media properties influence aspects of communication
differently
• Task performance, grounding, styles, similarity of
language patterns, social processes and outcomes etc.
How media influence communication?
32. Communication could be more than speaking.
Both verbal and non-verbal channels are active
during conversations.
Facial
expression
Gesture
[cf.
Goldin-‐Meadow,
1999;
Giles
Coupland,
1991
]
The (missing) non-verbal aspect in
CMC research
33. Studying gesture use in
communication
Current methods:
• Videotaping with manual coding.
• Giving specific instructions to participants
(e.g., to gesture or not).
• Using confederates etc.
Problems to solve:
• High cost. Labor-intensiveness.
• Resolution of manual analysis-
Hard to recognize and reliably label small movements.
• Scalability-
Hard to study arbitrary communication in the wild.
34. “Kinect-taping”method
Like videotaping, we use motion sensing devices, such as
Microsoft Kinect, to record hand and body movements
during conversations.
• Detailed, easier-to-process representations.
• Behavioral science instrument (“microscope”) to
study non-verbal communication in ad hoc groups.
• Low cost if automatic measures are satisfactory.
35. Re-appropriating motion sensors in HCI:
Sensing-aided user research for
future designs
From sensors as design elements to sensors as
research instruments to help future designs.
!
(a)!Face(to(face!(F2F)!communication!
[cf.
Mark
et
al.,
2014]
36. A media comparison study
Investigate how people use gestures during face-to-
face and computer-mediated brainstorming
Compare three communication media
• Face-to-Face
• Video
• Audio
!
(a)!Face(to(face!(F2F)!communication!
!
(b)!Video(mediated!communication!
Figure'1.'A'sample'study'setting'that'compares'(a)'F2F'to'(b)'videomediated'communication'
by'using'Kinect'as'a'behavioral'science'instrument.'
!
37. Hypotheses
H1. Visibility increases gesture use
Proportion of gesture
Face-to-Face Video Audio
H2. Visibility increases accommodation
Similarity between group members’ gestures
Face-to-Face Video Audio
Also explore how gesture use, level of understanding,
and ideation productivity correlate.
[cf.
Clark
Brennan,
1991]
[cf.
Giles
Coupland,
1991]
38. Experimental design
36 individuals, 18 two-person groups
Kinect-taped group brainstorming sessions
Face-to-Face
Video
Audio
Three
trials
(15
min
each)
in
counterbalanced
order
Data analysis
Amount and similarity of gestures,
Level of understanding, Productivity
39. How to quantify gestures?
How many gestures are there in a 15 min talk?
47. Feature extraction and representation
Unit motions are represented as feature vectors
• Time length, path length, displacement,
velocity, speed, angular movement etc.
• Features extracted for both hands and both
elbows.
73 features extracted for each unit motion.
Similarity between unit motions: Cosine value
between the two vectors.
48. Validating the similarity metric
1
2
3
Machine Ranking
Human Ranking
1
2
3
Randomly select
motion queries
Retrieve similar and
dissimilar motions
Kinect-taped motion
database
50. H1: Amount of gesture use
H2: Similarity between group members
Associations
• Amount of gesture and understanding
• Amount of gesture and ideation productivity
• Gesture similarity and ideation productivity
Key Results
51. Visibility on proportion of gesture use
0
2
4
6
8
10
12
14
16
Face-to-face Video Audio
ProportionofGestureUse(%)
H1 not supported. Media did not influence percentage of gesture.
People gesture as much in Audio as in F2F and Video.
52. Association between self-gesture
and level of understanding
ModelPredicted,UnderstandingModelPredicted,Numb
Propor9on,of,Individual’s,Own,Gesture,Use,(%)
Audio
F2F
Video
Individual’s Own Gesture Use (%)
Non-communicative
function of gesture.
Understanding
correlates with
self-gesture but not
partner-gesture
Stronger correlation
with reduced or no
visibility.
53. Similarity between group members
0.46
0.47
0.48
0.49
0.5
0.51
0.52
0.53
0.54
0.55
Face-to-face Video Audio
Between-participantGestural
Similarity
H2 supported. Similarity F2F Video Audio.
People gesture more similarly when they can see each other.
55. Motion sensing for
studying non-verbal
behaviors in CMC.
Summary and implications
Media
Comparison
Study
Kinect-
taping
Method
Visibility influences
similarity but not
amount of gesture.
Only self-gesture
correlates with
understanding.
Gesture doesn’t
seem to convey
much meaning to the
partner. Seeing the
partner is not crucial
to understanding.
56. Study communication
of ad hoc groups
in the wild.
Distributed
deployment
study of CMC tools.
Cross-lingual and
cross-cultural
communication.
Summary and implications (cont.)
Media
Comparison
Study
Kinect-
taping
Method
The value of video
may be relatively
limited to the social
and collaborative
aspect (similarity
etc.).
Feedback that
promotes self-
gesturing may help
understanding.
57. Effects of Interface Interactivity on Collecting
Language Data to Power Dialogue Agents
Hao-Chuan Wang, Tau-Heng Yeo, Hsin-Hui Lee, Ai-Ju Huang
National Tsing Hua University, Hsinchu, Taiwan
Jia-Jang Tu, Sen-Chia Chang
Industrial Technology Research Institute, Hsinchu, Taiwan
58. “What’s the top-grossing movie in
2012?”
“Let me see... The Avengers.”
“The top-grossing movie in 2012
is The Avengers”
59. Young, S., Keiser, S. Gašić, M. Spoken Dialogue Management using Partially
Observable Markov Decision Processes
Spoken
Dialogue
Systems
ChiCHI 2014 |
Effects of Interface Interactivity on Collecting Language Data to Power
Dialogue Agents
60. Young, S., Keiser, S. Gašić, M. Spoken Dialogue Management using Partially
Observable Markov Decision Processes
How to collect more natural
language responses?
Language
Generaon
Task
ChiCHI 2014 |
Effects of Interface Interactivity on Collecting Language Data to Power
Dialogue Agents
61. Some
Exisng
Methods
• One-on-one interviews to get the responses
from people
- Manual data collection.
- Expensive.
• Using surveys with specific instructions,
“Imagine that you’re answering people’s
questions …”
- Less expensive.
- Non-interactive, “imagined interaction”.
ChiCHI 2014 |
Effects of Interface Interactivity on Collecting Language Data to Power
Dialogue Agents
62. Idea:
Using
an
Interacve
Chat
Bot
to
Elicit
Natural
Responses
ChiCHI 2014 |
Effects of Interface Interactivity on Collecting Language Data to Power
Dialogue Agents
63. ChiCHI 2014 |
Effects of Interface Interactivity on Collecting Language Data to Power
Dialogue Agents
Anthropomorphic features:
ü Greet workers
ü Simulate human typing delays
ü Wait for response
64. Stac
Interface
ChiCHI 2014 |
Effects of Interface Interactivity on Collecting Language Data to Power
Dialogue Agents
65. Crowdsourcing
Answer
Generaon
Evaluaon
Compare
interactive and
static interface
Crowdsourcing
to select quality
responses
Evaluate the
results with end
users
ChiCHI 2014 |
Effects of Interface Interactivity on Collecting Language Data to Power
Dialogue Agents
Stage1 :
Creation
Stage2 :
Aggregation
Evaluation
Stage
66. PTT
A BBS System and Online
Community in Taiwan
MTurk
Mullingual
Crowdsourcing
Study
ChiCHI 2014 |
Effects of Interface Interactivity on Collecting Language Data to Power
Dialogue Agents
Chinese and English versions of ads and task
instructions are prepared for crowdsourcing
67. Stage 1 : Answer Creation
• 223 workers
- 122 from MTurk
- 101 from PTT
Stage 2 : Answer Aggregation
• 222 workers
Evaluation
• 165 workers
98 from Mturk
67 from PTT
ChiCHI 2014 |
Effects of Interface Interactivity on Collecting Language Data to Power
Dialogue Agents
68. Key
Results
ChiCHI 2014 |
Effects of Interface Interactivity on Collecting Language Data to Power
Dialogue Agents
69. Interacve
vs.
Stac
Interface
• 73.6% of comments show
preference for working with
the interactive chat bot.
• Increasing the satisfaction of
workers (Kittur, A., et al. 2013)
ChiCHI 2014 |
Effects of Interface Interactivity on Collecting Language Data to Power
Dialogue Agents
Stage1 : Creation
Stage2 : Aggregation
Evaluation Stage
70. Interacve
vs.
Stac
Interface
“Chat is much fun and more likely to
make me think, while questionnaire is
more standardized, like an exam.”
“the chat interface is much better. it
recognizes the text entered in real time and
responds accordingly with artificial
intelligence and recognition. very nice”
ChiCHI 2014 |
Effects of Interface Interactivity on Collecting Language Data to Power
Dialogue Agents
Stage1 : Creation
Stage2 : Aggregation
Evaluation Stage
71. Interacve
vs.
Stac
Interface
• 73.6% of comments show
preference for working with
the interactive chat bot.
• Increasing the satisfaction of
workers (Kittur, A., et al. 2013)
ChiCHI 2014 |
Effects of Interface Interactivity on Collecting Language Data to Power
Dialogue Agents
Stage1 : Creation
Stage2 : Aggregation
Evaluation Stage
72. Mturk
vs.
PTT
:
Language
• Two platforms are highly language-specific.
0
10
20
30
40
50
60
70
80
90
100
110
120
Chinese
Recruitment
Ads (PTT)
English
Recruitment
Ads (PTT)
Chinese
Recruitment
Ads (MTurk)
English
Recruitment
Ads (MTurk)
Answer in English
Answer in Chinese
ChiCHI 2014 |
Effects of Interface Interactivity on Collecting Language Data to Power
Dialogue Agents
Stage1 : Creation
Stage2 : Aggregation
Evaluation Stage
73. • Cultural Differences.
(Nisbett, R., 2003 Hall, E. T.,1977).
Evaluaon:
Ulmate
User
Experience
ChiCHI 2014 |
Effects of Interface Interactivity on Collecting Language Data to Power
Dialogue Agents
Stage1 : Creation
Stage2 : Aggregation
Evaluation Stage
3.5
3.0
2.5
Enjoyability
Answers collected
w/ Interactivity
Answers collected
w/ questionnaire
Chinese
English
74. Conclusion
• Present an interactive chat bot-based
interface for crowdsourcing language
generation tasks for building natural dialogue
agents.
• Interactivity lead to higher worker satisfaction,
and better perceived enjoyability by
Chinese-speaking users.
• Also, identified language specificity of
crowdsourcing platforms. Helps to inform
crowdsourcing practices.
ChiCHI 2014 |
Effects of Interface Interactivity on Collecting Language Data to Power
Dialogue Agents
75. Thank
you
for
your
listening.
Acknowledgement
This study is partially supported by
Project D352B24310 and
conducted at ITRI under the
sponsorship of the Ministry of
Economic Affairs, Taiwan.
////
////
Contact
Hao-Chuan Wang
haochuan@cs.nthu.edu.tw
Ai-Ju (Ivy) Huang
navys23@gmail.com
ChiCHI 2014 |
Effects of Interface Interactivity on Collecting Language Data to Power
Dialogue Agents
76. Wang
Key Messages
Supporng
human
communicaon
connues
to
be
an
important
topic
in
HCI,
both
to
research
and
design
pracce.
• Focusing
on
how
to
shorten
the
“psychological
distance”
between
people.
“Mind-‐connecng”!
Basic
and
applied
behavioral,
cognive
and
social
sciences
helps
to
understand
the
features
of
successful
and
unsuccessful
communicaon
• Insight
that
we
should
focus
on
CMC
affordances
as
much
as
technicality.
Interdisciplinary
work
can
benefit
both
sides:
Social
and
behavioral
sciences
help
technology
design,
and
vice
versa.
76
78. Wang 78
國⽴立清華⼤大學⼈人機合作與社群運算實驗室
NTHU
Collaborave
and
Social
Compung
Lab
(CSC
Lab)
Acknowledgement
for
Support
from
Ministry
of
Science
and
Technology,
Taiwan
科技部
Google
Inc.
美國Google總部
Microsov
Research
Asia
微軟亞洲研究院
Industrial
Technology
Research
Instute
(ITRI)
⼯工業技術研究院
Delta
Corp
台達電⼦子公司
Naonal
Science
Foundaon,
USA
美國NSF