2. Mixed-Initiative Interaction
Conventional systems: User initiates MIT
interaction and commands the system Media Lab
Mixed-initiative: System sometimes
initiates interaction with the user
– You have mail.
– Can I help you find that?
– Here is something useful to you.
Microsoft Research
Related PARC Research:
Responsive Technologies Psychographic Profiling Information also:
Recommendation Human-Robot
Interaction
Multi-party
conversations
Camera-based
Clothing
Responsive
Recognition
tracking
Mirror [IUI 2008, HCII 2009]
[ICDSC 2008] Magitti [CHI 2008]
3. Business Marketplace
In-store signage
– Traditional: Point-of-Purchase displays, shelf positioning,
packaging, store-handouts (coupons), specials (e.g., Kmart
blue light), aisle coupons, loyalty programs (lower price)
– Emerging: digital kiosks, digital signage, directed audio
Companies
– NewsAmerica leases store space and sells ad spaces to
consumer packaged goods
Search Engine Marketing
$13B to $26B in 2013
Advertisers pay more for
personalization
Reactrix charged higher rates than
static digital signage.
Reactrix is just the tip of the iceberg.
3
4. Avatars
Today we are
just at the tip of
the iceberg in
conversational Voice
interaction Systems
Media
In the future
we will interact Robots
with all types
of technology
Service Agents
as if they were
social entities Marketing
Sales
Education
Therapy
Performance
Coaching
4
5. In-Store Product Recommendations
Sensor Inferred User Personalized
Data Type Question Perception Goal Recommendation
Web Shopping
Previous Interest Profile:
Personal Items: x, y, Style, colors, price Similar items
Profile purchases z, …. range, etc. today
Eye-contact What product Looking for Matching
Is she looking A blue blazer Business clothes skirts in the
sensors
at now? store
Tracking Is she searching Shopping for gifts Highlight
browsing gift items
Sensors or just browsing?
Floor Is this a group Wants to show Highlight new
Sensors or individual?
Group “Fashion sense” trendy fashions
Is she Display
Motion Rushing Needs to decide Impulse-buy
Sensors rushing in quickly
a hurry? items
Responsive Personalized Sales Promotions
6. Existing Research: Many indicators
of a person’s engagement with media
eye gaze facial affect eye blinks
proximity, orientation
of head & body
[Haro, Flickner, Essa 2000] [Grauman et al. ‘01]
pupil dilation [Cohen et al. ‘03]
skin temperature
vocal affect
[Vogel &
Balakrishnan ‘04]
[Yu, Aoki, Woodruff, PARC ‘04]
[Daugman ‘94]
Component technologies exist, but not integrated, not
directed by behavioral models:
- What are sequential structures of engaged interactions?
- Which indicators are most predictive of engagement?
- Can we predict disengagement before it happens?
7. Responsive and Personalized Public
Information Display Interaction Structure of a
Marketing Engagement1
Attract and maintain audience engagement –Approach (hook)
–Assess
Content follows interaction model toward Monitor &
–Relax
an objective: –Describe
Re-engage
as needed
Marketing, Entertainment, Education, … –Benefit
–How to buy
–Reduce resistance
–Incentive to act
[HCII 2009]
Engineering approach (Reactrix) currently achieves Phase 1 using disruptive techniques
Phase 4 is the real value – requires recognizing human micro-behaviors
Conversation and interaction analysis bring clarity to vague notions like “engagement”
– Detect, describe and model the structured organization of natural interaction
– Create systems that interact and respond to individuals
[1Robert Prus, Making Sales]
8. Improving social capability and Linking research in human
behavior to technology
interactive personalization design.
Making systems socially interactive
Conversation analysis (CA) can build a more personalized, smooth interaction between technological systems and
humans
Interaction Analysis provides Technology designed using frameworks inspired by
conversational structures
Previous research: Sotto Voce, Responsive Mirror, Human-Robot Interaction
Broad Applications of Conversational Responsiveness
Any field with interactive features with customers: call centers and interactive voice responses to improve voice
interactions; games – making characters more interactive; mobile phone manufactures can make more use of
conversational data (i.e., providing analysis of conversations to provide feedback); and automobile - design
better audio-based interfaces
8
9. Sales Interaction Model
Representing Elements of Sellers’ Goals
Representation of dependencies and degree to which
each sales goal has been achieved
Not engaged
Neutralize
Offer Service
Reservations
engaged
Assess engaged engaged engaged Obtain
Engage
Commitment
Present Products Show
Customer Need
Appears uninterested
Generate Trust Maintain Trust Maximize Trust
low
[adapted from Making Sales, Robert Prus]
10. Psychographic Profiling
through Clothes Recognition
Mens shirts: multiple features
– Collar vs. crew neck
– Short vs. long sleeve
– Color, texture
– Pattern, emblems
What you wear says more about
your tastes than demographics
[Zhang, et al. IUI 2008]
12. Shirt style classification
Classes
Class Collar Sleeve Button
T-shirt No Short No
Polo shirt Yes Short Half
Casual shirt Collar Short Full
Business shirt Collar Long Full
SVM results
Classified T-shirt Polo Casual Business
as ->
T-shirt 80.8% 3.9% 15.4% 0%
Polo 16.7% 41.7% 8.3% 33.3%
Casual 0% 12.5% 50% 37.5%
Business 0% 5% 5% 90%
Overall accuracy: 72.7%
Sellers would approach someone wearing a T-shirt differently
than someone wearing a Business shirt
13. Research Opportunities
Perception Composable Content
• Detect external cues that • Content organized according to
indicate internal mental state abstract actions
Computer Vision Multimedia Data Structures
• Robust algorithms to detect • Efficient data structures for
specific behaviors realtime program re-composition
• Measures of inaccuracy
• Other Sensors Decision Engine &
• Audio, thermal, pupil, etc. Objective Model
• Select best abstract response
toward objective
Ethnography
• Internal user mental states
• External behavioral cues
• Abstract actions toward objective
Interaction Engine
• Develop realtime decision engine
14. Interdependencies Between Perception,
Decision and Action Components
Perception Composable Content
• Detect external cues that • Content organized according to
indicate internal mental state abstract actions
Computer Vision Multimedia Data
• Robust algorithms to Structures
detect specific behaviors • Efficient data structures
• Measures of inaccuracy Decision Engine & for realtime program
Objective Model composition
• Select best abstract response
toward objective
1. Decision engine and
objective model depend on 1. Structure of composable
reliability of computer Ethnography content framework
vision techniques. • Identify user mental states depends on output of
2. Required computer vision • Identify external cues of decision engine and object
depends on needs of model.
decision engine and object
mental state
2. Output of decision engine
model. • Identify abstract actions and object model should
leading to an objective allow for realtime
composition of content.
15. Responsive Interaction Platform
Sensing of Environment
Perception of Environment
Interaction
Image/Video
eye gaze
hand/body gestures
Emotional state
Energy level
Engine
Analyzer facial expression Person Model Patience
Mental activity – thinking, Interaction Model
Person Model confusion
This is the sequencing
Interest level
Audio non-vocal sounds Person Model Attitude toward information structure in the POMDP
Home position framework that defines what
Analyzer speech Model of internal state … stages the interaction should
Interaction among people follow. E.g., Sales*:
Positions and postures • Approach
sensor features …
Sensor Analyzer • Assess
• Relax
• Pitch
state of environment
• Benefit
• Reduce Resistance
Decision Engine • Incentive to act
Select “best” abstract action based on abstract state of
environment and the objective. Use the framework of Interaction stages
Partially Observable Markov Decision Processes Objective
(POMDP). metrics
abstract action – e.g., Promote
Interest, Gain Trust, Present
Product, make joke, …
Objective Model
This is the objective function in
the POMDP framework that
Content Actuation Engine defines what the “best” action
Convert abstract action to content segments. is. Example Objectives:
Increase brand awareness,
Abstract Display Sound Ambient
action Motion,
Introduce new product,
Lights Direct sales to mobile device,
Promote Fast Animation Catchy music Movement,
Provide navigation
Content Actuation actuator Interest light flash information, …
control Gain trust Scenes of family Smooth music Non-
life with product distracting
… … … …
16. Summary
Mixed-Initiative Interaction generates new business opportunities
Mixed-Initiative Interaction Engine
– Inference models to measure audience engagement
» Identify the most predictive set of sensors and the cost tradeoffs
– Precise assessment metrics of content effectiveness
– Engagement Detection
» Convert raw data to human-meaningful cues of engagement
Dynamic content framework
– Maps abstract actions to content segments to achieve the objective
– Tailorable to structure of engagements across multiple target domains
» Education, Training, Service, Sales, etc.
Far-reaching research and invention of next-generation interaction
paradigm for media technologies
– Displays, mobile device, speech conversation, etc.