2. 2
Introduction
• Interested in the functional benefits of emotion
for a cognitive agent
▫ Appraisal theories of emotion
▫ PEACTIDM theory of cognitive control
• Use emotion as a reward signal to a
reinforcement learning agent
▫ Demonstrates a functional benefit of emotion
▫ Provides a theory of the origin of intrinsic reward
3. 3
Outline
• Background
▫ Integration of emotion and cognition
▫ Integration of emotion and reinforcement learning
▫ Implementation in Soar
• Learning task
• Results
4. 4
Appraisal Theories of Emotion
• A situation is evaluated along a number of appraisal
dimensions, many of which relate the situation to
current goals
▫ Novelty, goal relevance, goal conduciveness, expectedness,
causal agency, etc.
• Appraisals influence emotion
• Emotion can then be coped with (via internal or
external actions)
Situation
Goals
Coping Appraisals
Emotion
5. 5
Appraisals to Emotions (Scherer 2001)
Joy Fear Anger
High/medium High High
Suddenness
High High High
Unpredictability
Low
Intrinsic pleasantness
High High High
Goal/need relevance
Other/nature Other
Cause: agent
Chance/intentional Intentional
Cause: motive
Very high High Very high
Outcome probability
Discrepancy from High High
expectation
Very high Low Low
Conduciveness
High
Control
Very low High
Power
6. 6
Cognitive Control: PEACTIDM (Newell 1990)
Perceive Obtain raw perception
Encode Create domain-independent
representation
Attend Choose stimulus to process
Comprehend Generate structures that relate stimulus
to tasks and can be used to inform
behavior
Task Perform task maintenance
Intend Choose an action, create prediction
Decode Decompose action into motor commands
Motor Execute motor commands
7. 7
Unification of PEACTIDM and Appraisal Theories
Perceive
Environmental Raw Perceptual
Change Information
Motor Encode
Suddenness
Stimulus
Unpredictability
Motor Relevance
Goal Relevance
Commands Intrinsic Pleasantness
Prediction
Outcome
Decode Attend
Probability
Causal Agent/Motive
Action Stimulus chosen
Discrepancy
for processing
Conduciveness
Control/Power
Intend Comprehend
Current Situation
Assessment
8. 8
Distinction between emotion, mood, and feeling
(Marinier & Laird 2007)
• Emotion: Result of appraisals
▫ Is about the current situation
• Mood: “Average” over recent emotions
▫ Provides historical context
• Feeling: Emotion “+” Mood
▫ What agent actually perceives
13. 14
Learning task: Encoding
North
Passable: false
On path: false
Progress: true
East
West
Passable: false
Passable: false
On path: true
On path: false
Progress: true
Progress: true
South
Passable: true
On path: true
Progress: true
14. 15
Learning task: Encoding & Appraisal
North
Intrinsic Pleasantness: Low
Goal Relevance: Low
Unpredictability: High
East
West
Intrinsic Pleasantness: Low
Intrinsic Pleasantness: Low
Goal Relevance: High
Goal Relevance: Low
Unpredictability: High
Unpredictability: High
South
Intrinsic Pleasantness: Neutral
Goal Relevance: High
Unpredictability: Low
15. 16
Learning task: Attending,
Comprehending & Appraisal
South
Intrinsic Pleasantness: Neutral
Goal Relevance: High
Unpredictability: Low
Conduciveness: High
Control: High …
18. 19
What is being learned?
• When to Attend vs Task
• If Attending, what to Attend to
• If Tasking, which subtask to create
• When to Intend vs. Ignore
20. 21
Results: With and without mood
300
Median Processing Cycles
290
280
270
260
250
240
8 9 10 11 12 13 14 15
Episode
Feeling=Emotion Feeling=Emotion+Mood Optimal
21. 22
Discussion
• Agent learns both internal (tasking) and external
(movement) actions
• Emotion allows for more frequent rewards, and
thus learns faster than standard RL
• Mood “fills in the gaps” allowing for even faster
learning and less variability
22. 23
Conclusion & Future Work
• Demonstrated computational model that integrates
emotion and cognitive control
• Confirmed emotion can drive reinforcement learning
• We have already successfully demonstrated similar
learning in a more complex domain
• Would like to explore multi-agent scenarios