FORR is a cognitive architecture that allows agents to develop expertise through experience. It uses multiple knowledge representations and heuristics called "Advisors" to make decisions. As an agent solves problems, it learns which Advisors are most useful and how to best combine their recommendations. This allows FORR-based agents to improve their performance and adapt their decision-making process. FORR has been successfully applied to develop agents that can play games, navigate environments, solve constraint satisfaction problems, and collaborate as teams.
AWS Community Day CPH - Three problems of Terraform
Â
Susan epstein at ibm csig speaker series
1. FORR
A Cognitive Architecture for Expertise
Susan L. Epstein
The Graduate Center and Hunter College of
The City University of New York
2. Executive summary
â˘âŻ FORR (FOr the Right Reasons) is an architecture
â˘âŻ FORR-based systems develop expertise
â˘âŻ FORR-based systems learn quickly from problem solving experience
â˘âŻ FORR-based systems are built from
§ď§âŻ World knowledge (descriptives)
§ď§âŻ Good reasons for making decisions (Advisors)
â˘âŻ FORR-based systems can restructure their decision process
â˘âŻ FORR confirmed cognitively plausible on human subjects
2Background ⢠FORR ⢠Applications
3. People, agents and expertise
â˘âŻ People are our best model of intelligent agents
§ď§âŻ Some human approaches work well on really hard problems
§ď§âŻ Their methods are robust to imperfect data
§ď§âŻ They pursue multiple goals
â˘âŻ If an agent is to collaborate with people, it is necessary to understand
human decision processes
â˘âŻ A cognitively plausible agent simulates significant human characteristics
â˘âŻ Expert does things faster and better than the rest of us [DâAndrade 1990]
3Background ⢠FORR ⢠Applications
4. Characteristics of human experts
â˘âŻ They work in a domain (set of related problem classes)
â˘âŻ They satisfice = make good enough decisions
â˘âŻ They entertain multiple decision-making heuristics [Ratterman
& Epstein 1995]
â˘âŻ They access multiple representations
â˘âŻ They do situation-based reasoning [Klein & Calderwood 1991]
â˘âŻ Human experts are made, not born
4
Learning is the hallmark of human intelligence
Background ⢠FORR ⢠Applications
5. Agent architecture
â˘âŻ Postulates general principles
â˘âŻ System shell for diverse domains
â˘âŻ Requirements for cognitive plausibility
§ď§âŻ Display reasonable behavior
â˘âŻ Make obvious decisions
â˘âŻ Avoid obvious errors
â˘âŻ Solve easy problems quickly
§ď§âŻ Balance accuracy and speed
§ď§âŻ Be robust to error
§ď§âŻ Tolerate and reason with inconsistent, incomplete, noisy data
§ď§âŻ Learn
5Background ⢠FORR ⢠Applications
Do forever
Sense the world
Select an action
Execute that action
6. Fundamental issues for a learning architecture
â˘âŻ What is there to learn?
â˘âŻ From whom to learn?
â˘âŻ When to learn?
â˘âŻ How to learn?
â˘âŻ How to use learned knowledge to make decisions?
â˘âŻ How to manage reality and noise?
6Background ⢠FORR ⢠Applications
7. Cornerstones of FORRâs pragmatic approach
â˘âŻ Expertise is learned, that is, it develops with experience
â˘âŻ Easy questions should have fast (reactive) answers
â˘âŻ Satisfice = make good enough decisions in a simplified model of a complex
world (and recover if need be)
â˘âŻ Exploit synergy inherent in multiplicity
§ď§âŻ Multiple domain-dependent representations
§ď§âŻ Multiple domain-dependent heuristics for decision making
§ď§âŻ Multiple learning methods
â˘âŻ Maintain flexibility
§ď§âŻ Decouple data, learning methods, and decision methods
§ď§âŻ Restructure its own decision-making process
â˘âŻ Transparency: explain decisions
FORR's building blocks are descriptives and Advisors
7Background ⢠FORR ⢠Applications
8. Multiple representations
â˘âŻ Descriptive = a shared data object
§ď§âŻ Value provided on demand
§ď§âŻ Defined with functions that determine how and when to update it
§ď§âŻ Value may be learned
â˘âŻ Although a descriptive has a single representation, many descriptives can
represent the same world state
â˘âŻ Examples:
X-O-blank
empty/occupied
lines on the board
8Background ⢠FORR ⢠Applications
9. Multiple ways to use knowledge
â˘âŻ Operationalization = how to use a data object
â˘âŻ Although a descriptive has a single representation, it can be
operationalized in many ways
Ways to reason about the empty/occupied squares
Calculate possible actions
Predict opponent's move
Ways to reason about the lines
Report a result
Finish a winning line
Block your opponentâs winning line
Create a fork
Plan a win on a specific line
9Background ⢠FORR ⢠Applications
10. An Advisor operationalizes descriptives
â˘âŻ Implements a class-independent, action-selection rationale
â˘âŻ Limitedly-rational (resource-limited) procedure
â˘âŻ Input: state of the world + descriptives + possible actions
â˘âŻ Output: comments whose strengths express intensity of support or
opposition to individual actions (or sets of actions)
â˘âŻ Domain-specific
< Advisor, action, strength>
Advisor
current state possible actions
relevant descriptives
10Background ⢠FORR ⢠Applications
11. Often, Advisors disagree
11
O
X X
O Panic
(prevent
immediate
loss)
Worried
(prevent long-
range loss)
Victory
(win!)
And rely on learned descriptives
â˘âŻ Good openings
â˘âŻ Endgame play
â˘âŻ Strategies
âŚ
Background ⢠FORR ⢠Applications
12. More about Advisors
â˘âŻ Advisors have different properties
§ď§âŻ Some are always right
§ď§âŻ Some need more time to decide
§ď§âŻ Some would like to make a sequence of decisions, not just one
â˘âŻ Comments are opinions from the perspective of the Advisor's rationale
§ď§âŻ On a single action
do x x is better than y
donât do z do x or y
x is a 10, y is an 8, but z is a â3
§ď§âŻ On an (unordered or fully or partially ordered) set of actions
do x and y do p and then q
do p and then do q and r
12Background ⢠FORR ⢠Applications
13. FORR (FOr the Right Reasons)
â˘âŻ Premise: synergy among domain-specific rationales solves problems
â˘âŻ Descriptives isolate representation from reasoning
â˘âŻ Advisor hierarchy
§ď§âŻ Tier 1: correct, quick, pre-sequenced
§ď§âŻ Tier 2: reactive plan rationales
§ď§âŻ Tier 3: voting among heuristics based on
their comment strengths and learned weights
13
<AdvisorA, action2, 10>
<AdvisorA, action4, 8>
<AdvisorA, action7, 6>
<AdvisorB, action2, 7>
<AdvisorB, action3, 9>
<AdvisorC, action1, 9>
<AdvisorC, action2, 7>
<AdvisorC, action3, 9>
<AdvisorC, action7, 9>
âŚ
Voting
For Advisor i and action j
argmax
j
diwicijâ
Background ⢠FORR ⢠Applications
14. The FORR decision cycle
14
take actionyes
Tier 1: Reaction from
perfect knowledge
Victory T-11 T-1nâŚ
Decision?
no
Background ⢠FORR ⢠Applications
state actions
descriptives
15. The FORR decision cycle
15
take actionyes
Tier 1: Reaction from
perfect knowledge
Victory T-11 T-1nâŚ
Decision?
begin planyes
Tier 2: Plans triggered by
situation recognition
no
T-21 T-22 T-2mâŚ
Decision?
Background ⢠FORR ⢠Applications
state actions
descriptives
16. The FORR decision cycle
16
take actionyes
Tier 1: Reaction from
perfect knowledge
Victory T-11 T-1nâŚ
Decision?
begin planyes
no
T-32T-31 T-3kâŚ
âŚTier 3: Heuristic reactions
Voting take action
Tier 2: Plans triggered by
situation recognition
no
T-21 T-22 T-2mâŚ
Decision?
Background ⢠FORR ⢠Applications
state actions
descriptives
17. How to develop a problem solver
17
â˘âŻ Specialize FORR with domain knowledge
§ď§âŻ Problem classes
§ď§âŻ Advisors
§ď§âŻ Descriptives with learning methods
â˘âŻ To solve a class of problems robustly, FORR learns
§ď§âŻ Descriptivesâ values
§ď§âŻ Rationalesâ relative utility
§ď§âŻ New Advisors
§ď§âŻ How to reorganize tier 3
Domain
knowledge
FORR
FORR-based
problem solver
Learned
problem solver
Problem class
Experience
WARNING: problem solving
often provides noisy data
Background ⢠FORR ⢠Applications
18. FORR-based single agents
18
â˘âŻ Hoyle learned to play 19 two-person, perfect-information, finite-board
games as well or better than human / machine expert [Epstein, 2001]
â˘âŻ Ariadne learned to navigate efficiently in grid worlds, despite perceptual
limitations and no map [Epstein, 1995]
â˘âŻ ACE learned to solve constraint satisfaction problems and rediscovered
the BrĂŠlaz heuristic [ Epstein & Freuder, 2005]
â˘âŻ SemaFORR: controls an autonomous search-and-rescue robot [Epstein,
Schneider, Ozgelen, Munoz, Costantino, Sklar & Parsons, 2012]
Background ⢠FORR ⢠Applications
19. Lessons learned
19
â˘âŻ Reactive plans work well
â˘âŻ Elimination of inaccurate heuristics produces substantial speedup
â˘âŻ Lazy descriptive computation also provides speedup
â˘âŻ Self-awareness supports transparency
â˘âŻ Advisor weights may have problem-stage context
â˘âŻ Weight learning has subtle pitfalls (example extraction)
â˘âŻ Autonomous restructuring must balance accuracy against risk
â˘âŻ Sometimes it is more efficient not to reason at all
Background ⢠FORR ⢠Applications
20. FORR-based collaborating agents
20
â˘âŻ Co-FORR: 5 collaborating agents for 2D park design [Epstein, 1998]
â˘âŻ FORRSooth: learned to conduct a spoken dialogue with a library patron
who orders books [Epstein, Passonneau, Gordon, & Ligorio, 2012]
â˘âŻ SemaFORR: controls autonomous search-and-rescue robot team [Epstein,
Aroor, Evanusa, Sklar & Parsons, 2015]
Each new domain poses new challenges whose solution strengthens FORR
Background ⢠FORR ⢠Applications
21. FORR-based results
21
â˘âŻ PhD theses
§ď§âŻ Shih on learning multiple behavior sequences, 2000
§ď§âŻ Lock on learning multiple plans from behavior sequences, 2003
§ď§âŻ Petrovic on weight learning for multiple Advisors, 2008
§ď§âŻ Ligorio on learning to select attributes, 2011
§ď§âŻ Li on representation and exploitation of multiple complex relationships, 2011
§ď§âŻ Yun on parallelization of multiple solvers, 2013
§ď§âŻ Osisek on application of multiple relationships in recommendation (in progress)
§ď§âŻ Aroor on reactive planning for multiple robots (in progress)
â˘âŻ Applications to bioinformatics (with Dr. Lei Xie)
§ď§âŻ Protein-protein interaction networks
§ď§âŻ Virtual drug screening
Background ⢠FORR ⢠Applications
22. Take home message
To develop expertise
FORR learns to harness the synergy of
multiplicity in representation and reasoning
22Background ⢠FORR ⢠Applications
23. Acknowledgements
We gratefully acknowledge the support of
The National Science Foundation
CUNYâs High Performance Computing Center
Continued thanks to my collaborators
Gene Freuder Rebecca Passonneau
Rick Wallace Lei Xie
Elizabeth Sklar Simon Parsons
and a host of undergraduate and graduate students
with whom I continue to learn
23
24. Selected references
â˘âŻ Epstein, S. L. 2001. Learning to Play Expertly: A Tutorial on Hoyle. In
Machine Learning in Game Playing
â˘âŻ Epstein, S. L. 1998. Pragmatic Navigation: Reactivity, Heuristics, and
Search. Artificial Intelligence, 100 (1-2): 275-322.
â˘âŻ Epstein, S. L., E. C. Freuder and M. Wallace 2005. Learning to Support
Constraint Programmers. Computational Intelligence 21(4): 337-371.
â˘âŻ Epstein, S. L., R. J. Passonneau, T. Ligorio and J. Gordon. 2012. Data
Mining to Support Human-Machine Dialogue for Autonomous Agents. In
Proceedings of Agents and Data Mining Interaction (ADMI2011).
â˘âŻ Epstein, S.L., Aroor, A., Evanusa, M., Sklar, E.I., Simon, S. 2015. Navigation
with Learned Spatial Affordances. In Proceedings of CogSci 2015.
http://www.cs.hunter.cuny.edu/~epstein/
24