The document discusses context-oriented programming (COP) and an approach called Auto-COP to generate adaptations from system execution using reinforcement learning. COP allows a system's behavior to dynamically change based on its context. Auto-COP monitors a system's execution trace to extract the contexts and related behaviors. It uses reinforcement learning to learn action sequences that optimize a goal for different system states, with the aim of generating adaptations without needing them to be predefined by developers.
Personalisation of Education by AI and Big Data - Lourdes Guàrdia
Generating Adaptations from the System Execution using Reinforcement Learning Options
1. Nicolás Cardozo - Ivana Dusparic
@ncardoz - @ivanadusparic
n.cardozo@uniandes.edu.co - ivana.dusparic@scss.tcd.ie
COP’21 - International Workshop on Context-Oriented Programming and Advanced Modularity - July 12 - (Virtual)
Generating Adaptations from the System
Execution using Reinforcement Learning
Options
2. Context-oriented programming
2
COP systems are software systems which have
to dynamically adapt their behavior in order to
cope with a changing environment.
[Acher et al. 09]
3. Context-oriented programming
2
COP systems are software systems which have
to dynamically adapt their behavior in order to
cope with a changing environment.
[Acher et al. 09]
[context-aware systems] use context to provide
relevant information and/or services to the user,
where relevancy depends on the user’s task
[Dey 01]
Context-oriented Programming (COP) as a new
programming technique to enable context-
dependent computation. We claim that Context-
oriented Programming brings a similar degree of
dynamicity to the notion of behavioral variations
[Hirshfeld et al. 08]
Context-oriented programming [6] is a
technique to modularize context-dependent
behavioral variations in a program, where those
behavioral variations can be dynamically
switched on and off in response to changes of
execution contexts
[Aotani et al. 11]
Context-oriented Programming (COP) [4]
enriches programming languages and
execution environments with features to
explicitly represent context-dependent
behavior variations.
[Appeltauer et al. 08]
Context-Oriented Programming (COP) [3] to
support programmers in developing software that
can dynamically change its behavior depending
on context information. [Afanasov et al. 13]
Context-aware systems are able to adapt their
behaviour depending on their context of use
without explicit user intervention
[Bainomugisha et al. 09]
described as a way to promote runtime
variability use and as a mechanism for
managing context features dynamically to cater
to the needs of dynamic adaptation
[Capilla et al. 14]
the goal of Context-oriented
Programming is to avoid having to spread
context-dependent behavior throughout a
program…
… can only be applied for context-
dependent behavior that are anticipated
in the software development process.
[Constanza et al. 05]
COP addresses the need for applications to
behave differently accordingly to the changing
run-time context in which they are embedded.
[ghezzi et al. 10]
In order to implement systems that are able to
use the implicit situational information…
… the system is able to learn from the user
preferences in order to autonomously evolve his
rules for future behavior
[Alegre et al. 16]
The combination of COP with computational reflection
opens further possibilities for runtime software adaptivity.
[Gonzalez et al. 09]
4. Context-oriented programming
3
Contexts are
(meaningful) situations
gathered from the
surrounding environment
Behavior variations correspond to
the specialized behavior
appropriate for a specific context
Adaptations correspond to the
behavior observed by the
system when executing in a
context
[S. Gonzalez. Programming in Ambience: Gearing up for dynamic adaptation to context. PhD thesis, 2008]
6. Context-oriented programming
5
Off-hook Silent Meeting Forwarding Behavior
x x x Ringtone
x ✔ x Vibrate
x ✔ ✔ x Vibrate
x ✔ ✔ ✔ Call forwarding
✔ x x Call waiting
✔ ✔ x Call waiting
✔ ✔ ✔ x Call waiting
✔ ✔ ✔ ✔ Call forwarding
15. 12
System design
The system must:
•Have users or be autonomous
•A defined goal
•A a way to know we progress towards
the goal
•A finite set of states
•A finite set of actions
16. 13
Context monitoring
To incorporate contexts (and their behavior)
dynamically, we need to capture the system state
and possible actions
5x5 grid
move_north()
move_south()
move_east()
move_west()
pickup()
dropoff()
28. 2
1
19
RL options
-1
-1
1
1
Action (sequences) get a reward for every state
0
10
3
Use RL to learn the best action sequence as the
option, based on the accumulated reward
Accumulate rewards for actions sequences
∑
a∈A
q(s, a)
29. 20
RL options
for i in 1..batchSize:
action_sequence[s].push(a, r+q_value[s][a])
while not done:
if P(s) >= 𝜀:
a = q_val[s]
r,next_state,done = step(a)
q_val[s][a] = (1-𝛼)*q_val[s][a] + 𝛼*(r + 𝛾*next_state)
——-
options.add(s, action_sequence)
while true:
if available_adaptation(s):
context,option = pick_option(𝜀, s)
context.activate()
execute(option)
context.deactivate()
31. 22
Warehouse robot delivery
Robot moves in a defined space searching for packages and takes them to the delivery area
Packages are at fixed locations
Paths to delivery are always the
same!
32. 23
Warehouse robot delivery
There is a context for each location of the robot, for each product
ContextDiamond23 = new cop.Context{(
name: “Diamond-2,3”
})
ContextShirt20 = new cop.Context{(
name: “Shirt-2,0”
})
ContextCarrot44 = new cop.Context{(
name: “Carrot-4,4”
})
33. 23
Warehouse robot delivery
There is a context for each location of the robot, for each product
ContextDiamond23 = new cop.Context{(
name: “Diamond-2,3”
})
ContextShirt20 = new cop.Context{(
name: “Shirt-2,0”
})
ContextCarrot44 = new cop.Context{(
name: “Carrot-4,4”
})
.
.
.
…
34. 24
Warehouse robot delivery
Robot moves in a defined space searching for packages and takes them to the delivery area
Context23false = new cop.Context({
name: “Context23false"
})
BAContext23false = Trait({
option: function() {
this.south();
this.west();
this.west();
this.south();
this.dropoff();
})
35.
36. 26
✓ Continuously process execution traces to extract action
sequences and their state
✓ Generated adaptations from extracted options
✓ Use of RL to manage options as the system’s most
appropriate behavior, and continuously update new options
Pushing COP forward not only to enable dynamic behavior
variations. Auto-COP lets systems to become adaptive to
unknown contexts and behavior
@ncardoz n.cardozo@uniandes.edu.co
37. 26
✓ Continuously process execution traces to extract action
sequences and their state
✓ Generated adaptations from extracted options
✓ Use of RL to manage options as the system’s most
appropriate behavior, and continuously update new options
Pushing COP forward not only to enable dynamic behavior
variations. Auto-COP lets systems to become adaptive to
unknown contexts and behavior
@ncardoz n.cardozo@uniandes.edu.co
Explore more system types
Integrate lifelong learning techniques to manage generated
adaptations