10. But What if…..
…Your task is manifested by a series of decisions to
reach or keep an optimal performance
10
11. Reinforcement Learning
• Building agents that are able to learn an optimal policy to preform a task within a
Markovian environment
…What ?
11
12. Reinforcement Learning
• Building agents that are able to learn an optimal policy to preform a task within a
Markovian environment
• In a Markovian environment the next state depends only on the current state and the
agent that will be preformed by the agent
…What ?
12
13. Reinforcement Learning
• Building agents that are able to learn an optimal policy to preform a task within a
Markovian environment
• In a Markovian environment the next state depends only on the current state and the
agent that will be preformed by the agent
…What ?
• This task can be episodic or continues
13
18. Reinforcement Learning
…How ?
Environment
Agent
Reward
New State
Action
• Reach an optimal policy
𝝿
•
𝝿
can be deterministic or stochastic
• A deterministic version of
𝝿
can be derived from the
action value function Q(S,a)
• You are free to choose your policy type
18
19. What’s hot 🔥about DeepRL
• Reinforcement Learning existed since the early 80s
19
20. What’s hot 🔥about DeepRL
• Reinforcement Learning existed since the early 80s
• Reinforcement Learning before the hype of Deep Learning used to rely on Dynamic
programing Algorithms
20
21. What’s hot 🔥about DeepRL
• Reinforcement Learning existed since the early 80s
• Reinforcement Learning before the hype of Deep Learning used to rely on Dynamic
programing Algorithms
• Monte-carlo, Sarsa (not salsa 💃), Q-learning, expected Sarsa…etc
21
22. What’s hot 🔥about DeepRL
• Reinforcement Learning existed since the early 80s
• Reinforcement Learning before the hype of Deep Learning used to rely on Dynamic
programing Algorithms
• Monte-carlo, Sarsa (not salsa 💃), Q-learning, expected Sarsa…etc
• Data structures to hold reference for the actions values of each state
22
23. What’s hot 🔥about DeepRL
Bio
Stocks Games
Robots
• Modern environments present complex action and state spaces
23
24. What’s hot 🔥about DeepRL
Bio
Stocks Games
Robots
• Deep Neural Networks are able to extract features from different state types
24
• Modern environments present complex action and state spaces
25. What’s hot 🔥about DeepRL
Bio
Stocks Games
Robots
• Deep Neural Networks are able to approximate functions that map an observation to
a desired output space
25
• Deep Neural Networks are able to extract features from different state types
• Modern environments present complex action and state spaces
26. DeepRL workshop
• Inspecting a dynamic programing version of Q-learning
• Inspecting limitation and Deep Neural network use case
• Implementing Deep Q-learning with Tensor
fl
ow Keras API and Pytorch
• Getting introduced to OpenAI GYM for reinforcement learning environments
• Visualizing the training and inference of a DQN agents
26
27. Other hot topics
• Multi-agent reinforcement learning
• Imitation learning and behaviour cloning
• The problem of generation in Deep RL
• Policy based methods: PPO, A2C, A3C…
• DeepRL frameworks: RLLib, TF Agents…
27
28. Resources
• Berkeley DeepRL Bootcamp on Youtube
• Reinforcement Learning, an introduction
• Udacity DeepRL Nanodegree if possible
• RL course by David silver on Youtube
• Open AI gym documentation
28