SlideShare ist ein Scribd-Unternehmen logo
1 von 152
Downloaden Sie, um offline zu lesen
RL Adventure
TO THE RAINBOW
성태경 양홍선 이의령 김예찬
DQN, Double DQN & Dueling DQN
PER and NoisyNet
Distributional RL
RAINBOW
성태경 양홍선 이의령 김예찬
RL Adventure
DQN, Double DQN & Duel DQN
성태경
OUTLINE
2
DQN
Double
DQN
Dueling
DQN
PER
C51NoisyNet Rainbow + 구현
OUTLINE
3
DQN
Double
DQN
Dueling
DQN
PER
C51NoisyNet Rainbow + 구현
RL APPLICATIONS
[Atari]
[Robotics] [Autonomous driving]
[Mario] [Pommerman] [Go]
4
HIGH-LEVEL PROCESS
5
[Decision]
[Pixel information]
reward
HIGH-LEVEL PROCESS
6
[Decision]
[Pixel information]
[
………
]
[Input values]
PREPROCESS
[Neural networks]
MLPs, CNNs, RNNs, …
TRAINING
DQN, 

Double DQN, 

DDQN, …
[Objective function]
reward
DEEP Q-NETWORK (DQN)
7
DQN
NEURAL NETWORKS IN ONE SLIDE
8
Weight 연산
Backpropagation
Non-linear function
DQN
NEURAL NETWORKS IN ONE SLIDE
9
Convolutional neural network
Max-pooling
Softmax
Weight 연산
Backpropagation
Non-linear function
DQN
Q-LEARNING
‣ 목적: 현재의 상황에서 어떤 행동을 하는 것이 가장 좋은지
V. Minh, et al. Playing Atari with Deep Reinforcement Learning. NIPS, 2013
C. J. C. H. Watkins, P. Dayan. Q-learning. 1992.
Qnew
(st, at) ← (1 − α)Q(st, at) + α(rt+γmaxaQ(st+1, at))
[Value iteration update]
Qπ(s, a) = 𝔼[
∞
∑
t=0
γt
R(xt, at)], γ ∈ (0,1)
[Expected rewards]
10
다음 state의 reward값현재의 reward값
DQN
MOTIVATION
11
V. Minh, et al. Playing Atari with Deep Reinforcement Learning. NIPS, 2013
Q(s, a) → → Q(s, a; θ)
Li(θi) = 𝔼s,a,r,s′[(r + γmaxa′Q(s′, a′; θi) − Q(s, a; θi))
2
]
목표값
TD error
예측값
뉴럴 네트워크로 Q함수를 근사화
DQN
PROBLEM
12
‣ Unstable update
‣ 입력 데이터간의 high correlations
V. Minh, et al. Playing Atari with Deep Reinforcement Learning. NIPS, 2013
https://curt-park.github.io/2018-05-17/dqn/
‣ Non-stationary targets (같은 네트워크 파라미터)
[Objective function]
Li(θi) = 𝔼s,a,r,s′[(r + γmaxa′Q(s′, a′; θi) − Q(s, a; θi))
2
]
DQN
SOLUTION
13
‣ Experience replay
Matiisen, Tambet Demystifying Deep Reinforcement Learning. Computational Neuroscience LAB. 2015.
Mnih, V. et al. Human-level control through deep reinforcement learning. Nature, 2015.
→ {s1, a1, r1, s2, …, sT−1, aT, rT−1, sT}
{
[Buffer]
Training
sampling
Episode
Experience
DQN
SOLUTION
14
‣ Experience replay
Matiisen, Tambet Demystifying Deep Reinforcement Learning. Computational Neuroscience LAB. 2015.
Mnih, V. et al. Human-level control through deep reinforcement learning. Nature, 2015.
→ {s1, a1, r1, s2, …, sT−1, aT, rT−1, sT}
{
[Buffer]
Training
sampling
Episode
Experience
‣ Fixed Q-targets
Li(θi) = 𝔼s,a,r,s′[(r + γmaxa′
̂Q(s′, a′; θ−
i ) − Q(s, a; θi))
2
]
Li(θi) = 𝔼s,a,r,s′[(r + γmaxa′Q(s′, a′; θi) − Q(s, a; θi))
2
]
[Objective function]
DQN
DQN - PREPROCESSING
15
210
160
https://danieltakeshi.github.io/2016/11/25/frame-skipping-and-preprocessing-for-deep-q-networks-on-atari-2600-games/
[
………
]
DQN
DQN - PREPROCESSING
16
https://danieltakeshi.github.io/2016/11/25/frame-skipping-and-preprocessing-for-deep-q-networks-on-atari-2600-games/
사이즈를 줄이고 grayscale로 바꿈으로써 입력데이터 사이즈를 줄이자
84
84
[
………
]
DQN
DQN - PREPROCESSING
17
https://danieltakeshi.github.io/2016/11/25/frame-skipping-and-preprocessing-for-deep-q-networks-on-atari-2600-games/
Parameter: 4
x1
x3 x5
x2
x7
x4
x6
s1 = (x1, x2, x3, x4)
s2 = (x2, x3, x4, x5)
[Input]: (84 x 84 x 4)
[
………
]
Frame skipping
DQN
DQN - PREPROCESSING
18
https://danieltakeshi.github.io/2016/11/25/frame-skipping-and-preprocessing-for-deep-q-networks-on-atari-2600-games/
DeepMind took the component-wise maximum over two consecutive frames (Atari setting)
[
………
]
Frame skipping
DQN
IMPLEMENTATION
19
https://github.com/higgsfield/RL-Adventure/blob/master/1.dqn.ipynb
DQN
IMPLEMENTATION
20
https://github.com/higgsfield/RL-Adventure/blob/master/1.dqn.ipynb
DQN
IMPLEMENTATION
21
https://github.com/higgsfield/RL-Adventure/blob/master/1.dqn.ipynb
DQN
IMPLEMENTATION
22
https://github.com/higgsfield/RL-Adventure/blob/master/1.dqn.ipynb
DQN
IMPLEMENTATION
23
https://github.com/higgsfield/RL-Adventure/blob/master/1.dqn.ipynb
DQN
IMPLEMENTATION
24
https://github.com/higgsfield/RL-Adventure/blob/master/1.dqn.ipynb
DQN
IMPLEMENTATION
25
https://github.com/higgsfield/RL-Adventure/blob/master/1.dqn.ipynb
DQN
IMPLEMENTATION
26
https://github.com/higgsfield/RL-Adventure/blob/master/1.dqn.ipynb
DQN
IMPLEMENTATION
27
https://github.com/higgsfield/RL-Adventure/blob/master/1.dqn.ipynb
DOUBLE DQN
28
DOUBLE Q-LEARNING
MOTIVATION
‣ DQN의 문제:
van Hasselt H., Guez A. and Silver D. Deep reinforcement learning with double Q-learning, AAAI, 2015
van Hasselt H., Double Q-learning, NIPS, 2011
29
Q(s, a) = r(s, a) + γmaxaQ(s′, a)
Q-target Accumulated rewards Maximum Q-value of next state
Overestimating the action values. What if the environment is noisy?
DOUBLE Q-LEARNING
MOTIVATION
‣ DQN의 문제:
van Hasselt H., Guez A. and Silver D. Deep reinforcement learning with double Q-learning, AAAI, 2015
van Hasselt H., Double Q-learning, NIPS, 2011
30
Q(s, a) = r(s, a) + γmaxaQ(s′, a)
Q-target Accumulated rewards Maximum Q-value of next state
Overestimating the action values.
‣ 해결:
Q(s, a) = r(s, a)+γQ(s′, argmaxaQ(s′, a))
DQN Network choose action for next state
What if the environment is noisy?
DOUBLE Q-LEARNING
IMPLEMENTATION
https://github.com/higgsfield/RL-Adventure/blob/master/2.double%20dqn.ipynb
31
DOUBLE Q-LEARNING
IMPLEMENTATION
https://github.com/higgsfield/RL-Adventure/blob/master/2.double%20dqn.ipynb
32
DOUBLE Q-LEARNING
IMPLEMENTATION
https://github.com/higgsfield/RL-Adventure/blob/master/2.double%20dqn.ipynb
33
DUELING DQN
34
DUELING DQN
MOTIVATION
35
Q(s, a) = V(s) + A(s, a)
[Q-value decomposition] State value Advantage value
‣ 현재 state의 가치에 비교가치로 정보를 추가한다
‣ 가치의 차이(advantage value) —> 더 빠른 학습속도
하나의 action 값만 반영

다른 actions는 그대로
선택한 하나의 action보다 

얼마나 더 좋은지(비교)를 나타낸다
DUELING DQN
IMPLEMENTATION
https://github.com/higgsfield/RL-Adventure/blob/master/3.dueling%20dqn.ipynb
36
DUELING DQN
IMPLEMENTATION
https://github.com/higgsfield/RL-Adventure/blob/master/3.dueling%20dqn.ipynb
37
DUELING DQN
IMPLEMENTATION
https://github.com/higgsfield/RL-Adventure/blob/master/3.dueling%20dqn.ipynb
38
EXPERIMENTS
EXPERIMENTAL COMPARISON
[DQN] [Double DQN] [Dueling DQN]
39
EXTRA
DQN LINEAGE
40
Niels Justesen, Philip Bontrager, Julian Togelius, Sebastian Risi. Deep Learning for Video Game Playing. 2017.
감사합니다
41
RL Adventure
PER and NoisyNet
양홍선
1
PER
Prioritized Experience Replay
2
Replay Memory
3
𝑠𝑠𝑡𝑡, 𝑎𝑎𝑡𝑡
𝑟𝑟𝑡𝑡, 𝑠𝑠𝑡𝑡+1
Replay Buffer
frequency
Replay Memory
4
Replay Buffer
More task-relevant
frequency
𝑠𝑠𝑡𝑡, 𝑎𝑎𝑡𝑡
𝑟𝑟𝑡𝑡, 𝑠𝑠𝑡𝑡+1
Replay Memory
5
Replay Buffer
More task-relevant
frequency
𝑠𝑠𝑡𝑡, 𝑎𝑎𝑡𝑡
𝑟𝑟𝑡𝑡, 𝑠𝑠𝑡𝑡+1
6
Which experiences to store
Which experiences to replay
Design of Replay Memory
7
Which experiences to store
Which experiences to replay
Design of Replay Memory
A Motivating Example
8
Two actions: ‘right(→→)’ and ‘wrong(→)’
The environment requires an exponential number of random steps until the first
non-zero reward
The most relevant transitions are hidden in a mass of highly redundant failure
cases
9
How?
10
Prioritizing with TD-Error
A transition’s TD error 𝛿𝛿
how ‘surprising’ or unexpected the transition is
11
A low TD-Error on first visit may not be replayed for a long time
The PER with TD-Error is sensitive to noise spikes
Greedy prioritization focuses on a small subset of the experience
Weakness
12
Stochastic Sampling!
Stochastic Prioritization
Proportional prioritization
• 𝑝𝑝𝑖𝑖 = 𝛿𝛿𝑖𝑖 + 𝜖𝜖
• 𝑃𝑃 𝑖𝑖 =
𝑝𝑝𝑖𝑖
𝛼𝛼
∑𝑘𝑘 𝑝𝑝𝑘𝑘
𝛼𝛼
• 𝑝𝑝𝑖𝑖 > 0: the priority of transition 𝑖𝑖
• 𝛼𝛼: determines how much prioritization is used
• Sum-tree
13
Stochastic Prioritization
Rank-based prioritization
• 𝑝𝑝𝑖𝑖 =
1
rank(𝑖𝑖)
• rank(𝑖𝑖) is the rank of transition 𝑖𝑖 when the replay memory is sorted
according to 𝛿𝛿𝑖𝑖
• More robust
• Binary heap
14
Annealing the Bias
• Importance-Sampling (IS) weights
• 𝑤𝑤𝑖𝑖 =
1
𝑁𝑁
1
𝑃𝑃 𝑖𝑖
𝛽𝛽
• Normalize:
1
max𝑖𝑖 𝑤𝑤𝑖𝑖
• ∆← ∆ + 𝑤𝑤𝑖𝑖 � 𝛿𝛿𝑖𝑖 � ∇𝜃𝜃 𝑄𝑄(𝑆𝑆𝑖𝑖−1, 𝐴𝐴𝑖𝑖−1)
15
17
Proportional prioritization (without sum-tree)
18
19
𝑃𝑃 𝑖𝑖 =
𝑝𝑝𝑖𝑖
𝛼𝛼
∑𝑘𝑘 𝑝𝑝𝑘𝑘
𝛼𝛼
20
21
최소 한번은 replay
22
23
𝑃𝑃 𝑖𝑖 =
𝑝𝑝𝑖𝑖
𝛼𝛼
∑𝑘𝑘 𝑝𝑝𝑘𝑘
𝛼𝛼
24
IS weights
25
26
TD-Error로 업데이트
NoisyNet
Noisy Networks for Exploration
27
28
Exploration
Exploitation
29
High exploration
Optimal
High exploitation
Exploration
Exploitation
30
Exploration
Efficient
31
Exploration methods
𝝐𝝐 −greedy
Entropy regularization
Loss에 추가하는 패널티로 한쪽으로 치우치지 않게 함
− �
𝑎𝑎
π(s,a) log π(s,a)
일정 확률 (𝜖𝜖) 만큼 무작위로 행동
32
𝜖𝜖 − greedy, Entropy regularization
33
𝜖𝜖 − greedy, Entropy regularization
Random perturbations
34
𝜖𝜖 − greedy, Entropy regularization
Random perturbations
Hard to the large-scale behavioural patterns
35
NoisyNet!!
36
NoisyNet learn perturbations of the
network weights are used to drive
exploration
37
𝜃𝜃 ≔ 𝜇𝜇 + ∑ ⊙ 𝜖𝜖
38
𝜃𝜃 ≔ 𝜇𝜇 + ∑ ⊙ 𝜖𝜖
Learnable parameters
Noise variables
39
𝜃𝜃 ≔ 𝜇𝜇 + ∑ ⊙ 𝜖𝜖
Learnable parameters
Noise variables
𝜁𝜁 ≔ (𝜇𝜇, ∑)
40
𝑦𝑦 = 𝑤𝑤𝑤𝑤 + 𝑏𝑏
𝑦𝑦 ≔ 𝜇𝜇 𝑤𝑤
+ 𝜎𝜎 𝑤𝑤
⊙ 𝜖𝜖 𝑤𝑤
𝑥𝑥 + 𝜇𝜇𝑏𝑏
+ 𝜎𝜎 𝑏𝑏
⊙ 𝜖𝜖 𝑏𝑏
• p inputs and q outputs
• Independent Gaussian noise
• Using an independent Gaussian noise entry per weight
• pq+q
• Factorised Gaussian noise
• Using and independent noise per each output and input
• p+q
41
NoisyNet
42
�𝐿𝐿 𝜁𝜁 = 𝔼𝔼 𝔼𝔼 𝑥𝑥,𝑎𝑎,𝑟𝑟,𝑦𝑦 ~𝐷𝐷[𝑟𝑟 + 𝛾𝛾 max
𝑏𝑏∈𝐴𝐴
𝑄𝑄 𝑦𝑦, 𝑏𝑏, 𝜖𝜖′; 𝜁𝜁− − 𝑄𝑄 𝑥𝑥, 𝑎𝑎, 𝜖𝜖; 𝜁𝜁 2
𝐿𝐿(𝜃𝜃) = 𝔼𝔼 𝔼𝔼 𝑥𝑥,𝑎𝑎,𝑟𝑟,𝑦𝑦 ~𝐷𝐷[𝑟𝑟 + 𝛾𝛾 max
𝑏𝑏∈𝐴𝐴
𝑄𝑄 𝑦𝑦, 𝑏𝑏; 𝜃𝜃−
− 𝑄𝑄 𝑥𝑥, 𝑎𝑎; 𝜃𝜃) 2
Loss
43
Loss
�𝐿𝐿 𝜁𝜁 = 𝔼𝔼 𝔼𝔼 𝑥𝑥,𝑎𝑎,𝑟𝑟,𝑦𝑦 ~𝐷𝐷[𝑟𝑟 + 𝛾𝛾 max
𝑏𝑏∈𝐴𝐴
𝑄𝑄 𝑦𝑦, 𝑏𝑏, 𝜖𝜖′; 𝜁𝜁− − 𝑄𝑄 𝑥𝑥, 𝑎𝑎, 𝜖𝜖; 𝜁𝜁 2
𝐿𝐿(𝜃𝜃) = 𝔼𝔼 𝔼𝔼 𝑥𝑥,𝑎𝑎,𝑟𝑟,𝑦𝑦 ~𝐷𝐷[𝑟𝑟 + 𝛾𝛾 max
𝑏𝑏∈𝐴𝐴
𝑄𝑄 𝑦𝑦, 𝑏𝑏; 𝜃𝜃−
− 𝑄𝑄 𝑥𝑥, 𝑎𝑎; 𝜃𝜃) 2
Initialisation of NoisyNet
• An unfactorized NoisyNet
• 𝜇𝜇𝑖𝑖,𝑗𝑗 ~ 𝑢𝑢 −
3
𝑝𝑝
, +
3
𝑝𝑝
• p: The number of inputs
• 𝜎𝜎𝑖𝑖,𝑗𝑗 = 0.017
• Factorised NosiyNet
• 𝜇𝜇𝑖𝑖,𝑗𝑗 ~ 𝑢𝑢 −
1
𝑝𝑝
, +
1
𝑝𝑝
• 𝜎𝜎𝑖𝑖,𝑗𝑗 =
𝜎𝜎0
𝑝𝑝
• 𝜎𝜎0 = 0.5
44
45
The learning curves of
the average noise parameter �∑
46
The learning curves of
the average noise parameter �∑
47
Factorised NosiyNet
48
49
Learnable parameters
50
Factorised NosiyNet
𝜇𝜇𝑖𝑖,𝑗𝑗 ~ 𝑢𝑢 −
1
𝑝𝑝
, +
1
𝑝𝑝
51
Factorised
52
53
𝜃𝜃 ≔ 𝜇𝜇 + ∑ ⊙ 𝜖𝜖
54
Code: https://github.com/higgsfield/RL-Adventure
PER: https://arxiv.org/abs/1511.05952
NoisyNet: https://arxiv.org/abs/1706.10295
감사합니다
Q&A
55
1
RL Adventure
Distributional RL
이의령
C51
Distributional RL
목차
1. Motivation
2. Distributional RL(C51) 설명
3. C51 Result
4. 코드 구현체 분석
3
1. Motivation
4
5
Motivation
+ $ 200
- $ 1,800
Ε	[R x ] =		
()
36
	×	200	 −	
0
36
× 1,800
	= 144
6
Motivation
+ $ 200
- $ 1,800
𝑅230 + 𝛾𝑅236 +	⋯	+ 𝛾89290 𝑅8
보상의 합
7
Expected RL
+ $ 200
- $ 1,800
벨만 방정식
𝑣 𝑥 = 𝑬	 𝑅230 + 𝛾𝑅236 +	⋯	|	𝑆2 = 𝑥 	
= 𝑬	R x + 𝛾	𝑬	𝑣(𝑥)
= 𝑬	 𝑅230 + 𝛾	𝑣 𝑥 	|	𝑆2 = 𝑥
Reward를 Random Variable 관점에서 바라보면…
§ 가치함수는 discount된 미래 보상에 대한 기댓값을 리턴한다.
§ 기댓값 = Scalar(o) / Distribution(x)
§ 미래 보상 값들은 complex, Multimodal의 특성을 가진다.
§ 기댓값은 각 보상들이 가지는 intrinsic(본질적인)한 특성을 담아내지 못한다.
8
Expected RL
Ε	[R x ] =		
()
36
	×	200	 −	
0
36
× 1,800
	= 144
Reward를 Random Variable 관점에서 바라보면…
9
Expected RL
이러한 Expected RL의 한계점을 보완책
-> A Distributional Perspective on RL (C51)
Return을 Distribution으로 만들어
Randomness한 특성과 정보를 최대한 반영해보자
𝑉B 	= 	𝐸 	𝑍B 𝑥 	 	= 	𝐸 	𝑅 𝑥 	 	+ 	𝐸[	𝑍B 𝑋F 	]
Return을 Distribution으로 만들어
Randomness한 특성과 정보를 최대한 반영해보자
𝑉B 	= 	𝐸 	𝑍B 𝑥 	 	= 	𝐸 	𝑅 𝑥 	 	+ 	𝐸[	𝑍B 𝑋F 	]
	𝑍B 𝑥 = 𝑅 𝑥 + 𝑍B 𝑋F
2. Distributional RL
13
§ Expected RL à Distributional RL
§ Return에 대한 Value Distribution을 만들자.
§ C51 = Categorical / 이산형 분포
§ 51개의 bin을 이용하여 분포를 만든다.
14
Distributional RL
A Distributional Perspective on Reinforcement Learning (C51)
https://arxiv.org/abs/1707.06887
§ Distributional Bellman Equation
§ Cf) Bellman Equation
§ 𝑍 𝑠, 𝑎 는 Distribution을 의미, 이를 이용하여 Distribution을 생성
15
Distributional RL
A Distributional Perspective on Reinforcement Learning (C51)
𝑄 𝑥, 𝑎 = 𝑅(𝑥, 𝑎) + 𝛾𝑄B(𝑥′, 𝑎′)
𝑄 𝑠, 𝑎 = 𝐸 𝑍 𝑠, 𝑎 =	L 𝑝N 𝑥N
O
NP0
16
Distributional RL
A Distributional Perspective on Reinforcement Learning (C51)
17
Distributional RL
A Distributional Perspective on Reinforcement Learning (C51)
18
Distributional RL
A Distributional Perspective on Reinforcement Learning (C51)
19
Distributional RL
A Distributional Perspective on Reinforcement Learning (C51)
20
Distributional RL
A Distributional Perspective on Reinforcement Learning (C51)
C51 = DQN + Projection Distribution
(분포 만들기)
Distributional DQN
1. Return에 대한 Value Distribution(51개 bin)을 만든다.
2. 각 스텝마다 만든 Value Distribution 들간의 거리를 구한다.
à 논문에서 이론상 Wasserstein distance로 정의했지만
실험에서 KL-divergence로 계산
3. Cross entropy로 분포간의 Loss 계산
21
Distributional RL
A Distributional Perspective on Reinforcement Learning (C51)
22
Distributional RL
A Distributional Perspective on Reinforcement Learning (C51)
23
Distributional RL
A Distributional Perspective on Reinforcement Learning (C51)
Replay Buffer에서 Batch size만큼 추출
24
Distributional RL
A Distributional Perspective on Reinforcement Learning (C51)
Projection Distribution
(분포 만들기)
25
Distributional RL
A Distributional Perspective on Reinforcement Learning (C51)
Bellman	distributional	operator
𝑽 𝒎𝒂𝒙 	= 𝟏𝟎
𝑽 𝒎𝒊𝒎 = -10
26
Distributional RL
A Distributional Perspective on Reinforcement Learning (C51)
27
Distributional RL
A Distributional Perspective on Reinforcement Learning (C51)
KL-divergence(cross entropy)로
Loss 구하기
28
Performance
A Distributional Perspective on Reinforcement Learning (C51)
Relative PerformanceComparison
3. 코드 구현체 분석
29
감사합니다.
» urleee@naver.com
30
RL Adventure
RAINBOW
김예찬
1
INDEX
1. Environment
2. Before RAINBOW
DDQN(Double Deep Q-Learning)
Dueling DQN
Multi-Step TD(Temporal Difference)
PER(Prioritized Experience Replay)
Noisy Network
Categorical DQN(C51)
3. RAINBOW
4. RAINBOW - Code
2
OPENAI GYM
HTTPS://GYM.OPENAI.COM
HTTPS://GITHUB.COM/OPENAI/GYM
1. EXPERIMENT ENVIRONMENT
3
2. BEFORE RAINBOW : DOUBLE DQN
4
HTTPS://ARXIV.ORG/ABS/1509.06461
2. BEFORE RAINBOW : DUELING DQN
HTTPS://ARXIV.ORG/ABS/1511.06581
5
2. BEFORE RAINBOW : DUELING DQN
6
HTTPS://ARXIV.ORG/ABS/1511.06581
2. BEFORE RAINBOW : MULTI-STEP LEARNING
7
2. BEFORE RAINBOW : PER
HTTPS://ARXIV.ORG/ABS/1511.05952
8
2. BEFORE RAINBOW : NOISY NETWORK
HTTPS://ARXIV.ORG/ABS/1706.10295
9
2. BEFORE RAINBOW : NOISY NETWORK
HTTPS://ARXIV.ORG/ABS/1706.10295
10
2. BEFORE RAINBOW : CATEGORICAL DQN(C51)
HTTPS://ARXIV.ORG/PDF/1707.06887.PDF
11
2. BEFORE RAINBOW : CATEGORICAL DQN(C51)
HTTPS://ARXIV.ORG/PDF/1707.06887.PDF
12
RAINBOW
3. RAINBOW
13
3. RAINBOW
RAINBOW
DDQN(Double Deep Q-Learning)
+
Dueling DQN
+
Multi-Step TD(Temporal Difference)
+
PER(Prioritized Experience Replay)
+
Noisy Network
+
Categorical DQN(C51)
14
3. RAINBOW
15
3. RAINBOW
HYPERPARAMETERS
16
3. RAINBOW
17
3. RAINBOW
18
PONG
4. RAINBOW - CODE
19
NOISYLINEAR
4. RAINBOW - CODE
20
DUELING + NOISY + C51
4. RAINBOW - CODE
21
PROJECTION STEP
4. RAINBOW - CODE
22
CROSS-ENTROPY LOSS
4. RAINBOW - CODE
23
TEST
4. RAINBOW - CODE
24
Thank you
RAINBOW
김예찬
25

Weitere ähnliche Inhalte

Was ist angesagt?

파이썬과 케라스로 배우는 강화학습 저자특강
파이썬과 케라스로 배우는 강화학습 저자특강파이썬과 케라스로 배우는 강화학습 저자특강
파이썬과 케라스로 배우는 강화학습 저자특강Woong won Lee
 
[머가]Chap11 강화학습
[머가]Chap11 강화학습[머가]Chap11 강화학습
[머가]Chap11 강화학습종현 최
 
강화학습의 개요
강화학습의 개요강화학습의 개요
강화학습의 개요Dongmin Lee
 
From REINFORCE to PPO
From REINFORCE to PPOFrom REINFORCE to PPO
From REINFORCE to PPOWoong won Lee
 
Introduction to SAC(Soft Actor-Critic)
Introduction to SAC(Soft Actor-Critic)Introduction to SAC(Soft Actor-Critic)
Introduction to SAC(Soft Actor-Critic)Suhyun Cho
 
Control as Inference.pptx
Control as Inference.pptxControl as Inference.pptx
Control as Inference.pptxssuserbd1647
 
RBMを応用した事前学習とDNN学習
RBMを応用した事前学習とDNN学習RBMを応用した事前学習とDNN学習
RBMを応用した事前学習とDNN学習Masayuki Tanaka
 
강화학습 기초_2(Deep sarsa, Deep Q-learning, DQN)
강화학습 기초_2(Deep sarsa, Deep Q-learning, DQN)강화학습 기초_2(Deep sarsa, Deep Q-learning, DQN)
강화학습 기초_2(Deep sarsa, Deep Q-learning, DQN)Euijin Jeong
 
딥러닝과 강화 학습으로 나보다 잘하는 쿠키런 AI 구현하기 DEVIEW 2016
딥러닝과 강화 학습으로 나보다 잘하는 쿠키런 AI 구현하기 DEVIEW 2016딥러닝과 강화 학습으로 나보다 잘하는 쿠키런 AI 구현하기 DEVIEW 2016
딥러닝과 강화 학습으로 나보다 잘하는 쿠키런 AI 구현하기 DEVIEW 2016Taehoon Kim
 
Safe Reinforcement Learning
Safe Reinforcement LearningSafe Reinforcement Learning
Safe Reinforcement LearningDongmin Lee
 
[DL輪読会]深層強化学習はなぜ難しいのか?Why Deep RL fails? A brief survey of recent works.
[DL輪読会]深層強化学習はなぜ難しいのか?Why Deep RL fails? A brief survey of recent works.[DL輪読会]深層強化学習はなぜ難しいのか?Why Deep RL fails? A brief survey of recent works.
[DL輪読会]深層強化学習はなぜ難しいのか?Why Deep RL fails? A brief survey of recent works.Deep Learning JP
 
強化学習の分散アーキテクチャ変遷
強化学習の分散アーキテクチャ変遷強化学習の分散アーキテクチャ変遷
強化学習の分散アーキテクチャ変遷Eiji Sekiya
 
SSII2021 [TS2] 深層強化学習 〜 強化学習の基礎から応用まで 〜
SSII2021 [TS2] 深層強化学習 〜 強化学習の基礎から応用まで 〜SSII2021 [TS2] 深層強化学習 〜 強化学習の基礎から応用まで 〜
SSII2021 [TS2] 深層強化学習 〜 強化学習の基礎から応用まで 〜SSII
 
[DL輪読会]Deep Learning 第16章 深層学習のための構造化確率モデル
[DL輪読会]Deep Learning 第16章 深層学習のための構造化確率モデル[DL輪読会]Deep Learning 第16章 深層学習のための構造化確率モデル
[DL輪読会]Deep Learning 第16章 深層学習のための構造化確率モデルDeep Learning JP
 
強化学習入門
強化学習入門強化学習入門
強化学習入門Shunta Saito
 
Natural Policy Gradient 직관적 접근
Natural Policy Gradient 직관적 접근Natural Policy Gradient 직관적 접근
Natural Policy Gradient 직관적 접근Sooyoung Moon
 
Continuous control with deep reinforcement learning (DDPG)
Continuous control with deep reinforcement learning (DDPG)Continuous control with deep reinforcement learning (DDPG)
Continuous control with deep reinforcement learning (DDPG)Taehoon Kim
 
Intro to Deep Reinforcement Learning
Intro to Deep Reinforcement LearningIntro to Deep Reinforcement Learning
Intro to Deep Reinforcement LearningKhaled Saleh
 
(文献紹介) 画像復元:Plug-and-Play ADMM
(文献紹介) 画像復元:Plug-and-Play ADMM(文献紹介) 画像復元:Plug-and-Play ADMM
(文献紹介) 画像復元:Plug-and-Play ADMMMorpho, Inc.
 

Was ist angesagt? (20)

파이썬과 케라스로 배우는 강화학습 저자특강
파이썬과 케라스로 배우는 강화학습 저자특강파이썬과 케라스로 배우는 강화학습 저자특강
파이썬과 케라스로 배우는 강화학습 저자특강
 
ddpg seminar
ddpg seminarddpg seminar
ddpg seminar
 
[머가]Chap11 강화학습
[머가]Chap11 강화학습[머가]Chap11 강화학습
[머가]Chap11 강화학습
 
강화학습의 개요
강화학습의 개요강화학습의 개요
강화학습의 개요
 
From REINFORCE to PPO
From REINFORCE to PPOFrom REINFORCE to PPO
From REINFORCE to PPO
 
Introduction to SAC(Soft Actor-Critic)
Introduction to SAC(Soft Actor-Critic)Introduction to SAC(Soft Actor-Critic)
Introduction to SAC(Soft Actor-Critic)
 
Control as Inference.pptx
Control as Inference.pptxControl as Inference.pptx
Control as Inference.pptx
 
RBMを応用した事前学習とDNN学習
RBMを応用した事前学習とDNN学習RBMを応用した事前学習とDNN学習
RBMを応用した事前学習とDNN学習
 
강화학습 기초_2(Deep sarsa, Deep Q-learning, DQN)
강화학습 기초_2(Deep sarsa, Deep Q-learning, DQN)강화학습 기초_2(Deep sarsa, Deep Q-learning, DQN)
강화학습 기초_2(Deep sarsa, Deep Q-learning, DQN)
 
딥러닝과 강화 학습으로 나보다 잘하는 쿠키런 AI 구현하기 DEVIEW 2016
딥러닝과 강화 학습으로 나보다 잘하는 쿠키런 AI 구현하기 DEVIEW 2016딥러닝과 강화 학습으로 나보다 잘하는 쿠키런 AI 구현하기 DEVIEW 2016
딥러닝과 강화 학습으로 나보다 잘하는 쿠키런 AI 구현하기 DEVIEW 2016
 
Safe Reinforcement Learning
Safe Reinforcement LearningSafe Reinforcement Learning
Safe Reinforcement Learning
 
[DL輪読会]深層強化学習はなぜ難しいのか?Why Deep RL fails? A brief survey of recent works.
[DL輪読会]深層強化学習はなぜ難しいのか?Why Deep RL fails? A brief survey of recent works.[DL輪読会]深層強化学習はなぜ難しいのか?Why Deep RL fails? A brief survey of recent works.
[DL輪読会]深層強化学習はなぜ難しいのか?Why Deep RL fails? A brief survey of recent works.
 
強化学習の分散アーキテクチャ変遷
強化学習の分散アーキテクチャ変遷強化学習の分散アーキテクチャ変遷
強化学習の分散アーキテクチャ変遷
 
SSII2021 [TS2] 深層強化学習 〜 強化学習の基礎から応用まで 〜
SSII2021 [TS2] 深層強化学習 〜 強化学習の基礎から応用まで 〜SSII2021 [TS2] 深層強化学習 〜 強化学習の基礎から応用まで 〜
SSII2021 [TS2] 深層強化学習 〜 強化学習の基礎から応用まで 〜
 
[DL輪読会]Deep Learning 第16章 深層学習のための構造化確率モデル
[DL輪読会]Deep Learning 第16章 深層学習のための構造化確率モデル[DL輪読会]Deep Learning 第16章 深層学習のための構造化確率モデル
[DL輪読会]Deep Learning 第16章 深層学習のための構造化確率モデル
 
強化学習入門
強化学習入門強化学習入門
強化学習入門
 
Natural Policy Gradient 직관적 접근
Natural Policy Gradient 직관적 접근Natural Policy Gradient 직관적 접근
Natural Policy Gradient 직관적 접근
 
Continuous control with deep reinforcement learning (DDPG)
Continuous control with deep reinforcement learning (DDPG)Continuous control with deep reinforcement learning (DDPG)
Continuous control with deep reinforcement learning (DDPG)
 
Intro to Deep Reinforcement Learning
Intro to Deep Reinforcement LearningIntro to Deep Reinforcement Learning
Intro to Deep Reinforcement Learning
 
(文献紹介) 画像復元:Plug-and-Play ADMM
(文献紹介) 画像復元:Plug-and-Play ADMM(文献紹介) 画像復元:Plug-and-Play ADMM
(文献紹介) 画像復元:Plug-and-Play ADMM
 

Ähnlich wie pycon2018 "RL Adventure : DQN 부터 Rainbow DQN까지"

zkStudy Club: Subquadratic SNARGs in the Random Oracle Model
zkStudy Club: Subquadratic SNARGs in the Random Oracle ModelzkStudy Club: Subquadratic SNARGs in the Random Oracle Model
zkStudy Club: Subquadratic SNARGs in the Random Oracle ModelAlex Pruden
 
Dueling Network Architectures for Deep Reinforcement Learning
Dueling Network Architectures for Deep Reinforcement LearningDueling Network Architectures for Deep Reinforcement Learning
Dueling Network Architectures for Deep Reinforcement LearningYoonho Lee
 
Deep learning study 2
Deep learning study 2Deep learning study 2
Deep learning study 2San Kim
 
Deep sarsa, Deep Q-learning, DQN
Deep sarsa, Deep Q-learning, DQNDeep sarsa, Deep Q-learning, DQN
Deep sarsa, Deep Q-learning, DQNEuijin Jeong
 
Hardware Acceleration for Machine Learning
Hardware Acceleration for Machine LearningHardware Acceleration for Machine Learning
Hardware Acceleration for Machine LearningCastLabKAIST
 
Neural network basic and introduction of Deep learning
Neural network basic and introduction of Deep learningNeural network basic and introduction of Deep learning
Neural network basic and introduction of Deep learningTapas Majumdar
 
Using R in remote computer clusters
Using R in remote computer clustersUsing R in remote computer clusters
Using R in remote computer clustersBurak Himmetoglu
 
"Practical Machine Learning With Ruby" by Iqbal Farabi (ID Ruby Community)
"Practical Machine Learning With Ruby" by Iqbal Farabi (ID Ruby Community)"Practical Machine Learning With Ruby" by Iqbal Farabi (ID Ruby Community)
"Practical Machine Learning With Ruby" by Iqbal Farabi (ID Ruby Community)Tech in Asia ID
 
Tensorflow + Keras & Open AI Gym
Tensorflow + Keras & Open AI GymTensorflow + Keras & Open AI Gym
Tensorflow + Keras & Open AI GymHO-HSUN LIN
 
Deep learning with TensorFlow
Deep learning with TensorFlowDeep learning with TensorFlow
Deep learning with TensorFlowBarbara Fusinska
 
Homomorphic Encryption
Homomorphic EncryptionHomomorphic Encryption
Homomorphic EncryptionGöktuğ Serez
 
A CGRA-based Approach for Accelerating Convolutional Neural Networks
A CGRA-based Approachfor Accelerating Convolutional Neural NetworksA CGRA-based Approachfor Accelerating Convolutional Neural Networks
A CGRA-based Approach for Accelerating Convolutional Neural NetworksShinya Takamaeda-Y
 
DeepLearningLecture.pptx
DeepLearningLecture.pptxDeepLearningLecture.pptx
DeepLearningLecture.pptxssuserf07225
 
Artificial neural networks introduction
Artificial neural networks introductionArtificial neural networks introduction
Artificial neural networks introductionSungminYou
 
Deep Feed Forward Neural Networks and Regularization
Deep Feed Forward Neural Networks and RegularizationDeep Feed Forward Neural Networks and Regularization
Deep Feed Forward Neural Networks and RegularizationYan Xu
 
Playing Go with Clojure
Playing Go with ClojurePlaying Go with Clojure
Playing Go with Clojureztellman
 
Gan seminar
Gan seminarGan seminar
Gan seminarSan Kim
 
Le Song, Assistant Professor, College of Computing, Georgia Institute of Tech...
Le Song, Assistant Professor, College of Computing, Georgia Institute of Tech...Le Song, Assistant Professor, College of Computing, Georgia Institute of Tech...
Le Song, Assistant Professor, College of Computing, Georgia Institute of Tech...MLconf
 
Variational Autoencoded Regression of Visual Data with Generative Adversarial...
Variational Autoencoded Regression of Visual Data with Generative Adversarial...Variational Autoencoded Regression of Visual Data with Generative Adversarial...
Variational Autoencoded Regression of Visual Data with Generative Adversarial...NAVER Engineering
 
Practical and Worst-Case Efficient Apportionment
Practical and Worst-Case Efficient ApportionmentPractical and Worst-Case Efficient Apportionment
Practical and Worst-Case Efficient ApportionmentRaphael Reitzig
 

Ähnlich wie pycon2018 "RL Adventure : DQN 부터 Rainbow DQN까지" (20)

zkStudy Club: Subquadratic SNARGs in the Random Oracle Model
zkStudy Club: Subquadratic SNARGs in the Random Oracle ModelzkStudy Club: Subquadratic SNARGs in the Random Oracle Model
zkStudy Club: Subquadratic SNARGs in the Random Oracle Model
 
Dueling Network Architectures for Deep Reinforcement Learning
Dueling Network Architectures for Deep Reinforcement LearningDueling Network Architectures for Deep Reinforcement Learning
Dueling Network Architectures for Deep Reinforcement Learning
 
Deep learning study 2
Deep learning study 2Deep learning study 2
Deep learning study 2
 
Deep sarsa, Deep Q-learning, DQN
Deep sarsa, Deep Q-learning, DQNDeep sarsa, Deep Q-learning, DQN
Deep sarsa, Deep Q-learning, DQN
 
Hardware Acceleration for Machine Learning
Hardware Acceleration for Machine LearningHardware Acceleration for Machine Learning
Hardware Acceleration for Machine Learning
 
Neural network basic and introduction of Deep learning
Neural network basic and introduction of Deep learningNeural network basic and introduction of Deep learning
Neural network basic and introduction of Deep learning
 
Using R in remote computer clusters
Using R in remote computer clustersUsing R in remote computer clusters
Using R in remote computer clusters
 
"Practical Machine Learning With Ruby" by Iqbal Farabi (ID Ruby Community)
"Practical Machine Learning With Ruby" by Iqbal Farabi (ID Ruby Community)"Practical Machine Learning With Ruby" by Iqbal Farabi (ID Ruby Community)
"Practical Machine Learning With Ruby" by Iqbal Farabi (ID Ruby Community)
 
Tensorflow + Keras & Open AI Gym
Tensorflow + Keras & Open AI GymTensorflow + Keras & Open AI Gym
Tensorflow + Keras & Open AI Gym
 
Deep learning with TensorFlow
Deep learning with TensorFlowDeep learning with TensorFlow
Deep learning with TensorFlow
 
Homomorphic Encryption
Homomorphic EncryptionHomomorphic Encryption
Homomorphic Encryption
 
A CGRA-based Approach for Accelerating Convolutional Neural Networks
A CGRA-based Approachfor Accelerating Convolutional Neural NetworksA CGRA-based Approachfor Accelerating Convolutional Neural Networks
A CGRA-based Approach for Accelerating Convolutional Neural Networks
 
DeepLearningLecture.pptx
DeepLearningLecture.pptxDeepLearningLecture.pptx
DeepLearningLecture.pptx
 
Artificial neural networks introduction
Artificial neural networks introductionArtificial neural networks introduction
Artificial neural networks introduction
 
Deep Feed Forward Neural Networks and Regularization
Deep Feed Forward Neural Networks and RegularizationDeep Feed Forward Neural Networks and Regularization
Deep Feed Forward Neural Networks and Regularization
 
Playing Go with Clojure
Playing Go with ClojurePlaying Go with Clojure
Playing Go with Clojure
 
Gan seminar
Gan seminarGan seminar
Gan seminar
 
Le Song, Assistant Professor, College of Computing, Georgia Institute of Tech...
Le Song, Assistant Professor, College of Computing, Georgia Institute of Tech...Le Song, Assistant Professor, College of Computing, Georgia Institute of Tech...
Le Song, Assistant Professor, College of Computing, Georgia Institute of Tech...
 
Variational Autoencoded Regression of Visual Data with Generative Adversarial...
Variational Autoencoded Regression of Visual Data with Generative Adversarial...Variational Autoencoded Regression of Visual Data with Generative Adversarial...
Variational Autoencoded Regression of Visual Data with Generative Adversarial...
 
Practical and Worst-Case Efficient Apportionment
Practical and Worst-Case Efficient ApportionmentPractical and Worst-Case Efficient Apportionment
Practical and Worst-Case Efficient Apportionment
 

Mehr von YeChan(Paul) Kim

강화학습과 LV&A 그리고 Navigation Agent
강화학습과 LV&A 그리고 Navigation Agent강화학습과 LV&A 그리고 Navigation Agent
강화학습과 LV&A 그리고 Navigation AgentYeChan(Paul) Kim
 
Learning to Communicate to Solve Riddles with Deep Distributed Recurrent Q-Ne...
Learning to Communicate to Solve Riddles with Deep Distributed Recurrent Q-Ne...Learning to Communicate to Solve Riddles with Deep Distributed Recurrent Q-Ne...
Learning to Communicate to Solve Riddles with Deep Distributed Recurrent Q-Ne...YeChan(Paul) Kim
 
Multiagent Cooperative and Competition with Deep Reinforcement Learning
Multiagent Cooperative and Competition with Deep Reinforcement LearningMultiagent Cooperative and Competition with Deep Reinforcement Learning
Multiagent Cooperative and Competition with Deep Reinforcement LearningYeChan(Paul) Kim
 
2018 global ai_bootcamp_seoul_HomeNavi(Reinforcement Learning, AI)
2018 global ai_bootcamp_seoul_HomeNavi(Reinforcement Learning, AI)2018 global ai_bootcamp_seoul_HomeNavi(Reinforcement Learning, AI)
2018 global ai_bootcamp_seoul_HomeNavi(Reinforcement Learning, AI)YeChan(Paul) Kim
 
3D Environment : HomeNavigation
3D Environment : HomeNavigation3D Environment : HomeNavigation
3D Environment : HomeNavigationYeChan(Paul) Kim
 
Diversity is all you need(DIAYN) : Learning Skills without a Reward Function
Diversity is all you need(DIAYN) : Learning Skills without a Reward FunctionDiversity is all you need(DIAYN) : Learning Skills without a Reward Function
Diversity is all you need(DIAYN) : Learning Skills without a Reward FunctionYeChan(Paul) Kim
 
pyconkr 2018 RL_Adventure : Rainbow(value based Reinforcement Learning)
pyconkr 2018 RL_Adventure : Rainbow(value based Reinforcement Learning)pyconkr 2018 RL_Adventure : Rainbow(value based Reinforcement Learning)
pyconkr 2018 RL_Adventure : Rainbow(value based Reinforcement Learning)YeChan(Paul) Kim
 

Mehr von YeChan(Paul) Kim (8)

강화학습과 LV&A 그리고 Navigation Agent
강화학습과 LV&A 그리고 Navigation Agent강화학습과 LV&A 그리고 Navigation Agent
강화학습과 LV&A 그리고 Navigation Agent
 
Neural module Network
Neural module NetworkNeural module Network
Neural module Network
 
Learning to Communicate to Solve Riddles with Deep Distributed Recurrent Q-Ne...
Learning to Communicate to Solve Riddles with Deep Distributed Recurrent Q-Ne...Learning to Communicate to Solve Riddles with Deep Distributed Recurrent Q-Ne...
Learning to Communicate to Solve Riddles with Deep Distributed Recurrent Q-Ne...
 
Multiagent Cooperative and Competition with Deep Reinforcement Learning
Multiagent Cooperative and Competition with Deep Reinforcement LearningMultiagent Cooperative and Competition with Deep Reinforcement Learning
Multiagent Cooperative and Competition with Deep Reinforcement Learning
 
2018 global ai_bootcamp_seoul_HomeNavi(Reinforcement Learning, AI)
2018 global ai_bootcamp_seoul_HomeNavi(Reinforcement Learning, AI)2018 global ai_bootcamp_seoul_HomeNavi(Reinforcement Learning, AI)
2018 global ai_bootcamp_seoul_HomeNavi(Reinforcement Learning, AI)
 
3D Environment : HomeNavigation
3D Environment : HomeNavigation3D Environment : HomeNavigation
3D Environment : HomeNavigation
 
Diversity is all you need(DIAYN) : Learning Skills without a Reward Function
Diversity is all you need(DIAYN) : Learning Skills without a Reward FunctionDiversity is all you need(DIAYN) : Learning Skills without a Reward Function
Diversity is all you need(DIAYN) : Learning Skills without a Reward Function
 
pyconkr 2018 RL_Adventure : Rainbow(value based Reinforcement Learning)
pyconkr 2018 RL_Adventure : Rainbow(value based Reinforcement Learning)pyconkr 2018 RL_Adventure : Rainbow(value based Reinforcement Learning)
pyconkr 2018 RL_Adventure : Rainbow(value based Reinforcement Learning)
 

Kürzlich hochgeladen

Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPirithiRaju
 
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxkessiyaTpeter
 
Botany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfBotany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfSumit Kumar yadav
 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptxanandsmhk
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsAArockiyaNisha
 
Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Patrick Diehl
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsSérgio Sacani
 
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...anilsa9823
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)PraveenaKalaiselvan1
 
Botany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questionsBotany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questionsSumit Kumar yadav
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Sérgio Sacani
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )aarthirajkumar25
 
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡anilsa9823
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Lokesh Kothari
 
Chemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfChemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfSumit Kumar yadav
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxUmerFayaz5
 
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisRaman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisDiwakar Mishra
 
A relative description on Sonoporation.pdf
A relative description on Sonoporation.pdfA relative description on Sonoporation.pdf
A relative description on Sonoporation.pdfnehabiju2046
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksSérgio Sacani
 

Kürzlich hochgeladen (20)

Engler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomyEngler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomy
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
 
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
 
Botany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfBotany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdf
 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based Nanomaterials
 
Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
 
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)
 
Botany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questionsBotany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questions
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )
 
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
 
Chemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfChemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdf
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptx
 
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisRaman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
 
A relative description on Sonoporation.pdf
A relative description on Sonoporation.pdfA relative description on Sonoporation.pdf
A relative description on Sonoporation.pdf
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disks
 

pycon2018 "RL Adventure : DQN 부터 Rainbow DQN까지"