SlideShare ist ein Scribd-Unternehmen logo
1 von 17
Downloaden Sie, um offline zu lesen
Value	Iteration	Networks
A.	Tamar,	Y.	Wu,	G.	Thomas,	S.	Levine,	and	P.	Abbeel
Dept.	of	Electrical	Engineering	and	Computer	Sciences,	UC	Berkeley
Presenter:	Keisuke	Fujimoto
(Twitter	@peisuke)
Value	Iteration	Networks
Purpose: Machine	learning	based	robot	path	planning.	This	planner	is	available	in	
new	environment not	included	in	train	data	set.
Strategy: Prediction	of	optimal	action.	The	method	can	learn	rewards	of	each	place	
and	action	to	get	good	rewards.
Result: Planning	in	28	x	28	grid	map,	Applicable	to	continuous	control	robot	
Map
Pose
Velocity
Goal
Action
A.	Tamar,	Y.	Wu,	G.	Thomas,	S.	Levine,	and	P.	Abbeel
Dept.	of	Electrical	Engineering	and	Computer	Sciences,	UC	Berkeley
Presenter:	
Keisuke	Fujimoto
(ABEJA)
Background
Target	:	Autonomous	Robot
• Manipulation	robot,	Navigation	robot,	Transfer	robot
Problem	:	
• Reinforcement	learning	can	not	work	outside	of	training	
environments.	
Goal
Target	object
Manipulation	robot Navigation	robot
Contribution
• Value	Iteration	Networks	(VIN)
• Model	free	training
• It	does	not	require	robot	dynamics	models.
• Generalized	action	prediction	in	new	environments
• It	can	not	work	outside	of	training	environments.
• Key	approach
• Represents	value-iteration	planning	by	CNN
• Prediction	of	reward	map	and	computation	of	sum	
of	future	rewards.
Overview	of	VIN
Input	:	State	of	the	robot	(pose,	velocity),	goal,	map	(left	fig.)
Output	:	Action	(direction,	mortar's	torque)
Strategy	:	Determination	of	optimal	action	using	predicted		
rewards	(right	fig.).
State Rewards
Reward	propagation
• Action	can	be	determined	by	sum	
of	future	reward	generated	using	
reward	propagation
-10 -10 -10
-10 -10 1
-10 -10
Map Reward	from	map
Left	move	action
-10 -10 -9 -10
-10 -10 -9 1 0.9
-10 -10 -9
-10 -10 -10
-10 -10 1 -9
-9 -10 -10 0.9
-9 -9
Up	move	from	map
One-step	propagation	example:
Determination	of	action
• Optimal	action	at	reward	propagated	
place	is	max	reward	action	(middle	fig.)
• Determination	of	optimal	action	using	
propagated	reward	(right	fig.)
Left	move	action
-10 -10 -9 -10
-10 -10 -9 1 0.9
-10 -10 -9
-10 -10 -10
-10 -10 1 -9
-9 -10 -10 0.9
-9 -9
Up	move	from	map -10 -10 -9 -10
-10 -10 -9 1 0.9
-9 -10 -10 0.9
-9 -9
Max
After	Reward	propagation	
-10 -10 -9 -8 -10
-10 -10 -9 1 0.9
-9 -10 -10 0.9 0.8
-8 -9 -9 0.8 0.7
-7 -8 -8 0.7 0.6
Current	robot	pose
Value	Iteration	Module
• Reward	propagation	with	Convolutional	Neural	Network
• Input	is	reward	map	and	output	is	sum	of	feature	reward	map
• Q	is	hidden	reward	map,	V	is	sum	of	feature	reward	map
Output
Convolution
Max
Value	Iteration	Networks
• Deep	Architecture	of	Value	Iteration	Networks
• Input	is	map	and	state,	fR predicts	reward	map
• Attention	modules	crops	the	value	map	around	robot	position
• 𝜓 outputs	optimal	action
Attention	function
• Attention	module	crops	a	subset	of	the	values	around	
current	robot	pose.
• Optimal	pose	have	relative	to	only	current	robot	pose.
• Due	to	this	attention	module,	prediction	of	optimal	
action	becomes	easy.
-10 -10 -9 -8 -10
-10 -10 -9 1 0.9
-9 -10 -10 0.9 0.8
-8 -9 -9 0.8 0.7
-7 -8 -8 0.7 0.6
If	robot	is	here.
-10 0.9 0.8
-9 0.8 0.7
-8 0.7 0.6
Selected	area
Grid-World	Domain
Environment	:
Occupancy	grid	map,	test	size	is	8x8	to	28x28
The	number	of	recurrence	is	20	for	the	28x28	maps
Training	dataset	is	5000	maps,	7	trajectories.
Networks	Arch.	:	
Competitive	method	:
CNN	based	Deep	Q-Network,	Direct	action	prediction	using	FCN	
Map,	Goal
CNN Reward	map VI	module Attention FC	layer
Action
Current	Position
3	layer	net
150	hidden	node 10	channels	in	Q-layer 80	parameters
Results	of	Grid-World	Domain
Predicted	path Reward Sum	of	feature	reward
Mars	Rover	Navigation
Environment	:
• Navigating	the	surface	of	Mars	by	a	rover.
• It	predicts	path	from	only	surface	image	without	obstacle	
information.
• Success	rate	is	90.3%.
Red	point	shows	elevation	sharper,	in	prediction	time,	vin	
does	not	uses	the	elevation	shape	information
Continuous	Control
Environment	:
• Apply	to	continuous	control	space.
• Grid	size	is	28x28
• input	is	position	and	velocity	
which	is	float	data.	
• Output	is	2d	continuous	control	
parameters.
Comparison	about	final	distance	to	the	goal
This	result	is	from	author's	presentation
WebNav	Challenge
Environment	:
• Navigate	website	links	to	find	a	query
• Features:	average	word	embeddings
• Using	an	approximate	graph	for	planning
Evaluation:	
• Success	rate	of	within	top-4	predictions	
• Test	set	1:	start	from	index	page	
• Test	set	2:	start	from	random	page	
Result:
Conclusion
Purpose	:
• Machine	learning	based	robot	path	planning.	
Method	:	
• Learning	rewards	of	each	place	and	predict	action	
using	propagated	reward.
Result	:	
• VIN	policies	learn	an	approximate	planning	
computation	relevant	for	solving	the	task.
• Grid-worlds,	to	continuous	control,	and	even	to	
navigation	of	Wikipedia	links.
Code:
https://github.com/peisuke/vin
This	code	is	implemented	in	chainer!
Twitter:		
@peisuke
We	are	hiring	!!	
https://www.wantedly.com/companies/abeja

Weitere ähnliche Inhalte

Was ist angesagt?

物体検出の歴史まとめ(1) 20180417
物体検出の歴史まとめ(1) 20180417物体検出の歴史まとめ(1) 20180417
物体検出の歴史まとめ(1) 20180417Masakazu Shinoda
 
【論文紹介】U-GAT-IT
【論文紹介】U-GAT-IT【論文紹介】U-GAT-IT
【論文紹介】U-GAT-ITmeownoisy
 
Transfer learning-presentation
Transfer learning-presentationTransfer learning-presentation
Transfer learning-presentationBushra Jbawi
 
(文献紹介)エッジ保存フィルタ:Side Window Filter, Curvature Filter
(文献紹介)エッジ保存フィルタ:Side Window Filter, Curvature Filter(文献紹介)エッジ保存フィルタ:Side Window Filter, Curvature Filter
(文献紹介)エッジ保存フィルタ:Side Window Filter, Curvature FilterMorpho, Inc.
 
Model-Based Reinforcement Learning @NIPS2017
Model-Based Reinforcement Learning @NIPS2017Model-Based Reinforcement Learning @NIPS2017
Model-Based Reinforcement Learning @NIPS2017mooopan
 
[DL輪読会] Residual Attention Network for Image Classification
[DL輪読会] Residual Attention Network for Image Classification[DL輪読会] Residual Attention Network for Image Classification
[DL輪読会] Residual Attention Network for Image ClassificationDeep Learning JP
 
Sift特徴量について
Sift特徴量についてSift特徴量について
Sift特徴量についてla_flance
 
完全自動運転実現のための信頼度付き自己位置推定の提案
完全自動運転実現のための信頼度付き自己位置推定の提案完全自動運転実現のための信頼度付き自己位置推定の提案
完全自動運転実現のための信頼度付き自己位置推定の提案Naoki Akai
 
An introduction to reinforcement learning
An introduction to reinforcement learningAn introduction to reinforcement learning
An introduction to reinforcement learningSubrat Panda, PhD
 
[DL輪読会]Social Influence as Intrinsic Motivation for Multi-Agent Deep Reinforc...
[DL輪読会]Social Influence as Intrinsic Motivation for Multi-Agent Deep Reinforc...[DL輪読会]Social Influence as Intrinsic Motivation for Multi-Agent Deep Reinforc...
[DL輪読会]Social Influence as Intrinsic Motivation for Multi-Agent Deep Reinforc...Deep Learning JP
 
[DL輪読会]BANMo: Building Animatable 3D Neural Models from Many Casual Videos
[DL輪読会]BANMo: Building Animatable 3D Neural Models from Many Casual Videos[DL輪読会]BANMo: Building Animatable 3D Neural Models from Many Casual Videos
[DL輪読会]BANMo: Building Animatable 3D Neural Models from Many Casual VideosDeep Learning JP
 
"Playing Atari with Deep Reinforcement Learning"
"Playing Atari with Deep Reinforcement Learning""Playing Atari with Deep Reinforcement Learning"
"Playing Atari with Deep Reinforcement Learning"mooopan
 
Pythonで画像処理をやってみよう!第7回 - Scale-space 第6回 -
Pythonで画像処理をやってみよう!第7回 - Scale-space 第6回 -Pythonで画像処理をやってみよう!第7回 - Scale-space 第6回 -
Pythonで画像処理をやってみよう!第7回 - Scale-space 第6回 -Project Samurai
 
Reinforcement Learning 5. Monte Carlo Methods
Reinforcement Learning 5. Monte Carlo MethodsReinforcement Learning 5. Monte Carlo Methods
Reinforcement Learning 5. Monte Carlo MethodsSeung Jae Lee
 
#10 pydata warsaw object detection with dn ns
#10   pydata warsaw object detection with dn ns#10   pydata warsaw object detection with dn ns
#10 pydata warsaw object detection with dn nsAndrew Brozek
 
[DL輪読会]Model soups: averaging weights of multiple fine-tuned models improves ...
[DL輪読会]Model soups: averaging weights of multiple fine-tuned models improves ...[DL輪読会]Model soups: averaging weights of multiple fine-tuned models improves ...
[DL輪読会]Model soups: averaging weights of multiple fine-tuned models improves ...Deep Learning JP
 
単一物体追跡論文のサーベイ
単一物体追跡論文のサーベイ単一物体追跡論文のサーベイ
単一物体追跡論文のサーベイHitoshi Nishimura
 

Was ist angesagt? (20)

物体検出の歴史まとめ(1) 20180417
物体検出の歴史まとめ(1) 20180417物体検出の歴史まとめ(1) 20180417
物体検出の歴史まとめ(1) 20180417
 
【論文紹介】U-GAT-IT
【論文紹介】U-GAT-IT【論文紹介】U-GAT-IT
【論文紹介】U-GAT-IT
 
Transfer learning-presentation
Transfer learning-presentationTransfer learning-presentation
Transfer learning-presentation
 
(文献紹介)エッジ保存フィルタ:Side Window Filter, Curvature Filter
(文献紹介)エッジ保存フィルタ:Side Window Filter, Curvature Filter(文献紹介)エッジ保存フィルタ:Side Window Filter, Curvature Filter
(文献紹介)エッジ保存フィルタ:Side Window Filter, Curvature Filter
 
Model-Based Reinforcement Learning @NIPS2017
Model-Based Reinforcement Learning @NIPS2017Model-Based Reinforcement Learning @NIPS2017
Model-Based Reinforcement Learning @NIPS2017
 
[DL輪読会] Residual Attention Network for Image Classification
[DL輪読会] Residual Attention Network for Image Classification[DL輪読会] Residual Attention Network for Image Classification
[DL輪読会] Residual Attention Network for Image Classification
 
Sift特徴量について
Sift特徴量についてSift特徴量について
Sift特徴量について
 
完全自動運転実現のための信頼度付き自己位置推定の提案
完全自動運転実現のための信頼度付き自己位置推定の提案完全自動運転実現のための信頼度付き自己位置推定の提案
完全自動運転実現のための信頼度付き自己位置推定の提案
 
Agents_AI.ppt
Agents_AI.pptAgents_AI.ppt
Agents_AI.ppt
 
An introduction to reinforcement learning
An introduction to reinforcement learningAn introduction to reinforcement learning
An introduction to reinforcement learning
 
[DL輪読会]Social Influence as Intrinsic Motivation for Multi-Agent Deep Reinforc...
[DL輪読会]Social Influence as Intrinsic Motivation for Multi-Agent Deep Reinforc...[DL輪読会]Social Influence as Intrinsic Motivation for Multi-Agent Deep Reinforc...
[DL輪読会]Social Influence as Intrinsic Motivation for Multi-Agent Deep Reinforc...
 
[DL輪読会]BANMo: Building Animatable 3D Neural Models from Many Casual Videos
[DL輪読会]BANMo: Building Animatable 3D Neural Models from Many Casual Videos[DL輪読会]BANMo: Building Animatable 3D Neural Models from Many Casual Videos
[DL輪読会]BANMo: Building Animatable 3D Neural Models from Many Casual Videos
 
Mobilenet
MobilenetMobilenet
Mobilenet
 
"Playing Atari with Deep Reinforcement Learning"
"Playing Atari with Deep Reinforcement Learning""Playing Atari with Deep Reinforcement Learning"
"Playing Atari with Deep Reinforcement Learning"
 
Pythonで画像処理をやってみよう!第7回 - Scale-space 第6回 -
Pythonで画像処理をやってみよう!第7回 - Scale-space 第6回 -Pythonで画像処理をやってみよう!第7回 - Scale-space 第6回 -
Pythonで画像処理をやってみよう!第7回 - Scale-space 第6回 -
 
Reinforcement Learning 5. Monte Carlo Methods
Reinforcement Learning 5. Monte Carlo MethodsReinforcement Learning 5. Monte Carlo Methods
Reinforcement Learning 5. Monte Carlo Methods
 
#10 pydata warsaw object detection with dn ns
#10   pydata warsaw object detection with dn ns#10   pydata warsaw object detection with dn ns
#10 pydata warsaw object detection with dn ns
 
KCFの紹介
KCFの紹介KCFの紹介
KCFの紹介
 
[DL輪読会]Model soups: averaging weights of multiple fine-tuned models improves ...
[DL輪読会]Model soups: averaging weights of multiple fine-tuned models improves ...[DL輪読会]Model soups: averaging weights of multiple fine-tuned models improves ...
[DL輪読会]Model soups: averaging weights of multiple fine-tuned models improves ...
 
単一物体追跡論文のサーベイ
単一物体追跡論文のサーベイ単一物体追跡論文のサーベイ
単一物体追跡論文のサーベイ
 

Andere mochten auch

Interaction Networks for Learning about Objects, Relations and Physics
Interaction Networks for Learning about Objects, Relations and PhysicsInteraction Networks for Learning about Objects, Relations and Physics
Interaction Networks for Learning about Objects, Relations and PhysicsKen Kuroki
 
Conditional Image Generation with PixelCNN Decoders
Conditional Image Generation with PixelCNN DecodersConditional Image Generation with PixelCNN Decoders
Conditional Image Generation with PixelCNN Decoderssuga93
 
Fast and Probvably Seedings for k-Means
Fast and Probvably Seedings for k-MeansFast and Probvably Seedings for k-Means
Fast and Probvably Seedings for k-MeansKimikazu Kato
 
Dual Learning for Machine Translation (NIPS 2016)
Dual Learning for Machine Translation (NIPS 2016)Dual Learning for Machine Translation (NIPS 2016)
Dual Learning for Machine Translation (NIPS 2016)Toru Fujino
 
InfoGAN: Interpretable Representation Learning by Information Maximizing Gen...
InfoGAN: Interpretable Representation Learning by Information Maximizing Gen...InfoGAN: Interpretable Representation Learning by Information Maximizing Gen...
InfoGAN: Interpretable Representation Learning by Information Maximizing Gen...Shuhei Yoshida
 
Safe and Efficient Off-Policy Reinforcement Learning
Safe and Efficient Off-Policy Reinforcement LearningSafe and Efficient Off-Policy Reinforcement Learning
Safe and Efficient Off-Policy Reinforcement Learningmooopan
 
Introduction of "TrailBlazer" algorithm
Introduction of "TrailBlazer" algorithmIntroduction of "TrailBlazer" algorithm
Introduction of "TrailBlazer" algorithmKatsuki Ohto
 
Introduction of “Fairness in Learning: Classic and Contextual Bandits”
Introduction of “Fairness in Learning: Classic and Contextual Bandits”Introduction of “Fairness in Learning: Classic and Contextual Bandits”
Introduction of “Fairness in Learning: Classic and Contextual Bandits”Kazuto Fukuchi
 
Learning to learn by gradient descent by gradient descent
Learning to learn by gradient descent by gradient descentLearning to learn by gradient descent by gradient descent
Learning to learn by gradient descent by gradient descentHiroyuki Fukuda
 
時系列データ3
時系列データ3時系列データ3
時系列データ3graySpace999
 
Improving Variational Inference with Inverse Autoregressive Flow
Improving Variational Inference with Inverse Autoregressive FlowImproving Variational Inference with Inverse Autoregressive Flow
Improving Variational Inference with Inverse Autoregressive FlowTatsuya Shirakawa
 
[DL輪読会]Convolutional Sequence to Sequence Learning
[DL輪読会]Convolutional Sequence to Sequence Learning[DL輪読会]Convolutional Sequence to Sequence Learning
[DL輪読会]Convolutional Sequence to Sequence LearningDeep Learning JP
 
NIPS 2016 Overview and Deep Learning Topics
NIPS 2016 Overview and Deep Learning Topics  NIPS 2016 Overview and Deep Learning Topics
NIPS 2016 Overview and Deep Learning Topics Koichi Hamada
 
論文紹介 Combining Model-Based and Model-Free Updates for Trajectory-Centric Rein...
論文紹介 Combining Model-Based and Model-Free Updates for Trajectory-Centric Rein...論文紹介 Combining Model-Based and Model-Free Updates for Trajectory-Centric Rein...
論文紹介 Combining Model-Based and Model-Free Updates for Trajectory-Centric Rein...Kusano Hitoshi
 
Differential privacy without sensitivity [NIPS2016読み会資料]
Differential privacy without sensitivity [NIPS2016読み会資料]Differential privacy without sensitivity [NIPS2016読み会資料]
Differential privacy without sensitivity [NIPS2016読み会資料]Kentaro Minami
 
Matching networks for one shot learning
Matching networks for one shot learningMatching networks for one shot learning
Matching networks for one shot learningKazuki Fujikawa
 
ICML2016読み会 概要紹介
ICML2016読み会 概要紹介ICML2016読み会 概要紹介
ICML2016読み会 概要紹介Kohei Hayashi
 
論文紹介 Pixel Recurrent Neural Networks
論文紹介 Pixel Recurrent Neural Networks論文紹介 Pixel Recurrent Neural Networks
論文紹介 Pixel Recurrent Neural NetworksSeiya Tokui
 

Andere mochten auch (18)

Interaction Networks for Learning about Objects, Relations and Physics
Interaction Networks for Learning about Objects, Relations and PhysicsInteraction Networks for Learning about Objects, Relations and Physics
Interaction Networks for Learning about Objects, Relations and Physics
 
Conditional Image Generation with PixelCNN Decoders
Conditional Image Generation with PixelCNN DecodersConditional Image Generation with PixelCNN Decoders
Conditional Image Generation with PixelCNN Decoders
 
Fast and Probvably Seedings for k-Means
Fast and Probvably Seedings for k-MeansFast and Probvably Seedings for k-Means
Fast and Probvably Seedings for k-Means
 
Dual Learning for Machine Translation (NIPS 2016)
Dual Learning for Machine Translation (NIPS 2016)Dual Learning for Machine Translation (NIPS 2016)
Dual Learning for Machine Translation (NIPS 2016)
 
InfoGAN: Interpretable Representation Learning by Information Maximizing Gen...
InfoGAN: Interpretable Representation Learning by Information Maximizing Gen...InfoGAN: Interpretable Representation Learning by Information Maximizing Gen...
InfoGAN: Interpretable Representation Learning by Information Maximizing Gen...
 
Safe and Efficient Off-Policy Reinforcement Learning
Safe and Efficient Off-Policy Reinforcement LearningSafe and Efficient Off-Policy Reinforcement Learning
Safe and Efficient Off-Policy Reinforcement Learning
 
Introduction of "TrailBlazer" algorithm
Introduction of "TrailBlazer" algorithmIntroduction of "TrailBlazer" algorithm
Introduction of "TrailBlazer" algorithm
 
Introduction of “Fairness in Learning: Classic and Contextual Bandits”
Introduction of “Fairness in Learning: Classic and Contextual Bandits”Introduction of “Fairness in Learning: Classic and Contextual Bandits”
Introduction of “Fairness in Learning: Classic and Contextual Bandits”
 
Learning to learn by gradient descent by gradient descent
Learning to learn by gradient descent by gradient descentLearning to learn by gradient descent by gradient descent
Learning to learn by gradient descent by gradient descent
 
時系列データ3
時系列データ3時系列データ3
時系列データ3
 
Improving Variational Inference with Inverse Autoregressive Flow
Improving Variational Inference with Inverse Autoregressive FlowImproving Variational Inference with Inverse Autoregressive Flow
Improving Variational Inference with Inverse Autoregressive Flow
 
[DL輪読会]Convolutional Sequence to Sequence Learning
[DL輪読会]Convolutional Sequence to Sequence Learning[DL輪読会]Convolutional Sequence to Sequence Learning
[DL輪読会]Convolutional Sequence to Sequence Learning
 
NIPS 2016 Overview and Deep Learning Topics
NIPS 2016 Overview and Deep Learning Topics  NIPS 2016 Overview and Deep Learning Topics
NIPS 2016 Overview and Deep Learning Topics
 
論文紹介 Combining Model-Based and Model-Free Updates for Trajectory-Centric Rein...
論文紹介 Combining Model-Based and Model-Free Updates for Trajectory-Centric Rein...論文紹介 Combining Model-Based and Model-Free Updates for Trajectory-Centric Rein...
論文紹介 Combining Model-Based and Model-Free Updates for Trajectory-Centric Rein...
 
Differential privacy without sensitivity [NIPS2016読み会資料]
Differential privacy without sensitivity [NIPS2016読み会資料]Differential privacy without sensitivity [NIPS2016読み会資料]
Differential privacy without sensitivity [NIPS2016読み会資料]
 
Matching networks for one shot learning
Matching networks for one shot learningMatching networks for one shot learning
Matching networks for one shot learning
 
ICML2016読み会 概要紹介
ICML2016読み会 概要紹介ICML2016読み会 概要紹介
ICML2016読み会 概要紹介
 
論文紹介 Pixel Recurrent Neural Networks
論文紹介 Pixel Recurrent Neural Networks論文紹介 Pixel Recurrent Neural Networks
論文紹介 Pixel Recurrent Neural Networks
 

Ähnlich wie Value iteration networks

自然方策勾配法の基礎と応用
自然方策勾配法の基礎と応用自然方策勾配法の基礎と応用
自然方策勾配法の基礎と応用Ryo Iwaki
 
Rapid motor adaptation for legged robots
Rapid motor adaptation for legged robotsRapid motor adaptation for legged robots
Rapid motor adaptation for legged robotsRohit Choudhury
 
Novel Graph Modeling Framework for Feature Importance Determination in Unsupe...
Novel Graph Modeling Framework for Feature Importance Determination in Unsupe...Novel Graph Modeling Framework for Feature Importance Determination in Unsupe...
Novel Graph Modeling Framework for Feature Importance Determination in Unsupe...Neo4j
 
Artificial Neural Network based Mobile Robot Navigation
Artificial Neural Network based Mobile Robot NavigationArtificial Neural Network based Mobile Robot Navigation
Artificial Neural Network based Mobile Robot NavigationMithun Chowdhury
 
Presentation of master thesis
Presentation of master thesisPresentation of master thesis
Presentation of master thesisSeoung-Ho Choi
 
[Paper research] GOSELO: for Robot navigation using Reactive neural networks
[Paper research] GOSELO: for Robot navigation using Reactive neural networks[Paper research] GOSELO: for Robot navigation using Reactive neural networks
[Paper research] GOSELO: for Robot navigation using Reactive neural networksJehong Lee
 
Combinatorial optimization and deep reinforcement learning
Combinatorial optimization and deep reinforcement learningCombinatorial optimization and deep reinforcement learning
Combinatorial optimization and deep reinforcement learning민재 정
 
crowd-robot interaction: crowd-aware robot navigation with attention-based DRL
crowd-robot interaction: crowd-aware robot navigation with attention-based DRLcrowd-robot interaction: crowd-aware robot navigation with attention-based DRL
crowd-robot interaction: crowd-aware robot navigation with attention-based DRL민재 정
 
Path Planning And Navigation
Path Planning And NavigationPath Planning And Navigation
Path Planning And Navigationguest90654fd
 
Path Planning And Navigation
Path Planning And NavigationPath Planning And Navigation
Path Planning And Navigationguest90654fd
 
State Representation Learning for control: an overview
State Representation Learning for control: an overviewState Representation Learning for control: an overview
State Representation Learning for control: an overviewNatalia Díaz Rodríguez
 
Rahul_Kirtoniya_11800121032_CSE_Machine_Learning.pptx
Rahul_Kirtoniya_11800121032_CSE_Machine_Learning.pptxRahul_Kirtoniya_11800121032_CSE_Machine_Learning.pptx
Rahul_Kirtoniya_11800121032_CSE_Machine_Learning.pptxRahulKirtoniya
 
Analysis of Educational Robotics activities using a machine learning approach
Analysis of Educational Robotics activities using a machine learning approachAnalysis of Educational Robotics activities using a machine learning approach
Analysis of Educational Robotics activities using a machine learning approachLorenzo Cesaretti
 
Road Network Extraction using Satellite Imagery.
Road Network Extraction using Satellite Imagery.Road Network Extraction using Satellite Imagery.
Road Network Extraction using Satellite Imagery.SUMITRAJ312049
 
Farkhatdinov Robotics education for children 2017 Accepted.pdf
Farkhatdinov Robotics education for children 2017 Accepted.pdfFarkhatdinov Robotics education for children 2017 Accepted.pdf
Farkhatdinov Robotics education for children 2017 Accepted.pdfMonesseKHAMISSIA1
 
Procedural modeling using autoencoder networks
Procedural modeling using autoencoder networksProcedural modeling using autoencoder networks
Procedural modeling using autoencoder networksShuhei Iitsuka
 
Autonomy Incubator Seminar Series: Tractable Robust Planning and Model Learni...
Autonomy Incubator Seminar Series: Tractable Robust Planning and Model Learni...Autonomy Incubator Seminar Series: Tractable Robust Planning and Model Learni...
Autonomy Incubator Seminar Series: Tractable Robust Planning and Model Learni...AutonomyIncubator
 

Ähnlich wie Value iteration networks (20)

自然方策勾配法の基礎と応用
自然方策勾配法の基礎と応用自然方策勾配法の基礎と応用
自然方策勾配法の基礎と応用
 
Rapid motor adaptation for legged robots
Rapid motor adaptation for legged robotsRapid motor adaptation for legged robots
Rapid motor adaptation for legged robots
 
Novel Graph Modeling Framework for Feature Importance Determination in Unsupe...
Novel Graph Modeling Framework for Feature Importance Determination in Unsupe...Novel Graph Modeling Framework for Feature Importance Determination in Unsupe...
Novel Graph Modeling Framework for Feature Importance Determination in Unsupe...
 
Artificial Neural Network based Mobile Robot Navigation
Artificial Neural Network based Mobile Robot NavigationArtificial Neural Network based Mobile Robot Navigation
Artificial Neural Network based Mobile Robot Navigation
 
Presentation of master thesis
Presentation of master thesisPresentation of master thesis
Presentation of master thesis
 
[Paper research] GOSELO: for Robot navigation using Reactive neural networks
[Paper research] GOSELO: for Robot navigation using Reactive neural networks[Paper research] GOSELO: for Robot navigation using Reactive neural networks
[Paper research] GOSELO: for Robot navigation using Reactive neural networks
 
Introduction to Deep Reinforcement Learning
Introduction to Deep Reinforcement LearningIntroduction to Deep Reinforcement Learning
Introduction to Deep Reinforcement Learning
 
Combinatorial optimization and deep reinforcement learning
Combinatorial optimization and deep reinforcement learningCombinatorial optimization and deep reinforcement learning
Combinatorial optimization and deep reinforcement learning
 
crowd-robot interaction: crowd-aware robot navigation with attention-based DRL
crowd-robot interaction: crowd-aware robot navigation with attention-based DRLcrowd-robot interaction: crowd-aware robot navigation with attention-based DRL
crowd-robot interaction: crowd-aware robot navigation with attention-based DRL
 
Path Planning And Navigation
Path Planning And NavigationPath Planning And Navigation
Path Planning And Navigation
 
Path Planning And Navigation
Path Planning And NavigationPath Planning And Navigation
Path Planning And Navigation
 
State Representation Learning for control: an overview
State Representation Learning for control: an overviewState Representation Learning for control: an overview
State Representation Learning for control: an overview
 
Rahul_Kirtoniya_11800121032_CSE_Machine_Learning.pptx
Rahul_Kirtoniya_11800121032_CSE_Machine_Learning.pptxRahul_Kirtoniya_11800121032_CSE_Machine_Learning.pptx
Rahul_Kirtoniya_11800121032_CSE_Machine_Learning.pptx
 
Analysis of Educational Robotics activities using a machine learning approach
Analysis of Educational Robotics activities using a machine learning approachAnalysis of Educational Robotics activities using a machine learning approach
Analysis of Educational Robotics activities using a machine learning approach
 
Road Network Extraction using Satellite Imagery.
Road Network Extraction using Satellite Imagery.Road Network Extraction using Satellite Imagery.
Road Network Extraction using Satellite Imagery.
 
Farkhatdinov Robotics education for children 2017 Accepted.pdf
Farkhatdinov Robotics education for children 2017 Accepted.pdfFarkhatdinov Robotics education for children 2017 Accepted.pdf
Farkhatdinov Robotics education for children 2017 Accepted.pdf
 
Procedural modeling using autoencoder networks
Procedural modeling using autoencoder networksProcedural modeling using autoencoder networks
Procedural modeling using autoencoder networks
 
Conv xg
Conv xgConv xg
Conv xg
 
Resume_2016
Resume_2016Resume_2016
Resume_2016
 
Autonomy Incubator Seminar Series: Tractable Robust Planning and Model Learni...
Autonomy Incubator Seminar Series: Tractable Robust Planning and Model Learni...Autonomy Incubator Seminar Series: Tractable Robust Planning and Model Learni...
Autonomy Incubator Seminar Series: Tractable Robust Planning and Model Learni...
 

Mehr von Fujimoto Keisuke

A quantum computational approach to correspondence problems on point sets
A quantum computational approach to correspondence problems on point setsA quantum computational approach to correspondence problems on point sets
A quantum computational approach to correspondence problems on point setsFujimoto Keisuke
 
F0-Consistent Many-to-many Non-parallel Voice Conversion via Conditional Auto...
F0-Consistent Many-to-many Non-parallel Voice Conversion via Conditional Auto...F0-Consistent Many-to-many Non-parallel Voice Conversion via Conditional Auto...
F0-Consistent Many-to-many Non-parallel Voice Conversion via Conditional Auto...Fujimoto Keisuke
 
YOLACT real-time instance segmentation
YOLACT real-time instance segmentationYOLACT real-time instance segmentation
YOLACT real-time instance segmentationFujimoto Keisuke
 
Product Managerの役割、周辺ロールとの差異
Product Managerの役割、周辺ロールとの差異Product Managerの役割、周辺ロールとの差異
Product Managerの役割、周辺ロールとの差異Fujimoto Keisuke
 
ChainerRLで株売買を結構頑張ってみた(後編)
ChainerRLで株売買を結構頑張ってみた(後編)ChainerRLで株売買を結構頑張ってみた(後編)
ChainerRLで株売買を結構頑張ってみた(後編)Fujimoto Keisuke
 
Temporal Cycle Consistency Learning
Temporal Cycle Consistency LearningTemporal Cycle Consistency Learning
Temporal Cycle Consistency LearningFujimoto Keisuke
 
20190414 Point Cloud Reconstruction Survey
20190414 Point Cloud Reconstruction Survey20190414 Point Cloud Reconstruction Survey
20190414 Point Cloud Reconstruction SurveyFujimoto Keisuke
 
20180925 CV勉強会 SfM解説
20180925 CV勉強会 SfM解説20180925 CV勉強会 SfM解説
20180925 CV勉強会 SfM解説Fujimoto Keisuke
 
Sliced Wasserstein Distance for Learning Gaussian Mixture Models
Sliced Wasserstein Distance for Learning Gaussian Mixture ModelsSliced Wasserstein Distance for Learning Gaussian Mixture Models
Sliced Wasserstein Distance for Learning Gaussian Mixture ModelsFujimoto Keisuke
 
LiDAR-SLAM チュートリアル資料
LiDAR-SLAM チュートリアル資料LiDAR-SLAM チュートリアル資料
LiDAR-SLAM チュートリアル資料Fujimoto Keisuke
 
Stock trading using ChainerRL
Stock trading using ChainerRLStock trading using ChainerRL
Stock trading using ChainerRLFujimoto Keisuke
 
Cold-Start Reinforcement Learning with Softmax Policy Gradient
Cold-Start Reinforcement Learning with Softmax Policy GradientCold-Start Reinforcement Learning with Softmax Policy Gradient
Cold-Start Reinforcement Learning with Softmax Policy GradientFujimoto Keisuke
 
Representation learning by learning to count
Representation learning by learning to countRepresentation learning by learning to count
Representation learning by learning to countFujimoto Keisuke
 
Dynamic Routing Between Capsules
Dynamic Routing Between CapsulesDynamic Routing Between Capsules
Dynamic Routing Between CapsulesFujimoto Keisuke
 
Deep Learning Framework Comparison on CPU
Deep Learning Framework Comparison on CPUDeep Learning Framework Comparison on CPU
Deep Learning Framework Comparison on CPUFujimoto Keisuke
 
Global optimality in neural network training
Global optimality in neural network trainingGlobal optimality in neural network training
Global optimality in neural network trainingFujimoto Keisuke
 

Mehr von Fujimoto Keisuke (20)

A quantum computational approach to correspondence problems on point sets
A quantum computational approach to correspondence problems on point setsA quantum computational approach to correspondence problems on point sets
A quantum computational approach to correspondence problems on point sets
 
F0-Consistent Many-to-many Non-parallel Voice Conversion via Conditional Auto...
F0-Consistent Many-to-many Non-parallel Voice Conversion via Conditional Auto...F0-Consistent Many-to-many Non-parallel Voice Conversion via Conditional Auto...
F0-Consistent Many-to-many Non-parallel Voice Conversion via Conditional Auto...
 
YOLACT real-time instance segmentation
YOLACT real-time instance segmentationYOLACT real-time instance segmentation
YOLACT real-time instance segmentation
 
Product Managerの役割、周辺ロールとの差異
Product Managerの役割、周辺ロールとの差異Product Managerの役割、周辺ロールとの差異
Product Managerの役割、周辺ロールとの差異
 
ChainerRLで株売買を結構頑張ってみた(後編)
ChainerRLで株売買を結構頑張ってみた(後編)ChainerRLで株売買を結構頑張ってみた(後編)
ChainerRLで株売買を結構頑張ってみた(後編)
 
Temporal Cycle Consistency Learning
Temporal Cycle Consistency LearningTemporal Cycle Consistency Learning
Temporal Cycle Consistency Learning
 
ML@Loft
ML@LoftML@Loft
ML@Loft
 
20190414 Point Cloud Reconstruction Survey
20190414 Point Cloud Reconstruction Survey20190414 Point Cloud Reconstruction Survey
20190414 Point Cloud Reconstruction Survey
 
Chainer meetup 9
Chainer meetup 9Chainer meetup 9
Chainer meetup 9
 
20180925 CV勉強会 SfM解説
20180925 CV勉強会 SfM解説20180925 CV勉強会 SfM解説
20180925 CV勉強会 SfM解説
 
Sliced Wasserstein Distance for Learning Gaussian Mixture Models
Sliced Wasserstein Distance for Learning Gaussian Mixture ModelsSliced Wasserstein Distance for Learning Gaussian Mixture Models
Sliced Wasserstein Distance for Learning Gaussian Mixture Models
 
LiDAR-SLAM チュートリアル資料
LiDAR-SLAM チュートリアル資料LiDAR-SLAM チュートリアル資料
LiDAR-SLAM チュートリアル資料
 
Stock trading using ChainerRL
Stock trading using ChainerRLStock trading using ChainerRL
Stock trading using ChainerRL
 
Cold-Start Reinforcement Learning with Softmax Policy Gradient
Cold-Start Reinforcement Learning with Softmax Policy GradientCold-Start Reinforcement Learning with Softmax Policy Gradient
Cold-Start Reinforcement Learning with Softmax Policy Gradient
 
Representation learning by learning to count
Representation learning by learning to countRepresentation learning by learning to count
Representation learning by learning to count
 
Dynamic Routing Between Capsules
Dynamic Routing Between CapsulesDynamic Routing Between Capsules
Dynamic Routing Between Capsules
 
Deep Learning Framework Comparison on CPU
Deep Learning Framework Comparison on CPUDeep Learning Framework Comparison on CPU
Deep Learning Framework Comparison on CPU
 
ICCV2017一人読み会
ICCV2017一人読み会ICCV2017一人読み会
ICCV2017一人読み会
 
Global optimality in neural network training
Global optimality in neural network trainingGlobal optimality in neural network training
Global optimality in neural network training
 
CVPR2017 oral survey
CVPR2017 oral surveyCVPR2017 oral survey
CVPR2017 oral survey
 

Kürzlich hochgeladen

AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024The Digital Insurer
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbuapidays
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfOverkill Security
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 

Kürzlich hochgeladen (20)

AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 

Value iteration networks