SlideShare ist ein Scribd-Unternehmen logo
1 von 27
Downloaden Sie, um offline zu lesen
Bayesian Nonparametric Motor-skill Representations
for Efficient Learning of Robotic Clothing Assistance
Workshop on Practical Bayesian Nonparametrics, NIPS 2016
Nishanth Koganti1,2
, Tomoya Tamei1
, Kazushi Ikeda1
, Tomohiro Shibata2
1
Nara Institute of Science and Technology, Ikoma, Japan
2
Kyushu Institute of Technology, Kitakyushu, Japan
February 11, 2017
0 / 15
Robotic Clothing Assistance
Aging causes loss of motor functions to perform dextrous tasks.
Goal: Develop learning framework for humanoid robots to
perform clothing assistance.
Challenge: Close interaction of robot with clothes and human
Non-rigid clothing material 1
Varying posture of human 1
1
Figure Left: Ramisa et al., 2011, Right: Dan MacLeod Posture Study
1 / 15
Reinforcement Learning for Clothing Assistance
Markov Decision Process (MDP)
formulated with low-dimensional state,
policy representations. 1
1
Tamei, T. et al., “Reinforcement learning of clothing assistance”, in IEEE-RAS Humanoids 2011
2 / 15
Clothing Assistance Framework 1
: Outline
1
Tamei, T. et al., “Reinforcement learning of clothing assistance”, in IEEE-RAS Humanoids 2011
2 / 15
Clothing Assistance Framework 1
: Policy
Control policy parametrized by Via-points 2
of trajectory.
Finite difference policy gradient method is used for policy update:
∂η(θ)
∂θ
≈
r(θi + ∆θ) − r(θi − ∆θ)
2∆θ
θ ← θ + α
∂η(θ)
∂θ
1
Tamei, T. et al., “Reinforcement learning of clothing assistance”, in IEEE-RAS Humanoids 2011
2
Wada, Y. et al. “Theory for handwriting on minimization principle.” in Biological Cybernetics, 1995
3 / 15
Problem: Adaptive Learning of Clothing Skills
Design of robust motor-skills learning framework is crucial for
real-world implementation on low-cost robots.
Tight coupling with cloth and close proximity to Human.
Optimal policy varies with initial conditions.
Non-rigid clothing material Varying posture of human
1
Figure Left: Ramisa et al., 2011, Right: Dan MacLeod Posture Study
4 / 15
Reinforcement Learning in Latent Space
Combining motor-skills learning with dimensionality reduction:
Tractable search space reducing learning time.
Latent space can be modeled to capture task space constraints.
Existing methods rely on linear models or MAP estimate of
latent space.
Bitzer et al., 2010 1
Luck et al., 2014 2
1
Bitzer, S. et al., “Using dimensionality reduction in reinforcement learning” in IEEE/RSJ IROS, 2010
2
Luck, K. S. et al., “Latent space policy search for robotics” in IEEE/RSJ IROS, 2014
5 / 15
Motor-skill Learning in Latent Spaces
Use Bayesian nonparametric nonlinear dimensionality reduction for
efficient learning of clothing skills 1.
1
Nishanth, K. et al., “Bayesian Nonparametric Motor-skill Representations for Efficient Learning of Clothing
Assistance” in Workshop on Practical Bayesian Nonparametrics, NIPS, 2016
6 / 15
Bayesian Gaussian Process Latent Variable Model
Latent variable model (Titsias et al., 2010 1):
y = f (x) + , ∈ N(0, σ2
I)
y ∈ RD
: Observed Variable
x ∈ RQ
(Q D): Unknown latent variable
f : x → y: Mapping given by Gaussian Process
p(Y|X) =
D
d=1
N(yd |0, KNN + β−1
IN)
x f
w, θ
y
1
Titsias, M. K. et al., “Bayesian Gaussian Process Latent Variable Model”, in AISTATS 2011
7 / 15
BGPLVM: Manifold Learning
Bayesian Inference: Posterior distribution on the latent
space.
p(Y) =
X
p(Y|X)p(X)dX
Marginalization made tractable using variational inference:
q(X) =
N
n=1
N(xn|µn, Sn)
log(p(Y)) ≥ q(X)p(Y|X)dX − q(X) log
q(X)
p(X)
dX
Automatic dimensionality reduction possible using ARD kernel:
k(x, x ) = σ2
f exp

−
1
2
Q
q=1
wq(xq − xq)2


1
Titsias, M. K. et al., “Bayesian Gaussian Process Latent Variable Model”, in AISTATS 2011
8 / 15
Motor-skills Transfer through Latent Space
BGPLVM model trained on robot joint angles ∈ R14
for kinesthetic
demonstration of clothing assistance 1.
1
Nishanth, K. et al., “Motor-skill Learning in Latent Spaces for Robotic Clothing Assistance” in RSJ Annual
Conference, 2016
9 / 15
Reinforcement Learning in BGPLVM Space
Apply Cross Entropy Method to perform policy improvement:
θ∗
∼ N(θ|µ∗
, Σ∗
)
µ∗
:= mean(argmax θold), Σ∗
:= var(argmax θold)
Represent policy using Dynamic Movement Primitive (DMP):
τ¨x = K(g − x) − D ˙x + (g − x0)f
f (s) = i wi ψi (s)s
i ψi (s)
, where τ ˙s = −αs
1
Nishanth, K. et al., “Bayesian Nonparametric Motor-skill Representations for Efficient Learning of Clothing
Assistance” in Workshop on Practical Bayesian Nonparametrics, NIPS, 2016 10 / 15
Reinforcement Learning in BGPLVM Space
Represent reward function by distance from desired Via-points
of current policy:
R(π(θ)) =
ndims
i=1
nvia
j=1
Vi,j − πi (θ, ti,j) 2
11 / 15
Latent Space Controller for Clothing Tasks 1
1
Nishanth, K. et al., “Motor-skill Learning in Latent Spaces for Robotic Clothing Assistance” in RSJ Annual
Conference, 2016
12 / 15
Generalization in Latent Space
Evaluation: Reconstruction error
of latent space with RMS Error 1.
Dataset: Clothing trajectories
for 4 postures: Shoulder Angle
∈ {65o
, 70o
, 75o
, 80o
}.
PCA GPLVM BGPLVM
1
Nishanth, K. et al., “Motor-skill Learning in Latent Spaces for Robotic Clothing Assistance” in RSJ Annual
Conference, 2016
13 / 15
Reinforcement Learning in Latent Space
Apply Reinforcement Learning in different action spaces with same
formulation and reward function
Parameters: 50 × ndims
basis functions
CEM: 50 rollouts per
iteration.
Policy Update: 5 best
rollouts per iteration
1
Nishanth, K. et al., “Bayesian Nonparametric Motor-skill Representations for Efficient Learning of Clothing
Assistance” in Workshop on Practical Bayesian Nonparametrics, NIPS, 2016
14 / 15
Moving forward
Immediate Goal: Latent spaces for Robotics applications:
Auto-regressive prior on latent space to capture task dynamics.
Explicit model of human-robot interaction as constraint.
Ambitious Goal: Combine policy search RL and BGPLVM:
Non-linear dimensionality reduction.
Bayesian and data-efficient learning.
Data-efficient 1
Bayesian Inference 1
1
Deisenroth, M. P. et al., “Gaussian processes for data-efficient learning in robotics and control” in IEEE
Transactions PAMI, 2015
15 / 15
Appendix
15 / 15
Topology Coordinates
To approximate Markov Decision Process, the relationship between
cloth and subject needs to be observed as much as possible.
Low dimensional representations need to be used for a fast learning
time.
Topological Coordinates introduced to address both requirements.
Concept proposed by Edmond et. al(2009) 1
.
Given 2 line segments, the amount of twist(writhe) between them is
given by the Guassian Linking Integral(GLI):
w = GLI(γ1, γ2) =
1
4π γ1 γ2
dγ1 × dγ2 · (γ1 − γ2)
γ1 − γ2
3 (1)
1
Motion Synthesis using Topology Coordinates, Edmond et. al., Eurographics 2009
15 / 15
Topology Space
The relationship between linesegments is defined by the Writhe
matrix(Tn×m).
Given line segments S1, S2 with n,m links, Tn×m is given by:
Tij = GLI(Si
1, Sj
2)
The parameters writhe, center, density are defined from writhe
matrix which form the Topology Space.
1
Motion Synthesis using Topology Coordinates, Edmond et. al., Eurographics 2009
15 / 15
Clothing Assistance Framework 1
: State and Reward
Low-dimensional representation using Topology Coordinates 2
.
Reward given by distance between final state and target state:
ri = − starget
i − si (i = 1, 2, 3), r(s) =
3
i=1
ri − µi
σi
1
Tamei, T. et al., “Reinforcement learning of clothing assistance”, in IEEE-RAS Humanoids 2011
2
Ho, E. S., et al., “Character synthesis by topology coordinates”, in Computer Graphics Forum 2009
15 / 15
Combining DR and RL
Policy representation:
a = W(ZT
Φ) + MΦ + EΦ
Expectation Step: Posterior distribution over Latent Variables
pθold
(ZT
Φ|a) = N(CWT
(a − MΦ), Cσ2
tr(ΦΦT
)),
C = (σ2
I + WT
W)
Maximization: Compute gradients with respect to Policy
parameters
∂lnp(a)Qt
π
∂M
,
∂lnp(a)Qt
π
∂W
,
∂lnp(a)Qt
π
∂σ2
1
Luck, K. S. et al., “Latent space policy search for robotics” in IEEE/RSJ IROS, 2014
15 / 15
DR as Preprocessing for RL
Bitzer et al. (2010) 1: GPLVM based latent space encoding
task space constraints.
Non-linear dimensionality reduction
Data-efficient learning with GP-mapping
Value-function reinforcement learning (TD(0)) applied to
tractable search space.
1
Bitzer, S. et al., “Using dimensionality reduction in reinforcement learning” in IEEE/RSJ IROS, 2010
15 / 15
Combining DR and RL
Luck et al. (2014) 1: Joint learning of latent space and
optimal policy.
a = W(ZT
Φ) + MΦ + EΦ (2)
PePPER: Formulated Expectation-Maximization formulation
based on KL-divergence lower bound.
Probabilistic PCA used as model for learning latent space.
1
Luck, K. S. et al., “Latent space policy search for robotics” in IEEE/RSJ IROS, 2014
15 / 15
Combining DR and RL
Inverse Kinematics: Planning in joint angle space of highly
redundant robot (20 DOF).
Standing on one leg: Applied to full-humanoid robot and
policy learned from scratch.
1
Luck, K. S. et al., “Latent space policy search for robotics” in IEEE/RSJ IROS, 2014
15 / 15
Discussion
Robotic Clothing Assistance involves several problems.
Propose use of DR with RL for efficient motor-skills learning.
Future Work
Implement Latent Space RL framework for Clothing
Assistance framework.
Combine real-time state estimation with motor-skills learning
framework.
15 / 15
References
Tamei, Tomoya, et al. “Reinforcement learning of clothing assistance with a
dual-arm robot.” Humanoid Robots (Humanoids), 2011 11th IEEE-RAS
International Conference on. IEEE, 2011.
Ho, Edmond SL, and Taku Komura. “Character motion synthesis by topology
coordinates.” Computer Graphics Forum. Vol. 28. No. 2. Blackwell Publishing
Ltd, 2009.
Pohl, William F. “The self-linking number of a closed space curve(Gauss integral
formula treated for disjoint closed space curves linking number).” Journal of
Mathematics and Mechanics 17 (1968): 975-985.
Miyamoto, Hiroyuki, et al. “A kendama learning robot based on bi-directional
theory.” Neural networks 9.8 (1996): 1281-1302.
Koganti, Nishanth, et al. “Cloth dynamics modeling in latent spaces and its
application to robotic clothing assistance.” Intelligent Robots and Systems
(IROS), 2015 IEEE/RSJ International Conference on. IEEE, 2015.
Deisenroth, Marc Peter, Dieter Fox, and Carl Edward Rasmussen. “Gaussian
processes for data-efficient learning in robotics and control.” Pattern Analysis
and Machine Intelligence, IEEE Transactions on 37.2 (2015): 408-423.
Levine, Sergey, et al. “End-to-end training of deep visuomotor policies.” arXiv
preprint arXiv:1504.00702 (2015).
15 / 15

Weitere ähnliche Inhalte

Was ist angesagt?

Symbolic representation and recognition of gait an approach based on lbp of ...
Symbolic representation and recognition of gait  an approach based on lbp of ...Symbolic representation and recognition of gait  an approach based on lbp of ...
Symbolic representation and recognition of gait an approach based on lbp of ...sipij
 
Cross-domain complementary learning with synthetic data for multi-person part...
Cross-domain complementary learning with synthetic data for multi-person part...Cross-domain complementary learning with synthetic data for multi-person part...
Cross-domain complementary learning with synthetic data for multi-person part...哲东 郑
 
Semantic Filtering (An Image Processing Method)
Semantic Filtering (An Image Processing Method)Semantic Filtering (An Image Processing Method)
Semantic Filtering (An Image Processing Method)Seval Çapraz
 
Lec10: Medical Image Segmentation as an Energy Minimization Problem
Lec10: Medical Image Segmentation as an Energy Minimization ProblemLec10: Medical Image Segmentation as an Energy Minimization Problem
Lec10: Medical Image Segmentation as an Energy Minimization ProblemUlaş Bağcı
 
On Training Targets and Objective Functions for Deep-Learning-Based Audio-Vis...
On Training Targets and Objective Functions for Deep-Learning-Based Audio-Vis...On Training Targets and Objective Functions for Deep-Learning-Based Audio-Vis...
On Training Targets and Objective Functions for Deep-Learning-Based Audio-Vis...Daniel Michelsanti
 
NIPS2009: Understand Visual Scenes - Part 2
NIPS2009: Understand Visual Scenes - Part 2NIPS2009: Understand Visual Scenes - Part 2
NIPS2009: Understand Visual Scenes - Part 2zukun
 
Improving Performance of Back propagation Learning Algorithm
Improving Performance of Back propagation Learning AlgorithmImproving Performance of Back propagation Learning Algorithm
Improving Performance of Back propagation Learning Algorithmijsrd.com
 
Fast Unbalanced Optimal Transport on a Tree
Fast Unbalanced Optimal Transport on a TreeFast Unbalanced Optimal Transport on a Tree
Fast Unbalanced Optimal Transport on a Treejoisino
 
Human Action Recognition Based on Spacio-temporal features
Human Action Recognition Based on Spacio-temporal featuresHuman Action Recognition Based on Spacio-temporal features
Human Action Recognition Based on Spacio-temporal featuresnikhilus85
 
SCALE RATIO ICP FOR 3D POINT CLOUDS WITH DIFFERENT SCALES
SCALE RATIO ICP FOR 3D POINT CLOUDS WITH DIFFERENT SCALES SCALE RATIO ICP FOR 3D POINT CLOUDS WITH DIFFERENT SCALES
SCALE RATIO ICP FOR 3D POINT CLOUDS WITH DIFFERENT SCALES Toru Tamaki
 
Vocabulary length experiments for binary image classification using bov approach
Vocabulary length experiments for binary image classification using bov approachVocabulary length experiments for binary image classification using bov approach
Vocabulary length experiments for binary image classification using bov approachsipij
 
HOL, GDCT AND LDCT FOR PEDESTRIAN DETECTION
HOL, GDCT AND LDCT FOR PEDESTRIAN DETECTIONHOL, GDCT AND LDCT FOR PEDESTRIAN DETECTION
HOL, GDCT AND LDCT FOR PEDESTRIAN DETECTIONcsandit
 
201109CVIM/PRMU Inverse Composite Alignment of a sphere under orthogonal proj...
201109CVIM/PRMU Inverse Composite Alignment of a sphere under orthogonal proj...201109CVIM/PRMU Inverse Composite Alignment of a sphere under orthogonal proj...
201109CVIM/PRMU Inverse Composite Alignment of a sphere under orthogonal proj...Toru Tamaki
 
20110326 CG・CVにおける散乱
20110326 CG・CVにおける散乱20110326 CG・CVにおける散乱
20110326 CG・CVにおける散乱Toru Tamaki
 
3次元レジストレーションの基礎とOpen3Dを用いた3次元点群処理
3次元レジストレーションの基礎とOpen3Dを用いた3次元点群処理3次元レジストレーションの基礎とOpen3Dを用いた3次元点群処理
3次元レジストレーションの基礎とOpen3Dを用いた3次元点群処理Toru Tamaki
 
Deep Learning - What's the buzz all about
Deep Learning - What's the buzz all aboutDeep Learning - What's the buzz all about
Deep Learning - What's the buzz all aboutDebdoot Sheet
 
Human Action Recognition Based on Spacio-temporal features-Poster
Human Action Recognition Based on Spacio-temporal features-PosterHuman Action Recognition Based on Spacio-temporal features-Poster
Human Action Recognition Based on Spacio-temporal features-Posternikhilus85
 

Was ist angesagt? (19)

Symbolic representation and recognition of gait an approach based on lbp of ...
Symbolic representation and recognition of gait  an approach based on lbp of ...Symbolic representation and recognition of gait  an approach based on lbp of ...
Symbolic representation and recognition of gait an approach based on lbp of ...
 
Cross-domain complementary learning with synthetic data for multi-person part...
Cross-domain complementary learning with synthetic data for multi-person part...Cross-domain complementary learning with synthetic data for multi-person part...
Cross-domain complementary learning with synthetic data for multi-person part...
 
Semantic Filtering (An Image Processing Method)
Semantic Filtering (An Image Processing Method)Semantic Filtering (An Image Processing Method)
Semantic Filtering (An Image Processing Method)
 
Lec10: Medical Image Segmentation as an Energy Minimization Problem
Lec10: Medical Image Segmentation as an Energy Minimization ProblemLec10: Medical Image Segmentation as an Energy Minimization Problem
Lec10: Medical Image Segmentation as an Energy Minimization Problem
 
On Training Targets and Objective Functions for Deep-Learning-Based Audio-Vis...
On Training Targets and Objective Functions for Deep-Learning-Based Audio-Vis...On Training Targets and Objective Functions for Deep-Learning-Based Audio-Vis...
On Training Targets and Objective Functions for Deep-Learning-Based Audio-Vis...
 
NIPS2009: Understand Visual Scenes - Part 2
NIPS2009: Understand Visual Scenes - Part 2NIPS2009: Understand Visual Scenes - Part 2
NIPS2009: Understand Visual Scenes - Part 2
 
Improving Performance of Back propagation Learning Algorithm
Improving Performance of Back propagation Learning AlgorithmImproving Performance of Back propagation Learning Algorithm
Improving Performance of Back propagation Learning Algorithm
 
Fast Unbalanced Optimal Transport on a Tree
Fast Unbalanced Optimal Transport on a TreeFast Unbalanced Optimal Transport on a Tree
Fast Unbalanced Optimal Transport on a Tree
 
Lecture15 xing
Lecture15 xingLecture15 xing
Lecture15 xing
 
Human Action Recognition Based on Spacio-temporal features
Human Action Recognition Based on Spacio-temporal featuresHuman Action Recognition Based on Spacio-temporal features
Human Action Recognition Based on Spacio-temporal features
 
B010430814
B010430814B010430814
B010430814
 
SCALE RATIO ICP FOR 3D POINT CLOUDS WITH DIFFERENT SCALES
SCALE RATIO ICP FOR 3D POINT CLOUDS WITH DIFFERENT SCALES SCALE RATIO ICP FOR 3D POINT CLOUDS WITH DIFFERENT SCALES
SCALE RATIO ICP FOR 3D POINT CLOUDS WITH DIFFERENT SCALES
 
Vocabulary length experiments for binary image classification using bov approach
Vocabulary length experiments for binary image classification using bov approachVocabulary length experiments for binary image classification using bov approach
Vocabulary length experiments for binary image classification using bov approach
 
HOL, GDCT AND LDCT FOR PEDESTRIAN DETECTION
HOL, GDCT AND LDCT FOR PEDESTRIAN DETECTIONHOL, GDCT AND LDCT FOR PEDESTRIAN DETECTION
HOL, GDCT AND LDCT FOR PEDESTRIAN DETECTION
 
201109CVIM/PRMU Inverse Composite Alignment of a sphere under orthogonal proj...
201109CVIM/PRMU Inverse Composite Alignment of a sphere under orthogonal proj...201109CVIM/PRMU Inverse Composite Alignment of a sphere under orthogonal proj...
201109CVIM/PRMU Inverse Composite Alignment of a sphere under orthogonal proj...
 
20110326 CG・CVにおける散乱
20110326 CG・CVにおける散乱20110326 CG・CVにおける散乱
20110326 CG・CVにおける散乱
 
3次元レジストレーションの基礎とOpen3Dを用いた3次元点群処理
3次元レジストレーションの基礎とOpen3Dを用いた3次元点群処理3次元レジストレーションの基礎とOpen3Dを用いた3次元点群処理
3次元レジストレーションの基礎とOpen3Dを用いた3次元点群処理
 
Deep Learning - What's the buzz all about
Deep Learning - What's the buzz all aboutDeep Learning - What's the buzz all about
Deep Learning - What's the buzz all about
 
Human Action Recognition Based on Spacio-temporal features-Poster
Human Action Recognition Based on Spacio-temporal features-PosterHuman Action Recognition Based on Spacio-temporal features-Poster
Human Action Recognition Based on Spacio-temporal features-Poster
 

Ähnlich wie Bayesian Nonparametric Motor-skill Representations for Efficient Learning of Robotic Clothing Assistance

Improved Particle Swarm Optimization
Improved Particle Swarm OptimizationImproved Particle Swarm Optimization
Improved Particle Swarm Optimizationvane sanchez
 
Web image annotation by diffusion maps manifold learning algorithm
Web image annotation by diffusion maps manifold learning algorithmWeb image annotation by diffusion maps manifold learning algorithm
Web image annotation by diffusion maps manifold learning algorithmijfcstjournal
 
Comparison Between PSO and HPSO In Image Steganography
Comparison Between PSO and HPSO In Image SteganographyComparison Between PSO and HPSO In Image Steganography
Comparison Between PSO and HPSO In Image SteganographyIJCSIS Research Publications
 
INTEGRATION OF GIS AND OPTIMIZATION ROUTINES FOR THE VEHICLE ROUTING PROBLEM
INTEGRATION OF GIS AND OPTIMIZATION ROUTINES FOR THE VEHICLE ROUTING PROBLEMINTEGRATION OF GIS AND OPTIMIZATION ROUTINES FOR THE VEHICLE ROUTING PROBLEM
INTEGRATION OF GIS AND OPTIMIZATION ROUTINES FOR THE VEHICLE ROUTING PROBLEMijccmsjournal
 
Integration Of Gis And Optimization Routines For The Vehicle Routing Problem
Integration Of Gis And Optimization Routines For The Vehicle Routing ProblemIntegration Of Gis And Optimization Routines For The Vehicle Routing Problem
Integration Of Gis And Optimization Routines For The Vehicle Routing Problemijccmsjournal
 
4 tracking objects of deformable shapes (1)
4 tracking objects of deformable shapes (1)4 tracking objects of deformable shapes (1)
4 tracking objects of deformable shapes (1)prj_publication
 
4 tracking objects of deformable shapes
4 tracking objects of deformable shapes4 tracking objects of deformable shapes
4 tracking objects of deformable shapesprj_publication
 
4 tracking objects of deformable shapes
4 tracking objects of deformable shapes4 tracking objects of deformable shapes
4 tracking objects of deformable shapesprj_publication
 
Bat-Cluster: A Bat Algorithm-based Automated Graph Clustering Approach
Bat-Cluster: A Bat Algorithm-based Automated Graph Clustering Approach Bat-Cluster: A Bat Algorithm-based Automated Graph Clustering Approach
Bat-Cluster: A Bat Algorithm-based Automated Graph Clustering Approach IJECEIAES
 
Person re-identification, PhD Day 2011
Person re-identification, PhD Day 2011Person re-identification, PhD Day 2011
Person re-identification, PhD Day 2011Riccardo Satta
 
HOL, GDCT AND LDCT FOR PEDESTRIAN DETECTION
HOL, GDCT AND LDCT FOR PEDESTRIAN DETECTIONHOL, GDCT AND LDCT FOR PEDESTRIAN DETECTION
HOL, GDCT AND LDCT FOR PEDESTRIAN DETECTIONcscpconf
 
4 tracking objects of deformable shapes
4 tracking objects of deformable shapes4 tracking objects of deformable shapes
4 tracking objects of deformable shapesprjpublications
 
From Signal to Symbols
From Signal to SymbolsFrom Signal to Symbols
From Signal to Symbolsgpano
 
[IROS2017] Online Spatial Concept and Lexical Acquisition with Simultaneous L...
[IROS2017] Online Spatial Concept and Lexical Acquisition with Simultaneous L...[IROS2017] Online Spatial Concept and Lexical Acquisition with Simultaneous L...
[IROS2017] Online Spatial Concept and Lexical Acquisition with Simultaneous L...Akira Taniguchi
 

Ähnlich wie Bayesian Nonparametric Motor-skill Representations for Efficient Learning of Robotic Clothing Assistance (20)

Improved Particle Swarm Optimization
Improved Particle Swarm OptimizationImproved Particle Swarm Optimization
Improved Particle Swarm Optimization
 
Presentation v3.2
Presentation v3.2Presentation v3.2
Presentation v3.2
 
Presentation v3.2
Presentation v3.2Presentation v3.2
Presentation v3.2
 
Web image annotation by diffusion maps manifold learning algorithm
Web image annotation by diffusion maps manifold learning algorithmWeb image annotation by diffusion maps manifold learning algorithm
Web image annotation by diffusion maps manifold learning algorithm
 
Comparison Between PSO and HPSO In Image Steganography
Comparison Between PSO and HPSO In Image SteganographyComparison Between PSO and HPSO In Image Steganography
Comparison Between PSO and HPSO In Image Steganography
 
INTEGRATION OF GIS AND OPTIMIZATION ROUTINES FOR THE VEHICLE ROUTING PROBLEM
INTEGRATION OF GIS AND OPTIMIZATION ROUTINES FOR THE VEHICLE ROUTING PROBLEMINTEGRATION OF GIS AND OPTIMIZATION ROUTINES FOR THE VEHICLE ROUTING PROBLEM
INTEGRATION OF GIS AND OPTIMIZATION ROUTINES FOR THE VEHICLE ROUTING PROBLEM
 
Integration Of Gis And Optimization Routines For The Vehicle Routing Problem
Integration Of Gis And Optimization Routines For The Vehicle Routing ProblemIntegration Of Gis And Optimization Routines For The Vehicle Routing Problem
Integration Of Gis And Optimization Routines For The Vehicle Routing Problem
 
2213ijccms02.pdf
2213ijccms02.pdf2213ijccms02.pdf
2213ijccms02.pdf
 
Application of transportation problem under pentagonal neutrosophic environment
Application of transportation problem under pentagonal neutrosophic environmentApplication of transportation problem under pentagonal neutrosophic environment
Application of transportation problem under pentagonal neutrosophic environment
 
4 tracking objects of deformable shapes (1)
4 tracking objects of deformable shapes (1)4 tracking objects of deformable shapes (1)
4 tracking objects of deformable shapes (1)
 
4 tracking objects of deformable shapes
4 tracking objects of deformable shapes4 tracking objects of deformable shapes
4 tracking objects of deformable shapes
 
4 tracking objects of deformable shapes
4 tracking objects of deformable shapes4 tracking objects of deformable shapes
4 tracking objects of deformable shapes
 
Bat-Cluster: A Bat Algorithm-based Automated Graph Clustering Approach
Bat-Cluster: A Bat Algorithm-based Automated Graph Clustering Approach Bat-Cluster: A Bat Algorithm-based Automated Graph Clustering Approach
Bat-Cluster: A Bat Algorithm-based Automated Graph Clustering Approach
 
Person re-identification, PhD Day 2011
Person re-identification, PhD Day 2011Person re-identification, PhD Day 2011
Person re-identification, PhD Day 2011
 
HOL, GDCT AND LDCT FOR PEDESTRIAN DETECTION
HOL, GDCT AND LDCT FOR PEDESTRIAN DETECTIONHOL, GDCT AND LDCT FOR PEDESTRIAN DETECTION
HOL, GDCT AND LDCT FOR PEDESTRIAN DETECTION
 
4 tracking objects of deformable shapes
4 tracking objects of deformable shapes4 tracking objects of deformable shapes
4 tracking objects of deformable shapes
 
From Signal to Symbols
From Signal to SymbolsFrom Signal to Symbols
From Signal to Symbols
 
I04105358
I04105358I04105358
I04105358
 
[IROS2017] Online Spatial Concept and Lexical Acquisition with Simultaneous L...
[IROS2017] Online Spatial Concept and Lexical Acquisition with Simultaneous L...[IROS2017] Online Spatial Concept and Lexical Acquisition with Simultaneous L...
[IROS2017] Online Spatial Concept and Lexical Acquisition with Simultaneous L...
 
K010218188
K010218188K010218188
K010218188
 

Kürzlich hochgeladen

The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...ranjana rawat
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxpurnimasatapathy1234
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Call Girls in Nagpur High Profile
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Christo Ananth
 
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...Call Girls in Nagpur High Profile
 
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...roncy bisnoi
 
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performancesivaprakash250
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINESIVASHANKAR N
 
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Dr.Costas Sachpazis
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSKurinjimalarL3
 
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlysanyuktamishra911
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxupamatechverse
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...ranjana rawat
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations120cr0395
 

Kürzlich hochgeladen (20)

The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptx
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
 
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
 
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
 
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performance
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
 
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
 
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghly
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptx
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations
 

Bayesian Nonparametric Motor-skill Representations for Efficient Learning of Robotic Clothing Assistance

  • 1. Bayesian Nonparametric Motor-skill Representations for Efficient Learning of Robotic Clothing Assistance Workshop on Practical Bayesian Nonparametrics, NIPS 2016 Nishanth Koganti1,2 , Tomoya Tamei1 , Kazushi Ikeda1 , Tomohiro Shibata2 1 Nara Institute of Science and Technology, Ikoma, Japan 2 Kyushu Institute of Technology, Kitakyushu, Japan February 11, 2017 0 / 15
  • 2. Robotic Clothing Assistance Aging causes loss of motor functions to perform dextrous tasks. Goal: Develop learning framework for humanoid robots to perform clothing assistance. Challenge: Close interaction of robot with clothes and human Non-rigid clothing material 1 Varying posture of human 1 1 Figure Left: Ramisa et al., 2011, Right: Dan MacLeod Posture Study 1 / 15
  • 3. Reinforcement Learning for Clothing Assistance Markov Decision Process (MDP) formulated with low-dimensional state, policy representations. 1 1 Tamei, T. et al., “Reinforcement learning of clothing assistance”, in IEEE-RAS Humanoids 2011 2 / 15
  • 4. Clothing Assistance Framework 1 : Outline 1 Tamei, T. et al., “Reinforcement learning of clothing assistance”, in IEEE-RAS Humanoids 2011 2 / 15
  • 5. Clothing Assistance Framework 1 : Policy Control policy parametrized by Via-points 2 of trajectory. Finite difference policy gradient method is used for policy update: ∂η(θ) ∂θ ≈ r(θi + ∆θ) − r(θi − ∆θ) 2∆θ θ ← θ + α ∂η(θ) ∂θ 1 Tamei, T. et al., “Reinforcement learning of clothing assistance”, in IEEE-RAS Humanoids 2011 2 Wada, Y. et al. “Theory for handwriting on minimization principle.” in Biological Cybernetics, 1995 3 / 15
  • 6. Problem: Adaptive Learning of Clothing Skills Design of robust motor-skills learning framework is crucial for real-world implementation on low-cost robots. Tight coupling with cloth and close proximity to Human. Optimal policy varies with initial conditions. Non-rigid clothing material Varying posture of human 1 Figure Left: Ramisa et al., 2011, Right: Dan MacLeod Posture Study 4 / 15
  • 7. Reinforcement Learning in Latent Space Combining motor-skills learning with dimensionality reduction: Tractable search space reducing learning time. Latent space can be modeled to capture task space constraints. Existing methods rely on linear models or MAP estimate of latent space. Bitzer et al., 2010 1 Luck et al., 2014 2 1 Bitzer, S. et al., “Using dimensionality reduction in reinforcement learning” in IEEE/RSJ IROS, 2010 2 Luck, K. S. et al., “Latent space policy search for robotics” in IEEE/RSJ IROS, 2014 5 / 15
  • 8. Motor-skill Learning in Latent Spaces Use Bayesian nonparametric nonlinear dimensionality reduction for efficient learning of clothing skills 1. 1 Nishanth, K. et al., “Bayesian Nonparametric Motor-skill Representations for Efficient Learning of Clothing Assistance” in Workshop on Practical Bayesian Nonparametrics, NIPS, 2016 6 / 15
  • 9. Bayesian Gaussian Process Latent Variable Model Latent variable model (Titsias et al., 2010 1): y = f (x) + , ∈ N(0, σ2 I) y ∈ RD : Observed Variable x ∈ RQ (Q D): Unknown latent variable f : x → y: Mapping given by Gaussian Process p(Y|X) = D d=1 N(yd |0, KNN + β−1 IN) x f w, θ y 1 Titsias, M. K. et al., “Bayesian Gaussian Process Latent Variable Model”, in AISTATS 2011 7 / 15
  • 10. BGPLVM: Manifold Learning Bayesian Inference: Posterior distribution on the latent space. p(Y) = X p(Y|X)p(X)dX Marginalization made tractable using variational inference: q(X) = N n=1 N(xn|µn, Sn) log(p(Y)) ≥ q(X)p(Y|X)dX − q(X) log q(X) p(X) dX Automatic dimensionality reduction possible using ARD kernel: k(x, x ) = σ2 f exp  − 1 2 Q q=1 wq(xq − xq)2   1 Titsias, M. K. et al., “Bayesian Gaussian Process Latent Variable Model”, in AISTATS 2011 8 / 15
  • 11. Motor-skills Transfer through Latent Space BGPLVM model trained on robot joint angles ∈ R14 for kinesthetic demonstration of clothing assistance 1. 1 Nishanth, K. et al., “Motor-skill Learning in Latent Spaces for Robotic Clothing Assistance” in RSJ Annual Conference, 2016 9 / 15
  • 12. Reinforcement Learning in BGPLVM Space Apply Cross Entropy Method to perform policy improvement: θ∗ ∼ N(θ|µ∗ , Σ∗ ) µ∗ := mean(argmax θold), Σ∗ := var(argmax θold) Represent policy using Dynamic Movement Primitive (DMP): τ¨x = K(g − x) − D ˙x + (g − x0)f f (s) = i wi ψi (s)s i ψi (s) , where τ ˙s = −αs 1 Nishanth, K. et al., “Bayesian Nonparametric Motor-skill Representations for Efficient Learning of Clothing Assistance” in Workshop on Practical Bayesian Nonparametrics, NIPS, 2016 10 / 15
  • 13. Reinforcement Learning in BGPLVM Space Represent reward function by distance from desired Via-points of current policy: R(π(θ)) = ndims i=1 nvia j=1 Vi,j − πi (θ, ti,j) 2 11 / 15
  • 14. Latent Space Controller for Clothing Tasks 1 1 Nishanth, K. et al., “Motor-skill Learning in Latent Spaces for Robotic Clothing Assistance” in RSJ Annual Conference, 2016 12 / 15
  • 15. Generalization in Latent Space Evaluation: Reconstruction error of latent space with RMS Error 1. Dataset: Clothing trajectories for 4 postures: Shoulder Angle ∈ {65o , 70o , 75o , 80o }. PCA GPLVM BGPLVM 1 Nishanth, K. et al., “Motor-skill Learning in Latent Spaces for Robotic Clothing Assistance” in RSJ Annual Conference, 2016 13 / 15
  • 16. Reinforcement Learning in Latent Space Apply Reinforcement Learning in different action spaces with same formulation and reward function Parameters: 50 × ndims basis functions CEM: 50 rollouts per iteration. Policy Update: 5 best rollouts per iteration 1 Nishanth, K. et al., “Bayesian Nonparametric Motor-skill Representations for Efficient Learning of Clothing Assistance” in Workshop on Practical Bayesian Nonparametrics, NIPS, 2016 14 / 15
  • 17. Moving forward Immediate Goal: Latent spaces for Robotics applications: Auto-regressive prior on latent space to capture task dynamics. Explicit model of human-robot interaction as constraint. Ambitious Goal: Combine policy search RL and BGPLVM: Non-linear dimensionality reduction. Bayesian and data-efficient learning. Data-efficient 1 Bayesian Inference 1 1 Deisenroth, M. P. et al., “Gaussian processes for data-efficient learning in robotics and control” in IEEE Transactions PAMI, 2015 15 / 15
  • 19. Topology Coordinates To approximate Markov Decision Process, the relationship between cloth and subject needs to be observed as much as possible. Low dimensional representations need to be used for a fast learning time. Topological Coordinates introduced to address both requirements. Concept proposed by Edmond et. al(2009) 1 . Given 2 line segments, the amount of twist(writhe) between them is given by the Guassian Linking Integral(GLI): w = GLI(γ1, γ2) = 1 4π γ1 γ2 dγ1 × dγ2 · (γ1 − γ2) γ1 − γ2 3 (1) 1 Motion Synthesis using Topology Coordinates, Edmond et. al., Eurographics 2009 15 / 15
  • 20. Topology Space The relationship between linesegments is defined by the Writhe matrix(Tn×m). Given line segments S1, S2 with n,m links, Tn×m is given by: Tij = GLI(Si 1, Sj 2) The parameters writhe, center, density are defined from writhe matrix which form the Topology Space. 1 Motion Synthesis using Topology Coordinates, Edmond et. al., Eurographics 2009 15 / 15
  • 21. Clothing Assistance Framework 1 : State and Reward Low-dimensional representation using Topology Coordinates 2 . Reward given by distance between final state and target state: ri = − starget i − si (i = 1, 2, 3), r(s) = 3 i=1 ri − µi σi 1 Tamei, T. et al., “Reinforcement learning of clothing assistance”, in IEEE-RAS Humanoids 2011 2 Ho, E. S., et al., “Character synthesis by topology coordinates”, in Computer Graphics Forum 2009 15 / 15
  • 22. Combining DR and RL Policy representation: a = W(ZT Φ) + MΦ + EΦ Expectation Step: Posterior distribution over Latent Variables pθold (ZT Φ|a) = N(CWT (a − MΦ), Cσ2 tr(ΦΦT )), C = (σ2 I + WT W) Maximization: Compute gradients with respect to Policy parameters ∂lnp(a)Qt π ∂M , ∂lnp(a)Qt π ∂W , ∂lnp(a)Qt π ∂σ2 1 Luck, K. S. et al., “Latent space policy search for robotics” in IEEE/RSJ IROS, 2014 15 / 15
  • 23. DR as Preprocessing for RL Bitzer et al. (2010) 1: GPLVM based latent space encoding task space constraints. Non-linear dimensionality reduction Data-efficient learning with GP-mapping Value-function reinforcement learning (TD(0)) applied to tractable search space. 1 Bitzer, S. et al., “Using dimensionality reduction in reinforcement learning” in IEEE/RSJ IROS, 2010 15 / 15
  • 24. Combining DR and RL Luck et al. (2014) 1: Joint learning of latent space and optimal policy. a = W(ZT Φ) + MΦ + EΦ (2) PePPER: Formulated Expectation-Maximization formulation based on KL-divergence lower bound. Probabilistic PCA used as model for learning latent space. 1 Luck, K. S. et al., “Latent space policy search for robotics” in IEEE/RSJ IROS, 2014 15 / 15
  • 25. Combining DR and RL Inverse Kinematics: Planning in joint angle space of highly redundant robot (20 DOF). Standing on one leg: Applied to full-humanoid robot and policy learned from scratch. 1 Luck, K. S. et al., “Latent space policy search for robotics” in IEEE/RSJ IROS, 2014 15 / 15
  • 26. Discussion Robotic Clothing Assistance involves several problems. Propose use of DR with RL for efficient motor-skills learning. Future Work Implement Latent Space RL framework for Clothing Assistance framework. Combine real-time state estimation with motor-skills learning framework. 15 / 15
  • 27. References Tamei, Tomoya, et al. “Reinforcement learning of clothing assistance with a dual-arm robot.” Humanoid Robots (Humanoids), 2011 11th IEEE-RAS International Conference on. IEEE, 2011. Ho, Edmond SL, and Taku Komura. “Character motion synthesis by topology coordinates.” Computer Graphics Forum. Vol. 28. No. 2. Blackwell Publishing Ltd, 2009. Pohl, William F. “The self-linking number of a closed space curve(Gauss integral formula treated for disjoint closed space curves linking number).” Journal of Mathematics and Mechanics 17 (1968): 975-985. Miyamoto, Hiroyuki, et al. “A kendama learning robot based on bi-directional theory.” Neural networks 9.8 (1996): 1281-1302. Koganti, Nishanth, et al. “Cloth dynamics modeling in latent spaces and its application to robotic clothing assistance.” Intelligent Robots and Systems (IROS), 2015 IEEE/RSJ International Conference on. IEEE, 2015. Deisenroth, Marc Peter, Dieter Fox, and Carl Edward Rasmussen. “Gaussian processes for data-efficient learning in robotics and control.” Pattern Analysis and Machine Intelligence, IEEE Transactions on 37.2 (2015): 408-423. Levine, Sergey, et al. “End-to-end training of deep visuomotor policies.” arXiv preprint arXiv:1504.00702 (2015). 15 / 15