SlideShare ist ein Scribd-Unternehmen logo
1 von 40
Downloaden Sie, um offline zu lesen
[course site]
Attention Models
Day 3 Lecture 6
#DLUPC
Amaia Salvador
amaia.salvador@upc.edu
PhD Candidate
Universitat PolitĂšcnica de Catalunya
Attention Models: Motivation
Image:
H x W x 3
bird
The whole input volume is used to predict the output...
...despite the fact that not all pixels are equally important
2
Attention Models: Motivation
3
A bird flying over a body of water
Attend to different parts of the input to optimize a certain output
Case study: Image Captioning
Previously D3L5: Image Captioning
4
only takes into account
image features in the first
hidden state
Multimodal Recurrent
Neural Network
Karpathy and Fei-Fei. "Deep visual-semantic alignments for generating image descriptions." CVPR 2015
LSTM Decoder for Image Captioning
LSTMLSTM LSTM
CNN LSTM
A bird flying
...
<EOS>
Features:
D
5
...
Vinyals et al. Show and tell: A neural image caption generator. CVPR 2015
Limitation: All output predictions are based on the final and static output
of the encoder
Attention for Image Captioning
CNN
Image:
H x W x 3
6
Attention for Image Captioning
CNN
Image:
H x W x 3
Features f:
L x D
h0
7
a1 y1
c0 y0
first context vector
is the average
Attention weights (LxD) Predicted word
First word (<start> token)
Attention for Image Captioning
CNN
Image:
H x W x 3
h0
c1
Visual features weighted with
attention give the next
context vector
y1
h1
a2 y2
8
a1 y1
c0 y0
Predicted word in
previous timestep
Attention for Image Captioning
CNN
Image:
H x W x 3
h0
c1 y1
h1
a2 y2
h2
a3 y3
c2 y2
9
a1 y1
c0 y0
Attention for Image Captioning
Xu et al. Show, Attend and Tell: Neural Image Caption Generation with Visual Attention. ICML 2015
10
Attention for Image Captioning
Xu et al. Show, Attend and Tell: Neural Image Caption Generation with Visual Attention. ICML 2015
11
Attention for Image Captioning
Xu et al. Show, Attend and Tell: Neural Image Caption Generation with Visual Attention. ICML 2015
12
Some outputs can probably be predicted without looking at the image...
Attention for Image Captioning
Xu et al. Show, Attend and Tell: Neural Image Caption Generation with Visual Attention. ICML 2015
13
Some outputs can probably be predicted without looking at the image...
Attention for Image Captioning
14
Can we focus on the image only when necessary?
Attention for Image Captioning
CNN
Image:
H x W x 3
h0
c1 y1
h1
a2 y2
h2
a3 y3
c2 y2
15
a1 y1
c0 y0
“Regular” spatial attention
Attention for Image Captioning
CNN
Image:
H x W x 3 c1 y1
a2 y2 a3 y3
c2 y2
16
a1 y1
c0 y0
Attention with sentinel: LSTM is modified to output a “non-visual” feature to attend to
s0 h0 s1 h1 s2 h2
Lu et al. Knowing When to Look: Adaptive Attention via a Visual Sentinel for Image Captioning. CVPR
2017
Attention for Image Captioning
Lu et al. Knowing When to Look: Adaptive Attention via a Visual Sentinel for Image Captioning. CVPR
2017
17
Attention weights indicate when it’s more important to look at the image features, and when it’s better to
rely on the current LSTM state
If:
sum(a[0:LxD]) > a[LxD]
image features are needed
for the final decision
Else:
RNN state is enough
to predict the next word
Soft Attention
Xu et al. Show, Attend and Tell: Neural Image Caption Generation with Visual Attention. ICML 2015
CNN
Image:
H x W x 3
Grid of features
(Each
D-dimensional)
a b
c d
pa
pb
pc
pd
Distribution over
grid locations
pa
+ pb
+ pc
+ pc
= 1
Soft attention:
Summarize ALL locations
z = pa
a+ pb
b + pc
c + pd
d
Derivative dz/dp is nice!
Train with gradient descent
Context vector z
(D-dimensional)
From
RNN:
Slide Credit: CS231n 18
Soft Attention
Xu et al. Show, Attend and Tell: Neural Image Caption Generation with Visual Attention. ICML 2015
CNN
Image:
H x W x 3
Grid of features
(Each
D-dimensional)
a b
c d
pa
pb
pc
pd
Distribution over
grid locations
pa
+ pb
+ pc
+ pc
= 1
Soft attention:
Summarize ALL locations
z = pa
a+ pb
b + pc
c + pd
d
Differentiable function
Train with gradient descent
Context vector z
(D-dimensional)
From
RNN:
Slide Credit: CS231n
● Still uses the whole input !
● Constrained to fix grid
19
Hard Attention
Input image:
H x W x 3
Box Coordinates:
(xc, yc, w, h)
Cropped and
rescaled image:
X x Y x 3
Not a differentiable function !
Can’t train with backprop :(
20
Hard attention:
Sample a subset
of the input
Need other optimization strategies
e.g.: reinforcement learning
Spatial Transformer Networks
Input image:
H x W x 3
Box Coordinates:
(xc, yc, w, h)
Cropped and
rescaled image:
X x Y x 3
CNN
bird
Jaderberg et al. Spatial Transformer Networks. NIPS 2015
Not a differentiable function !
Can’t train with backprop :(
Make it differentiable
Train with backprop :) 21
Spatial Transformer Networks
Jaderberg et al. Spatial Transformer Networks. NIPS 2015
Input image:
H x W x 3 Cropped and
rescaled image:
X x Y x 3
Can we make this
function differentiable?
Idea: Function mapping
pixel coordinates (xt, yt) of
output to pixel coordinates
(xs, ys) of input
Slide Credit: CS231n
Repeat for all pixels
in output
Network
attends to
input by
predicting
22
Mapping given by box coordinates
(translation + scale)
Spatial Transformer Networks
Jaderberg et al. Spatial Transformer Networks. NIPS 2015
Easy to incorporate in any network, anywhere !
Differentiable module
Insert spatial transformers into a
classification network and it learns
to attend and transform the input
23
Spatial Transformer Networks
Jaderberg et al. Spatial Transformer Networks. NIPS 2015
24
Fine-grained classification
Also used as an alternative to RoI pooling in proposal-based detection & segmentation pipelines
Deformable Convolutions
Dai, Qi, Xiong, Li, Zhang et al. Deformable Convolutional Networks. arXiv Mar 2017
25
Dynamic & learnable receptive field
Resources
26
Seq2seq implementations with attention:
● Tensorflow
● Pytorch
Spatial Transformers
● Tensorflow
● Coming soon to Pytorch (thread here)
Deformable Convolutions
● MXNet (Original)
● Tensorflow / Keras (slow)
● [WIP]PyTorch
Questions?
Attention Mechanism
28
Kyunghyun Cho, “Introduction to Neural Machine Translation with GPUs” (2015)
The vector to be fed to the RNN at each timestep is a
weighted sum of all the annotation vectors.
Attention Mechanism
29
Kyunghyun Cho, “Introduction to Neural Machine Translation with GPUs” (2015)
An attention weight (scalar) is predicted at each time-step for each annotation vector
hj
with a simple fully connected neural network.
h1
zi
Annotation
vector
Recurrent
state
Attention
weight
(a1
)
Attention Mechanism
30
Kyunghyun Cho, “Introduction to Neural Machine Translation with GPUs” (2015)
An attention weight (scalar) is predicted at each time-step for each annotation vector
hj
with a simple fully connected neural network.
h2
zi
Annotation
vector
Recurrent
state
Attention
weight
(a2
)
Shared for all j
Attention Mechanism
31
Kyunghyun Cho, “Introduction to Neural Machine Translation with GPUs” (2015)
Once a relevance score (weight) is estimated for each word, they are normalized
with a softmax function so they sum up to 1.
Attention Mechanism
32
Kyunghyun Cho, “Introduction to Neural Machine Translation with GPUs” (2015)
Finally, a context-aware representation ci+1
for the output word at timestep i can be
defined as:
Attention Mechanism
33
Kyunghyun Cho, “Introduction to Neural Machine Translation with GPUs” (2015)
The model automatically finds the correspondence structure between two languages
(alignment).
(Edge thicknesses represent the attention weights found by the attention model)
Attention Models
Attend to different parts of the input to optimize a certain output
34
Attention Models
35
Chan et al. Listen, Attend and Spell. ICASSP 2016
Source: distill.pub
Input: Audio features; Output: Text
Attend to different parts of the input to optimize a certain output
Attention for Image Captioning
36
Side-note: attention can be computed with previous or current hidden state
CNN
Image:
H x W x 3
h1
v y1
h2 h3
v y2
a1
y1
v y0average
c1
a2
y2
c2
a3
y3
c3
Attention for Image Captioning
37
Attention with sentinel: LSTM is modified to output a “non-visual” feature to attend to
CNN
Image:
H x W x 3 v y1 v y2
a1
y1
v y0average
c1
a2
y2
c2
a3
y3
c3
s1 h1 s2 h2 s3 h3
Semantic Attention: Image Captioning
38You et al. Image Captioning with Semantic Attention. CVPR 2016
Visual Attention: Saliency Detection
Kuen et al. Recurrent Attentional Networks for Saliency Detection. CVPR 2016
39
Visual Attention: Fixation Prediction
Cornia et al. Predicting Human Eye Fixations via an LSTM-based Saliency Attentive Model.
40

Weitere Àhnliche Inhalte

Was ist angesagt?

HML: Historical View and Trends of Deep Learning
HML: Historical View and Trends of Deep LearningHML: Historical View and Trends of Deep Learning
HML: Historical View and Trends of Deep LearningYan Xu
 
Deep Learning - Convolutional Neural Networks
Deep Learning - Convolutional Neural NetworksDeep Learning - Convolutional Neural Networks
Deep Learning - Convolutional Neural NetworksChristian Perone
 
Notes on attention mechanism
Notes on attention mechanismNotes on attention mechanism
Notes on attention mechanismKhang Pham
 
Transformer Introduction (Seminar Material)
Transformer Introduction (Seminar Material)Transformer Introduction (Seminar Material)
Transformer Introduction (Seminar Material)Yuta Niki
 
Variational Autoencoder
Variational AutoencoderVariational Autoencoder
Variational AutoencoderMark Chang
 
K-Means Clustering Algorithm - Cluster Analysis | Machine Learning Algorithm ...
K-Means Clustering Algorithm - Cluster Analysis | Machine Learning Algorithm ...K-Means Clustering Algorithm - Cluster Analysis | Machine Learning Algorithm ...
K-Means Clustering Algorithm - Cluster Analysis | Machine Learning Algorithm ...Edureka!
 
Introduction to Transformers for NLP - Olga Petrova
Introduction to Transformers for NLP - Olga PetrovaIntroduction to Transformers for NLP - Olga Petrova
Introduction to Transformers for NLP - Olga PetrovaAlexey Grigorev
 
Survey of Attention mechanism & Use in Computer Vision
Survey of Attention mechanism & Use in Computer VisionSurvey of Attention mechanism & Use in Computer Vision
Survey of Attention mechanism & Use in Computer VisionSwatiNarkhede1
 
Word embeddings, RNN, GRU and LSTM
Word embeddings, RNN, GRU and LSTMWord embeddings, RNN, GRU and LSTM
Word embeddings, RNN, GRU and LSTMDivya Gera
 
Introduction to Recurrent Neural Network
Introduction to Recurrent Neural NetworkIntroduction to Recurrent Neural Network
Introduction to Recurrent Neural NetworkKnoldus Inc.
 
Introduction to Deep learning
Introduction to Deep learningIntroduction to Deep learning
Introduction to Deep learningleopauly
 
Introduction to Neural Networks
Introduction to Neural NetworksIntroduction to Neural Networks
Introduction to Neural NetworksDatabricks
 
[Mmlab seminar 2016] deep learning for human pose estimation
[Mmlab seminar 2016] deep learning for human pose estimation[Mmlab seminar 2016] deep learning for human pose estimation
[Mmlab seminar 2016] deep learning for human pose estimationWei Yang
 
Image segmentation with deep learning
Image segmentation with deep learningImage segmentation with deep learning
Image segmentation with deep learningAntonio Rueda-Toicen
 
Action Recognition (Thesis presentation)
Action Recognition (Thesis presentation)Action Recognition (Thesis presentation)
Action Recognition (Thesis presentation)nikhilus85
 
Object Detection and Recognition
Object Detection and Recognition Object Detection and Recognition
Object Detection and Recognition Intel Nervana
 
Deep learning
Deep learning Deep learning
Deep learning Rajgupta258
 
Introduction to Recurrent Neural Network
Introduction to Recurrent Neural NetworkIntroduction to Recurrent Neural Network
Introduction to Recurrent Neural NetworkYan Xu
 

Was ist angesagt? (20)

HML: Historical View and Trends of Deep Learning
HML: Historical View and Trends of Deep LearningHML: Historical View and Trends of Deep Learning
HML: Historical View and Trends of Deep Learning
 
Deep Learning - Convolutional Neural Networks
Deep Learning - Convolutional Neural NetworksDeep Learning - Convolutional Neural Networks
Deep Learning - Convolutional Neural Networks
 
Notes on attention mechanism
Notes on attention mechanismNotes on attention mechanism
Notes on attention mechanism
 
Transformer Introduction (Seminar Material)
Transformer Introduction (Seminar Material)Transformer Introduction (Seminar Material)
Transformer Introduction (Seminar Material)
 
Variational Autoencoder
Variational AutoencoderVariational Autoencoder
Variational Autoencoder
 
K-Means Clustering Algorithm - Cluster Analysis | Machine Learning Algorithm ...
K-Means Clustering Algorithm - Cluster Analysis | Machine Learning Algorithm ...K-Means Clustering Algorithm - Cluster Analysis | Machine Learning Algorithm ...
K-Means Clustering Algorithm - Cluster Analysis | Machine Learning Algorithm ...
 
Introduction to Transformers for NLP - Olga Petrova
Introduction to Transformers for NLP - Olga PetrovaIntroduction to Transformers for NLP - Olga Petrova
Introduction to Transformers for NLP - Olga Petrova
 
Image captioning
Image captioningImage captioning
Image captioning
 
Image captioning
Image captioningImage captioning
Image captioning
 
Survey of Attention mechanism & Use in Computer Vision
Survey of Attention mechanism & Use in Computer VisionSurvey of Attention mechanism & Use in Computer Vision
Survey of Attention mechanism & Use in Computer Vision
 
Word embeddings, RNN, GRU and LSTM
Word embeddings, RNN, GRU and LSTMWord embeddings, RNN, GRU and LSTM
Word embeddings, RNN, GRU and LSTM
 
Introduction to Recurrent Neural Network
Introduction to Recurrent Neural NetworkIntroduction to Recurrent Neural Network
Introduction to Recurrent Neural Network
 
Introduction to Deep learning
Introduction to Deep learningIntroduction to Deep learning
Introduction to Deep learning
 
Introduction to Neural Networks
Introduction to Neural NetworksIntroduction to Neural Networks
Introduction to Neural Networks
 
[Mmlab seminar 2016] deep learning for human pose estimation
[Mmlab seminar 2016] deep learning for human pose estimation[Mmlab seminar 2016] deep learning for human pose estimation
[Mmlab seminar 2016] deep learning for human pose estimation
 
Image segmentation with deep learning
Image segmentation with deep learningImage segmentation with deep learning
Image segmentation with deep learning
 
Action Recognition (Thesis presentation)
Action Recognition (Thesis presentation)Action Recognition (Thesis presentation)
Action Recognition (Thesis presentation)
 
Object Detection and Recognition
Object Detection and Recognition Object Detection and Recognition
Object Detection and Recognition
 
Deep learning
Deep learning Deep learning
Deep learning
 
Introduction to Recurrent Neural Network
Introduction to Recurrent Neural NetworkIntroduction to Recurrent Neural Network
Introduction to Recurrent Neural Network
 

Ähnlich wie Attention Models (D3L6 2017 UPC Deep Learning for Computer Vision)

ANISH_and_DR.DANIEL_augmented_reality_presentation
ANISH_and_DR.DANIEL_augmented_reality_presentationANISH_and_DR.DANIEL_augmented_reality_presentation
ANISH_and_DR.DANIEL_augmented_reality_presentationAnish Patel
 
PixelCNN, Wavenet, Normalizing Flows - Santiago Pascual - UPC Barcelona 2018
PixelCNN, Wavenet, Normalizing Flows - Santiago Pascual - UPC Barcelona 2018PixelCNN, Wavenet, Normalizing Flows - Santiago Pascual - UPC Barcelona 2018
PixelCNN, Wavenet, Normalizing Flows - Santiago Pascual - UPC Barcelona 2018Universitat PolitĂšcnica de Catalunya
 
When Discrete Optimization Meets Multimedia Security (and Beyond)
When Discrete Optimization Meets Multimedia Security (and Beyond)When Discrete Optimization Meets Multimedia Security (and Beyond)
When Discrete Optimization Meets Multimedia Security (and Beyond)Shujun Li
 
Image Texture Analysis
Image Texture AnalysisImage Texture Analysis
Image Texture Analysislalitxp
 
Deep learning for molecules, introduction to chainer chemistry
Deep learning for molecules, introduction to chainer chemistryDeep learning for molecules, introduction to chainer chemistry
Deep learning for molecules, introduction to chainer chemistryKenta Oono
 
Scene understanding
Scene understandingScene understanding
Scene understandingMohammed Shoaib
 
CD504 CGM_Lab Manual_004e08d3838702ed11fc6d03cc82f7be.pdf
CD504 CGM_Lab Manual_004e08d3838702ed11fc6d03cc82f7be.pdfCD504 CGM_Lab Manual_004e08d3838702ed11fc6d03cc82f7be.pdf
CD504 CGM_Lab Manual_004e08d3838702ed11fc6d03cc82f7be.pdfRajJain516913
 
Iaetsd traffic sign recognition for advanced driver
Iaetsd traffic sign recognition for  advanced driverIaetsd traffic sign recognition for  advanced driver
Iaetsd traffic sign recognition for advanced driverIaetsd Iaetsd
 
A Beginner's Guide to Monocular Depth Estimation
A Beginner's Guide to Monocular Depth EstimationA Beginner's Guide to Monocular Depth Estimation
A Beginner's Guide to Monocular Depth EstimationRyo Takahashi
 
Camp IT: Making the World More Efficient Using AI & Machine Learning
Camp IT: Making the World More Efficient Using AI & Machine LearningCamp IT: Making the World More Efficient Using AI & Machine Learning
Camp IT: Making the World More Efficient Using AI & Machine LearningKrzysztof Kowalczyk
 
Conference_paper.pdf
Conference_paper.pdfConference_paper.pdf
Conference_paper.pdfNarenRajVivek
 
Session 4 .pdf
Session 4 .pdfSession 4 .pdf
Session 4 .pdfssuser8cda84
 
OpenCV+Android.pptx
OpenCV+Android.pptxOpenCV+Android.pptx
OpenCV+Android.pptxVishwas459764
 
CS 354 Transformation, Clipping, and Culling
CS 354 Transformation, Clipping, and CullingCS 354 Transformation, Clipping, and Culling
CS 354 Transformation, Clipping, and CullingMark Kilgard
 

Ähnlich wie Attention Models (D3L6 2017 UPC Deep Learning for Computer Vision) (20)

Deep Learning for Computer Vision: Attention Models (UPC 2016)
Deep Learning for Computer Vision: Attention Models (UPC 2016)Deep Learning for Computer Vision: Attention Models (UPC 2016)
Deep Learning for Computer Vision: Attention Models (UPC 2016)
 
Backpropagation for Deep Learning
Backpropagation for Deep LearningBackpropagation for Deep Learning
Backpropagation for Deep Learning
 
Log polar coordinates
Log polar coordinatesLog polar coordinates
Log polar coordinates
 
ANISH_and_DR.DANIEL_augmented_reality_presentation
ANISH_and_DR.DANIEL_augmented_reality_presentationANISH_and_DR.DANIEL_augmented_reality_presentation
ANISH_and_DR.DANIEL_augmented_reality_presentation
 
PixelCNN, Wavenet, Normalizing Flows - Santiago Pascual - UPC Barcelona 2018
PixelCNN, Wavenet, Normalizing Flows - Santiago Pascual - UPC Barcelona 2018PixelCNN, Wavenet, Normalizing Flows - Santiago Pascual - UPC Barcelona 2018
PixelCNN, Wavenet, Normalizing Flows - Santiago Pascual - UPC Barcelona 2018
 
Cgm Lab Manual
Cgm Lab ManualCgm Lab Manual
Cgm Lab Manual
 
When Discrete Optimization Meets Multimedia Security (and Beyond)
When Discrete Optimization Meets Multimedia Security (and Beyond)When Discrete Optimization Meets Multimedia Security (and Beyond)
When Discrete Optimization Meets Multimedia Security (and Beyond)
 
Image Texture Analysis
Image Texture AnalysisImage Texture Analysis
Image Texture Analysis
 
mini prjt
mini prjtmini prjt
mini prjt
 
Deep learning for molecules, introduction to chainer chemistry
Deep learning for molecules, introduction to chainer chemistryDeep learning for molecules, introduction to chainer chemistry
Deep learning for molecules, introduction to chainer chemistry
 
Scene understanding
Scene understandingScene understanding
Scene understanding
 
CD504 CGM_Lab Manual_004e08d3838702ed11fc6d03cc82f7be.pdf
CD504 CGM_Lab Manual_004e08d3838702ed11fc6d03cc82f7be.pdfCD504 CGM_Lab Manual_004e08d3838702ed11fc6d03cc82f7be.pdf
CD504 CGM_Lab Manual_004e08d3838702ed11fc6d03cc82f7be.pdf
 
Iaetsd traffic sign recognition for advanced driver
Iaetsd traffic sign recognition for  advanced driverIaetsd traffic sign recognition for  advanced driver
Iaetsd traffic sign recognition for advanced driver
 
A Beginner's Guide to Monocular Depth Estimation
A Beginner's Guide to Monocular Depth EstimationA Beginner's Guide to Monocular Depth Estimation
A Beginner's Guide to Monocular Depth Estimation
 
Camp IT: Making the World More Efficient Using AI & Machine Learning
Camp IT: Making the World More Efficient Using AI & Machine LearningCamp IT: Making the World More Efficient Using AI & Machine Learning
Camp IT: Making the World More Efficient Using AI & Machine Learning
 
Conference_paper.pdf
Conference_paper.pdfConference_paper.pdf
Conference_paper.pdf
 
CG.pptx
CG.pptxCG.pptx
CG.pptx
 
Session 4 .pdf
Session 4 .pdfSession 4 .pdf
Session 4 .pdf
 
OpenCV+Android.pptx
OpenCV+Android.pptxOpenCV+Android.pptx
OpenCV+Android.pptx
 
CS 354 Transformation, Clipping, and Culling
CS 354 Transformation, Clipping, and CullingCS 354 Transformation, Clipping, and Culling
CS 354 Transformation, Clipping, and Culling
 

Mehr von Universitat PolitĂšcnica de Catalunya

The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...Universitat PolitĂšcnica de Catalunya
 
Towards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-NietoTowards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-NietoUniversitat PolitĂšcnica de Catalunya
 
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...Universitat PolitĂšcnica de Catalunya
 
Generation of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in VideosGeneration of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in VideosUniversitat PolitĂšcnica de Catalunya
 
Discovery and Learning of Navigation Goals from Pixels in Minecraft
Discovery and Learning of Navigation Goals from Pixels in MinecraftDiscovery and Learning of Navigation Goals from Pixels in Minecraft
Discovery and Learning of Navigation Goals from Pixels in MinecraftUniversitat PolitĂšcnica de Catalunya
 
Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...Universitat PolitĂšcnica de Catalunya
 
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020Universitat PolitĂšcnica de Catalunya
 
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...Universitat PolitĂšcnica de Catalunya
 
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020Universitat PolitĂšcnica de Catalunya
 
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...Universitat PolitĂšcnica de Catalunya
 
Q-Learning with a Neural Network - Xavier GirĂł - UPC Barcelona 2020
Q-Learning with a Neural Network - Xavier GirĂł - UPC Barcelona 2020Q-Learning with a Neural Network - Xavier GirĂł - UPC Barcelona 2020
Q-Learning with a Neural Network - Xavier GirĂł - UPC Barcelona 2020Universitat PolitĂšcnica de Catalunya
 
Language and Vision with Deep Learning - Xavier GirĂł - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier GirĂł - ACM ICMR 2020 (Tutorial)Language and Vision with Deep Learning - Xavier GirĂł - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier GirĂł - ACM ICMR 2020 (Tutorial)Universitat PolitĂšcnica de Catalunya
 
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...Universitat PolitĂšcnica de Catalunya
 
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020Universitat PolitĂšcnica de Catalunya
 

Mehr von Universitat PolitĂšcnica de Catalunya (20)

Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
Deep Generative Learning for All
Deep Generative Learning for AllDeep Generative Learning for All
Deep Generative Learning for All
 
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
 
Towards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-NietoTowards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-Nieto
 
The Transformer - Xavier GirĂł - UPC Barcelona 2021
The Transformer - Xavier GirĂł - UPC Barcelona 2021The Transformer - Xavier GirĂł - UPC Barcelona 2021
The Transformer - Xavier GirĂł - UPC Barcelona 2021
 
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
 
Open challenges in sign language translation and production
Open challenges in sign language translation and productionOpen challenges in sign language translation and production
Open challenges in sign language translation and production
 
Generation of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in VideosGeneration of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in Videos
 
Discovery and Learning of Navigation Goals from Pixels in Minecraft
Discovery and Learning of Navigation Goals from Pixels in MinecraftDiscovery and Learning of Navigation Goals from Pixels in Minecraft
Discovery and Learning of Navigation Goals from Pixels in Minecraft
 
Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...
 
Intepretability / Explainable AI for Deep Neural Networks
Intepretability / Explainable AI for Deep Neural NetworksIntepretability / Explainable AI for Deep Neural Networks
Intepretability / Explainable AI for Deep Neural Networks
 
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
 
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
 
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
 
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
 
Q-Learning with a Neural Network - Xavier GirĂł - UPC Barcelona 2020
Q-Learning with a Neural Network - Xavier GirĂł - UPC Barcelona 2020Q-Learning with a Neural Network - Xavier GirĂł - UPC Barcelona 2020
Q-Learning with a Neural Network - Xavier GirĂł - UPC Barcelona 2020
 
Language and Vision with Deep Learning - Xavier GirĂł - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier GirĂł - ACM ICMR 2020 (Tutorial)Language and Vision with Deep Learning - Xavier GirĂł - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier GirĂł - ACM ICMR 2020 (Tutorial)
 
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
 
Curriculum Learning for Recurrent Video Object Segmentation
Curriculum Learning for Recurrent Video Object SegmentationCurriculum Learning for Recurrent Video Object Segmentation
Curriculum Learning for Recurrent Video Object Segmentation
 
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
 

KĂŒrzlich hochgeladen

Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Valters Lauzums
 
Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraGovindSinghDasila
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...amitlee9823
 
Just Call Vip call girls Bellary Escorts ☎9352988975 Two shot with one girl ...
Just Call Vip call girls Bellary Escorts ☎9352988975 Two shot with one girl ...Just Call Vip call girls Bellary Escorts ☎9352988975 Two shot with one girl ...
Just Call Vip call girls Bellary Escorts ☎9352988975 Two shot with one girl ...gajnagarg
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNKTimothy Spann
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...amitlee9823
 
Call Girls In Hsr Layout ☎ 7737669865 đŸ„” Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 đŸ„” Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 đŸ„” Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 đŸ„” Book Your One night Standamitlee9823
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangaloreamitlee9823
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...amitlee9823
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...amitlee9823
 
Just Call Vip call girls Mysore Escorts ☎9352988975 Two shot with one girl (...
Just Call Vip call girls Mysore Escorts ☎9352988975 Two shot with one girl (...Just Call Vip call girls Mysore Escorts ☎9352988975 Two shot with one girl (...
Just Call Vip call girls Mysore Escorts ☎9352988975 Two shot with one girl (...gajnagarg
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...amitlee9823
 
âž„đŸ” 7737669865 đŸ”â–» Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
âž„đŸ” 7737669865 đŸ”â–» Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...âž„đŸ” 7737669865 đŸ”â–» Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...
âž„đŸ” 7737669865 đŸ”â–» Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...amitlee9823
 
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Pooja Nehwal
 
âž„đŸ” 7737669865 đŸ”â–» Ongole Call-girls in Women Seeking Men 🔝Ongole🔝 Escorts S...
âž„đŸ” 7737669865 đŸ”â–» Ongole Call-girls in Women Seeking Men  🔝Ongole🔝   Escorts S...âž„đŸ” 7737669865 đŸ”â–» Ongole Call-girls in Women Seeking Men  🔝Ongole🔝   Escorts S...
âž„đŸ” 7737669865 đŸ”â–» Ongole Call-girls in Women Seeking Men 🔝Ongole🔝 Escorts S...amitlee9823
 
Just Call Vip call girls kakinada Escorts ☎9352988975 Two shot with one girl...
Just Call Vip call girls kakinada Escorts ☎9352988975 Two shot with one girl...Just Call Vip call girls kakinada Escorts ☎9352988975 Two shot with one girl...
Just Call Vip call girls kakinada Escorts ☎9352988975 Two shot with one girl...gajnagarg
 
Call Girls In Shivaji Nagar ☎ 7737669865 đŸ„” Book Your One night Stand
Call Girls In Shivaji Nagar ☎ 7737669865 đŸ„” Book Your One night StandCall Girls In Shivaji Nagar ☎ 7737669865 đŸ„” Book Your One night Stand
Call Girls In Shivaji Nagar ☎ 7737669865 đŸ„” Book Your One night Standamitlee9823
 
âž„đŸ” 7737669865 đŸ”â–» Sambalpur Call-girls in Women Seeking Men 🔝Sambalpur🔝 Esc...
âž„đŸ” 7737669865 đŸ”â–» Sambalpur Call-girls in Women Seeking Men  🔝Sambalpur🔝   Esc...âž„đŸ” 7737669865 đŸ”â–» Sambalpur Call-girls in Women Seeking Men  🔝Sambalpur🔝   Esc...
âž„đŸ” 7737669865 đŸ”â–» Sambalpur Call-girls in Women Seeking Men 🔝Sambalpur🔝 Esc...amitlee9823
 
Just Call Vip call girls Erode Escorts ☎9352988975 Two shot with one girl (E...
Just Call Vip call girls Erode Escorts ☎9352988975 Two shot with one girl (E...Just Call Vip call girls Erode Escorts ☎9352988975 Two shot with one girl (E...
Just Call Vip call girls Erode Escorts ☎9352988975 Two shot with one girl (E...gajnagarg
 

KĂŒrzlich hochgeladen (20)

Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - Almora
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Just Call Vip call girls Bellary Escorts ☎9352988975 Two shot with one girl ...
Just Call Vip call girls Bellary Escorts ☎9352988975 Two shot with one girl ...Just Call Vip call girls Bellary Escorts ☎9352988975 Two shot with one girl ...
Just Call Vip call girls Bellary Escorts ☎9352988975 Two shot with one girl ...
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
 
Call Girls In Hsr Layout ☎ 7737669865 đŸ„” Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 đŸ„” Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 đŸ„” Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 đŸ„” Book Your One night Stand
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Just Call Vip call girls Mysore Escorts ☎9352988975 Two shot with one girl (...
Just Call Vip call girls Mysore Escorts ☎9352988975 Two shot with one girl (...Just Call Vip call girls Mysore Escorts ☎9352988975 Two shot with one girl (...
Just Call Vip call girls Mysore Escorts ☎9352988975 Two shot with one girl (...
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
 
âž„đŸ” 7737669865 đŸ”â–» Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
âž„đŸ” 7737669865 đŸ”â–» Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...âž„đŸ” 7737669865 đŸ”â–» Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...
âž„đŸ” 7737669865 đŸ”â–» Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
 
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
 
âž„đŸ” 7737669865 đŸ”â–» Ongole Call-girls in Women Seeking Men 🔝Ongole🔝 Escorts S...
âž„đŸ” 7737669865 đŸ”â–» Ongole Call-girls in Women Seeking Men  🔝Ongole🔝   Escorts S...âž„đŸ” 7737669865 đŸ”â–» Ongole Call-girls in Women Seeking Men  🔝Ongole🔝   Escorts S...
âž„đŸ” 7737669865 đŸ”â–» Ongole Call-girls in Women Seeking Men 🔝Ongole🔝 Escorts S...
 
Just Call Vip call girls kakinada Escorts ☎9352988975 Two shot with one girl...
Just Call Vip call girls kakinada Escorts ☎9352988975 Two shot with one girl...Just Call Vip call girls kakinada Escorts ☎9352988975 Two shot with one girl...
Just Call Vip call girls kakinada Escorts ☎9352988975 Two shot with one girl...
 
Call Girls In Shivaji Nagar ☎ 7737669865 đŸ„” Book Your One night Stand
Call Girls In Shivaji Nagar ☎ 7737669865 đŸ„” Book Your One night StandCall Girls In Shivaji Nagar ☎ 7737669865 đŸ„” Book Your One night Stand
Call Girls In Shivaji Nagar ☎ 7737669865 đŸ„” Book Your One night Stand
 
âž„đŸ” 7737669865 đŸ”â–» Sambalpur Call-girls in Women Seeking Men 🔝Sambalpur🔝 Esc...
âž„đŸ” 7737669865 đŸ”â–» Sambalpur Call-girls in Women Seeking Men  🔝Sambalpur🔝   Esc...âž„đŸ” 7737669865 đŸ”â–» Sambalpur Call-girls in Women Seeking Men  🔝Sambalpur🔝   Esc...
âž„đŸ” 7737669865 đŸ”â–» Sambalpur Call-girls in Women Seeking Men 🔝Sambalpur🔝 Esc...
 
Just Call Vip call girls Erode Escorts ☎9352988975 Two shot with one girl (E...
Just Call Vip call girls Erode Escorts ☎9352988975 Two shot with one girl (E...Just Call Vip call girls Erode Escorts ☎9352988975 Two shot with one girl (E...
Just Call Vip call girls Erode Escorts ☎9352988975 Two shot with one girl (E...
 

Attention Models (D3L6 2017 UPC Deep Learning for Computer Vision)

  • 1. [course site] Attention Models Day 3 Lecture 6 #DLUPC Amaia Salvador amaia.salvador@upc.edu PhD Candidate Universitat PolitĂšcnica de Catalunya
  • 2. Attention Models: Motivation Image: H x W x 3 bird The whole input volume is used to predict the output... ...despite the fact that not all pixels are equally important 2
  • 3. Attention Models: Motivation 3 A bird flying over a body of water Attend to different parts of the input to optimize a certain output Case study: Image Captioning
  • 4. Previously D3L5: Image Captioning 4 only takes into account image features in the first hidden state Multimodal Recurrent Neural Network Karpathy and Fei-Fei. "Deep visual-semantic alignments for generating image descriptions." CVPR 2015
  • 5. LSTM Decoder for Image Captioning LSTMLSTM LSTM CNN LSTM A bird flying ... <EOS> Features: D 5 ... Vinyals et al. Show and tell: A neural image caption generator. CVPR 2015 Limitation: All output predictions are based on the final and static output of the encoder
  • 6. Attention for Image Captioning CNN Image: H x W x 3 6
  • 7. Attention for Image Captioning CNN Image: H x W x 3 Features f: L x D h0 7 a1 y1 c0 y0 first context vector is the average Attention weights (LxD) Predicted word First word (<start> token)
  • 8. Attention for Image Captioning CNN Image: H x W x 3 h0 c1 Visual features weighted with attention give the next context vector y1 h1 a2 y2 8 a1 y1 c0 y0 Predicted word in previous timestep
  • 9. Attention for Image Captioning CNN Image: H x W x 3 h0 c1 y1 h1 a2 y2 h2 a3 y3 c2 y2 9 a1 y1 c0 y0
  • 10. Attention for Image Captioning Xu et al. Show, Attend and Tell: Neural Image Caption Generation with Visual Attention. ICML 2015 10
  • 11. Attention for Image Captioning Xu et al. Show, Attend and Tell: Neural Image Caption Generation with Visual Attention. ICML 2015 11
  • 12. Attention for Image Captioning Xu et al. Show, Attend and Tell: Neural Image Caption Generation with Visual Attention. ICML 2015 12 Some outputs can probably be predicted without looking at the image...
  • 13. Attention for Image Captioning Xu et al. Show, Attend and Tell: Neural Image Caption Generation with Visual Attention. ICML 2015 13 Some outputs can probably be predicted without looking at the image...
  • 14. Attention for Image Captioning 14 Can we focus on the image only when necessary?
  • 15. Attention for Image Captioning CNN Image: H x W x 3 h0 c1 y1 h1 a2 y2 h2 a3 y3 c2 y2 15 a1 y1 c0 y0 “Regular” spatial attention
  • 16. Attention for Image Captioning CNN Image: H x W x 3 c1 y1 a2 y2 a3 y3 c2 y2 16 a1 y1 c0 y0 Attention with sentinel: LSTM is modified to output a “non-visual” feature to attend to s0 h0 s1 h1 s2 h2 Lu et al. Knowing When to Look: Adaptive Attention via a Visual Sentinel for Image Captioning. CVPR 2017
  • 17. Attention for Image Captioning Lu et al. Knowing When to Look: Adaptive Attention via a Visual Sentinel for Image Captioning. CVPR 2017 17 Attention weights indicate when it’s more important to look at the image features, and when it’s better to rely on the current LSTM state If: sum(a[0:LxD]) > a[LxD] image features are needed for the final decision Else: RNN state is enough to predict the next word
  • 18. Soft Attention Xu et al. Show, Attend and Tell: Neural Image Caption Generation with Visual Attention. ICML 2015 CNN Image: H x W x 3 Grid of features (Each D-dimensional) a b c d pa pb pc pd Distribution over grid locations pa + pb + pc + pc = 1 Soft attention: Summarize ALL locations z = pa a+ pb b + pc c + pd d Derivative dz/dp is nice! Train with gradient descent Context vector z (D-dimensional) From RNN: Slide Credit: CS231n 18
  • 19. Soft Attention Xu et al. Show, Attend and Tell: Neural Image Caption Generation with Visual Attention. ICML 2015 CNN Image: H x W x 3 Grid of features (Each D-dimensional) a b c d pa pb pc pd Distribution over grid locations pa + pb + pc + pc = 1 Soft attention: Summarize ALL locations z = pa a+ pb b + pc c + pd d Differentiable function Train with gradient descent Context vector z (D-dimensional) From RNN: Slide Credit: CS231n ● Still uses the whole input ! ● Constrained to fix grid 19
  • 20. Hard Attention Input image: H x W x 3 Box Coordinates: (xc, yc, w, h) Cropped and rescaled image: X x Y x 3 Not a differentiable function ! Can’t train with backprop :( 20 Hard attention: Sample a subset of the input Need other optimization strategies e.g.: reinforcement learning
  • 21. Spatial Transformer Networks Input image: H x W x 3 Box Coordinates: (xc, yc, w, h) Cropped and rescaled image: X x Y x 3 CNN bird Jaderberg et al. Spatial Transformer Networks. NIPS 2015 Not a differentiable function ! Can’t train with backprop :( Make it differentiable Train with backprop :) 21
  • 22. Spatial Transformer Networks Jaderberg et al. Spatial Transformer Networks. NIPS 2015 Input image: H x W x 3 Cropped and rescaled image: X x Y x 3 Can we make this function differentiable? Idea: Function mapping pixel coordinates (xt, yt) of output to pixel coordinates (xs, ys) of input Slide Credit: CS231n Repeat for all pixels in output Network attends to input by predicting 22 Mapping given by box coordinates (translation + scale)
  • 23. Spatial Transformer Networks Jaderberg et al. Spatial Transformer Networks. NIPS 2015 Easy to incorporate in any network, anywhere ! Differentiable module Insert spatial transformers into a classification network and it learns to attend and transform the input 23
  • 24. Spatial Transformer Networks Jaderberg et al. Spatial Transformer Networks. NIPS 2015 24 Fine-grained classification Also used as an alternative to RoI pooling in proposal-based detection & segmentation pipelines
  • 25. Deformable Convolutions Dai, Qi, Xiong, Li, Zhang et al. Deformable Convolutional Networks. arXiv Mar 2017 25 Dynamic & learnable receptive field
  • 26. Resources 26 Seq2seq implementations with attention: ● Tensorflow ● Pytorch Spatial Transformers ● Tensorflow ● Coming soon to Pytorch (thread here) Deformable Convolutions ● MXNet (Original) ● Tensorflow / Keras (slow) ● [WIP]PyTorch
  • 28. Attention Mechanism 28 Kyunghyun Cho, “Introduction to Neural Machine Translation with GPUs” (2015) The vector to be fed to the RNN at each timestep is a weighted sum of all the annotation vectors.
  • 29. Attention Mechanism 29 Kyunghyun Cho, “Introduction to Neural Machine Translation with GPUs” (2015) An attention weight (scalar) is predicted at each time-step for each annotation vector hj with a simple fully connected neural network. h1 zi Annotation vector Recurrent state Attention weight (a1 )
  • 30. Attention Mechanism 30 Kyunghyun Cho, “Introduction to Neural Machine Translation with GPUs” (2015) An attention weight (scalar) is predicted at each time-step for each annotation vector hj with a simple fully connected neural network. h2 zi Annotation vector Recurrent state Attention weight (a2 ) Shared for all j
  • 31. Attention Mechanism 31 Kyunghyun Cho, “Introduction to Neural Machine Translation with GPUs” (2015) Once a relevance score (weight) is estimated for each word, they are normalized with a softmax function so they sum up to 1.
  • 32. Attention Mechanism 32 Kyunghyun Cho, “Introduction to Neural Machine Translation with GPUs” (2015) Finally, a context-aware representation ci+1 for the output word at timestep i can be defined as:
  • 33. Attention Mechanism 33 Kyunghyun Cho, “Introduction to Neural Machine Translation with GPUs” (2015) The model automatically finds the correspondence structure between two languages (alignment). (Edge thicknesses represent the attention weights found by the attention model)
  • 34. Attention Models Attend to different parts of the input to optimize a certain output 34
  • 35. Attention Models 35 Chan et al. Listen, Attend and Spell. ICASSP 2016 Source: distill.pub Input: Audio features; Output: Text Attend to different parts of the input to optimize a certain output
  • 36. Attention for Image Captioning 36 Side-note: attention can be computed with previous or current hidden state CNN Image: H x W x 3 h1 v y1 h2 h3 v y2 a1 y1 v y0average c1 a2 y2 c2 a3 y3 c3
  • 37. Attention for Image Captioning 37 Attention with sentinel: LSTM is modified to output a “non-visual” feature to attend to CNN Image: H x W x 3 v y1 v y2 a1 y1 v y0average c1 a2 y2 c2 a3 y3 c3 s1 h1 s2 h2 s3 h3
  • 38. Semantic Attention: Image Captioning 38You et al. Image Captioning with Semantic Attention. CVPR 2016
  • 39. Visual Attention: Saliency Detection Kuen et al. Recurrent Attentional Networks for Saliency Detection. CVPR 2016 39
  • 40. Visual Attention: Fixation Prediction Cornia et al. Predicting Human Eye Fixations via an LSTM-based Saliency Attentive Model. 40