SlideShare ist ein Scribd-Unternehmen logo
1 von 46
Downloaden Sie, um offline zu lesen
Visual Question Answering 2.0
Slides by Francisco Roldán
Bsc Thesis, UPC
12th July, 2017
Author: Francisco Roldán Sánchez
Advisors: Xavier Giró-i-Nieto, Santiago Pascual de la
Puente, Issey Masuda Mora
Roadmap
1. Introduction
2. Related Work
3. Methodology
4. Results
5. Conclusion
6. Future Work
2
Roadmap
1. Introduction
2. Related Work
3. Methodology
4. Results
5. Conclusion
6. Future Work
3
Visual Question Answering
4
AI System
Visual Question AnsweringVisual Question AnsweringVisual Question AnsweringVisual Question Answering
Why VQA?
5
Why VQA?
● Multidisciplinary task.
● Models need to tackle different sub-tasks at once
6
Natural Language
Processing
Knowledge
Representation
Computer Vision
VQA Challenge
7Antol, S., Agrawal, A., Lu, J., Mitchell, M., Batra, D., Lawrence Zitnick, C., & Parikh, D. Vqa: Visual question answering. ICCV (2015)
Roadmap
1. Introduction
2. Related Work
3. Methodology
4. Results
5. Conclusion
6. Future Work
8
Roadmap
1. Introduction
2. Related Work
3. Methodology
4. Results
5. Conclusion
6. Future Work
9
VQA: Common solution
10
WE
Convolutional Neural Networks (CNN)
11
Word embeddings
12
Recurrent Neural Networks (RNN)
13
Roadmap
1. Introduction
2. Related Work
3. Methodology
4. Results
5. Conclusion
6. Future Work
14
Roadmap
1. Introduction
2. Related Work
3. Methodology
4. Results
5. Conclusion
6. Future Work
15
VQA Dataset 2.0
16
Goyal, Y., Khot, T., Summers-Stay, D., Batra, D., & Parikh, D. (2016). Making the V in VQA matter: Elevating the role of image
understanding in Visual Question Answering. arXiv preprint arXiv:1612.00837.
VQA Dataset 2.0 Population
17
Train Validation Test
Images 82,783 40,504 81,434
Questions 443,757 214,354 447,793
Answers 4,437,570 2,143,540 4,477,930
Goyal, Y., Khot, T., Summers-Stay, D., Batra, D., & Parikh, D. (2016). Making the V in VQA matter: Elevating the role of image
understanding in Visual Question Answering. arXiv preprint arXiv:1612.00837.
Evaluation Metric
18
Loss Function
19
UPC 2016 Model
20
Yes/No Number Other Overall
66.05 29.77 20.35 40.25
Masuda, I., de la Puente, S. P., & Giro-i-Nieto, X. (2016). Open-Ended Visual Question-Answering. arXiv preprint arXiv:1610.02692.
Language-Only Model
21
Pennington, J., Socher, R., & Manning, C. D. Glove: Global vectors for word representation. In EMNLP (2014, October)
Language-Only Model
22
Freezing or fine-tuning GloVe embeddings:
Fine-tuning
Freezing
Yes/No Number Other Overall
66.14 30.42 24.47 42.31
66.15 31.17 24.87 42.59
23
VQA
1.0
VQA
2.0
ResNet based model
24
ResNet based model
25
ResNet based model
26
ResNet based Model
27
Selection of the merge operand
Concat + FC
Concat
Product
Sum
Yes/No Number Other Overall
64.66 29.49 21.27 40.08
65.51 29.99 22.02 40.84
65.90 30.00 22.80 41.37
VGG based model
28
VGG based Model
29
Selection of the merge operand
Sum
Product
VGG based Model
30
Adding Average Pooling to VGG based model
VGG based Model
31
Adding Average Pooling to VGG based model
No pooling
Avg pooling
VGG based Model
32
Deciding between GloVe or learnable embeddings:
Glove
Learnable
Yes/No Number Other Overall
66.59 31.01 25.83 43.21
67.10 31.54 25.46 43.30
Roadmap
1. Introduction
2. Related Work
3. Methodology
4. Results
5. Conclusion
6. Future Work
33
Roadmap
1. Introduction
2. Related Work
3. Methodology
4. Results
5. Conclusion
6. Future Work
34
Results
35
0% 100%50%
43.48%
Our method
25.98%
UPC 2016*
Priors
40.25%
* Accuracy obtained from the test-dev split, where our method obtained a 43.30%
LV_NUS
69.71%
Qualitative Results
36
Qualitative Results
37
Roadmap
1. Introduction
2. Related Work
3. Methodology
4. Results
5. Conclusion
6. Future Work
38
Roadmap
1. Introduction
2. Related Work
3. Methodology
4. Results
5. Conclusion
6. Future Work
39
Conclusions
Summary:
- GloVe embeddings work better when frozen .
- Need of spatial information for VQA task.
- Models tend to learn language biases.
Goals achieved:
- Improve last year model’s performance by 3.05%.
- Participate in the VQA 2.0 Challenge, obtaining a 43.48% of accuracy.
- Explore many different Deep Learning techniques.
- Build a reusable modular software.
40
Roadmap
1. Introduction
2. Related Work
3. Methodology
4. Results
5. Conclusion
6. Future Work
41
Roadmap
1. Introduction
2. Related Work
3. Methodology
4. Results
5. Conclusion
6. Future Work
42
Visual Reasoning
43
Johnson, J., Hariharan, B., van der Maaten, L., Fei-Fei, L., Zitnick, C. L., & Girshick, R. (2016). CLEVR: A diagnostic dataset for
compositional language and elementary visual reasoning. arXiv preprint arXiv:1612.06890.
44
Visual Reasoning
45
Johnson, J., Hariharan, B., van der Maaten, L., Hoffman, J., Fei-Fei, L., Zitnick, C. L., & Girshick, R. (2017). Inferring and
Executing Programs for Visual Reasoning. arXiv preprint arXiv:1705.03633.
LSTM
46

Weitere ähnliche Inhalte

Was ist angesagt?

Deep Learning for Domain-Specific Entity Extraction from Unstructured Text wi...
Deep Learning for Domain-Specific Entity Extraction from Unstructured Text wi...Deep Learning for Domain-Specific Entity Extraction from Unstructured Text wi...
Deep Learning for Domain-Specific Entity Extraction from Unstructured Text wi...Databricks
 
Question Answering System using machine learning approach
Question Answering System using machine learning approachQuestion Answering System using machine learning approach
Question Answering System using machine learning approachGarima Nanda
 
Tutorial on Question Answering Systems
Tutorial on Question Answering Systems Tutorial on Question Answering Systems
Tutorial on Question Answering Systems Saeedeh Shekarpour
 
[Paper Reading] Attention is All You Need
[Paper Reading] Attention is All You Need[Paper Reading] Attention is All You Need
[Paper Reading] Attention is All You NeedDaiki Tanaka
 
Recommendation at Netflix Scale
Recommendation at Netflix ScaleRecommendation at Netflix Scale
Recommendation at Netflix ScaleJustin Basilico
 
Transformer in Computer Vision
Transformer in Computer VisionTransformer in Computer Vision
Transformer in Computer VisionDongmin Choi
 
Deep Learning for Natural Language Processing
Deep Learning for Natural Language ProcessingDeep Learning for Natural Language Processing
Deep Learning for Natural Language ProcessingDevashish Shanker
 
Natural language processing and transformer models
Natural language processing and transformer modelsNatural language processing and transformer models
Natural language processing and transformer modelsDing Li
 
Sequence to sequence (encoder-decoder) learning
Sequence to sequence (encoder-decoder) learningSequence to sequence (encoder-decoder) learning
Sequence to sequence (encoder-decoder) learningRoberto Pereira Silveira
 
Neural Machine Translation (D3L4 Deep Learning for Speech and Language UPC 2017)
Neural Machine Translation (D3L4 Deep Learning for Speech and Language UPC 2017)Neural Machine Translation (D3L4 Deep Learning for Speech and Language UPC 2017)
Neural Machine Translation (D3L4 Deep Learning for Speech and Language UPC 2017)Universitat Politècnica de Catalunya
 
Word Embeddings - Introduction
Word Embeddings - IntroductionWord Embeddings - Introduction
Word Embeddings - IntroductionChristian Perone
 
CNNs: from the Basics to Recent Advances
CNNs: from the Basics to Recent AdvancesCNNs: from the Basics to Recent Advances
CNNs: from the Basics to Recent AdvancesDmytro Mishkin
 
Neural Networks: Multilayer Perceptron
Neural Networks: Multilayer PerceptronNeural Networks: Multilayer Perceptron
Neural Networks: Multilayer PerceptronMostafa G. M. Mostafa
 
Introduction to Deep Learning
Introduction to Deep LearningIntroduction to Deep Learning
Introduction to Deep LearningOleg Mygryn
 
Deep Learning: Application & Opportunity
Deep Learning: Application & OpportunityDeep Learning: Application & Opportunity
Deep Learning: Application & OpportunityiTrain
 
Introduction to Transformers for NLP - Olga Petrova
Introduction to Transformers for NLP - Olga PetrovaIntroduction to Transformers for NLP - Olga Petrova
Introduction to Transformers for NLP - Olga PetrovaAlexey Grigorev
 

Was ist angesagt? (20)

Deep Learning for Domain-Specific Entity Extraction from Unstructured Text wi...
Deep Learning for Domain-Specific Entity Extraction from Unstructured Text wi...Deep Learning for Domain-Specific Entity Extraction from Unstructured Text wi...
Deep Learning for Domain-Specific Entity Extraction from Unstructured Text wi...
 
Question Answering System using machine learning approach
Question Answering System using machine learning approachQuestion Answering System using machine learning approach
Question Answering System using machine learning approach
 
Tutorial on Question Answering Systems
Tutorial on Question Answering Systems Tutorial on Question Answering Systems
Tutorial on Question Answering Systems
 
Introduction to Transformer Model
Introduction to Transformer ModelIntroduction to Transformer Model
Introduction to Transformer Model
 
[Paper Reading] Attention is All You Need
[Paper Reading] Attention is All You Need[Paper Reading] Attention is All You Need
[Paper Reading] Attention is All You Need
 
Recurrent Neural Networks
Recurrent Neural NetworksRecurrent Neural Networks
Recurrent Neural Networks
 
Recommendation at Netflix Scale
Recommendation at Netflix ScaleRecommendation at Netflix Scale
Recommendation at Netflix Scale
 
Transformer in Computer Vision
Transformer in Computer VisionTransformer in Computer Vision
Transformer in Computer Vision
 
Deep Learning for Natural Language Processing
Deep Learning for Natural Language ProcessingDeep Learning for Natural Language Processing
Deep Learning for Natural Language Processing
 
Natural language processing and transformer models
Natural language processing and transformer modelsNatural language processing and transformer models
Natural language processing and transformer models
 
Sequence to sequence (encoder-decoder) learning
Sequence to sequence (encoder-decoder) learningSequence to sequence (encoder-decoder) learning
Sequence to sequence (encoder-decoder) learning
 
Neural Machine Translation (D3L4 Deep Learning for Speech and Language UPC 2017)
Neural Machine Translation (D3L4 Deep Learning for Speech and Language UPC 2017)Neural Machine Translation (D3L4 Deep Learning for Speech and Language UPC 2017)
Neural Machine Translation (D3L4 Deep Learning for Speech and Language UPC 2017)
 
Word2 vec
Word2 vecWord2 vec
Word2 vec
 
Word Embeddings - Introduction
Word Embeddings - IntroductionWord Embeddings - Introduction
Word Embeddings - Introduction
 
CNNs: from the Basics to Recent Advances
CNNs: from the Basics to Recent AdvancesCNNs: from the Basics to Recent Advances
CNNs: from the Basics to Recent Advances
 
Neural Networks: Multilayer Perceptron
Neural Networks: Multilayer PerceptronNeural Networks: Multilayer Perceptron
Neural Networks: Multilayer Perceptron
 
Attention Is All You Need
Attention Is All You NeedAttention Is All You Need
Attention Is All You Need
 
Introduction to Deep Learning
Introduction to Deep LearningIntroduction to Deep Learning
Introduction to Deep Learning
 
Deep Learning: Application & Opportunity
Deep Learning: Application & OpportunityDeep Learning: Application & Opportunity
Deep Learning: Application & Opportunity
 
Introduction to Transformers for NLP - Olga Petrova
Introduction to Transformers for NLP - Olga PetrovaIntroduction to Transformers for NLP - Olga Petrova
Introduction to Transformers for NLP - Olga Petrova
 

Ähnlich wie Visual Question Answering 2.0

2023 Google Solution Challenge Kickoff From Idea to Execution
2023 Google Solution Challenge Kickoff From Idea to Execution2023 Google Solution Challenge Kickoff From Idea to Execution
2023 Google Solution Challenge Kickoff From Idea to ExecutionGDSCUniversitasMatan
 
Temporal Activity Detection in Untrimmed Videos with Recurrent Neural Networks
Temporal Activity Detection in Untrimmed Videos with Recurrent Neural NetworksTemporal Activity Detection in Untrimmed Videos with Recurrent Neural Networks
Temporal Activity Detection in Untrimmed Videos with Recurrent Neural NetworksUniversitat Politècnica de Catalunya
 
Personalized Tasks and Anonymous Peer Feedback in the Fundamentals of Electri...
Personalized Tasks and Anonymous Peer Feedback in the Fundamentals of Electri...Personalized Tasks and Anonymous Peer Feedback in the Fundamentals of Electri...
Personalized Tasks and Anonymous Peer Feedback in the Fundamentals of Electri...Mathias Magdowski
 
2023 Google Solution Challenge Kickoff_ From Idea to Execution.pptx
2023 Google Solution Challenge Kickoff_ From Idea to Execution.pptx2023 Google Solution Challenge Kickoff_ From Idea to Execution.pptx
2023 Google Solution Challenge Kickoff_ From Idea to Execution.pptxGDSC2
 
Poster CELePro for the eScience Network Conference 06/2013
Poster CELePro for the eScience Network Conference 06/2013Poster CELePro for the eScience Network Conference 06/2013
Poster CELePro for the eScience Network Conference 06/2013Anja Lorenz
 
Thesis+of+soumaya+medini.ppt
Thesis+of+soumaya+medini.pptThesis+of+soumaya+medini.ppt
Thesis+of+soumaya+medini.pptPtidej Team
 
Applying Machine Learning to Data Visaulization: What, Why, Where, and How
Applying Machine Learning to Data Visaulization: What, Why, Where, and HowApplying Machine Learning to Data Visaulization: What, Why, Where, and How
Applying Machine Learning to Data Visaulization: What, Why, Where, and HowQianwen Wang
 
Which visual questions are difficult to answer? Analysis with Entropy of Answ...
Which visual questions are difficult to answer? Analysis with Entropy of Answ...Which visual questions are difficult to answer? Analysis with Entropy of Answ...
Which visual questions are difficult to answer? Analysis with Entropy of Answ...Toru Tamaki
 
Solution Challenge_ Info Session.pptx
Solution Challenge_ Info Session.pptxSolution Challenge_ Info Session.pptx
Solution Challenge_ Info Session.pptxbcedsc
 
Ice 2013-A Structured Team Building Method for Collaborative Crowdsourcing
Ice 2013-A Structured Team Building Method for Collaborative CrowdsourcingIce 2013-A Structured Team Building Method for Collaborative Crowdsourcing
Ice 2013-A Structured Team Building Method for Collaborative CrowdsourcingErre Quadro
 
Multi-Attribute Decision Making with VIKOR Method for Any Purpose Decision
Multi-Attribute Decision Making with VIKOR Method for Any Purpose DecisionMulti-Attribute Decision Making with VIKOR Method for Any Purpose Decision
Multi-Attribute Decision Making with VIKOR Method for Any Purpose DecisionUniversitas Pembangunan Panca Budi
 
2022 Solution Challenge Info Session
2022 Solution Challenge Info Session2022 Solution Challenge Info Session
2022 Solution Challenge Info SessionSHRIHARRIPRIYAR
 
Multimodal Residual Learning for Visual Question-Answering
Multimodal Residual Learning for Visual Question-AnsweringMultimodal Residual Learning for Visual Question-Answering
Multimodal Residual Learning for Visual Question-AnsweringNAVER D2
 
Solution Challenge Event GDSC TU Berlin.pdf
Solution Challenge Event GDSC TU Berlin.pdfSolution Challenge Event GDSC TU Berlin.pdf
Solution Challenge Event GDSC TU Berlin.pdfGDSCTUBerlin
 
Transfer Learning Model for Image Segmentation by Integrating U-NetPlusPlus a...
Transfer Learning Model for Image Segmentation by Integrating U-NetPlusPlus a...Transfer Learning Model for Image Segmentation by Integrating U-NetPlusPlus a...
Transfer Learning Model for Image Segmentation by Integrating U-NetPlusPlus a...YutaSuzuki27
 
DeepDRImageGuidedDiabeticRetinopathyDetectionUsingAttentionBasedDeepLearningS...
DeepDRImageGuidedDiabeticRetinopathyDetectionUsingAttentionBasedDeepLearningS...DeepDRImageGuidedDiabeticRetinopathyDetectionUsingAttentionBasedDeepLearningS...
DeepDRImageGuidedDiabeticRetinopathyDetectionUsingAttentionBasedDeepLearningS...RamithaDevi
 
Smart Hydroponic Plant Growing System using IoT
Smart Hydroponic Plant Growing System using IoTSmart Hydroponic Plant Growing System using IoT
Smart Hydroponic Plant Growing System using IoTGustavo Sanchez Collado
 

Ähnlich wie Visual Question Answering 2.0 (20)

2023 Google Solution Challenge Kickoff From Idea to Execution
2023 Google Solution Challenge Kickoff From Idea to Execution2023 Google Solution Challenge Kickoff From Idea to Execution
2023 Google Solution Challenge Kickoff From Idea to Execution
 
Temporal Activity Detection in Untrimmed Videos with Recurrent Neural Networks
Temporal Activity Detection in Untrimmed Videos with Recurrent Neural NetworksTemporal Activity Detection in Untrimmed Videos with Recurrent Neural Networks
Temporal Activity Detection in Untrimmed Videos with Recurrent Neural Networks
 
Personalized Tasks and Anonymous Peer Feedback in the Fundamentals of Electri...
Personalized Tasks and Anonymous Peer Feedback in the Fundamentals of Electri...Personalized Tasks and Anonymous Peer Feedback in the Fundamentals of Electri...
Personalized Tasks and Anonymous Peer Feedback in the Fundamentals of Electri...
 
2023 Google Solution Challenge Kickoff_ From Idea to Execution.pptx
2023 Google Solution Challenge Kickoff_ From Idea to Execution.pptx2023 Google Solution Challenge Kickoff_ From Idea to Execution.pptx
2023 Google Solution Challenge Kickoff_ From Idea to Execution.pptx
 
Poster CELePro for the eScience Network Conference 06/2013
Poster CELePro for the eScience Network Conference 06/2013Poster CELePro for the eScience Network Conference 06/2013
Poster CELePro for the eScience Network Conference 06/2013
 
Thesis+of+soumaya+medini.ppt
Thesis+of+soumaya+medini.pptThesis+of+soumaya+medini.ppt
Thesis+of+soumaya+medini.ppt
 
Applying Machine Learning to Data Visaulization: What, Why, Where, and How
Applying Machine Learning to Data Visaulization: What, Why, Where, and HowApplying Machine Learning to Data Visaulization: What, Why, Where, and How
Applying Machine Learning to Data Visaulization: What, Why, Where, and How
 
Which visual questions are difficult to answer? Analysis with Entropy of Answ...
Which visual questions are difficult to answer? Analysis with Entropy of Answ...Which visual questions are difficult to answer? Analysis with Entropy of Answ...
Which visual questions are difficult to answer? Analysis with Entropy of Answ...
 
Solution Challenge_ Info Session.pptx
Solution Challenge_ Info Session.pptxSolution Challenge_ Info Session.pptx
Solution Challenge_ Info Session.pptx
 
Speech Conditioned Face Generation with Deep Adversarial Networks
Speech Conditioned Face Generation with Deep Adversarial NetworksSpeech Conditioned Face Generation with Deep Adversarial Networks
Speech Conditioned Face Generation with Deep Adversarial Networks
 
Ice 2013-A Structured Team Building Method for Collaborative Crowdsourcing
Ice 2013-A Structured Team Building Method for Collaborative CrowdsourcingIce 2013-A Structured Team Building Method for Collaborative Crowdsourcing
Ice 2013-A Structured Team Building Method for Collaborative Crowdsourcing
 
Multi-Attribute Decision Making with VIKOR Method for Any Purpose Decision
Multi-Attribute Decision Making with VIKOR Method for Any Purpose DecisionMulti-Attribute Decision Making with VIKOR Method for Any Purpose Decision
Multi-Attribute Decision Making with VIKOR Method for Any Purpose Decision
 
2022 Solution Challenge Info Session
2022 Solution Challenge Info Session2022 Solution Challenge Info Session
2022 Solution Challenge Info Session
 
Meetup Giugno - c-ResUNET.pdf
Meetup Giugno - c-ResUNET.pdfMeetup Giugno - c-ResUNET.pdf
Meetup Giugno - c-ResUNET.pdf
 
Evip Medbiq2
Evip Medbiq2Evip Medbiq2
Evip Medbiq2
 
Multimodal Residual Learning for Visual Question-Answering
Multimodal Residual Learning for Visual Question-AnsweringMultimodal Residual Learning for Visual Question-Answering
Multimodal Residual Learning for Visual Question-Answering
 
Solution Challenge Event GDSC TU Berlin.pdf
Solution Challenge Event GDSC TU Berlin.pdfSolution Challenge Event GDSC TU Berlin.pdf
Solution Challenge Event GDSC TU Berlin.pdf
 
Transfer Learning Model for Image Segmentation by Integrating U-NetPlusPlus a...
Transfer Learning Model for Image Segmentation by Integrating U-NetPlusPlus a...Transfer Learning Model for Image Segmentation by Integrating U-NetPlusPlus a...
Transfer Learning Model for Image Segmentation by Integrating U-NetPlusPlus a...
 
DeepDRImageGuidedDiabeticRetinopathyDetectionUsingAttentionBasedDeepLearningS...
DeepDRImageGuidedDiabeticRetinopathyDetectionUsingAttentionBasedDeepLearningS...DeepDRImageGuidedDiabeticRetinopathyDetectionUsingAttentionBasedDeepLearningS...
DeepDRImageGuidedDiabeticRetinopathyDetectionUsingAttentionBasedDeepLearningS...
 
Smart Hydroponic Plant Growing System using IoT
Smart Hydroponic Plant Growing System using IoTSmart Hydroponic Plant Growing System using IoT
Smart Hydroponic Plant Growing System using IoT
 

Mehr von Universitat Politècnica de Catalunya

The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...Universitat Politècnica de Catalunya
 
Towards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-NietoTowards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-NietoUniversitat Politècnica de Catalunya
 
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...Universitat Politècnica de Catalunya
 
Generation of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in VideosGeneration of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in VideosUniversitat Politècnica de Catalunya
 
Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...Universitat Politècnica de Catalunya
 
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020Universitat Politècnica de Catalunya
 
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...Universitat Politècnica de Catalunya
 
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020Universitat Politècnica de Catalunya
 
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...Universitat Politècnica de Catalunya
 
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020Universitat Politècnica de Catalunya
 
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)Universitat Politècnica de Catalunya
 
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...Universitat Politècnica de Catalunya
 

Mehr von Universitat Politècnica de Catalunya (20)

Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
Deep Generative Learning for All
Deep Generative Learning for AllDeep Generative Learning for All
Deep Generative Learning for All
 
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
 
Towards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-NietoTowards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-Nieto
 
The Transformer - Xavier Giró - UPC Barcelona 2021
The Transformer - Xavier Giró - UPC Barcelona 2021The Transformer - Xavier Giró - UPC Barcelona 2021
The Transformer - Xavier Giró - UPC Barcelona 2021
 
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
 
Open challenges in sign language translation and production
Open challenges in sign language translation and productionOpen challenges in sign language translation and production
Open challenges in sign language translation and production
 
Generation of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in VideosGeneration of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in Videos
 
Discovery and Learning of Navigation Goals from Pixels in Minecraft
Discovery and Learning of Navigation Goals from Pixels in MinecraftDiscovery and Learning of Navigation Goals from Pixels in Minecraft
Discovery and Learning of Navigation Goals from Pixels in Minecraft
 
Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...
 
Intepretability / Explainable AI for Deep Neural Networks
Intepretability / Explainable AI for Deep Neural NetworksIntepretability / Explainable AI for Deep Neural Networks
Intepretability / Explainable AI for Deep Neural Networks
 
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
 
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
 
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
 
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
 
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
 
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
 
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
 
Curriculum Learning for Recurrent Video Object Segmentation
Curriculum Learning for Recurrent Video Object SegmentationCurriculum Learning for Recurrent Video Object Segmentation
Curriculum Learning for Recurrent Video Object Segmentation
 
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
 

Kürzlich hochgeladen

Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramMoniSankarHazra
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxolyaivanovalion
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...Pooja Nehwal
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...amitlee9823
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Delhi Call girls
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxolyaivanovalion
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 

Kürzlich hochgeladen (20)

Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics Program
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptx
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 

Visual Question Answering 2.0