https://imatge.upc.edu/web/publications/visual-memorability-egocentric-cameras
This project explores visual memorability of egocentric in different ways having three main contributions. The first and the main contribution of the project is a new tool visual memorability in egocentric images. This tool that consists in a web application that allows the annotation of the visual memorability associated to still images with an online game. The second contribution of this work is a convolutional neural network model for visual memorability prediction that adapts an off-the-shelf model to egocentric images. Moreover, a visualization study has been pursued to localize the regions of the images that are more memorable than others. With this maps a comparison with saliency maps and is explored. This part of the research opens a new branch in visual memorability that consists in use memorability maps for saliency prediction. Also the memorability of the images is related with a sentiment analysis applying a model that predicts that feature. The final contribution is related to join visual memorability of images with human behaviour and physical state, finding a relation between memory and some physiological signals as: heart rate, galvanic skin response and electroencephalographic signals.
3. Outline
➔ Introduction
➔ Contributions
◆ Annotation tool for visual memorability
◆ EgoMemNet: visual memorability adaptation to egocentric images
◆ Visual memorability and physiological signals
➔ Conclusions
3
4. Outline
➔ Introduction
➔ Contributions
◆ Annotation tool for visual memorability
◆ EgoMemNet: visual memorability adaptation to egocentric images
◆ Visual memorability and physiological signals
➔ Conclusions
4
5. “Brain is designed to forget in order to survive”
● Lifelogger → person that captures his daily life in order to create a virtual and
digital memory.
● Wearable cameras → capture first person vision.
● Big data → 1.400 - 2.000 images/day.
● Challenge → retrieval!
5
Introduction
6. “Brain is designed to forget in order to survive”
● Lifelogger → person that captures his daily life in order to create a virtual and
digital memory.
● Wearable cameras → capture first person vision.
● Big data → 1.400 - 2.000 images/day.
● Challenge → retrieval!
6
Introduction
What we
want to
remember?
12. Outline
➔ Introduction
➔ Contributions
◆ Annotation tool for visual memorability
◆ EgoMemNet: visual memorability adaptation to egocentric images
◆ Visual memorability and physiological signals
➔ Conclusions
12
13. Why an annotation tool?
13
Convolutional
Neural Network
Input
(image)
Output
(label)
Image + label to train the model
14. ● Inspired by MIT research work [1]
● Visual memory game:
○ Simple task → press ‘d’ when a repeated image is found
○ Duration: 9 minutes
○ Output: text file with detections
14
[1] Understanding and Predicting Image Memorability at a Large
Scale, A. Khosla, A. S. Raju, A. Torralba and A. Oliva. ICCV 2015
Annotation tool for visual memorability
UTEgocentric
Insight Center for
Data Analytics
17. ● Docker:
○ Container with an operating system and software required.
○ Always run the same in any environment.
● Simple implementation → dockerfile
17
Annotation tool implementation
Why to use
a Docker?
First docker
implementation
in GPI for
research
19. Outline
➔ Introduction
➔ Contributions
◆ Annotation tool for visual memorability
◆ EgoMemNet: visual memorability adaptation to egocentric images
◆ Visual memorability and physiological signals
➔ Contributions
19
20. Convolutional neural network: definition
● Automatic learning paradigm based by how human brain works
● Neuron interconnection that work together to generate an output stimulus or activation
20
24. ● No augmentation
● Spatial data augmentation → common method
● Temporal data augmentation → egocentric feature
24
Data augmentation strategies
Spatial data augmentation Temporal data augmentation
25. 25
Quantitative results
Spearman’s rank correlation
Compute the similarity between
positions between two different
ranked lists.
Memorability rank Ground truth rank
31. 31
Memorability vs. saliency maps
Original image Saliency map
(SalNet CNN)
[Pan, CVPR 2016]
Memorability map
(EgoMemNet* CNN)
In green, parts shared between saliency and memorability
maps.
In blue, memorability regions non-salient.
In red, salient regions non-memorability
Binarized maps with
learned threshold
32. Outline
➔ Introduction
➔ Contributions
◆ Annotation tool for visual memorability
◆ EgoMemNet: visual memorability adaptation to egocentric images
◆ Visual memorability and physiological signals
➔ Conclusions
32
42. Outline
➔ Introduction
➔ Contributions
◆ Annotation tool for visual memorability
◆ EgoMemNet: visual memorability adaptation to egocentric images
◆ Visual memorability and physiological signals
➔ Conclusions
42
43. Conclusions
● New annotation tool allows to create novel dataset for egocentric
memorability.
● Egocentric (first person vision) dataset containing 50 annotated images.
● EgoMemNet, a model adapted for memorability prediction to egocentric
images, presents a perform over MemNet, a convolutional neural network
model trained with human-taken images.
● Physiological signals for memorability prediction.
43
44. Extended abstract
Carné-Herrera M, Giró-i-Nieto X, Gurrin C. EgoMemNet: Visual Memorability Adaptation to Egocentric
Images. Las Vegas, NV, USA: 4th Workshop on Egocentric (First-Person) Vision, CVPR 2016;
44