GaZIR: Gaze-based Zooming Interface for Image Retrieval

•

0 likes•822 views

GaZIR is a gaze-based interface for searching and browsing images. We first describe the system in detail, how users interact with it and how the system predicts the relevance of images the user is searching by using eye-tracking. We then go on to experiment with the system by testing the predictions of image relevancy and actual image retrieval accuracy.

Technology

Intelligent Multimodal Interaction 2014
Francesco Bonadiman
Craig Kershaw

GaZIR
“Gaze-based Zooming Interface
for Image Retrieval”
László Kozma, Arto Klami, Samuel Kaski
Helsinki Institute for Information Technology HIIT

What is GaZIR?
● Gaze-based interface for searching and
browsing images
○ Collecting information from what the user would do
naturally via eye-tracking (implicit feedback)

● The user can zoom-in and out
○ Focusing on the centre or the borders
○ Allowing Image retrieval

Put a Ring on it!
● Consists of 3 Rings of images
○ Each consecutive ring shows the next set of
relevant images based on information gathered
from the previous ring

● Better for predicting image relevancy
○ Avoids users scanning images row-by-row, as with
grid-based layouts

Eye Tracking
● Eye-movements tracking user’s pupils
○ Fixation of >120ms → Relevant image

● 3 main advantages of Eye-Tracking
○ Effortlessness, user only needs to look at the images
○ “I-will-know-it-when-I-see-it” search problems
○ Hands are not needed → Motor disabilities

Similar Work
● Only preliminary studies
● Oyekoya et al.
○ Simple retrieval → relevance from viewing time

● Klami et al.
○ More complex predictions
○ Only measure isolated predictions
○ Artificial setup

… and GaZIR?
● GaZIR → combines two approaches
○ More sophisticated relevance predictor
○ Real retrieval search engine
○ Gaze-based interaction designed interface

Aim of the research paper
● To provide a user interface that is more
fluid and natural with searching
● Test whether it is feasible to construct and
if it works in practice with pre-existing
CBIR search engines
○ Designed to work with any CBIR engine

Data collection
● Simplifications were made
○
○
○
○

user was only expected to zoom inwards
not allowed to reset the process
images only retrieved when zooming-in
mouse wheel used for zooming (no eye control)

● Training data collected to create a model
○ show images closer to users’ expectations

Experiment 1
● 6 different users
● Each of them performing 6 search tasks
○ look into the MirFlickr database
○ search images matching the category description
○ indicate which ones were relevant

● On average around 120 images
○ eye movement over 4300 user-task-image instances

Experiment 2
● 3 average users from the previous
● 6 new search tasks:
○ 2 with the gaze-based relevance predictor
○ 2 with a dummy interface
○ 2 same interface + explicit feedback (mouse click)

● Performances measurement:
○ by counting the proportion of relevant images

Results
● Prediction accuracy > random for all users
○ confirms “relevance through eye movements”

● Huge differences between the users
○
○
○
○
○

due to different tasks or use of the system
for some prediction accuracy → excellent
for others → slightly better than random
explicit feedback (mouse click) → the best
predicted feedback → comparable for 50% of tasks

Contribution
● Distinction “false positive and negative”
○ former: look similar to the relevant but miss details
○ latter: images (too) easy to recognize as relevant

● Promising results → further experiments
● GaZIR is concluded to be
○ “first attempt of building a sophisticated image
retrieval interface utilizing implicit gaze information”

Viewers also liked

Smart trento ver_finale_23marzo2013Giacomo Fioroni

Smart & sustainable transportation for everyone by chatbotsFrancesco Bonadiman

Heart Activity and ECGFrancesco Bonadiman

Sell me anything *Francesco Bonadiman

SpeakerRank - Find the right speakerFrancesco Bonadiman

Facebook financingFrancesco Bonadiman

Lo sportello online del comune di trento - Trento Smart City WeekGiacomo Fioroni

Applications of Emotions RecognitionFrancesco Bonadiman

Privacy and ethical issues in Biometric SystemsFrancesco Bonadiman

Pietismus und AufklärungFrancesco Bonadiman

Artusi Learning - Blended courseFrancesco Bonadiman

The Right to be ForgottenFrancesco Bonadiman

Trento s perspective_of_digital_cities_isdc_2013_fioroniGiacomo Fioroni

Cattolica Go - InsurTechFrancesco Bonadiman

The Bubble CursorFrancesco Bonadiman

Thetha - Open TransparencyFrancesco Bonadiman

Game consoles marketFrancesco Bonadiman

Enhancing the interaction space of a tabletop computing system to design pape...Francesco Bonadiman

Viewers also liked (18)

Smart trento ver_finale_23marzo2013

Smart & sustainable transportation for everyone by chatbots

Heart Activity and ECG

Sell me anything *

SpeakerRank - Find the right speaker

Facebook financing

Lo sportello online del comune di trento - Trento Smart City Week

Applications of Emotions Recognition

Privacy and ethical issues in Biometric Systems

Pietismus und Aufklärung

Artusi Learning - Blended course

The Right to be Forgotten

Trento s perspective_of_digital_cities_isdc_2013_fioroni

Cattolica Go - InsurTech

The Bubble Cursor

Thetha - Open Transparency

Game consoles market

Enhancing the interaction space of a tabletop computing system to design pape...

Similar to GaZIR: Gaze-based Zooming Interface for Image Retrieval

MGaze: Multi-Gaze InteractionsRajith Bhanuka Mahanama

Ai based glaucoma detection using deep learningjaijoy6

EyeGrip: Detecting Targets in a Series of Uni-directional Moving Objects Usin...Diako Mardanbegi

sigir16Telefonica Research

Wristsense2015Vivian Motti

Computer Vision – From traditional approaches to deep neural networksinovex GmbH

Gaze-Net: Appearance-Based Gaze Estimation using CapsuleNetworksRajith Bhanuka Mahanama

Gaze Prediction for Recommender SystemsQian Zhao

VIBE: Video Inference for Human Body Pose and Shape EstimationArithmer Inc.

Convolution Neural Network (CNN)Suraj Aavula

NMO IE-2 Activity Presentation.pptxLEGENDARYTECHNICAL

Visual Summary of Egocentric Photostreams by Representative Keyframes (BSc Ri...Universitat Politècnica de Catalunya

Delving deep into personal photo and video searchJason Tang

Eye Tracking Based Human - Computer InteractionSharath Raj

Automatic Image Cropping - A journey from a Master Thesis to ProductionAlexey Grigorev

From “Selena Gomez” to “Marlon Brando”: Understanding Explorative Entity SearchMounia Lalmas-Roelleke

Reading user’s mind from their eye’sArsha Raman

Pratik ibm-open power-pptVaibhav R

[DL輪読会]ClearGraspDeep Learning JP

Retinal Image Analysis using Machine Learning and Deep.pptxDeval Bhapkar

Similar to GaZIR: Gaze-based Zooming Interface for Image Retrieval (20)

MGaze: Multi-Gaze Interactions

Ai based glaucoma detection using deep learning

EyeGrip: Detecting Targets in a Series of Uni-directional Moving Objects Usin...

sigir16

Wristsense2015

Computer Vision – From traditional approaches to deep neural networks

Gaze-Net: Appearance-Based Gaze Estimation using CapsuleNetworks

Gaze Prediction for Recommender Systems

VIBE: Video Inference for Human Body Pose and Shape Estimation

Convolution Neural Network (CNN)

NMO IE-2 Activity Presentation.pptx

Visual Summary of Egocentric Photostreams by Representative Keyframes (BSc Ri...

Delving deep into personal photo and video search

Eye Tracking Based Human - Computer Interaction

Automatic Image Cropping - A journey from a Master Thesis to Production

From “Selena Gomez” to “Marlon Brando”: Understanding Explorative Entity Search

Reading user’s mind from their eye’s

Pratik ibm-open power-ppt

[DL輪読会]ClearGrasp

Retinal Image Analysis using Machine Learning and Deep.pptx

Recently uploaded

How to convert PDF to text with Nanonetsnaman860154

How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes

Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard

Histor y of HAM Radio presentation slidevu2urc

08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls

WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal

IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge

[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745

The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los

Scaling API-first – The story of a global engineering organizationRadu Cotescu

Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik

Finology Group – Insurtech Innovation Award 2024The Digital Insurer

My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar

The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge

08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls

Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC

The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad

04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG

Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK

Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada

Recently uploaded (20)

How to convert PDF to text with Nanonets

How to Troubleshoot Apps for the Modern Connected Worker

Maximizing Board Effectiveness 2024 Webinar.pptx

Histor y of HAM Radio presentation slide

08448380779 Call Girls In Diplomatic Enclave Women Seeking Men

WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service

IAC 2024 - IA Fast Track to Search Focused AI Solutions

[2024]Digital Global Overview Report 2024 Meltwater.pdf

The 7 Things I Know About Cyber Security After 25 Years | April 2024

Scaling API-first – The story of a global engineering organization

Injustice - Developers Among Us (SciFiDevCon 2024)

Finology Group – Insurtech Innovation Award 2024

My Hashitalk Indonesia April 2024 Presentation

The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf

08448380779 Call Girls In Greater Kailash - I Women Seeking Men

Breaking the Kubernetes Kill Chain: Host Path Mount

The Codex of Business Writing Software for Real-World Solutions 2.pptx

04-2024-HHUG-Sales-and-Marketing-Alignment.pptx

Unblocking The Main Thread Solving ANRs and Frozen Frames

Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024

GaZIR: Gaze-based Zooming Interface for Image Retrieval

1. Intelligent Multimodal Interaction 2014 Francesco Bonadiman Craig Kershaw GaZIR “Gaze-based Zooming Interface for Image Retrieval” László Kozma, Arto Klami, Samuel Kaski Helsinki Institute for Information Technology HIIT

2. What is GaZIR? ● Gaze-based interface for searching and browsing images ○ Collecting information from what the user would do naturally via eye-tracking (implicit feedback) ● The user can zoom-in and out ○ Focusing on the centre or the borders ○ Allowing Image retrieval

3. Put a Ring on it! ● Consists of 3 Rings of images ○ Each consecutive ring shows the next set of relevant images based on information gathered from the previous ring ● Better for predicting image relevancy ○ Avoids users scanning images row-by-row, as with grid-based layouts

4. Rings

5. Eye Tracking ● Eye-movements tracking user’s pupils ○ Fixation of >120ms → Relevant image ● 3 main advantages of Eye-Tracking ○ Effortlessness, user only needs to look at the images ○ “I-will-know-it-when-I-see-it” search problems ○ Hands are not needed → Motor disabilities

6. Similar Work ● Only preliminary studies ● Oyekoya et al. ○ Simple retrieval → relevance from viewing time ● Klami et al. ○ More complex predictions ○ Only measure isolated predictions ○ Artificial setup

7. … and GaZIR? ● GaZIR → combines two approaches ○ More sophisticated relevance predictor ○ Real retrieval search engine ○ Gaze-based interaction designed interface

8. Aim of the research paper ● To provide a user interface that is more fluid and natural with searching ● Test whether it is feasible to construct and if it works in practice with pre-existing CBIR search engines ○ Designed to work with any CBIR engine

9. Data collection ● Simplifications were made ○ ○ ○ ○ user was only expected to zoom inwards not allowed to reset the process images only retrieved when zooming-in mouse wheel used for zooming (no eye control) ● Training data collected to create a model ○ show images closer to users’ expectations

10. Experiment 1 ● 6 different users ● Each of them performing 6 search tasks ○ look into the MirFlickr database ○ search images matching the category description ○ indicate which ones were relevant ● On average around 120 images ○ eye movement over 4300 user-task-image instances

11. Experiment 2 ● 3 average users from the previous ● 6 new search tasks: ○ 2 with the gaze-based relevance predictor ○ 2 with a dummy interface ○ 2 same interface + explicit feedback (mouse click) ● Performances measurement: ○ by counting the proportion of relevant images

12. Results ● Prediction accuracy > random for all users ○ confirms “relevance through eye movements” ● Huge differences between the users ○ ○ ○ ○ ○ due to different tasks or use of the system for some prediction accuracy → excellent for others → slightly better than random explicit feedback (mouse click) → the best predicted feedback → comparable for 50% of tasks

13. Contribution ● Distinction “false positive and negative” ○ former: look similar to the relevant but miss details ○ latter: images (too) easy to recognize as relevant ● Promising results → further experiments ● GaZIR is concluded to be ○ “first attempt of building a sophisticated image retrieval interface utilizing implicit gaze information”

14. Thank you Any questions?

GaZIR: Gaze-based Zooming Interface for Image Retrieval

Recommended

Recommended

More Related Content

Viewers also liked

Viewers also liked (18)

Similar to GaZIR: Gaze-based Zooming Interface for Image Retrieval

Similar to GaZIR: Gaze-based Zooming Interface for Image Retrieval (20)

More from Francesco Bonadiman

More from Francesco Bonadiman (15)

Recently uploaded

Recently uploaded (20)

GaZIR: Gaze-based Zooming Interface for Image Retrieval