This paper presents a study which examined the selection of Web search results with a gaze-based input device. A standard list interface was compared to a grid and a tabular layout with regard to task performance and subjective ratings. Furthermore, the gazebased input device was compared to conventional mouse interaction. Test persons had to accomplish a series of search tasks by selecting search results. The study revealed that mouse users accomplished more tasks correctly than users of the gazebased input device. However, no differences were found between input devices regarding the number of search results taken into account to accomplish a task. Regarding task completion time and ease of search result selection only in the list interface gaze-based interaction was inferior to mouse interaction. Moreover, with a gaze-based input device search tasks were accomplished faster in tabular presentation than in a standard list interface, suggesting a tabular interface as best suited for gaze-based interaction.
2. In order to test the suitability of the different search results could navigate back to the SERP by clicking on a hyperlink “back
interfaces for gaze-based search result selection, an experiment to the Google page” placed in the center of the screen.
was conducted, in which participants successively had to use the
three different search results interfaces. Additionally, the gaze- Apart from the experimental manipulation of the search results
based input device was compared to conventional mouse interfaces (see section 4.3) the SERPs were displayed in Google
interaction. style because of people’s familiarity with this search engine.
However, ads and the hyperlinks “in cache” and “similar pages”
We expected the mouse to be superior to the gaze-based input were not included on the SERPs. In the experiment the SERPs
device for search result selection, resulting in higher task were presented in full screen mode such that there was no browser
performance and more positive subjective ratings. For instance, task bar displayed. In each of the interfaces the nine search results
gaze-based search result selection should take longer and evoke fit on the screen, thus obviating the need for scrolling.
more accidental selections. For gaze-based interaction, the
standard list interface was expected to be the least suitable of the 4.3 Experimental Design
three search results interfaces, as the to-be-selected search results
The experiment was a 3 (within-subjects) x 2 (between-subjects)
are vertically aligned next to each other in the screen periphery (to
mixed-model factorial design.
the left of the screen). In contrast, we hypothesized that the
tabular interface would be most apt for a gaze-based input device: As a first factor, the search results interface was varied within
As the summary and URL of a search result can be read without subjects by presenting search results in a list interface, a grid
the risk of accidental selections, the Midas Touch Problem should interface, or a tabular interface (see Figure 1). In the list interface
be reduced. Therefore, the tabular interface was expected to lead the nine search results were listed from top to bottom to the left of
to higher task performance and more positive subjective ratings the screen. In the grid interface, search results were arranged in
than the list interface. The suitability of the grid interface was three rows and three columns, towards the center of the screen. In
supposed to be in between the two other search results interfaces the tabular interface every search result element was presented in
because the likelihood of calibration errors should be reduced, but a separate column. The titles (i.e., the hyperlinks) were presented
not the Midas Touch Problem. in the left column, the summaries in the middle column, and the
URLs in the right column. The nine search results were listed
4 Method from top to bottom, with the hyperlinks being presented to the left
4.1 Experimental Setup of the screen.
Thirty-six able-bodied university students (7 male; mean age: As a second factor the input device was manipulated between
23.33 years) participated in this experiment. All participants had subjects, who either used a computer mouse or a gaze-based input
normal or corrected to normal vision. Participants reported to device for operating the search results interfaces. In the gaze-
have intermediate or advanced computer- and Web search skills based input device, a dwell-time based selection mechanism was
without any differences between the two experimental conditions used. Because of the complexity of result selection, involving
(i.e., gaze-based input device vs. mouse). None of the participants visual scan and decision processes, the dwell time was set to 750
had experience with gaze-based computer input. ms. The hyperlinks indicated their interactivity by inverting their
color when hovering over them in either interaction technique.
The eye gaze data was collected with a Tobii 1750 remote eye Participants were randomly assigned to one of the two conditions,
tracker built into a 17” monitor set to a resolution of 1280 x 1024 with 19 participants using the gaze-based input device and 17 the
pixels. Participants were seated on a height-adjustable seat with mouse.
backrest. The viewing distance was approx. 65cm, and recorded
gaze data was smoothed by a filter algorithm.
4.2 Tasks and Material
In order to investigate a rather natural Web search situation,
participants were requested to find the answers to specific
questions by selecting search results presented by a search engine.
Twenty-seven search tasks and associated result lists were
created, covering a broad range of topics including sports, movies,
travel, news, computers, literature, and automotive. Example tasks
included: “When did Apollo 13 take off?” or “Who was the
youngest world champion in chess?” Each search task started with
a control page containing one of the 27 questions and a brief task
description. By pressing the space bar, a Google SERP with pre-
defined query terms (e.g., “take off Apollo 13”) and nine search
results appeared. The search results were manipulated such that
for each task there was exactly one search result, which lead to the
correct answer. The eight other results were distracters. Note that
the correct search result did not contain the answer, but clearly
indicated that the answer could be found on the corresponding Figure 1. SERP types, from back to front:
Web page. The correct search result was displayed in one of the List interface, grid interface, tabular interface.
nine positions, allowing three tasks for each position. The search
results were not linked to real Web pages, but to control pages 4.4 Procedure
that a) denoted that the correct answer could not be found on this
page or b) presented the correct answer. In case of a) participants Participants were tested in individual sessions of approximately
one hour. Before starting with the experiment participants were
asked to provide some demographic and personal data, received
192
3. some general instructions and were calibrated on the eye tracking Table 1. Means and standard deviations of task performance.
system using a nine-point calibration. Subsequently, the first
experimental run started with a training task to get acquainted to Mouse interaction Gaze-based interaction
gaze control and the interaction with the search results interface.
Then, participants performed 9 tasks (with the correct search List Grid Tab. List Grid Tab.
result being located once at each of the nine positions). # correct 8.88 8.88 8.88 8.05 8.05 8.53
Participants were asked to accomplish each task as fast and with tasks (0.33) (0.33) (0.33) (1.13) (1.31) (0.70)
the least number of clicks as possible. They were informed that
for each task only one of the nine search results presented on a time 22.99 22.46 23.47 29.67 26.03 24.26
SERP lead to the correct answer. A search task was regarded as (in s) (6.03) (5.77) (5.72) (9.42) (9.34) (6.41)
successfully accomplished if the correct search result was selected
within a time limit of 90 seconds. Participants received a feedback # clicks / 1.76 1.75 1.65 1.82 1.64 1.47
on their task accomplishment after each task and were then corr. task (0.60) (0.70) (0.52) (0.68) (0.62) (0.57)
provided with the next task. After having processed all tasks, a # clicks / 1.78 1.79 1.67 1.84 1.67 1.51
questionnaire addressing participants’ subjective ratings regarding all task (0.60) (0.72) (0.51) (0.63) (0.65) (0.57)
the interface was administered. Afterwards, the eye tracker was
recalibrated and the second experimental run started.
All participants performed the same 27 search tasks. The order in Table 2. Means and standard deviations of subjective ratings.
which participants used the three interfaces was counterbalanced
Mouse interaction Gaze-based interaction
across participants as well as the order of the search tasks and the
position of the correct search results. List Grid Tab. List Grid Tab.
4.5 Dependent Measures mental 43.53 51.76 46.76 34.74 40.53 32.37
demand (23.8) (21.0) (17.7) (18.0) (22.0) (19.7)
To test the suitability of the three search results interfaces for
accomplishing the fact-finding tasks, we examined participants’ 3.24 2.82 2.82 3.32 3.53 3.58
layout
task performance and subjective ratings with either the mouse or (0.97) (0.95) (1.43) (0.89) (1.07) (1.12)
the gaze-based input device.
ease of 3.88 3.29 3.59 2.79 3.00 3.47
Task performance. Task performance was determined by three selection (0.78) (1.11) (1.06) (0.92) (1.20) (1.17)
dependent measures. First, the number of correctly accomplished
tasks (with a minimum of 0 and a maximum of 9 tasks) was satisfac- 3.47 3.06 3.41 3.42 3.37 3.63
counted. Second, task completion time was recorded (in ms) from tion (0.87) (0.90) (1.28) (0.90) (1.12) (1.27)
the start of the task with the pressing of the space bar until the
selection of the correct search result. However, the time spent on
wrong pages (i.e., the time from the moment of having selected a 5.1 Comparisons between Mouse- and Gaze-
wrong search result until the return to the SERP) was not included based Interaction
in the time measurement. Only correctly accomplished tasks were To compare task performance and subjective ratings between
included in the calculations. For gaze-based interaction the dwell mouse-based and gaze-based interaction we conducted
time (750 ms/click) was included in the analysis of task MANOVAs with input device as between-subjects factor.
completion time. Third, the number of search results selected per
task was counted both for correctly accomplished tasks and for all Task performance. The MANOVA showed a significant effect of
tasks (i.e., also including failed tasks). The number of search the input device on the number of correctly accomplished tasks
results selected in correctly accomplished tasks comprised the (F(3, 32)=4.71, p=.01). Univariate analyses revealed that in all
number of false search results selected plus the selection of the three result interfaces mouse users accomplished more tasks
correct search result per task (resulting in a minimum of 1). correctly than users using the gaze-based input (list interface: F(1,
34)=8.50, p=.01; grid interface: F(1, 34)=6.42, p=.02; tabular
Subjective ratings. Subjective ratings included the following interface: F(1, 34)=3.68, p=.06). The greatest differences between
measures: First, participants were asked to rate their mental the two input devices appeared in the list interface and the least in
demand during task processing on a scale ranging from 0=very the tabular interface. With regard to the task completion time, the
low to 100=very high. Second, participants were presented three MANOVA showed a marginally significant effect of the input
statements, which they were asked to rate on a five-point scale device (F(3, 32)=2.50, p=.08). Univariate analyses revealed that
(5=highly agree). The statements addressed 1) how much they this effect could be traced back to the list interface (F(1, 34)=6.25,
liked the layout of the interface, 2) how easy they found the p=.02). In the list interface tasks were accomplished significantly
search result selection from an interface, and 3) how satisfied they faster with the mouse than with the gaze-based device. Though,
were with the interface. for the grid interface and the tabular interface there were no
differences between input devices. Contrary to our expectations,
5 Results the number of search results selected per task did not differ
Tables 1 and 2 show the mean values of the seven dependent between input devices, irrespective of whether only correctly
measures as a function of the two factors input device and accomplished tasks were included in the analyses or all tasks.
interface. For statistical analyses, first, comparisons between Subjective ratings. The MANOVA showed no significant
mouse interaction and gaze-based interaction were made. Second, differences between input devices on participants’ perceived
the suitability of the three different interfaces for gaze-based mental demand. Although not reaching statistical significance, the
search results selection was analyzed. Because of space MANOVA showed a statistical trend of input device on
limitations, statistical values are only reported for significant participants’ ratings regarding the layout of the interfaces (F(3,
results. 32)=2.30, p=.10). Univariate analyses revealed that this effect
193
4. could be traced back to the grid interface (F(1, 34)=4.28, p=.05) input device, in line with our expectations, search tasks were
and marginally to the tabular interface (F(1, 34) =3.16, p=.08). accomplished faster and with fewer search results selected than in
Users of the gaze-based input device tended to like the layout of a standard list interface. One drawback of the grid interface might
these alternative interfaces more than mouse users. With regard to be that it is perceived more mentally demanding than the tabular
participants’ ratings about the ease of selection of search results interface, which might be due to its unclear arrangement of the
from a SERP, the MANOVA showed a significant effect of input search results.
device F(3, 32)=4.90, p=.01). Again, univariate analyses revealed
that this effect could be traced back to the list interface (F(1, To conclude, even though not all of the experimental results
34)=14.62, p=.001). Users of the gaze-based device rated search reached statistical significance, the study quite clearly speaks
result selection in the list interface less easy (i.e., more strenuous) against using conventional list interfaces for gaze-based search
than mouse users, whereas for the grid and the tabular interface result selection. Rather, this study provides first indications that
ratings did not differ between input devices. Finally, the the tabular interface is best suited for gaze-based interaction
MANOVA showed no differences between input devices with among the given alternatives. Its suitability for a series of
regard to users’ overall satisfaction. consecutive searches in case that the desired information could
not be found among the presented results or for browsing or
5.2 Suitability of Search Results Interfaces for navigational tasks has yet to be shown. Furthermore, one can
Gaze-based Interaction assume that by including a separate activation link instead of
using the title as link the advantage of a tabular layout might be
To compare task performance and subjective ratings between the
further increased. Nonetheless, without further manipulation of
three search results interfaces, repeated-measures ANOVAs with
search engines, the tabular interface as it was used in the current
interface as within-subjects factor were conducted.
study presents a first step towards more efficient Web searching
Task performance. The repeated-measures ANOVA showed no for situations, when the user’s hands cannot be used, for instance,
significant differences between the three interfaces on the number due to motor impairment.
of correctly accomplished tasks. However, with regard to task
completion time, the ANOVA showed a marginally significant References
effect of interface (F(2, 36)=2.55, p=.09). Bonferroni-adjusted
post hoc tests showed that in the tabular interface tasks were ASHMORE, M., DUCHOWSKI, A.T., and SHOEMAKER, G. 2005.
accomplished faster than in the list interface (p=.08). With regard Efficient eye pointing with a fisheye lens. In Proceedings of
to the number of search results selected per task, the ANOVA Graphics interface. GI ‘05. ACM International Conference
also showed a significant effect of interface (F(2, 36)=3.30, Proceeding Series, vol. 112. Canadian Human-Computer
p=.05). Again, post hoc tests revealed that in the tabular interface Communications Society, 203-210.
participants selected less search results to accomplish the task BEINHAUER, W. 2006. A widget library for gaze-based interaction
than in the list interface (p=.03). When analyzing the number of elements. In Proceedings of the 2006 Symposium on Eye
clicks for all tasks, this effect becomes even stronger (p=.01). Tracking Research & Applications. ETRA ‘06. ACM, New
Furthermore, for both variables (time and clicks), values for the York, NY, 53-53.
grid interface were in between, neither differing from the list
interface nor from the tabular interface. DONEGAN, M. et al. 2005. User requirements report with
observations of difficulties users are experiencing. COGAIN,
Subjective ratings. Although not reaching statistical significance, IST-2003-511598: Deliverable 3.1.
the ANOVA showed a statistical trend of interface on
participants’ perceived mental demand (F(2, 36)=2.51, p=.10). JACOB, R. J. K. 1990. What you look at is what you get: eye
Post hoc tests revealed that in the grid interface participants movement-based interaction techniques. In Proceedings of the
tended to perceive a higher mental demand than in the tabular SIGCHI Conference on Human Factors in Computing Systems:
interface (p=.09). Besides this, no differences were found Empowering. J.C. Chew and J. Whiteside, Eds. CHI ‘90. ACM,
between interfaces on users’ ratings regarding the layout of the New York, NY, 11-18.
interfaces, the ease of search result selection, and their overall
KAMMERER, Y., BEINHAUER, W., and SCHEITER, K. 2008. Looking
satisfaction with the interfaces. In case of mouse operation, no
my way through the menu: The impact of menu design and
significant differences were registered between the interfaces with
multimodal input on gaze-based menu selection. In
regard to task performance and subjective ratings.
Proceedings of the 2008 Symposium on Eye Tracking Research
6 Conclusions & Applications. ETRA ‘08. ACM, New York, NY, 213-220.
As expected, the study showed that mouse users accomplished KUMAR, M., PAEPCKE, A., and WINOGRAD, T. 2007. EyePoint:
more tasks correctly than users of the gaze-based input device. Of Practical pointing and selection using gaze and keyboard. In
note, however, irrespective of the input device almost all tasks Proceedings of ACM Conference on Human Factors in
were accomplished correctly. No differences were found between Computing Systems. CHI ‘07. ACM, New York, NY, 421-430.
input devices regarding the number of search results selected to MOLLENBACH, E., STEFANSSON, T., and HANSEN, J.P. 2008. All
accomplish a task, with very few wrong search results being eyes on the monitor: gaze based interaction in zoomable, multi-
selected. Thus, contrary to our expectations, gaze-based search scaled information-spaces. In Proceedings of the 13th
result selection in general did not evoke more accidental international Conference on Intelligent User Interfaces. IUI
selections. Furthermore, with regard to task completion time and ‘08. ACM, New York, NY, 373-376.
ease of search result selection only in the list interface gaze-based
interaction was inferior to mouse interaction, but not in the two RELE, R S., and DUCHOWSKI, A.T., 2005. Using Eye Tracking to
alternative interfaces. The layouts of the two alternative interfaces Evaluate Alternative Search Results Interfaces. In Proceedings
were also liked better when operated with the gaze-based input of the Human Factors and Ergonomics Society, Sep. 26-30,
device than when operated by mouse. Moreover, when search 2005, Orlando, FL.
results were presented in a tabular interface with a gaze-based
194