Analysis of recordings made by a wearable eye tracker is complicated by video stream synchronization, pupil coordinate mapping, eye movement analysis, and tracking of dynamic Areas Of Interest (AOIs) within the scene. In this paper a semi-automatic system is developed to help automate these processes. Synchronization is accomplished
via side by side video playback control. A deformable eye template and calibration dot marker allow reliable initialization via simple drag and drop as well as a user-friendly way to correct the algorithm when it fails. Specifically, drift may be corrected by nudging the detected pupil center to the appropriate coordinates. In a case study, the impact of surrogate nature views on physiological health and perceived well-being is examined via analysis of gaze over images of nature. A match-moving methodology was developed to track AOIs for this particular application but is applicable toward similar future studies.
3. Figure 3: Screen flash for synchronization visible as eye reflection. Figure 4: Initialization of pupil/limbus and dot tracking.
No IR illumination is used, simplifying the hardware and reduc- by mapping the pupil center (x, y) to scene coordinates (sx , sy ) via
ing cost. The eye-tracker functions in environments with signifi- a second order polynomial [Morimoto and Mimica 2005],
cant ambient IR illumination (e.g., outdoors on a sunny day, see sx = a0 + a1 x + a2 y + a3 xy + a4 x2 + a5 y2
Ryan et al. [2008]). However, lacking a stable corneal reflection
and visible spectrum filtering, video processing is more challeng- sy = b0 + b1 x + b2 y + b3 xy + b4 x2 + b5 y2 . (1)
ing. Specular reflections often occlude the limbus and contrast at
the pupil boundary is inconsistent. The unknown parameters ak and bk are computed via least squares
fitting (e.g., see Lancaster and Šalkauskas [1986]).
3.1 Stimulus for Video Processing 3.3.1 Initialization of Pupil/Limbus and Dot Tracking
For video synchronization and calibration, a laptop computer is Pupil center in the eye video stream and calibration dot in the scene
placed in front of the participant. To synchronize the two videos video stream are tracked by different local search algorithms, both
a simple program that flashes the display several times is executed. initialized by manually positioning a template over recognizable
Next, a roving dot is displayed for calibration purposes. The par- eye features and a crosshair over the calibration dot. Grip boxes
ticipant is asked to visually track the dot as it moves. The laptop allow for adjustment of the eye template (see Figure 4). During
display is then flashed again to signify the end of calibration. For initialization, only one playback control is visible, controlling ad-
good calibration the laptop display should appear entirely within the vancement of both video streams. It may be necessary to advance
scene image frame, and should span most of the frame. After cali- to the first frame with a clearly visible calibration dot. Subsequent
bration the laptop is moved away and the participant is free to view searches exploit temporal coherence by using the previous search
the scene normally. After a period of time (in this instance about result as the starting location.
two minutes) the recording is stopped and video collection is com-
plete. All subsequent processing is then carried out offline. Note 3.3.2 Dot Tracking
that during recording it is impossible to judge camera alignment. A simple greedy algorithm is used to track the calibration dot. The
Poor camera alignment is the single greatest impediment toward underlying assumption is that the dot is a set of bright pixels sur-
successful data processing. rounded by darker pixels (see Figure 4). The sum of differences is
largest at a bright pixel surrounded by dark pixels. The dot moves
3.2 Synchronization from one location to the next in discrete steps determined by the
Video processing begins with synchronization. Synchronization refresh rate of the display. To the human eye this appears as smooth
is necessary because the two cameras might not begin recording motion, but in a single frame of video it appears as a short trail of
at precisely the same time. This situation would be alleviated if multiple dots. To mitigate this effect the image is blurred with a
the cameras could be synchronized via hardware or software con- Gaussian smoothing function, increasing the algorithm’s tolerance
trol (e.g., via IEEE 1394 bus control). In the present case, no to variations in dot size. In the present application the dot radius
such mechanism was available. As suggested previously [Li and was roughly 3 to 5 pixels in the scene image frame.
Parkhurst 2006], a flash of light visible in both videos is used as The dot tracking algorithm begins with an assumed dot location
a marker. Using the marker an offset necessary for proper frame obtained from the previous frame of video, or from initialization. A
alignment is established. In order to find these marker locations in sum of differences is evaluated over an 8×8 reference window:
the two video streams, they are both displayed side by side, each
with its own playback control. The playback speed is adjustable in ∑ ∑ I(x, y) − I(x − i, y − j), −8 < i, j < 8. (2)
i j
forward and reverse directions. Single frame advance is also pos-
sible. To synchronize the videos, the playback controls are used This evaluation is repeated over a 5×5 search field centered at the
to manually advance/rewind each video to the last frame where the assumed location (x, y). If the assumed location yields a maximum
light flash is visible (see Figure 3). within the 25 pixel field then the algorithm stops. Otherwise the
location with the highest sum of differences becomes the new as-
3.3 Calibration & Gaze Point Mapping
sumed location and the computation is repeated.
Pupil center coordinates are produced by a search algorithm exe- One drawback of this approach is that the dot is not well tracked
cuted over eye video frames. The goal is to map the pupil center near the edge of the laptop display. Reducing the search field and
to gaze coordinates in the corresponding scene video frame. Cali- reference window allows better discrimination between the dot and
bration requires sequential viewing of a set of spatially distributed display edges while reducing the tolerance to rapid dot movement.
calibration points with known scene coordinates. Once calibration
is complete the eye is tracked and gaze coordinates are computed 3.4 Pupil/Limbus Tracking
for the remainder of the video. A traditional video-oculography ap- A two-step process is used to locate the limbus (iris-sclera bound-
proach [Pelz et al. 2000; Li et al. 2006] calculates the point of gaze ary) and hence pupil center in an eye image. First, feature points are
237
4. (a) constrained ray origin and extent (b) constrained rays and fit ellipse
Figure 5: Constrained search for limbic feature points: (a) con-
strained ray origin and termination point; (b) resultant rays, fitted
ellipse, and center. For clarity of presentation only 36 rays are dis- Figure 6: Display of fitted ellipse and computed gaze point.
played in (b), in practice 360 feature points are identified.
ellipse on the screen. If the user observes drift in the computed el-
detected. Second, an ellipse is fit to the feature points. The ellipse lipse the center may be nudged to the correct location using a simple
center is a good estimate of the pupil center. drag and drop action.
These strategies are analogous to traditional keyframing opera-
3.4.1 Feature Detection tions, e.g., when match-moving. If a feature tracker fails to track
The purpose of feature detection is to identify point locations on the a given pixel pattern, manual intervention is required at specific
limbus. We use a technique similar to Starburst [Li et al. 2005]. A frames. The result is a semi-automatic combination of manual
candidate feature point is found by casting a ray R away from an trackbox positioning and automatic trackbox translation. Although
origin point O and terminating the ray as it exits a dark region. We not as fast as a fully automatic approach, this is still considerably
determine if the ray is exiting a dark region by checking the gradi- better than the fully manual, frame-by-frame alternative. A screen-
ent magnitude collinear with the ray. The location with maximum shot of the user interface is shown in Figure 6.
collinear gradient component max ∇ is recorded as a feature point. 3.4.4 Tracking Accuracy
Starburst used a fixed threshold value rather than the maximum and
did not constrain the length of the rays. The DejaView camera has approximately a 60◦ field of view, with
Consistent and accurate feature point identification and selection video resolution of 320×240. Therefore a simple multiplication by
is critical for stable and accurate eye-tracking. Erroneous feature 0.1875 converts our measurement in pixels of Euclidean distance
points are often located at the edges of the pupil, eyelid, or at a spec- between gaze point and calibration coordinates to degrees visual
ular reflection. To mitigate these effects the feature point search angle. Using this metric, the eye tracker’s horizontal accuracy is
area is constrained by further exploiting temporal coherence. The better than 2◦ , on average [Ryan et al. 2008]. Vertical and horizon-
limbic boundary is not expected to move much from one frame to tal accuracy is roughly equivalent.
the next, therefore it is assumed that feature points will be near the 3.5 Fixation Detection
ellipse E identified in the previous frame. If P is the intersection of
ray R and ellipse E the search is constrained according to: After mapping eye coordinates to scene coordinates via Equation
(1), the collected gaze points and timestamp x = (x, y,t) are ana-
max ∇(O + α(P − O) : 0.8 < α < 1.2), (3) lyzed to detect fixations in the data stream. Prior to this type of
analysis, raw eye movement data is not very useful as it represents
as depicted in Figure 5. For the first frame in the video we use the a conjugate eye movement signal, composed of a rapidly changing
eye model manually aligned at initialization to determine P. component (generated by fast saccadic eye movements) with the
comparatively stationary component representative of fixations, the
3.4.2 Ellipse Fitting and Evaluation
eye movements generally associated with cognitive processing.
Ellipses are fit to the set of feature points using linear least squares There are two leading methods for detecting fixations in the raw
minimization (e.g., [Lancaster and Šalkauskas 1986]). This method eye movement data stream: the position-variance or velocity-based
will generate ellipses even during blinks when no valid ellipse is approaches. The former defines fixations spatially, with centroid
attainable. In order to detect these invalid ellipses we implemented and variance indicating spatial distribution [Anliker 1976]. If the
an ellipse evaluation method. variance of a given point is above some threshold, then that point
Each pixel that the ellipse passes through is labeled as acceptable is considered outside of any fixation cluster and is considered to
or not depending upon the magnitude and direction of the gradient be part of a saccade. The latter approach, which could be consid-
at that pixel. The percentage of acceptable pixels is computed and ered a dual of the former, examines the velocity of a gaze point,
1
included in the output as a confidence measure. e.g., via differential filtering, xi = ∆t ∑k xi+ j g j , i ∈ [0, n − k),
˙ j=0
3.4.3 Recovery From Failure where k is the filter length, ∆t = k − i. A 2-tap filter with coeffi-
cients g j = {1, −1}, while noisy, can produce acceptable results.
The ellipse fitting algorithm occasionally fails to identify a valid ˙
The point xi is considered to be a saccade if the velocity xi is above
ellipse due to blinks or other occlusions. Reliance on temporal co- threshold [Duchowski et al. 2002]. It is possible to combine these
herence can prevent the algorithm from recovering from such situ- methods by either checking the two threshold detector outputs (e.g.,
ations. To mitigate this problem we incorporated both manual and for agreement) or by deriving the state-probability estimates, e.g.,
automatic recovery strategies. Automatic recovery relies on ellipse via Hidden Markov Models [Salvucci and Goldberg 2000].
evaluation: if an ellipse evaluates poorly, it is not used to constrain In the present implementation, fixations are identified by a vari-
the search for feature points in the subsequent frame. Instead, we ant of the position-variance approach, with a spatial deviation
revert to using the radius of the eye model as determined at ini- threshold of 19 pixels and number of samples set to 10 (the fixa-
tialization, in conjunction with the center of the last good ellipse. tion analysis code is freely available on the web1 ). Note that this
Sometimes this automatic recovery is insufficient to provide a good
fit, however. Manual recovery is provided by displaying each fitted 1 The position-variance fixation analysis code was originally made
238
5. t1
θ t2
A
B
C
D x
E
t3 F
Figure 7: AOI trackbox with corners labeled (A, B,C, D).
G
H
I
approach is independent of frame rate, so long as each gaze point is
listed with its timestamp, unlike a previous approach where fixation
detection was tied to the video frame rate [Munn et al. 2008]. Figure 8: Trackboxes t1 , t2 , t3 , AOIs A, B, . . ., I, and fixation x.
The sequence of detected fixations can be processed to gain
insight into the attentional deployment strategy employed by the
wearer of the eye tracking apparatus. A common approach is to within the reference window of the trackbox (see Figure 7). As in
count the number of fixations observed over given Areas Of Inter- dot tracking, a 5×5 search field is used within an 8×8 reference
est, or AOIs, in the scene. To do so in dynamic media, i.e., over window. Equation (2) is now replaced with I(x, y) − µ, where
video, it is necessary to track the AOIs as their apparent position in
(S(A) + S(B)) − (S(C) + S(D))
the video translates due to camera movement. µ= .
p×q
3.6 Feature Tracking
Trackable features include both bright spots and dark spots in
By tracking the movement of individual features it is possible to ap- the scene image. For a bright spot, I(x, y) − µ is maximum at the
proximate the movement of identified AOIs. We allow the user to target location. Dark spots produce minima at target locations. Ini-
place trackboxes at any desired feature in the scene. The trackbox tial placement of the trackbox determines whether the feature to
then follows the feature as it translates from frame to frame. This be tracked is a bright or dark spot, based on the sign of the initial
is similar in principle to the match-moving tracker window in com- evaluation of I(x, y) − µ.
mon compositing software packages (e.g., Apple’s Shake [Paolini Some features cannot be correctly tracked because they exit the
2006]). Locations of trackboxes are written to the output data file camera field. For this study three trackboxes were sufficient to
along with corresponding gaze coordinates. We then post-process properly track all areas of interest within the scene viewed by par-
the data to compute fixation and AOI information from gazepoint ticipants in the study. Extra trackboxes were placed and the three
and trackbox data. that appeared to be providing the best track were selected manu-
The user places a trackbox by clicking on the trackbox sym- ally. Our implementation output a text file and a video. The text file
bol, dragging and dropping it onto the desired feature. A user may contained one line per frame of video. Each line included a frame
place as many trackboxes as desired. For our study trackboxes were number, the (x, y) coordinates of each trackbox, the (x, y) coordi-
placed at the corners of each monitor. nates of the corresponding gaze point, and a confidence number.
Feature tracking is similar to that used for tracking the calibra- See Figure 6 for a sample frame of the output video. Note the frame
tion dot with some minor adaptations. Computation is reduced by number in the upper left corner.
precomputing a summed area table S [Crow 1984]. The value of The video was visually inspected to determine frame numbers
any pixel in S stores the sum of all pixels above and to the left of for the beginning and end of stimulus presentation, and most us-
the corresponding pixel in the original image, able trackboxes. Text files were then manually edited to remove
extraneous information.
S(x, y) = ∑ ∑ I(i, j), 0 < i < x, 0 < j < y. (4)
i j 3.7 AOI Labeling
Computation of the summation table is efficiently performed by a The most recent approach to AOI tracking used structure from mo-
dynamic programming approach (see Algorithm 1). The summa- tion to compute 3D information from eye gaze data [Munn and Pelz
tion table is then used to efficiently compute the average pixel value 2008]. We found such complex computation unnecessary because
we did not need 3D information. We only wanted analysis of fixa-
available by LC Technologies. The original fixfunc.c can tions in AOIs. While structure from motion is able to extract 3D in-
still be found on Andrew R. Freed’s eye tracking web page: formation including head movement, it assumes a static scene. Our
<http://freedville.com/professional/thesis/eyetrack-readme.html>. The method makes no such assumption, AOIs may move independently
C++ interface and implementation ported from C by Mike Ashmore are from the observer, and independently from each other. Structure
available at: <http://andrewd.ces.clemson.edu/courses/cpsc412/fall08>. from motion can however handle some degree of occlusion that our
approach does not. Trackboxes are unable to locate any feature that
becomes obstructed from view.
AOI labeling begins with the text files containing gaze data and
for (y = 0 to h)
sum = 0
track box locations as described above. The text files were then
for (x = 0 to w) automatically parsed and fed into our fixation detection algorithm.
sum = sum + I(x, y) Using the location of the trackboxes at the end of fixation, we were
S(x, y) = sum + S(x, y − 1) able to assign AOI labels to each fixation. For each video a short
program was written to apply translation, rotation, and scaling be-
Algorithm. 1: Single-pass computation of summation table. fore labeling the fixations, with selected trackboxes defining the
local frame of reference. The programs varied slightly depending
239
6. Apparatus, Environment, & Data Collected. Participants
viewed each image on a display wall consisting of nine video mon-
itors arranged in a 3×3 grid. Each of the nine video monitors’
display areas measured 36 wide × 21 high, with each monitor
framed by a 1/2 black frame for an overall measurement of 9
wide × 5 3 high.
The mock patient room measured approximately 15.6 × 18.6 .
Participants viewed the display wall from a hospital bed facing the
monitors. The bed was located approximately 5 3 from the dis-
play wall with its footboard measuring 3.6 high off the floor (the
monitors were mounted 3 from the floor). As each participant lay
on the bed, their head location measured approximately 9.6 to the
center of the monitors. Given these dimensions and distances and
using θ = 2 tan−1 (r/(2D)) to represent visual angle, with r = 9
Figure 9: Labeling AOIs. Trackboxes, usually at image corners, and D = 9.6 , the monitors subtended θ = 50.2◦ visual angle.
are used to maintain position and orientation of the 9-window dis- Pain perception, mood, blood pressure, and heart rate were con-
play panel; each of the AOIs is labeled in sequential alphanumeric tinually assessed during the experiment. Results from these mea-
order from top-left to bottom-right—the letter ‘O’ is used to record surements are omitted here, they are mentioned to give the reader a
when a fixation falls outside of the display panels. In this screen- sense of the complete procedure employed in the experiment.
shot, the viewer is looking at the purple flower field.
Procedure. Each participant was greeted and asked to provide
documentation of informed consent. After situating themselves on
the bed facing the display wall, each participant involved in the eye
upon which track boxes were chosen. For example, consider a fix- tracking portion of the study donned the wearable eye tracker. A
ation detected at location x, with trackboxes t1 , t2 , t3 , and AOIs A, laptop was then placed in front of them on a small rolling table
B, . . ., I as illustrated in Figure 8. Treating t1 as the origin of the and the participant was asked to view the calibration dot sequence.
reference frame, trackboxes t2 and t3 as well as the fixation x are Following calibration, each participant viewed the image stimulus
translated to the origin by subtracting the coordinates of trackbox (or blank monitors) for two minutes as timed by a stopwatch.
t1 . Following translation, the coordinates of trackbox t2 define the
rotation angle, θ = tan−1 (t2y /t2x ). A standard rotation matrix is Subjects. 109 healthy college students took part in the study,
used to rotate fixation point x to bring it in alignment with the hor- with a small subsample (21) participating in the eye tracking por-
izontal x-axis. Finally, if trackbox t3 is located two-thirds across tion.
and down the panel display, then the fixation coordinates are scaled
by 2/3. The now axis-aligned and scaled fixation point x is checked Experimental Design. The study used a mixed randomized de-
for which third of the axis-aligned box it is positioned in and the ap- sign. Analysis of recorded gaze points by participants wearing the
propriate label is assigned. Note that this method of AOI tracking eye tracker was performed based on a repeated measures design
is scale- and 2D-rotationally-invariant. It is not, however, invari- where the set of fixations generated by each individual was treated
ant to shear, resulting from feature rotation in 3D (e.g., perspective as the within-subjects fixed factor.
rotation).
Following fixation localization, another text file is then output Discarded Data. Four recordings were collected over each of
with one line per fixation. Each line contains the subject number, four stimulus images with four additional recordings displaying
stimulus identifier, AOI label, and fixation duration. This informa- no image as control. There was one failed attempt to record data
tion is then reformatted for subsequent statistical analysis by the over the purple flower field stimulus. A replacement recording was
statistical package used (R in this case). made. There were 21 sessions in all.
Ten recordings were discarded during post processing because
4 Applied Example video quality prohibited effective eye tracking. In each of these
videos some combination of multiple factors rendered them unus-
In an experiment conducted to better understand the potential health able. These factors included heavy mascara, eyelid occlusion, fre-
benefits of images of nature in a hospital setting, participants’ gaze quent blinking, low contrast between iris and sclera, poor position-
was recorded along with physiological and self-reported psycho- ing of eye cameras, and calibration dots not in the field of view. We
logical data collected. successfully processed 2 control, 4 yellow field, 1 tree, 2 fire, and 2
purple flower field videos.
Eye Movement Analysis. For analysis of fixations within AOIs, Poor camera positioning could have been discovered and cor-
trackboxes were placed at the corners of the corners of the 3×3 rected if the cameras provided real-time video feedback. Our hard-
panel display in the scene video. All 9 AOIs were assumed to be ware did not support online processing. Online processing could
equally-sized connected rectangles (see Figure 9). Trackboxes were have provided additional feedback allowing for detection and miti-
used to determine AOI position orientation and scale. Out of plane gation of most other video quality issues.
rotation was not considered. Trackboxes on the outside corners of
the 3×3 grid were preferred. Otherwise linear interpolation was 5 Results
used to determine exterior boundaries of the grid.
Using AOIs and image type as fixed factors (with participant as the
Stimulus. Using the prospect-refuge theory of landscape prefer- random factor [Baron and Li 2007]), repeated-measures two-way
ence [Appleton 1996], four different categories of images (see Fig- ANOVA indicates a marginally significant main effect of AOI on
ure 10) were viewed by participants before and after undergoing fixation duration (F(9,1069) = 2.08, p < 0.05, see Figure 11).2 Av-
a pain stressor (hand in ice water for up to 120 seconds). A fifth eraging over image types, pair-wise t-tests with pooled SD indicate
group of participants (control) viewed the same display wall (see
below) with the monitors turned off. 2 Assuming sphericity as computed by R.
240
7. (a) (b) (c) (d)
Figure 10: Stimulus images: (a) yellow field: prospect (Getty Images), (b) tree: refuge (Getty Images), (c) fire: hazard (Getty Images), (d)
purple flower field: mixed prospect and refuge (courtesy Ellen A. Vincent).
Fixation Durations vs. AOI Fixation Durations vs. AOI
Mean Fixation Durations (in ms; with SE)
Mean Fixation Durations (in ms; with SE)
1800 4000
control
3500 yellow field
1600 tree
3000 fire
lavender field
1400
2500
1200 2000
1500
1000
1000
800
500
600 0
A B C D E F G H I O A B C D E F G H I O
AOI AOI
Figure 11: Comparison of mean fixation duration per AOI aver- Figure 12: Comparison of mean fixation duration per AOI and per
aged over image types, with standard error bars. image type, with standard error bars.
no significant differences in fixation durations between any pair of a propensity of viewers to look around more when presented with
AOIs. stimulus than when there is nothing of interest at all.
A similar observation could be made regarding fixation durations
Repeated-measures ANOVA also indicates a significant main ef-
found over region C (upper right) for the purple flower field image,
fect of image type on fixation duration (F(34,1044) = 1.78, p <
an image with which viewers perceived lower sensory pain com-
0.01), with AOI × image interaction not significant (see Figure 12).
pared to those who viewed other landscape images and no images
Averaging over AOIs, pair-wise t-tests with pooled SD indicate sig-
with statistical significance at α = 0.1 [Vincent et al. 2009]. How-
nificantly different fixation durations between the control image
ever, the difference in fixation durations over region C is not signif-
(blank screen) and the tree image (p < 0.01, with Bonferroni cor-
icant according to the pair-wise post-hoc analysis.
rection). No other significant differences were detected.
7 Conclusion
6 Discussion
A match-moving approach was presented to help automate analy-
Averaging over image types, the marginally significant difference sis of eye movements collected by a wearable eye tracker. Tech-
in fixation durations over AOIs suggests that longest durations tend nical contributions addressed video stream synchronization, pupil
to fall on central AOIs (E and H). This simply suggests that viewers detection, eye movement analysis, and tracking of dynamic scene
tend to fixate the image center. This is not unusual, particularly Areas Of Interest (AOIs). The techniques were demonstrated in
in the absence of a specific viewing task [Wooding 2002]. Post- the evaluation of eye movements on images of nature viewed by
hoc pair-wise comparisons failed to reveal significant differences, subjects participating in an experiment on the perception of well-
which is likely due to the relatively high variability of the data. being. Although descriptive statistics of gaze locations over AOIs
Averaging over AOIs shows that the tree image drew signifi- failed to show significance of any particular AOI except the center,
cantly shorter fixations than the control (blank) screen. Due to av- the methodology is applicable toward similar future studies.
eraging, however, it is difficult to infer further details regarding fix-
ation duration distributions over particular image regions. Cursory References
examination of Figure 12 suggests shorter fixations over the center A NLIKER , J. 1976. Eye Movements: On-Line Measurement, Anal-
panels (E & H), compared to the longer dwell times made when the ysis, and Control. In Eye Movements and Psychological Pro-
screen was blank. Considering the averaging inherent in ANOVA, cesses, R. A. Monty and J. W. Senders, Eds. Lawrence Erlbaum
this could just mean that fixations are more evenly distributed over Associates, Hillsdale, NJ, 185–202.
the tree image than over the blank display, where it is fairly clear
that viewers mainly looked at the center panels. This may suggest A PPLETON , J. 1996. The Experience of Landscape. John Wiley &
a greater amount of visual interest offered by the tree image and Sons, Ltd., Chicester, UK.
241
8. BABCOCK , J. S. AND P ELZ , J. B. 2004. Building a Lightweight M EGAW, E. D. AND R ICHARDSON , J. 1979. Eye Movements and
Eyetracking Headgear. In ETRA ’04: Proceedings of the 2004 Industrial Inspection. Applied Ergonomics 10, 145–154.
Symposium on Eye Tracking Research & Applications. ACM,
San Antonio, TX, 109–114. M ORIMOTO , C. H. AND M IMICA , M. R. M. 2005. Eye Gaze
Tracking Techniques for Interactive Applications. Computer Vi-
BALLARD , D. H., H AYHOE , M. M., AND P ELZ , J. B. 1995. Mem- sion and Image Understanding 98, 4–24.
ory Representations in Natural Tasks. Journal of Cognitive Neu-
roscience 7, 1, 66–80. M UNN , S. M. AND P ELZ , J. B. 2008. 3D point-of-regard, position
and head orientation from a portable monocular video-based eye
BARON , J. AND L I , Y. 2007. Notes on the use of R for psy- tracker. In ETRA ’08: Proceedings of the 2008 Symposium on
chology experiments and questionnaires. Online Notes. URL: Eye Tracking Research & Applications. ACM, Savannah, GA,
<http://www.psych.upenn.edu/∼baron/rpsych/rpsych.html> 181–188.
(last accessed December 2007).
M UNN , S. M., S TEFANO , L., AND P ELZ , J. B. 2008. Fixation-
B USWELL , G. T. 1935. How People Look At Pictures. University identification in dynamic scenes: Comparing an automated al-
of Chicago Press, Chicago, IL. gorithm to manual coding. In APGV ’08: Proceedings of the
5th Symposium on Applied Perception in Graphics and Visual-
C ROW, F. C. 1984. Summed-area tables for texture mapping. In ization. ACM, New York, NY, 33–42.
SIGGRAPH ’84: Proceedings of the 11th Annual Conference
on Computer Graphics and Interactive Techniques. ACM, New PAOLINI , M. 2006. Apple Pro Training Series: Shake 4. Peachpit
York, NY, 207–212. Press, Berkeley, CA.
D UCHOWSKI , A., M EDLIN , E., C OURNIA , N., G RAMOPADHYE , P ELZ , J. B., C ANOSA , R., AND BABCOCK , J. 2000. Extended
A., NAIR , S., VORAH , J., AND M ELLOY, B. 2002. 3D Eye Tasks Elicit Complex Eye Movement Patterns. In ETRA ’00:
Movement Analysis. Behavior Research Methods, Instruments, Proceedings of the 2000 Symposium on Eye Tracking Research
Computers (BRMIC) 34, 4 (November), 573–591. & Applications. ACM, Palm Beach Gardens, FL, 37–43.
F REED , A. R. 2003. The Effects of Interface Design on Telephone R EICH , S., G OLDBERG , L., AND H UDEK , S. 2004. Deja View
Dialing Performance. M.S. thesis, Pennsylvania State Univer- Camwear Model 100. In CARPE’04: Proceedings of the 1st
sity, University Park, PA. ACM Workshop on Continuous Archival and Retrieval of Per-
sonal Experiences. ACM Press, New York, NY, 110–111.
JACOB , R. J. K. AND K ARN , K. S. 2003. Eye Tracking in Human-
Computer Interaction and Usability Research: Ready to Deliver RYAN , W. J., D UCHOWSKI , A. T., AND B IRCHFIELD , S. T. 2008.
the Promises. In The Mind’s Eye: Cognitive and Applied Aspects Limbus/pupil switching for wearable eye tracking under variable
of Eye Movement Research, J. Hyönä, R. Radach, and H. Deubel, lighting conditions. In ETRA ’08: Proceedings of the 2008 Sym-
Eds. Elsevier Science, Amsterdam, The Netherlands, 573–605. posium on Eye Tracking Research & Applications. ACM, New
York, NY, 61–64.
L ANCASTER , P. AND Š ALKAUSKAS , K. 1986. Curve and Surface
Fitting: An Introduction. Academic Press, San Diego, CA. S ALVUCCI , D. D. AND G OLDBERG , J. H. 2000. Identifying Fix-
ations and Saccades in Eye-Tracking Protocols. In ETRA ’00:
L AND , M., M ENNIE , N., AND RUSTED , J. 1999. The Roles of Proceedings of the 2000 Symposium on Eye Tracking Research
Vision and Eye Movements in the Control of Activities of Daily & Applications. ACM, Palm Beach Gardens, FL, 71–78.
Living. Perception 28, 11, 1307–1432.
S MEETS , J. B. J., H AYHOE , H. M., AND BALLARD , D. H. 1996.
L AND , M. F. AND H AYHOE , M. 2001. In What Ways Do
Goal-Directed Arm Movements Change Eye-Head Coordina-
Eye Movements Contribute to Everyday Activities. Vision Re-
tion. Experimental Brain Research 109, 434–440.
search 41, 25-26, 3559–3565. (Special Issue on Eye Movements
and Vision in the Natual World, with most contributions to the V INCENT, E., BATTISTO , D., G RIMES , L., AND M C C UBBIN , J.
volume originally presented at the ‘Eye Movements and Vision 2009. Effects of nature images on pain in a simulated hospital
in the Natural World’ symposium held at the Royal Netherlands patient room. Health Environments Research and Design. In
Academy of Sciences, Amsterdam, September 2000). press.
L I , D. 2006. Low-Cost Eye-Tracking for Human Computer Inter- W EBB , N. AND R ENSHAW, T. 2008. Eyetracking in HCI. In Re-
action. M.S. thesis, Iowa State University, Ames, IA. Techreport search Methods for Human-Computer Interaction, P. Cairns and
TAMU-88-010. A. L. Cox, Eds. Cambridge University Press, Cambridge, UK,
35–69.
L I , D., BABCOCK , J., AND PARKHURST, D. J. 2006. openEyes: A
Low-Cost Head-Mounted Eye-Tracking Solution. In ETRA ’06: W OODING , D. 2002. Fixation Maps: Quantifying Eye-Movement
Proceedings of the 2006 Symposium on Eye Tracking Research Traces. In Proceedings of ETRA ’02. ACM, New Orleans, LA.
& Applications. ACM, San Diego, CA.
L I , D. AND PARKHURST, D. 2006. Open-Source Software for
Real-Time Visible-Spectrum Eye Tracking. In Conference on
Communication by Gaze Interaction. COGAIN, Turin, Italy.
L I , D., W INFIELD , D., AND PARKHURST, D. J. 2005. Star-
burst: A hybrid algorithm for video-based eye tracking com-
bining feature-based and model-based approaches. In Vision
for Human-Computer Interaction Workshop (in conjunction with
CVPR).
242