This paper discusses a new application of eye-tracking, namely power management, and outlines its implementation in personal computer system. Unlike existing power management technology, which “senses” a PC user through keyboard and/or mouse, our technology “watches” the user through a single camera. The technology tracks the user’s eyes keeping the display active only if the user looks at the screen. Otherwise it dims the display down or even switches it off to save energy. We implemented the technology in hardware and present the results of its experimental evaluation.
2. 2.2 User Presence Detection
Next frame
no Search area (S)
The goal of this task is to determine from the camera readings BTE location known?
of whole image
whether or not the user is currently present in front of display. yes
Search area (S)
To detect the user’s presence in front of the display, we first around the BTE
localize the face search by applying background subtraction and
skin-color segmentation to the RGB representation of input im- Compute integral image of S
age. The skin is defined by the following criteria [Douxchamps,
Find BTE candidates Run SSR filter
2008]: 0.55<R<0.85, 1.15<R/G<1.19, 1.15<R/B<1.5 and
0.6<(R+G+B)<1.8. To accelerate the face-area extraction, two no
Next BTE candidate Face candidate?
additional filters are used. The first one limits the size of the yes
Confirm by SVM
head in reasonable range. The second one verifies that the face Save BTE location
contains a minimum of 25% of skin colored pixels. Thus, if the Locate the eyes no
Run complete?
total number of pixels in the derived face area exceeds a given no yes
Positive
threshold the user is assumed present.
yes
Save BTE, eye locations
(x0,y0)
yes
1 2 3 0ther candidates?
no
(x1,y1) (x2,y1)
Figure 2: The modified eye-tracking algorithm
4 5 6
(x1,y2) (x2,y2)
Fig.2 shows the modified algorithm. For the first frame or any
Figure 1: Illustration of the SSR filter frame in which the search for BTE candidate was unsuccessful,
we search the image area reduced by background and skin-color
2.3 Eye-Gaze Detection extraction; otherwise the search is restricted to a small area (S)
of ±8 pixels around the previously located BTE pattern. For the
The eye-gaze detector implements the algorithm proposed by chosen area, the algorithm first transforms the green component
[Kawato 2005], which scans the Six Segment Rectangular (SSR) of corresponding image into integral image representation and
filter over the integral representation input image to define the then scans it by the SSR filter to select the BTE candidate. If the
Between The Eyes (BTE) pattern of human face (Fig.1) and then BTE candidate is found, the system takes it as a starting point to
searches the regions 1 and 3 from the left and right side of the locate eyes. If eyes have been detected, the user is assumed to be
BTE pattern to locate the eyes. The algorithm does not depend looking at screen; else it is not. If no BTE candidate has been
on illumination, face occlusion and eye closure. It is more stable, found, the user is considered to be not looking at the screen.
robust and less complex than the other eye-tracking formula-
tions. However, it is still very computationally demanding. In a To detect a BTE pattern in an image, we scan the SSR filter
quest to locate all faces in an image (without restriction on face over the search area S in a row-first fashion and at each location
size, motion and rotation), the algorithm scans the whole image compare the integral sums of the rectangular segments corre-
six times performing over 28M operations per (640x480) frame. sponding to eyes, chicks and nose (i.e. 1 and 2, 1 and 4, 3 and 2,
Though such a full search might be necessary in some applica- and 3 and 6) as follows:
tions, it seems redundant when tracking eyes of the PC user. Sum(1) < Sum (2) & Sum(1) < Sum(4) (2)
In our eye-tracking application we can assume that: Sum(3) < Sum(2) & Sum(3) < Sum(6)
1. The target object is a single PC user. The user sits in front If the above criteria are satisfied, the SSR is considered to be a
of PC at a relatively close distance of 50-70cm. candidate for the BTE pattern (i.e. face) and two local minimum
2. The user’s motion is slow relatively to the frame rate. (i.e. dark) points each are extracted from the regions 1 and 3 of
the SSR for left and right eyes, respectively.
3. The background is stable and constant.
The eye localization procedure is organized as a scan over the
Based on these assumptions, we apply the following algorithmic
green plane representation of regions 1 and 3 for a continuous
optimizations to reduce eye-tracking complexity [Yamamoto
segment of dark pixels (i.e. whose value is lower than the
and Moshnyaga 2009]:
threshold k). During the search for eyes, we ignore 2 pixels at
• fixed SSR filter size; when the user is 50-70 cm from the the boarder of the regions to avoid effects of eyebrows, hair and
camera, the BTE interval of 55 pixels and the filter size ra- beard. Also, because the eyebrows have almost the same grey
tio of 2:3 ensure minimal computational complexity at level as the eyes, the search starts from the lowest positions of
almost 100% detection rate. regions 1 and 3. Similarly to [Kawato 2000], we assume that
• single SSR filter scan; It follows from the single user as- eyes are located if the distance between the located eyes (D) and
sumption and the fixed SSR filter size. the angle (A) at the center point of the BTE area 2 (see Fig.3,
left) satisfy the following: 30<D< 42 & 115° < A < 180°.If
• pixel displacement of the SSR filter during the scan; Ex-
periments showed that the computational complexity de- both eyes are detected, the user’s gaze is considered to be on
creases by a factor of 3 when the displacement of 2, and by screen. The eye positions found in this case the current frame are
a factor of 4.5 for the displacement of 3 without affecting then used to reduce complexity of processing the successive
the detection rate of the original (full scan) algorithm. frames. The search in the next frame is limited to a small region,
which spans by 8 pixels in vertical and horizontal direction
• low frame processing rate (5-10 fps); Because the user around the eye points of the current frame.
motion is very slow, high processing rates are redundant.
114
3. A
D
Figure 3: An illustration of the eye detection heuristics
(left) and the search area reduction (right)
Screen Backlight Video
User tracking unit Camera
lamp
User
u0 presence
High-voltage Vb Voltage detector
inverter converter u1
Eye
(R,G,B)
Figure 5: Examples of correct eye-detection
Display detector
memory, input images were 160x120 pixels in size. The SSR
filter was 30x20 pixels in size. The total power consumption of
Figure 4: System overview the design was 150mW, which is 35 times less than software
implementation on a desktop PC [Moshnyaga et al. 2009].
Table 1: Results of evaluation on test sequences 4 Experimental Evaluation
Test Frames Positives Negatives Accuracy
True False True False (%)
1 151 127 0 6 18 88
4.1 Eye-Detection Accuracy
2 240 149 1 65 25 89
To evaluate accuracy of the gaze detector, we ran four different
3 100 74 0 16 10 90
tests each of each conducted by different users. The users were
4 180 142 4 18 24 84
Average 167 123 1 26 19 88 free to look at the camera/display, read from the materials on the
table, type text, wear eyeglasses, move gesticulate or even leave
Fig. 3 (right) demonstrates the search area reduction by our algo- the PC whenever wanted. Fig.5 illustrates the detection results
rithm: the dashed line shows the area defined by background on 4 images. The + marks depict positions where the system
extraction; the dotted line depicts the area obtained by skin- assumes the eyes to be. As we see, even though the lighting
color-segmentation; the plain (dark line) shows the area around conditions of faces vary, the results are correct. Ordinary pairs
the BTE pattern found in the previous image frame; white of glasses (see Fig.5, top row) have no bad effect on the per-
crosses show the computed locations of eyes. formance for frontal faces. In some face orientations, however,
the frame of pair of glasses can hide a part of eye ball, causing
the system to loose the eye. Or sometimes it takes eyebrow or
3 Implementation hair as an eye and tracks it in the following frames.
Table 1 summarizes the results. Here, the second column de-
We implemented the proposed PC display power management
picts the total number of frames considered in the test; columns
system in hardware. Fig.4 outlines the block-diagram of the
marked by ‘True’ and ‘False’ reflect the number of true and
system. The user tracking unit receives an RGB color image and
false detections, respectively, for positive and negative cases.
outputs two logic signals, u1,u0. If the user is detected in the The false positives correspond to cases in which one of the eyes
image, the signal u0 is set to 1; otherwise it is 0. The zero value
is tracked on the eyebrow or on the hair near the eye. The false
of u0 enforces the voltage converter to shrink the backlight sup- negatives reflect cases in which the user gazed off the screen
ply voltage to 0 Volts, dimming the display off. If the eye-gaze
(the both eyes are tracked on the eyebrows). Accuracy column
detector determines that the user looks at screen, it sets u1=1. shows the ratio of true decisions to the total number of decisions
When both u0 and u1 are 1, the display operates as usual. If the
made. As the tests showed, the eye tracking accuracy of pro-
user’s gaze has been off the screen for more than N consecutive posed system is quite high (88% on average).
frames, u1 becomes 0. If u0=1 and u1=0, the voltage converter
lowers the input voltage (Vb) of the high-voltage inverter by
4.2 Energy Reduction Efficiency
∆V. This voltage drop lowers backlight luminance and so
shrinks the power consumption of the display. Any on-screen Next, we estimated the energy efficiency of the proposed cam-
gaze in this low power mode reactivates the initial backlight era–based power management system by measuring the total
luminance and moves the display onto normal mode. However, power consumption taken from the wall by the system itself and
if u0=0 and the backlight luminance has already reached the the 17” IO-DATA TFT LCD display controlled by the system.
lowest level, the display is turned off. Fig.6 profiles the results measured per frame on a 100sec
The user-tracking unit was realized on a single Xilinx FPGA (2000frames) long test. In the test, the user was present in front
board connected to VGA camera through parallel I/O interface. of the display (frames 1-299, 819-1491, 1823-2001); moved a
See [Moshnyaga, et al, 2009] for details. The unit operates at little from the display but still present in the camera view
48MHz frequency, 3.3V voltage and provides eye tracking at (frames 1300 to 1491); and stepped away from the PC disap-
20fps rate. Due to capacity limitations of the on-chip SRAM pearing from the camera (frames 300-818, 1492-1822). The
115
4. 40
The total power overhead of the system is 960mW. Even
ACPI though the system takes a little more power than ACPI (see
35
30
Gaze Gaze horizontal line in Fig.6) in active mode, it saves 36% of the total
on on
screen screen energy consumed by the display on this short test. In environ-
25
ments when users frequently detract their attention from the
Power (W)
Gaze
off screen
20 screen or leave computers unattended (e.g. school, university,
15 office) the energy savings could be significant.
No user No user
10 User is
5
present
5 Conclusion
0
1 501 1001 1501 2001
In this paper we presented a novel eye-tracking application,
namely display power management, and outlined an implemen-
Frame
tation technology which made the application viable. Experi-
Figure 6: Display power consumption per frame ments showed that the camera-based display power management
is more efficient than the currently used ACPI method due to its
ability to adjust the display power adaptively to the viewer be-
havior. The application-specific algorithm optimizations and
eye-tracking implementation in hardware allowed us reduce the
power overhead below 1W yet satisfying real-time and high
accuracy requirements of the application. However, this power
can be reduced even further should custom design be performed.
In the current work we restricted ourselves to a simple case of
a singular user monitoring. However, when talking about moni-
toring in general, some critical issues arise. For instance, how
should the technology behave when handling more than one
person looking at screen? The user might not look at screen
while the others do. Concerning this point, we believe that a
feasible solution is to keep the display active while there is
someone looking at the screen. We are currently investigating
the issue as well as the influence of camera positioning, user
gender/race etc.
References
ACPI: Advanced Configuration and Power Interface Specification, 2004,
Sept., Rev. 3.0, http://www.acpi.info/spec.htm
YAMAMOTO S., AND MOSHNYAGA V.G. 2009. Algorithm optimizations
for low-complexity eye tracking, Proc. IEEE SMC, 18-22
MOSHNYAGA V.G., HASIMOTO K., SUETSUGU T., HIGASHI S. 2009. A
hardware implementation of the user-centric display energy management,
Proc. PATMOS 2009, LNCS 5953, 56-65.
DAI, X., AND RAYCHANDRAN, K. 2003. Computer screen power
management through detection of user presence, US Patent 6650322.
DOUXCHAMPS D., AND CAMPBELL N. 2008, Robust real time face
tracking for the analysis of human behavior, in Machine Learning for
multimodal Interaction, LNCS 4892, 1-10.
Fujitsu-Siemens Report 2007, Energy savings with personal computers,
Figure 7: Screenshots of display and corresponding Fujitsu-Siemens Corp. http://www.fujitsu-simens.nl/aboutus/sor/energy_
power consumption: when the user looks at screen, the saving/prof_desk_prod.html
screen is bright and power is 35W (top picture); else the Global Citizenship Report, Hewlett-Packard Co., 2006,
screen is dimmed and is power 15.6W (bottom picture) www.hp.com/hpinfo/globalcitizenship/gcreport/pdf/
hp2006gcreport_lowres.pdf
system was set to step down from the current power level if the KAWATO S., AND OHYA J. 2000, Two-step approach for real-time eye-
tracking with a new filtering technique. Proc. IEEE SMC, 1366-1371
eye-gaze off the screen was continuously detected for more than
15 frames (i.e. almost 1 sec). The ACPI line shows the power KAWATO S., TETSUTANI N., OSAKA K. 2005. Scale-adaptive face
detection and tracking in real time with SSR filters and support vector
consumption level of the ACPI. machine, IEICE Trans. Information &Systems, E88-D, (12) 2857-2863.
We see that our technology is very effective. It changes the
MORIMOTO, C., MIMICA, M. R.M. 2004. Eye gaze tracking techniques
display power accordingly to the user behavior; dimming the for interactive applications, Computer Vision and Image Understanding,
display when the user gaze is off the screen and powering the 98 (2004) (1), 4–24
display up when the user looks at it. Changing the brightness MAHESRI, A., VARDHAN, V. 2005. Power Consumption Breakdown on a
from one power level to another in our system takes only 20ms, Modern Laptop, Proc. Power Aware Computing Systems, LNCS (3471),
which is unobservable for the user. Fig.7 shows the brightness of 165-180.
the screenshots and the corresponding power consumption level PARK, W.I. 1999. Power saving in a portable computer, EU Patent,
(see the numbers displayed on the down-right corner of the EP0949557.
screenshots; the second row from the bottom shows the power).
116