4. I have data!
I need a
model!
This Note contributes a method that
supports model acquisition in HCI
5. We focus on nonlinear regression
models
has to nav
patch to an
to another,
WWW sit
engine resu
- Pirolli
18
Patch h
interface e
ch can have different gain rates and can insist on a
n-patch cost. gi(tW) is the cumulative gain in patch
een spent. In a well-organized interface there is a
gi and a quick depletion. A user can define a policy
tay within each patch. In this case,
i = 1PliTBgi(twi) = TB Âi = 1Pligi(twi), (15)
q. 13, we get:
R =
ligi(twi)
1 + litwi
(16)
the patch model. The linear case is simple. Figure 12
for different within-patch times. Cases where gi are
proached by Charnov’s marginal value theorem, which
ger should stay in a patch as long as the slope of gi is
verage gain rate R for the environment.
Linear menus; User Performance; Mathematical
models; Visual search.
ACM Classification Keywords
H.5.2. Information Interfaces and Presentation
Miscellaneous
INTRODUCTION
Hick-hyman law
T = blog2(n + 1)
Fitts’ law
T = a + blog2(
2A
W
)
T = blog2(n + 1) (1)
w
T = a + blog2(
2A
W
) (2)
aw of learning
T = aPb
+ c (3)
’ power law
appropriate copyright statement here. ACM now supports three different
statements:
pyright: ACM holds the copyright on the work. This is the historical ap-
more straightforw
Our goal is to ad
models of linear
top applications
mathematical mo
sist of time spen
We assume that
a number of strat
ceptual/motor ta
large [23, 40] b
space. 1) Direct
quired through p
programme sacc
target location, o
first menu item a
target is located
would combine e
reliability of ava
search might be
to the reliability
T = a + blog2(
2A
W
)
Power law of learning
T = aPb
+ c
Stevens’ power law
Paste the appropriate copyright statement here. ACM now supports thre
copyright statements:
• ACM copyright: ACM holds the copyright on the work. This is the his
proach.
• License: The author(s) retain copyright, but ACM receives an exclusive p
license.
• Open Access: The author(s) wish to pay for the work to be open access.
tional fee must be paid to ACM.
This text field is large enough to hold the appropriate release statement ass
single spaced.
Pointing
Foraging Choice
Learning
6. We focus on nonlinear regression
models
has to nav
patch to an
to another,
WWW sit
engine resu
- Pirolli
18
Patch h
interface e
ch can have different gain rates and can insist on a
n-patch cost. gi(tW) is the cumulative gain in patch
een spent. In a well-organized interface there is a
gi and a quick depletion. A user can define a policy
tay within each patch. In this case,
i = 1PliTBgi(twi) = TB Âi = 1Pligi(twi), (15)
q. 13, we get:
R =
ligi(twi)
1 + litwi
(16)
the patch model. The linear case is simple. Figure 12
for different within-patch times. Cases where gi are
proached by Charnov’s marginal value theorem, which
ger should stay in a patch as long as the slope of gi is
verage gain rate R for the environment.
Linear menus; User Performance; Mathematical
models; Visual search.
ACM Classification Keywords
H.5.2. Information Interfaces and Presentation
Miscellaneous
INTRODUCTION
Hick-hyman law
T = blog2(n + 1)
Fitts’ law
T = a + blog2(
2A
W
)
T = blog2(n + 1) (1)
w
T = a + blog2(
2A
W
) (2)
aw of learning
T = aPb
+ c (3)
’ power law
appropriate copyright statement here. ACM now supports three different
statements:
pyright: ACM holds the copyright on the work. This is the historical ap-
more straightforw
Our goal is to ad
models of linear
top applications
mathematical mo
sist of time spen
We assume that
a number of strat
ceptual/motor ta
large [23, 40] b
space. 1) Direct
quired through p
programme sacc
target location, o
first menu item a
target is located
would combine e
reliability of ava
search might be
to the reliability
T = a + blog2(
2A
W
)
Power law of learning
T = aPb
+ c
Stevens’ power law
Paste the appropriate copyright statement here. ACM now supports thre
copyright statements:
• ACM copyright: ACM holds the copyright on the work. This is the his
proach.
• License: The author(s) retain copyright, but ACM receives an exclusive p
license.
• Open Access: The author(s) wish to pay for the work to be open access.
tional fee must be paid to ACM.
This text field is large enough to hold the appropriate release statement ass
single spaced.
Pointing
Foraging Choice
Learning
“White box”
Efficient
7. We focus on nonlinear regression
models
has to nav
patch to an
to another,
WWW sit
engine resu
- Pirolli
18
Patch h
interface e
ch can have different gain rates and can insist on a
n-patch cost. gi(tW) is the cumulative gain in patch
een spent. In a well-organized interface there is a
gi and a quick depletion. A user can define a policy
tay within each patch. In this case,
i = 1PliTBgi(twi) = TB Âi = 1Pligi(twi), (15)
q. 13, we get:
R =
ligi(twi)
1 + litwi
(16)
the patch model. The linear case is simple. Figure 12
for different within-patch times. Cases where gi are
proached by Charnov’s marginal value theorem, which
ger should stay in a patch as long as the slope of gi is
verage gain rate R for the environment.
Linear menus; User Performance; Mathematical
models; Visual search.
ACM Classification Keywords
H.5.2. Information Interfaces and Presentation
Miscellaneous
INTRODUCTION
Hick-hyman law
T = blog2(n + 1)
Fitts’ law
T = a + blog2(
2A
W
)
T = blog2(n + 1) (1)
w
T = a + blog2(
2A
W
) (2)
aw of learning
T = aPb
+ c (3)
’ power law
appropriate copyright statement here. ACM now supports three different
statements:
pyright: ACM holds the copyright on the work. This is the historical ap-
more straightforw
Our goal is to ad
models of linear
top applications
mathematical mo
sist of time spen
We assume that
a number of strat
ceptual/motor ta
large [23, 40] b
space. 1) Direct
quired through p
programme sacc
target location, o
first menu item a
target is located
would combine e
reliability of ava
search might be
to the reliability
T = a + blog2(
2A
W
)
Power law of learning
T = aPb
+ c
Stevens’ power law
Paste the appropriate copyright statement here. ACM now supports thre
copyright statements:
• ACM copyright: ACM holds the copyright on the work. This is the his
proach.
• License: The author(s) retain copyright, but ACM receives an exclusive p
license.
• Open Access: The author(s) wish to pay for the work to be open access.
tional fee must be paid to ACM.
This text field is large enough to hold the appropriate release statement ass
single spaced.
Pointing
Foraging Choice
Learning
“White box”
Efficient
Applications in HCI
1.Engineering models
2.Adaptive interfaces
3.Interface optimization
8. We focus on nonlinear regression
models
has to nav
patch to an
to another,
WWW sit
engine resu
- Pirolli
18
Patch h
interface e
ch can have different gain rates and can insist on a
n-patch cost. gi(tW) is the cumulative gain in patch
een spent. In a well-organized interface there is a
gi and a quick depletion. A user can define a policy
tay within each patch. In this case,
i = 1PliTBgi(twi) = TB Âi = 1Pligi(twi), (15)
q. 13, we get:
R =
ligi(twi)
1 + litwi
(16)
the patch model. The linear case is simple. Figure 12
for different within-patch times. Cases where gi are
proached by Charnov’s marginal value theorem, which
ger should stay in a patch as long as the slope of gi is
verage gain rate R for the environment.
Linear menus; User Performance; Mathematical
models; Visual search.
ACM Classification Keywords
H.5.2. Information Interfaces and Presentation
Miscellaneous
INTRODUCTION
Hick-hyman law
T = blog2(n + 1)
Fitts’ law
T = a + blog2(
2A
W
)
T = blog2(n + 1) (1)
w
T = a + blog2(
2A
W
) (2)
aw of learning
T = aPb
+ c (3)
’ power law
appropriate copyright statement here. ACM now supports three different
statements:
pyright: ACM holds the copyright on the work. This is the historical ap-
more straightforw
Our goal is to ad
models of linear
top applications
mathematical mo
sist of time spen
We assume that
a number of strat
ceptual/motor ta
large [23, 40] b
space. 1) Direct
quired through p
programme sacc
target location, o
first menu item a
target is located
would combine e
reliability of ava
search might be
to the reliability
T = a + blog2(
2A
W
)
Power law of learning
T = aPb
+ c
Stevens’ power law
Paste the appropriate copyright statement here. ACM now supports thre
copyright statements:
• ACM copyright: ACM holds the copyright on the work. This is the his
proach.
• License: The author(s) retain copyright, but ACM receives an exclusive p
license.
• Open Access: The author(s) wish to pay for the work to be open access.
tional fee must be paid to ACM.
This text field is large enough to hold the appropriate release statement ass
single spaced.
Pointing
Foraging Choice
Learning
But hard to
acquire!
“White box”
Efficient
Applications in HCI
1.Engineering models
2.Adaptive interfaces
3.Interface optimization
10. Exploration is inefficient and laborious
The set of all possible models defined by your task
Unexplored model space
11. Exploration is inefficient and laborious
The set of all possible models defined by your task
Unexplored model space
12. We propose automated model search
Best models
Automated model search
Dataset
It builds on work in symbolic programming [6,15]
Constraints
13. We propose automated model search
Best models
Automated model search
Dataset
It builds on work in symbolic programming [6,15]
Generate
Test
Constraints
14. Iterative search in a model space
y, X = {x1, ..., xm}
Dependent variable Predictor variables
y = β1f1(X) + ... + βnfn(X) Winner
15. Iterative search in a model space
y, X = {x1, ..., xm}
Dependent variable Predictor variables
y = β1x1 + ... + βnxm Start
y = β1f1(X) + ... + βnfn(X) Winner
16. Iterative search in a model space
y, X = {x1, ..., xm}
Dependent variable Predictor variables
y = β1x1 + ... + βnxm Start
y = β1f1(X) + ... + βnfn(X) Winner
y = β1(xl ¤ xk) + ... + βnxm
Transform/
Drop
Iterate
Fitness function
17. Iterative search in a model space
y, X = {x1, ..., xm}
Dependent variable Predictor variables
y = β1x1 + ... + βnxm Start
y = β1f1(X) + ... + βnfn(X) Winner
y = β1(xl ¤ xk) + ... + βnxm
Transform/
Drop
Iterate
Fitness function
Algebraic
Exponential
Logarithmic
Trigonometric
Presently 16
transformations
25. Multiple controls offered
Max. number of
free parameters
Transformations:Types, Number per term
Seed equation
Constraints to the model space
Your model space
26. Multiple controls offered
Max. number of
free parameters
Transformations:Types, Number per term
Seed equation
Constraints to the model space
Stochasticity
Fitness function (e.g.,
R2,AIC, BIC)
Search process
Local search depth
Your model space
29. Case 1. Comparison with 11 existing
models in literature
Mouse pointing Two-thumb tapping
...
Menu selection
D,W ID, Telapsed B,I,D,W,Fr
30. Case 1. Comparison with 11 existing
models in literature
Mouse pointing Two-thumb tapping
...
Menu selection
D,W ID, Telapsed B,I,D,W,Fr
More predictors, observations, model terms
31. Case 1. Comparison with 11 existing
models in literature
Mouse pointing Two-thumb tapping
...
Menu selection
D,W ID, Telapsed B,I,D,W,Fr
Improvements to fitness found in 7 out of 11 cases.
Comparable model fitness in others.
More predictors, observations, model terms
32. Baseline This paper
# Dataset Predictors⇤ n k Model provided in paper R2 ⇤⇤ Best model found⇤⇤⇤ R2
1 Stylus tapping (1 oz)[8] A,W 16 2 a + b log2(2A/W) .966 a + b log2(A/W) .966
2 Reanalyzed data [8] A,We a + b log2(A/We + 1) .987 a + b(log2(log2 A) We) .981
3 Mouse pointing [8] A,W 16 2 a + b log2(A/W + 1) .984 a + b log2(A/W) .973
4 A,We a + b log2(A/We + 1) .980 a + b log10(A/We) .978
5 Trackball dragging [8] A,W 16 2 a + b log2(A/W + 1) .965 a + b log2(A (W3)4) .981
6 A,We a + b log2(A/We + 1) .817 a + b(A/(1 elog10 We )) .941
7 Magic lens pointing [13] A,W, S 16 3 a + b log2(D/S + 1) + c log2(S/2/A) .88 a + b(1 1/A) + cW9 .947
8 Tactile guidance [7] N,I,D 16 3 Eq. 8-9, nonlinear .91, .95 Nonlinear (k = 3) .980
9 Pointing, angular [3] Exp. 2 W, H, ↵, A 310 4 Eq. 33, IDpr, nonlinear .953 Nonlinear (k = 4) .962
10 Two thumb tapping[11] ID,Telapsed 20 6 Eq. 5-6, quadratic .79 a + b(T2
elapsed/ID) .929
11 Menu selection[2] B,I,D,W,Fr 10 6 Eq. 1-7, nonlinear .99,.52 Nonlinear (k = 6) .990
Table 1. Benchmarking automatic modeling against previously published models of response time in HCI. Notes: n = Number of observations (data
rows); k = Number of free parameters; * All variable names from the original papers, except I is interface type (dummy coded); ** = As reported in
the paper; *** = Some equations omitted due to space restrictions
to fixed terms. A second is deciding on a meaningful fit-
ness score – we currently use R2
, but this can be changed
to cross-validation metrics. A third is model diagnostics. For
instance, the use of OLS assumes collinearity and homoge-
nous error variance [9]. The latter is probably an unrealistic
assumption in many HCI datasets. Analytics are needed to
examine the consequences. Fourthly, the equations are not
1. Pointing datasets 1–6 provide the least room to improve,
since the R2
s are high to begin with.
2. The method is more successful when there are more predic-
tors. The improvements obtained for datasets 7–11 range
from small (8, 9, and 11) to medium (7) to large (10).
Constraining of model exploration
See the full table in the paper
33. Baseline This paper
# Dataset Predictors⇤ n k Model provided in paper R2 ⇤⇤ Best model found⇤⇤⇤ R2
1 Stylus tapping (1 oz)[8] A,W 16 2 a + b log2(2A/W) .966 a + b log2(A/W) .966
2 Reanalyzed data [8] A,We a + b log2(A/We + 1) .987 a + b(log2(log2 A) We) .981
3 Mouse pointing [8] A,W 16 2 a + b log2(A/W + 1) .984 a + b log2(A/W) .973
4 A,We a + b log2(A/We + 1) .980 a + b log10(A/We) .978
5 Trackball dragging [8] A,W 16 2 a + b log2(A/W + 1) .965 a + b log2(A (W3)4) .981
6 A,We a + b log2(A/We + 1) .817 a + b(A/(1 elog10 We )) .941
7 Magic lens pointing [13] A,W, S 16 3 a + b log2(D/S + 1) + c log2(S/2/A) .88 a + b(1 1/A) + cW9 .947
8 Tactile guidance [7] N,I,D 16 3 Eq. 8-9, nonlinear .91, .95 Nonlinear (k = 3) .980
9 Pointing, angular [3] Exp. 2 W, H, ↵, A 310 4 Eq. 33, IDpr, nonlinear .953 Nonlinear (k = 4) .962
10 Two thumb tapping[11] ID,Telapsed 20 6 Eq. 5-6, quadratic .79 a + b(T2
elapsed/ID) .929
11 Menu selection[2] B,I,D,W,Fr 10 6 Eq. 1-7, nonlinear .99,.52 Nonlinear (k = 6) .990
Table 1. Benchmarking automatic modeling against previously published models of response time in HCI. Notes: n = Number of observations (data
rows); k = Number of free parameters; * All variable names from the original papers, except I is interface type (dummy coded); ** = As reported in
the paper; *** = Some equations omitted due to space restrictions
to fixed terms. A second is deciding on a meaningful fit-
ness score – we currently use R2
, but this can be changed
to cross-validation metrics. A third is model diagnostics. For
instance, the use of OLS assumes collinearity and homoge-
nous error variance [9]. The latter is probably an unrealistic
assumption in many HCI datasets. Analytics are needed to
examine the consequences. Fourthly, the equations are not
1. Pointing datasets 1–6 provide the least room to improve,
since the R2
s are high to begin with.
2. The method is more successful when there are more predic-
tors. The improvements obtained for datasets 7–11 range
from small (8, 9, and 11) to medium (7) to large (10).
Constraining of model exploration
See the full table in the paper
Baseline This paper
in paper R2 ⇤⇤ Best model found⇤⇤⇤
W) .966 a + b log2(A/W) .
We + 1) .987 a + b(log2(log2 A) We) .
W + 1) .984 a + b log2(A/W) .
We + 1) .980 a + b log10(A/We) .
W + 1) .965 a + b log2(A (W3)4) .
We + 1) .817 a + b(A/(1 elog10 We )) .
+ 1) + c log2(S/2/A) .88 a + b(1 1/A) + cW9 .
ar .91, .95 Nonlinear (k = 3) .
onlinear .953 Nonlinear (k = 4) .
c .79 a + b(T2
elapsed/ID) .
ar .99,.52 Nonlinear (k = 6) .
ls of response time in HCI. Notes: n = Number of observations (dat
35. Case 2: Complex dataset
Multitouch-rotation data
n of parameters: Angle,
shown in figure below).
display with tablet in Po-
Dependent variable: MT
Predictors: Angle, Diameter, X
position,Y position, Direction
[Hoggan et al. Proc. CHI’13]
36. Case 2: Complex dataset
Multitouch-rotation data
n of parameters: Angle,
shown in figure below).
display with tablet in Po-
Dependent variable: MT
Predictors: Angle, Diameter, X
position,Y position, Direction
[Hoggan et al. Proc. CHI’13]
and R2
= 0.835. However, the method also foun
with seven free parameters and R2
= 0.827. Also,
model, with four parameters and R2
= 0.805, was
a + bx1 +
c cos x3
2
e
cos 1
x2
0
log10(x1⇥x3)
+ d tan x3
Here, variables x0, ..., x3 refer to x-position, y-po
gle, and diameter, respectively. Further analysis i
R2=0.805
39. Case 3:Theoretically
motivated operations
Dataset
Model
type (dummy coded); ** = As reported in
provide the least room to improve,
to begin with.
ccessful when there are more predic-
ts obtained for datasets 7–11 range
1) to medium (7) to large (10).
exploration
ted modeling, we took Dataset 11
mations (1/x, log2(x), ⇤, /, +, ) to
e original paper. Many models were
ee parameters and R2
= 0.90.
tasets with a single model
eling multiple datasets with a single
pointing papers, the model terms are
ameters fitted per dataset. We tested
covering three datasets (1, 3, and 5)
Theoretically
motivated
operations
41. Case 4: Multiple datasets,
one model
Dataset 3
Dataset 1
Dataset 2
Model
42. Conclusion & Discussion
• Proof-of-concept
• Model identification by defining constraints
• Supports different modeling tasks in HCI
• Promising results
• Limitations and open questions
• E.g., assumptions of nonlinear modeling (see paper)
• “Brute force” approach
• Warning against “fishing”!
• Future work: performance and expressive controls
43. Project homepage (code forthcoming!)
http://www.mpi-inf.mpg.de/~oantti/nonlinearmodeling/
Acknowledgements: This research was funded by the Max Planck Centre for
Visual Computing and Communication and the Cluster of Excellence on Multimodal
Computing and Interaction at Saarland University.
antti.oulasvirta@aalto.fi
Automated Nonlinear Regression
Modeling for HCI
Take-away:
•Model identification by constraint definition