Lecture11

Introduction to sample size
and power calculations
How much chance do we have to
reject the null hypothesis when
the alternative is in fact true?
(what’s the probability of detecting
a real effect?)

Can we quantify how much
power we have for given
sample sizes?

study 1: 263 cases, 1241 controls

Null
Distribution:
difference=0.

Rejection region.
Any value >= 6.5
(0+3.3*1.96)
For 5% significance level,
one-tail area=2.5%

(Z α/2 = 1.96)
Power= chance of being in the
Clinically relevant region if the alternative
rejection
alternative: is true=area to the right of this
difference=10%. yellow)
line (in

Rejection region.
Any value >= 6.5
(0+3.3*1.96)

Power here:
6.5 10
P( Z >
)=
3.3
P( Z > 1.06) = 85%

Power= chance of being in the
rejection region if the alternative
is true=area to the right of this
line (in yellow)


Critical value=
0+10*1.96=20

Z α/2 =1.96

2.5% area
Power closer to
15% now.

Study 2: 18 treated, 72 controls, STD DEV = 2
Critical value=
0+0.52*1.96 = 1

Clinically relevant
alternative:
difference=4 points

Power is nearly
100%!

Study 2: 18 treated, 72 controls, STD DEV=10
Critical value=
0+2.58*1.96 = 5

Power is about
40%

Study 2: 18 treated, 72 controls, effect size=1.0

Critical value=
0+0.52*1.96 = 1

Power is about
50%
Clinically relevant
alternative:
difference=1 point

Factors Affecting Power
1. Size of the effect
2. Standard deviation of the characteristic
3. Bigger sample size
4. Significance level desired

1. Bigger difference from the null mean
Null

Clinically
relevant
alternative

average weight from samples of 100

2. Bigger standard deviation


3. Bigger Sample Size


4. Higher significance level

Rejection region.


Sample size calculations




Based on these elements, you can write
a formal mathematical equation that
relates power, sample size, effect size,
standard deviation, and significance
level…
**WE WILL DERIVE THESE
FORMULAS FORMALLY SHORTLY**

Simple formula for difference
in means
Sample size in each
group (assumes equal
sized groups)

n=

Represents the
desired power
(typically .84 for
80% power).

2σ ( Z β + Zα/2 )

Standard
deviation of the
outcome variable

2

2

2

difference Represents the
Effect Size
(the difference
in means)

desired level of
statistical
significance
(typically 1.96).

Simple formula for difference
in proportions
Represents the
desired power
(typically .84 for
80% power).

Sample size in each
group (assumes equal
sized groups)

n=

2( p )(1 − p )( Z β + Zα/2 )

A measure of
variability (similar
to standard
deviation)

(p1 − p2 )
Effect Size
(the difference
in proportions)

2

2
Represents the
desired level of
statistical
significance
(typically 1.96).

Derivation of sample size
formula….

Study 2: 18 treated, 72 controls, effect size=1.0
Critical value= 0+.52*1.96=1

Power close to 50%

SAMPLE SIZE AND POWER FORMULAS
Critical value=
0+standard error (difference)*Zα/2

Power= area to right of Zβ =
Zβ =

critical value - alternative difference (here = 1)
standard error (diff)

e.g. here :Z β =

−0
; power = 50%

Power= area to right of Zβ =
Zβ =

critical value - alternative difference

Zα/2 * standard error (diff) - difference
Zβ =
standard error(diff)
Power is the area to the right of Z . OR
difference
Z β = Zα/2 −
power is the area to the left of - Z .
standard error(diff) Since normal charts give us the area to
the left by convention, we need to use
difference
- Z to get the correct value. Most
− Zβ =
− Zα/2 textbooks just call this “Z ”; I’ll use the
standard error(diff)
term Z
to avoid confusion.
β

β

β

β

Z power = − Z β 
→

power

the area to the left of Z power = the area to the right of Z β

All-purpose power formula…

Z power

difference
=
− Zα / 2
standard error(difference)

Derivation of a sample size
formula…
σ
σ
s.e.(diff ) =
+
n1 n2
2

2

Sample size is embedded in the
standard error….

σ
σ
if ratio r of group 2 to group 1 : s.e.(diff ) =
+
n1 rn1
2

2

Algebra…
∴ Z power =

Z power =

difference

σ2 σ2
+
n1 rn1

difference
(r + 1)σ 2
rn1
2

( Z power + Zα/2 ) = (

− Zα/2

− Zα/2

difference
(r + 1)σ 2
rn1

)2

( r + 1)σ ( Z power + Zα/2 ) = rn1 difference

2

rn1difference = ( r + 1)σ ( Z power + Zα/2 )

2

2

2

2

2

( r + 1)σ ( Z power + Zα/2 )
2

n1 =

2

rdifference 2

(r + 1) σ ( Z power + Zα/2 )
n1 =
2
r
difference
2

If r = 1 (equal groups), then n1 =

2

2σ 2 ( Z power + Zα/2 ) 2
difference 2

Sample size formula for
difference in means
(r + 1) σ ( Z power + Zα/2 )
n1 =
2
r
difference
2

2

where :
n 1 = size of smaller group
r = ratio of larger group to smaller group

σ = standard deviation of the characteristic
diffference = clinically meaningful difference in means of the outcome
Z power = corresponds to power (.84 = 80% power)
Zα / 2 = corresponds to two - tailed significance level (1.96 for α = .05)

Examples


Example 1: You want to calculate how much power
you will have to see a difference of 3.0 IQ points
between two groups: 30 male doctors and 30 female
doctors. If you expect the standard deviation to be
about 10 on an IQ test for both groups, then the
standard error for the difference will be about:

10 2 10 2
+
= 2.57
30
30

Power formula…
Z power

Zβ
Z power =

d*
=
− Zα / 2 =
σ (d *)

d*
2σ 2
n

− Zα / 2

d* n
=
− Zα / 2
σ 2

d*
3
d* n
3
− Zα / 2 =
− 1.96 = −.79 or Z power =
Zβ
− Zα / 2 =
σ (d *)
2.57
σ 2
10

P(Z≤ -.79) =.21; only 21% power to see a difference of 3 IQ points.

30
− 1.96 = −.79
2



Example 2: How many people would
you need to sample in each group to
achieve power of 80% (corresponds to
Zβ=.84)

n=

2σ 2 ( Z β + Z α / 2 ) 2
(d *) 2

100( 2)(.84 + 1.96) 2
=
= 174
2
(3)

174/group; 348 altogether

Sample Size needed for
comparing two proportions:
Example: I am going to run a case-control study
to determine if pancreatic cancer is linked to
drinking coffee. If I want 80% power to detect a
10% difference in the proportion of coffee
drinkers among cases vs. controls (if coffee
drinking and pancreatic cancer are linked, we
would expect that a higher proportion of cases
would be coffee drinkers than controls), how
many cases and controls should I sample?
About half the population drinks coffee.

formula:
The standard error of the difference of two proportions is:

p (1 − p ) p (1 − p )
+
n1
n2

formula:
Here, if we assume equal sample size and
that, under the null hypothesis proportions of
coffee drinkers is .5 in both cases and
controls, then
s.e.(diff)=

.5(1 − .5) .5(1 − .5)
+
= .5 / n
n
n

Z power

test statistic
=
− Zα / 2
s.e.(test statistic )

Z power =

.10
.5 / n

1.96

For 80% power…
.84 =

.10
.5 / n

.84 + 1.96 =

− 1.96
.10
.5 / n

.10 2 n
(.84 + 1.96) 2 =
.5
.5(.84 + 1.96) 2
∴n =
= 392
2
.10

There is 80% area to the
left of a Z-score of .84
on a standard normal
curve; therefore, there is
80% area to the right of
-.84.

Would take 392 cases and 392 controls to have 80% power!
Total=784

Question 2:
How many total cases and controls would I have
to sample to get 80% power for the same
study, if I sample 2 controls for every case?


Ask yourself, what changes here?
Z power =

test statistic
− Zα / 2
s.e.(test statistic)

p (1 − p ) p (1 − p )
.25 .25
.25 .5
.75
.75
+
=
+
=
+
=
=
2n
n
2n
n
2 n 2n
2n
2n

Different size groups…
.84 =

.10
.75 / 2n

− 1.96
.10

.84 + 1.96 =

.75 / 2n

(.10 2 ) 2n
(.84 + 1.96) =
.75
.75(.84 + 1.96) 2
∴n =
= 294
2
( 2).10
2

Need: 294 cases and 2x294=588 controls. 882 total.
Note: you get the best power for the lowest sample size if you keep both groups equal (882 > 784).
You would only want to make groups unequal if there was an obvious difference in the cost or ease of
collecting data on one group. E.g., cases of pancreatic cancer are rare and take time to find.

General sample size formula
s.e.(diff ) =

p (1 − p ) p (1 − p)
+
=
rn
n

p(1 − p ) rp(1 − p )
( r + 1) p (1 − p)
+
=
rn
rn
rn

2
r + 1 p (1 − p )( Z power + Z α / 2 )
n=
r
( p1 − p 2 ) 2

General sample size needs
when outcome is binary:
p (1 − p )( Z β + Zα / 2 ) 2
r +1
n=
2
r
( p1 − p2 )
where :
n = size of smaller group
2σ 2 ( Z power + Z α / 2 ) 2
p1 − p2 = clinically meaningful difference in proportions of the outcome
n=
2

Z β = corresponds to power (.84 = 80% power)
(diff )


Compare with when outcome
is continuous:
(r + 1) σ ( Z β + Zα/2 )
n1 =
2
r
difference
2

2

where :
n1 = size of smaller group

σ = standard deviation of the characteristic
diffference = clinically meaningful difference in means of the outcome

Question


How many subjects would we need to sample
to have 80% power to detect an average
increase in MCAT biology score of 1 point, if
the average change without instruction (just
due to chance) is plus or minus 3 points
(=standard deviation of change)?

Standard error here=

σ change
n

=

3
n

Z power
Z power

test statistic
=
− Zα / 2
s.e.(test statistic )

D
=
− Zα / 2
σD
n

( Z power + Zα / 2 ) 2 =
2

n=

Where D=change
from test 1 to test
2. (difference)

nD

σD

2

σ D ( Z power + Zα / 2 )
D2

Therefore, need:
(9)(1.96+.84)2/1 =
70 people total

2

2

Sample size for paired data:
2

n=

σ d ( Z β + Zα/2 )
difference

2

2

where :
n = sample size
σ = standard deviation of the within - pair difference
diffference = clinically meaningful difference

Paired data difference in
proportion: sample size:
n=

p (1 − p )( Z β + Zα / 2 ) 2
( p1 − p2 )

2

where :
n = sample size for 1 group

p1 − p2 = clinically meaningful difference in)
2σ ( Z power + Z α / 2 dependent proportions
2

2

n=
2
(diff )


Lecture11

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (6)

Andere mochten auch

Andere mochten auch (19)

Ähnlich wie Lecture11

Ähnlich wie Lecture11 (20)

Mehr von Aakash Kulkarni

Mehr von Aakash Kulkarni (13)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Lecture11

Hinweis der Redaktion