This document discusses sample size and power calculations for clinical studies. It provides formulas for calculating the required sample size to detect a desired effect size with a specified power and significance level. Formulas are presented for comparing means between two independent groups, comparing proportions between two independent groups, and comparing means within a paired or dependent groups design. Key factors that affect statistical power, and thus required sample size, are described as the size of the effect, standard deviation, sample size, and desired significance level. Examples are provided to demonstrate how to apply the formulas and calculate sample size for different study designs and scenarios.
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Lecture11
1. Introduction to sample size
and power calculations
How much chance do we have to
reject the null hypothesis when
the alternative is in fact true?
(what’s the probability of detecting
a real effect?)
2. Can we quantify how much
power we have for given
sample sizes?
3. study 1: 263 cases, 1241 controls
Null
Distribution:
difference=0.
Rejection region.
Any value >= 6.5
(0+3.3*1.96)
For 5% significance level,
one-tail area=2.5%
(Z α/2 = 1.96)
Power= chance of being in the
Clinically relevant region if the alternative
rejection
alternative: is true=area to the right of this
difference=10%. yellow)
line (in
4. study 1: 263 cases, 1241 controls
Rejection region.
Any value >= 6.5
(0+3.3*1.96)
Power here:
6.5 10
P( Z >
)=
3.3
P( Z > 1.06) = 85%
Power= chance of being in the
rejection region if the alternative
is true=area to the right of this
line (in yellow)
5. study 1: 50 cases, 50 controls
Critical value=
0+10*1.96=20
Z α/2 =1.96
2.5% area
Power closer to
15% now.
6. Study 2: 18 treated, 72 controls, STD DEV = 2
Critical value=
0+0.52*1.96 = 1
Clinically relevant
alternative:
difference=4 points
Power is nearly
100%!
7. Study 2: 18 treated, 72 controls, STD DEV=10
Critical value=
0+2.58*1.96 = 5
Power is about
40%
8. Study 2: 18 treated, 72 controls, effect size=1.0
Critical value=
0+0.52*1.96 = 1
Power is about
50%
Clinically relevant
alternative:
difference=1 point
9. Factors Affecting Power
1. Size of the effect
2. Standard deviation of the characteristic
3. Bigger sample size
4. Significance level desired
10. 1. Bigger difference from the null mean
Null
Clinically
relevant
alternative
average weight from samples of 100
14. Sample size calculations
Based on these elements, you can write
a formal mathematical equation that
relates power, sample size, effect size,
standard deviation, and significance
level…
**WE WILL DERIVE THESE
FORMULAS FORMALLY SHORTLY**
15. Simple formula for difference
in means
Sample size in each
group (assumes equal
sized groups)
n=
Represents the
desired power
(typically .84 for
80% power).
2σ ( Z β + Zα/2 )
Standard
deviation of the
outcome variable
2
2
2
difference Represents the
Effect Size
(the difference
in means)
desired level of
statistical
significance
(typically 1.96).
16. Simple formula for difference
in proportions
Represents the
desired power
(typically .84 for
80% power).
Sample size in each
group (assumes equal
sized groups)
n=
2( p )(1 − p )( Z β + Zα/2 )
A measure of
variability (similar
to standard
deviation)
(p1 − p2 )
Effect Size
(the difference
in proportions)
2
2
Represents the
desired level of
statistical
significance
(typically 1.96).
18. Study 2: 18 treated, 72 controls, effect size=1.0
Critical value= 0+.52*1.96=1
Power close to 50%
19. SAMPLE SIZE AND POWER FORMULAS
Critical value=
0+standard error (difference)*Zα/2
Power= area to right of Zβ =
Zβ =
critical value - alternative difference (here = 1)
standard error (diff)
e.g. here :Z β =
−0
; power = 50%
standard error (diff)
20. Power= area to right of Zβ =
Zβ =
critical value - alternative difference
standard error (diff)
Zα/2 * standard error (diff) - difference
Zβ =
standard error(diff)
Power is the area to the right of Z . OR
difference
Z β = Zα/2 −
power is the area to the left of - Z .
standard error(diff) Since normal charts give us the area to
the left by convention, we need to use
difference
- Z to get the correct value. Most
− Zβ =
− Zα/2 textbooks just call this “Z ”; I’ll use the
standard error(diff)
term Z
to avoid confusion.
β
β
β
β
Z power = − Z β
→
power
the area to the left of Z power = the area to the right of Z β
22. Derivation of a sample size
formula…
σ
σ
s.e.(diff ) =
+
n1 n2
2
2
Sample size is embedded in the
standard error….
σ
σ
if ratio r of group 2 to group 1 : s.e.(diff ) =
+
n1 rn1
2
2
23. Algebra…
∴ Z power =
Z power =
difference
σ2 σ2
+
n1 rn1
difference
(r + 1)σ 2
rn1
2
( Z power + Zα/2 ) = (
− Zα/2
− Zα/2
difference
(r + 1)σ 2
rn1
)2
24. ( r + 1)σ ( Z power + Zα/2 ) = rn1 difference
2
rn1difference = ( r + 1)σ ( Z power + Zα/2 )
2
2
2
2
2
( r + 1)σ ( Z power + Zα/2 )
2
n1 =
2
rdifference 2
(r + 1) σ ( Z power + Zα/2 )
n1 =
2
r
difference
2
If r = 1 (equal groups), then n1 =
2
2σ 2 ( Z power + Zα/2 ) 2
difference 2
25. Sample size formula for
difference in means
(r + 1) σ ( Z power + Zα/2 )
n1 =
2
r
difference
2
2
where :
n 1 = size of smaller group
r = ratio of larger group to smaller group
σ = standard deviation of the characteristic
diffference = clinically meaningful difference in means of the outcome
Z power = corresponds to power (.84 = 80% power)
Zα / 2 = corresponds to two - tailed significance level (1.96 for α = .05)
26. Examples
Example 1: You want to calculate how much power
you will have to see a difference of 3.0 IQ points
between two groups: 30 male doctors and 30 female
doctors. If you expect the standard deviation to be
about 10 on an IQ test for both groups, then the
standard error for the difference will be about:
10 2 10 2
+
= 2.57
30
30
27. Power formula…
Z power
Zβ
Z power =
d*
=
− Zα / 2 =
σ (d *)
d*
2σ 2
n
− Zα / 2
d* n
=
− Zα / 2
σ 2
d*
3
d* n
3
− Zα / 2 =
− 1.96 = −.79 or Z power =
Zβ
− Zα / 2 =
σ (d *)
2.57
σ 2
10
P(Z≤ -.79) =.21; only 21% power to see a difference of 3 IQ points.
30
− 1.96 = −.79
2
28.
Example 2: How many people would
you need to sample in each group to
achieve power of 80% (corresponds to
Zβ=.84)
n=
2σ 2 ( Z β + Z α / 2 ) 2
(d *) 2
100( 2)(.84 + 1.96) 2
=
= 174
2
(3)
174/group; 348 altogether
29. Sample Size needed for
comparing two proportions:
Example: I am going to run a case-control study
to determine if pancreatic cancer is linked to
drinking coffee. If I want 80% power to detect a
10% difference in the proportion of coffee
drinkers among cases vs. controls (if coffee
drinking and pancreatic cancer are linked, we
would expect that a higher proportion of cases
would be coffee drinkers than controls), how
many cases and controls should I sample?
About half the population drinks coffee.
30. Derivation of a sample size
formula:
The standard error of the difference of two proportions is:
p (1 − p ) p (1 − p )
+
n1
n2
31. Derivation of a sample size
formula:
Here, if we assume equal sample size and
that, under the null hypothesis proportions of
coffee drinkers is .5 in both cases and
controls, then
s.e.(diff)=
.5(1 − .5) .5(1 − .5)
+
= .5 / n
n
n
33. For 80% power…
.84 =
.10
.5 / n
.84 + 1.96 =
− 1.96
.10
.5 / n
.10 2 n
(.84 + 1.96) 2 =
.5
.5(.84 + 1.96) 2
∴n =
= 392
2
.10
There is 80% area to the
left of a Z-score of .84
on a standard normal
curve; therefore, there is
80% area to the right of
-.84.
Would take 392 cases and 392 controls to have 80% power!
Total=784
34. Question 2:
How many total cases and controls would I have
to sample to get 80% power for the same
study, if I sample 2 controls for every case?
Ask yourself, what changes here?
Z power =
test statistic
− Zα / 2
s.e.(test statistic)
p (1 − p ) p (1 − p )
.25 .25
.25 .5
.75
.75
+
=
+
=
+
=
=
2n
n
2n
n
2 n 2n
2n
2n
35. Different size groups…
.84 =
.10
.75 / 2n
− 1.96
.10
.84 + 1.96 =
.75 / 2n
(.10 2 ) 2n
(.84 + 1.96) =
.75
.75(.84 + 1.96) 2
∴n =
= 294
2
( 2).10
2
Need: 294 cases and 2x294=588 controls. 882 total.
Note: you get the best power for the lowest sample size if you keep both groups equal (882 > 784).
You would only want to make groups unequal if there was an obvious difference in the cost or ease of
collecting data on one group. E.g., cases of pancreatic cancer are rare and take time to find.
36. General sample size formula
s.e.(diff ) =
p (1 − p ) p (1 − p)
+
=
rn
n
p(1 − p ) rp(1 − p )
( r + 1) p (1 − p)
+
=
rn
rn
rn
2
r + 1 p (1 − p )( Z power + Z α / 2 )
n=
r
( p1 − p 2 ) 2
37. General sample size needs
when outcome is binary:
p (1 − p )( Z β + Zα / 2 ) 2
r +1
n=
2
r
( p1 − p2 )
where :
n = size of smaller group
r = ratio of larger group to smaller group
2σ 2 ( Z power + Z α / 2 ) 2
p1 − p2 = clinically meaningful difference in proportions of the outcome
n=
2
Z β = corresponds to power (.84 = 80% power)
(diff )
Zα / 2 = corresponds to two - tailed significance level (1.96 for α = .05)
38. Compare with when outcome
is continuous:
(r + 1) σ ( Z β + Zα/2 )
n1 =
2
r
difference
2
2
where :
n1 = size of smaller group
r = ratio of larger group to smaller group
σ = standard deviation of the characteristic
diffference = clinically meaningful difference in means of the outcome
Z β = corresponds to power (.84 = 80% power)
Zα / 2 = corresponds to two - tailed significance level (1.96 for α = .05)
39. Question
How many subjects would we need to sample
to have 80% power to detect an average
increase in MCAT biology score of 1 point, if
the average change without instruction (just
due to chance) is plus or minus 3 points
(=standard deviation of change)?
41. Z power
Z power
test statistic
=
− Zα / 2
s.e.(test statistic )
D
=
− Zα / 2
σD
n
( Z power + Zα / 2 ) 2 =
2
n=
Where D=change
from test 1 to test
2. (difference)
nD
σD
2
σ D ( Z power + Zα / 2 )
D2
Therefore, need:
(9)(1.96+.84)2/1 =
70 people total
2
2
42. Sample size for paired data:
2
n=
σ d ( Z β + Zα/2 )
difference
2
2
where :
n = sample size
σ = standard deviation of the within - pair difference
diffference = clinically meaningful difference
Z β = corresponds to power (.84 = 80% power)
Zα / 2 = corresponds to two - tailed significance level (1.96 for α = .05)
43. Paired data difference in
proportion: sample size:
n=
p (1 − p )( Z β + Zα / 2 ) 2
( p1 − p2 )
2
where :
n = sample size for 1 group
p1 − p2 = clinically meaningful difference in)
2σ ( Z power + Z α / 2 dependent proportions
2
2
n=
Z β = corresponds to power (.84 = 80% power)
2
(diff )
Zα / 2 = corresponds to two - tailed significance level (1.96 for α = .05)
Hinweis der Redaktion
It turns out that if you were to go out and sample many, many times, most sample statistics that you could calculate would follow a normal distribution.
What are the 2 parameters (from last time) that define any normal distribution?
Remember that a normal curve is characterized by two parameters, a mean and a variability (SD)
What do you think the mean value of a sample statistic would be? The standard deviation?
Remember standard deviation is natural variability of the population
Standard error can be standard error of the mean or standard error of the odds ratio or standard error of the difference of 2 means, etc. The standard error of any sample statistic.