1. Why nonparametric methods What test to use ? Rank Tests
Parametric and non-parametric statistical methods
for the life sciences - Session I
Liesbeth Bruckers Geert Molenberghs
Interuniversity Institute for Biostatistics and statistical
Bioinformatics (I-Biostat)
Universiteit Hasselt
June 7, 2011
Doctoral School Medicine
2. Why nonparametric methods What test to use ? Rank Tests
Table of contents
1 Why nonparametric methods
Introductory example
Nonparametric test of hypotheses
2 What test to use ?
Two independent samples
More then two independent samples
Two dependent samples
More then two dependent samples
Ordered hypotheses
3 Rank Tests
Wilcoxon Rank Sum Test
Kruskal-Wallis Test
Friedmann Statistic
Sign Test
Jonckheere-Terpstra Test
Doctoral School Medicine
3. Why nonparametric methods What test to use ? Rank Tests Introductory example Nonparametric test of hypotheses
Why nonparametric methods ?
Doctoral School Medicine
4. Why nonparametric methods What test to use ? Rank Tests Introductory example Nonparametric test of hypotheses
Introductory Example
The paper Hypertension in Terminal Renal Failure, Observations
Pre and Post Bilateral Nephrectomy (J. Chronic Diseases (1973):
471-501) gave blood pressure readings for five terminal renal
patients before and 2 months after surgery (removal of kidney).
Patient 1 2 3 4 5
Before surgery 107 102 95 106 112
After surgery 87 97 101 113 80
Question: Does the mean blood pressure before surgery exceed the
mean blood pressure two months after surgery ?
Doctoral School Medicine
5. Why nonparametric methods What test to use ? Rank Tests Introductory example Nonparametric test of hypotheses
Classical Approach
Paired t-test:
Patient 1 2 3 4 5
Before surgery 107 102 95 106 112
After surgery 87 97 101 113 80
Difference Di 20 5 -6 -7 32
Hypotheses: H0 : µd = 0 versus H1 : µd > 0
µd : mean difference in blood pressure
Test-Statistic : t = D
1
n(n−1)
(Di −D)2
follows a t distribution with n − 1 d.f.
Doctoral School Medicine
6. Why nonparametric methods What test to use ? Rank Tests Introductory example Nonparametric test of hypotheses
Assumptions
The statistic follows a t-distribution if the differences are
normally distributed ⇒ t-test = parametric method
Observations are made independent: selection of a patient
does not influence chance of any other patient for inclusion
(Two sample t test): populations must have same variances
Variables must be measured in an interval scale, to interpret
the results
These assumptions are often not tested, but accepted.
Doctoral School Medicine
7. Why nonparametric methods What test to use ? Rank Tests Introductory example Nonparametric test of hypotheses
Normal probability plot
Normality is questionable !
Doctoral School Medicine
8. Why nonparametric methods What test to use ? Rank Tests Introductory example Nonparametric test of hypotheses
Nonparametric Test of Hypotheses
Follow same general procedure as parametric tests:
State null and alternative hypothesis
Calculate the value of the appropriate test statistic (choice
based on the design of the study)
Decision rule: either reject or accept depending on the
magnitude of the statistic
PH0
(T ≥ c) = ??
Exact distribution
Approximation for the exact distribution
Doctoral School Medicine
9. Why nonparametric methods What test to use ? Rank Tests Two independent samples More then two independent samples
When to use what test
Doctoral School Medicine
10. Why nonparametric methods What test to use ? Rank Tests Two independent samples More then two independent samples
What test to use ?
Choice of appropriate test statistic depends on the design of the
study:
number of groups ?
independent of dependent samples ?
ordered alternative hypothesis ?
Doctoral School Medicine
11. Why nonparametric methods What test to use ? Rank Tests Two independent samples More then two independent samples
Two Independent Samples
Permeability constants of the human chorioamnion (a placental
membrane) for at term (x) and between 12 to 26 weeks gestational
age (y) pregnancies are given in the table below. Investigate the
alternative of interest that the permeability of the human
chorioamnion for a term pregnancy is greater than for a 12 to 26
weeks of gestational age pregnancy.
X (at term) 0.83 1.89 1.04 1.45 1.38 1.91 1.64 1.46
Y (12-26weeks) 1.15 0.88 0.90 0.74 1.21
Statistical Methods:
t-test
Wilcoxon Rank Sum Test
Doctoral School Medicine
12. Why nonparametric methods What test to use ? Rank Tests Two independent samples More then two independent samples
More Than Two Independent Samples
Protoporphyrin levels were determined for three groups of people -
a control group of normal workers, a group of alcoholics with
sideroblasts in their bone marrow, and a group of alcoholics
without sideroblasts. The data is shown below. Does the data
suggest that normal workers and alcoholics with and without
sideroblasts differ with respect to protoporphyrin level ?
Group Protoporphyrin level (mg)
Normal 22 27 47 30 38 78 28 58 72 56
Alcoholics with sideroblasts 78 172 286 82 453 513 174 915 84 153
Alcoholics without sideroblasts 37 28 38 45 47 29 34 20 68 12
Statistical Methods:
ANOVA
Kruskal-Wallis Test
Doctoral School Medicine
13. Why nonparametric methods What test to use ? Rank Tests Two independent samples More then two independent samples
Two Dependent Samples
Twelve adult males were put on liquid diet in a weight-reducing
plan. Weights were recorded before and after the diet. The data
are shown in the table below.
Subject 1 2 3 4 5 6 7 8 9 10 11 12
Before 186 171 177 168 191 172 177 191 170 171 188 187
After 188 177 176 169 196 172 165 190 165 180 181 172
Statistical Methods:
Paired t-test
Sign test; Signed-rank test
Doctoral School Medicine
14. Why nonparametric methods What test to use ? Rank Tests Two independent samples More then two independent samples
Randomized Blocked Design
Effect of Hypnosis:
Emotions of fear, happiness, depression and calmness were
requested (in random order) from 8 subject during hypnosis
Response: skin potential (in millivolts)
Subject 1 2 3 4 5 6 7 8
Fear 23.1 57.6 10.5 23.6 11.9 54.6 21.0 20.3
Happiness 22.7 53.2 9.7 19.6 13.8 47.1 13.6 23.6
Depression 22.5 53.7 10.8 21.1 13.7 39.2 13.7 16.3
Calmness 22.6 53.1 8.3 21.6 13.3 37.0 14.8 14.8
Statistical Methods:
Mixed Models
Friedmann test
Doctoral School Medicine
15. Why nonparametric methods What test to use ? Rank Tests Two independent samples More then two independent samples
Ordered Treatments
Patients were treated with a drug a four dose levels (100mg,
200mg, 300mg and 400mg) and then monitored for toxicity.
Drug Toxicity
Dose Mild Moderate Severe Drug Death
100mg 100 1 0 0
200mg 18 1 1 0
300mg 50 1 1 0
400mg 50 1 1 1
Statistical Methods:
Regression
Jonckheere-Terpstra Test
Doctoral School Medicine
16. Why nonparametric methods What test to use ? Rank Tests Wilcoxon Rank Sum Test Kruskal-Wallis Test Friedmann Statis
Wilcoxon Rank Sum Test
Doctoral School Medicine
17. Why nonparametric methods What test to use ? Rank Tests Wilcoxon Rank Sum Test Kruskal-Wallis Test Friedmann Statis
Wilxocon Rank Sum Test
Detailed Example:
Data : GAF scores
Control 25 10 35
Treatment 36 26 40
Does treatment improve the functioning ?
Doctoral School Medicine
18. Why nonparametric methods What test to use ? Rank Tests Wilcoxon Rank Sum Test Kruskal-Wallis Test Friedmann Statis
Parametric Approach: t-test
t =
¯X1− ¯X0
SX1−X0
, where SX1−X0
=
s2
1
n1
+
s2
0
n0
t test: means of two normally distributed populations are
equal
H0 : µ1 = µ0
H1 : µ1 = µ0 (one sided test H1 : µ1 ≥ µ0
equal sample sizes
two distributions have the same variance
¯X1 = 34.00, ¯X0 = 23.33, SX1 = 7.21, SX0 = 12.58
t = 1.27
PH0 (t ≥ 1.27) = 0.1358
Doctoral School Medicine
19. Why nonparametric methods What test to use ? Rank Tests Wilcoxon Rank Sum Test Kruskal-Wallis Test Friedmann Statis
Wilxocon Rank Sum Test
Detailed Example:
Control 25 10 35
Treatment 36 26 40
Order data: Position of patients on treatment as compared
with position of patients in control arm ?
Ranks
Doctoral School Medicine
20. Why nonparametric methods What test to use ? Rank Tests Wilcoxon Rank Sum Test Kruskal-Wallis Test Friedmann Statis
Treatment is effective if treated patients rank sufficiently
high in the combined ranking of all patients
Test statistic such that:
treatment ranks are high ⇔ value test statistic is high
treatment ranks are low ⇔ value test statistic is low
WS = S1 + S2 + . . . + Sn (n=3, number of patients in treatment arm)
Ranks
Control 2 1 4
(25) (10) (35)
Treatment 5 3 6
(36) (26) (40)
WS = 5+3+6 =14
Doctoral School Medicine
21. Why nonparametric methods What test to use ? Rank Tests Wilcoxon Rank Sum Test Kruskal-Wallis Test Friedmann Statis
Reject null hypothesis when WS is sufficiently large : WS ≥ c
PH0 (WS ≥ c) = α (alpha=0.05)
Distribution of WS under H0 ?
Suppose no treatment effect (H0)
rank is solely determined by patients health status
rank is independent of receiving treatment or placebo
“rank is assigned to patient before randomisation”
Random selection of patients for treatment ⇒ random
selection of 3 ranks out of 6
Randomisation divides ranks (1,2,...6) into two groups !
Number of possible combinations : N
n = N!
n!(N−n)!
Doctoral School Medicine
22. Why nonparametric methods What test to use ? Rank Tests Wilcoxon Rank Sum Test Kruskal-Wallis Test Friedmann Statis
All posibilities: (each as a probability of 1/20 under H0)
treatment ranks (4,5,6) (3,5,6) (3,4,6) (3,4,5) (2,5,6)
ws 15 14 13 12 13
treatment ranks (2,4,6) (2,4,5) (2,3,6) (2,3,5) (2,3,4)
w 12 11 11 10 9
treatment ranks (1,5,6) (1,4,6) (1,4,5) (1,3,6) (1,3,5)
ws 12 11 10 10 9
treatment ranks (1,3,4) (1,2,6) (1,2,5) (1,2,4) (1,2,3)
ws 8 9 8 7 6
Doctoral School Medicine
23. Why nonparametric methods What test to use ? Rank Tests Wilcoxon Rank Sum Test Kruskal-Wallis Test Friedmann Statis
Distribution of WS under the null hypothesis:
w 6 7 8 9 10 11 12 13 14 15
PH0
(Ws = w) 1
20
1
20
2
20
3
20
3
20
3
20
3
20
2
20
1
20
1
20
Doctoral School Medicine
24. Why nonparametric methods What test to use ? Rank Tests Wilcoxon Rank Sum Test Kruskal-Wallis Test Friedmann Statis
PHO
(WS ≥ 14) = 0.1
Do not reject H0.
Conclusion: Treatment does not increase the GAF scores.
Power of this study ???
Doctoral School Medicine
25. Why nonparametric methods What test to use ? Rank Tests Wilcoxon Rank Sum Test Kruskal-Wallis Test Friedmann Statis
Large Sample Size-case
N
n increases rapidly with N and n
20
10 = 184756
12
6 = 924
Asymptotic Null Distribution: Central Limit Theorem
Sum T of large number of independent random variables is
approximately normally distributed.
P
T − E(T)
Var(T)
≤ a ≈ Φ(a)
where Φ(a) is the area to the left of a under a standard normal curve
Doctoral School Medicine
26. Why nonparametric methods What test to use ? Rank Tests Wilcoxon Rank Sum Test Kruskal-Wallis Test Friedmann Statis
If both n and m are sufficiently large:
WS ≈ N(E(WS ); Var(WS ))
E(WS ) = 1
2n(N + 1)
Var(WS ) = 1
12nm(N + 1)
Doctoral School Medicine
27. Why nonparametric methods What test to use ? Rank Tests Wilcoxon Rank Sum Test Kruskal-Wallis Test Friedmann Statis
Kruskal-Wallis Test
Doctoral School Medicine
28. Why nonparametric methods What test to use ? Rank Tests Wilcoxon Rank Sum Test Kruskal-Wallis Test Friedmann Statis
Kruskal- Wallis test
Example: Kruskal- Wallis test:
The following data represent corn yields per acre from three
different fields where different farming methods were used.
Method 1 Method 2 Method 3
92 94 101
91 90 100
84 81 93
89 102
Question: is the yields different for the 4 methods ?
Doctoral School Medicine
29. Why nonparametric methods What test to use ? Rank Tests Wilcoxon Rank Sum Test Kruskal-Wallis Test Friedmann Statis
Parametric Approach One-way ANOVA
Statistical test of whether or not the means of several groups
are all equal
Assumptions:
Independence of cases
The distributions of the residuals are normal : i ∼ (0, σ2
).
Homoscedasticity
F =
variance between groups
variance within groups
= MSTR
MSE
Statistic follows a F distribution with s − 1, n − s d.f.
Doctoral School Medicine
30. Why nonparametric methods What test to use ? Rank Tests Wilcoxon Rank Sum Test Kruskal-Wallis Test Friedmann Statis
Small F:
Large F:
Doctoral School Medicine
31. Why nonparametric methods What test to use ? Rank Tests Wilcoxon Rank Sum Test Kruskal-Wallis Test Friedmann Statis
One-Way ANOVA results
¯X1 = 89, ¯X2 = 88.33, ¯X3 = 99
σ1 = 3.56, σ2 = 6.65, σ3 = 4.08
MSTR= 135.03 , MSE = 22.08
F= 6.11
PH0 (F ≥ 6.11) = 0.0245
Doctoral School Medicine
32. Why nonparametric methods What test to use ? Rank Tests Wilcoxon Rank Sum Test Kruskal-Wallis Test Friedmann Statis
Ranks:
Method 1 Method 2 Method 3
6 8 10
5 4 9
1 2 7
3 11
Ri.: 3.75 4.666 6.75
Doctoral School Medicine
33. Why nonparametric methods What test to use ? Rank Tests Wilcoxon Rank Sum Test Kruskal-Wallis Test Friedmann Statis
Hypothesis :
H0: No difference between the treatments
H1: Any difference between the treatments
If treatments do not differ widely (H0):
Ri. are close to each other
Ri. close to R..
If treatments do differ (H1):
Ri. differ substantial
Ri. not close to R..
Doctoral School Medicine
34. Why nonparametric methods What test to use ? Rank Tests Wilcoxon Rank Sum Test Kruskal-Wallis Test Friedmann Statis
Evaluate the null hypothesis by investigating:
K =
12
N(N + 1)
s
i=1
ni (Ri. − R..)2
PH0 (K ≥ c) = ?
Exact distribution of K under H0 :
ranks are determined before assignment to treatment
random assignment → all possibilities same chance of being
observed
Number of possible combinations: multinomial coefficient :
11
4,3,4 = 11
4
7
3
4
4 = 11550
N
n1,n2,...,ns
= N
n1
N−n1
n2
. . . N−n1−...−ns−1
ns
Doctoral School Medicine
35. Why nonparametric methods What test to use ? Rank Tests Wilcoxon Rank Sum Test Kruskal-Wallis Test Friedmann Statis
A few possible configurations:
Method 1 Method 2 Method 3 K
(1,2,3,4) (5,6,7) (8,9,10,11) 8.91
(1,2,3,5) (4,6,7) (8,9,10,11) 8.32
(1,2,3,6) (4,5,6) (8,9,10,11) 7.84
(1,2,3,7) (4,5,6) (8,9,10,11) 7,48
. . .
(1,3,5,6) (2,4,8) (7,9,10,11) 6.16
. . .
Each configuration has a probability of 1
11550 to happen.
Doctoral School Medicine
36. Why nonparametric methods What test to use ? Rank Tests Wilcoxon Rank Sum Test Kruskal-Wallis Test Friedmann Statis
Exact Distribution of K:
PH0 (K ≥ 6.16) = 0.0306
Conclusion: Reject H0: there is a difference between the
farming methods
Large sample size approximation ” χ2 distribution with s − 1
d.f.
Doctoral School Medicine
37. Why nonparametric methods What test to use ? Rank Tests Wilcoxon Rank Sum Test Kruskal-Wallis Test Friedmann Statis
Friedmann Test
Doctoral School Medicine
38. Why nonparametric methods What test to use ? Rank Tests Wilcoxon Rank Sum Test Kruskal-Wallis Test Friedmann Statis
Friedmann Statistic
Setting 1: complete randomization:
Kruskal-Wallis test p-value =0.8611
Treatment effect is blurred by the variability between subjects
Setting 2: randomisation within age groups:
p-value 0.0411
Conclusion reject H0
Doctoral School Medicine
39. Why nonparametric methods What test to use ? Rank Tests Wilcoxon Rank Sum Test Kruskal-Wallis Test Friedmann Statis
Procedure
Divide subjects in homogeneous subgroups (BLOCKS)
Compare subjects within the blocks w.r.t. treatment effects
(Generalisation of the paired comparison design)
Doctoral School Medicine
40. Why nonparametric methods What test to use ? Rank Tests Wilcoxon Rank Sum Test Kruskal-Wallis Test Friedmann Statis
Example
Data
Age-group
treatment 20-30 y 30-40 y 40-50 y 50-60 y
A 19 21 43 46
B 17 20 37 44
C 23 22 39 42
Rank subjects within a block:
Age-group
treatment 20-30 y 30-40 y 40-50 y 50-60 y
A 2 2 3 3
B 1 1 1 2
C 3 3 2 1
Doctoral School Medicine
41. Why nonparametric methods What test to use ? Rank Tests Wilcoxon Rank Sum Test Kruskal-Wallis Test Friedmann Statis
Mean of ranks for:
treatment A = RA.=10
4 = 2.5
treatment B = RB.=6
4 = 1.5
treatment C = RC.=9
4 = 2.25
If these mean ranks are different → reject H0
If these mean ranks are close → accept H0
Doctoral School Medicine
42. Why nonparametric methods What test to use ? Rank Tests Wilcoxon Rank Sum Test Kruskal-Wallis Test Friedmann Statis
Measure for closseness of the mean ranks:
if the Ri. are all close to each other
↓
then they are close to the overall mean R..
and
(Ri. − R..)2 will be close to zero
Friedman Statistic
Q =
12N
s(s + 1)
s
i=1
(Ri. − R..)2
Doctoral School Medicine
43. Why nonparametric methods What test to use ? Rank Tests Wilcoxon Rank Sum Test Kruskal-Wallis Test Friedmann Statis
PH0 (Q ≥ c) =?
Exact distribution of Q under H0:
A few possible configurations:
Age-group Q
Treatment 20-30 y 30-40 y 40-50 y 50-60 y
A 1 1 1 1 8
B 2 2 2 2
C 3 3 3 3
A 3 3 3 3 8
B 2 2 2 2
C 1 1 1 1
A 1 3 1 3 0
B 2 2 2 2
C 3 1 3 1
. . .
A 2 2 3 3 3.5
B 1 1 1 2
C 3 3 2 1
Doctoral School Medicine
44. Why nonparametric methods What test to use ? Rank Tests Wilcoxon Rank Sum Test Kruskal-Wallis Test Friedmann Statis
Exact Distribution of Q:
Q Pr
—————————————-
.0000000 .694444444444444E-01
.5000000 .277777777777778
1.500000 .222222222222222
2.000000 .157407407407407
3.500000 .148148148148148
4.500000 .555555555555555E-01
6.000000 .277777777777778E-01
6.500000 .370370370370370E-01
8.000000 .462962962962963E-02
Doctoral School Medicine
45. Why nonparametric methods What test to use ? Rank Tests Wilcoxon Rank Sum Test Kruskal-Wallis Test Friedmann Statis
Number of possibilities for the rank combinations:
age-group 20- 30 year: 3! = 6
age-groups are independent
↓
total number of possible combinations: (3!)4
= 1296
Under the null these are all equally likely : 1
1296
(s!)N, s= treatment groups, N = of blocks
PH0 (Q ≥ 3.5) = 0.2731
Do not reject H0
Doctoral School Medicine
46. Why nonparametric methods What test to use ? Rank Tests Wilcoxon Rank Sum Test Kruskal-Wallis Test Friedmann Statis
Sign Test
Doctoral School Medicine
47. Why nonparametric methods What test to use ? Rank Tests Wilcoxon Rank Sum Test Kruskal-Wallis Test Friedmann Statis
Sign Test
Special case of Friedmann test: blocks of size 2
subjects matched on e.g. age, gender, ...
twins
two eyes (hands) of a person
subject serves as own control: e.g. blood pressure before and after treatment
Example: Pain scores for lower back pain, before and after
having acupuncture
Pain score Pain score Sign Pain score Pain score Sign
Patient Before After Patient Before After
1 5 6 - 8 7 6 +
2 6 7 - 9 6 5 +
3 7 6 + 10 5 7 -
4 9 4 + 11 8 6 +
5 6 7 - 12 8 4 +
6 5 4 + 13 7 3 +
7 4 8 - 14 8 5 +
15 6 7 -
Doctoral School Medicine
48. Why nonparametric methods What test to use ? Rank Tests Wilcoxon Rank Sum Test Kruskal-Wallis Test Friedmann Statis
9 pairs out 15 where treatment comes out ahead (reduction in
pain scores)
Sign Test: SN = 9
PH0 (SN ≥ 9) =???
Exact Distribution of SN under H0 is binomial
N trials, N = number of ‘pairs’
Success probability: 1
2
PH0 (SN = a) =
N
a
1
2N
PH0 (SN ≥ 9) = ( 15
9 + 15
10 + . . . + 15
15 ) 1
215 = 0.31
Doctoral School Medicine
49. Why nonparametric methods What test to use ? Rank Tests Wilcoxon Rank Sum Test Kruskal-Wallis Test Friedmann Statis
Jonckheere-Terpstra Test
Doctoral School Medicine
50. Why nonparametric methods What test to use ? Rank Tests Wilcoxon Rank Sum Test Kruskal-Wallis Test Friedmann Statis
Jonckheere-Terpstra Test
To be used when the H1 is ordered.
Ordinal data for the responses and an ordering in the
treatment/groups.
Example:
Data:
Three diets for rats
Response: growth
H1: Growth rate decreases from A to C : A ≥ B ≥ C
A 133 139 149 160 184
B 111 125 143 148 157
C 99 114 116 127 146
Doctoral School Medicine
51. Why nonparametric methods What test to use ? Rank Tests Wilcoxon Rank Sum Test Kruskal-Wallis Test Friedmann Statis
Parametric Approach : Regression
Models the relationship between a dependent and independent
variable
yi = β0 + β1xi + i
Assumptions
i ∼ N(0, σ2
), i are independent
homoscedasticity
xi is measured without error
Doctoral School Medicine
52. Why nonparametric methods What test to use ? Rank Tests Wilcoxon Rank Sum Test Kruskal-Wallis Test Friedmann Statis
β0 = 169, p-value = < 0.0001
β1 = −16, p-value = 0.0133
R-square = 0.3866
Doctoral School Medicine
53. Why nonparametric methods What test to use ? Rank Tests Wilcoxon Rank Sum Test Kruskal-Wallis Test Friedmann Statis
Jonckheere-Terpstra Test
Based on Mann-Whitney statistics for two treatments
Comparing the treatment groups two by two
if WBA is large: growth A > growth B : (WBA= 18
if WBC is large: growth B > growth C : (WBC = 18
if WCA is large: growth A > growth C : (WBA= 23
JT Statistic: W = i<j Wij
Reject H0 when W is sufficiently large
W = 59
PH0 (W ≥ c) = 0.0120
Compare with the result of a Kruskal-Wallis Test: p-value =
0. 072
The distribution of W follows a normal distribution for large
samples
Doctoral School Medicine
54. Why nonparametric methods What test to use ? Rank Tests Wilcoxon Rank Sum Test Kruskal-Wallis Test Friedmann Statis
Parametric versus nonparametric tests
Parametric tests:
Assumptions about the distribution in the population
Conditions are often not tested
Test depends on the validity of the assumptions
Most powerful test if all assumptions are met
Nonparametric tests:
Fewer assumptions about the distribution in the population
In case of small sample sizes often the only alternative (unless the
nature of the population distribution is known exactly)
Less sensitive for measurement error (uses ranks)
Can be used for data which are inherently in ranks, even for
data measured in a nominal scale
Easier to learn
Doctoral School Medicine