Kruskal Wallis test, Friedman test, Spearman Correlation

Kruskal-Wallis test
Friedman test
Spearman’s Correlation
Dr. S. A. Rizwan, M.D.,
Public Health Specialist,
Saudi Board of Preventive Medicine,
Riyadh, Kingdom of Saudi Arabia.
With thanks to Dr. Tajali Nazir Shora

Checklist of four questions
• Q1. What scales of measurement has been used?
• Q2. Which hypothesis has been tested?
• Q3. If the hypothesis of difference has been
tested, are the samples independent or
dependent?
• Q4. How many sets of measures are involved?
2

KW test
• The Kruskal-Wallis test is a nonparametric
test that can be used to determine whether
three or more independent samples were
selected from populations having the same
distribution.
H0: There is no difference in the group medians
Ha: There is a difference in the group medians
4

KW test
• This test is appropriate for use under the
following circumstances:
– we have three or more conditions that you want to
compare;
– each condition is performed by a different group of
participants;
– the data do not meet the requirements for a
parametric test.
5

Requirements
• The data:
– not normally distributed;
– if the variances for the different conditions are
markedly different;
– if the data are measurements on an ordinal scale.
• If the data meet the requirements for a
parametric test, it is better to use a one-way
independent-measures Analysis of Variance
(ANOVA)
6

Given three or more independent samples, the test statistic H
for the Kruskal-Wallis test is
Where: N represents the number of participants,
Tc is the rank total for each group
nc is the number of participants in each group
Reject the null hypothesis when H is greater than the critical number.
(always use a right tail test.)
KW test
7

Procedure
1
• Combine the observations of the various groups
2
• Arrange them in order of magnitude from lowest to highest
3
• Assign ranks to each of the observations and replace them in each of the
groups
4
• Original ratio data has therefore been converted into ordinal or ranked
data
5
• Ranks are summed in each group and the test statistic, H is computed
6
• Ranks assigned to observations in each of the groups are added
separately to give rank sums
8

Example
• Does physical exercise alleviate depression?
– We find some depressed people and check that they
are all equivalently depressed to begin with. Then we
allocate each person randomly to one of three
groups:
• no exercise;
• 20 minutes of jogging per day; or
• 60 minutes of jogging per day.
– At the end of a month, we ask each participant to
rate how depressed they now feel, on a Likert scale
that runs from 1 ("totally miserable") through to 100
(ecstatically happy"). 9

Appropriate test?
• We have three separate groups of participants.
• In each group each participant gives us a single
score on a rating scale.
(Ratings are examples of an ordinal scale of
measurement, and so the data are not suitable
for a parametric test)
10

KW test
The Kruskal-Wallis test will tell us if the
differences between the groups are so large that
they are unlikely to have occurred by chance.
11

Step 1: Rank all of the scores
13

Step 2: Find "Tc", the total of the ranks for each group. Just add
together all of the ranks for each group in turn.
• Tc1 (the rank total for the "no exercise" group)
is 76.5.
• Tc2 (the rank total for the "20 minutes" group)
is 79.5.
• Tc3 (the rank total for the "60 minutes" group)
is 144.
15

Step 4: Calculate the degrees of freedom
• The degrees of freedom is the number of
groups minus one.
Degrees of Freedom = Number of Groups - 1
• Here we have three groups, and so we have 2
d.f.
19

Step 5: Assess the significance of H
• Assessing the significance of H depends on
– the number of participants and
– the number of groups.
20

KW test
• Three groups, with ≤ 5 participants in each
group, then use the special table for small
sample sizes.
• For more than five participants per group, treat
H as Chi-Square.
21

KW test
• H is statistically significant if it is equal to or
larger than the critical value of Chi-Square for
particular df
• Here, we have 8 participants per group, and so
we treat H as Chi-Square.
• H is 7.27, with 2 df
24

KW test
• Look along the row that corresponds to number
of degrees of freedom.
• We look along the row for 2 df
• We compare our obtained value of H to each of
the critical values in that row of the table,
starting on the left hand side and stopping once
our value of H is no longer equal to or larger
than the critical value.
25

KW test
• So here, we start by comparing our H of 7.27 to
5.99. With 2 degrees of freedom, a value of Chi-
Square as large as 5.99 is likely to occur by
chance only 5 times in a hundred: i.e. it has a p
of .05.
• Our obtained value of 7.27 is even larger than
this, and so this tells us that our value of H is
even less likely to occur by chance.
• Our H will occur by chance with a probability
of less than 0.05. 26

KW test
• Move on, and compare our H to the next value
in the row, 9.21. 9.21 will occur by chance one
time in a hundred, i.e. with a p of .01. However,
our H of 7.27 is less than 9.21, not bigger than
it.
• This tells us that our value of H is not so large
that it is likely to occur with a probability of
0.01.
27

KW test
• The likelihood of obtaining a value of H as
large as the one we've found, purely by chance,
is somewhere between 0.05 and 0.01 - i.e. pretty
unlikely, and so we would conclude that there is
a difference of some kind between our three
groups.
28

KW test
• Note that the Kruskal-Wallis test merely tells you that
the groups differ in some way: you need to inspect the
group means or medians
• However in this particular case, the interpretation
seems fairly straightforward: exercise does seem to
reduce self-reported depression, but only in the case of
participants who are doing 60 minutes.
• There seems to be no difference between those
participants who took 20 minutes of exercise per day,
and those who did not exercise at all.
29

KW test
• We could write this up as follows:
• "A Kruskal-Wallis test revealed that there was a
significant effect of exercise on depression
levels (H (2) = 7.27, p < .05).
• Inspection of the group means suggests that
compared to the "no exercise" control
condition, depression was significantly reduced
by 60 minutes of daily exercise, but not by 20
minutes of exercise".
30

KW test
• The post hoc test for a significant KW is called
Dunn’s test with Bonferroni correction for
multiple testing
31

Friedman Test
• This is similar to the Wilcoxon singed rank test,
except that we can use it with three or more
conditions.
• Each subject participates in all of the different
conditions of the experiment.
33

FT
• Does background music affect the
performance of Laboratory workers?
• We take a group of five workers, and measure
each worker's productivity (in terms of the
number of Stool Microscopies per hour) thrice
– once while the worker is listening to “Easy-
listening music’,
– once while the worker is listening to ‘Marching
Band music,’ and
– once while the same worker is working in ‘Silence.’34

Step 1:
Rank each subject's scores individually
Worker No Music Easy Listening Marching Band
Raw
Score
Ranked
Score
Raw
Score
Ranked
Score
Raw
Score
Ranked
Score
1 4 1 5 2 6 3
2 2 1 7 2.5 7 2.5
3 6 1.5 6 1.5 8 3
4 3 1 7 3 5 2
5 3 1 8 2 9 3
35

Step 2:
Find the rank total for each condition
Raw
Score
Ranked
Score
Raw
Score
Ranked
Score
Raw
Score
Ranked
Score
1 4 1 5 2 6 3
2 2 1 7 2.5 7 2.5
3 6 1.5 6 1.5 8 3
4 3 1 7 3 5 2
5 3 1 8 2 9 3
Total 5.5 11 13.5
36

Step 3:
Work out χr2
• Where :
– C is the number of conditions,
– N is the number of subjects
– ΣTc2 is the sum of the squared rank totals for
each condition.
37

FT
• To get ΣTc2:
• Take each rank total and square it.
Raw
Score
Ranked
Score
Raw
Score
Ranked
Score
Raw
Score
Ranked
Score
1 4 1 5 2 6 3
2 2 1 7 2.5 7 2.5
3 6 1.5 6 1.5 8 3
4 3 1 7 3 5 2
5 3 1 8 2 9 3
Total 5.5 11 13.5
Square of
Rank Total
30.25 121 182.25
38

Step 4: Calculate degrees of
freedom
• The degrees of freedom are given by the
number of conditions minus one.
C - 1 = 3 - 1 = 2.
40

Step 5:
Assess the statistical significance of Xr2
• It depends on the number of subjects and the
number of groups.
– more than 9 subjects, we can use a Chi-Square table.
– less than nine subjects, special tables of critical
values are available.
• Compare obtained Xr2 value to the critical value
of Chi-Square for df
41

Number of
Subjects
Number of
Conditions
42

FT
• If your obtained value of Xr2 is bigger than the
critical Chi-Square value, the difference between
conditions is statistically significant.
• As with the Kruskal- Wallis test, Friedman's test
only tells you that some kind of difference
exists;
• Inspection of the median score for each
condition will usually be enough
• Post hoc test for FT is called Dunn’s test
43

Types of correlation
• Pearson’s
• Spearman’s
• Hoeffding’s D
• Distance correlation
• Mutual Information and the Maximal
Information Coefficient
45

Correlation coefficient
• A correlation coefficient is a succinct (single-
number) measure of the strength of association
between two variables.
• There are various types of correlation
coefficient for different purposes.
• Commonly used correlation coeff. are
– Pearson's r
– Spearman's rho
46

Difference between Pearson’s r
and Spearman’s rho
• Spearman’s rho makes fewer assumptions about the
nature of the data on which the correlation is to be
performed
• The data need only be measured on an ordinal scale
• Pearson’s “r” is a measure of the strength of the
linear relationship between two variables
• Spearman’s rho is a measure of the strength of the
monotonic relationship between them.
47

• If a monotonic relationship exists, it simply means that
one of the variables increases (or decreases) by some
amount when the value of the other variable changes.
• A linear relationship is thus a special case of a
monotonic relationship.
• Thus, if there is a monotonic but non-linear
relationship between two variables, it’s better to use
Spearman’s rho because Pearson’s “r” will tend to
underestimate the strength of the relationship.
48

• Spearman’s rho won’t be “fooled” by the fact
that the relationship isn’t a linear one.
49

Monotonic and non-monotonic
50
Linear relationships

Monotonic and non-monotonic
• Monotonic variables
increase (or decrease)
in the same direction,
but not always at the
same rate.
• Linear variables
increase (or decrease)
in the same direction
at the same rate.
51

How to decide a correlation test?
• Before running a test, you should make a scatter plot first to
view the overall pattern of your data.
• The strength and direction of a monotonic relationship between
two variables can be measured by the Spearman Rank-Order
Correlation.
• If your variables are monotonic and linear, a more appropriate
test might be Pearson’s Correlation as long as the assumptions
for Pearson’s are met. for example, if your data is highly skewed,
has high kurtosis, or is heteroscedastic, you cannot use Pearson’s.
You can, however, still use Spearman’s, which is a non-
parametric test.
52

Example for Spearman’s rho
• Vitamin treatment enhances the memory of
people who take it
• If there is a correlation between the number of
vitamin treatments that a subject has had, and
their performance on some memory test. We
would have two scores for each subject: number
of vitamin treatments, and that person’s
memory-test score.
53

Problems In Interpreting Correlations
• Correlation does not imply causality
• Factors affecting the size of the correlation
– The smaller the sample, the more likely it is to show
sampling variation, and the more variable the
correlation can be.
– Thus, when a correlation coefficient is calculated, it
does not represent the one-and-only correlation
between those two variables for that population.
– Different samples might produce higher or lower
correlations.
56

Sample and population
correlations
Here are the limits within which 80% of sample r’s will
fall, when the true correlation (i.e., in the population) is
zero
57

Requirements for Pearson’s
correlation
Linearity of the relationship
Homoscedasticity
Effect of discontinuous distributions
58

Linearity of the relationship
• Pearson’s r is a measure of the linear relationship between two
variables and Spearman’s rho is a measure of the monotonic
relationship between them.
• Two variables may have a very strong relationship to each other,
but of a non-linear or non-monotonic kind (for example they
might have a curvilinear relationship).
• If so, the correlation coefficient will underestimate the strength
of the relationship between the two variables.
• This is why you should always make a scatterplot of the data, so
that you can look at the trend before calculating a correlation.
59

Homoscedasticity
• Homoscedasticity (equal variability): For
Pearsons’ r to be valid, scores should have a
reasonably constant amount of variability at all
points in their distribution
60

Effect of discontinuous
distributions
61

Take home messages
• KW test NP equivalent of One Way ANOVA, >2
independent groups
• FT test NP equivalent of Repeated Measures ANOVA, >2
paired groups
• Spearman’s rank order correlation NP equivalent of
Pearson’s correlation for monotonic relationships
62

Thank you
Kindly email your queries to sarizwan1986@outlook.com
63

Kruskal Wallis test, Friedman test, Spearman Correlation

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Kruskal Wallis test, Friedman test, Spearman Correlation

Similar to Kruskal Wallis test, Friedman test, Spearman Correlation (20)

More from Rizwan S A

More from Rizwan S A (20)

Recently uploaded

Recently uploaded (20)

Kruskal Wallis test, Friedman test, Spearman Correlation