A complete guidelines for Non-parametric Statistical tests for Hypotheses testing with relevant examples which covers Meaning of non-parametric test, Types of non-parametric test, Sign test, Rank sum test, Chi-square test, Wilcoxon signed-ranks test, Mc Nemer test, Spearman’s rank correlation, statistics,
Subscribe to Vision Academy for Video assistance
https://www.youtube.com/channel/UCjzpit_cXjdnzER_165mIiw
2. Meaning of non-parametric test
Types of non-parametric test
• Sign test
• Rank sum test
• Chi-square test
• Wilcoxon signed-ranks test
• Mc Nemer test
• Spearman’s rank correlation
Conclusion
Bibliography
3. Non-parametric statistics is the branch of statistics. It refers to a
statistical method in which the data is not required to fit a normal
distribution. Nonparametric statistics uses data that is often
ordinal, meaning it does not rely on numbers, but rather a ranking
or order of sorts.
For example: a survey conveying consumer preferences ranging
from like to dislike would be considered ordinal data.
4. Nonparametric statistics does not assume that data is drawn from
a normal distribution. Instead, the shape of the distribution is
estimated under this form of statistical measurements like
descriptive statistics, statistical test, inference statistics and
models. There is no assumption of sample size because it’s
observed data is quantitative.
This type of statistics can be used without the mean, sample size,
standard deviation or estimation of any other parameters.
5. Sign test
Rank sum test
Chi-square test
Wilcoxon signed-ranks test
McNemer test
Spearman’s rank correlation
6. The sign test is one of the non parametric test. Its names says the
fact that is based on the direction of the plus(+) and minus(-)
signs of observations in a sample.
The sign test may be classified in to two types
One sample sign test
Two sample sign test
7. The one sample sign test is a very simple non-parametric
test and the data can be non symmetric in nature. The one
sample sign test computes the statistical significance of a
hypothesized median value for a single data set.
For example
H0 : population median = 63
H1 : population median > 63
+ = 8
- = 2 Total sample = 10
64 +
69 +
40 -
64 +
65 +
71 +
82 +
59 -
64 +
74 +
63 0
8. The sign test has important applications in problems where we deal
with paired data. Each pair of value can be replaced with a plus (+)
sign if the first value (say X) is greater than the first value of second
sample (say Y) and we take minus (-) sign if the first value of x is
less than the first value. In case of two values are equal, the pairs are
discarded.
For example
Total number of + signs = 6
Total number of – signs = 2 Hence, sample size is 8 [since there
are 2 zeros in the sign row and such 2 pairs are discarded (10-2=8) ]
By X 1 0 2 3 1 0 2 2 3 0
By Y 0 0 1 0 2 0 0 1 1 2
Sign
s
(X-Y)
+ 0 + + - 0 + + + -
10. Rank sum tests are
U test (Wilcoxon-Mann-Whitney test)
H test (Kruskal-Wallis test)
U test: It is a non-parametric test. This test is determine
whether two independent samples have been drawn from
the same population. The data that can be ranked i.e.,
order from lowest to highest (ordinal data).
11. For example
The values of one sample 53, 38, 69, 57, 46
The values of another sample 44, 40, 61, 53, 32
We assign the ranks to all observations, adopting
low to high ranking process and given items
belong to a single sample.
Formula
U1 = n1n2+
1
2
n1 (n1+1) - ∑r1.
N1 = number of samples readings in one area.
N2 = number of samples readings in another area.
∑r1 = sum of ranks of readings.
Size of sample in
ascending order
Rank
32 1
38 2
40 3
44 4
46 5
53 6.5
53 6.5
57 8
61 9
69 10
12. H test: The Kruskal-Wallis H test (also called as the “one-Way
ANOVA on ranks”) is a rank-based non parametric test that can be
used to determine if there are statistically significant difference
between two or more groups of an independent variable on a
continuous or ordinal dependent variable.
For example: H test to understand whether exam performance,
measured on a continuous scale from 0-100, differed based on test
anxiety levels(i.e., dependent variable would be “exam performance”
and independent variable would be “test anxiety level”, which has
three independent groups: students with “low”, “medium” and “high”
test anxiety levels).
13. Formula
H = 12 ∑T 2
- 3(n+1)
n(n+1) ni
Where ni = sample size for a population
Ti = rank sum for population
n = total no. of observations.
14. The chi-square test is a non-parametric test. It is used mainly
when dealing with a nominal variable. The chi-square test is
mainly 2 methods.
Goodness of fit: Goodness of fit refers to whether a
significant difference exists between an observed number and
an expected number of responses, people or other objects.
For example: suppose that we flip a coin 20 times and record
the frequency of occurrence of heads and tails. Then we should
expect 10 heads and 10 tails.
Let us suppose our coin-flipping experiment yielded 12 heads
and 8 tails. Our expected frequencies (10-10) and our observed
frequencies (12-8).
15. Independence: the independence of test is difference
between the frequencies of occurrence in two or more
categories with two or more groups. For example:
The educational attainment is classified (UG and PG) and
income categories (low, middle, high) then we could use
the chi-square test for independence.
Formula 𝑥2
= ∑ [ (O – E)² ] where O= observed
frequency
Educational
attainment
low Middle High Total
UG 13 16 01 30
PG 43 51 60 154
56 67 61 184
16. In various research situations in the context of
two-related samples when we can determined
both direction and magnitude of difference
between matched values, we can use an
important non-parametric test viz., Wilcoxon
matched-pair test. While applying this test, we
first find the difference between each pair of
values and assign rank to the difference from the
smallest to the largest without regard to sign.
18. McNemer test is one of the important non-parametric test
often used when the data happen to be nominal and relate
to two related samples. As such this test id specially useful
with before and after measurement of the same subjects.
Example: a researcher wanted to compare the attitudes of
medical students toward confidence in statistics analysis
before and after the intensive statistics course.
Formula
𝑥2
= (b - c)² / (b + c) (1 df)
19. In this method a measure of association that is based on the ranks
of the observations and not on the numerical values of the data. It
was developed by famous Charles spearman in the early 1990s
and such it is also known as spearman’s rank correlation co-
efficient.
20. For example
Formula
1 - 6 ∑D ²
N (N² - 1)
D = R1 – R2
Where
R1 = rating one
R2 = rating two
N = number of
pairs
English
(marks)
Maths
(marks)
Rank
(English)
Rank (maths) Difference of
ranks
56 66 9 4 5
75 70 3 2 1
45 40 10 10 0
71 60 4 7 3
62 65 6 5 1
64 56 5 9 16
58 59 8 8 0
80 77 1 1 0
76 67 2 3 1
61 63 7 6 1
21. The non-parametric test are called as “distribution-free” test since
they make no assumptions regarding the population distribution.
It is test may be applied ranking test. They are easier to explain
and easier to understand but one should not forget the fact that
they usually less efficient/powerful as they are based on no
assumptions.
Non-parametric test is always valid, but not always efficient.
22. Parametric Non parametric
Information about population is completely known No information about the population is available
Specific assumptions are made regarding the
population
No assumptions are made regarding the population
Null hypothesis is made on parameters of the
population distribution
The null hypothesis is free from parameters
Test statistics is based on the distribution Test statistics is arbitrary
Parametric tests are applicable only for variable It is applied both variable and attributes
No parametric test exist for nominal scale data Non parametric test do exist for nominal and
ordinal scale data
Parametric test is powerful, if it exist It is not so powerful like parametric test