REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
Parametric & non parametric
1. SEMINAR ON RESEARCH METHODOLOGY
NON-PARAMETIC TESTS,
CORRELATION & REGRESSION
ANALYSES
MULTIVARIATE ANALYSES
&
FACTOR ANALYSES
2. PARAMETRE
Parameter is any numerical quantity that
characterize a given population or some
aspects of it. Most common statistics
parameters are mean, median, mode,
standard deviation.
3. PARAMETRIC TEST
The test in which, the population constants like
mean, SD, standard error, correlation coefficient,
proportion etc. and data tend to follow one
assumed or established distribution such as
normal, binomial, poisson etc.
4. ASSUMPTIONS OF PARAMETRIC TESTS
The general assumptions of parametric tests
are
o The populations are normally distributed
(follow normal distribution curve)
o The selective population is representative of
general population
o The data is in interval or ratio scale
5. NON-PARAMETRIC TEST
The test in which no constant of a population is
used. Data do not follow any specific distribution
and no assumption are made in these tests.
Eg: to classify good, better and best we just
allocate arbitrary numbers or marks to each
category.
6. ASSUMPTIONS OF NON-PARAMETRIC
TESTS
Non-parametric tests can applied when:
o Data don’t follow any specific
distribution and no assumption about the
population are made
o Data measured on any scale
7. TESTING NORMALITY
Normality : This assumption is only broken if
there are large and obvious departure from
normality
This can be checked by
1. Inspecting a histogram
2. Skewness and Kurtosis
8. COMMONLY USED
NON-PARAMERTIC TETS
Chi-Square test
McNemar test
The Sign Test
Wilcoxon Signed-Ranks Test
Mann-Whitney U or Wilcoxon rank sum test
The Kruskal Wallis or H test
Friedman ANOVA
The Spearman rank correlation test
Cochran’s Q test
9. Chi-square test offers an alternate method of
testing the significance of difference between two
proportions
Chi-square test involves the calculation of chi-
square.
Chi-square is derived from the greek letter ‘chi’
(𝒳)
CHI-SQUARE TEST
10. CON…
Chi-square was developed by Karl pearson.
Chi-square test is a non-parametric test.
It follows a specific distribution known as Chi-
square distribution.
11. CON…
The three essential requirements for Chi-square
test are:
A random sample
Qualitative data
Lowest expected frequency not less than 5
12. Important Terms
Degree of freedom
It denotes the extend of independence (freedom)
enjoyed by a given set of observed frequencies
Suppose we are given a set of “n” observed
frequencies which are subjected to “k”
independent constrains(restriction) then,
13. d.f. = (no. of frequencies)-(no. of
independent constrains on them)
In other terms,
d.f.= (r-1)(c-1)
Where
r = the no. of rows
c = the no. of columns
14. Contingency Table
When the table is prepared by
enumeration of qualitative data by
entering the actual frequencies, and if
that table represents occurrence of
two sets of events, it is also called an
association table.
15. IMPORTANT CHARACTERISTICS OF A
CHI-SQUARE TEST
It is based on frequencies and not on the
parameters like mean and SD.
The test is used for testing the hypothesis and is
not useful for estimation
It can also be applied to a complex contingency
table with several classes and such as is a very
useful test in research work.
16. CON…
No rigid assumptions are necessary in regard to
the type of population, no need of parameter
values and relatively less mathematical details are
involved.
17. CHI SQUARE DISTRIBUTION
If 𝑋1,𝑋2,…𝑋 𝑛 are independent normal variants
and each is distributed normally with mean zero
and SD unity, then𝑋1
2
,𝑋2
2,…… 𝑋2
𝑛 = ∑𝑋𝑖
2
is
distributed as chi square (𝒳2
)with n degrees of
freedom (d.f.) where n is large.
19. CALCULATION OF CHI-SQUARE VALUE
The calculation of Chi-square value is as follows:
Make the contingency tables
Note the frequencies observed (O) in each class
of one event, row-wise and the number in each
group of the other event, column-wise.
Determine the expected number (E) in each group
of the sample or the cell of table on the
assumption of null hypothesis.
20. CON…
The hypothesis that there was no difference
between the effect of the two frequencies, and
then proceed to test the hypothesis in quantitative
terms is called the Null hypothesis.
Find the difference between the observed and the
expected frequencies in each cell (O – E).
Calculate the Chi-square values by the formula
Sum up the Chi-square values of all the cells to
get the total Chi-square value.
21. CON…
Calculate the degrees of freedom which are
related to the number of categories in both the
events.
The formula adopted in case of contingency table
is Degrees of freedom
(d.f.) = (c – 1 ) (r – 1)
Where c is the number of columns and r is the
number of rows
22. ALTERNATIVE FORMULA
In the case of (2×2) table.
If we write the cell frequencies and marginal total
in case of a (2×2) table thus,
a b (a+b)
c d (c+d)
(a+c) (b+d) N
23. APPLICATIONS OFA CHI SQUARE TEST
This test can be used in
Goodness of fit of distributions
test of independence of attributes
test of homogeneity.
24. TEST OF GOODNESS OF FIT OF
DISTRIBUTIONS:
This test enables us to see how well does the
assumed theoretical distribution (such as
Binomial distribution, Poisson distribution or
Normal distribution) fit to the observed data.
The Chi Square test formula for goodness of fit
is:
𝑋2
= ∑
𝑜−𝑒 2
𝑒
Where, o = observed frequency
e = expected frequency
25. CON…
If chi square (calculated) > chi square (tabulated),
with (n-1) d.f, then null hypothesis is rejected
otherwise accepted.
And if null hypothesis is accepted, then it can be
concluded that the given distribution follows
theoretical distribution.
26. TEST OF INDEPENDENCE OF ATTRIBUTES
Test enables us to explain whether or not two
attributes are associated.
For instance, we may be interested in knowing
whether a new medicine is effective in controlling
fever or not, chi square test is useful.
In such a situation, we proceed with the null
hypothesis that the two attributes (viz., new
medicine and control of fever) are independent
which means that new medicine is not effective in
controlling fever.
27. CON…
𝒳2
(calculated) >𝒳2
(tabulated) at a certain level
of significance for given degrees of freedom, the
null hypothesis is rejected,
i.e. two variables are dependent.(i.e., the new
medicine is effective in controlling the fever)
28. CON…
if, 𝒳2
(calculated) <𝒳2
(tabulated) ,the null
hypothesis is accepted,
i.e. 2 variables are independent.(i.e., the new
medicine is not effective in controlling the fever).
when null hypothesis is rejected, it can be
concluded that there isa significant association
between two attributes.
29. TEST OF HOMOGENITY
This test can also be used to test whether the
occurance of events follow uniformity or not e.g. the
admission of patients in government hospital in all
days of week is uniform or not can be tested with the
help of chi square test
𝒳2
(calculated) <𝒳2
(tabulated), then null hypothesis
is accepted, and it can be concluded that there is a
uniformity in the occurance of the events. (uniformity
in the admission of patients through out the week)
30. CONDITIONS FOR THE APPLICATION OF CHI
SQUARE TEST
The data must be in the form of frequencies
The frequency data must have a precise
numerical value and must be organised into
categories or groups.
Observations recorded and used are collected on
a random basis.
All the items in the sample must be independent.
31. CON…
No group should contain very few items, say less
than 10. In case where the frequencies are less
than 10, regrouping is done by combining the
frequencies of adjoining groups so that the new
frequencies become greater than 10. (Some
statisticians take this number as 5, but 10 is
regarded as better by most of the statisticians.)
The overall number of items must also be
reasonably large. It should normally be at least
50.
32. YATE’S CORRECTION
If in the 2×2 contingency table, the expected
frequencies are small say less than 5, then chi
square test can’t be used. In that case, the direct
formula of the chi square test is modified and
given by Yate’s correction for continuity
33. ADDITIVE PROPERTY
Several values of 𝒳2
can be added together and if
the degrees of freedom are also added, this
number gives the d.f of the total value of 𝒳2
.
Thus if a number of 𝒳2
value is obtained from a
number of samples of similar data, then because
of the additive nature of 𝒳2
we can combine the
various values of 𝒳2
by just simply adding them.
34. Such addition of various values of 𝒳2
gives one
value of 𝒳2
which helps in forming a better idea
about the significance of the problem under
consideration.
Eg: The table shows the value of 𝒳2
from
different investigations carried to examine the
effectiveness of a recent invented medicine for
checking malaria.
35.
36. By adding all the values of 𝒳2
, we obtain a value
equal to 18.0
Adding the various d.f as given in the table, we
obtain the value 5
Now we can state that the 𝒳2
value for 5 d.f is
18.0
37. Let as take the hypothesis that, the new medicine
is not effective.
The table value of 𝒳2
for 5 d.f at 5% level of
significance is 11.071.
38.
39. Our calculated value is higher than this table
value ,means the difference is significant and is
not due to chance.
As such the hypothesis is rejected.
40. CONVERSION OF CHI-SQUARE INTO PHI
COEFFICIENT
Chi-square tells about the significance of a
relation between variables.
It provides no answer regarding the magnitude of
the relation.
In this case we use phi coefficient, which is a
non-parametric measure of coefficient of
correlation, as under:
ϕ=
𝒳2
𝑁
41. CONVERSION OF CHI-SQUARE INTO
COEFFICIENT OF CONTINGENCY (C)
In case of a contingency table of higher order
than 2×2 table to study the magnitude of the
relation or the degree of association between two
attributes, we convert the chi-square into
coefficient of contingency.
C =
𝒳2
𝒳2+𝑁
42. CON…
While finding out the value of C we proceed on
the assumption of null hypothesis that the two
attributes are independent and exhibit no
association.
It is also known as coefficient of Mean Square
contingency
This measure comes under the category of non-
parametric measure of relationship.
43. CAUTION IN USING 𝓧 𝟐
TEST
Neglect of frequencies of non-occurrence
Failure to equilise the sum of observed and the
sum of the expected frequencies
Wrong determination of the degrees of freedom
Wrong computations
44. LIMITATIONS OF A CHI SQUARE TEST
The data is from a random sample.
This test applied in a four fould table, will not give a
reliable result with one degree of freedom if the
expected value in any cell is less than 5. in such case,
Yate’s correction is necessry. i.e. reduction of the
mode of (o – e) by half.
Even if Yate’s correction, the test may be misleading
if any expected frequency is much below 5. in that
case another appropriate test should be applied.
45. CON…
In contingency tables larger than 2×2, Yate’s
correction cannot be applied.
This test doesn’t indicate the cause and effect, it
only tells the probability of occurance of
association by chance.
This test tells the presence or absence of an
association between the events but doesn’t
measure the strength of association.
46. CON…
The test is to be applied only when the individual
observations of sample are independent which
means that the occurrence of one individual
observation (event) has no effect upon the
occurrence of any other observation (event) in the
sample under consideration.