Biostatistics

HYPOTHESIS TESTING
(The basis of Statistical reasoning )
All research, whether it is a trial of a new technique or a
comparison of two groups or two different techniques or the
assessment of the relationship between two different variables,
begins with a research question. To answer this question, data are
obtained from a sample drawn from the population of interest.
A] The hypothesis: The research question is posed in the form of
a declarative statement.
For e.g. If the research study is ‘The effect of fluoride varnish on
the decalcification around orthodontic brackets’, the research
question will be ‘ Does fluoride varnish have any effect on the
decalcification of enamel around orthodontic brackets?’

Before we commence the study we have to formulate a
hypothesis from the research question. So convert ‘ Does fluoride
varnish have any effect on the decalcification of enamel around
orthodontic brackets?’ into a statement,
‘Fluoride varnish has an effect on decalcification of enamel around
orthodontic brackets’
This is my research hypothesis. If the data drawn from the
sample can be shown , beyond reasonable doubt, to be consistent with the
hypothesis, then the hypothesis is accepted. Otherwise it is rejected. Due
to random variation even an unbiased sample may not represent the
population as a whole. So it may be possible that the data obtained from
the sample are due to chance. The probability that the data is due to
chance is calculated using statistical tests of significance.

THE STATISTICAL REASONING PROCESS
Example: A study is conducted to determine whether the
average ANB readings for people with retrognathic profile is
greater than that of people with orthognathic profile. From
previous work we know that the average ANB angle of
orthognathic profile is 20
and the standard deviation of these
values is 30
.
STEP 1: Formulating the research hypothesis
The research question in this study is ‘ Is the average
ANB reading for people with retrognathic profiles greater than
the ANB reading for people with orthognathic profile?’

This can be framed as two differing hypotheses:
1. H0 states that there is no difference between the ANB readings of
people with orthognathic and retrognathic profiles. This can be
symbolically written as,
H0: µr = µo
Or H0: µr = 20
, in this example.
This hypothesis that says there is no difference is called as the ‘Null
hypothesis’.
2. HA states that the ANB readings of people with retrognathic
profiles is greater than the ANB readings of people with
orthognathic profiles. This can be written as,
HA: µr > µo
Or HA: µr > 20
, in this example.
This hypothesis is called as the ‘Alternate hypothesis’.

STEP II: Obtaining information about the population from the
sample
The ANB angles of 100 randomly selected people having
retrognathic profiles were measured from their cephalograms. It
was found that the sample mean of these 100 readings was µr
=50
. Conclusions regarding population having retrognathic
profiles will be based on this sample data.
STEP III: setting the level of confidence
The investigator sets α = 0.01. In other words, HA will be
accepted as true if and only if there is a less than 1% chance of
obtaining a sample mean as large as 50
or larger than that by
random chance from a population whose mean is 20
.

The investigator’s decision can be represented as
If P ≤ 0.1, reject H0 and conclude that the sample data support
HA; report that the results are statistically significant and not
due to chance.
If P > 0.1, do not reject H0; report results are not statistically
significant and are likely to be due to chance.
STEP IV: Measuring the disparity ( difference ) between the
population parameter and the sample parameter
The test statistic measures the disparity between the population
mean and the observed sample mean in terms of the standard
deviation.
Test statistic = sample estimate – population value/standard deviation.

Where sample estimate is xr =mean of the sample of retrognathic
profiles = 50
,
Population value is µr (assuming H0 is true) = 20
,
Standard deviation is the standard error of the estimate = σr /√ n.
So test statistic =.
= (5 –2)/ 3/√100.
= 3/(3/10)
= 3/0.3
= 10.
-In other words the sample mean xr is 10 standard deviations
greater than the population mean postulated by the H0.

STEP V: Evaluating the evidence against Ho.
The disparity between the sample estimate and the population
mean can be explained as either:
HA is true, that is the sample is drawn from a population of
retrognathic profiles whose mean ANB > 20
.
OR
H0 is true, and the disparity between the sample mean and the
population mean is due to random chance.
In this case if we compare the test statistic obtained in terms of
probability, 10 is nothing but the z score of the sample mean
when H0 is true (µr = 20
). So if we check the P value for the z
score of 10 we get P < 0.001. That is the probability that a
sample randomly drawn from a population where H0 is true
will have a mean of 50
is 0.0001 or 1 in 10000.

STEP VI: Drawing a conclusion
Since P ≤ 0.01, the chosen level of significance, H0 is rejected in
favor of HA. This result is statistically significant at 1% level of
significance.
Clinically, it can be concluded that the average ANB values of
people having retrognathic profiles is greater than the ANB
values of those with orthognathic profile.
The chance that this conclusion is wrong or the investigator has
committed a Type I error is less than 0.0001.

TESTS OF SIGNIFICANCE
These are the statistical tests that are conducted to
check the significance of the data obtained from the sample.
They determine whether the results obtained are due to
chance or due to any underlying cause. Depending upon the
level of significance chosen, α, the tests determine the degree
of significance of the results.

CLASSIFICATION OF THE TESTS OF SIGNIFICANCE
A. Based on the utility:
I. Tests of Mean and Variation
1. Paired ‘t’ test
2. Analysis of Variance
3. Bartlett’s test
II. Tests of association
1. Pearson’s co-efficient test
2. Regression analysis
3. Spearman’s co-efficient test

III. Tests of Ranks
1. Wilcoxon test
2. mann Whitney test
3. Kruskal Wallis test
IV. Post Hoc tests
1. Newman Keuls test
2. Scheffe test
3. Duncan test
4. Dunnett test

B. Based on the parameters:
I. Parametric tests
1. ‘t’ tests
2. ANOVA
II. Non parametric tests
1. Wilcoxon test
2. Chi-square test

Selecting the appropriate test
Decision trees used to choose an
appropriate statistical test when the
research question is concerned with
(A) evaluating an association
between two variables or (B)
analyzing differences among
comparison groups. [ANOVA =
Analysis of Variance; OR = odds
ratio; RR = relative risk; RRt =
relative risk with restrictive
conditions; RCB = randomized
complete block design; CRD =
completely random design].

Decision trees used to choose an
appropriate statistical test when the
research question is concerned with
(A) evaluating an association
between two variables or (B)
analyzing differences among
comparison groups. [ANOVA =
Analysis of Variance; OR = odds
ratio; RR = relative risk; RRt =
relative risk with restrictive
conditions; RCB = randomized
complete block design; CRD =
completely random design].

CHI-SQUARE TEST
Chi-square χ2 test is one of the tests of statistical significance,
which is used for analyzing the statistical significance of qualitative data
when they are not matched.
Example:
Step I: Formulating the research hypothesis
Consider the research hypothesis “ Thumb sucking leads to Class II
malocclusion”
The Null hypothesis would be, Ho = “Thumb sucking is not related to
Class II malocclusion”.
The alternate hypothesis would be, HA = “ Thumb sucking leads to Class
II malocclusion”
A decision rule is formulated that the level of significance α = 0.01.
That is expressed as Reject Ho if P ≤ 0.01,
Accept Ho if P > 0.01.

Step II: Collecting the data
100 subjects each were randomly selected for Class I
and Class II. The 200 subjects were examined and a thorough
case history was taken and persons with thumb sucking habits
were identified in both Class I and Class II categories. They
were divided into a T+ (Thumb sucking) and T- (Not thumb
sucking) groups.
Step III: Tabulating the data
A 2×2 table was used for tabulating the results. They were as
follows:

C+
C-
Totals
T+ 30 10 40
T- 70 90 160
Totals 100 100 200
The row in this table is denoted by r and the column by c. So
the individual squares will be r1c1 (having the number 30), r1c2
(having the number 10), r2c1 (having the number 70) and r2c2
(having the number 90).

Step IV: Calculating the expected frequencies
Expected frequency refers to the probable
number, which would have been seen if the event occurring on the whole
is applied. For example consider the first square r1c1. It indicates the
number of people who are both C+ and T+, i.e. they are subjects with
both Class II and thumb sucking habit. If we consider the grand total
200, in that 40 subjects have thumb-sucking habit. If we apply the same
probability to the number of class II subjects then it will give the expected
frequency of that square r1c1.
For the total 200 subjects (grand total) – 40 subjects have thumb-sucking
habit (r / row total).
So for the 100 Class II (c / column total) the expected number of thumb
sucking patients would be - ?

(cross-multiplying)
Er1c1 = column total×row total/ grand total.
Er1c1 = 100×40/ 200.
= 20.
Similarly - Er1c2 = 100×40 /200
= 20.
Er2c1 = 100×160/ 200
= 80
Er2c2 = 100×160/ 200
= 80.

Step V: Calculating the Chi-square (χ2) value for
each square.
The χ2 value for each square is given by the formula –
(Observed freq – expected freq)2/ expected freq.
So the χ2 values are:
For
-square r1c1: (30 – 20)2/20
= 5.
-square r1c2: (10 – 20)2/20
= 5.
-square r2c1: (70 – 80)2/80
= 1.25.
-square r1c1: (90 – 80)2/80
= 1.25.

The χ2 value for the whole sample is the sum of these
individual χ2 values.
= 5 + 5 + 1.25 + 1.25
= 12.5
Step VI: Calculate the degree of freedom
The degree of freedom is dependent on the number of
variables and is given by:
d.f = (r-1) (c-1), where r is the number of
rows and c is the number of columns.
So d.f = (2-1) (2-1)
= 1.

Step VI: Statistical Inference
The χ2 value is got from the χ2 table for the desired level
of significance in the row pertaining to the degree of
freedom 1.
Here
The calculated χ2 value is 12.5.
The degree of freedom is 1.
The decided level of significance is α = 0.01.

For consulting the χ2 Table the column to be seen is
obtained by 1- α, which gives 1- 0.01 = 0.99.
So the χ2 table is consulted as follows,
See along the I column for the degree of freedom, which in
this case is 1. Then follow the row along this degree of freedom
and check the position of the test statistic (χ2) value, which is
12.5 we have got on the table. We can see that it is greater than
the last value given i.e. 7.879 for 0.995 level of significance. This
indicates that the probability is less than 0.005.
i.e. P< 0.005.

Compare this with our decision rule.
Reject Ho if P ≤ 0.01,
Accept Ho if P > 0.01.
Here obtained P< 0.005. So reject Ho. Accept HA.
HA = “ Thumb sucking leads to Class II malocclusion”
Conclusion is:
Statistically: The results obtained in the sample are
statistically significant, as P <0.005, (much more below our level
of significance P ≤ 0.01).
Clinically: “ Thumb sucking leads to Class II malocclusion”.

Correlation and regression analyses are two procedures used to
analyze associations involving continuous data (interval/ratio scale).
Correlation analysis:
A] Assessing the strength of the association between the two variables:
The correlation co-efficient, denoted as r, defines both the strength
and the direction of the linear relationship between the two
variables..
CORRELATION REGRESSION ANALYSIS

Characteristics of the correlation co-efficient:
a. The correlation co-efficient varies from –1 to +1.
when r = +1, the two variables have a perfect positive linear
relationship.
when r = -1, the two variables have a perfect negative linear
relationship.
when r = 0, there is no linear relationship between the two variables.
The relationship may be non-linear or there may not be any
relationship at all.
b. The better the points in the scatter diagram approximate the
straight line greater is the magnitude of r.
c. The correlation co-efficient r of a sample drawn from a population
is an estimate of the population correlation co-efficient p.

Calculation:
The correlation co-efficient is given by the formula
Where X bar is the mean of the first variable,
Xi are the individual observations of the first variable.
Y bar is the mean of the second variable,
Yi are the individual observations of the second variable.
Sx and Sy are the standard deviations of the first and second variable
respectively.
n is the number of pairs of observations.

Regression Analysis:
It is possible to formulate an equation relating the two
variables being studied such that, for any given value of ‘x’ it should be
possible to find ‘y’.
Expressed as y f (x), where x is the independent variable and y is the
dependent variable.
This is done using the regression analysis, which equates the
two variables. The goal of the regression analysis is to derive a linear
equation that best fits a set of data pairs ( x, y ) represented as points
on a scatter diagram. This equation can be used to predict the value of
y for given values of x.

The form of the equation:
The general form of the equation is:
y= b0 + b1x
Where b1 = S xy / S xx
And b0 = y’ – b1 x’
The formulae for S xy and S xx are:

The T-Test
The t-test assesses whether the means of two
groups are statistically different from each other. This
analysis is appropriate whenever you want to compare
the means of two groups, and especially appropriate as
the analysis for the posttest – only two group
randomized experimental design.

Figure below shows the distributions for the treated (blue)
and control (green) groups in a study. Actually, the figure shows
the idealized distribution -- the actual distribution would usually
be depicted with a histogram or bar graph. The figure indicates
where the control and treatment group means are located. The
question the t-test addresses is whether the means are statistically
different.
What does it mean to say that the averages for two groups
are statistically different?

Consider the three situations shown .The first thing to
notice about the three situations is that the difference between
the means is the same in all three. But, you should also notice
that the three situations don't look the same .The top example
shows a case with moderate variability of scores within each
group. The second situation shows the high variability case. the
third shows the case with low variability. Clearly, we would
conclude that the two groups appear most different or distinct in
the bottom or low-variability case. Why? Because there is
relatively little overlap between the two bell-shaped curves. In
the high variability case, the group difference appears least
striking because the two bell-shaped distributions overlap so
much.

Statistical Analysis of the t-test
The formula for the t-test is a ratio. The top part of the
ratio is just the difference between the two means or averages.
The bottom part is a measure of the variability or dispersion of
the scores. This formula is essentially another example of the
signal to noise metaphor in research: the difference between the
means is the signal that, in this case, we think our program or
treatment introduced into the data; the bottom part of the formula
is a measure of variability that is essentially noise that may make
it harder to see the group difference.

The top part of the formula is easy to compute -- just
find the difference between the means. The bottom part is
called the standard error of the difference (SED). To compute
it, we take the VARIANCE for each group and divide it by the
number of people in that group. We add these two values and
then take their square root. The SED is given by:

The t-value will be positive if the first mean is larger than
the second and negative if it is smaller. Once you compute the t-
value you have to look it up in a table of significance to test
whether the ratio is large enough to say that the difference
between the groups is not likely to have been a chance finding.
In the t-test, the degrees of freedom is the sum of the
persons in both groups minus 2. Given the alpha level, the df, and
the t-value, you can look the t-value up in a standard table of
significance (available as an appendix in the back of most
statistics texts) to determine whether the t-value is large enough
to be significant. If it is, you can conclude that the difference
between the means for the two groups is different (even given the
variability).

The paired t-test is used when there is one
measurement variable and two nominal variables. One of the
nominal variables has only two values. The most common
design is that one nominal variable represents different
individuals, while the other is "before" and "after" some
treatment. Sometimes the pairs are spatial rather than
temporal, such as left vs. right, e.g. canine retraction with coil
spring on right quadrant compared with canine retraction
using e-chain on left quadrant.

The paired t-test is only appropriate when there is just
one observation for each combination of the nominal values.
For the canine retraction example, that would be one
measurement of rate of retraction in millimeters per month. If
we have multiple measurements of retraction on each canine,
like millimeters moved after 1 week (T1), after one month (T2)
and after six months (T3) and so on, we would do a two way
ANOVA.

LIMITATIONS OF T TEST
1. Fails to gauge magnitude of difference between two means
2. Only compares 2 groups
(solution- if> than 2 groups – ANOVA)
3. Doesn’t control a No. of other variables in a simple pre-
postdesign
4. In many studies pre-test not possible
- mortality studies
5. With-in subject variation is introduced twice
- e.g. in pain ratings

ANOVA
Two variables: 1 Categorical, 1 Quantitative
The means of the quantitative variables depend on which
group (given by categorical variable) the individual is in.
If categorical variable has only 2 values:
• 2-sample t-test
ANOVA allows for 3 or more groups

AN EXAMPLE ANOVA SITUATION
Subjects: 25 patients with oral ulcers
Treatments: Treatment A, Treatment B, Placebo
Measurement: # of days until blisters heal
Data [and means]:
• A: 5,6,6,7,7,8,9,10 [7.25]
• B: 7,7,8,9,9,10,10,11 [8.875]
• P: 7,9,9,10,10,10,11,12,13 [10.11]
Are these differences significant?

Informal Investigation
Graphical investigation:
• side-by-side box plots
• multiple histograms
Whether the differences between the groups are significant
depends on
• the difference in the means
• the standard deviations of each group
• the sample sizes
ANOVA determines p-value from the F statistic

Assumptions of ANOVA
• each group is approximately normal
 check this by looking at histograms and/or normal
quantile plots, or use assumptions
 can handle some nonnormality, but not severe outliers
• standard deviations of each group are approximately
equal
 rule of thumb: ratio of largest to smallest sample st.
dev. must be less than 2:1

Normality Check
We should check for normality using:
• assumptions about population
• histograms for each group
• normal quantile plot for each group
With such small data sets, there really isn’t a really good way
to check normality from data, but we make the common
assumption that physical measurements of people tend to be
normally distributed.

Standard Deviation Check
Compare largest and smallest standard deviations:
• largest: 1.764
• smallest: 1.458
• 1.458 x 2 = 2.916 > 1.764
Note: variance ratio of 4:1 is equivalent.
Variable treatment N Mean Median StDev
days A 8 7.250 7.000 1.669
B 8 8.875 9.000 1.458
P 9 10.111 10.000 1.764

Notations for ANOVA
• n = number of individuals all together
• I = number of groups
• = mean for entire data set is
Group i has
• ni = # of individuals in group i
• xij = value for individual j in group i
• = mean for group i
• si = standard deviation for group i
ix
x

How ANOVA works
ANOVA measures two sources of variation in the data and
compares their relative sizes
• variation BETWEEN groups
• for each data value look at the difference between its
group mean and the overall mean
• variation WITHIN groups
• for each data value we look at the difference between
that value and the mean of its group

The ANOVA F-statistic is a ratio of the Between
Group Variaton divided by the Within Group
Variation:
A large F is evidence against H0, since it indicates
that there is more difference between groups than
within groups.

We want to measure the amount of variation due to
BETWEEN group variation and WITHIN group
variation
For each data value, we calculate its contribution to:
• BETWEEN group variation:
• WITHIN group variation:

Computing ANOVA F statistic
WITHIN BETWEEN
difference: difference
group data - group mean group mean - overall mean
data group mean plain squared plain squared
5.3 1 6.00 -0.70 0.490 -0.4 0.194
6.0 1 6.00 0.00 0.000 -0.4 0.194
6.7 1 6.00 0.70 0.490 -0.4 0.194
5.5 2 5.95 -0.45 0.203 -0.5 0.240
6.2 2 5.95 0.25 0.063 -0.5 0.240
6.4 2 5.95 0.45 0.203 -0.5 0.240
5.7 2 5.95 -0.25 0.063 -0.5 0.240
7.5 3 7.53 -0.03 0.001 1.1 1.188
7.2 3 7.53 -0.33 0.109 1.1 1.188
7.9 3 7.53 0.37 0.137 1.1 1.188
TOTAL 1.757 5.106
TOTAL/df 0.25095714 2.55275
overall mean: 6.44 F = 2.5528/0.25025 = 10.21575

So How big is F?
Since F is
Mean Square Between / Mean Square Within
= MSG / MSE
A large value of F indicates relatively more
difference between groups than within groups
(evidence against H0)
To get the P-value, we compare to F(I-1,n-I)-distribution
• I-1 degrees of freedom in numerator (# groups -1)
• n - I degrees of freedom in denominator (rest of df)

Connections between SST, MST, and
standard deviation
So SST = (n-1) s2
, and MST = s2
. That is, SST and
MST measure the TOTAL variation in the data set.
If ignore the groups for a moment and just compute
the standard deviation of the entire data set, we see

Connections between SSE, MSE, and standard
deviation
So SS[Within Group i] = (si
2
) (dfi )
This means that we can compute SSE from the
standard deviations and sizes (df) of each group:
Remember:

Pooled estimate for standard deviation
One of the ANOVA assumptions is that all
groups have the same standard deviation. We can
estimate this with a weighted average:

R2
Statistic
R2
gives the percent of variance due to between
group variation

Multiple Comparisons
Once ANOVA indicates that the groups do not all
have the same means, we can compare them two
by two using the 2-sample t test
• We need to adjust our p-value threshold because we are
doing multiple tests with the same data.
•There are several methods for doing this.
• If we really just want to test the difference between one pair of
treatments, we should set the study up that way.

Wilcoxon test and Spearman’s Correlation
• This test can be used in various research situations in
context of 2 related samples when we can determine
both direction and magnitude of difference between
matched values.
• When the data are not available to use in numerical
form for doing correlation analysis but when the
information is sufficient to rank the data as I, II, III,
and so forth rank correlation method is used.

Conclusion
Biostatistics is a tool which has to be thoroughly
understood by any post graduate student, prior to its utilization
in his/ her own specialty thesis. The lack of understanding
regarding the methods of data collection and presentation and
improper sample selection has led to more failures, than any
inherent problem in method of analysis.
Not only for thesis, but also while perusing the various
journal articles, a knowledge of biostatistics, can make the
difference between a correct/ incorrect interpretation of results.
“What the mind does not know, the eyes can’t see”
So a knowledge of statistics can open our eyes wider to
assimilating facts appositely from the journals.

REFERENCES
1. ‘Research Methodology’, C.R.Kothari.
2. ‘Clinical epidemiology and biostatistics’, Rebecca g Knapp
and M. Clinton Miller III.
3. ‘ Statistics’, A SI metric edition, Murray R Spiegel.
4. ‘Essentials Of Preventive and Community Dentistry’,
Soben Peter.

Biostatistics

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Andere mochten auch

Andere mochten auch (20)

Ähnlich wie Biostatistics

Ähnlich wie Biostatistics (20)

Mehr von Indian dental academy

Mehr von Indian dental academy (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Biostatistics