Call Girl Coimbatore Prisha☎️ 8250192130 Independent Escort Service Coimbatore
Sample size calculations
1. Sample size calculations
Dr Vinodh Kumar O.R
Division of Epidemiology
ICAR-Indian Veterinary Research Institute
Izatnagar, Bareilly-243 122
2. NEED FOR SAMPLE SIZE CALCULATION
• Sample-size determination is often an important step in
planning an epidemiological study
• An adequate sample size helps ensure that the study will yield
reliable information.
• Conducting a study with an inadequate sample size is not
only futile, it is also un ethical.
• Different study design need different method of sample size
calculation and one formula cannot be used in all
designs.
• Determining sample size is a very important issue because
samples that are too large may waste time, resources and
money, while samples that are too small may lead to
inaccurate results.
3. • Sampling frame: It is a complete
enumeration of the sampling units in
the study population, which may be a
list, directory, map, arial
configuration.
• Sampling unit: It may be an
individual, a household or a school.
Non-representativeness
of the study population
results in a lowered
accuracy
Small sample size
leads to low precision
4.
5.
6. Knowledge of the population
parameters
• By pilot surveys
• By use of results of previous surveys
• By intelligent guess
7. α and confidence level
• Alpha (α ): The
significance level of a
test: the probability of
rejecting the null
hypothesis when it is true
(or the probability of
making a Type I error).
• Confidence level: The
probability that an
estimate of a population
parameter is within
certain specified limits of
the true value; commonly
denoted by “1- α”.
8. • Beta( β) : The probability of
failing to reject the null
hypothesis when it is false (or the
probability of making a Type II
error).
• Power: The probability of
correctly rejecting the null
hypothesis when it is false;
commonly denoted by “1- β”
• Precision: A measure of how
close an estimate is to the true
value of a population parameter.
It may be expressed in absolute
terms or relative to the estimate.
• Degree of precision is the margin
of permissible error between the
estimated value and the
population value.
9. Basis for determining the size of sample
• Specification of a precision level.
• Specification of level of confidence.
• Power: The likelihood of rejecting the null
hypothesis when the null hypothesis is false.
10. Margin of error/sampling error
• The margin of error is a statistic expressing the amount of
random sampling error in a survey's results
• Larger the margin of error, the less confidence.
• The difference between the sample statistic and the related
population parameter is called the sampling error.
Margin of error Sample size
12. Sample size
• The choosing of sample size depends on non-
statistical and statistical considerations.
• Nonstatistical: availability of manpower and
sampling frames.
• Statistical considerations : Precision of the
estimate of prevalence and the expected
prevalence of the disease.
13. Sample size required for estimating
population mean
• Suppose we want an interval that extends d units on either side of the
estimator
d = (reliability coefficient) x (Standard error)
• If sampling is from a population sufficiently large size, the equation is:
d = z σ
n
• When solved for n gives:
n = z2
σ2
d2
width of the confidence interval (d)
level of confidence (z)
population variance (σ2)
14. • A farm has 1000 young pigs with an initial weight of about 50 kgs. They put
them on a new diet for 3 weeks and want to know how many pigs to sample
so that they can estimate the average weight gain. We want the results to be
within 2 Kgs with 90% confidence level.
• We have no idea of σ or SD
Sample size for population mean
90% confidence level =1.645
15. Sample size required for estimating
proportions
n
z
• Same as for population mean.
• Assuming random sampling and approximate
normality in the distribution of p, brings us to the
formula for n if sampling is with replacement, from a
population sufficiently large to warrant ignoring the
finite population correction :
Where q = 1 – p
pq=
2
2d
16. What Sample Size for proportion
• A researcher wants to estimate the true FMD immunization coverage in a village of cattle
population
• As per literature review , the immunization coverage should be somewhere around 80%
• Precision (absolute): we’d like the result to be within 4% of the true value
• Confidence level: conventional = 95% = 1 - α; therefore, α = 0.05 and z(1-a/2) = 1.96 = value of
the standard normal distribution corresponding to a significance level of 0.05 (1.96 for a 2-
sided test at the 0.05 level)
• d = absolute precision = 0.04
• p = expected proportion in the population = 0.80
• z(1-a/2) = 1.96 = value of the standard normal distribution corresponding to a significance level
of a (1.96 for a 2-sided test at the 0.05 level)
z2 . p . (1-p)
n = -------------------------
d2
(1.96)2 (.80) (.20)
= ------------------------------
(0.04)2
= 384
17. Descriptive studies
• In general, these studies can only identify
patterns or trends in disease occurrence over
time or in different geographical locations, but
cannot ascertain the causal agent or degree of
exposure.
• To calculate the required sample size in a
descriptive study, we need to know the level of
precision, level of confidence or risk and
degree of variability.
18. Finite population correction factor
• When population sizes are less than 10 times the
estimated sample size, it is possible to use a finite
population correction factor.
• The finite population correction factor measures how
much extra precision we achieve when the sample size
becomes close to the population size.
N is the size of the population and n is the size of
the sample.
If fpc is close to 1, then there is almost no effect.
When fpc is much smaller than 1, then sampling a
large fraction of the population is indeed having an effect
on precision.
19. Independent case-control studies
α = alpha, β = 1 – power, ψ = odds ratio
m– number of
control subjects per case subject, p1 – probability
of exposure in controls. p0 can be estimated as the
population prevalence
of exposure, nc is the continuity corrected sample
size and Zp is the standard normal deviate for
probability p
26. Sample size calculation for testing a hypothesis
(Clinical trials or clinical interventional studies)
27. Resource equation method
• It depends on the size of the whole experiment and
the number of treatment groups, not the individual
group sizes.
• If a value of E is less than 10 then more animal should
be included and if it is more than 20 then sample size
should be decreased.
• The resource equation method is useful when there is
no previous estimate of the standard deviation.
28. • For example, if a factorial experiment is planned
with both sexes and three dose levels then there
will be six treatment groups. If it is proposed that
there should be eight animals in each treatment
group (as is common), there will be 48 animals in
total and E = 48 – 6 = 42. This experiment is
unnecessarily large.
• Redesigning it with four animals per group, E =
24 – 6 = 18, which is within the suggested limits of
10 – 20.
• A power analysis should be used in preference to
the resource equation method wherever possible.
• Unfortunately, power analysis is not so easy to use
when there are more than two groups because it
is more difficult (but not impossible) to specify
the effect size of interest.
Resource equation method example
29. What factors affect the power of a
test?
To increase the power of your test, you may do
any of the following:
1. Increase the effect size (the difference between
the null and alternative values) to be detected
2. Increase the sample size(s)
3. Decrease the variability in the sample(s)
4. Increase the significance level (alpha) of the test
For example, if it is a study in a village (with a
population of say, 500) and the objective is to
determine the prevalence of some unusual events or
factors among the villagers, the selection unit ideally
should be individuals residing in the village. In this
case, the list of the names of all inhabitants will be the
reference sampling frame. But there are situations
where the sampling frame could not be worked out so
easily. Taking example of a similar study covering a
state, it is almost impossible to draw a list of all
inhabitants residing in the state. So here, simple
random sampling could not be appropriate; one has to
make use of a more simple approach
1.Specification of a precision level: A decision on the tolerable limits of errors is made, i.e. the researcher makes a statement that it does not
matter if his sample estimate does not differ from true population value by a certain amount. For example, suppose a Paediatrician plans a study to
estimate the population of malnourished children in a village and suppose that the true proportion of malnourished children is 10%. He is satisfied
if his estimate does not differ from true value of 10% by 5% i.e. he is okay with the result of his study if his estimate is within 9.5% to 10.5% (i.e. 10±0.5%).
2. Specification of level of confidence: This is the degree of uncertainty or probability that a sample value lies outside a stated limits (i.e. 10 ± 0.5) %.
Suppose this measure is 5%, the investigator has to accept the unlikely situation of 1 in 20 cases that the sample result falls aside the desired limit;
and if it is 1%, then the chance that the sample result falls outside the desired limits in 1 in 400. However, by convention, the mostly used confidence levels are 5% and 1%; but nothing stops the investigator from tolerating 10%, 2.5%
etc.ond level
When the sample size is 50, it does not matter much
whether the population is 10 thousand or 10 million.
When the sample size is four thousand, then we
have about 23% more precision with a population of
ten thousand than we would for a population of ten
million.