This presentation will address the issue of sample size determination for social sciences. A simple example is provided for every to understand and explain the sample size determination.
2. Let us assume that you have chosen an organization or a
phenomena, to study human resources effectiveness or
organizational behavior.
Quantitative or qualitative research are the two research paths
you may choose based on your objectives and budget
availability.
Organizational behavior originated from the fact-based
scientific experimentation which requires manipulation of
independent variables and to study their impact on the
dependent variables.
However, all phenomena may not be studied with scientific
exploration. Therefore there is a need for qualitative research.
3. One of the most important task is determination of
sample size.
Quantitative sample size determination is substantially
different from that of qualitative methods.
We will be discussing sample size determination under
qualitative study.
4. Review of research articles indicate
inappropriate, inadequate, or excessive sample
sizes continue to impact the quality of
inferences thus inflating the cost and effort of
doing research.
In this lecture, procedures for
determining sample size are
addressed to help the researchers
to select appropriate sample size.
5. Sample size determination
for quantitative Research
Research starts with defining population consisting of sample units. The
process of defining sampling units and frame will be discussed in another
lecture.
Our objective will be to select a set of samples that are representative of
the population and make inferences from the sample and the inferences
are generalized for the population.
Since we are using samples to make inferences for population, there will
be sampling error.
Thus objective of sample size determination is to minimize sampling error.
6. Sampling Error
The sampling error is the difference between a sample statistic used to
estimate a population parameter and the actual but unknown value of the
parameter..
For example if you want to estimate the proportion of the employees who
are happy out of 1000 employees in an organization. you determined your
sample size as 118 employees and you obtained 3.5 as an average score on
a scale of 1-5.
7. However, if you select the entire population and perform the same analysis
and you obtain the average as 3.8. Thus there is an error of 3.8 - 3.5 = .3.
This is called to sampling error.
The effect of sampling error and response & nonresponse bias are usually
not considered by researchers while choosing the sample size.
Sampling Error
8. What
should be
the sample
size for
following
instances?
Studying voting preferences of Indians; the population for
study is approximately 914.5 million voters in India.
Attitude of employees of an automobile manufacturing unit
with 20,000 work force dispersed across different locations
in India.
Measuring happiness index for 600 employees working in an
IT organization.
Measuring stress levels of eighty employees working in night
shifts in an information technology enabled service firm.
What is your observation?
9. You will observe that the population range between 90 million units to
60 units in the four examples.
What is the relationship of sample size to that of population size? Or
Should the researcher be concerned about size of population?
The sample size is not dependent on the population size. This is a crucial
statistical insight that is counter intuitive. Please do go through the first
self instruction material given by Journal of Statistical Education.
Observation and Questions
10. Assertion and Discussion
Let us make a strong assertion;
the standard error of an
estimator depends on the size of
the sample, but not on the size of
the population.
Some management researchers
do talk about 5% to 10% of
sample size.
11. Alpha Error and Beta Error
Survey designs try to minimize both
alpha error (finding a difference that does
not actually exist in the population) and,
beta error (failing to find a difference
that exists in the population.
12. What is the way to go about calculating the
sample size?
Let us start with the variables that are
measured in a hypothetical job satisfaction
study that is to be conducted.
You know that job satisfaction is a continuous
variable and may be measured to the
precision that you wish to measure.
This Photo by Unknown Author is licensed under CC BY-NC-ND
13. Primary variables of measurement
Job satisfaction is a continuous variable and you are going to use a 7-
point scale in the instrument. Thus we will consider continuous
variable measured on a 7-point scale.
We know that job satisfaction is influences by variables such as
gender& race (nominal variable), number of years of service, age
group, educational qualification (categorical variable).
Question is which variable to be used in the formula that we are going to
form?
14. Step one of our starting point
• He posited that “One method of determining
sample size is to specify margins of error for the
items that are regarded as most vital to the survey.
• An estimation of the sample size needed is first
made separately for each of these important items”
Recommendation
given by
Cochran(1977).
Reference: Cochran, W. G. (1977). Sampling Techniques (3rd Ed.). New York: John Wiley &
If gender is a primary variable that is likely to impact the measurement of job
satisfaction(primary variable) by 7-point scale, it is likely to provide higher
sample size.
Thus the researcher will have a range of n’s that is n=sample size for different
variables. We should find all the n’s for different variables and select the highest
n so that we obtain likely lowest sample error.
15. Error Estimation –
Your Decision….
Factors to be considered in error estimation are
The risk the researcher is willing to
accept in the research which is called as
margin of error, or the error the
researcher thinks that his inferences
about population are estimated.
The alpha level is the level of acceptable
risk the researcher is willing to accept
that the true margin of error exceeds the
acceptable margin of error; i.e., the
probability that differences revealed by
statistical analyses really do not exist;
also known as Type I error. The Type I
error is making inference that is not in
existence.
16. Type II error which is not addressed in this
lecture
Another type of error is beta error.
Type II error occurs when statistical procedures result
in a judgment of no significant differences when
these differences do indeed exist.
Type II error is unable to detect the differences when
there are statistically significant differences exists.
17. Sample Size Determination and the
Thinking Process
Think on what should be there!
𝒏 =
𝒏𝒖𝒎𝒆𝒓𝒂𝒕𝒐𝒓
𝒅𝒆𝒏𝒐𝒎𝒊𝒏𝒂𝒕𝒐𝒓
,
Where n is the sample size.
We will not go into the mathematical aspects but try to intuitively
understand the complete process.
To determine n, the sample size what factors should be there in the
numerator and denominator? i.e., how we have to design a formula ?
18. Alpha Level in Practice
The alpha level used in determining the sample size in most social
sciences research is 0.05 or 0.01.
We will be using Cochran’s formula which we will discuss in future
slides and use student t-value.
We have studied to use student t- distribution for small sample
size and for large sample size normal distribution. If you read
most of the articles and statistical output you will observe t-value
being given by SPSS or R for larger samples.
I want you to reflect on this point for sometime by reading the
self instruction material.
19. I have discussed the history of t-
distribution in another self
instruction manual.
The t value is robust in the sense
for sample size lesser than 60, it
has a different value and more
than 120 it will approach normal
distribution value
Th t-value for α level of point 0.05
is 1.96 for sample sizes of above
120 sample units.
You may be wondering what is
standard normal table and value of
1.96? Why it has no units?
For this you should go through the
self instruction manual discussion
on standard normal distribution.
t-value
20. Alpha Levels to be used for different studies
An alpha level of 0.05 is acceptable for most of your research reports
and for publication in journals however you use an alpha level of 0.10
or even 0.20 to quickly identify statistical phenomena of relationships
and differences.
This output you will be using for further study.
In the case of critical studies you should use a alpha level of 0.01. this
level of alpha level is used if the decisions have large financial and
social implications.
21. The acceptable margin of error for continuous data and categorical data are
different.
If you want to detect variations in different variables such as quickness of
grasping, ability to solve problems based on the reading material prepared by
SWAYAM.
Thus education levels such as +2, graduation, post graduation are likely to
influence other variables.
In this case the educational levels will be influencing other variables and this
will be our primary variable.
Acceptable Margin of Error
22. Continuous
data
If you are measuring job satisfaction
on a 7 point scale and assuming it to
be continuous variable generally 3%
margin of error is acceptable. That
means the researcher is confident
but the true mean of the seven point
scale is within ± 0.21.
Reflect on why 3% will give us 0.21
error?
The answer is you have to multiply
0.03 with 7 to obtain 0.21.
Acceptable Margin for Continuous Variables
23. The other critical component of sample size
formula is the estimation of variance in the
primary variables
As a researcher you cannot control variance, but
you can incorporate variance estimate in your
research design
Four ways are suggested for estimating population
variance for sample size determination
Variance Estimation
24. Four stages to estimate sample size
1.Take samples in two stages
1.Use the results of first stage o determine how many
samples are needed in second sample based on
variance observed in first sample
Use Pilot studies
Use Previous studies with similar population and,
Use estimator based on some mathematical logic
As a social science researcher you will be
estimating variance of the scale and a
categorical variables.
25. The sample standard deviation S
To estimate variance of a scaled variable we use the following
formula:
S=
𝟕 𝒏𝒖𝒎𝒃𝒆𝒓 𝒐𝒇 𝒑𝒐𝒊𝒏𝒕𝒔 𝒐𝒏 𝒕𝒉𝒆 𝒔𝒄𝒂𝒍𝒆
𝟔( 𝒏𝒖𝒎𝒃𝒆𝒓 𝒐𝒇 𝒔𝒕𝒂𝒏𝒅𝒂𝒓𝒅 𝒅𝒆𝒗𝒊𝒂𝒕𝒊𝒐𝒏𝒔)
= 7/6 = 1.167
We all know that 6 standard deviations both on positive and
negative sides of the mean will explain the 98% of variation.
26. Basic Sample Size Determination
Let us assume that a continuous variable is likely to play a role in measurement of job
satisfaction.
You and your organization decided to set alpha level at 0.05, plan to use 7 point scale
and set level of acceptable error at 3% and has estimated standard deviation of scale
1.167 then the formula is
n = (𝒕 𝟐) * (𝑺 𝟐)/ (𝒅 𝟐)
= (𝟏. 𝟗𝟔 𝟐) * (𝟏. 𝟔𝟕 𝟐)/ ((𝟕 ∗ 𝟎. 𝟎𝟑 𝟐) = 118
Therefore the required sample size is 118.
27. Suitability of the sample size
This sample size will be suitable if
the participants are captive
audience or where you have control
over selecting random samples out
of the population.
28. Categorical Data
The sample size formulas for categorical
data is similar but not identical
•N =
[𝒕 𝟐
∗ 𝒑.𝒒 ]
𝒅 𝟐 = [1.962∗(0.5)(0.5)/(0.052 )] = 384
Units