3. DATA
a. Data is a gathered body of facts
b. Data is the central thread of any activity
Understanding the nature of data is most
fundamental for proper and effective use
of statistical skills
4. Sources of Data
Sources of Data
Internal Sources External Sources
Primary Data Secondary Data
7. i). Subpopulation
It is a subset within the population that
inherits the characteristics of the population
also maintains some unique characteristics
that is not present in other distinct
subpopulations inside the population.
Example –
All males and females are two subpopulations
Definitions
8. Definitions…
ii)Sampling frame
It is the listing of all items in the population under
study.
Example-
telephone Directory, EnrollmentForm,Census,Patients
list etc
9. Example…
we may use a telephone directory of Kerala as a
sampling frame to represent the population defined
as "the adult residents of Kerala".
Obviously, there would be a number of
elements (people) who fit our population definition,
but do not figure in the telephone directory. Similarly,
some who have moved out of Kerala recently would
still be listed.
Thus, a sampling frame is usually a practical
listing of the population, or a definition of the
elements or areas which can be used for the sampling
exercise.
10. iii) Sample
A finite subset of the population, selected from it with
the objective of investigating its properties is called
sample.
Example-
When we want to study the life of
electric bulbs produced by a
company we select some electric
bulbs anaad study their length of
life.
11. iv) Sample Size
The number of units or subjects sampled for
inclusion in the study is called sample size.
It is not a formula alone that determines
sample size. Sampling in practice is based on
science, but is also an art
12. The sample size is decided based on
a) use of formulae,
b) experience of similar studies,
c) time and budget constraints,
d) output or analysis requirements,
e) number of segments of the target population,
f) number of centres where the study is conducted,
etc.
13. Methods of data collection
1. Census Method
Under this method each and every item or unit
constituting the universe is selected for data
collection.
Eg: The population Census conducted in India once
in every ten years .
14. 2. Sample Method
Selection of some part of an aggregate on the
basis of which a judgment or inference about the
aggregate is made.
15. Census Vs Sampling
Size of population
Amount of Funds for the study
Facilities
Time
16.
17. Stages in Sampling
Define the population
Select a sampling frame
Selection of the sample
Collection of information about the population
Making an inference about the population
18. Types of sampling Technique
Probability
sampling Tec
Non-
probability
sampling Tec
19. Probability Sampling
Every unit in the population has less or more,
but valid chance of being selected as a sample. And
also, this valid chance can be statistically measured.
In case the probability is equal for each unit in
the population, it is called Equal Probability of
Selection
20. Non Probability Sampling
In this method some units of the population
does not have any valid chance or the chance cannot
be known before, of getting selected in the sampling.
21. SAMPLING TECHNIQUES
• Simple Random sampling
• Stratified sampling
• Systematic sampling
• Probability Proportional to size
sampling(PPS)
• Cluster sampling
• Multi-stage sampling
Probability
Sampling
Tech.
• Judgmental sampling
• Convenience sampling
• Quota sampling
• Snowball Sampling
Non
probability
Sampling Tech
22. Probability Sampling Tech.
1. Simple Random Sampling (SRS)
Sample is selected from a population in
such a way that every member of the population
has an equal chance of being selected and the
selection of any individual does not influence
the selection of any other.
It can be done with or without replacement
Possibility of
selecting the same
item as a sample
More convenience,
more precise result
23. SRS with replacement (SRSWR)
One unit of element is randomly selected from
population is the first sampled unit
Then the sampled unit is replaced in the population
The second sample is drawn with equal probability
The procedure is repeated until the requisite sample
units n are drawn
The probability of selection of an element remains
unchanged after each draw
The same units could be selected more than once
24. Number of possible samples in
SRSWR= Nn
Example: 2 elements from 4 (ABCD)
How many ways we can draw 2 elements from a
population of size 4
26. SRS without replacement (SRSWOR)
once an element is selected as a sample unit, will not
be replaced in the population
The selected sample units are distinct
Number of possible samples in
SRSWR= N = N!
r r! (N-r)!
n ! = 1 x 2 x 3x….x n
5 ! = 1 x 2 x 3 x 4 x 5 = 120
27. Example: 2 elements from 4 (ABCD)
How many ways we can draw 2 elements from a
population of size 4 using SRSWOR
AA, AB, AC, AD,
BA, BB, BC, BD,
CA, CB, CC, CD,
DA, DB, DC, DD
29. Random Samples may be selected by
Lottery method: The
name or identifying
number of each item in
the population is
recorded on a slip of
paper and placed in a
box - shuffled –
randomly choose
required sample size
from the box.
random numbers table:
Each item is
numbered and a table
of random numbers is
used to select the
members of the
sample.
30. Table of random numbers…
Suppose your college has 500 students (population) and you
need to conduct a short survey on the quality of the food
served in the cafeteria. You decide that a sample of 70
students (sample) should be sufficient for your purposes.
In order to get your sample, you;
a. Assign a number from 001 to
500 to each students,
b. use a table of randomly
generated numbers (Random
Number Tables)
31. Table of random numbers…
c. Randomly pick a starting point in the table, and look at
the random number appear there.
d. (In this case) The data run into three digits (500), the
random number would need to contain three digits as
well.
e. Ignore all random numbers greater than 500 because
they do not correspond to any of the students in the
college.
Remember !! Sample is without replacement, so if the
number recurs, skip over it and use the next random
number.
The first 70 different numbers between 001 to 500 make
up your sample.
33. Merits and Demerits
Merits
Fair way of selecting a sample
Require minimum knowledge
about the population in advance
It is an unbiased probability
method
Demerits
It requires a complete & up-to-
date list of all the members of
the population.
Does not make use of
knowledge about a population
which Investigator may already
have.
Lots of procedure need to be
done before sampling
Expensive & time-consuming
34. 2.Stratified Random Sampling
A population is divided into homogenous,
mutually exclusive subgroups, called strata and a
sample is selected from each stratum
Goal: To guarantee that all groups in the
population are adequately represented.
Within stratum - uniformity (homogenous),
Between strata – differences
(heterogeneous).
35. For example, a group of 200 college teachers can
be first divided into teachers in Arts faculty,
Commerce Faculty and Science Faculty.
After dividing the entire population of teachers into
such classes called strata, a sample is selected from
each stratum of teachers at random. These samples
are put together to form a single sample.
Contd…
36. Sample size = 70
Number of females =350
Population size =500students
Stratifying the population by gender. (Male and
Female)
Calculate the exact sample size from each strata;
Male = (150/500)*70 = 21 male students
Female = (350/500)*70 = 49 female students
Give the total sample = 21 + 49 = 70 students
Contd…
Allocation Proportional to Size of Strata method
37. Merits and Demerits
Merits
It represent all group in a
population
Comparative analysis of
data become possible
Offers reliable as well as
meaning full results
Demerits
It require accurate
information on the
proportion of population
in each stratum.
Possibility of faulty
classification
38. 3.Systematic sampling
It is modification of simple random sampling ,it is
called as quasi (it is in between probability and non-
probability sampling )random sampling
39. Steps
The procedure of quasi sampling begins
with finding out the sample interval. This can
be found out by the ratio of the population to
the sample. Afterwards a random number is
selected from the sample interval.
40. • The market
researcher
might select
every 5th
person who
enters a
particular
store, after
selecting
the first
person at
random.
Contd…
• .
41. Circular systematic sampling,
In this case, the end of list is connected to the
beginning of the list, making the list circular.
This allows the random start r to start between 1 to N
(1<r<N), rather than between 1 to k as in the linear
systematic sampling.
42. Example:
Say we want to take a sample of size 10 from a
population of 100. We will select the first sample
randomly, say, 85th element.
So, our sample will consist of the following
elements:
85, 95, 5, 15, 25, 35, 45, 55, 65, 75
43. Merits and Demerits
Merits
Convenient & simple to
carry out.
Distribution of sample is
spread evenly over the
entire given population.
Less cumbersome, time-
consuming, & cheaper
Demerits
If first subject is not
randomly selected, then it
becomes a nonrandom
sampling technique
Items of universe does not
get equal chance of being
selected
44. 4. Probability Proportional to
size sampling(PPS)
If there are more than one subpopulation with varying
size of entities each, PPS sampling ensures that the
probability of an entity being selected as a sample
proportional to the size of its subpopulation .
Example
If we have a sample size of 10 to select from 1000
students in 4 colleges
45. Contd…
Colleges Size Cumulative
size
University College-A 10 10
( 1 to 10)
Arts College-B 20 30
(11 to 30)
MG College-C 15 45
(31 t0 45)
Kariavattom Campus-D 55 100
(46 to 100)
K= N/n=100/10=10
Select r, If r=8
46. Contd…
sample Number College
1 8 A
2 18 B
3 28 B
4 38 C
5 48 D
6 58 D
7 68 D
8 78 D
9 88 D
10 98 D
Sub Population Sample
Size
University College-A 1
Arts College-B 2
MG College-C 1
Kariavattom
Campus-D
6
10
47. 5.Cluster Sampling
Cluster means group, therefore, sampling
units are selected in groups.
Cluster sampling is an improvement over
stratified sampling. Both simple random and
stratified random sampling are not suitable
while dealing with large and geographically
scattered populations. Therefore, large-scale
sample surveys are conducted on cluster
sampling basis.
48. Steps:
• divides the population into groups or clusters
- Within cluster- differences (heterogeneous)
- Between cluster– uniformity (homogenous)
select clusters at random
49. Cluster Sampling…
Suppose researcher wants to study the learning
habits of the college students from Kerala. He
may select the sample as under
1)First prepare a list of all colleges in Kerala
2)Then, select a sample of colleges on random
basis. Suppose there are 200 colleges in Kerala,
then he may select 20 colleges by random method.
3)From the 20 sampled colleges, prepare a list of
all students. From these lists select the
required number of say 1000 students on
random basis]
50. Cluster Sampling…
Cluster
Formation In
EARAS. Key Plot Selection
N-No. of Survey sub divisions as per BTR
n- No. of subdivisions are to be selected
Interval, I= N/n (rounded to the nearest
integer)
R- Random start which is less than or equal to
N
The sub divisions with sampling serial numbers
R,R+I,R+2I,R+3I,….,R+(n-1)I will be the key
plots selected.
If any of these exceeds N, N will be subtracted
from it to get a serial number of survey sub
division number to be selected.
51. For the formation of Clusters, 100 survey numbers
are selected randomly from the Basic tax Register
which is known as key plots.
100 Clusters from each Investigator Zon
52. 6.Multistage Sampling
As the name suggests, multistage sampling is
carried out in steps. This method is regularly used
in conducting national surveys on large scale. It is
an economical and time saving method of
selecting a sample out of widely spread
population.
In this method first the population will be
divided on state basis, then districts, then cities,
then locality, wards, individuals who are sampled
at different stages until a final sample unit.
53. Multistage…
Involves selecting a sample in at least two stages
e.g: i. Stage 1: Stratified Sampling
Stage 2: Systematic Sampling
e.g: ii. Stage 1: Cluster Sampling
Stage 2: Stratified Sampling
Stage 3: Simple Random Sampling
54. Multistage…
A stratified multi-stage design
rural sector: The first stage units (FSU) are panchayath
wards
urban sector: The FSU are Urban Frame Survey (UFS) blocks
In case of large FSUs, one intermediate stage of sampling is the
selection of two hamlet-groups (hgs)/ sub-blocks (sbs) from
each rural/ urban FSU
The ultimate stage units (USU) are households in both the
sectors
55. Sampling Frame for First Stage Units
rural sector
the list of 2001 panchayath wards
For the urban sector
the list of latest available UFS blocks
58. Non Probability Sampling
Unequal chance of being included in the sample (non-
random)
Non random or non - probability sampling refers to
the sampling process in which, the samples are
selected for a specific purpose with a pre-determined
basis of selection.
59. 1.Judgemental Sampling
In this method, the sample selection is purely based on
the judgement of the investigator or the researcher.
This is because, the researcher may lack information
regarding the population from which he has to collect
the sample. Population characteristics or qualities may
not be known, but sample has to be selected.
60. contd….
For example, suppose 100 boys are to be selected from
a college with 1000 boys. If nothing is known about
the students in this college, then the investigator may
visit the college and choose the first 100 boys he meets.
Or he may select 100 boys all belonging to III Year. Or
he might select 25 boys from Commerce course, 25
from Science courses, 25 boys from Arts courses and 25
from Fine arts courses. Hence, when only the sample
size is known, the investigator uses his discretion and
select the sample.
61. 2.Convenience sampling
This method of sampling
involves selecting the
sample elements using
some convenient
method without going
through the rigour
(extrmenes) of sampling
method. The researcher
may make use of any
convenient base to select
the required number of
samples.
It involves the sample
being drawn from that
part of the population
which is close to hand.
That is, readily available
and convenient.
62. contd….
For example, suppose 100 car owners are to be selected.
Then we may collect from the RTO's office the list of
car owners and then make a selection of 100 from that
to form the sample.
63. 3.Quota Sampling
In this method, the sample size is determined first and
then quota is fixed for various categories of
population, which is followed while selecting the
sample.
In this method the quota has to be determined in
advance and intimated to the investigator. The quota
for each segment of the population may be fixed at
random or with a specific basis. Normally such a
sampling method does not ensure representativeness
of the population.
64. Contd….
Example: -
Suppose we want to select 100 students, then
we might say that the sample should be according to the
quota given below : Boys 50%, Girls 50% Then among
the boys, 20% college students, 40% plus two students,
30% high school students and 10% elementary school
students. A different or the same quota may be fixed for
the girls.
65. 4.Snowball Sampling
It refers to Identifying someone who meets the criteria
for inclusion in the study.
Selection of additional respondents is based on
referrals from the initial respondents.