This document discusses various topics related to sampling and data collection, including:
1. It describes different types of data sources like primary data collected by the researcher and secondary data collected by others. It notes the advantages and disadvantages of each.
2. It discusses different levels of measurement for data like nominal, ordinal, interval, and ratio scales.
3. It covers sampling techniques including probability methods like simple random sampling, systematic sampling, stratified random sampling, and cluster sampling as well as non-probability methods like purposive sampling, quota sampling, snowball sampling, and convenience sampling.
4. It provides an overview of scale construction techniques for developing measurement scales.
types of data in research, measurement level, sampling techniques, sampling theory, probability and non probability
1. Unit 3
By: Dr. Monu Singh,
Assistant Professor,
Dept. Of Management & Commerce,
SRM University, Sikkim, India
2. Types of data –sources
Measurement level- concepts.
Level of measurement- ordinary, nominal, ratio and interval.
Sampling techniques- meaning, types and scale construction techniques.
Population and Sample.
Sampling theory- concepts, methods, sample frame and error.
Sample size, characteristics of good sample, parametric and statistic.
Types of sample design- probability and non-probability sample.
3. Types of data –sources
• There are many ways of classifying data.
• A common classification is based upon who collected the data.
• Primary data: Data collected by the investigator himself/ herself for a
specific purpose. (first hand data)
• Examples: Data collected by a student for his/her thesis or research project.
4. • Secondary data: Data collected by someone else for some other purpose
(but being utilized by the investigator for another purpose).
• Examples: Census data being used to analyze the impact of education on
career choice and earning.
5. Advantages of using Primary data:
The investigator collects data specific to the problem under study.
There is no doubt about the quality of the data collected (for the
investigator).
If required, it may be possible to obtain additional data during the study
period.
6. Disadvantages of using Primary data
Expensive mode of data collection
Chances of occurring error
Time consuming.
7. Advantages of using Secondary data
• The data’s already there- no hassles of data collection
• Saving in resources, in particular your time and money.
• The investigator is not personally responsible for the quality of data (“I
didn’t do it”)
8. Disadvantages of using Secondary data:
• May be collected for a purpose that does not match your need.
• One can only hope that the data is of good quality
• Obtaining additional data (or even clarification) about something is not
possible (most often).
• Where data have been collected for commercial reasons, gaining access
may be difficult or costly.
9. SOURCE/ Methods OF DATA COLLECTION:
PRIMARY DATA
• Questionnaire
• Interview
• Observation
• Survey
SECONDARY DATA
• Books published
• Journals
• Newspaper
• Articles
• Reports
• Internet
10. Measurement level- concepts.
Level of measurement- ordinary, nominal, ratio and
interval.
• There are four level of measurement.
• Nominal,
• Ordinal,
• Interval,
• Ratio.
11. Nominal scales
• Nominal scales are used for labeling variables, without
any quantitative value.
• “Nominal” scales could simply be called “labels.”
12. ORDINAL SCALE
• Ordinal scales involve the ranking of individuals, attitudes or items along the
continuum of the characteristic being scaled but difference between is
unknown.
• Example: if a researcher asked farmer to rank 5 brands of pesticides in order of
preferences.
Order of preference Brand
1 Pesticide 1
2 Pesticide 2
3 Pesticide 3
4 Pesticide 4
5 Pesticide 5
13. INTERVAL SCALE
• The interval scale is defined as a quantitative measurement scale where
the difference between 2 variables is meaningful.
• In other words, the variables are measured in actuals and not as a relative
manner, where the presence of zero is arbitrary.
• EXAMPLE, Please indicate your views on golden apples by ticking the
appropriate response below
Content Excellent Very good Good Fair Poor
Value for money
Attractiveness
14. Ratio
• Ratio scale is a type of variable measurement scale which is quantitative in
nature.
• Ratio scale allows any researcher to compare the intervals or differences.
• Ratio scale possesses a zero point or character of origin.This is a unique feature
of ratio scale.
• Example of ratio scale are- weight, lengths and times.
15. Sampling techniques- meaning, types and scale
construction techniques.
• Sampling is the easiest method of social investigation.
• According to Goode and Hatt- “ a sample is a smaller representation of the
larger whole”.
16. A sample should possess the following essential
characteristics to provide accurate results.
• 1.Representativeness- It should be representative to give a true picture of the population.
• 2. Adequate-The size of the sample should be adequate to provide reliability.
• 3. Independence-All items of the sample should be selected independently of one another.
• 4. Homogeneity- there should be no basic difference of the nature of units in the universe
and in the sample.
• 5. Lack of bias- it should be unbiased.
• 6. Accuracy and completeness- a sample should never give incomplete information and it
should not omit any unit included in the sample.
17. Advantages of sampling
1. Time saving process- as the study of only a sample or a small number of units
require a much shorter time than otherwise, it is a time saving process.
2. Economic- sampling makes the study less expensive.
3. Huge scope- as the units covered are small in number it gives much scope for a
detailed study.
4. Result accuracy- by proper selection of the sample accuracy of results can be
ensured.
5. Best method- if the universe is vast and scattered and if all units can’’t be
contacted sampling is the best method.
18. Disadvantages of sampling
1. Faulty method of sampling may lead to biased selection and hence false generalization.
2. Results will be accurate only if the sample is representative. But selection of a
representative sample is difficult.
3. Sampling demands knowledge in sampling techniques, statistical analysis and calculation
of errors. Otherwise results may be misleading.
4. It is not easy to stick to the sample because of lack of response and in accessibility. In
such cases the results may be biased.
5. Sometimes the universe is small and heterogeneous. In such cases it is not possible to
draw a representative sample.
19. Types of sampling techniques
Probability Sampling
Methods are:
1. Simple Random Sampling
2. Systematic Sampling
3. Cluster sampling
4. Stratified random Sampling
Non probability sampling methods
are:
1. Convenience Sampling
2. Purposive/ Judgment Sampling
3. Quota Sampling
4. Snow ball sampling
21. 1. Simple Random Sampling
• Simple random sample is applied when the method of selection assures each
individual elements in the universe have an equal chance of being chosen.
• this method is most suitable when the universe is homogeneous and large .
• If the universe is heterogeneous this method cannot be used.
•
22. Simple random sample has the following features:
1. selection of unit under this method has absolutely no connection with the
others. Items in the sample are independent of each other.
2. Each item in the universe has an equal chance of representation.
3. It is not a haphazard selection . Deliberate methods are used to ensure
Chance selection.
4. Selection of sample units in this method is free from bias.
23. There are certain principles that are to be followed
in the selection of simple random sampling .They
are as follows :
1.The universe should consist of large number of small units.
2.There should be ready list of universe.
3. methods of selection should be independent.
4.The sample unit should be accessible for investigation.
5. Once selected, the unit should be discarded.
6. all the units must be clearly defined.
7. the unit should be equal in size.
8. all the elements should be independent of each other.
24. While drawing the simple random sampling
sample certain precautions are to be made.
1. the population to be sampled is to be clearly defined.
2.The units in the sample should be of equal size.
3. they should be independent of each other
4. they should be easily accessible.
25. Advantages of simple random sampling:
• 1. it is simple
2. it is more representative.
3. it is free from bias.
4. the sample error can be easily assessed.
26. Disadvantages of simple random sampling:
• 1. it is very difficult to catalogue the entire universe.
2. cases selected maybe widely dispersed making it impossible for the
investigators to contact
3. random sampling is unsuitable if the universe varies in size.
4. sometimes this method prove to be expensive and time consuming.
27. Simple random samples are drawn using the
following methods .
• 1. lottery method - under this method numbers or names of various units of
universe are written on chits and put in a bowl and mix thoroughly. then the
needed chits are drawn.
2. selection from sequential units- under this system are arranged in same order
serial , alphabetical order or geographical. out of this every 5th or 10th or any other
number may be drawn.
3. grid system - according to this method a group of entire area is prepared and
screen with squares is placed upon the map . some squares the selected at random.
The screen is placed on the map and the areas falling in the selected squares are
taken as samples.
28. 2. Systematic Sampling
• a systematic sampling is drawn by selecting every n th item from the
population , where in refers to the sampling interval. by dividing the size of
the population with the size of the sample to be selected , the sampling
interval is determined.
• for example , if you want to draw a sample of 420 bills from 42,000 bills , the
sampling intervention would be 100. Hence every 100th bill will have to be
selected. we can make a random start anywhere between the first and the
100th bill.
29. following are the merits of
systematic random sampling
1. it is simple to follow
2. the sample is distributed evenly over the
population.
.
following are the demerits of
systematic random sampling
• 1. this matter is not really random as the items
are already pre-determined by the constant
interval .
2. It gives ample scope for bias
30. 3. Stratified random sampling-
• Stratified random sampling- it is a combination of random sampling and
purposive sampling .
• under this system the universe is first divided into a number of stratus or
groups based on a single criterion.
• then each group members of items are selected randomly .
31. while constructing strata the following points
are to be kept in mind
1. different variables involved in the study of the problem should be classified into
different groups.
2. the size of each stratum in the universe should be large enough to provide selection
of items on random basis.
3. there should be perfect homogeneity among different units of strata.
4. the number of items to be selected from each stratum represents the units in
whole universe.
5. Stratum should be clear cut and free from overlapping.
32. Advantages and disadvantages
• Advantages
1.The sample is fully representative and no
essential group is omitted.
2. with proper stratification a representative
character can be achieved with fewer items.
If a stratum is perfectly homogeneous
selection of even a few items from it is
enough.
3. Again if the original case it is not accessible
to study replacement of care can be resorted
to easily . if a person refuses to cooperate
with the survey , he may be replaced by
another man from the same stratum.
• Disadvantages
1. when the stratification is improper there is
much scope for the bias to be caused . if the strata
are overlapping, unsuitable or the problem
understudy or disproportionate the selection of
the sample may not be representative.
2. A sample in order to be representative must be
proportionate proportion is attained in random
sampling automatically . in stratified sampling a
deliberate attempts has to be made in this
respect. attainment of proportion is very difficult
through deliberate means specially when the size
different strata is extremely unequal.
3. Disproportionate stratification requires
weighing again introduces selective factors in the
sample . any sample becomes unrepresentative
when there is a new weighing.
33. 4.Through stratification in sample can be so
selected that most of the units are geographically
localised. in a purely random sample there is no
such control and the cases is actually selected
may be very widely dispersed. concentration of
units saves time and cost of survey.
4. Again there are difficulties in pulling a particular
case in a stratum. if the strata are not very clear
cut it may be difficult to decide in which Stratum
any particular needs to be placed.
34. 4. Cluster sampling
• Cluster sampling - it refers to the method of dividing the population into groups called clusters and drawing a sample of
clusters to represent the population.
• the primary sample units or elementary sample units constitute the cluster.
• In a selected cluster either all the sample units are selected or a few of them are chosen on any sampling method.
• for instance if a study has to be made on the industrial workers of a district, the industrial units are in primary sample
units. if the primary sample units clustering in one particular locality is selected then it forms a first stage cluster; and
when workers employed in one or two firms are selected that forms that second stage cluster. a plan to make selection of
cluster within clusters is called multistage cluster sampling .
• thus, one significant factor that differentiates the cluster sampling from other method of sampling is but, Unlike the other
types of samples in which each element in the population is separately selected . in cluster sampling , groups are selected
. A cluster maybe anything a school , a municipal ward an Industry or a cooperative Society.
35. Advantages and disadvantages
• Advantages
1. it provides significant cost gain
2. it is easy and more practical method which
facilitates the fieldwork
3. more units can be included because of that
geographical contiguity of the samples.
• Disadvantages
1. probability and representativeness of the
sample is sometimes affected.
2. the results are likely to be less precise and
accurate
38. 1. Purposive/ judgemental sampling
• Purposive sampling - when certain units in the universe are purposively
selected it is called purposive sampling.
• Hence, the unit selected are representative of the universe.
• The researcher by exercising good judgement and strategy should pick the
cases to be included in the sample.
39. Advantages and disadvantages
• advantages
1 it is economical and quick
2. it is a practical method particularly in fields
where randomisation is not possible.
• disadvantages
1. it warrants prior knowledge of the
population.
2. in this matter it is very difficult to estimate
the sampling error
40. 2. Quota sampling
• Quota sampling - the quota sampling is a nonprobability sampling in which the population is classified into a number of groups based on
some criteria, say age of the members of population, viz., old age, Middle Age and young age.
• let the proportion of the number of persons in the population under the old age category be 20% that of middle-age and young age will be at
50% and 30% respectively.
• In the quota sampling , the proportions of the number of sampling units selected from these categories are the same as in the population.
• if n is the sample size , then the proportions of the number of sampling units to be selected from the old age, middle age and young age
categories will be 20% 50% and 30% of the sample size respectively ( i.e. 0.2n and 0.5n and 0.3n respectively )
• later white selecting the required number of sampling units from Each category , one can use any one of the other non-probability sampling
method viz., convenient sampling or judgement sampling.
Though this matter comes under non probability sampling methods, in the first place, certain amount proportionate selections of sampling
units from different strata of the population are made which makes a sample as a representative sub- population of the main population.
41. ADVANTAGES OF QUOTA SAMPLING
• It is a practical as well as a convenient sampling method
• Economic in nature
• It is useful when no other sample frame is available
42. 3. Snow ball sampling
• The snowball sampling is a restrictive multistage sampling in which initially certain number
of sampling units are randomly selected.
• Later, additional sampling units are selected based on referral process.
• This mean that they initially selected respondents provides addresses of additional
respondents for the interviewers.
• Initial respondents maybe randomly selected , for example , from the information
contained in telephonic directories.
• later, additional respondents can be included in the sample based on the references made
by those initial respondents .
43. Advantage OF SNOWBALL Sampling
• inexpensive and convenient nonprobability sampling method which suits
the situation where the development of sampling frame is a difficult and
time-consuming task.
44. 4. Convenience sampling
• This is the nonprobability sampling method in which the interviewers will decide
the choice of sampling units based on their convenience.
In most of the situations, the following maybe true:
1. the sampling unit may be distributed sparsely.
2. many respondents refused to fill the questionnaires
3. Some respondents will not cooperate in filling the questionnaire.
4. some of the interviewers may not be serious in selecting the sampling unit as
per the assumed sampling plan.
45. • Though the probability sampling gives better accuracy in terms of
confidence level of the inferences of the study, there are many practical
difficulties in fully executing probability sampling because of the limitations
stated earlier.
• so naturally, then interviewers will be resorted to convenience sampling to
overcome such difficulties. the sampling units for this type of sampling are
selected from a telephone directory, newspaper subscribers list ,
departmental stores etc,
46. Advantages of convenience sampling method
• 1 when universe is not clearly defined
• 2. sampling unit is not clear
• 3. Complete source of test is not available.
49. Before we proceed further it will be worthwhile
to understand the following two terms:
• (a) Measurement, and
• (b) Scaling
• Measurement: Measurement is the process of observing and recording the observations that are collected as part of
research.
• The recording of the observations may be in terms of numbers or other symbols to characteristics of objects according to
certain prescribed rules.
• The respondent’s, characteristics are feelings, attitudes, opinions etc.
• The most important aspect of measurement is the specification of rules for assigning numbers to characteristics.
• The rules for assigning numbers should be standardized and applied uniformly.This must not change over time or objects.
50. • Scaling: Scaling is the assignment of objects to numbers or semantics according to a rule.
• In scaling, the objects are text statements, usually statements of attitude, opinion, or
feeling.
• When a researcher is interested in measuring the attitudes, feelings or opinions of
respondents he/she should be clear about the following:
• a)What is to be measured?
• b)Who is to be measured?
• c)The choices available in data collection techniques
51. • The level of measurement refers to the relationship among the values that
are assigned to the attributes, feelings or opinions for a variable.
• Typically, there are four levels of measurement scales or methods of
assigning numbers: (a) Nominal scale, (b) Ordinal scale, (c) Interval scale,
and (d) Ratio scale.
52. Population and Sample.
• A population is a complete set of people with a specialized set of
characteristics, and a sample is a subset of the population.
• The usual criteria we use in defining population are geographic,
for example, “the population of SIKKIM ”. ...The study sample is
the sample chosen from the study population.
• Another Example, I want to measure the attitude of the SRM students
towards COVID-19. Here, my population is SRM University SIKKIM students
and sample will be the students I have chosen to collect the data.
53.
54. Sampling theory- concepts
• Sampling theory is a study of relationships existing between a population
and samples drawn from the population. Sampling theory is applicable only
to random samples. For this purpose the population or a universe may be
defined as an aggregate of items possessing a common trait or traits.
55. Methods of sampling
• There are two types of sampling methods:
• Probability sampling – under this comes, simple random sample, systematic
sample, stratified sample, cluster sample.
• Non Probability sampling- under this comes, conveniences sample,
purposive sample , snow ball sample and voluntary response sample / quota
sampling.
• Note: all these methods, we have already discussed.
56. sample frame and error.
• Sampling frame- The sampling frame is the actual list of individuals that the sample will be
drawn from. Ideally, it should include the entire target population (and nobody who is not
part of that population).
• Example
• You are doing research on working conditions at Company X.Your population is all 1000
employees of the company.Your sampling frame is the company’s HR database which lists
the names and contact details of every employee.
• Another example, I am doing research on the SRM University Sikkim students, towards
their attitude on COVID-19. Here, my sampling frame will be only those students who will
contribute in the research.
57. Sample size-
• Sample size- The number of individuals in your sample depends on the size
of the population, and on how precisely you want the results to represent
the population as a whole.
• You can use a sample size calculator to determine how big your sample
should be. In general, the larger the sample size, the more accurately and
confidently you can make inferences about the whole population
58. Characteristics of good sample size
• (1) Goal-oriented:A sample design should be goal oriented. It should be
oriented to the research objectives and fitted to the survey conditions.
• 2) Accurate representative of the universe: A sample should be an
accurate representative of the universe from which it is taken.There are
different methods for selecting a sample. It will be truly representative only
when it represents all types of units or groups in the total population in fair
proportions. In brief sample should be selected carefully as improper
sampling is a source of error in the survey.
59. • (3) Proportional:A sample should be proportional. It should be large
enough to represent the universe properly.The sample size should be
sufficiently large to provide statistical stability or reliability. The sample size
should give accuracy required for the purpose of particular study.
• (4) Random selection:A sample should be selected at random.This means
that any item in the group has a full and equal chance of being selected and
included in the sample.This makes the selected sample truly representative
in character.
60. • (5) Economical: A sample should be economical.The objectives of the
survey should be achieved with minimum cost and effort.
• (6) Practical: A sample design should be practical.The sample design should
be simple i.e. it should be capable of being understood and followed in the
fieldwork.
• (7) Actual information provider:A sample should be designed so as to
provide actual information required for the study and also provide an
adequate basis for the measurement of its own reliability.
61. parametric and statistic.
• First time Parametric statistics was mentioned by R.A. FISHER in his work
“Statistical methods for research workers” in the year 1925, which created
the foundation for modern statistics.
• In the literal meaning of the terms, a parametric statistical test is one that
makes assumptions about the parameters (defining properties) of the
population distribution(s) from which one's data are drawn, while a non-
parametric test is one that makes no such assumptions.
62. Parametric and Non-Parametric.
• Parametric statistics is a branch of statistics which assumes that sample
data come from a population that can be adequately modeled by
a probability distribution that has a fixed set of parameters .
• Conversely a non-parametric model differs precisely in that the parameter
set (or feature set in machine learning) is not fixed and can increase, or even
decrease, if new relevant information is collected.
69. Sampling Error
• Sampling error is the deviation of the selected sample from the true
characteristics, traits, behaviours, qualities or figures of the entire
population.
• Sampling Error is of 2 types- Sample error and non sample error.
70. Sample error
• Sample Error caused by the act of taking a sample.
• They cause sample results to be different from the results of census.
• Differences between the sample and the population that exist only because
of the observations that happened to be selected for the sample.
• Statistical Errors are sample error
• We have no control over
71. Non Sample Errors
• Non sample errors are the errors which are Not control by sample size.
• Non Response Error
• Response error
72. • Non Response Error - A non-response error occurs when units selected as part of the sampling
procedure do not respond in whole or in part.
• Response Errors- a response or data error is any systematic bias that occurs during data
collection, analysis or interpretation.
Respondent error (e.g., lying, forgetting, etc.)
Interviewer bias
Recording errors
Poorly designed questionnaires
Measurement error
73. Respondent Error occurs when-
• respondent gives an incorrect answer, e.g. due to prestige or competence
implications, or due to sensitivity or social undesirability of question.
• respondent misunderstands the requirements.
• lack of motivation to give an accurate answer.
• “lazy” respondent gives an “average” answer.
• question requires memory/recall.
• proxy respondents are used, i.e. taking answers from some one other than the
respondent
74. Interviewer bias error occurs when:
• Different interviewers administer a survey in different ways
• Differences occur in reactions of respondents to different interviewers, e.g.
to interviewers of their own sex or own ethnic group
• Inadequate training of interviewers
• Inadequate attention to the selection of interviewers
• There is too high a workload for the interviewer
75. Measurement Error occurs when-
• The question is unclear, ambiguous or difficult to answer
• The list of possible answers suggested in the recording instrument is
incomplete
• Requested information assumes a frame work unfamiliar to the respondent
• The definitions used by the survey are different from those used by the
respondent (e.g. how many part-time employees do you have?