This document discusses biostatistics and data collection techniques. It defines primary and secondary data. Methods for collecting primary data include direct interviews, indirect oral interviews, using correspondents, mailed questionnaires, and schedules sent through enumerators. Secondary data is collected from published sources like journals and government reports or unpublished sources like research studies. Data can be collected via census or sampling. Sampling methods include probability sampling (simple random, stratified, systematic, and cluster sampling) and non-probability sampling (judgement, quota, and convenience sampling). The document outlines the advantages and disadvantages of sampling.
Biostatistics Collection of Data and Sampling Techniques SMG.pptx
1. •
Biostatistics: Collection of Data and Sampling Techniques
Dr. Saji Mariam George
Associate Professor (Retired)
Assumption College Autonomous
Changanacherry
2. Biostatistics: Collection of Data and Sampling Techniques
• Data
Data is a collection of facts from which conclusions may be
drawn (Statistical data). The term data means groups of
information that represent the qualitative or quantitative
attributes of a variable or set of variables.
Based on sources, data can be classified into primary and
secondary.
I. Primary Data
The original data collected by an investigator for the first
time to study a particular problem.
3. Methods of Collection of Primary Data
1. Direct Personal Observations (Interviews)
The data is collected by the investigator by personal
observation (interview) of the objects under study. This
method is suitable when the field of study is small and
greater accuracy is needed.
2. Indirect Oral Interview
In this case, instead of directly approaching the informants,
the investigator interviews people who can provide the
necessary information about the problem under study. For
example, studies regarding addiction to drugs , alcohol etc.
In such cases, the informants may be reluctant to answer
the questions of the interviewer. The success of this
method depends on the wise choice of the persons to be
interviewed.
4. 3. Information from Correspondents / Agencies
This method is useful in cases where regular information is to be
collected from a wide area. In this method, local agents or
correspondents collect information and transmit to the central office
where the data are processed.
4. Mailed Questionnaire Method
In this method, a questionnaire containing relevant questions related
to the problem under study is sent along with a covering letter
requesting the informants to fill up the questionnaire and send it
back within a specified time. This method is suitable for extensive
surveys , where informants are spread over wide area.
5. Schedules sent through Enumerators
In this method, a number of trained enumerators personally meet the
informants with standardized questionnaires. He explains the
purpose of the enquiry and record the replies to the questions in the
schedule . This is the most widely used method of collection of
primary data.
5. II. Secondary Data
Secondary data is not originally collected by the investigator. It was
originally collected by some other person for a particular study. The
investigator obtain secondary data either from published or
unpublished sources.
a) Published Sources
i) National and international research journals and other publications.
ii) Official publications of Central and State governments.
iii) Semi-Official publications of local bodies such as Municipal
corporation.
iv) Reports of various committees and commission appointed by the
government.
v) Newspapers etc.
b) Unpublished Sources
i) Studies made by Research institutes, research scholars etc.
ii) Records maintained by various government and private offices.
6. Collection of Data
The data may be collected by two methods
1. Census Method
In Census or Complete Enumeration Survey Method, data
are collected for each and every unit of the population or
universe. For example, Population Census. The census
method is not commonly used in practice because it needs
much time, effort and money.
2. Sampling Method
In Sampling, instead of every unit of the population, only a
part of the population is studied and conclusions are drawn
on that basis for the entire population. In other words,
Sampling is a method that helps to know the
characteristics of the population by examining only a small
part of it.
7. Methods of Sampling
There are two major sampling methods viz. Probability Sampling
(Random Sampling) and Non-probability Sampling (Non-random
Sampling).
1. Probability Sampling (Random Sampling)
In this method, every item in the population has a known
chance or probability of being selected in the sample. That is,
the items will be chosen strictly at random. There are two types
of Random Sampling Methods such as Simple or Unrestricted
Random Sampling and Restricted Random Sampling.
a) Simple or Unrestricted Random Sampling
In this method, each and every unit of the population has an
equal chance of being selected in the sample. The randomness
of the sample selection can be ensured either by the use of
Lottery Method or the Table of Random Numbers.
8. i) Lottery Method
This is a very popular method of selecting a random sample,
when the population is relatively small. In this method, all
items of the population are numbered or named on separate
slips of paper of identical size and shape. These slips are then
folded and mixed well in a container. Then the required
number of slips are picked up at random. Thus the selection
of items depends entirely on chance.
ii) Table of Random Numbers
When the population is large, a Table of Random Numbers
can be used for Random Sampling. Several standard Tables
of Random Numbers are available. These numbers, prepared
by using certain randomizing machines and then arranged in
rows and columns , can be used to select either a single
digit, double digit or three or four digit numbers. The
randomness of the sample is ensured by the proper use of
the table.
10. b) Restricted Random Sampling
There are three types of Restricted Random Sampling viz.
Stratified Sampling, Systematic Sampling and Cluster
Sampling.
i) Stratified Sampling
This method is used when the population is heterogeneous
with respect to the variable or characteristic under
investigation. In this method, the heterogeneous population
is first sub-divided into several homogeneous groups or
classes called strata. Then from each stratum or sub-
population, a small sample called sub-sample is selected at
random. All the sub-samples are then combined together to
form the stratified sample.
11. ii) Systematic Sampling ( Quasi-random Sampling)
A Systematic sample is formed by selecting one unit at
random and then selecting additional units at evenly spaced
intervals. This method is popularly used in those cases
where a complete list of the population from which sample
is to be drawn is available. The items are serially numbered.
The first item is selected at random generally by following
the lottery method. Subsequent items are selected by taking
every kth item from the list where , k refers to the sampling
interval or sampling ratio.
12. Example :
There are 90 students in a class with roll numbers 1 to 90.
Use Systematic Sampling method to select a sample of 10
students.
Solution :
In this case, the sampling interval , k is 9. So the first student is
to be selected at random from 1 and k. That is, from 1 and 9.
Then we have to select every kth student. Suppose the first
student randomly selected is the one with roll number 2, the
sample would consists of students with roll numbers 2, 11 , 20 ,
29 , 38 , 47 , 56 , 65 , 74 and 83.
13. iv) Cluster Sampling (Multi-stage Sampling)
In this method, the population is divided into a number of groups
called clusters or primary sampling units. The units in each cluster
which are actually observed are the elementary sampling units. The
method includes single stage, double stage and multi-stage cluster
sampling. In single stage cluster sampling, one or more clusters are
randomly selected and all the elementary sampling units contained in
these clusters are observed. In double stage cluster sampling, a
sample is selected at random from the elementary units contained in
one or more randomly selected clusters. If the sample selection
passes through more than two stages of sampling, it is known as
multi-stage cluster sampling.
Example :
Select a sample of 4000 students from Kerala.
Solution:
In this case, we can take Universities at the first stage, then the
number of colleges at the second stage and selection of students from
colleges at the third stage.
14. 2. Non- Probability Sampling (Non-Random Sampling)
In this method, the sample is selected not on the basis of
probability but are selected based on some considerations
such as expert judgements, convenience etc. Non-Random
Sampling methods include Judgement Sampling, Quota
Sampling and Convenience Sampling etc.
a) Judgement or Purposive or Deliberate Sampling
In this method, the investigator selects a sample based on
his judgement. This method is popularly used by auditors to
check the accounts. There is no set procedure for the
selection of the sample. Hence one can not draw any
inferences about the whole population.
15. b) Quota Sampling
In this method, the population is divided into quota based on some
characteristics such as age, religion etc. Then the enumerator
interviews certain number of persons in his quota. Within the quota,
the selection of sample depends on personal judgement . The success
of this method depends on the efficiency and skill of the investigator.
This method is useful in public opinion studies and consumer
research.
c) Convenience Sampling (Chunk Sampling)
A chunk refers to that part of population being investigated. It is
selected neither by probability nor by judgement but by convenience
of time, place, availability of resources etc. A sample obtained from
readily available lists such as telephone directories, automobile
registrations etc. is a Convenience sample and not a random sample
even if the sample is drawn at random from the lists. This method is
also known as Accidental, Accessibility or Haphazard Sampling. The
results of Convenience Sampling is not representative of the
population. However, this method is often used in preliminary studies.
16.
17. Merits of Sampling
Sampling is the only method that can be used when the
population under study is infinite.
Sampling technique saves time and money and it is possible
to collect more detailed information in a sample survey.
Sampling is a scientific method. It is possible to obtain
accurate and reliable results if expert and trained persons
are employed for scientific processing and analysis of data.
If there is any sampling error, it is possible to estimate and
control the results.
The organization and administration of sample surveys are
easy.
18. Demerits of Sampling
Sampling is not suitable if one is interested in the
characteristics of individual constituents of the population.
Sampling generally requires the services of experts and need
careful planning and execution. Otherwise, the results will
be inaccurate and misleading.
Selection of an appropriate method of sampling is also very
important. If the sample is not truly representative of the
population, the results will be inaccurate.
If the size of the sample is not appropriate, it may not truly
represent the population and will not reflect the true
characteristics of the population.