2. Introduction : In statistics population represents the entire
group of individuals in whom we are
interested.
most research studies involve the observation
of a chunk from some predefined population
of interest. This chunk of observation is
known as sample.
3. Sampling technique:
1) Nonprobability sampling :
Here selection does not depend upon any laws of probability.
It is non random sampling
Eg are purposive sampling,convenience sampling,self selection
sampling, snowball sampling,quota sampling
4. A. Purposive sampling: here participants are purposively selected from
whom information can be obtained easily
B. Convenience sampling: here participants are selected on the basis of
easy accessibility. For example ,in a districts ,some primary schools
were selected based upon their location beside a main road
C. Self-selection sampling: participants take part in the research on their
own as a volunteer
5. D. Snowball sampling: This type of sampling is applied when the
target population is hidden and / or hard to reach, such as
drug addicts,commercial sex workers
E. Quota sampling: Here reasearchers are given quotas to fill from
different strata of population keeping the proportion of quota same as
observed in the population. For example in a village hindu and muslim
population are 60% and 40% respectively; and a researcher by the
method of quota sampling can select participants by his own choice in
the same ratio of 6:4 (HINDU: MUSLIM)
6. 2) Probability sampling: This is superior to non probability
sampling. It obeys the law of probability and is based on the
concept of random selection , this type of sampling is also
known as random sampling.
A. Simple random sampling: This is the M/C and simplest of the
sampling methods. This method is applicable when
population is small. At first all the sampling units are assigned
with numbers, then sample can be selected either by random
number table method or lottery method.
7. B. Systematic random sampling: This is done in case of large, scattered
and haterogenous population; when complete list of population is
available. Like simple random sampling, at first all the sampling units are
assigned with numbers. Then we calculate the sample interval.
C. Stratified random sampling: This type of sampling is used in
heterogenous population and when we want to have information about
the distribution of a particular variable. At first the entire heterogenous
population is divided into small homogenous groups called strata. Then
from each group, required number of study subjects is selected by
simple or systematic random sampling in proportion to its original size.
8. D. Cluster sampling : Cluster sampling involves dividing the
specific population of interest into geographically distinct
groups or clusters, such as neighbourhoods or families.
Cluster is defined as a randomly selected group; this method
is used when units of population are natural groups or
clusters like books,Wards,villages,school etc.
E. Multistage sampling: carried out in several stages, in
case of large country survey eg anemia survey, hook worm
survey
9. F. Multiphase sampling: part of information is collected
from whole sample and part from the sub sample .
G. Sequential sampling: Ultimate sample size is not fixed in
advance, but is determined on the basis of information
yielded as survey progresses by decision rules
H. Lot quality Assurance sampling: This technique was
originally developed in 1920. It involves taking a small
independent random sample of a manufactured batch and
tests the sample items for quality.
10. DATA COLLECTION METHOD: Data are facts expressed in
numerical terms. When data set undergoes through
statistical processing , it becomes information. Inteligence is
for the decision makers or policy makers, based on
transformation of the information.
DATA INFORMATION INTELIGENCE
11. Classification of DATA:
A. Continuous and discrete data:
I. Cotinuous data- It is a DATA for which an unlimited number
of possible values exist. An example of continuous data is an
individuals weight.
II. Discrete data- is a data for which a limited number of
possible values exist i.e it is always expressed in whole
number. For example number of people participating in a
cricket match.
12. B. Qalitative and Qantitative data:
I. Qualitative data: when a particular characteristics
can’t be measured, but can be expressed in
frequency, it is known as qualitative data. Eg Age,
sex etc.
II. Quantitative data: when both the characteristics
and frequency of a variable can’t be measured,it is
known as quantitative data. It is always numerical
13. C. Primary and secondary data:
I. Primary data: the data which are collected by the
researchers themselves are called primary data. Thus the
data collected from key informants ,study subjects, focus
group discussions, experimental data etc.
II. Secondary data: The data which have already been
collected by someone else and are used by another
researchers are called secondary data. Eg census
data,hospital data etc.
14. D. Grouped and ungrouped data:
I. Grouped data: These type of data are presented after being
organized divided into different groups or categories eg
weight of 8 man can be presented as 50-55 kg (2 men), 55-60
kg(3 men), 60-65 kg(2 men),65-70 kg(1 man)
II. Ungrouped data: These are presented individually,rather
then in groups. These be arranged either in ascending or
descending order. Eg weight of 8 men are 52.5kg, 53.5kg,
56.4kg, 57kg, 58.5kg, 61.4kg, 63.5kg and 68.5 kg.
15. E. Hard and soft data:
I. Hard data: The data that are usually displayed on a
continuous scale as a digital readout or a computer print out
, then from modern mechanical instruments are called hard
data.
II. Soft data: Any subjective measurement which has more
potential for bias or variability on the part of the observer is
known as soft data. For example, in evaluating pain of a
cancer patient , his mood and his ability to work, the data
that is generated are soft and subjective.
16. Another authority classifies data into two broad categories :
A. Continuous data( quantitative data): The data that are
expressed in integers, fractions or decimals, in which equal
gap exist between successive intervals are known as
continuous data. Eg systolic or diastolic BP, Pulse rate etc.
B. Discrete /categorical data(qualitative data): The data that are
expressed either in dichotomous or polychotomous category
are known as qualitative data. It is always expressed in whole
number. Dichotomous: male/female, yes/no etc.
polychotomous: hindu/muslim/Christian, no
pneumonia/pneumonia/severe pneumonia etc.
17. SAMPLE SIZE CALCULATION: sample size calculation
provides the number of study subjects needed to carry out
the study.
sample size calculation solely depends upon the type of
epidemiological study. For calculation of sample size for
descriptive , case control, cohort study and RCT different
formula are used.
18. A. Sample size calculation for descriptive study
1)For qualitative data: Zα
2 pq /L2
Zα = standard normal deviate at a desired confidence
level(95% or 99%), p= previous prevalence, q= 100-p, L=
allowable error, 5%, 10%, or 20% of p
At 95% confidence level Zα =1.96 while at 99%
confidence level Zα =2.58
2)For quantitative data: Zα
2 Ϭ2 / L2
Ϭ= standard deviation
19. B. Sample size calculation for analytical studies(case control,cohort,
RCT)
Here prerequisites are:
1)For case control study: Anticipated probability of exposure for
people with disease and without disease separately, anticipated odds
ratio, confidence level and precision.
2)For cohort study: Anticipated probability of disease in people
exposed to the factor of interest, anticipated probability of disease in
people not exposed to the factor of interest, anticipated relative risk,
confidence level and precision.
20. 3) For RCT:
Zα= z value for alpha error(type1 error), Zβ= Z value for
beta error( type2 error), standard deviation, proportion of event
and mean difference to be detected. RCTs
with one experimental group and one control group and
considering only alpha error or both alpha and beta errors .
I. RCTs using students t test and alpha error:
N=(Zα)2 . 2. (S)2/ (d)2
II. RCTs using student’s t test and considering both alpha and
beta errors:
21. N=(Zα + Zβ)2 . 2 .(S)2 /(d)2
Zα= Z value for alpha error ( at 95% confidence level it is 1.96 in two
tailed) ; Zβ= Z value for beta error(20 % beta error and 80 %
power it is 0.84 in one tailed); S= standard deviation;
d=mean difference to be detected.
22. VARIABLES: Events that are measured in a research are called
variables. They vary from person to person, place to place and object to
object
Types of variables:
1. Simple variable and composite variable: A simple variable has only
one main component, eg weight, height, age etc. composite variable
also known as multi–component variable,it has more than one
component in it. This type of variable is derived from two or more
variables. Body mass index- which is the quotient of weight(in Kg)
divided by the square of height(in meters) is an example of
composite variable.
23. 2. Dependent variable and independent variable: If one variable
depends upon or is a consequence of the other variable/s , it is termed
as dependent variable. Thus dependent variable is an outcome of
interest. Examples are health status , use of health services and cost of
care etc.
3. Latent variable: The variable which cannot be measured directly, but
is assumed to be related to a number of observable variables is known
as latent variable . Eg bright student , efficient worker etc.
4. Random variable: If the value of a variable can’t be predicted in
advance then the variable is referred to as a random variable. eg in
tossing a coin, the outcome may be either head or tail.
24. “prakṛteḥ kriyamāṇāni guṇaiḥ karmāṇi sarvaśaḥ
ahaṅkāravimūḍhātmā kartāhamiti manyate”
“in fact all actions are being performed by the modes of
prakruti (primordial nature).
the fool, whose mind is deluded by egoism, thinks: “i am the
doer.”
Thank you