Complex sampling design & analysis

Complex sampling design &
analysis. A revision
Assoc. Prof. Dr. JamalludinAb Rahman MD MPH
Department of Community Medicine
Kulliyyah of Medicine

Content
 Sampling method & sample size for survey
 What is complex sampling method
 Sampling weight
 Complex sampling analysis
6-7thApril2016(C)JamalludinAbRahman.Allrightsreserved.
2

About sampling
 Not feasible to select ALL population
 Best sampling should be able to represent population
 Sampling error occurs when statistics ≠ parameters
 Sampling error is not sampling bias
 Sampling error is random, sampling bias is predictable
(systematic)
 Sampling design affects sampling error
 Standard error measures sampling error
3

4
The aim of any sampling plan
should is to reduce sampling error,
and to avoid sampling bias

Describe the sample
 Target population – inferred population
 Study population – representative of the target population
 Sampling frame – list of sampling unit
 Sampling unit – unit to be sampled
 Observation unit – unit to be observed/measured
5

Sampling method
 Random vs. non-random
 Random ensures representativeness
 Simple vs. complex
 SRS = all samples have equal chance to be selected
i.e. equal probability of selection
 Anything not SRS is complex sampling
6

7
Simple Random
Sampling
Systematic
Random Sampling Stratified Random
Sampling

Stratified versus cluster sampling
 Stratified for heterogeneous groups
e.g. male-female, age groups
 Cluster for homogenous groups – rarely homogenous,
only in ideal situation e.g. schools, districts
8

9
Cluster Stratified
• There are clusters not selected at all
• Large variance
• All strata selected
• Smaller variance

Design Effect (deff)
 Design Effect =
Variance estimate (complex)
Variance estimate (SRS)
 How much the sample differ from population
 Different value for different variable
 Usually deff for complex survey >> 1
 If > 1.5, meaning effective loss 50% of sample if
designed using SRS
10

Design Factor (deft)
 Design factor (deft) is sqrt(deff) ~ effect of sampling to
standard error
 If deft = 2, the SE is twice larger than if the sampling
design is SRS
 The use of deff or deft, is as guide (a priori) to measure
sample size or to measure whether sample size has
been adequately achieved (post hoc)
11

Sampling Weight
 aka Probability Weight
 N/n (inverse of sampling fraction)
 Two stage = (N1/n1)*(N2/n2)
 The sum of PW = population
 Weighting can increase standard error
12

Sampling weight…
 Why? There is always imperfection in sampling
 Weighting will try to correct
1. Unequal probability of selection – base/design
weight
2. Non-response bias
3. Stratification in population – trying to represent true
characteristics of population e.g. by sex, ethnic etc. – post
stratification
Slide |
13

Example
 N = 100,000 people
 Sample (n) = 1000
 Therefore, SW = 100,000/1000 = 100
 Every 1 sample represents 100 people in that region
14

Example – two stage
15
Grade
Class Students SW1 SW2 SW
N1 n1 N2 n2 N1/n1 N2/n2 SW1*SW2
1 5 3 150 30 1.7 5.0 8.3
2 6 3 180 30 2.0 6.0 12.0
3 6 3 175 30 2.0 5.8 11.7
4 7 3 185 30 2.3 6.2 14.4
5 4 3 170 30 1.3 5.7 7.6
* Non-proportionate distribution

Example – stratified, one-stage
16
Population Size Sample Size Sampling Weight
District 1 District 2 District 1 District 2 District 1 District 2
Urban Rural Urban Rural Urban Rural Urban Rural Urban Rural Urban Rural
Under 18 10000 13000 20000 15000 100 100 100 100 100 130 200 150
18-60 30000 25000 60000 45000 100 100 100 100 300 250 600 450
Above 60 5000 7000 5000 10000 100 100 100 100 50 70 50 100
45000 45000 85000 70000 300 300 300 300
1 sample from District 1 urban represents 100 people
1 sample from District 2 urban represents 200 people
* Non-proportionate distribution

Complex sampling analysis
 Accommodate sampling weight
 Adjust for standard error
17

Estimating standard error
 Linearization method
(Taylor’s series) – assume linear association
 Replication method – sub-sample & calculate variance
for each samples – e.g. BRR (Balanced Repeated
Replication), Jacknife, bootstrapping
18

Practical Session
19

Practical
 Sampling distribution
 Calculating sampling weight
 Preparing data for analysis
 Complex sample analysis (using SPSS)
20

Sampling distribution
 Using 2016 adult household by location (urban/rural) in
Malaysia, prepare sampling distribution to represent up
to Malaysian urban/rural if the sample size calculated is
10,000 respondents
 Taking 12 LQ per EB and 2 adults per LQ
 Proportionate to size
21

22
Population Size by census ('000)*
No. State Urban Rural Total
1 Johor 1,682 537 2,219
2 Kedah 905 433 1,338
3 Kelantan 508 543 1,050
4 Melaka 537 47 584
5 Negeri Sembilan 492 198 690
6 Pahang 564 427 991
7 Perak 1,260 394 1,653
8 Perlis 102 66 167
9 Pulau Pinang 1,069 69 1,138
10 Sabah 1,064 597 1,661
11 Sarawak 1,009 694 1,703
12 Selangor 3,583 274 3,857
13 Terengganu 450 250 700
14 WP Kuala Lumpur 1,133 1,133
15 WP Labuan 50 6 57
16 WP Putrajaya 46 46
14,454 4,533 18,987

Calculating sampling weight
23
PSU (Kindergarten) SSU (Children)
URBAN RURAL URBAN RURAL
Total
population *
Kindergarten
visited
Total
population *
Kindergarten
visited
Total
population *
Children
Examined
Total
population *
Children
Examined
FT Kuala Lumpur 471 34 - - 10,940 687 - -
Perlis 65 5 222 7 1,007 97 2,557 113
Kedah 164 19 757 69 1,913 203 9,154 846
Penang 297 21 316 24 4,845 402 4,496 366
Perak 356 19 1,040 55 6,382 412 12,627 819
Selangor 1,051 93 607 55 22,951 2,204 7,994 815
Negeri Sembilan 206 15 420 30 2,924 253 4,850 373
Melaka 131 8 384 22 1,941 125 5,111 316
Johor 586 42 1,121 80 9,389 779 13,594 1,163
Pahang 235 13 873 45 4,188 224 12,092 642
Terengganu 400 21 813 35 6,979 336 9,308 427
Kelantan 144 9 1,042 58 2,924 178 14,882 934
FT Putrajaya 71 4 - - 2,170 127 - -
Sabah 395 32 1,230 101 10,330 998 13,837 1,006
Sarawak 590 30 1,493 67 13,395 644 14,936 725
FT Labuan 74 8 - - 1,400 135 - -
Total 5,236 373 10,318 648 103,678 7,804 125,438 8,545

Preparing data for analysis
 Merge SW into dataset
24

25

Complex sample analysis
 Preparing cs plan
 Analysis using SPSS
27

Complex sampling design & analysis

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Andere mochten auch

Andere mochten auch (20)

Ähnlich wie Complex sampling design & analysis

Ähnlich wie Complex sampling design & analysis (20)

Mehr von International Islamic University Malaysia

Mehr von International Islamic University Malaysia (6)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Complex sampling design & analysis

Hinweis der Redaktion