SlideShare ist ein Scribd-Unternehmen logo
1 von 26
Complex sampling design &
analysis. A revision
Assoc. Prof. Dr. JamalludinAb Rahman MD MPH
Department of Community Medicine
Kulliyyah of Medicine
Content
 Sampling method & sample size for survey
 What is complex sampling method
 Sampling weight
 Complex sampling analysis
6-7thApril2016(C)JamalludinAbRahman.Allrightsreserved.
2
About sampling
 Not feasible to select ALL population
 Best sampling should be able to represent population
 Sampling error occurs when statistics ≠ parameters
 Sampling error is not sampling bias
 Sampling error is random, sampling bias is predictable
(systematic)
 Sampling design affects sampling error
 Standard error measures sampling error
6-7thApril2016(C)JamalludinAbRahman.Allrightsreserved.
3
6-7thApril2016(C)JamalludinAbRahman.Allrightsreserved.
4
The aim of any sampling plan
should is to reduce sampling error,
and to avoid sampling bias
Describe the sample
 Target population – inferred population
 Study population – representative of the target population
 Sampling frame – list of sampling unit
 Sampling unit – unit to be sampled
 Observation unit – unit to be observed/measured
6-7thApril2016(C)JamalludinAbRahman.Allrightsreserved.
5
Sampling method
 Random vs. non-random
 Random ensures representativeness
 Simple vs. complex
 SRS = all samples have equal chance to be selected
i.e. equal probability of selection
 Anything not SRS is complex sampling
6-7thApril2016(C)JamalludinAbRahman.Allrightsreserved.
6
6-7thApril2016(C)JamalludinAbRahman.Allrightsreserved.
7
Simple Random
Sampling
Systematic
Random Sampling Stratified Random
Sampling
Stratified versus cluster sampling
 Stratified for heterogeneous groups
e.g. male-female, age groups
 Cluster for homogenous groups – rarely homogenous,
only in ideal situation e.g. schools, districts
6-7thApril2016(C)JamalludinAbRahman.Allrightsreserved.
8
6-7thApril2016(C)JamalludinAbRahman.Allrightsreserved.
9
Cluster Stratified
• There are clusters not selected at all
• Large variance
• All strata selected
• Smaller variance
Design Effect (deff)
 Design Effect =
Variance estimate (complex)
Variance estimate (SRS)
 How much the sample differ from population
 Different value for different variable
 Usually deff for complex survey >> 1
 If > 1.5, meaning effective loss 50% of sample if
designed using SRS
6-7thApril2016(C)JamalludinAbRahman.Allrightsreserved.
10
Design Factor (deft)
 Design factor (deft) is sqrt(deff) ~ effect of sampling to
standard error
 If deft = 2, the SE is twice larger than if the sampling
design is SRS
 The use of deff or deft, is as guide (a priori) to measure
sample size or to measure whether sample size has
been adequately achieved (post hoc)
6-7thApril2016(C)JamalludinAbRahman.Allrightsreserved.
11
Sampling Weight
 aka Probability Weight
 N/n (inverse of sampling fraction)
 Two stage = (N1/n1)*(N2/n2)
 The sum of PW = population
 Weighting can increase standard error
12
Sampling weight…
 Why? There is always imperfection in sampling
 Weighting will try to correct
1. Unequal probability of selection – base/design
weight
2. Non-response bias
3. Stratification in population – trying to represent true
characteristics of population e.g. by sex, ethnic etc. – post
stratification
Slide |
13
Example
 N = 100,000 people
 Sample (n) = 1000
 Therefore, SW = 100,000/1000 = 100
 Every 1 sample represents 100 people in that region
6-7thApril2016(C)JamalludinAbRahman.Allrightsreserved.
14
Example – two stage
6-7thApril2016(C)JamalludinAbRahman.Allrightsreserved.
15
Grade
Class Students SW1 SW2 SW
N1 n1 N2 n2 N1/n1 N2/n2 SW1*SW2
1 5 3 150 30 1.7 5.0 8.3
2 6 3 180 30 2.0 6.0 12.0
3 6 3 175 30 2.0 5.8 11.7
4 7 3 185 30 2.3 6.2 14.4
5 4 3 170 30 1.3 5.7 7.6
* Non-proportionate distribution
Example – stratified, one-stage
6-7thApril2016(C)JamalludinAbRahman.Allrightsreserved.
16
Population Size Sample Size Sampling Weight
District 1 District 2 District 1 District 2 District 1 District 2
Urban Rural Urban Rural Urban Rural Urban Rural Urban Rural Urban Rural
Under 18 10000 13000 20000 15000 100 100 100 100 100 130 200 150
18-60 30000 25000 60000 45000 100 100 100 100 300 250 600 450
Above 60 5000 7000 5000 10000 100 100 100 100 50 70 50 100
45000 45000 85000 70000 300 300 300 300
1 sample from District 1 urban represents 100 people
1 sample from District 2 urban represents 200 people
* Non-proportionate distribution
Complex sampling analysis
 Accommodate sampling weight
 Adjust for standard error
6-7thApril2016(C)JamalludinAbRahman.Allrightsreserved.
17
Estimating standard error
 Linearization method
(Taylor’s series) – assume linear association
 Replication method – sub-sample & calculate variance
for each samples – e.g. BRR (Balanced Repeated
Replication), Jacknife, bootstrapping
18
6-7thApril2016(C)JamalludinAbRahman.Allrightsreserved.
Practical Session
6-7thApril2016(C)JamalludinAbRahman.Allrightsreserved.
19
Practical
 Sampling distribution
 Calculating sampling weight
 Preparing data for analysis
 Complex sample analysis (using SPSS)
6-7thApril2016(C)JamalludinAbRahman.Allrightsreserved.
20
Sampling distribution
 Using 2016 adult household by location (urban/rural) in
Malaysia, prepare sampling distribution to represent up
to Malaysian urban/rural if the sample size calculated is
10,000 respondents
 Taking 12 LQ per EB and 2 adults per LQ
 Proportionate to size
6-7thApril2016(C)JamalludinAbRahman.Allrightsreserved.
21
6-7thApril2016(C)JamalludinAbRahman.Allrightsreserved.
22
Population Size by census ('000)*
No. State Urban Rural Total
1 Johor 1,682 537 2,219
2 Kedah 905 433 1,338
3 Kelantan 508 543 1,050
4 Melaka 537 47 584
5 Negeri Sembilan 492 198 690
6 Pahang 564 427 991
7 Perak 1,260 394 1,653
8 Perlis 102 66 167
9 Pulau Pinang 1,069 69 1,138
10 Sabah 1,064 597 1,661
11 Sarawak 1,009 694 1,703
12 Selangor 3,583 274 3,857
13 Terengganu 450 250 700
14 WP Kuala Lumpur 1,133 1,133
15 WP Labuan 50 6 57
16 WP Putrajaya 46 46
14,454 4,533 18,987
Calculating sampling weight
6-7thApril2016(C)JamalludinAbRahman.Allrightsreserved.
23
PSU (Kindergarten) SSU (Children)
URBAN RURAL URBAN RURAL
Total
population *
Kindergarten
visited
Total
population *
Kindergarten
visited
Total
population *
Children
Examined
Total
population *
Children
Examined
FT Kuala Lumpur 471 34 - - 10,940 687 - -
Perlis 65 5 222 7 1,007 97 2,557 113
Kedah 164 19 757 69 1,913 203 9,154 846
Penang 297 21 316 24 4,845 402 4,496 366
Perak 356 19 1,040 55 6,382 412 12,627 819
Selangor 1,051 93 607 55 22,951 2,204 7,994 815
Negeri Sembilan 206 15 420 30 2,924 253 4,850 373
Melaka 131 8 384 22 1,941 125 5,111 316
Johor 586 42 1,121 80 9,389 779 13,594 1,163
Pahang 235 13 873 45 4,188 224 12,092 642
Terengganu 400 21 813 35 6,979 336 9,308 427
Kelantan 144 9 1,042 58 2,924 178 14,882 934
FT Putrajaya 71 4 - - 2,170 127 - -
Sabah 395 32 1,230 101 10,330 998 13,837 1,006
Sarawak 590 30 1,493 67 13,395 644 14,936 725
FT Labuan 74 8 - - 1,400 135 - -
Total 5,236 373 10,318 648 103,678 7,804 125,438 8,545
Preparing data for analysis
 Merge SW into dataset
6-7thApril2016(C)JamalludinAbRahman.Allrightsreserved.
24
6-7thApril2016(C)JamalludinAbRahman.Allrightsreserved.
25
Complex sample analysis
 Preparing cs plan
 Analysis using SPSS
6-7thApril2016(C)JamalludinAbRahman.Allrightsreserved.
27

Weitere ähnliche Inhalte

Was ist angesagt?

Introduction to sampling
Introduction to samplingIntroduction to sampling
Introduction to sampling
Situo Liu
 

Was ist angesagt? (20)

Systematic sampling in probability sampling
Systematic sampling in probability sampling Systematic sampling in probability sampling
Systematic sampling in probability sampling
 
SAMPLING DESIGN AND STEPS IN SAMPLE DESIGN
SAMPLING DESIGN AND STEPS IN SAMPLE DESIGNSAMPLING DESIGN AND STEPS IN SAMPLE DESIGN
SAMPLING DESIGN AND STEPS IN SAMPLE DESIGN
 
probability and non-probability samplings
probability and non-probability samplingsprobability and non-probability samplings
probability and non-probability samplings
 
4.sampling design
4.sampling design4.sampling design
4.sampling design
 
Recapitulation of Basic Statistical Concepts .pptx
Recapitulation of Basic Statistical Concepts .pptxRecapitulation of Basic Statistical Concepts .pptx
Recapitulation of Basic Statistical Concepts .pptx
 
RESEARCH METHOD - SAMPLING
RESEARCH METHOD - SAMPLINGRESEARCH METHOD - SAMPLING
RESEARCH METHOD - SAMPLING
 
Sampling techniques
Sampling techniquesSampling techniques
Sampling techniques
 
Sampling methods 16
Sampling methods   16Sampling methods   16
Sampling methods 16
 
5. sampling design
5. sampling design5. sampling design
5. sampling design
 
Sampling
Sampling Sampling
Sampling
 
Sampling and methods of doing sampling Assignment
Sampling and methods of doing sampling Assignment Sampling and methods of doing sampling Assignment
Sampling and methods of doing sampling Assignment
 
Sampling and Sample Types
Sampling  and Sample TypesSampling  and Sample Types
Sampling and Sample Types
 
Sample method
Sample methodSample method
Sample method
 
Complex random sampling designs
Complex random sampling designsComplex random sampling designs
Complex random sampling designs
 
Study types
Study typesStudy types
Study types
 
Sample Designs and Sampling Procedures
Sample Designs and Sampling ProceduresSample Designs and Sampling Procedures
Sample Designs and Sampling Procedures
 
Sampling....
Sampling....Sampling....
Sampling....
 
2.7.21 sampling methods data analysis
2.7.21 sampling methods data analysis2.7.21 sampling methods data analysis
2.7.21 sampling methods data analysis
 
Sampling
SamplingSampling
Sampling
 
Introduction to sampling
Introduction to samplingIntroduction to sampling
Introduction to sampling
 

Andere mochten auch

Methods for developing assessment instruments to generate useful data in t…
Methods for developing assessment instruments to generate useful data in t…Methods for developing assessment instruments to generate useful data in t…
Methods for developing assessment instruments to generate useful data in t…
Pat Barlow
 
Application of assessment and evaluation data to improve a dynamic graduate m...
Application of assessment and evaluation data to improve a dynamic graduate m...Application of assessment and evaluation data to improve a dynamic graduate m...
Application of assessment and evaluation data to improve a dynamic graduate m...
Pat Barlow
 

Andere mochten auch (20)

Research Made Easy
Research Made EasyResearch Made Easy
Research Made Easy
 
Data management & statistics in clinical trials
Data management & statistics in clinical trialsData management & statistics in clinical trials
Data management & statistics in clinical trials
 
Critical appraisal guideline
Critical appraisal guidelineCritical appraisal guideline
Critical appraisal guideline
 
Principle of good clinical practice
Principle of good clinical practicePrinciple of good clinical practice
Principle of good clinical practice
 
New Benchmark 500 Uploads!
New Benchmark 500 Uploads!New Benchmark 500 Uploads!
New Benchmark 500 Uploads!
 
Social marketing
Social marketingSocial marketing
Social marketing
 
The Development of the Biostatistics & Clinical Epideimiolgy Skills (BACES) A...
The Development of the Biostatistics & Clinical Epideimiolgy Skills (BACES) A...The Development of the Biostatistics & Clinical Epideimiolgy Skills (BACES) A...
The Development of the Biostatistics & Clinical Epideimiolgy Skills (BACES) A...
 
Life style
Life styleLife style
Life style
 
Methods for developing assessment instruments to generate useful data in t…
Methods for developing assessment instruments to generate useful data in t…Methods for developing assessment instruments to generate useful data in t…
Methods for developing assessment instruments to generate useful data in t…
 
Application of assessment and evaluation data to improve a dynamic graduate m...
Application of assessment and evaluation data to improve a dynamic graduate m...Application of assessment and evaluation data to improve a dynamic graduate m...
Application of assessment and evaluation data to improve a dynamic graduate m...
 
Quality
QualityQuality
Quality
 
Auditing1
Auditing1Auditing1
Auditing1
 
Bad Statistics, Bad Reporting, Bad Impact on Patients: The Story of the PACE ...
Bad Statistics, Bad Reporting, Bad Impact on Patients: The Story of the PACE ...Bad Statistics, Bad Reporting, Bad Impact on Patients: The Story of the PACE ...
Bad Statistics, Bad Reporting, Bad Impact on Patients: The Story of the PACE ...
 
Medical statistics2
Medical statistics2Medical statistics2
Medical statistics2
 
Regression analysis
Regression analysisRegression analysis
Regression analysis
 
Fundamentals of measurement
Fundamentals of measurementFundamentals of measurement
Fundamentals of measurement
 
Brief Look at Association vs causation
Brief Look at Association vs causationBrief Look at Association vs causation
Brief Look at Association vs causation
 
Sociology
SociologySociology
Sociology
 
Comparing research designs fw 2013 handout version
Comparing research designs fw 2013 handout versionComparing research designs fw 2013 handout version
Comparing research designs fw 2013 handout version
 
Lecture of epidemiology
Lecture of epidemiologyLecture of epidemiology
Lecture of epidemiology
 

Ähnlich wie Complex sampling design & analysis

Wan Muhamad Amir et al., 2015
Wan Muhamad Amir et al., 2015Wan Muhamad Amir et al., 2015
Wan Muhamad Amir et al., 2015
Min Pau Tan
 
Biostatistics-MDS(Sampling techniques, Probabaility) Dr. Kanwal Preet K Gill....
Biostatistics-MDS(Sampling techniques, Probabaility) Dr. Kanwal Preet K Gill....Biostatistics-MDS(Sampling techniques, Probabaility) Dr. Kanwal Preet K Gill....
Biostatistics-MDS(Sampling techniques, Probabaility) Dr. Kanwal Preet K Gill....
DrSandeepKaur4
 
Yisen Lin's Term 2 Presentation
Yisen Lin's Term 2 PresentationYisen Lin's Term 2 Presentation
Yisen Lin's Term 2 Presentation
Yisen Lin
 
KFP60604_material_K03097_20211214122005_Week 3_Persampelan Kajian.pptx
KFP60604_material_K03097_20211214122005_Week 3_Persampelan Kajian.pptxKFP60604_material_K03097_20211214122005_Week 3_Persampelan Kajian.pptx
KFP60604_material_K03097_20211214122005_Week 3_Persampelan Kajian.pptx
AhmadMukhsin2
 

Ähnlich wie Complex sampling design & analysis (20)

Sampling_Distribution_stat_of_Mean_New.pptx
Sampling_Distribution_stat_of_Mean_New.pptxSampling_Distribution_stat_of_Mean_New.pptx
Sampling_Distribution_stat_of_Mean_New.pptx
 
COMPLEX RANDOM SAMPLING DESIGNS.pptx
COMPLEX RANDOM SAMPLING DESIGNS.pptxCOMPLEX RANDOM SAMPLING DESIGNS.pptx
COMPLEX RANDOM SAMPLING DESIGNS.pptx
 
sampling and statiscal inference
sampling and statiscal inferencesampling and statiscal inference
sampling and statiscal inference
 
week 5.pptx
week 5.pptxweek 5.pptx
week 5.pptx
 
SAMPLING TECHNIQUES.pptx
SAMPLING TECHNIQUES.pptxSAMPLING TECHNIQUES.pptx
SAMPLING TECHNIQUES.pptx
 
Effective Process Control
Effective Process ControlEffective Process Control
Effective Process Control
 
Wan Muhamad Amir et al., 2015
Wan Muhamad Amir et al., 2015Wan Muhamad Amir et al., 2015
Wan Muhamad Amir et al., 2015
 
Unit 3 Sampling
Unit 3 SamplingUnit 3 Sampling
Unit 3 Sampling
 
Chi square Test Using SPSS
Chi square Test Using SPSSChi square Test Using SPSS
Chi square Test Using SPSS
 
Sampling
SamplingSampling
Sampling
 
Article of analytical chemistry
Article of analytical chemistryArticle of analytical chemistry
Article of analytical chemistry
 
Biostatistics-MDS(Sampling techniques, Probabaility) Dr. Kanwal Preet K Gill....
Biostatistics-MDS(Sampling techniques, Probabaility) Dr. Kanwal Preet K Gill....Biostatistics-MDS(Sampling techniques, Probabaility) Dr. Kanwal Preet K Gill....
Biostatistics-MDS(Sampling techniques, Probabaility) Dr. Kanwal Preet K Gill....
 
AAA screening national programme update September 2019: Lisa Summers
AAA screening national programme update September 2019: Lisa SummersAAA screening national programme update September 2019: Lisa Summers
AAA screening national programme update September 2019: Lisa Summers
 
Yisen Lin's Term 2 Presentation
Yisen Lin's Term 2 PresentationYisen Lin's Term 2 Presentation
Yisen Lin's Term 2 Presentation
 
Response Surface Methodology: In the Food Sector
Response Surface Methodology: In the Food SectorResponse Surface Methodology: In the Food Sector
Response Surface Methodology: In the Food Sector
 
Statistics
StatisticsStatistics
Statistics
 
Cross-validation aggregation for forecasting
Cross-validation aggregation for forecastingCross-validation aggregation for forecasting
Cross-validation aggregation for forecasting
 
Evaluation of Variability and Combinability of Fecal Calprotectin (FCP) Resul...
Evaluation of Variability and Combinability of Fecal Calprotectin (FCP) Resul...Evaluation of Variability and Combinability of Fecal Calprotectin (FCP) Resul...
Evaluation of Variability and Combinability of Fecal Calprotectin (FCP) Resul...
 
Sampling Techniques
Sampling TechniquesSampling Techniques
Sampling Techniques
 
KFP60604_material_K03097_20211214122005_Week 3_Persampelan Kajian.pptx
KFP60604_material_K03097_20211214122005_Week 3_Persampelan Kajian.pptxKFP60604_material_K03097_20211214122005_Week 3_Persampelan Kajian.pptx
KFP60604_material_K03097_20211214122005_Week 3_Persampelan Kajian.pptx
 

Mehr von International Islamic University Malaysia (6)

Introduction to Practical Biostatistics
Introduction to Practical BiostatisticsIntroduction to Practical Biostatistics
Introduction to Practical Biostatistics
 
Islam and preventive medicine
Islam and preventive medicineIslam and preventive medicine
Islam and preventive medicine
 
Muslim contribution to public health
Muslim contribution to public healthMuslim contribution to public health
Muslim contribution to public health
 
Epidemiology & biostatistics in Islamic perspective
Epidemiology & biostatistics in Islamic perspectiveEpidemiology & biostatistics in Islamic perspective
Epidemiology & biostatistics in Islamic perspective
 
Introduction to Multivariate analysis
Introduction to Multivariate analysisIntroduction to Multivariate analysis
Introduction to Multivariate analysis
 
Introduction to Research methodology
Introduction to Research methodology Introduction to Research methodology
Introduction to Research methodology
 

Kürzlich hochgeladen

The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
heathfieldcps1
 
Making and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdfMaking and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdf
Chris Hunter
 
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptx
negromaestrong
 

Kürzlich hochgeladen (20)

Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 
Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdf
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptx
 
Making and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdfMaking and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdf
 
psychiatric nursing HISTORY COLLECTION .docx
psychiatric  nursing HISTORY  COLLECTION  .docxpsychiatric  nursing HISTORY  COLLECTION  .docx
psychiatric nursing HISTORY COLLECTION .docx
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptx
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
Role Of Transgenic Animal In Target Validation-1.pptx
Role Of Transgenic Animal In Target Validation-1.pptxRole Of Transgenic Animal In Target Validation-1.pptx
Role Of Transgenic Animal In Target Validation-1.pptx
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 

Complex sampling design & analysis

  • 1. Complex sampling design & analysis. A revision Assoc. Prof. Dr. JamalludinAb Rahman MD MPH Department of Community Medicine Kulliyyah of Medicine
  • 2. Content  Sampling method & sample size for survey  What is complex sampling method  Sampling weight  Complex sampling analysis 6-7thApril2016(C)JamalludinAbRahman.Allrightsreserved. 2
  • 3. About sampling  Not feasible to select ALL population  Best sampling should be able to represent population  Sampling error occurs when statistics ≠ parameters  Sampling error is not sampling bias  Sampling error is random, sampling bias is predictable (systematic)  Sampling design affects sampling error  Standard error measures sampling error 6-7thApril2016(C)JamalludinAbRahman.Allrightsreserved. 3
  • 4. 6-7thApril2016(C)JamalludinAbRahman.Allrightsreserved. 4 The aim of any sampling plan should is to reduce sampling error, and to avoid sampling bias
  • 5. Describe the sample  Target population – inferred population  Study population – representative of the target population  Sampling frame – list of sampling unit  Sampling unit – unit to be sampled  Observation unit – unit to be observed/measured 6-7thApril2016(C)JamalludinAbRahman.Allrightsreserved. 5
  • 6. Sampling method  Random vs. non-random  Random ensures representativeness  Simple vs. complex  SRS = all samples have equal chance to be selected i.e. equal probability of selection  Anything not SRS is complex sampling 6-7thApril2016(C)JamalludinAbRahman.Allrightsreserved. 6
  • 8. Stratified versus cluster sampling  Stratified for heterogeneous groups e.g. male-female, age groups  Cluster for homogenous groups – rarely homogenous, only in ideal situation e.g. schools, districts 6-7thApril2016(C)JamalludinAbRahman.Allrightsreserved. 8
  • 9. 6-7thApril2016(C)JamalludinAbRahman.Allrightsreserved. 9 Cluster Stratified • There are clusters not selected at all • Large variance • All strata selected • Smaller variance
  • 10. Design Effect (deff)  Design Effect = Variance estimate (complex) Variance estimate (SRS)  How much the sample differ from population  Different value for different variable  Usually deff for complex survey >> 1  If > 1.5, meaning effective loss 50% of sample if designed using SRS 6-7thApril2016(C)JamalludinAbRahman.Allrightsreserved. 10
  • 11. Design Factor (deft)  Design factor (deft) is sqrt(deff) ~ effect of sampling to standard error  If deft = 2, the SE is twice larger than if the sampling design is SRS  The use of deff or deft, is as guide (a priori) to measure sample size or to measure whether sample size has been adequately achieved (post hoc) 6-7thApril2016(C)JamalludinAbRahman.Allrightsreserved. 11
  • 12. Sampling Weight  aka Probability Weight  N/n (inverse of sampling fraction)  Two stage = (N1/n1)*(N2/n2)  The sum of PW = population  Weighting can increase standard error 12
  • 13. Sampling weight…  Why? There is always imperfection in sampling  Weighting will try to correct 1. Unequal probability of selection – base/design weight 2. Non-response bias 3. Stratification in population – trying to represent true characteristics of population e.g. by sex, ethnic etc. – post stratification Slide | 13
  • 14. Example  N = 100,000 people  Sample (n) = 1000  Therefore, SW = 100,000/1000 = 100  Every 1 sample represents 100 people in that region 6-7thApril2016(C)JamalludinAbRahman.Allrightsreserved. 14
  • 15. Example – two stage 6-7thApril2016(C)JamalludinAbRahman.Allrightsreserved. 15 Grade Class Students SW1 SW2 SW N1 n1 N2 n2 N1/n1 N2/n2 SW1*SW2 1 5 3 150 30 1.7 5.0 8.3 2 6 3 180 30 2.0 6.0 12.0 3 6 3 175 30 2.0 5.8 11.7 4 7 3 185 30 2.3 6.2 14.4 5 4 3 170 30 1.3 5.7 7.6 * Non-proportionate distribution
  • 16. Example – stratified, one-stage 6-7thApril2016(C)JamalludinAbRahman.Allrightsreserved. 16 Population Size Sample Size Sampling Weight District 1 District 2 District 1 District 2 District 1 District 2 Urban Rural Urban Rural Urban Rural Urban Rural Urban Rural Urban Rural Under 18 10000 13000 20000 15000 100 100 100 100 100 130 200 150 18-60 30000 25000 60000 45000 100 100 100 100 300 250 600 450 Above 60 5000 7000 5000 10000 100 100 100 100 50 70 50 100 45000 45000 85000 70000 300 300 300 300 1 sample from District 1 urban represents 100 people 1 sample from District 2 urban represents 200 people * Non-proportionate distribution
  • 17. Complex sampling analysis  Accommodate sampling weight  Adjust for standard error 6-7thApril2016(C)JamalludinAbRahman.Allrightsreserved. 17
  • 18. Estimating standard error  Linearization method (Taylor’s series) – assume linear association  Replication method – sub-sample & calculate variance for each samples – e.g. BRR (Balanced Repeated Replication), Jacknife, bootstrapping 18 6-7thApril2016(C)JamalludinAbRahman.Allrightsreserved.
  • 20. Practical  Sampling distribution  Calculating sampling weight  Preparing data for analysis  Complex sample analysis (using SPSS) 6-7thApril2016(C)JamalludinAbRahman.Allrightsreserved. 20
  • 21. Sampling distribution  Using 2016 adult household by location (urban/rural) in Malaysia, prepare sampling distribution to represent up to Malaysian urban/rural if the sample size calculated is 10,000 respondents  Taking 12 LQ per EB and 2 adults per LQ  Proportionate to size 6-7thApril2016(C)JamalludinAbRahman.Allrightsreserved. 21
  • 22. 6-7thApril2016(C)JamalludinAbRahman.Allrightsreserved. 22 Population Size by census ('000)* No. State Urban Rural Total 1 Johor 1,682 537 2,219 2 Kedah 905 433 1,338 3 Kelantan 508 543 1,050 4 Melaka 537 47 584 5 Negeri Sembilan 492 198 690 6 Pahang 564 427 991 7 Perak 1,260 394 1,653 8 Perlis 102 66 167 9 Pulau Pinang 1,069 69 1,138 10 Sabah 1,064 597 1,661 11 Sarawak 1,009 694 1,703 12 Selangor 3,583 274 3,857 13 Terengganu 450 250 700 14 WP Kuala Lumpur 1,133 1,133 15 WP Labuan 50 6 57 16 WP Putrajaya 46 46 14,454 4,533 18,987
  • 23. Calculating sampling weight 6-7thApril2016(C)JamalludinAbRahman.Allrightsreserved. 23 PSU (Kindergarten) SSU (Children) URBAN RURAL URBAN RURAL Total population * Kindergarten visited Total population * Kindergarten visited Total population * Children Examined Total population * Children Examined FT Kuala Lumpur 471 34 - - 10,940 687 - - Perlis 65 5 222 7 1,007 97 2,557 113 Kedah 164 19 757 69 1,913 203 9,154 846 Penang 297 21 316 24 4,845 402 4,496 366 Perak 356 19 1,040 55 6,382 412 12,627 819 Selangor 1,051 93 607 55 22,951 2,204 7,994 815 Negeri Sembilan 206 15 420 30 2,924 253 4,850 373 Melaka 131 8 384 22 1,941 125 5,111 316 Johor 586 42 1,121 80 9,389 779 13,594 1,163 Pahang 235 13 873 45 4,188 224 12,092 642 Terengganu 400 21 813 35 6,979 336 9,308 427 Kelantan 144 9 1,042 58 2,924 178 14,882 934 FT Putrajaya 71 4 - - 2,170 127 - - Sabah 395 32 1,230 101 10,330 998 13,837 1,006 Sarawak 590 30 1,493 67 13,395 644 14,936 725 FT Labuan 74 8 - - 1,400 135 - - Total 5,236 373 10,318 648 103,678 7,804 125,438 8,545
  • 24. Preparing data for analysis  Merge SW into dataset 6-7thApril2016(C)JamalludinAbRahman.Allrightsreserved. 24
  • 26. Complex sample analysis  Preparing cs plan  Analysis using SPSS 6-7thApril2016(C)JamalludinAbRahman.Allrightsreserved. 27

Hinweis der Redaktion

  1. For cluster samples, the main components of deff are the intraclass correlation or rho, and the number of units within each cluster. Rho is a statistical estimate of within cluster homogeneity. It represents the probability that two units drawn randomly from the same cluster will have the same value on the variable in question, relative to two units drawn at random from the population as a whole.   Thus, a rho of 0.10 indicates that two units randomly selected from within the same cluster are 10% more likely to have the same value than are two randomly selected units in the population as a whole.
  2. For example, a deft value of 2, indicates that the standard errors are twice as large as they would have been had the design been a simple random sample.