SlideShare ist ein Scribd-Unternehmen logo
1 von 25
Computer Intensive Models
Daniele and Steph
Playing Card Example
• 52 cards are population
▫ 25 card sample…. What is the mean, median, SE
• Randomization
▫ Randomly reallocate cards to groups (i.e. diamonds,
spades..)
▫ 4 people divide up cards
▫ Do 1000+ times randomly allocating to the 4 groups
▫ Graph up distribution of parameter
▫ If your original parameter is outside of the 95% CI then
it is significantly different from random
Playing Card Example
• JackKnife
▫ Take 24 cards… (leave 1 out) and complete test
statistics
▫ Redo for all possible combinations (25 x)
• Bootstrapping
▫ Pick out 25 cards, but put them back each time
▫ Calculate new parameters
 Redo 1000+ times
▫ If your sample parameter falls within the 95% CI of
the distribution, then it isn’t statistically different
from random
Playing Card Example
• Monte Carlo
▫ Find a model that would fit the card trend
 Relative frequency plot, examine shape
▫ Randomly select value for model parameters or data (cards
picked)
 Complete 1000+ times
 Analyze your parameters to fit the data vs. random generated
parameters
http://www.vertex42.com/ExcelArticles/mc/MonteCarloSimulation.html
COMPARISON
Randomize Jackknife Bootstrap Monte Carlo
With
Replacement
No No Yes Yes
Exact P-values Yes Likely no Yes Yes
Resample a
theoretical PDF
Parametric
Resample an
empirical
distribution
Yes Yes Non-
parametric
Non-
parametric
Good to… Deal with
unknown
distribution
Detect bias,
calc. SE, good
for biases
parameters
Calc. sample
size for exp.
Design, CI, SE
and Test Hypot.
Flexible,
generic, SE,
CI, Test Hypot.
Good for sparse data sets
Limitations Can’t calc SE,
or CI (weak)
Bad CI
4 Methods
• Randomization
▫ Ho: each group of obs. is a random sample of 1 pop.
▫ Must be characterized by a test stat
▫ Combine all groups, then reallocate, and compare
 Repeat 1000+ times
▫ Compare obs. Test stat with empirical distribution of
that test stat given available data.
 A sig diff is when obs. test stat is beyond empirical
distribution
4 Methods
• Jackknife
▫ Sample could be from one arm of the distribution
▫ Subset data (for all combinations of all data minus 1
pt) (total = n-1)
▫ Calculate pseudo-values , diff. btw. this and obs =
estimate of bias
▫ Good when estimating something other than mean
▫ Calculate Jackknife SE and parameter of interest
 CI can be fitted but issues with DFs
4 Methods
• Bootstrapping
▫ Random samples of the observation (with
replacement
 Each treated as a separate random sample
▫ Should = the distribution if you had repeatedly
sampled the original population
▫ Provides better CI than the Jackknife
 Can determine SE, CI and test hypotheses
▫ Have been used for multiple regression and stratified
sampling
4 Methods
• Monte Carlo
▫ Mathematical model of the situation + model
parameters
▫ Randomly select variable, parameter or data values
and then use to determine model output
 Do 1000+ times and use to test hypotheses or
determine Confidence Intervals
▫ *Resampling from a theoretical distribution
▫ Compare observations with data from a model of the
system + for use in risk assessment
Chapter 5: Randomization TestsChapter 5: Randomization Tests
• ANOVA (e.g.) vs. Randomization
▫ Assumptions
• Hypothesis testing
▫ Determinations of likelihood that observations in
nature could have arisen by chance
▫ Relativity: group 1 vs. group 2, hypothesis
Standard Significance Testing
• Test statistic (e.g. t-test)
• Significant difference
• Statistically 3 things
needed to test
hypothesis:
1. Formally stated hypothesis
(Ho and Ha)
2. Test statistic: t-test, F-ratio,
r correlation, etc.
3. Means of generating PD of
test statistic under
assumption Ho is true
• idrc.ca
idrc.ca
• Observed vs. expected values (d.f., etc.)
• Determine how likely observed values,
assuming Ho to be true
• α-value: 0.05, 0.01, 0.001
0
0.2
0.4
0.6
0.8
1
0 5 10 15 20 25
X2
Statistic
Probability
α 0.05 = ~15
However…
• Test statistics are not valid if assumptions are
falsely made
• Can, therefore, reject a real difference in data or
accept a non-existent difference
• Problem with theoretically derived PDF:
▫ If test statistic is not significant, cannot tell without
further analyses if the test failed b/c:
1. Samples are not independently distributed (thing being tested)
or,
2. The data failed to conform to the assumptions necessary for the
validity of the test (e.g. samples were not normally distributed)
This is where randomization tests come into play…
Significance Testing by Randomization
Independent of any determined parametric
PDF
Generates empirical PDF for the test-statistic
Given a null hypothesis, the expected PDF for a
test statistic can be generated by repeatedly
randomizing the data with respect to sample
group membership and recalculating the test
statistic
1. Done many times (min. 1000)
2. Test statistic values tabulated
3. Compared with original value from un-
randomized data
4. If original value is unusual relative to
permutations, Ho can be rejected
The Three R’s
sharonlanegetaz.v2efoliomn.mnscu.edu/Research
Key Points
• Essentially, the null-hypothesis is that the
groups being compared are random samples
from the same population
• Thus, a test of significance is an attempt to
determine whether observed samples could
have been randomly drawn from the same
population
• Answer = probability
▫ PROBABILITY = possibly from same pop. (never
claim definitively)
▫ PROBABILITY = not likely from same pop.
Ex: Fish length for in-shore fish vs. off-
shore fish
• Ha: In-shore fish are smaller on average than off-
shore fish
0 50 100 150 200 250 300
Fork Length
Off-shore
In-shore
Ex. 5.2
• Randomization can be used to test mean
difference
• Original mean difference occurred 25 x’s out of
1000
• What does this tell us about the data?
• Weight of evidence, not significant difference
(e.g. p=0.05)
0
100
200
300
400
500
600
700
800
900
1000
0 5 10 15 20 25 30 35 40 45 50 55
Abs(Mean Difference)
SortedRandomization
Replicates
p=0.025
Selection of a Test Statistic
• Chosen based on sensitivity to the hypothetical
property being checked or compared (e.g. mean
vs. median (ex.5.3))
• Precaution should be taken when selecting a
non-standard test statistic
• Multivariate comparisons
• Determine exactly which hypothesis is being
tested by the test statistic
▫ “When in doubt, be conservative”
• Ex 5.3
Ideal Test Statistics
• Greatest statistical power
• Significance – probability of making a Type I
error
• Power – probability of making a Type II error
• Unbiased – using a test that is more likely to
reject a false hypothesis than a true one
No difference
Null true
Difference exists
Null False
Null accepted OK TYPE II ERROR
Null rejected TYPE I ERROR OK
Randomization of Structured Data
• Restricted to comparison tests (cannot be used
for parameter estimation)
• Differences in variation – randomize residuals
instead of data values
• Basic rule: unbalanced and highly non-normal
data a randomization procedure should be used
• Question: what should be randomized?
▫ Raw data,
▫ Sub-set of raw data, or
▫ residuals from model
Take Home Message…
With structured or non-linear data, care needs to
be taken in what components should be
randomized
Summary
• Randomization requires less assumptions than
standard parametric stats
• Significance tests test whether the observed
samples could be from the same pop.
• State hypotheses and determine significance
level
• Test statistics that yield the greatest power
should be utilized

Weitere ähnliche Inhalte

Was ist angesagt?

Fundamentals of Testing Hypothesis
Fundamentals of Testing HypothesisFundamentals of Testing Hypothesis
Fundamentals of Testing HypothesisYesica Adicondro
 
Scalable Simple Random Sampling Algorithms
Scalable Simple Random Sampling AlgorithmsScalable Simple Random Sampling Algorithms
Scalable Simple Random Sampling AlgorithmsXiangrui Meng
 
Data Analysis: Evaluation Metrics for Supervised Learning Models of Machine L...
Data Analysis: Evaluation Metrics for Supervised Learning Models of Machine L...Data Analysis: Evaluation Metrics for Supervised Learning Models of Machine L...
Data Analysis: Evaluation Metrics for Supervised Learning Models of Machine L...Md. Main Uddin Rony
 
The Chi-Square Statistic: Tests for Goodness of Fit and Independence
The Chi-Square Statistic: Tests for Goodness of Fit and IndependenceThe Chi-Square Statistic: Tests for Goodness of Fit and Independence
The Chi-Square Statistic: Tests for Goodness of Fit and Independencejasondroesch
 
Sampling methods theory and practice
Sampling methods theory and practice Sampling methods theory and practice
Sampling methods theory and practice Ravindra Sharma
 
Hypothesis Testing
Hypothesis TestingHypothesis Testing
Hypothesis TestingJeremy Lane
 
ISSTA'16 Summer School: Intro to Statistics
ISSTA'16 Summer School: Intro to StatisticsISSTA'16 Summer School: Intro to Statistics
ISSTA'16 Summer School: Intro to StatisticsAndrea Arcuri
 
Research methodology - Estimation Theory & Hypothesis Testing, Techniques of ...
Research methodology - Estimation Theory & Hypothesis Testing, Techniques of ...Research methodology - Estimation Theory & Hypothesis Testing, Techniques of ...
Research methodology - Estimation Theory & Hypothesis Testing, Techniques of ...The Stockker
 
Probability and Samples: The Distribution of Sample Means
Probability and Samples: The Distribution of Sample MeansProbability and Samples: The Distribution of Sample Means
Probability and Samples: The Distribution of Sample Meansjasondroesch
 
Lecture 7 Hypothesis Testing Two Sample
Lecture 7 Hypothesis Testing Two SampleLecture 7 Hypothesis Testing Two Sample
Lecture 7 Hypothesis Testing Two SampleAhmadullah
 
Statistics Presentation week 7
Statistics Presentation week 7Statistics Presentation week 7
Statistics Presentation week 7krookroo
 
8. testing of hypothesis for variable & attribute data
8. testing of hypothesis for variable & attribute  data8. testing of hypothesis for variable & attribute  data
8. testing of hypothesis for variable & attribute dataHakeem-Ur- Rehman
 

Was ist angesagt? (20)

Fundamentals of Testing Hypothesis
Fundamentals of Testing HypothesisFundamentals of Testing Hypothesis
Fundamentals of Testing Hypothesis
 
Scalable Simple Random Sampling Algorithms
Scalable Simple Random Sampling AlgorithmsScalable Simple Random Sampling Algorithms
Scalable Simple Random Sampling Algorithms
 
Data Analysis: Evaluation Metrics for Supervised Learning Models of Machine L...
Data Analysis: Evaluation Metrics for Supervised Learning Models of Machine L...Data Analysis: Evaluation Metrics for Supervised Learning Models of Machine L...
Data Analysis: Evaluation Metrics for Supervised Learning Models of Machine L...
 
Data science
Data scienceData science
Data science
 
The Chi-Square Statistic: Tests for Goodness of Fit and Independence
The Chi-Square Statistic: Tests for Goodness of Fit and IndependenceThe Chi-Square Statistic: Tests for Goodness of Fit and Independence
The Chi-Square Statistic: Tests for Goodness of Fit and Independence
 
Sampling methods theory and practice
Sampling methods theory and practice Sampling methods theory and practice
Sampling methods theory and practice
 
Testing of hypothesis
Testing of hypothesisTesting of hypothesis
Testing of hypothesis
 
chi square statistics
chi square statisticschi square statistics
chi square statistics
 
Hypothesis Testing
Hypothesis TestingHypothesis Testing
Hypothesis Testing
 
ISSTA'16 Summer School: Intro to Statistics
ISSTA'16 Summer School: Intro to StatisticsISSTA'16 Summer School: Intro to Statistics
ISSTA'16 Summer School: Intro to Statistics
 
Sampling distribution
Sampling distributionSampling distribution
Sampling distribution
 
Chapter 4
Chapter 4Chapter 4
Chapter 4
 
Chapter 7
Chapter 7 Chapter 7
Chapter 7
 
Research methodology - Estimation Theory & Hypothesis Testing, Techniques of ...
Research methodology - Estimation Theory & Hypothesis Testing, Techniques of ...Research methodology - Estimation Theory & Hypothesis Testing, Techniques of ...
Research methodology - Estimation Theory & Hypothesis Testing, Techniques of ...
 
Probability and Samples: The Distribution of Sample Means
Probability and Samples: The Distribution of Sample MeansProbability and Samples: The Distribution of Sample Means
Probability and Samples: The Distribution of Sample Means
 
Chapter 9
Chapter 9 Chapter 9
Chapter 9
 
Lecture 7 Hypothesis Testing Two Sample
Lecture 7 Hypothesis Testing Two SampleLecture 7 Hypothesis Testing Two Sample
Lecture 7 Hypothesis Testing Two Sample
 
Statistics Presentation week 7
Statistics Presentation week 7Statistics Presentation week 7
Statistics Presentation week 7
 
8. testing of hypothesis for variable & attribute data
8. testing of hypothesis for variable & attribute  data8. testing of hypothesis for variable & attribute  data
8. testing of hypothesis for variable & attribute data
 
Statistics - Basics
Statistics - BasicsStatistics - Basics
Statistics - Basics
 

Andere mochten auch

MCMC and likelihood-free methods
MCMC and likelihood-free methodsMCMC and likelihood-free methods
MCMC and likelihood-free methodsChristian Robert
 
Bootstrap: a bias minimization technique of an estimator
Bootstrap: a bias minimization technique of an estimatorBootstrap: a bias minimization technique of an estimator
Bootstrap: a bias minimization technique of an estimatorMichel Alves
 
Reading Efron's 1979 paper on bootstrap
Reading Efron's 1979 paper on bootstrapReading Efron's 1979 paper on bootstrap
Reading Efron's 1979 paper on bootstrapChristian Robert
 
Aula 2 Teoria Da Amostragem Daniel
Aula 2 Teoria Da Amostragem DanielAula 2 Teoria Da Amostragem Daniel
Aula 2 Teoria Da Amostragem Danielguest8af68839
 
Uchim.org 6-klass-vilenkin
Uchim.org 6-klass-vilenkinUchim.org 6-klass-vilenkin
Uchim.org 6-klass-vilenkinRazon Ej
 

Andere mochten auch (8)

jrudd1_RDay
jrudd1_RDayjrudd1_RDay
jrudd1_RDay
 
Jsm09 talk
Jsm09 talkJsm09 talk
Jsm09 talk
 
MCMC and likelihood-free methods
MCMC and likelihood-free methodsMCMC and likelihood-free methods
MCMC and likelihood-free methods
 
Bootstrap: a bias minimization technique of an estimator
Bootstrap: a bias minimization technique of an estimatorBootstrap: a bias minimization technique of an estimator
Bootstrap: a bias minimization technique of an estimator
 
Reading Efron's 1979 paper on bootstrap
Reading Efron's 1979 paper on bootstrapReading Efron's 1979 paper on bootstrap
Reading Efron's 1979 paper on bootstrap
 
Aula 2 Teoria Da Amostragem Daniel
Aula 2 Teoria Da Amostragem DanielAula 2 Teoria Da Amostragem Daniel
Aula 2 Teoria Da Amostragem Daniel
 
A.b aula 4 amostragem
A.b aula 4 amostragemA.b aula 4 amostragem
A.b aula 4 amostragem
 
Uchim.org 6-klass-vilenkin
Uchim.org 6-klass-vilenkinUchim.org 6-klass-vilenkin
Uchim.org 6-klass-vilenkin
 

Ähnlich wie Computer Intensive Statistical Methods

2010 smg training_cardiff_day1_session3_higgins
2010 smg training_cardiff_day1_session3_higgins2010 smg training_cardiff_day1_session3_higgins
2010 smg training_cardiff_day1_session3_higginsrgveroniki
 
Lecture 10 - Model Testing and Evaluation, a lecture in subject module Statis...
Lecture 10 - Model Testing and Evaluation, a lecture in subject module Statis...Lecture 10 - Model Testing and Evaluation, a lecture in subject module Statis...
Lecture 10 - Model Testing and Evaluation, a lecture in subject module Statis...Maninda Edirisooriya
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanMYRABACSAFRA2
 
Sample Size And Gpower Module
Sample Size And Gpower ModuleSample Size And Gpower Module
Sample Size And Gpower Modulellalablink
 
sample_size_Determination .pdf
sample_size_Determination .pdfsample_size_Determination .pdf
sample_size_Determination .pdfstatsanjal
 
Statistical tests
Statistical testsStatistical tests
Statistical testsmartyynyyte
 
Introduction to sampling
Introduction to samplingIntroduction to sampling
Introduction to samplingSituo Liu
 
Chapter Seven - .pptbhhhdfhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhd
Chapter Seven - .pptbhhhdfhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhdChapter Seven - .pptbhhhdfhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhd
Chapter Seven - .pptbhhhdfhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhdbeshahashenafe20
 
ch09-Simulation.ppt
ch09-Simulation.pptch09-Simulation.ppt
ch09-Simulation.pptLuckySaigon1
 

Ähnlich wie Computer Intensive Statistical Methods (20)

LR 9 Estimation.pdf
LR 9 Estimation.pdfLR 9 Estimation.pdf
LR 9 Estimation.pdf
 
Overview of statistical tests: Data handling and data quality (Part II)
Overview of statistical tests: Data handling and data quality (Part II)Overview of statistical tests: Data handling and data quality (Part II)
Overview of statistical tests: Data handling and data quality (Part II)
 
2010 smg training_cardiff_day1_session3_higgins
2010 smg training_cardiff_day1_session3_higgins2010 smg training_cardiff_day1_session3_higgins
2010 smg training_cardiff_day1_session3_higgins
 
Lecture 10 - Model Testing and Evaluation, a lecture in subject module Statis...
Lecture 10 - Model Testing and Evaluation, a lecture in subject module Statis...Lecture 10 - Model Testing and Evaluation, a lecture in subject module Statis...
Lecture 10 - Model Testing and Evaluation, a lecture in subject module Statis...
 
Hypothsis testing
Hypothsis testingHypothsis testing
Hypothsis testing
 
ai4.ppt
ai4.pptai4.ppt
ai4.ppt
 
ai4.ppt
ai4.pptai4.ppt
ai4.ppt
 
ai4.ppt
ai4.pptai4.ppt
ai4.ppt
 
ai4.ppt
ai4.pptai4.ppt
ai4.ppt
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population Mean
 
Sample Size And Gpower Module
Sample Size And Gpower ModuleSample Size And Gpower Module
Sample Size And Gpower Module
 
sample_size_Determination .pdf
sample_size_Determination .pdfsample_size_Determination .pdf
sample_size_Determination .pdf
 
Statistical tests
Statistical testsStatistical tests
Statistical tests
 
evaluation and credibility-Part 1
evaluation and credibility-Part 1evaluation and credibility-Part 1
evaluation and credibility-Part 1
 
BIIntro.ppt
BIIntro.pptBIIntro.ppt
BIIntro.ppt
 
Introduction to sampling
Introduction to samplingIntroduction to sampling
Introduction to sampling
 
Chapter Seven - .pptbhhhdfhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhd
Chapter Seven - .pptbhhhdfhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhdChapter Seven - .pptbhhhdfhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhd
Chapter Seven - .pptbhhhdfhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhd
 
ch09-Simulation.ppt
ch09-Simulation.pptch09-Simulation.ppt
ch09-Simulation.ppt
 
Environmental statistics
Environmental statisticsEnvironmental statistics
Environmental statistics
 
9618821.ppt
9618821.ppt9618821.ppt
9618821.ppt
 

Kürzlich hochgeladen

Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3JemimahLaneBuaron
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsTechSoup
 
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...fonyou31
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityGeoBlogs
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAssociation for Project Management
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeThiyagu K
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfAdmir Softic
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDThiyagu K
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Disha Kariya
 
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...PsychoTech Services
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactPECB
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13Steve Thomason
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdfQucHHunhnh
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Celine George
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhikauryashika82
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfagholdier
 
social pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajansocial pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajanpragatimahajan3
 

Kürzlich hochgeladen (20)

Advance Mobile Application Development class 07
Advance Mobile Application Development class 07Advance Mobile Application Development class 07
Advance Mobile Application Development class 07
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across Sectors
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SD
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..
 
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
social pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajansocial pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajan
 

Computer Intensive Statistical Methods

  • 2. Playing Card Example • 52 cards are population ▫ 25 card sample…. What is the mean, median, SE • Randomization ▫ Randomly reallocate cards to groups (i.e. diamonds, spades..) ▫ 4 people divide up cards ▫ Do 1000+ times randomly allocating to the 4 groups ▫ Graph up distribution of parameter ▫ If your original parameter is outside of the 95% CI then it is significantly different from random
  • 3. Playing Card Example • JackKnife ▫ Take 24 cards… (leave 1 out) and complete test statistics ▫ Redo for all possible combinations (25 x) • Bootstrapping ▫ Pick out 25 cards, but put them back each time ▫ Calculate new parameters  Redo 1000+ times ▫ If your sample parameter falls within the 95% CI of the distribution, then it isn’t statistically different from random
  • 4. Playing Card Example • Monte Carlo ▫ Find a model that would fit the card trend  Relative frequency plot, examine shape ▫ Randomly select value for model parameters or data (cards picked)  Complete 1000+ times  Analyze your parameters to fit the data vs. random generated parameters http://www.vertex42.com/ExcelArticles/mc/MonteCarloSimulation.html
  • 5. COMPARISON Randomize Jackknife Bootstrap Monte Carlo With Replacement No No Yes Yes Exact P-values Yes Likely no Yes Yes Resample a theoretical PDF Parametric Resample an empirical distribution Yes Yes Non- parametric Non- parametric Good to… Deal with unknown distribution Detect bias, calc. SE, good for biases parameters Calc. sample size for exp. Design, CI, SE and Test Hypot. Flexible, generic, SE, CI, Test Hypot. Good for sparse data sets Limitations Can’t calc SE, or CI (weak) Bad CI
  • 6. 4 Methods • Randomization ▫ Ho: each group of obs. is a random sample of 1 pop. ▫ Must be characterized by a test stat ▫ Combine all groups, then reallocate, and compare  Repeat 1000+ times ▫ Compare obs. Test stat with empirical distribution of that test stat given available data.  A sig diff is when obs. test stat is beyond empirical distribution
  • 7. 4 Methods • Jackknife ▫ Sample could be from one arm of the distribution ▫ Subset data (for all combinations of all data minus 1 pt) (total = n-1) ▫ Calculate pseudo-values , diff. btw. this and obs = estimate of bias ▫ Good when estimating something other than mean ▫ Calculate Jackknife SE and parameter of interest  CI can be fitted but issues with DFs
  • 8. 4 Methods • Bootstrapping ▫ Random samples of the observation (with replacement  Each treated as a separate random sample ▫ Should = the distribution if you had repeatedly sampled the original population ▫ Provides better CI than the Jackknife  Can determine SE, CI and test hypotheses ▫ Have been used for multiple regression and stratified sampling
  • 9. 4 Methods • Monte Carlo ▫ Mathematical model of the situation + model parameters ▫ Randomly select variable, parameter or data values and then use to determine model output  Do 1000+ times and use to test hypotheses or determine Confidence Intervals ▫ *Resampling from a theoretical distribution ▫ Compare observations with data from a model of the system + for use in risk assessment
  • 10. Chapter 5: Randomization TestsChapter 5: Randomization Tests • ANOVA (e.g.) vs. Randomization ▫ Assumptions • Hypothesis testing ▫ Determinations of likelihood that observations in nature could have arisen by chance ▫ Relativity: group 1 vs. group 2, hypothesis
  • 11. Standard Significance Testing • Test statistic (e.g. t-test) • Significant difference • Statistically 3 things needed to test hypothesis: 1. Formally stated hypothesis (Ho and Ha) 2. Test statistic: t-test, F-ratio, r correlation, etc. 3. Means of generating PD of test statistic under assumption Ho is true • idrc.ca idrc.ca
  • 12. • Observed vs. expected values (d.f., etc.) • Determine how likely observed values, assuming Ho to be true • α-value: 0.05, 0.01, 0.001 0 0.2 0.4 0.6 0.8 1 0 5 10 15 20 25 X2 Statistic Probability α 0.05 = ~15
  • 13. However… • Test statistics are not valid if assumptions are falsely made • Can, therefore, reject a real difference in data or accept a non-existent difference • Problem with theoretically derived PDF: ▫ If test statistic is not significant, cannot tell without further analyses if the test failed b/c: 1. Samples are not independently distributed (thing being tested) or, 2. The data failed to conform to the assumptions necessary for the validity of the test (e.g. samples were not normally distributed) This is where randomization tests come into play…
  • 14. Significance Testing by Randomization Independent of any determined parametric PDF Generates empirical PDF for the test-statistic
  • 15. Given a null hypothesis, the expected PDF for a test statistic can be generated by repeatedly randomizing the data with respect to sample group membership and recalculating the test statistic 1. Done many times (min. 1000) 2. Test statistic values tabulated 3. Compared with original value from un- randomized data 4. If original value is unusual relative to permutations, Ho can be rejected
  • 17. Key Points • Essentially, the null-hypothesis is that the groups being compared are random samples from the same population • Thus, a test of significance is an attempt to determine whether observed samples could have been randomly drawn from the same population • Answer = probability ▫ PROBABILITY = possibly from same pop. (never claim definitively) ▫ PROBABILITY = not likely from same pop.
  • 18. Ex: Fish length for in-shore fish vs. off- shore fish • Ha: In-shore fish are smaller on average than off- shore fish 0 50 100 150 200 250 300 Fork Length Off-shore In-shore
  • 19. Ex. 5.2 • Randomization can be used to test mean difference • Original mean difference occurred 25 x’s out of 1000 • What does this tell us about the data? • Weight of evidence, not significant difference (e.g. p=0.05)
  • 20. 0 100 200 300 400 500 600 700 800 900 1000 0 5 10 15 20 25 30 35 40 45 50 55 Abs(Mean Difference) SortedRandomization Replicates p=0.025
  • 21. Selection of a Test Statistic • Chosen based on sensitivity to the hypothetical property being checked or compared (e.g. mean vs. median (ex.5.3)) • Precaution should be taken when selecting a non-standard test statistic • Multivariate comparisons • Determine exactly which hypothesis is being tested by the test statistic ▫ “When in doubt, be conservative” • Ex 5.3
  • 22. Ideal Test Statistics • Greatest statistical power • Significance – probability of making a Type I error • Power – probability of making a Type II error • Unbiased – using a test that is more likely to reject a false hypothesis than a true one No difference Null true Difference exists Null False Null accepted OK TYPE II ERROR Null rejected TYPE I ERROR OK
  • 23. Randomization of Structured Data • Restricted to comparison tests (cannot be used for parameter estimation) • Differences in variation – randomize residuals instead of data values • Basic rule: unbalanced and highly non-normal data a randomization procedure should be used • Question: what should be randomized? ▫ Raw data, ▫ Sub-set of raw data, or ▫ residuals from model
  • 24. Take Home Message… With structured or non-linear data, care needs to be taken in what components should be randomized
  • 25. Summary • Randomization requires less assumptions than standard parametric stats • Significance tests test whether the observed samples could be from the same pop. • State hypotheses and determine significance level • Test statistics that yield the greatest power should be utilized

Hinweis der Redaktion

  1. FOR THIS EXAMPLE PICK OUT 3 cards at a time
  2. FOR THIS EXAMPLE PICK OUT 3 cards at a time
  3. Less assumptions are needed for randomization tests, giving extra flexibility.
  4. Parametric stats good when assumptions are known