SlideShare ist ein Scribd-Unternehmen logo
1 von 25
Computer Intensive Models
Daniele and Steph
Playing Card Example
• 52 cards are population
▫ 25 card sample…. What is the mean, median, SE
• Randomization
â–« Randomly reallocate cards to groups (i.e. diamonds,
spades..)
â–« 4 people divide up cards
â–« Do 1000+ times randomly allocating to the 4 groups
â–« Graph up distribution of parameter
â–« If your original parameter is outside of the 95% CI then
it is significantly different from random
Playing Card Example
• JackKnife
▫ Take 24 cards… (leave 1 out) and complete test
statistics
â–« Redo for all possible combinations (25 x)
• Bootstrapping
â–« Pick out 25 cards, but put them back each time
â–« Calculate new parameters
ď‚– Redo 1000+ times
â–« If your sample parameter falls within the 95% CI of
the distribution, then it isn’t statistically different
from random
Playing Card Example
• Monte Carlo
â–« Find a model that would fit the card trend
ď‚– Relative frequency plot, examine shape
â–« Randomly select value for model parameters or data (cards
picked)
ď‚– Complete 1000+ times
ď‚– Analyze your parameters to fit the data vs. random generated
parameters
http://www.vertex42.com/ExcelArticles/mc/MonteCarloSimulation.html
COMPARISON
Randomize Jackknife Bootstrap Monte Carlo
With
Replacement
No No Yes Yes
Exact P-values Yes Likely no Yes Yes
Resample a
theoretical PDF
Parametric
Resample an
empirical
distribution
Yes Yes Non-
parametric
Non-
parametric
Good to… Deal with
unknown
distribution
Detect bias,
calc. SE, good
for biases
parameters
Calc. sample
size for exp.
Design, CI, SE
and Test Hypot.
Flexible,
generic, SE,
CI, Test Hypot.
Good for sparse data sets
Limitations Can’t calc SE,
or CI (weak)
Bad CI
4 Methods
• Randomization
â–« Ho: each group of obs. is a random sample of 1 pop.
â–« Must be characterized by a test stat
â–« Combine all groups, then reallocate, and compare
ď‚– Repeat 1000+ times
â–« Compare obs. Test stat with empirical distribution of
that test stat given available data.
ď‚– A sig diff is when obs. test stat is beyond empirical
distribution
4 Methods
• Jackknife
â–« Sample could be from one arm of the distribution
â–« Subset data (for all combinations of all data minus 1
pt) (total = n-1)
â–« Calculate pseudo-values , diff. btw. this and obs =
estimate of bias
â–« Good when estimating something other than mean
â–« Calculate Jackknife SE and parameter of interest
ď‚– CI can be fitted but issues with DFs
4 Methods
• Bootstrapping
â–« Random samples of the observation (with
replacement
ď‚– Each treated as a separate random sample
â–« Should = the distribution if you had repeatedly
sampled the original population
â–« Provides better CI than the Jackknife
ď‚– Can determine SE, CI and test hypotheses
â–« Have been used for multiple regression and stratified
sampling
4 Methods
• Monte Carlo
â–« Mathematical model of the situation + model
parameters
â–« Randomly select variable, parameter or data values
and then use to determine model output
ď‚– Do 1000+ times and use to test hypotheses or
determine Confidence Intervals
â–« *Resampling from a theoretical distribution
â–« Compare observations with data from a model of the
system + for use in risk assessment
Chapter 5: Randomization TestsChapter 5: Randomization Tests
• ANOVA (e.g.) vs. Randomization
â–« Assumptions
• Hypothesis testing
â–« Determinations of likelihood that observations in
nature could have arisen by chance
â–« Relativity: group 1 vs. group 2, hypothesis
Standard Significance Testing
• Test statistic (e.g. t-test)
• Significant difference
• Statistically 3 things
needed to test
hypothesis:
1. Formally stated hypothesis
(Ho and Ha)
2. Test statistic: t-test, F-ratio,
r correlation, etc.
3. Means of generating PD of
test statistic under
assumption Ho is true
• idrc.ca
idrc.ca
• Observed vs. expected values (d.f., etc.)
• Determine how likely observed values,
assuming Ho to be true
• α-value: 0.05, 0.01, 0.001
0
0.2
0.4
0.6
0.8
1
0 5 10 15 20 25
X2
Statistic
Probability
α 0.05 = ~15
However…
• Test statistics are not valid if assumptions are
falsely made
• Can, therefore, reject a real difference in data or
accept a non-existent difference
• Problem with theoretically derived PDF:
â–« If test statistic is not significant, cannot tell without
further analyses if the test failed b/c:
1. Samples are not independently distributed (thing being tested)
or,
2. The data failed to conform to the assumptions necessary for the
validity of the test (e.g. samples were not normally distributed)
This is where randomization tests come into play…
Significance Testing by Randomization
Independent of any determined parametric
PDF
Generates empirical PDF for the test-statistic
Given a null hypothesis, the expected PDF for a
test statistic can be generated by repeatedly
randomizing the data with respect to sample
group membership and recalculating the test
statistic
1. Done many times (min. 1000)
2. Test statistic values tabulated
3. Compared with original value from un-
randomized data
4. If original value is unusual relative to
permutations, Ho can be rejected
The Three R’s
sharonlanegetaz.v2efoliomn.mnscu.edu/Research
Key Points
• Essentially, the null-hypothesis is that the
groups being compared are random samples
from the same population
• Thus, a test of significance is an attempt to
determine whether observed samples could
have been randomly drawn from the same
population
• Answer = probability
â–« PROBABILITY = possibly from same pop. (never
claim definitively)
â–« PROBABILITY = not likely from same pop.
Ex: Fish length for in-shore fish vs. off-
shore fish
• Ha: In-shore fish are smaller on average than off-
shore fish
0 50 100 150 200 250 300
Fork Length
Off-shore
In-shore
Ex. 5.2
• Randomization can be used to test mean
difference
• Original mean difference occurred 25 x’s out of
1000
• What does this tell us about the data?
• Weight of evidence, not significant difference
(e.g. p=0.05)
0
100
200
300
400
500
600
700
800
900
1000
0 5 10 15 20 25 30 35 40 45 50 55
Abs(Mean Difference)
SortedRandomization
Replicates
p=0.025
Selection of a Test Statistic
• Chosen based on sensitivity to the hypothetical
property being checked or compared (e.g. mean
vs. median (ex.5.3))
• Precaution should be taken when selecting a
non-standard test statistic
• Multivariate comparisons
• Determine exactly which hypothesis is being
tested by the test statistic
▫ “When in doubt, be conservative”
• Ex 5.3
Ideal Test Statistics
• Greatest statistical power
• Significance – probability of making a Type I
error
• Power – probability of making a Type II error
• Unbiased – using a test that is more likely to
reject a false hypothesis than a true one
No difference
Null true
Difference exists
Null False
Null accepted OK TYPE II ERROR
Null rejected TYPE I ERROR OK
Randomization of Structured Data
• Restricted to comparison tests (cannot be used
for parameter estimation)
• Differences in variation – randomize residuals
instead of data values
• Basic rule: unbalanced and highly non-normal
data a randomization procedure should be used
• Question: what should be randomized?
â–« Raw data,
â–« Sub-set of raw data, or
â–« residuals from model
Take Home Message…
With structured or non-linear data, care needs to
be taken in what components should be
randomized
Summary
• Randomization requires less assumptions than
standard parametric stats
• Significance tests test whether the observed
samples could be from the same pop.
• State hypotheses and determine significance
level
• Test statistics that yield the greatest power
should be utilized

Weitere ähnliche Inhalte

Was ist angesagt?

Fundamentals of Testing Hypothesis
Fundamentals of Testing HypothesisFundamentals of Testing Hypothesis
Fundamentals of Testing HypothesisYesica Adicondro
 
Scalable Simple Random Sampling Algorithms
Scalable Simple Random Sampling AlgorithmsScalable Simple Random Sampling Algorithms
Scalable Simple Random Sampling AlgorithmsXiangrui Meng
 
Data Analysis: Evaluation Metrics for Supervised Learning Models of Machine L...
Data Analysis: Evaluation Metrics for Supervised Learning Models of Machine L...Data Analysis: Evaluation Metrics for Supervised Learning Models of Machine L...
Data Analysis: Evaluation Metrics for Supervised Learning Models of Machine L...Md. Main Uddin Rony
 
Data science
Data scienceData science
Data scienceS. M. Akash
 
The Chi-Square Statistic: Tests for Goodness of Fit and Independence
The Chi-Square Statistic: Tests for Goodness of Fit and IndependenceThe Chi-Square Statistic: Tests for Goodness of Fit and Independence
The Chi-Square Statistic: Tests for Goodness of Fit and Independencejasondroesch
 
Sampling methods theory and practice
Sampling methods theory and practice Sampling methods theory and practice
Sampling methods theory and practice Ravindra Sharma
 
Hypothesis Testing
Hypothesis TestingHypothesis Testing
Hypothesis TestingJeremy Lane
 
ISSTA'16 Summer School: Intro to Statistics
ISSTA'16 Summer School: Intro to StatisticsISSTA'16 Summer School: Intro to Statistics
ISSTA'16 Summer School: Intro to StatisticsAndrea Arcuri
 
Sampling distribution
Sampling distributionSampling distribution
Sampling distributionNilanjan Bhaumik
 
Research methodology - Estimation Theory & Hypothesis Testing, Techniques of ...
Research methodology - Estimation Theory & Hypothesis Testing, Techniques of ...Research methodology - Estimation Theory & Hypothesis Testing, Techniques of ...
Research methodology - Estimation Theory & Hypothesis Testing, Techniques of ...The Stockker
 
Probability and Samples: The Distribution of Sample Means
Probability and Samples: The Distribution of Sample MeansProbability and Samples: The Distribution of Sample Means
Probability and Samples: The Distribution of Sample Meansjasondroesch
 
Lecture 7 Hypothesis Testing Two Sample
Lecture 7 Hypothesis Testing Two SampleLecture 7 Hypothesis Testing Two Sample
Lecture 7 Hypothesis Testing Two SampleAhmadullah
 
Statistics Presentation week 7
Statistics Presentation week 7Statistics Presentation week 7
Statistics Presentation week 7krookroo
 
8. testing of hypothesis for variable & attribute data
8. testing of hypothesis for variable & attribute  data8. testing of hypothesis for variable & attribute  data
8. testing of hypothesis for variable & attribute dataHakeem-Ur- Rehman
 

Was ist angesagt? (20)

Fundamentals of Testing Hypothesis
Fundamentals of Testing HypothesisFundamentals of Testing Hypothesis
Fundamentals of Testing Hypothesis
 
Scalable Simple Random Sampling Algorithms
Scalable Simple Random Sampling AlgorithmsScalable Simple Random Sampling Algorithms
Scalable Simple Random Sampling Algorithms
 
Data Analysis: Evaluation Metrics for Supervised Learning Models of Machine L...
Data Analysis: Evaluation Metrics for Supervised Learning Models of Machine L...Data Analysis: Evaluation Metrics for Supervised Learning Models of Machine L...
Data Analysis: Evaluation Metrics for Supervised Learning Models of Machine L...
 
Data science
Data scienceData science
Data science
 
The Chi-Square Statistic: Tests for Goodness of Fit and Independence
The Chi-Square Statistic: Tests for Goodness of Fit and IndependenceThe Chi-Square Statistic: Tests for Goodness of Fit and Independence
The Chi-Square Statistic: Tests for Goodness of Fit and Independence
 
Sampling methods theory and practice
Sampling methods theory and practice Sampling methods theory and practice
Sampling methods theory and practice
 
Testing of hypothesis
Testing of hypothesisTesting of hypothesis
Testing of hypothesis
 
chi square statistics
chi square statisticschi square statistics
chi square statistics
 
Hypothesis Testing
Hypothesis TestingHypothesis Testing
Hypothesis Testing
 
ISSTA'16 Summer School: Intro to Statistics
ISSTA'16 Summer School: Intro to StatisticsISSTA'16 Summer School: Intro to Statistics
ISSTA'16 Summer School: Intro to Statistics
 
Sampling distribution
Sampling distributionSampling distribution
Sampling distribution
 
Chapter 4
Chapter 4Chapter 4
Chapter 4
 
Chapter 7
Chapter 7 Chapter 7
Chapter 7
 
Research methodology - Estimation Theory & Hypothesis Testing, Techniques of ...
Research methodology - Estimation Theory & Hypothesis Testing, Techniques of ...Research methodology - Estimation Theory & Hypothesis Testing, Techniques of ...
Research methodology - Estimation Theory & Hypothesis Testing, Techniques of ...
 
Probability and Samples: The Distribution of Sample Means
Probability and Samples: The Distribution of Sample MeansProbability and Samples: The Distribution of Sample Means
Probability and Samples: The Distribution of Sample Means
 
Chapter 9
Chapter 9 Chapter 9
Chapter 9
 
Lecture 7 Hypothesis Testing Two Sample
Lecture 7 Hypothesis Testing Two SampleLecture 7 Hypothesis Testing Two Sample
Lecture 7 Hypothesis Testing Two Sample
 
Statistics Presentation week 7
Statistics Presentation week 7Statistics Presentation week 7
Statistics Presentation week 7
 
8. testing of hypothesis for variable & attribute data
8. testing of hypothesis for variable & attribute  data8. testing of hypothesis for variable & attribute  data
8. testing of hypothesis for variable & attribute data
 
Statistics - Basics
Statistics - BasicsStatistics - Basics
Statistics - Basics
 

Andere mochten auch

MCMC and likelihood-free methods
MCMC and likelihood-free methodsMCMC and likelihood-free methods
MCMC and likelihood-free methodsChristian Robert
 
Bootstrap: a bias minimization technique of an estimator
Bootstrap: a bias minimization technique of an estimatorBootstrap: a bias minimization technique of an estimator
Bootstrap: a bias minimization technique of an estimatorMichel Alves
 
Reading Efron's 1979 paper on bootstrap
Reading Efron's 1979 paper on bootstrapReading Efron's 1979 paper on bootstrap
Reading Efron's 1979 paper on bootstrapChristian Robert
 
Aula 2 Teoria Da Amostragem Daniel
Aula 2 Teoria Da Amostragem DanielAula 2 Teoria Da Amostragem Daniel
Aula 2 Teoria Da Amostragem Danielguest8af68839
 
Uchim.org 6-klass-vilenkin
Uchim.org 6-klass-vilenkinUchim.org 6-klass-vilenkin
Uchim.org 6-klass-vilenkinRazon Ej
 

Andere mochten auch (8)

jrudd1_RDay
jrudd1_RDayjrudd1_RDay
jrudd1_RDay
 
Jsm09 talk
Jsm09 talkJsm09 talk
Jsm09 talk
 
MCMC and likelihood-free methods
MCMC and likelihood-free methodsMCMC and likelihood-free methods
MCMC and likelihood-free methods
 
Bootstrap: a bias minimization technique of an estimator
Bootstrap: a bias minimization technique of an estimatorBootstrap: a bias minimization technique of an estimator
Bootstrap: a bias minimization technique of an estimator
 
Reading Efron's 1979 paper on bootstrap
Reading Efron's 1979 paper on bootstrapReading Efron's 1979 paper on bootstrap
Reading Efron's 1979 paper on bootstrap
 
Aula 2 Teoria Da Amostragem Daniel
Aula 2 Teoria Da Amostragem DanielAula 2 Teoria Da Amostragem Daniel
Aula 2 Teoria Da Amostragem Daniel
 
A.b aula 4 amostragem
A.b aula 4 amostragemA.b aula 4 amostragem
A.b aula 4 amostragem
 
Uchim.org 6-klass-vilenkin
Uchim.org 6-klass-vilenkinUchim.org 6-klass-vilenkin
Uchim.org 6-klass-vilenkin
 

Ă„hnlich wie Computer Intensive Statistical Methods

2010 smg training_cardiff_day1_session3_higgins
2010 smg training_cardiff_day1_session3_higgins2010 smg training_cardiff_day1_session3_higgins
2010 smg training_cardiff_day1_session3_higginsrgveroniki
 
Lecture 10 - Model Testing and Evaluation, a lecture in subject module Statis...
Lecture 10 - Model Testing and Evaluation, a lecture in subject module Statis...Lecture 10 - Model Testing and Evaluation, a lecture in subject module Statis...
Lecture 10 - Model Testing and Evaluation, a lecture in subject module Statis...Maninda Edirisooriya
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanMYRABACSAFRA2
 
Sample Size And Gpower Module
Sample Size And Gpower ModuleSample Size And Gpower Module
Sample Size And Gpower Modulellalablink
 
sample_size_Determination .pdf
sample_size_Determination .pdfsample_size_Determination .pdf
sample_size_Determination .pdfstatsanjal
 
Statistical tests
Statistical testsStatistical tests
Statistical testsmartyynyyte
 
Introduction to sampling
Introduction to samplingIntroduction to sampling
Introduction to samplingSituo Liu
 
Chapter Seven - .pptbhhhdfhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhd
Chapter Seven - .pptbhhhdfhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhdChapter Seven - .pptbhhhdfhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhd
Chapter Seven - .pptbhhhdfhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhdbeshahashenafe20
 
ch09-Simulation.ppt
ch09-Simulation.pptch09-Simulation.ppt
ch09-Simulation.pptLuckySaigon1
 

Ă„hnlich wie Computer Intensive Statistical Methods (20)

LR 9 Estimation.pdf
LR 9 Estimation.pdfLR 9 Estimation.pdf
LR 9 Estimation.pdf
 
Overview of statistical tests: Data handling and data quality (Part II)
Overview of statistical tests: Data handling and data quality (Part II)Overview of statistical tests: Data handling and data quality (Part II)
Overview of statistical tests: Data handling and data quality (Part II)
 
2010 smg training_cardiff_day1_session3_higgins
2010 smg training_cardiff_day1_session3_higgins2010 smg training_cardiff_day1_session3_higgins
2010 smg training_cardiff_day1_session3_higgins
 
Lecture 10 - Model Testing and Evaluation, a lecture in subject module Statis...
Lecture 10 - Model Testing and Evaluation, a lecture in subject module Statis...Lecture 10 - Model Testing and Evaluation, a lecture in subject module Statis...
Lecture 10 - Model Testing and Evaluation, a lecture in subject module Statis...
 
Hypothsis testing
Hypothsis testingHypothsis testing
Hypothsis testing
 
ai4.ppt
ai4.pptai4.ppt
ai4.ppt
 
ai4.ppt
ai4.pptai4.ppt
ai4.ppt
 
ai4.ppt
ai4.pptai4.ppt
ai4.ppt
 
ai4.ppt
ai4.pptai4.ppt
ai4.ppt
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population Mean
 
Sample Size And Gpower Module
Sample Size And Gpower ModuleSample Size And Gpower Module
Sample Size And Gpower Module
 
sample_size_Determination .pdf
sample_size_Determination .pdfsample_size_Determination .pdf
sample_size_Determination .pdf
 
Statistical tests
Statistical testsStatistical tests
Statistical tests
 
evaluation and credibility-Part 1
evaluation and credibility-Part 1evaluation and credibility-Part 1
evaluation and credibility-Part 1
 
BIIntro.ppt
BIIntro.pptBIIntro.ppt
BIIntro.ppt
 
Introduction to sampling
Introduction to samplingIntroduction to sampling
Introduction to sampling
 
Chapter Seven - .pptbhhhdfhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhd
Chapter Seven - .pptbhhhdfhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhdChapter Seven - .pptbhhhdfhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhd
Chapter Seven - .pptbhhhdfhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhd
 
ch09-Simulation.ppt
ch09-Simulation.pptch09-Simulation.ppt
ch09-Simulation.ppt
 
Environmental statistics
Environmental statisticsEnvironmental statistics
Environmental statistics
 
9618821.pdf
9618821.pdf9618821.pdf
9618821.pdf
 

KĂĽrzlich hochgeladen

USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...Postal Advocate Inc.
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Celine George
 
4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptxmary850239
 
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)lakshayb543
 
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptxQ4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptxlancelewisportillo
 
Concurrency Control in Database Management system
Concurrency Control in Database Management systemConcurrency Control in Database Management system
Concurrency Control in Database Management systemChristalin Nelson
 
Choosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for ParentsChoosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for Parentsnavabharathschool99
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfTechSoup
 
Karra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxKarra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxAshokKarra1
 
ROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptxROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptxVanesaIglesias10
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxHumphrey A Beña
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPCeline George
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4MiaBumagat1
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...JhezDiaz1
 
How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17Celine George
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designMIPLM
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxiammrhaywood
 

KĂĽrzlich hochgeladen (20)

USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17
 
4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx
 
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
 
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptxQ4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
 
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptxFINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
 
Concurrency Control in Database Management system
Concurrency Control in Database Management systemConcurrency Control in Database Management system
Concurrency Control in Database Management system
 
Choosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for ParentsChoosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for Parents
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
 
Karra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxKarra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptx
 
ROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptxROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptx
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERP
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
 
How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-design
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
 
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptxYOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
 
Raw materials used in Herbal Cosmetics.pptx
Raw materials used in Herbal Cosmetics.pptxRaw materials used in Herbal Cosmetics.pptx
Raw materials used in Herbal Cosmetics.pptx
 

Computer Intensive Statistical Methods

  • 2. Playing Card Example • 52 cards are population â–« 25 card sample…. What is the mean, median, SE • Randomization â–« Randomly reallocate cards to groups (i.e. diamonds, spades..) â–« 4 people divide up cards â–« Do 1000+ times randomly allocating to the 4 groups â–« Graph up distribution of parameter â–« If your original parameter is outside of the 95% CI then it is significantly different from random
  • 3. Playing Card Example • JackKnife â–« Take 24 cards… (leave 1 out) and complete test statistics â–« Redo for all possible combinations (25 x) • Bootstrapping â–« Pick out 25 cards, but put them back each time â–« Calculate new parameters ď‚– Redo 1000+ times â–« If your sample parameter falls within the 95% CI of the distribution, then it isn’t statistically different from random
  • 4. Playing Card Example • Monte Carlo â–« Find a model that would fit the card trend ď‚– Relative frequency plot, examine shape â–« Randomly select value for model parameters or data (cards picked) ď‚– Complete 1000+ times ď‚– Analyze your parameters to fit the data vs. random generated parameters http://www.vertex42.com/ExcelArticles/mc/MonteCarloSimulation.html
  • 5. COMPARISON Randomize Jackknife Bootstrap Monte Carlo With Replacement No No Yes Yes Exact P-values Yes Likely no Yes Yes Resample a theoretical PDF Parametric Resample an empirical distribution Yes Yes Non- parametric Non- parametric Good to… Deal with unknown distribution Detect bias, calc. SE, good for biases parameters Calc. sample size for exp. Design, CI, SE and Test Hypot. Flexible, generic, SE, CI, Test Hypot. Good for sparse data sets Limitations Can’t calc SE, or CI (weak) Bad CI
  • 6. 4 Methods • Randomization â–« Ho: each group of obs. is a random sample of 1 pop. â–« Must be characterized by a test stat â–« Combine all groups, then reallocate, and compare ď‚– Repeat 1000+ times â–« Compare obs. Test stat with empirical distribution of that test stat given available data. ď‚– A sig diff is when obs. test stat is beyond empirical distribution
  • 7. 4 Methods • Jackknife â–« Sample could be from one arm of the distribution â–« Subset data (for all combinations of all data minus 1 pt) (total = n-1) â–« Calculate pseudo-values , diff. btw. this and obs = estimate of bias â–« Good when estimating something other than mean â–« Calculate Jackknife SE and parameter of interest ď‚– CI can be fitted but issues with DFs
  • 8. 4 Methods • Bootstrapping â–« Random samples of the observation (with replacement ď‚– Each treated as a separate random sample â–« Should = the distribution if you had repeatedly sampled the original population â–« Provides better CI than the Jackknife ď‚– Can determine SE, CI and test hypotheses â–« Have been used for multiple regression and stratified sampling
  • 9. 4 Methods • Monte Carlo â–« Mathematical model of the situation + model parameters â–« Randomly select variable, parameter or data values and then use to determine model output ď‚– Do 1000+ times and use to test hypotheses or determine Confidence Intervals â–« *Resampling from a theoretical distribution â–« Compare observations with data from a model of the system + for use in risk assessment
  • 10. Chapter 5: Randomization TestsChapter 5: Randomization Tests • ANOVA (e.g.) vs. Randomization â–« Assumptions • Hypothesis testing â–« Determinations of likelihood that observations in nature could have arisen by chance â–« Relativity: group 1 vs. group 2, hypothesis
  • 11. Standard Significance Testing • Test statistic (e.g. t-test) • Significant difference • Statistically 3 things needed to test hypothesis: 1. Formally stated hypothesis (Ho and Ha) 2. Test statistic: t-test, F-ratio, r correlation, etc. 3. Means of generating PD of test statistic under assumption Ho is true • idrc.ca idrc.ca
  • 12. • Observed vs. expected values (d.f., etc.) • Determine how likely observed values, assuming Ho to be true • α-value: 0.05, 0.01, 0.001 0 0.2 0.4 0.6 0.8 1 0 5 10 15 20 25 X2 Statistic Probability α 0.05 = ~15
  • 13. However… • Test statistics are not valid if assumptions are falsely made • Can, therefore, reject a real difference in data or accept a non-existent difference • Problem with theoretically derived PDF: â–« If test statistic is not significant, cannot tell without further analyses if the test failed b/c: 1. Samples are not independently distributed (thing being tested) or, 2. The data failed to conform to the assumptions necessary for the validity of the test (e.g. samples were not normally distributed) This is where randomization tests come into play…
  • 14. Significance Testing by Randomization Independent of any determined parametric PDF Generates empirical PDF for the test-statistic
  • 15. Given a null hypothesis, the expected PDF for a test statistic can be generated by repeatedly randomizing the data with respect to sample group membership and recalculating the test statistic 1. Done many times (min. 1000) 2. Test statistic values tabulated 3. Compared with original value from un- randomized data 4. If original value is unusual relative to permutations, Ho can be rejected
  • 17. Key Points • Essentially, the null-hypothesis is that the groups being compared are random samples from the same population • Thus, a test of significance is an attempt to determine whether observed samples could have been randomly drawn from the same population • Answer = probability â–« PROBABILITY = possibly from same pop. (never claim definitively) â–« PROBABILITY = not likely from same pop.
  • 18. Ex: Fish length for in-shore fish vs. off- shore fish • Ha: In-shore fish are smaller on average than off- shore fish 0 50 100 150 200 250 300 Fork Length Off-shore In-shore
  • 19. Ex. 5.2 • Randomization can be used to test mean difference • Original mean difference occurred 25 x’s out of 1000 • What does this tell us about the data? • Weight of evidence, not significant difference (e.g. p=0.05)
  • 20. 0 100 200 300 400 500 600 700 800 900 1000 0 5 10 15 20 25 30 35 40 45 50 55 Abs(Mean Difference) SortedRandomization Replicates p=0.025
  • 21. Selection of a Test Statistic • Chosen based on sensitivity to the hypothetical property being checked or compared (e.g. mean vs. median (ex.5.3)) • Precaution should be taken when selecting a non-standard test statistic • Multivariate comparisons • Determine exactly which hypothesis is being tested by the test statistic â–« “When in doubt, be conservative” • Ex 5.3
  • 22. Ideal Test Statistics • Greatest statistical power • Significance – probability of making a Type I error • Power – probability of making a Type II error • Unbiased – using a test that is more likely to reject a false hypothesis than a true one No difference Null true Difference exists Null False Null accepted OK TYPE II ERROR Null rejected TYPE I ERROR OK
  • 23. Randomization of Structured Data • Restricted to comparison tests (cannot be used for parameter estimation) • Differences in variation – randomize residuals instead of data values • Basic rule: unbalanced and highly non-normal data a randomization procedure should be used • Question: what should be randomized? â–« Raw data, â–« Sub-set of raw data, or â–« residuals from model
  • 24. Take Home Message… With structured or non-linear data, care needs to be taken in what components should be randomized
  • 25. Summary • Randomization requires less assumptions than standard parametric stats • Significance tests test whether the observed samples could be from the same pop. • State hypotheses and determine significance level • Test statistics that yield the greatest power should be utilized

Hinweis der Redaktion

  1. FOR THIS EXAMPLE PICK OUT 3 cards at a time
  2. FOR THIS EXAMPLE PICK OUT 3 cards at a time
  3. Less assumptions are needed for randomization tests, giving extra flexibility.
  4. Parametric stats good when assumptions are known