SlideShare ist ein Scribd-Unternehmen logo
1 von 29
Analysis 101
Brian Wells
Today’s Objectives
• Not to teach you the mathematics involved
• Not to make you an expert statistician
• Not to make you an expert in picking tests and
designing studies
• Is to highlight different analytic and statistical
methods in research
• Is to help facilitate communication between
investigators and biostatisticians by establishing a
common vocabulary
Data Types
• Numerical data (quantitative)
• Measurements or counts
• Weight, blood pressure, number of medications
• Categorical data (qualitative)
• Patients sorted into categories
• Diabetic/non-diabetic
• Adherent/non-adherent
• Smoking/non-smoking
Categorical Data
• Nominal
• No explicit ordering to categories
• Blood types – A/B/AB/O
• Race/Ethnicity
• Called binary or dichotomous if 2 categories
• Gender – M/F
• Ordinal
• Defined ordering
• Cancer stage I, II, III, IV
• Non-smoker/smoker/ex-smoker
• NYHA Class
Numerical Data
• Can be further subdivided into discrete and
continuous
• Discrete variables
• Have a limited number of possible values (finite or
countably infinite)
• Gaps between possible values (whole integers)
• Ex: Number of CHF episodes, number of medications
• Continuous variables
• No gaps between possible values
• Ex: Duration of seizure, body mass index, height
Determining Data Types
• Ordinal (Categorical) v. Discrete (Numerical)
• Ordinal
• Cancer Stage I, II, III, IV
• Cancer Stage II is not 2*Stage I
• Discrete
• Number of children: 0, 1, 2…
• 4 children = 2 times 2 children
So Why Spend Time On This?
• The data types help determine which analysis to
use
• It helps determine how best to summarize the
display data
• Categorical – percent's, fractions, numbers in
categories
• Numerical – mean, median, mode, standard
deviation, variance, quartile ranges
Data Summaries
• Be careful of overreliance on numbers – Keep the
big picture in mind (more on this next time)
• Both means = 2, SD = 1.9, n = 1000
Statistical Inference
• Estimation of quantity of interest
• Estimate itself
• Quantify how good an estimate it is
• Ex: If you took more and more samples, how much
would the estimate vary?
• Hypothesis testing
Statistical Inference Example
• Proportion of people in a population who have diabetes.
N = 800
• Sample 1: 200/800 = 0.25
• We conclude that the estimated % of people with
diabetes is 25%
• But how variable is our estimate?
• We need to know the sampling distribution!
• Option 1: Take lots and lots of samples
• Sample 2: 215/800 = 26.8%
• Sample 3: 194/800 = 24.25%
• Not practical!
Statistical Inference Example
• Statistical theory
• Sample distributions for means and proportions are
normally “bell-shaped”
• From a single sample, we calculate the standard error
(variability) of our estimated mean or proportion
• Standard error measures the variability of the sample
statistic. Small SE means more precise estimate.
• SE ≠ Standard Deviation
• SD = variability of the sample data
• SE = variability of the statistic
Distributions
• Sample means follow a t-distributions on if
• Underlying data is approximately normal OR
• N is large
• A sample mean from a sample of size n will have a t
distributions with n-1 degrees of freedom (tn-1)
Confidence Intervals
• Assume we use our t15 distribution with n = 16, mean SBP
= 123.4 mm Hg, and SD = 14.0 mm Hg
• SE of mean = SD / √n = 3.5
• 95% CI for sample mean is then
• Mean + 2.131 (for t15 distribution) * SE
• = 123.4 ± 2.131 * SE
• = (115.9, 130.8) mm Hg
• And as N gets larger, t statistic gets smaller (t99 = 1.984),
which with the same numbers as above but with N = 100,
CI narrows to (120.6, 126.2)
• Note: It’s never incorrect to use a t-distribution as long as
the underlying population is normal or N is large
Hypothesis Testing
• Confidence intervals told us the best estimate and the
variability of the best estimate
• Hypothesis testing tells us if there really is a difference
between an observed value and another value
• From our earlier example: N = 800, we estimated that
25% of people had diabetes
• Let’s say a study 10 years prior estimated that 12% of
people had diabetes
• Has the percent of people with diabetes really changed?
Hypothesis Testing
• Support the true percent of people with diabetes is 12%
• Called the null hypothesis or H0
• How likely is it that we would observe a result as or more
extreme than 25% given the true percent is 12%?
• This is the p-value, computed using normal distributions for
sample proportions and t-distribution for sample means
• If the probability is small, consult the supposition may not be
right
• Reject the null hypothesis in favor of the alternate
hypothesis Ha
• If the probability is not small, conclude that there is
insufficient evidence to reject the null hypothesis
• This is NOT the same as accepting the null or showing the
null hypothesis is true
Hypothesis Testing
• H0: True proportion is 12%
• Ha: True proportion is not 12%
• If P < 0.05, we would conclude it is not likely to observe
our data is the true proportion was 12%
• We conclude that this is sufficient evidence that the
proportion with diabetes is not 12%
• Test can be one-sided or two-sided
• One-sided ONLY ok if previous research suggests that the
proportion is larger
Misinterpreting the p-value
• A p-value of 0.32 (or > 0.05) DOES NOT mean:
• We accept the null
• There is a 32% chance the null is true
• It only lets us reject the null in favor of the alternative or
fail to reject the null
• If you fail to reject, it DOES NOT mean the alternative isn’t
true. It may mean your N is too small or the study is
underpowered.
Decision-Making
Other Statistics
• Some statistics are distribution-free
• Recall that t-tests/distributions depend on normality or
large N’s
• What is we don’t have one or both of these, ex: skewed
data, N is small
• We can use nonparametric methods that look at ranks,
not means
• The median is a nonparametric estimate
Nonparametric Methods
• Don’t require a particular distribution
• Well-suited to hypothesis testing
• Not as useful for point estimates or Cis
• Especially useful is data is ranks or scores – Apgar scores,
Vision (20/20, 20/40)
• Do inferences on medial values
• Hypothesis Test is Sign Test
• Assumes hypothesized value of median is correct,
except to observe about half the sample above and
half below
• Computes probability for proportion above median
Parametric v. Nonparametric
• Nonparametric are always ok to use
• Nonparametric are more conservative than parametric
• In fact, 95% CI for medians are sometimes twice as
wide as those for the mean
• If your N is fairly large, or if you know your data is normal,
parametric is always best
How To Select A Test
• Start by asking, “Am I testing for a difference or a
relationship in my data?”
Difference Testing
• Am I testing one sample or more than one sample?
• One sample – Is my data parametric?
• Yes – One sample t-test
• No – Wilcoxon Signed Rank Test
Difference Testing
• More than one sample – Is my data nominal, or
ordinal/interval/ratio?
• Nominal – Chi-Squared test
• Ordinal/interval/ratio – How many dependent
variables are there?
• Two or more – Multivariate Analysis of Variance
(MANOVA)
Difference Testing
• One – Are the measures repeated, independent, or
mixed?
• Mixed – Mixed Model ANOVA
• Independent
• How many conditions are there?
• Two conditions
• Parametric data – Independent samples t-test
• Non-parametric data – Mann-Whitney U test
• More than two
• Parametric – Between Participants (One-Way)
ANOVA
• Non-parametric – Kruskal-Wallis
• Repeated
Difference Testing
• One – Are the measures repeated, independent, or
mixed?
• Repeated
• Two Conditions
• Parametric – Paired Samples t-test
• Non-parametric – Wilcoxon Matched Pairs
• More than two conditions
• Parametric – Within Participants ANOVA
• Non-parametric – Friedman’s ANOVA
Relationship Testing
• Single Independent variable
• Parametric – Pearson’s Correlation
• Non-parametric – Spearman’s Correlation
• Multiple Independent variables
• Parametric – Logistic Regression
• Non-parametric – Multiple Regression
• Multiple Factors Correlation Matrix
• Factor analysis
Model Information
• The specific of each model (how they differ, how they’re
calculated, etc) are not important for our purposes
• What is important is to be able to select the correct test
• Selecting the wrong test WILL lead to wrong conclusions
(failing to reject the null, inappropriately rejecting the
null)
Going Further
• There are many, many more tests we did not cover
• Durbin-Watson
• Kolmogorov-Smirnov
• Anderson-Darling
• Cox Proportional Hazards
• Kaplan-Meier Survival Analysis
• And so on…
• However, the tests presented will cover the majority of
basic studies done

Weitere ähnliche Inhalte

Was ist angesagt?

Chosing the appropriate_statistical_test
Chosing the appropriate_statistical_testChosing the appropriate_statistical_test
Chosing the appropriate_statistical_test
BRAJESH KUMAR PARASHAR
 
Statistical methods for the life sciences lb
Statistical methods for the life sciences lbStatistical methods for the life sciences lb
Statistical methods for the life sciences lb
priyaupm
 
Very good statistics-overview rbc (1)
Very good statistics-overview rbc (1)Very good statistics-overview rbc (1)
Very good statistics-overview rbc (1)
Abdul Wasay Baloch
 

Was ist angesagt? (20)

Chosing the appropriate_statistical_test
Chosing the appropriate_statistical_testChosing the appropriate_statistical_test
Chosing the appropriate_statistical_test
 
Non parametric study; Statistical approach for med student
Non parametric study; Statistical approach for med student Non parametric study; Statistical approach for med student
Non parametric study; Statistical approach for med student
 
Biostatistics
BiostatisticsBiostatistics
Biostatistics
 
Statistical methods for the life sciences lb
Statistical methods for the life sciences lbStatistical methods for the life sciences lb
Statistical methods for the life sciences lb
 
Bio statistics 2 /certified fixed orthodontic courses by Indian dental academy
Bio statistics 2 /certified fixed orthodontic courses by Indian dental academy Bio statistics 2 /certified fixed orthodontic courses by Indian dental academy
Bio statistics 2 /certified fixed orthodontic courses by Indian dental academy
 
Parametric vs Non-Parametric
Parametric vs Non-ParametricParametric vs Non-Parametric
Parametric vs Non-Parametric
 
Parametric and non parametric test in biostatistics
Parametric and non parametric test in biostatistics Parametric and non parametric test in biostatistics
Parametric and non parametric test in biostatistics
 
T test^jsample size^j ethics
T test^jsample size^j ethicsT test^jsample size^j ethics
T test^jsample size^j ethics
 
Very good statistics-overview rbc (1)
Very good statistics-overview rbc (1)Very good statistics-overview rbc (1)
Very good statistics-overview rbc (1)
 
Normality tests
Normality testsNormality tests
Normality tests
 
BIOSTATISTICS + EXERCISES
BIOSTATISTICS + EXERCISESBIOSTATISTICS + EXERCISES
BIOSTATISTICS + EXERCISES
 
bio statistics for clinical research
bio statistics for clinical researchbio statistics for clinical research
bio statistics for clinical research
 
11. data management
11. data management11. data management
11. data management
 
Stats test
Stats testStats test
Stats test
 
Overview of statistical tests: Data handling and data quality (Part II)
Overview of statistical tests: Data handling and data quality (Part II)Overview of statistical tests: Data handling and data quality (Part II)
Overview of statistical tests: Data handling and data quality (Part II)
 
How to choose a right statistical test
How to choose a right statistical testHow to choose a right statistical test
How to choose a right statistical test
 
Commonly Used Statistics in Survey Research
Commonly Used Statistics in Survey ResearchCommonly Used Statistics in Survey Research
Commonly Used Statistics in Survey Research
 
NON-PARAMETRIC TESTS by Prajakta Sawant
NON-PARAMETRIC TESTS by Prajakta SawantNON-PARAMETRIC TESTS by Prajakta Sawant
NON-PARAMETRIC TESTS by Prajakta Sawant
 
When to use, What Statistical Test for data Analysis modified.pptx
When to use, What Statistical Test for data Analysis modified.pptxWhen to use, What Statistical Test for data Analysis modified.pptx
When to use, What Statistical Test for data Analysis modified.pptx
 
Parametric versus non parametric test
Parametric versus non parametric testParametric versus non parametric test
Parametric versus non parametric test
 

Ähnlich wie Analysis 101

BASIC STATISTICS AND THEIR INTERPRETATION AND USE IN EPIDEMIOLOGY 050822.pdf
BASIC STATISTICS AND THEIR INTERPRETATION AND USE IN EPIDEMIOLOGY 050822.pdfBASIC STATISTICS AND THEIR INTERPRETATION AND USE IN EPIDEMIOLOGY 050822.pdf
BASIC STATISTICS AND THEIR INTERPRETATION AND USE IN EPIDEMIOLOGY 050822.pdf
Adamu Mohammad
 
7- Quantitative Research- Part 3.pdf
7- Quantitative Research- Part 3.pdf7- Quantitative Research- Part 3.pdf
7- Quantitative Research- Part 3.pdf
ezaldeen2013
 
PARAMETRIC TESTS.pptx
PARAMETRIC TESTS.pptxPARAMETRIC TESTS.pptx
PARAMETRIC TESTS.pptx
DrLasya
 

Ähnlich wie Analysis 101 (20)

BASIC STATISTICS AND THEIR INTERPRETATION AND USE IN EPIDEMIOLOGY 050822.pdf
BASIC STATISTICS AND THEIR INTERPRETATION AND USE IN EPIDEMIOLOGY 050822.pdfBASIC STATISTICS AND THEIR INTERPRETATION AND USE IN EPIDEMIOLOGY 050822.pdf
BASIC STATISTICS AND THEIR INTERPRETATION AND USE IN EPIDEMIOLOGY 050822.pdf
 
7- Quantitative Research- Part 3.pdf
7- Quantitative Research- Part 3.pdf7- Quantitative Research- Part 3.pdf
7- Quantitative Research- Part 3.pdf
 
Statistics basics for oncologist kiran
Statistics basics for oncologist kiranStatistics basics for oncologist kiran
Statistics basics for oncologist kiran
 
SAMPLE SIZE CALCULATION IN DIFFERENT STUDY DESIGNS AT.pptx
SAMPLE SIZE CALCULATION IN DIFFERENT STUDY DESIGNS AT.pptxSAMPLE SIZE CALCULATION IN DIFFERENT STUDY DESIGNS AT.pptx
SAMPLE SIZE CALCULATION IN DIFFERENT STUDY DESIGNS AT.pptx
 
Tests of significance Periodontology
Tests of significance PeriodontologyTests of significance Periodontology
Tests of significance Periodontology
 
De-Mystifying Stats: A primer on basic statistics
De-Mystifying Stats: A primer on basic statisticsDe-Mystifying Stats: A primer on basic statistics
De-Mystifying Stats: A primer on basic statistics
 
BIOSTATISTICS.pptx
BIOSTATISTICS.pptxBIOSTATISTICS.pptx
BIOSTATISTICS.pptx
 
tests of significance
tests of significancetests of significance
tests of significance
 
Common Statistical Terms - Biostatistics - Ravinandan A P.pdf
Common Statistical Terms - Biostatistics - Ravinandan A P.pdfCommon Statistical Terms - Biostatistics - Ravinandan A P.pdf
Common Statistical Terms - Biostatistics - Ravinandan A P.pdf
 
Application of statistical tests in Biomedical Research .pptx
Application of statistical tests in Biomedical Research .pptxApplication of statistical tests in Biomedical Research .pptx
Application of statistical tests in Biomedical Research .pptx
 
Non parametric test
Non parametric testNon parametric test
Non parametric test
 
Introduction to Data Management in Human Ecology
Introduction to Data Management in Human EcologyIntroduction to Data Management in Human Ecology
Introduction to Data Management in Human Ecology
 
non parametric test.pptx
non parametric test.pptxnon parametric test.pptx
non parametric test.pptx
 
PARAMETRIC TESTS.pptx
PARAMETRIC TESTS.pptxPARAMETRIC TESTS.pptx
PARAMETRIC TESTS.pptx
 
Stats-Review-Maie-St-John-5-20-2009.ppt
Stats-Review-Maie-St-John-5-20-2009.pptStats-Review-Maie-St-John-5-20-2009.ppt
Stats-Review-Maie-St-John-5-20-2009.ppt
 
Parametric tests
Parametric testsParametric tests
Parametric tests
 
Non parametric test
Non parametric testNon parametric test
Non parametric test
 
Stats - Intro to Quantitative
Stats -  Intro to Quantitative Stats -  Intro to Quantitative
Stats - Intro to Quantitative
 
Chapter34
Chapter34Chapter34
Chapter34
 
Basic stat analysis using excel
Basic stat analysis using excelBasic stat analysis using excel
Basic stat analysis using excel
 

Mehr von Brian Wells, MD, MS, MPH

Seizures and Epilepsy and Their Relationship to Autism
Seizures and Epilepsy and Their Relationship to AutismSeizures and Epilepsy and Their Relationship to Autism
Seizures and Epilepsy and Their Relationship to Autism
Brian Wells, MD, MS, MPH
 
The Effect of TNF-α Blockage on Diabetic Neuropathy
The Effect of TNF-α Blockage on Diabetic NeuropathyThe Effect of TNF-α Blockage on Diabetic Neuropathy
The Effect of TNF-α Blockage on Diabetic Neuropathy
Brian Wells, MD, MS, MPH
 
Universal Health Insurance Coverage in the United States
Universal Health Insurance Coverage in the United StatesUniversal Health Insurance Coverage in the United States
Universal Health Insurance Coverage in the United States
Brian Wells, MD, MS, MPH
 

Mehr von Brian Wells, MD, MS, MPH (20)

Adult Lines and Tubes in Radiology
Adult Lines and Tubes in RadiologyAdult Lines and Tubes in Radiology
Adult Lines and Tubes in Radiology
 
An Introduction to Artificial Intelligence for the Everyday Radiologist
An Introduction to Artificial Intelligence for the Everyday RadiologistAn Introduction to Artificial Intelligence for the Everyday Radiologist
An Introduction to Artificial Intelligence for the Everyday Radiologist
 
Basics of Research and Bias
Basics of Research and BiasBasics of Research and Bias
Basics of Research and Bias
 
The Science of Sepsis
The Science of SepsisThe Science of Sepsis
The Science of Sepsis
 
Acute Coronary Syndrome
Acute Coronary SyndromeAcute Coronary Syndrome
Acute Coronary Syndrome
 
Seven Basic Tools of Quality
Seven Basic Tools of QualitySeven Basic Tools of Quality
Seven Basic Tools of Quality
 
Ring Enhancing Lesions
Ring Enhancing LesionsRing Enhancing Lesions
Ring Enhancing Lesions
 
HRCT Interpretation
HRCT InterpretationHRCT Interpretation
HRCT Interpretation
 
GI Bleeds
GI BleedsGI Bleeds
GI Bleeds
 
Seizures and Epilepsy and Their Relationship to Autism
Seizures and Epilepsy and Their Relationship to AutismSeizures and Epilepsy and Their Relationship to Autism
Seizures and Epilepsy and Their Relationship to Autism
 
The Effect of TNF-α Blockage on Diabetic Neuropathy
The Effect of TNF-α Blockage on Diabetic NeuropathyThe Effect of TNF-α Blockage on Diabetic Neuropathy
The Effect of TNF-α Blockage on Diabetic Neuropathy
 
Choanal atresia
Choanal atresiaChoanal atresia
Choanal atresia
 
Health informatics
Health informaticsHealth informatics
Health informatics
 
Medical Malpractice and Legal Challenges to Caps on Noneconomic Damages
Medical Malpractice and Legal Challenges to Caps on Noneconomic DamagesMedical Malpractice and Legal Challenges to Caps on Noneconomic Damages
Medical Malpractice and Legal Challenges to Caps on Noneconomic Damages
 
Universal Health Insurance Coverage in the United States
Universal Health Insurance Coverage in the United StatesUniversal Health Insurance Coverage in the United States
Universal Health Insurance Coverage in the United States
 
Chemical Methods of Vector Control
Chemical Methods of Vector ControlChemical Methods of Vector Control
Chemical Methods of Vector Control
 
Workforce Graduate Medical Education
Workforce Graduate Medical EducationWorkforce Graduate Medical Education
Workforce Graduate Medical Education
 
HIV Vaccine Development Strategies
HIV Vaccine Development StrategiesHIV Vaccine Development Strategies
HIV Vaccine Development Strategies
 
Ecstasy Use Among College Students
Ecstasy Use Among College StudentsEcstasy Use Among College Students
Ecstasy Use Among College Students
 
Guide to Building Your Own PC - May 2005
Guide to Building Your Own PC - May 2005Guide to Building Your Own PC - May 2005
Guide to Building Your Own PC - May 2005
 

Kürzlich hochgeladen

Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 

Kürzlich hochgeladen (20)

Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibit
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
 
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfUnit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
 
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
 
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptxHMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and Modifications
 
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptxOn_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentation
 
How to Add New Custom Addons Path in Odoo 17
How to Add New Custom Addons Path in Odoo 17How to Add New Custom Addons Path in Odoo 17
How to Add New Custom Addons Path in Odoo 17
 
Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)
 
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
 
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptxExploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POS
 

Analysis 101

  • 2. Today’s Objectives • Not to teach you the mathematics involved • Not to make you an expert statistician • Not to make you an expert in picking tests and designing studies • Is to highlight different analytic and statistical methods in research • Is to help facilitate communication between investigators and biostatisticians by establishing a common vocabulary
  • 3. Data Types • Numerical data (quantitative) • Measurements or counts • Weight, blood pressure, number of medications • Categorical data (qualitative) • Patients sorted into categories • Diabetic/non-diabetic • Adherent/non-adherent • Smoking/non-smoking
  • 4. Categorical Data • Nominal • No explicit ordering to categories • Blood types – A/B/AB/O • Race/Ethnicity • Called binary or dichotomous if 2 categories • Gender – M/F • Ordinal • Defined ordering • Cancer stage I, II, III, IV • Non-smoker/smoker/ex-smoker • NYHA Class
  • 5. Numerical Data • Can be further subdivided into discrete and continuous • Discrete variables • Have a limited number of possible values (finite or countably infinite) • Gaps between possible values (whole integers) • Ex: Number of CHF episodes, number of medications • Continuous variables • No gaps between possible values • Ex: Duration of seizure, body mass index, height
  • 6. Determining Data Types • Ordinal (Categorical) v. Discrete (Numerical) • Ordinal • Cancer Stage I, II, III, IV • Cancer Stage II is not 2*Stage I • Discrete • Number of children: 0, 1, 2… • 4 children = 2 times 2 children
  • 7. So Why Spend Time On This? • The data types help determine which analysis to use • It helps determine how best to summarize the display data • Categorical – percent's, fractions, numbers in categories • Numerical – mean, median, mode, standard deviation, variance, quartile ranges
  • 8. Data Summaries • Be careful of overreliance on numbers – Keep the big picture in mind (more on this next time) • Both means = 2, SD = 1.9, n = 1000
  • 9. Statistical Inference • Estimation of quantity of interest • Estimate itself • Quantify how good an estimate it is • Ex: If you took more and more samples, how much would the estimate vary? • Hypothesis testing
  • 10. Statistical Inference Example • Proportion of people in a population who have diabetes. N = 800 • Sample 1: 200/800 = 0.25 • We conclude that the estimated % of people with diabetes is 25% • But how variable is our estimate? • We need to know the sampling distribution! • Option 1: Take lots and lots of samples • Sample 2: 215/800 = 26.8% • Sample 3: 194/800 = 24.25% • Not practical!
  • 11. Statistical Inference Example • Statistical theory • Sample distributions for means and proportions are normally “bell-shaped” • From a single sample, we calculate the standard error (variability) of our estimated mean or proportion • Standard error measures the variability of the sample statistic. Small SE means more precise estimate. • SE ≠ Standard Deviation • SD = variability of the sample data • SE = variability of the statistic
  • 12. Distributions • Sample means follow a t-distributions on if • Underlying data is approximately normal OR • N is large • A sample mean from a sample of size n will have a t distributions with n-1 degrees of freedom (tn-1)
  • 13. Confidence Intervals • Assume we use our t15 distribution with n = 16, mean SBP = 123.4 mm Hg, and SD = 14.0 mm Hg • SE of mean = SD / √n = 3.5 • 95% CI for sample mean is then • Mean + 2.131 (for t15 distribution) * SE • = 123.4 ± 2.131 * SE • = (115.9, 130.8) mm Hg • And as N gets larger, t statistic gets smaller (t99 = 1.984), which with the same numbers as above but with N = 100, CI narrows to (120.6, 126.2) • Note: It’s never incorrect to use a t-distribution as long as the underlying population is normal or N is large
  • 14. Hypothesis Testing • Confidence intervals told us the best estimate and the variability of the best estimate • Hypothesis testing tells us if there really is a difference between an observed value and another value • From our earlier example: N = 800, we estimated that 25% of people had diabetes • Let’s say a study 10 years prior estimated that 12% of people had diabetes • Has the percent of people with diabetes really changed?
  • 15. Hypothesis Testing • Support the true percent of people with diabetes is 12% • Called the null hypothesis or H0 • How likely is it that we would observe a result as or more extreme than 25% given the true percent is 12%? • This is the p-value, computed using normal distributions for sample proportions and t-distribution for sample means • If the probability is small, consult the supposition may not be right • Reject the null hypothesis in favor of the alternate hypothesis Ha • If the probability is not small, conclude that there is insufficient evidence to reject the null hypothesis • This is NOT the same as accepting the null or showing the null hypothesis is true
  • 16. Hypothesis Testing • H0: True proportion is 12% • Ha: True proportion is not 12% • If P < 0.05, we would conclude it is not likely to observe our data is the true proportion was 12% • We conclude that this is sufficient evidence that the proportion with diabetes is not 12% • Test can be one-sided or two-sided • One-sided ONLY ok if previous research suggests that the proportion is larger
  • 17. Misinterpreting the p-value • A p-value of 0.32 (or > 0.05) DOES NOT mean: • We accept the null • There is a 32% chance the null is true • It only lets us reject the null in favor of the alternative or fail to reject the null • If you fail to reject, it DOES NOT mean the alternative isn’t true. It may mean your N is too small or the study is underpowered.
  • 19. Other Statistics • Some statistics are distribution-free • Recall that t-tests/distributions depend on normality or large N’s • What is we don’t have one or both of these, ex: skewed data, N is small • We can use nonparametric methods that look at ranks, not means • The median is a nonparametric estimate
  • 20. Nonparametric Methods • Don’t require a particular distribution • Well-suited to hypothesis testing • Not as useful for point estimates or Cis • Especially useful is data is ranks or scores – Apgar scores, Vision (20/20, 20/40) • Do inferences on medial values • Hypothesis Test is Sign Test • Assumes hypothesized value of median is correct, except to observe about half the sample above and half below • Computes probability for proportion above median
  • 21. Parametric v. Nonparametric • Nonparametric are always ok to use • Nonparametric are more conservative than parametric • In fact, 95% CI for medians are sometimes twice as wide as those for the mean • If your N is fairly large, or if you know your data is normal, parametric is always best
  • 22. How To Select A Test • Start by asking, “Am I testing for a difference or a relationship in my data?”
  • 23. Difference Testing • Am I testing one sample or more than one sample? • One sample – Is my data parametric? • Yes – One sample t-test • No – Wilcoxon Signed Rank Test
  • 24. Difference Testing • More than one sample – Is my data nominal, or ordinal/interval/ratio? • Nominal – Chi-Squared test • Ordinal/interval/ratio – How many dependent variables are there? • Two or more – Multivariate Analysis of Variance (MANOVA)
  • 25. Difference Testing • One – Are the measures repeated, independent, or mixed? • Mixed – Mixed Model ANOVA • Independent • How many conditions are there? • Two conditions • Parametric data – Independent samples t-test • Non-parametric data – Mann-Whitney U test • More than two • Parametric – Between Participants (One-Way) ANOVA • Non-parametric – Kruskal-Wallis • Repeated
  • 26. Difference Testing • One – Are the measures repeated, independent, or mixed? • Repeated • Two Conditions • Parametric – Paired Samples t-test • Non-parametric – Wilcoxon Matched Pairs • More than two conditions • Parametric – Within Participants ANOVA • Non-parametric – Friedman’s ANOVA
  • 27. Relationship Testing • Single Independent variable • Parametric – Pearson’s Correlation • Non-parametric – Spearman’s Correlation • Multiple Independent variables • Parametric – Logistic Regression • Non-parametric – Multiple Regression • Multiple Factors Correlation Matrix • Factor analysis
  • 28. Model Information • The specific of each model (how they differ, how they’re calculated, etc) are not important for our purposes • What is important is to be able to select the correct test • Selecting the wrong test WILL lead to wrong conclusions (failing to reject the null, inappropriately rejecting the null)
  • 29. Going Further • There are many, many more tests we did not cover • Durbin-Watson • Kolmogorov-Smirnov • Anderson-Darling • Cox Proportional Hazards • Kaplan-Meier Survival Analysis • And so on… • However, the tests presented will cover the majority of basic studies done