2. DATA ANALYSIS
Beforethe collected data can be utilized,
appropriate analytic methods must be
applied to meet the users' need for
information
Mainconsideration – OBJECTIVES for
which the data were collected
(Mendoza, et al Foundations of statistical
analysis for the health sciences 2009)
3. Organization and Presentation
of Data
Collected data – questionnaires, examination
papers, rating scales, interview transcription,
secondary data
Start by thoroughly reviewing all accomplished
instruments and other data
- have respondents answered all questions?
- are there any inconsistencies?
- verify identification numbers
- systematize your coding system
4. Coding
Assignnumerical values to research
variables
Enter
the RAW data into your
computer software
Encoding data into a computer
facilitates computation of statistical
testing
Ex. Microsoft Excel
5.
6.
7.
8. After
entering the data you are now
ready to process them
Sample table entry
9. BIOSTATISTICS deals with both qualitative and
quantitative data; either constants or variables
CONSTANT VARIABLE
phenomenon whose value phenomenon whose values/
remains the same categories cannot be
from person to person, predicted with certainty
from time to time,
from place to place
# minutes in an hour age of gestation
pull of gravity smoking habit
speed of light attitudes towards certain issues
weight
educational attainment
(Mendoza, et al Foundations of statistical
analysis for the health sciences 2009)
10. VARIABLES
QUANTITATIVE QUALITATIVE
categories can be categories are used as labels
measured and ordered to distinguish one group from
according to quantity/ another
amount; values can be
expressed numerically
(discrete or continuous)
birth weight sex
hospital bed capacity urban-rural
arm circumference religion
population size region
disease status
occupation
(Mendoza, et al Foundations of statistical
analysis for the health sciences 2009)
11. Types of scales
Nominal Ordinal Interval Ratio
(QL) (QL/QN)
numbers refer can be ranked exact distance zero point is
to categories, or ordered between 2 fixed
groups, labels categories can
of data be
determined;
zero point is
arbitrary
measurement disease temperature, weight,
scale set for severity (mild, IQ money
data collection moderate,
severe)
12. It is important to distinguish the type
of variable one is dealing with
- major determinant of type of
statistical technique
- type of graph that can be
constructed
- statistical measure that can be
computed
(Mendoza, et al Foundations of statistical
analysis for the health sciences 2009)
13. DESCRIPTIVE STATISTICS
Describe the characteristics of the
members of one group
No attempt to compare or relate these to
the characteristics of another group
Measures of central tendency and
variation
14. MEASURES OF CENTRAL
TENDENCY
Methodof compressing a mass of
numerical data for better comprehension
and description of what it tends to portray
MEAN, MEDIAN, MODE – “typical “ or
average values which may be utilized to
represent a series of observations
(Mendoza, et al Foundations of statistical
analysis for the health sciences 2009)
15. MEASURES OF CENTRAL
TENDENCY
A. MEAN (X) – arithmetic mean that
represents a set of scores with a single
number
Computed by dividing the sum of all scores
by the number of scores
16. MEASURES OF CENTRAL
TENDENCY
B. MEDIAN (Md)- 50th percentile
- Point above and below which half
of the scores fall
- Better choice than MEAN if there
are extreme values
18. MEASURES OF SPREAD OR
DISPERSION
A. RANGE
– difference between the highest
and lowest scores plus 1
19. MEASURES OF SPREAD OR
DISPERSION
B. VARIANCE – average of the
squared deviations from the MEAN
Computing for VARIANCE
1. Get the deviation score for each score by
subtracting it from the mean
2. Square each resulting deviation
3. Get the sum of all squared deviations
4. Divide the result by the number of subjects for the
population (N) or the number of subjects minus 1
(n-1) for a sample
21. MEASURES OF SPREAD OR
DISPERSION
C. STANDARD DEVIATION
- indicates how much scores are
spread around the mean
- Square root of the variance
22. Ex. Scores of 2 groups of students:
Grp 1 : 46 60 65 65 70 80 90
Grp 2 : 62 66 68 70 70 70 70
GROUP 1 GROUP 2
MEAN 68 68
MEDIAN 65 70
MODE 65 70
RANGE 90-46+1 = 45 70-62+1 = 9
S.D. 14.3 3.06
23. Variances and standard deviation in the sample distribution of scores
SCORES Deviation of score from X Square of the
Group 1 deviation
46 46 – 68 = 22 484
60 60 – 68 = 8 64
65 65 – 68 = 3 9
65 65 – 68 = 3 9
70 70 – 68 = 2 4
80 80 – 68 = 12 144
90 90 – 68 = 22 484
VARIANCE = sum of squared deviations
n-1
= 484 + 64 + 9 + 9 + 4 + 144 + 484 = 199.67
7–1
STD. DEVIATION = square root of variance =
= 14.3
24. TESTS
- Also used to make inferences
- PARAMETRIC tests - for interval and
ratio variables assuming that:
sample was drawn from a
normally distributed population
if two groups are analyzed they
have the same variance
25. TESTS COMPARING GROUPS
1. Tests to determine the difference
between TWO groups
2. Tests to determine the difference
among THREE or more groups
26. TESTS COMPARING GROUPS
1. Tests to determine the difference
between TWO groups
a. T-test for independent groups
b. T-test for paired data
27. TESTS COMPARING GROUPS
1. Tests to determine the difference
between TWO groups
a. T-test for independent groups
- detects statistically significant
differences between means
- for static group comparison or
randomized control group design
(compare scores of 2 unmatched groups)
28. TESTS COMPARING GROUPS
a. T-test for independent groups
ex. 2 groups of slow learners
Oral • Mean post-
instruction instruction
group scores
T test for
more
independent effective?
• Mean post- groups
Videotape instruction
group scores
Dominguez (1985) “ A comparative study of the
achievement of slow learners taught by oral tutorials with
those taught by self-instructional programmed videotapes.”
29. TESTS COMPARING GROUPS
1. Tests to determine the difference
between TWO groups
b. T-test for paired data
- identify statistically significant
changes in a single group
- or between matched groups
30. TESTS COMPARING GROUPS
2. Tests to determine the difference
among THREE or more groups
a. Univariate analysis of variance
(ANOVA)
b. Analysis of Covariance
(ANCOVA)
c. Multivariate analysis of variance
(MANOVA)
31. a. ANOVA
Univariate analysis of variance
- to determine significant difference
among 3 or more group means
(1 variable)
Ex. Posttest scores of students to compare
effectiveness of 3 instructional strategies
32. ANOVA Univariate analysis of variance
Written Written +
matl’s videotape
n=47 n= 46
N= 168 2nd yr med students
23 2nd yr physician
Written + assistant students
small group Written+
practice video+ SGP
n = 43
n = 55
Students’ knowledge and skills were assessed after instruction to
determine any significant difference among the groups through
ONE-WAY ANOVA
“Teaching a screening musculoskeletal examination: A randomized
control trial of different instructional methods.” Lawry et al. (1999)
33. ANOVA Univariate analysis of variance
One-way ANOVA
Two-way ANOVA
4 X 2 ANOVA
Three-way ANOVA
34. b. ANCOVA Analysis of Covariance
- Used to control differences among
groups that existed before the study
- Usually used in quasi-experimental
designs
- Ex. When Pre-test means of groups are significantly
different from each other ANCOVA can be used to
adjust pretest scores so they can be treated as
identical
35. c. MANOVA
Multivariate analysis of variance
- Groups are compared with respect
to 2 or more dependent variables
36. TESTS TO DETERMINE THE RELATIONSHIP
AMONG VARIABLES IN A GROUP
1. Pearson product moment
correlation coefficient
(interval / ratio variables)
2. Regression
37. TESTS TO DETERMINE THE RELATIONSHIP
AMONG VARIABLES IN A GROUP
1. Pearson product moment correlation
coefficient (interval / ratio variables)
- when there are 2 scores per
subject
- study intends to determine how
these scores are related
Ex. Survey of pharmacists to determine work patterns
and whether other factors (age, gender, # years
in work force) affected the work patterns
(Knapp et al. 1992)
38. TESTS TO DETERMINE THE RELATIONSHIP
AMONG VARIABLES IN A GROUP
1. Pearson product moment correlation coefficient (interval /
ratio variables)
2. Regression
- Simple regression – predicting one
variable from another variable
- Multiple regression – predicting values
of 1 variable on the basis of the values
of 2 or more variables
39. TESTS TO DETERMINE THE RELATIONSHIP
AMONG VARIABLES IN A GROUP
1. Pearson product moment correlation coefficient
2. Regression
Ex. Study to identify predictors of dental skill dev’t -
whether commonly examined fine motor ability tests
(steadiness tester, mirror trace test) and maturational
tests (hand length, index finger length, wrist width )
were associated with early scaling and root-planning
skills in 120 dental students (Wilson, Waldman and McDonald 1991)
40. COMMOMLY USED PARAMETRIC TESTS
USES APPROPRIATE TESTS
Determining the differences
among groups
Between 2 related or matched T-test for paired data
groups
Between 2 independent T-test for independent groups
groups
Among 3 or more groups ANOVA (1 dependent variable)
ANCOVA (1 dep variable; quasi-exptl design)
MANOVA ( 2/> dep variables)
Determining the relationship Pearson product moment correlation
among variables in a group coeficient
Regression
41. NONPARAMETRIC TESTS
- nominal and ordinal variables
- when underlying assumptions for
parametric tests are not met
- for small sample size
42. NONPARAMETRIC TESTS
1. TESTS TO COMPARE GROUPS
a. Tests to determine the difference
between TWO groups
(1) McNemar Change test
(2) Wilcoxon matched-pairs
signed- ranks test
(3) Permutation test for paired
replicates
43. NONPARAMETRIC TESTS
(4) Fischer exact test for 2 X 2 table
(5) Chi-square test (X2 test)
(6) Wilcoxon-Mann-Whitney test
(7) Robust rank-order test
(8) Kolmogorov-Smirnov two-
sample test
(9) Permutation test for two
independent samples
44. NONPARAMETRIC TESTS
1. TESTS TO COMPARE GROUPS
a. Tests to determine the difference between TWO groups
(1) McNemar Change test
- For 2 related/ matched nominal variable
- (Ex. Responses of a group of students on which 2 types
of instructional methods they prefer when asked before
and after being exposed to such methods)
- Observed frequencies of students’ preferred instructional method
Preferred instructional method Preferred instructional method Total
before exposure before exposure
Method A Method B
Method A
Method B
Total
45. NONPARAMETRIC TESTS
1. TESTS TO COMPARE GROUPS
a. Tests to determine the difference between TWO groups
(2) Wilcoxon matched-pairs signed-
ranks test
- For 2 related samples; ordinal data
- Determines the direction of
differences within pairs or related
samples and relative magnitude of
those differences
46. NONPARAMETRIC TESTS
(2) Wilcoxon matched-pairs signed- ranks
test
Ex. To determine whether there is a significant difference
in perceptions of graduates on their degree of
preparedness in various aspects of training during their
clinical fellowship and degree of importance in clinical
practice of those same aspects. (Atienza 2001)
Perceived degree of preparedness and importance of graduates
Graduates Perceived degree of Perceived degree of Difference
preparedness importance
Graduates A
Graduates B
etc.
47. NONPARAMETRIC TESTS
1. TESTS TO COMPARE GROUPS
a. Tests to determine the difference between TWO groups
(3) Permutation test for paired
replicates
- one of most powerful tests for
paired observation
- variables on interval scale
- small sample size
48. NONPARAMETRIC TESTS
1. TESTS TO COMPARE GROUPS
a. Tests to determine the difference between TWO groups
(4) Fischer exact test for 2 X 2 table
- nominal or ordinal data
- two independent samples
- sample size in small (n< 2)
- subjects fall in one of two classes
Variable Group Combined
- Number of students who passed and failed
I II
Pass
Fail
49. NONPARAMETRIC TESTS
1. TESTS TO COMPARE GROUPS
a. Tests to determine the difference between TWO groups
(5) Chi-square test (X2 test)
- nominal or ordinal data
- to determine the difference between
2 independent groups ( n > 20 ; each
of the expected frequencies is 5/> )
- for examining the differences among
3/> groups and
- for testing association between 2/>
categorical variables
50. NONPARAMETRIC TESTS
1. TESTS TO COMPARE GROUPS
a. Tests to determine the difference between TWO groups
(5) Chi-square test (X2 test)
Ex. Cross sectional survey of 545 doctors to
examine young physicians’ views on
professional issues (professional regulation,
multidisciplinary teamwork, priority setting,
clinical autonomy, private practice)
These variables were tested against
demographic variables like sex.
Specialty choice revealed marked sex
bias
51. NONPARAMETRIC TESTS
1. TESTS TO COMPARE GROUPS
a. Tests to determine the difference between TWO groups
(6) Wilcoxon-Mann-Whitney test
- One of most powerful tests for data in
ordinal scale
- alternative to t-test
- Used to predict the difference between 2
independent samples from same
population or from populations with the
same/equal variances
52. NONPARAMETRIC TESTS
1. TESTS TO COMPARE GROUPS
a. Tests to determine the difference between TWO groups
(7) Robust rank-order test
- Does not assume that the 2
independent samples come from
the same population
- Does not require equal variances
for the 2 populations from which the
sample was taken
53. NONPARAMETRIC TESTS
1. TESTS TO COMPARE GROUPS
a. Tests to determine the difference between TWO groups
(8) Kolmogorov-Smirnov two-sample
test
- 2 independent samples drawn from
the same population or populations
with the same distributions
- Powerful for small samples
54. NONPARAMETRIC TESTS
1. TESTS TO COMPARE GROUPS
a. Tests to determine the difference between TWO groups
(9) Permutation test for two
independent samples
- Powerful for testing the difference
between the means of two
independent sample when their
sample sizes are small
- Requires interval measurement
- No special assumptions about the
distributions of the populations
55. NONPARAMETRIC TESTS
1. TESTS TO COMPARE GROUPS
b. Tests to determine the difference
between THREE or more groups
(1) Cochran Q test
(2) Friedman two-way analysis of
variance by ranks
(3) Kruskal-Wallis one-way analysis
of variance
56. NONPARAMETRIC TESTS
1. TESTS TO COMPARE GROUPS
b. Tests to determine the difference between THREE or more
groups
(1) Cochran Q test
- Extension of McNemar test used for
2/> related samples (nominal
variables)
- Used to analyze responses to a test
or questionnaire
57. NONPARAMETRIC TESTS
1. TESTS TO COMPARE GROUPS
b. Tests to determine the difference between THREE or more
groups
(2) Friedman two-way analysis of
variance by ranks
- For ordinal data
- to test if a number of repeated
measures or matched groups come
from the same population or
populations with the same median
58. NONPARAMETRIC TESTS
1. TESTS TO COMPARE GROUPS
b. Tests to determine the difference between THREE or more
groups
(2) Friedman two-way analysis of
variance by ranks
Three groups of subjects in four conditions
Group Conditions / Variables
Variable A Variable A Variable A Variable A
Group I
Group II
Group III
59. NONPARAMETRIC TESTS
1. TESTS TO COMPARE GROUPS
b. Tests to determine the difference between THREE or more
groups
(3) Kruskal-Wallis one-way analysis
of variance
- for testing 3 or more independent
groups for ordinal data
- Ex. Testing for significant differences of
socioeconomic scores or attitudinal
scores based on specified criteria of
students from different regions in the
country
60. NONPARAMETRIC TESTS
2. MEASURES OF ASSOCIATION
(a) Pearson product moment correlation
coefficient (interval, ratio)
(b) Phi coefficient (nominal)
(c) Kappa coefficient of agreement
(nominal)
(d) Spearmen rank-order correlation
coefficient (ordinal)
(e) Kendall coefficient (ordinal)
(f) Gamma statistic (ordinal)
61. USES LEVEL OF MEASUREMENT
NOMINAL ORDINAL INTERVAL
Determining the
difference among
groups
Between 2 related/ McNemar change Wilcoxon signed ranks Permutation test for
matched groups test test paired replicates
Between 2 Fischer exact test Wilcoxon-Mann- Permutation test for
independent groups for 2X2 table Whitney test 2 independent
Chi-square test Robust rank order test samples
Komogorov-Smirnov
two-sample test
Among 3/> related Cochran Q test Friedman 2-way
groups analysis of variance by
ranks
Among 3/> Chi-square test Kruskall-Wallis one-way
independent groups analysis of variance
Determining Cramer coefficient Spearman rank-order correlation coefficient
association Phi coefficient Gamma statistic
Kappa confidence
of agreement
So we have already collected our data. But all these are just raw information. For it to be of any use to us we have to apply analytic methods to the data.What is our main consideration? Of course, it is the OBJECTIVES of our study
- have respondents answered all questions? - are there any inconsistencies? - verify identification numbers - systematize your coding system
CODING - Assigning numerical values to research variablesAfter coding you are ready to enter the RAW data into your computer softwareEncoding research data into a computer facilitates computation of statistical testing (through selected software)
This is an example of the data encoding using the 2004 study by Salvacion on the stress profile of students in the UP College of Dentistry The researcher used questionnaires, tests, and inventoriesThe questionnaire asked 149 students basic demographic data, like !D#, year level, sex, civil status and residence These were some of the variables the researcher hypothesized to be related to the stress profile of the studentsSince ID #s and year level are real #s they could be entered into EXCEL without any codinSex can be coded as 1 for male, 2 for female ; Civil status is coded as 1 for single, 2 for maried, etc.These numbers can now be entered in the excel spreadsheetsheetSo lets take entry for respondent with ID # 1 who is a 3rdyr student, male, single and lives in a dormitory within Ermita.
QUANTITATIVE VARIABLES categories can be measured and ordered according to quantity/amount;values can be expressed NUMERICALLY (discrete -whole #s or continuous- fractions and decimals)QUALITATIVE VARIABLES - categories are used as labels to distinguish one group from another (not a basis for saying that one group is greater or less, higher or lower, better or worse than another)
It is important to distinguish the type of variable one is dealing with- major determinant of the type of statistical technique applied to the data- It also determines the type of graph that can be constructed as well as the- statistical measure that can be computed from a given set of data
In the next slidewe will review the formula for getting the variance for a population and for a sample
MEAN (X) – Computed by dividing the sum of all scores by the number of scoresMEDIAN (Md)- 50th percentile. Point above and below which half of the scores fallMODE (Mo) – most frequently occurring score in the distributionRANGE – difference between the highest and lowest scores plus 1STANDARD DEVIATION - indicates how much scores are spread around the mean - Square root of the variance
n – 1 because we are using a sample not a population
Normaly distributed population
b. T-test for paired data - identify statistically significant changes in a single group (e.g. pre-test and post-test) - or between matched groups ( e.g. pre-test scores of matched members of 2 groups, experimental and comparison)
- to determine significant difference among 3 or more group means (1 variable) Ex. Posttest scores of students to compare effectiveness of 3 instructional strategies
Randomized post-test only control design N= 168 2ndyr med students + 23 2nd yr physician assistant students randomly divided into 4 grps given the different instructional methods Students’ knowledge and skills were assessed after instruction to determine any significant difference among the groups through ONE-WAY ANOVA
Ex. When Pre-test means of groups are significantly different from each other ANCOVA can be used to adjust pretest scores so they can be treated as identical
Ex. Survey of pharmacists to determine work patterns and whether other factors (age, gender, # years in work force) affected the work patterns (Knapp et al. 1992)
Quasi-experimental design
For nominal and ordinal variablesApplicable when underlying assumptions for parametric tests are not metPARAMETRIC tests – for interval and ratio variables assuming that: - sample was drawn from a normally distributed population - if two groups are to be analyzed they have the same varianceUseful for small sample size