Lesson 23 planning data analyses using statistics

PLANNING DATA
ANALYSES USING
STATISTICS
LESSON 23

INTRODUCTION
When the necessary data have been collected, the next step
is to organize the raw data for data analysis.
It is important that the researcher is assured of the quality
of the data for accuracy, consistency, completeness and
systematic arrangement to facilitate coding and tabulation.

Purpose of Data Analysis Plan
1. describe data sets
2. determine the degree of relationship of variables.
3. determine the differences between variables.
4. predict outcomes
5. compare variables.

DATA ANALYSIS STRATEGIES
1. Exploratory Data Analysis
◦This type of data analysis is used when it is not clear
what to expect from the data. This strategy uses
numerical and visual presentations such as graphs.

2. Descriptive Data Analysis
◦This type of data analysis is used to described, show or
summarize data in a meaningful way, leading to a simple
interpretation of data.
◦The commonly used data analysis tools for descriptive
statistics are frequency, percentage, measures of central
tendency and measures of dispersion.

3. Inferential Data Analysis
◦Inferential statistics tests hypothesis about a set of data to
reach conclusions or make generalizations beyond merely
describing the data.
◦Inferential statistics include test of significant difference
such as t-test, Analysis of variance (ANOVA) and test of
relationship such as Pearson Product Moment of
Correlation, Spearman rho, linear regression and chi-
square test.

DESCRIPTIVE DATA ANALYSIS
1. MEASURES OF CENTRAL TENDENCY
◦A. Mean –often called the arithmetic average of a set of
data.
◦The symbol 𝑋 (x bar) is used to denote the arithmetic
mean.
◦Formula: 𝑋 =
𝑥
𝑛

DESCRIPTIVE DATA ANALYSIS
MEAN FOR UNGROUPED DATA
◦1. Find the mean of the measurement 18, 26, 27, 29 30.
◦Formula: 𝑋 =
𝑥
𝑛
=
18+26+27+29+30
5
=
130
5
=26

Example 2: Find the mean of the
following.
SCORES IN NATIONAL ACHIEVMENT TEST
90 95 96 87 110
102 95 98 87 117
115 96 91 95 95
93 105 86 103 106

Mean of Grouped Data
When the observations are grouped into classes.
𝑋 =
𝑓𝑥
𝑛
Where: f= frequency
◦ x= numerical value or item in a set of data.
n = number of observations in the data set.

Example 1: Find the mean of the height of 50
senior high school students summarized as
follows.
Heights( in inches) Frequency Height x frequency
56 6 336
57 15 855
58 12 696
59 8 472
60 5 300
61 2 122
62 2 124
𝑓 = 50 𝑓𝑥 = 2905

Solution
𝑋 =
𝑓𝑥
𝑛
=
2905
50
= 58.1 𝑖𝑛𝑐ℎ𝑒𝑠

Example 2: Solve for the mean of the
data below.
Solve for the mean of the data below.
Class Frequency (f) Class Midpoint (x) fx
76-80 3 78 234
71-75 5 73 365
66-70 6 68 408
61-65 8 63 504
56-60 10 58 580
51-55 7 53 371
46-50 7 48 336
41-45 3 43 129
36-40 1 38 38
TOTAL 50 2965

Solution
𝑋 =
𝑓𝑥
𝑛
=
2965
50
= 59.3

Measures of Dispersion
Refers to the extent of spread of the scores.
The measures of dispersion are the range, average
deviation, standard deviation and variance.

The RANGE
It is the difference between the highest score and the
lowest score in a set of data.
Example: Find the range of the following data:
6, 10, 12, 15, 18, 18, 20, 23, 25, 28
The range is 22.

AVERAGE DEVIATION
This measure of spread is defined as the absolute difference
or deviation between the values in a set of data and the
mean, divided by the total number of values in the set of
data.

AVERAGE DEVIATION (UNGROUPED
DATA)
Formula:
◦𝐴𝐷 =
𝑥− 𝑥
𝑛
◦Example: Find the average deviation of the following
scores 20, 25, 35, 40, 45.
◦𝐴𝐷 =
20−33 + 25−33 + 35−33 + 40−33 + 45−33
5
◦𝐴𝐷 =
−13 + −8 + 2 + 7 + 12
5
=8.4

EXAMPLE 2
A set of observations consists of 22, 60, 75, 85, 98.
Find the average deviation.

STANDARD DEVIATION
The standard deviation (SD) is a measure of the spread
or variation of data about the mean.
SD is computed by calculating the average distance that
the average value is from the mean.

STANDARD DEVIATION (UNGROUP DATA)
Formula: 𝑆𝐷 =
(𝑥− 𝑥)
2
𝑛−1
Example 1: Find the standard deviation of the following scores:
6, 10, 12, 15, 18, 18, 20, 23, 25, 28.
Step 1: Solve the mean
𝑥 =
6+10+12+15+18+18+20+23+25+28
10
=17.5

STANDARD DEVIATION (UNGROUP DATA)
Formula: 𝑆𝐷 =
(𝑥− 𝑥)
2
𝑛−1
Step 2: Subtract the mean from each score. 𝑥 − 𝑥
Step 3: Square each difference from Step 2 or (𝑥 − 𝑥)2

SCORE (x) 𝑿 − 𝑿 (𝑿 − 𝑿) 𝟐
6 (6-17.5)= -11.5 132.25
10 (10-17.5) = -7.5 56.25
12 (12-17.5) = -5.5 30.25
15 (15-17.5) =-3.5 12.25
18 (18-17.5)= 0.5 6.25
18 (18-17.5)= 0.5 0.25
20 (20-17.5)=2.5 6.25
23 (23-17.5) = 5.5 30.25
25 (25-17.5) = 7.5 56.25
28 (28-17.5)= 10.5 110.25
(𝑋 − 𝑋)2
= 434.5

Computation
𝑆𝐷 =
(𝑥− 𝑥)
2
𝑛−1
=
434.5
10−1
= 6.948

INTERPRETATION OF THE STANDARD
DEVIATION
1. Approximately 68% of the scores in the sample falls within one
standard deviation of the mean.
2. Approximately 95% of the scores in the sample falls within two
standard deviations of the mean.
3. Approximately, 99% of the scores in the sample falls within three
standard deviation of the mean.

INFERENTIAL STATISTICS
It refers to statistical measures and techniques that
allow us to use samples to make generalizations about
the population from which the samples were drawn.

1. Test of Significant Difference (t-test)
Between means
For independent samples (when the respondents consists
of two different groups as boys and girls, working mothers
and non-working mothers, healthy and malnourished
children and the like.

Case 1 :𝛿1 𝛿2, 𝑢𝑛𝑘𝑛𝑜𝑤𝑛 𝑜𝑟 𝑛1 ≥ 30;
𝑛2 ≥ 30
Formula
𝑧 =
𝑥1− 𝑥2 − 𝜇1−𝜇2
𝛿2
𝑛1
+
𝛿2
𝑛2

Case 2 :𝛿1 ≠ 𝛿2, 𝑢𝑛𝑘𝑛𝑜𝑤𝑛 𝑜𝑟 𝑛1 < 30;
𝑛2 < 30
Formula
𝑡 =
𝑥1− 𝑥2 − 𝜇1−𝜇2
𝛿2
𝑛1
+
𝛿2
𝑛2
(df=smaller of 𝑛1 − 1 𝑜𝑟 𝑛2 − 1

Case 3 :𝛿1 = 𝛿2, 𝑢𝑛𝑘𝑛𝑜𝑤𝑛 𝑜𝑟 𝑛1 < 30;
𝑛2 < 30
Formula
𝑡 =
𝑥1− 𝑥2 − 𝜇1−𝜇2
𝛿2
𝑛1
+
𝛿2
𝑛2
(df=𝑛1 + 𝑛2 − 2)

ANALYSIS OF VARIANCE
ANOVA is used when significance of difference of means
of two or more groups are to be determined at one time.

2. Test of Relationship
1. Spearman rank order correlation or spearman rho.
◦This is used when data available are expressed in terms of ranks
(ordinal variable)
◦Formula: 𝜌 = 1 −
6 𝐷2
𝑁(𝑁2−1)

2. Chi-Square Test for Independence.
◦This is used when data are expressed in terms of frequencies or
percentage(nominal variable).
◦Formula:
◦𝑥2
=
(𝑂−𝐸)2
𝐸
[df=(r-1)(c-1)
◦Where: 𝐸 =
(𝑟𝑜𝑤 𝑡𝑜𝑡𝑎𝑙)(𝑐𝑜𝑙𝑢𝑚𝑛 𝑡𝑜𝑡𝑎𝑙)
𝑔𝑟𝑎𝑛𝑑 𝑡𝑜𝑡𝑎𝑙

4. Pearson Product Moment Coefficient of Correlation
◦This is used when data are expressed in terms of scores such as
weights and heights in a test (ratio or interval)
◦Formula:
◦𝑟 =
𝑛 𝑥𝑦− 𝑥 𝑦
[𝑛 𝑥2−( 𝑥)
2
][𝑛 𝑦2−( 𝑦)
2
]

T-Test to test the significance of Pearson r
◦Formula:
◦𝑡 = 𝑟
𝑛−2
1−𝑟2
◦Where: r= correlation coefficient
◦ n= number of samples

Lesson 23 planning data analyses using statistics

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Ähnlich wie Lesson 23 planning data analyses using statistics

Ähnlich wie Lesson 23 planning data analyses using statistics (20)

Mehr von mjlobetos

Mehr von mjlobetos (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Lesson 23 planning data analyses using statistics

Hinweis der Redaktion