Measures of Variation and Standard Deviation

Used to determine the scatter of
values in a distribution. In this
chapter, we will consider the six
measures of variation: the range,
quartile deviation, mean deviation,
variance, standard deviation and
the coefficient of variation

o Range
The difference between the highest and
lowest values in the distribution.
RANGE = H - L
Where: H= represents the highest value
L = represents the lower value

 Ungrouped Data
Subtract the lowest score from the highest
score.
Example: Find the range of distribution if the
highest score is 100 and the lowest score is 21.
Solution:
Range = highest score- lowest score
= 100-21
= 79

Grouped Data
To find the range for a frequency
distribution, just get the differences
between the upper limit of the highest
score and the lower limit of the lowest
class interval

Example: Find the range for the frequency
distribution

Class interval
Frequency
100-104 4
105-109 6
110-114 10
115-119 13
120-124 8
125-129 6
130-134 3
N= 50

Range= Highest Class Upper Limit-
Lowest Class Lower Limit
=134.5-99.5
=35

Quartile Deviations
and
Mean Deviations

oQuartile Deviations

Is a measure that describes the existing
dispersion in terms of the distance selected
observation points. The smaller the quartiles
deviation, the greater the concentration in the middle
half if the observation in the data set.
Are measures of variation which uses
percentiles, deciles, or quartiles.
Quartile Deviation (QD) means the semi
variation between the upper quartiles (Q3) and lower
quartiles (Q1) in a distribution. Q3 - Q1 is referred as
the interquartile range.

Formula:

QD = Q3 - Q1/2

where and are the first and third quartiles
and is the interquartile range.

A. Ungrouped Data
Example: given the data below
33
52
58
41
56
71
77
74
85
45
82
50
62
51
67
79
48
83
43
81
38
79
65
68
59

Solution: Arrange the 25 entries from lowest to highest.
33
38
41- 3rd
entry
43
45
(n= 25)
48- 6th
entry
50
51
52
56
79
81
82-23rd
entry
83
85
68
71
74
77- 19th
entry
79
58
59
62
65
67

A. Forsemi-interquartilerange
SinceQ3=
P75andQ1=P25weuseP75 andP25
forP75:
Cum.Freq.ofP75= x = 18.75or19
ThismeansthatP75isthe19thentry
Therefore,P75 =77

For P25
Cum. Freq. of P25= . 25=6.6 or which means that P25 is entry6th
P25= 48
But semi interquartile range= = =
Semi-interquartile range= = = or =
Hence semi interquartile range = 14.5

A. Group Data
Example:
Class Intervals f
<cf
21-23
24-26
3
4
3
7
27-29 6
13
30-32 10
23
33-35 5
28
36-38 2
n=30
30

Solution:
Note that Q3-Q1= P75-P25
For P75
Cum freq. of P75 = x 75= 22.5 or 22
L= 29.5 f= 10 F=13, c=3 j= 75
P75= 32.35
For P25
Cum freq. of P25= x 25= 7.5 or 8
L= 26.5 f= 6 F=7, c=3 j= 25
P25= 26.75
Finally the interquartile range is P75-P25= 32.35-26.75= 5.6

o Mean Deviation
The mean deviation or average
deviation is the arithmetic mean of
the absolute deviations and is denoted by .

Example:
Calculatethemeandeviationofthefollowingdistribution:
9,3,8,8,9,8,9,18

MeanDeviationforGroupedData
Ifthedataisgroupedinafrequencytable,theexpressionofthemeandeviationis:

Example:
Calculate the mean deviation of the following distribution:
xi fi xi · fi |x - x| |x - x| · fi
[10, 15) 12.5 3 37.5 9.286 27.858
[15, 20) 17.5 5 87.5 4.286 21.43
[20, 25) 22.5 7 157.5 0.714 4.998
[25, 30) 27.5 4 110 5.714 22.856
[30, 35) 32.5 2 65 10.714 21.428
21 457.5 98.57

In probability theory and statistics
variance measures how far a set of numbers
is spread out. A variance of zero indicates that
all the values are identical. Variance is always
non-negative: a small variance indicates that
the data points tend to be very close to
the mean expected value and hence to each
other, while a high variance indicates that the
data points are very spread out around the
mean and from each other.

 It is important to distinguish
between the variance of a
population and the variance of a
sample. They have different
notation, and they are computed
differently.
 The variance of a population is
denoted by σ2
; and the variance of a
sample, by s2
.

The variance of a population is
defined by the following formula:
σ2
= Σ ( Xi - X )2
/ N
where σ2
is the population variance, X is
the population mean, Xi is the ith
element from the population, and N is
the number of elements in the
population.

The variance of a sample is defined by
slightly different formula:
s2
= Σ ( xi - x )2
/ ( n - 1 )
where s2
is the sample variance, x is the sample
mean, xi is the ith element from the sample, and
n is the number of elements in the sample. Using
this formula, the variance of the sample is an
unbiased estimate of the variance of the
population.

For example, suppose you want to find the
variance of scores on a test. Suppose the
scores are 67, 72, 85, 93 and 98.
 Write down the formula for variance:
σ2
= ∑ (x-µ)2
/ N
 There are five scores in total, so N = 5.
σ2
= ∑ (x-µ)2
/ 5

The formula will look like this:
σ2
= [ (-16)2
+(-11)2
+(2)2
+(10)2
+(15)2] / 5
 Then, square each paranthesis. We get 256,
121, 4, 100 and 225.
This is how:
σ2
= [ (-16)x(-16)+(-11)x(-
11)+(2)x(2)+(10)x(10)+(15)x(15)] / 5
σ2
= [ 16x16 + 11x11 + 2x2 + 10x10 + 15x15] / 5
which equals:
σ2
= [256 + 121 + 4 + 100 + 225] / 5

 The mean (µ) for the five scores (67, 72, 85, 93, 98),
so µ = 83.
σ2
= ∑ (x-83)2
/ 5
 Now, compare each score (x = 67, 72, 85, 93,
98) to the mean (µ = 83)
σ2
= [ (67-83)2
+(72-83)2
+(85-83)2
+(93-83)2
+(98-83)2
] / 5
 Conduct the subtraction in each parenthesis.
67-83 = -16
72-83 = -11
85-83 = 2
93-83 = 10
98 - 83 = 15

 Then summarize the numbers inside the
brackets:
σ2
= 706 / 5
 To get the final answer, we divide the sum by
5 (Because it was five scores). This is the
variance for the dataset:
σ2
= 141.2

Standard Deviation
and Coefficient of
Variation

The Standard Deviation is a measure of how spread out
numbers are.
The symbol for Standard Deviation is σ (the Greek letter sigma).
This is the formula for Standard Deviation:
Say we have a bunch of numbers like 9, 2, 5, 4, 12, 7, 8, 11.
To calculate the standard deviation of those numbers:
1. Work out the Mean (the simple average of the numbers)
2. Then for each number: subtract the Mean and square the result
3. Then work out the mean of those squared differences.
4. Take the square root of that and we are done!
First, let us have some example values to work on:
Example: Sam has 20 Rose Bushes.
The number of flowers on each bush is
9, 2, 5, 4, 12, 7, 8, 11, 9, 3, 7, 4, 12, 5, 4, 10, 9, 6, 9, 4
o STANDARD DEVIATION

Work out the Standard Deviation.
Step 1. Work out the mean
In the formula above μ (the greek letter "mu") is the mean of all our
values ...
Example: 9, 2, 5, 4, 12, 7, 8, 11, 9, 3, 7, 4, 12, 5, 4, 10, 9, 6, 9, 4
The mean is:
9+2+5+4+12+7+8+11+9+3+7+4+12+5+4+10+9+6+9+420
= 14020 = 7
So: μ = 7
Step 2. Then for each number: subtract the Mean and square the
result
This is the part of the formula that says:
So what is xi ? They are the individual x values 9, 2, 5, 4, 12, 7, etc...
In other words x1 = 9, x2 = 2, x3 = 5, etc.

So it says "for each value, subtract the mean and square the result", like this
Example (continued):
(9 - 7)2
= (2)2
= 4
(2 - 7)2
= (-5)2
= 25
(5 - 7)2
= (-2)2
= 4
(4 - 7)2
= (-3)2
= 9
(12 - 7)2
= (5)2
= 25
(7 - 7)2
= (0)2
= 0
(8 - 7)2
= (1)2
= 1
... etc ...
Step 3. Then work out the mean of those squared differences.
To work out the mean, add up all the values then divide by how
many.
First add up all the values from the previous step.
But how do we say "add them all up" in mathematics? We use "Sigma": Σ
The handy Sigma Notation says to sum up as many terms as we want:

We already calculated (x1-7)2
=4 etc. in the previous step, so just sum them
up:
= 4+25+4+9+25+0+1+16+4+16+0+9+25+4+9+9+4+1+4+9 = 178
But that isn't the mean yet, we need to divide by how many, which is
simply done by multiplying by "1/N":
We want to add up all the values from 1 to N, where N=20 in our case
because there are 20 values:
Which means: Sum all values from (x1-7)2
to (xN-7)2

Step 4. Take the square root of that:
Example (concluded):
Mean of squared differences = (1/20) × 178 = 8.9
(Note: this value is called the "Variance")
σ = √(8.9) = 2.983...
Sample Standard Deviation
Sometimes our data is only a sample of the whole population.
Example: Sam has 20 rose bushes, but what if Sam only counted the
flowers on 6 of them?
The "population" is all 20 rose bushes,
and the "sample" is the 6 he counted. Let us say they are:
9, 2, 5, 4, 12, 7
We can still estimate the Standard Deviation.

Example (concluded):
But when we use the sample as an estimate of the whole
population, the Standard Deviation formula changes to this:
The formula for Sample Standard Deviation:
The important change is "N-1" instead of "N" (which is called
"Bessel's correction").
The symbols also change to reflect that we are working on a sample
instead of the whole population:
The mean is now x (for sample mean) instead of μ (the population
mean),
And the answer is s (for Sample Standard Deviation) instead of σ.
But that does not affect the calculations. Only N-1 instead of N
changes the calculations.

OK, let us now calculate the Sample Standard Deviation:
Step 1. Work out the mean
Example 2: Using sampled values 9, 2, 5, 4, 12, 7
The mean is (9+2+5+4+12+7) / 6 = 39/6 = 6.5
So: x = 6.5
Step 2. Then for each number: subtract the Mean and square the result
Example 2 (continued):
(9 - 6.5)2
= (2.5)2
= 6.25
(2 - 6.5)2
= (-4.5)2
= 20.25
(5 - 6.5)2
= (-1.5)2
= 2.25
(4 - 6.5)2
= (-2.5)2
= 6.25
(12 - 6.5)2
= (5.5)2
= 30.25
(7 - 6.5)2
= (0.5)2
= 0.25

Step 3. Then work out the mean of those squared differences.
To work out the mean, add up all the values then divide by how many.
But hang on ... we are calculating the Sample Standard Deviation, so
instead of dividing by how many (N), we will divide by N-1
Example 2 (continued):
Sum = 6.25 + 20.25 + 2.25 + 6.25 + 30.25 + 0.25 = 65.5
Divide by N-1: (1/5) × 65.5 = 13.1
(This value is called the "Sample Variance")
Example 2 (concluded):
s = √(13.1) = 3.619...

Comparing
When we used the whole population we got: Mean = 7, Standard
Deviation = 2.983...
When we used the sample we got: Sample Mean = 6.5, Sample Standard
Deviation = 3.619...
Our Sample Mean was wrong by 7%, and our Sample Standard Deviation
was wrong by 21%.
Why Would We Take a Sample?
Mostly because it is easier and cheaper.
Imagine you want to know what the whole country thinks ... you
can't ask millions of people, so instead you ask maybe 1,000 people.

"You don't have to eat the whole ox to know that the meat is
tough."
This is the essential idea of sampling. To find out information
about the population (such as mean and standard deviation), we do not
need to look at all members of the population; we only need a sample.
But when we take a sample, we lose some accuracy.
Summary
The Population Standard Deviation:
The Sample Standard Deviation:

oCoefficient of Variation (CV)
Refers to a statistical measure of the
distribution of data points in a data series
around the mean. It represents the ratio of
the Standard Deviation to the mean. The
coefficient of variation is a helpful statistic in
comparing the degree of variation from one
data series to the other, although the means
are considerably different from each other.

The CV enables the determination of
assumed volatility as compared to the amount
of return expected from an investment. Putting
it simple, a lower ratio of standard deviation to
mean return indicates a better risk-return
trade off.

Coefficient of Variation Formula
Coefficient of Variation is expressed as the ratio
of standard deviation and mean. It is often abbreviated
as CV. Coefficient of variation is the measure of
variability of the data. When the value of coefficient of
variation is higher, it means that the data has high
variability and less stability. When the value of
coefficient of variation is lower, it means the data has
less variability and high stability.
The formula for coefficient of variation is given below:
Coefficient of Variation = Standard Deviation
Mean

Question: find the coefficient of variation of 5,
10, 15, 20?
Formula for the mean: x = ∑x
n
x = 50 = 12.5
4

x x−x¯ (x−x )¯ 2
5 -7.5 56.25
10 -2.5 6.25
15 2.5 6.25
20 7.5 56.25
∑x = 50 ∑(x−x )¯ 2 = 125

Formula for population standard deviation:
S= √ ∑(x−x¯)2
n
= √125
4
=5.59
Coefficient of variation= standard deviation
mean
= 5.59
12.5
= 0.447

Measures of Variation and Standard Deviation

Measures of Variation and Standard Deviation

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Andere mochten auch

Andere mochten auch (9)

Ähnlich wie Measures of Variation and Standard Deviation

Ähnlich wie Measures of Variation and Standard Deviation (20)

Mehr von Rica Joy Pontilar

Mehr von Rica Joy Pontilar (7)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Measures of Variation and Standard Deviation