1. CORRELATION
ANALYSIS
It is a statistical measure which
shows relationship between two
or more variable moving in the
same or in opposite direction
08/01/12 anilmishra5555@rediffmail.com 1
2. Visual Displays and
Correlation Analysis • Correlation Analysis
• The sample correlation coefficient (r) measures the
degree of linearity in the relationship between X and
Y.
-1 < r < +1
Strong negative relationship Strong positive relationship
• r = 0 indicates no linear relationship
• In Excel, use =CORREL(array1,array2),
where array1 is the range for X and array2 is the
range for Y.
3. Types of correlation
correlation
Simple ,
positive Linear
multiple
& negative & non-linear
& partial
08/01/12 anilmishra5555@rediffmail.com 3
4. Methods of correlation
• Scatter diagram
• Product moment or covariance
• Rank correlation
• Concurrent deviation
08/01/12 anilmishra5555@rediffmail.com 4
16. Sol.cont.
X=
∑X =
30
=6
N 5
Y =
∑ Y = 40 = 8
N 5
r=
∑ x. y = − 26 = − 26 ≈ −0.92
∑ x 2 .∑ y 2 40.20 800
17. Direct method
N .∑ XY − ∑ X .∑ Y
r=
[ N ∑ X − ( ∑ X ) ]. [ N.∑ Y − ( ∑ Y ) ]
2 2 2 2
08/01/12 anilmishra5555@rediffmail.com 17
18. Short-cut method
N ∑ d x .d y − ∑ d x .∑ d y
r=
N ∑ d x − (∑ dx ) . N ∑ d y − (∑ d y )
2 2 2 2
08/01/12 anilmishra5555@rediffmail.com 18
19. Where
dx = X − A
&
dy = Y − A
A = assume mean
08/01/12 anilmishra5555@rediffmail.com 19
20. Product moment method
r = bxy .byx
where
bxy =
∑xy
∑ y 2
byx =
∑xy
08/01/12
∑x
anilmishra5555@rediffmail.com
2
20
21. spearman’s Rank correlation
6∑ D 2
R = 1−
N ( N − 1)2
where
D = Rx − R y
Rx = rank .of . X
R y = rank .of . y
08/01/12 anilmishra5555@rediffmail.com 21
22. problem
Calculate spearman’s rank correlation
coefficient between advt.cost & sales from
the following data
Advt.cost :39 65 62 90 82 75 25 98 36 78
Sales(lakhs): 47 53 58 86 62 68 60 91 51 84
24. Sol.cont.
6∑ D 2
R = 1−
N −N3
6.30
⇒ R = 1− 3
10 − 10
2
⇒ R = 1−
11
9
⇒ R = = 0.82
11
25. In case of equal rank
6 ∑ D +
2 1 3
12
(
m −m +
1 3
12
)
m − m + ........( )
R = 1−
N N −1
2
( )
where
m = no.of repeated items
08/01/12 anilmishra5555@rediffmail.com 25
26. problem
A psychologist wanted to compare two methods A
& B of teaching. He selected a random sample
of 22 students. He grouped them into 11 pairs so
that the students in a pair have approximately
equal scores in an intelligence test. In each pair
one student was taught by method A and the
other by method B and examined after the
course. The marks obtained by them as follows
Pair:1 2 3 4 5 6 7 8 9 10 11
A: 24 29 19 14 30 19 27 30 20 28 11
B: 37 35 16 26 23 27 19 20 16 11 21
28. Sol.cont.
in A series the items 19 & 30 are repeated twice and in B series16
is repeated twice ∴
2( 4 − 1) 2( 4 − 1) 2( 4 − 1)
6 ∑ D +
2
+ +
12 12 12
R = 1-
11(121 − 1)
⇒ R = −0.0225
29. Properties of correlation coefficient
• r always lies between +1 & -1
i.e. -1<r<+1
• Two independent variables are
uncorrelated but converse is not true
• r is independent of change in origin and
scale
• r is the G.M. of two regression coefficients
• r is symmetrical
08/01/12 anilmishra5555@rediffmail.com 29
30. Probable error
1 −r 2
SE ( r ) =
n
PE ( r ) = 0.6745 × SE ( r )
or
1 −r 2
PE ( r ) = 0.6745 ×
n
08/01/12 anilmishra5555@rediffmail.com 30
31. The coefficient of determination
It is the primary way we can measure the extent
or strength of the association that exists
between two variables x & y . Because we have
used a sample of points to develop regression
lines .
2
It is denoted by r