04-CORRELATION AND REGRESSION (THEORY)

STUDY  INNOVATIONS
STUDY INNOVATIONSEducator um Study Innovations

2.2.1 Introduction . “If it is proved true that in a large number of instances two variables tend always to fluctuate in the same or in opposite directions, we consider that the fact is established and that a relationship exists. This relationship is called correlation.” (1) Univariate distribution : These are the distributions in which there is only one variable such as the heights of the students of a class. (2) Bivariate distribution : Distribution involving two discrete variable is called a bivariate distribution. For example, the heights and the weights of the students of a class in a school. (3) Bivariate frequency distribution : Let x and y be two variables. Suppose x takes the values and y takes the values then we record our observations in the form of ordered pairs , where . If a certain pair occurs times, we say that its frequency is . The function which assigns the frequencies ’s to the pairs is known as a bivariate frequency distribution. Example: 1 The following table shows the frequency distribution of age (x) and weight (y) of a group of 60 individuals x (yrs) y (yrs.) 40 – 45 45 – 50 50 – 55 55 – 60 60 – 65 45 – 50 2 5 8 3 0 50 – 55 1 3 6 10 2 55 – 60 0 2 5 12 1 Then find the marginal frequency distribution for x and y. Solution: Marginal frequency distribution for x x 40 – 45 45 – 50 50 – 55 55 – 60 60 – 65 f 3 10 19 25 3 Marginal frequency distribution for y y 45 – 50 50 – 55 55 – 60 f 18 22 20 2.2.2 Covariance . Let be a bivariate distribution, where are the values of variable x and those of y. Then the covariance Cov (x, y) between x and y is given by or where, and are means of variables x and y respectively. Covariance is not affected by the change of origin, but it is affected by the change of scale. Example: 2 Covariance between x and y, if , , is (a) 22 (b) 2 (c) – 2 (d) None of these Solution: (c) Given, We know that, . 2.2.3 Correlation . The relationship between two variables such that a change in one variable results in a positive or negative change in the other variable is known as correlation. (1) Types of correlation (i) Perfect correlation : If the two variables vary in such a manner that their ratio is always constant, then the correlation is said to be perfect. (ii) Positive or direct correlation : If an increase or decrease in one variable corresponds to an increase or decrease in the other, the correlation is said to be positive. (iii) Negative or indirect correlation : If an increase or decrease in one variable corresponds to a decrease or increase in the other, the correlation is said to be negative. (2) Karl Pearson's coefficient of correlation : The correlation coefficient , between two variable x and y is given by, , . (3) Modified formula : , where Also . Example: 3 For the data x: 4 7 8 3 4 y: 5 8 6 3 5 The Karl Pearson’s co

Correlation and Regression 65
2.2.1 Introduction .
“If it is proved true that in a large number of instances two variables tend always to fluctuate in the same or
in opposite directions, we consider that the fact is established and that a relationship exists. This relationship is
called correlation.”
(1) Univariate distribution : These are the distributions in which there is only one variable such as the
heights of the students of a class.
(2) Bivariate distribution : Distribution involving two discrete variable is called a bivariate distribution. For
example, the heights and the weights of the students of a class in a school.
(3) Bivariate frequency distribution : Let x and y be two variables. Suppose x takes the values
n
x
x
x ,.....,
, 2
1 and y takes the values ,
,.....,
, 2
1 n
y
y
y then we record our observations in the form of ordered pairs
)
,
( 1
1 y
x , where n
j
n
i 


 1
,
1 . If a certain pair occurs ij
f times, we say that its frequency is ij
f .
The function which assigns the frequencies ij
f ’s to the pairs )
,
( j
i y
x is known as a bivariate frequency distribution.
Example: 1 The following table shows the frequency distribution of age (x) and weight (y) of a group of 60 individuals
x (yrs)
y (yrs.)
40 – 45 45 – 50 50 – 55 55 – 60 60 – 65
45 – 50 2 5 8 3 0
50 – 55 1 3 6 10 2
55 – 60 0 2 5 12 1
Then find the marginal frequency distribution for x and y.
Solution: Marginal frequency distribution for x
x 40 – 45 45 – 50 50 – 55 55 – 60 60 – 65
f 3 10 19 25 3
Marginal frequency distribution for y
y 45 – 50 50 – 55 55 – 60
f 18 22 20
2.2.2 Covariance .
Let n
i
x
x i ,.....,
2
,
1
);
,
( 1  be a bivariate distribution, where n
x
x
x ,.....,
, 2
1 are the values of variable x and
n
y
y
y ,.....,
, 2
1 those of y. Then the covariance Cov (x, y) between x and y is given by





n
i
i
i y
y
x
x
n
y
x
Cov
1
)
)(
(
1
)
,
( or )
(
1
)
,
(
1
y
x
y
x
n
y
x
Cov
n
i
i
i



 where, 


n
i
i
x
n
x
1
1
and 


n
i
i
y
n
y
1
1
are
means of variables x and y respectively.
Covariance is not affected by the change of origin, but it is affected by the change of scale.
Example: 2 Covariance )
,
( y
x between x and y, if   15
x ,   40
y ,   ,
110
.y
x 5

n is
65
66 Correlation and Regression
(a) 22 (b) 2 (c) – 2 (d) None of these
Solution: (c) Given,   
 40
,
15 y
x
  ,
110
.y
x 15

n
We know that,  

 



















n
i
n
i
i
n
i
i
i
i y
n
x
n
y
x
n
y
x
Cov
1 1
1
1
1
.
1
)
,
( 












 

 y
n
x
n
y
x
n
1
1
.
1














5
40
5
15
)
110
(
5
1
2
8
3
22 



 .
2.2.3 Correlation .
The relationship between two variables such that a change in one variable results in a positive or negative
change in the other variable is known as correlation.
(1) Types of correlation
(i) Perfect correlation : If the two variables vary in such a manner that their ratio is always constant, then
the correlation is said to be perfect.
(ii) Positive or direct correlation : If an increase or decrease in one variable corresponds to an increase or
decrease in the other, the correlation is said to be positive.
(iii) Negative or indirect correlation : If an increase or decrease in one variable corresponds to a decrease
or increase in the other, the correlation is said to be negative.
(2) Karl Pearson's coefficient of correlation : The correlation coefficient )
,
( y
x
r , between two variable x
and y is given by,
y
x
y
x
Cov
y
Var
x
Var
y
x
Cov
y
x
r


)
,
(
or
)
(
)
(
)
,
(
)
,
(  ,
2
1
1
2
1
2
1
2
1
1
1
)
,
(














































 





 



n
i
i
n
i
i
n
i
n
i
i
i
n
i
i
n
i
i
n
i
i
i
y
y
n
x
x
n
y
x
y
x
n
y
x
r
2
2
2
2
)
(
)
(
)
(
)
(
dy
dx
dxdy
y
y
x
x
y
y
x
x
r











 .
(3) Modified formula :
   




























  
n
dy
dy
n
dx
dx
n
dy
dx
dxdy
r
2
2
2
2
.
, where y
y
dy
x
x
dx 


 ;
Also
)
var(
).
var(
)
,
(
)
,
(
y
x
y
x
Cov
y
x
Cov
r
y
x
xy 



.
Example: 3 For the data
x: 4 7 8 3 4
y: 5 8 6 3 5
The Karl Pearson’s coefficient is
(a)
66
94
63

(b) 63 (c)
94
63
(d)
66
63
Solution: (a) Take 5
,
5 
 B
A
Correlation and Regression 67
i
x i
y 5

 i
i x
u 5

 i
i y
v 2
i
u 2
i
v i
iv
u
4
7
8
3
4
5
8
6
3
5
– 1
2
3
– 2
– 1
0
3
1
– 2
0
1
9
1
4
0
0
9
1
4
0
0
6
3
4
0
Total
  1
i
u   2
i
v   19
2
i
u   14
2
i
v
  13
i
iv
u

   
 
 
  




2
2
2 1
1
1
)
,
(
i
i
i
i
i
i
i
i
v
n
v
u
n
u
v
u
n
v
u
y
x
r
5
2
14
5
1
19
5
2
1
13
2
2





66
94
63
 .
Example: 4 Coefficient of correlation between observations (1, 6),(2, 5),(3, 4), (4, 3), (5, 2), (6, 1) is
(a) 1 (b) – 1 (c) 0 (d) None of these
Solution: (b) Since there is a linear relationship between x and y, i.e. 7

 y
x
 Coefficient of correlation = – 1.
Example: 5 The value of co-variance of two variables x and y is
3
148
 and the variance of x is
3
272
and the variance of y is
3
131
. The
coefficient of correlation is
(a) 0.48 (b) 0.78 (c) 0.87 (d) None of these
Solution : (d) We know that coefficient of correlation
y
x
y
x
Cov

 .
)
,
(

Since the covariance is – ive.
 Correlation coefficient must be – ive. Hence (d) is the correct answer.
Example: 6 The coefficient of correlation between two variables x and y is 0.5, their covariance is 16. If the S.D of x is 4, then the S.D.
of y is equal to
(a) 4 (b) 8 (c) 16 (d) 64
Solution: (b) We have, ,
5
.
0

xy
r 16
)
,
( 
y
x
Cov . S.D of x i.e., 4

x
 , ?

y

We know that,
y
x
y
x
Cov
y
x
r

 .
)
,
(
)
,
( 
y

.
4
16
5
.
0  ;  8

y
 .
Example: 7 For a bivariate distribution (x, y) if   50
x ,   60
y , 6
,
5
,
350 


 y
x
xy variance of x is 4, variance of y is 9,
then )
,
( y
x
r is
(a) 5/6 (b) 5/36 (c) 11/3 (d) 11/18
Solution: (a)
n
x
x

 
n
50
5   10

n .
 )
6
)(
5
(
10
350
.
)
,
( 



 y
x
n
y
x
y
x
Cov = 5.

9
.
4
5
.
)
,
(
)
,
( 

y
x
y
x
Cov
y
x
r


=
6
5
.
Example: 8 A, B, C, D are non-zero constants, such that
(i) both A and C are negative. (ii) A and C are of opposite sign.
68 Correlation and Regression
If coefficient of correlation between x and y is r, then that between B
AX  and D
CY  is
(a) r (b) – r (c) r
C
A
(d) r
C
A

Solution : (a,b) (i) Both A and C are negative.
Now )
,
( D
CY
B
AX
Cov 
 )
,
.( Y
X
Cov
AC

x
B
AX A 
 |
|

 and y
D
CY C 
 |
|


Hence )
,
( D
CY
B
AX 

 = )
,
(
|
|
)
|
)(|
|
(|
)
,
(
.
Y
X
AC
AC
C
A
Y
X
Cov
AC
y
x



 = )
0
(
,
)
,
( 
 AC
r
Y
X 

(ii) )
,
( D
CY
B
AX 

 = )
,
(
|
|
Y
X
AC
AC
 , )
0
( 
AC

= )
,
( Y
X
AC
AC


= r
Y
X 

 )
,
(
 .
2.2.4 Rank Correlation .
Let us suppose that a group of n individuals is arranged in order of merit or proficiency in possession of two
characteristics A and B.
These rank in two characteristics will, in general, be different.
For example, if we consider the relation between intelligence and beauty, it is not necessary that a beautiful
individual is intelligent also.
Rank Correlation :
)
1
(
6
1 2
2




n
n
d
 , which is the Spearman's formulae for rank correlation coefficient.
Where  2
d = sum of the squares of the difference of two ranks and n is the number of pairs of
observations.
Note :  We always have, 0
)
(
)
(
)
( 





 


 y
n
x
n
y
x
y
x
d i
i
i
i
i , )
( y
x 

If all d's are zero, then 1

r , which shows that there is perfect rank correlation between the
variable and which is maximum value of r.
 If however some values of i
x are equal, then the coefficient of rank correlation is given by
)
1
(
)
(
12
1
6
1 2
3
2


















n
n
m
m
d
r
where m is the number of times a particular i
x is repeated.
Positive and Negative rank correlation coefficients
Let r be the rank correlation coefficient then, if
 0

r , it means that if the rank of one characteristic is high, then that of the other is also high or if
the rank of one characteristic is low, then that of the other is also low. e.g., if the two
characteristics be height and weight of persons, then 0

r means that the tall persons are also
heavy in weight.
 1

r , it means that there is perfect correlation in the two characteristics i.e., every individual is
getting the same ranks in the two characteristics. Here the ranks are of the type (1, 1), (2, 2),....., (n, n).
Correlation and Regression 69
 1

r , it means that if the rank of one characteristics is high, then that of the other is low or if the
rank of one characteristics is low, then that of the other is high. e.g., if the two characteristics be
richness and slimness in person, then 0

r means that the rich persons are not slim.
 1


r , it means that there is perfect negative correlation in the two characteristics i.e, an
individual getting highest rank in one characteristic is getting the lowest rank in the second
characteristic. Here the rank, in the two characteristics in a group of n individuals are of the type
(1, n), )
1
,
(
),.....,
1
,
2
( n
n  .
 0

r , it means that no relation can be established between the two characteristics.
Important Tips
 If 0

r , the variable x and y are said to be uncorrelated or independent.
 If 1


r , the correlation is said to be negative and perfect.
 If ,
1


r the correlation is said to be positive and perfect.
 Correlation is a pure number and hence unitless.
 Correlation coefficient is not affected by change of origin and scale.
 If two variate are connected by the linear relation K
y
x 
 , then x, y are in perfect indirect correlation. Here 1


r .
 If x, y are two independent variables, then 2
2
2
2
)
,
(
y
x
y
x
y
x
y
x









 .

   2
2
2
2 1
1
.
1
)
,
(











i
i
i
i
i
i
i
i
v
n
v
u
n
u
v
u
n
v
u
y
x
r , where B
y
v
A
x
u i
i
i
i 


 , .
Example: 9 Two numbers within the bracket denote the ranks of 10 students of a class in two subjects
(1, 10), (2, 9), (3, 8), (4, 7), (5, 6), (6, 5), (7, 4), (8, 3), (9, 2), (10, 1). The rank of correlation coefficient is
(a) 0 (b) – 1 (c) 1 (d) 0.5
Solution: (b) Rank correlation coefficient is
)
1
(
.
6
1 2
2




n
n
d
r , Where x
y
d 
 for pair )
,
( y
x
 330
)
9
(
)
7
(
)
5
(
)
3
(
)
1
(
1
3
5
7
9 2
2
2
2
2
2
2
2
2
2
2
















d
Also 10

n ;  1
)
1
100
(
10
330
6
1 





r .
Example : 10 Let n
x
x
x
x ,.....,
,
, 3
2
1 be the rank of n individuals according to character A and n
y
y
y ,......,
, 2
1 the ranks of same
individuals according to other character B such that 1


 n
y
x i
i for n
i ,.....,
3
,
2
,
1
 . Then the coefficient of rank
correlation between the characters A and B is
(a) 1 (b) 0 (c) – 1 (d) None of these
Solution: (c) 1


 n
y
x i
i for all n
i ,.....,
3
,
2
,
1

Let i
i
i d
y
x 
 . Then, i
i d
n
x 

 1
2  )
1
(
2 

 n
x
d i
i
 
 




n
i
i
n
i
i n
x
d
1
2
1
2
)]
1
(
2
[ = 





n
i
i
i n
x
n
x
1
2
2
)]
1
(
4
)
1
(
4
[
 
  






n
i
n
i
i
i
n
i
i x
n
n
n
x
d
1 1
2
2
1
2
)
1
(
4
)
1
(
)
(
4 =
2
)
1
(
)
1
(
4
)
1
(
)
(
6
)
1
2
)(
1
(
4 2 





 n
n
n
n
n
n
n
n
3
)
1
( 2
1
2 



n
n
d
n
i
i .

)
1
)(
(
3
)
1
)(
(
6
1
)
1
(
6
1 2
2
2
2








n
n
n
n
n
n
d
r
i
i.e., 1


r .
70 Correlation and Regression
Regression
2.2.5 Linear Regression .
If a relation between two variates x and y exists, then the dots of the scatter diagram will more or less be
concentrated around a curve which is called the curve of regression. If this curve be a straight line, then it is
known as line of regression and the regression is called linear regression.
Line of regression: The line of regression is the straight line which in the least square sense gives the best fit
to the given frequency.
2.2.6 Equations of lines of Regression .
(1) Regression line of y on x : If value of x is known, then value of y can be found as
)
(
)
,
(
2
x
x
y
x
Cov
y
y
x




or )
( x
x
r
y
y
x
y





(2) Regression line of x on y : It estimates x for the given value of y as
)
(
)
,
(
2
y
y
y
x
Cov
x
x
y




or )
( y
y
r
x
x
y
x





(3) Regression coefficient : (i) Regression coefficient of y on x is 2
)
,
(
x
x
y
yx
y
x
Cov
r
b





(ii) Regression coefficient of x on y is 2
)
,
(
y
y
x
xy
y
x
Cov
r
b




 .
2.2.7 Angle between Two lines of Regression .
Equation of the two lines of regression are )
( x
x
b
y
y yx 

 and )
( y
y
b
x
x xy 


We have, 
1
m slope of the line of regression of y on x =
x
y
yx r
b


.


2
m Slope of line of regression of x on y =
x
y
xy r
b 

.
1


x
y
x
y
x
y
x
y
r
r
r
r
m
m
m
m









.
1
1
tan
2
1
1
2







 =
)
(
)
1
(
)
(
2
2
2
2
2
2
y
x
y
x
y
x
x
y
y
r
r
r
r
r















 .
Here the positive sign gives the acute angle  , because 1
2

r and y
x 
 , are positive.
 2
2
2
.
1
tan
y
x
y
x
r
r







 .....(i)
Note :  If 0

r , from (i) we conclude 


tan or 2
/

  i.e., two regression lines are at right angels.
 If 1


r , 0
tan 
 i.e., 0

 , since  is acute i.e., two regression lines coincide.
Correlation and Regression 71
2.2.8 Important points about Regression coefficients bxy and byx .
(1) .
.
. e
i
b
b
r xy
yx
 the coefficient of correlation is the geometric mean of the coefficient of regression.
(2) If 1

yx
b , then 1

xy
b i.e. if one of the regression coefficient is greater than unity, the other will be less than
unity.
(3) If the correlation between the variable is not perfect, then the regression lines intersect at )
,
( y
x .
(4) yx
b is called the slope of regression line y on x and
xy
b
1
is called the slope of regression line x on y.
(5) xy
yx
xy
yx b
b
b
b 2

 or r
b
b xy
yx 2

 , i.e. the arithmetic mean of the regression coefficient is greater than
the correlation coefficient.
(6) Regression coefficients are independent of change of origin but not of scale.
(7) The product of lines of regression’s gradients is given by 2
2
x
y


.
(8) If both the lines of regression coincide, then correlation will be perfect linear.
(9) Ifboth yx
b and xy
b arepositive,the r willbepositiveandifboth yx
b and xy
b arenegative,ther willbenegative.
Important Tips
 If 0

r , then tan is not defined i.e.
2

  . Thus the regression lines are perpendicular.
 If 1


r or 1
 , then tan  = 0 i.e.  = 0. Thus the regression lines are coincident.
 If regression lines are b
ax
y 
 and d
cy
x 
 , then
ac
b
ad
y
ac
d
bc
x






1
and
1
.
 If byx, bxy and 0

r then r
b
b yx
xy 
 )
(
2
1
and if bxy, byx and 0

r then r
b
b yx
xy 
 )
(
2
1
.
 Correlationmeasures therelationshipbetweenvariableswhileregression measures only the cause andeffect ofrelationshipbetweenthevariables.
 If line of regression of y on x makes an angle  , with the +ive direction of X-axis, then yx
b


tan .
 If line of regression of x on y makes an angle  , with the +ive direction of X-axis, then xy
b


cot .
Example : 11 The two lines of regression are 0
6
7
2 

 y
x and 0
1
2
7 

 y
x . The correlation coefficient between x and y is
(a) – 2/7 (b) 2/7 (c) 4/49 (d) None of these
Solution: (b) The two lines of regression are 0
6
7
2 

 y
x .....(i) and 0
1
2
7 

 y
x ......(ii)
If (i) is regression equation of y on x, then (ii) is regression equation of x on y.
We write these as
7
6
7
2

 x
y and
7
1
7
2

 y
x

7
2

yx
b ,
7
2

xy
b ;  1
49
4
. 

xy
yx b
b , So our choice is valid.

49
4
2

r 
7
2

r . [ 0
,
0 
 xy
yx b
b ]
Example: 12 Given that the regression coefficients are – 1.5 and 0.5, the value of the square of correlation coefficient is
(a) 0.75 (b) 0.7
(c) – 0.75 (d) – 0.5
72 Correlation and Regression
Solution: (c) Correlation coefficient is given by xy
yx b
b
r .
2
 = 75
.
0
)
5
.
0
(
)
5
.
1
( 

 .
Example: 13 In a bivariate data 
 
 400
,
30 y
x , 
 
 850
,
196
2
xy
x and 10

n .The regression coefficient of y on x is
(a) – 3.1 (b) – 3.2 (c) – 3.3 (d) – 3.4
Solution: (c) y
x
n
xy
n
y
x
Cov 



 .
1
1
)
,
( 2
= )
400
(
)
30
(
100
1
)
850
(
10
1
 = 35

2
2
2 1
)
( 




 




n
x
x
n
x
Var x
 =
2
10
30
10
196






 = 10.6
6
.
10
35
)
(
)
,
( 


x
Var
y
x
Cov
byx = – 3.3.
Example: 14 If two lines of regression are 0
66
10
8 

 y
x and 214
18
40 
 y
x , then )
,
( y
x is
(a) (17, 13) (b) (13, 17) (c) (– 17, 13) (d) (– 13, – 17)
Solution: (b) Since lines of regression pass through )
,
( y
x , hence the equation will be 0
66
10
8 

 y
x and 214
18
40 
 y
x
On solving the above equations, we get the required answer 17
,
13 
 y
x .
Example: 15 The regression coefficient of y on x is
3
2
and of x on y is
3
4
. If the acute angle between the regression line is  , then


tan
(a)
18
1
(b)
9
1
(c)
9
2
(d) None of these
Solution: (a)
3
4
,
3
2

 xy
yx b
b . Therefore,
yx
xy
yx
xy
b
b
b
b



1
1
tan =
18
1
3
/
2
3
/
4
1
2
3
3
4



.
Example: 16 If the lines of regression of y on x and x on y make angles o
30 and o
60 respectively with the positive direction of X-axis,
then the correlation coefficient between x and y is
(a)
2
1
(b)
2
1
(c)
3
1
(d)
3
1
Solution: (c) Slope of regression line of y on x =
3
1
30
tan 
 o
yx
b
Slope of regression line of x on y = 3
60
tan
1

 o
xy
b

3
1

xy
b . Hence,
3
1
3
1
3
1
. 

















 yx
xy b
b
r .
Example: 17 If two random variables x and y, are connected by relationship 3
2 
 y
x , then 
xy
r
(a) 1 (b) – 1 (c) – 2 (d) 3
Solution: (b) Since 3
2 
 y
x
 3
2 
 y
x ;  )
(
2 x
x
y
y 


 . So, 2


yx
b
Also )
(
2
1
y
y
x
x 


 , 
2
1


xy
b
Correlation and Regression 73
 1
2
1
)
2
(
.
2










 xy
yx
xy b
b
r  1


xy
r . ( both xy
yx b
b , are –ive)
2.2.9 Standard error and Probable error.
(1) Standard error of prediction : The deviation of the predicted value from the observed value is known
as the standard error prediction and is defined as









 


n
y
y
S
p
y
2
)
(
where y is actual value and p
y is predicted value.
In relation to coefficient of correlation, it is given by
(i) Standard error of estimate of x is 2
1 r
S x
x 
  (ii) Standard error of estimate of y is 2
1 r
S y
y 
  .
(2) Relation between probable error and standard error : If r is the correlation coefficient in a sample of
n pairs of observations, then its standard error S.E.
n
r
r
2
1
)
(

 and probable error P.E. )
(r = 0.6745 (S.E.)= 0.6745







 
n
r2
1
. The probable error or the standard error are used for interpreting the coefficient of correlation.
(i) If )
.(
. r
E
P
r  , there is no evidence of correlation.
(ii) If )
.(
.
6 r
E
P
r  , the existence of correlation is certain.
The square of the coefficient of correlation for a bivariate distribution is known as the “Coefficient of
determination”.
Example: 18 If
4
21
)
( 
x
Var and 21
)
( 
y
Var and 1

r , then standard error of y is
(a) 0 (b)
2
1
(c)
4
1
(d) 1
Solution: (a) 0
1
1
1 2




 y
y
y r
S 
 .

Recomendados

How to learn Correlation and Regression for JEE Main 2015 von
How to learn Correlation and Regression for JEE Main 2015 How to learn Correlation and Regression for JEE Main 2015
How to learn Correlation and Regression for JEE Main 2015 Ednexa
1.2K views4 Folien
Correlation & Regression (complete) Theory. Module-6-Bpdf von
Correlation & Regression (complete) Theory. Module-6-BpdfCorrelation & Regression (complete) Theory. Module-6-Bpdf
Correlation & Regression (complete) Theory. Module-6-BpdfSTUDY INNOVATIONS
7 views9 Folien
Correlation and Regression von
Correlation and RegressionCorrelation and Regression
Correlation and RegressionShubham Mehta
1.9K views52 Folien
Statistics von
Statistics Statistics
Statistics KafiPati
21 views10 Folien
SM_d89ccf05-7de1-4a30-a134-3143e9b3bf3f_38.pptx von
SM_d89ccf05-7de1-4a30-a134-3143e9b3bf3f_38.pptxSM_d89ccf05-7de1-4a30-a134-3143e9b3bf3f_38.pptx
SM_d89ccf05-7de1-4a30-a134-3143e9b3bf3f_38.pptxManjulasingh17
4 views28 Folien
Correlation von
CorrelationCorrelation
CorrelationProfessor Desmond Ayim-Aboagye
60 views25 Folien

Más contenido relacionado

Similar a 04-CORRELATION AND REGRESSION (THEORY)

Co vand corr von
Co vand corrCo vand corr
Co vand corrShehzad Rizwan
271 views6 Folien
Corr-and-Regress (1).ppt von
Corr-and-Regress (1).pptCorr-and-Regress (1).ppt
Corr-and-Regress (1).pptMuhammadAftab89
26 views30 Folien
Corr-and-Regress.ppt von
Corr-and-Regress.pptCorr-and-Regress.ppt
Corr-and-Regress.pptBAGARAGAZAROMUALD2
4 views30 Folien
Corr-and-Regress.ppt von
Corr-and-Regress.pptCorr-and-Regress.ppt
Corr-and-Regress.pptMoinPasha12
4 views30 Folien
Corr-and-Regress.ppt von
Corr-and-Regress.pptCorr-and-Regress.ppt
Corr-and-Regress.pptHarunorRashid74
1 view30 Folien
Cr-and-Regress.ppt von
Cr-and-Regress.pptCr-and-Regress.ppt
Cr-and-Regress.pptRidaIrfan10
2 views30 Folien

Similar a 04-CORRELATION AND REGRESSION (THEORY)(20)

Correlation by Neeraj Bhandari ( Surkhet.Nepal ) von Neeraj Bhandari
Correlation by Neeraj Bhandari ( Surkhet.Nepal )Correlation by Neeraj Bhandari ( Surkhet.Nepal )
Correlation by Neeraj Bhandari ( Surkhet.Nepal )
Neeraj Bhandari7.4K views
Correlation engineering mathematics von ErSaurabh2
Correlation  engineering mathematicsCorrelation  engineering mathematics
Correlation engineering mathematics
ErSaurabh2540 views
Correlation and regression impt von freelancer
Correlation and regression imptCorrelation and regression impt
Correlation and regression impt
freelancer62 views
MRS EMMAH.pdf von Kasungwa
MRS EMMAH.pdfMRS EMMAH.pdf
MRS EMMAH.pdf
Kasungwa19 views

Más de STUDY INNOVATIONS

Physics-31.Rotational Mechanics von
Physics-31.Rotational MechanicsPhysics-31.Rotational Mechanics
Physics-31.Rotational MechanicsSTUDY INNOVATIONS
3 views14 Folien
Physics-30.24-Physics-Solids and Semiconductors von
Physics-30.24-Physics-Solids and SemiconductorsPhysics-30.24-Physics-Solids and Semiconductors
Physics-30.24-Physics-Solids and SemiconductorsSTUDY INNOVATIONS
2 views40 Folien
Physics-29.Atoms-Molecules & nuclei von
Physics-29.Atoms-Molecules & nucleiPhysics-29.Atoms-Molecules & nuclei
Physics-29.Atoms-Molecules & nucleiSTUDY INNOVATIONS
2 views19 Folien
Physics-28..22-Electron & Photon von
Physics-28..22-Electron & PhotonPhysics-28..22-Electron & Photon
Physics-28..22-Electron & PhotonSTUDY INNOVATIONS
2 views30 Folien
Physics-27.21-Electromagnetic waves von
Physics-27.21-Electromagnetic wavesPhysics-27.21-Electromagnetic waves
Physics-27.21-Electromagnetic wavesSTUDY INNOVATIONS
2 views17 Folien
Physics-26.20-Wave Optics von
Physics-26.20-Wave OpticsPhysics-26.20-Wave Optics
Physics-26.20-Wave OpticsSTUDY INNOVATIONS
2 views27 Folien

Más de STUDY INNOVATIONS(20)

Physics-24.18-Electromagnetic induction & Alternating current von STUDY INNOVATIONS
Physics-24.18-Electromagnetic induction & Alternating currentPhysics-24.18-Electromagnetic induction & Alternating current
Physics-24.18-Electromagnetic induction & Alternating current
Physics-22.15-THERMAL & CHEMICAL EFFECTS OF CURRENT & THERMOELECTRICITY von STUDY INNOVATIONS
Physics-22.15-THERMAL & CHEMICAL EFFECTS OF CURRENT & THERMOELECTRICITYPhysics-22.15-THERMAL & CHEMICAL EFFECTS OF CURRENT & THERMOELECTRICITY
Physics-22.15-THERMAL & CHEMICAL EFFECTS OF CURRENT & THERMOELECTRICITY
Physics-14.7-NEWTON'S LAW OF UNIVERSAL GRAVITATION von STUDY INNOVATIONS
Physics-14.7-NEWTON'S LAW OF UNIVERSAL GRAVITATIONPhysics-14.7-NEWTON'S LAW OF UNIVERSAL GRAVITATION
Physics-14.7-NEWTON'S LAW OF UNIVERSAL GRAVITATION

Último

Guess Papers ADC 1, Karachi University von
Guess Papers ADC 1, Karachi UniversityGuess Papers ADC 1, Karachi University
Guess Papers ADC 1, Karachi UniversityKhalid Aziz
109 views17 Folien
GSoC 2024 .pdf von
GSoC 2024 .pdfGSoC 2024 .pdf
GSoC 2024 .pdfShabNaz2
45 views15 Folien
MercerJesse3.0.pdf von
MercerJesse3.0.pdfMercerJesse3.0.pdf
MercerJesse3.0.pdfjessemercerail
183 views6 Folien
BÀI TẬP BỔ TRỢ TIẾNG ANH 11 THEO ĐƠN VỊ BÀI HỌC - CẢ NĂM - CÓ FILE NGHE (FRIE... von
BÀI TẬP BỔ TRỢ TIẾNG ANH 11 THEO ĐƠN VỊ BÀI HỌC - CẢ NĂM - CÓ FILE NGHE (FRIE...BÀI TẬP BỔ TRỢ TIẾNG ANH 11 THEO ĐƠN VỊ BÀI HỌC - CẢ NĂM - CÓ FILE NGHE (FRIE...
BÀI TẬP BỔ TRỢ TIẾNG ANH 11 THEO ĐƠN VỊ BÀI HỌC - CẢ NĂM - CÓ FILE NGHE (FRIE...Nguyen Thanh Tu Collection
106 views91 Folien
Interaction of microorganisms with vascular plants.pptx von
Interaction of microorganisms with vascular plants.pptxInteraction of microorganisms with vascular plants.pptx
Interaction of microorganisms with vascular plants.pptxMicrobiologyMicro
75 views33 Folien
INT-244 Topic 6b Confucianism von
INT-244 Topic 6b ConfucianismINT-244 Topic 6b Confucianism
INT-244 Topic 6b ConfucianismS Meyer
51 views77 Folien

Último(20)

Guess Papers ADC 1, Karachi University von Khalid Aziz
Guess Papers ADC 1, Karachi UniversityGuess Papers ADC 1, Karachi University
Guess Papers ADC 1, Karachi University
Khalid Aziz109 views
GSoC 2024 .pdf von ShabNaz2
GSoC 2024 .pdfGSoC 2024 .pdf
GSoC 2024 .pdf
ShabNaz245 views
BÀI TẬP BỔ TRỢ TIẾNG ANH 11 THEO ĐƠN VỊ BÀI HỌC - CẢ NĂM - CÓ FILE NGHE (FRIE... von Nguyen Thanh Tu Collection
BÀI TẬP BỔ TRỢ TIẾNG ANH 11 THEO ĐƠN VỊ BÀI HỌC - CẢ NĂM - CÓ FILE NGHE (FRIE...BÀI TẬP BỔ TRỢ TIẾNG ANH 11 THEO ĐƠN VỊ BÀI HỌC - CẢ NĂM - CÓ FILE NGHE (FRIE...
BÀI TẬP BỔ TRỢ TIẾNG ANH 11 THEO ĐƠN VỊ BÀI HỌC - CẢ NĂM - CÓ FILE NGHE (FRIE...
Interaction of microorganisms with vascular plants.pptx von MicrobiologyMicro
Interaction of microorganisms with vascular plants.pptxInteraction of microorganisms with vascular plants.pptx
Interaction of microorganisms with vascular plants.pptx
INT-244 Topic 6b Confucianism von S Meyer
INT-244 Topic 6b ConfucianismINT-244 Topic 6b Confucianism
INT-244 Topic 6b Confucianism
S Meyer51 views
11.21.23 Economic Precarity and Global Economic Forces.pptx von mary850239
11.21.23 Economic Precarity and Global Economic Forces.pptx11.21.23 Economic Precarity and Global Economic Forces.pptx
11.21.23 Economic Precarity and Global Economic Forces.pptx
mary85023994 views
Introduction to AERO Supply Chain - #BEAERO Trainning program von Guennoun Wajih
Introduction to AERO Supply Chain  - #BEAERO Trainning programIntroduction to AERO Supply Chain  - #BEAERO Trainning program
Introduction to AERO Supply Chain - #BEAERO Trainning program
Guennoun Wajih135 views
What is Digital Transformation? von Mark Brown
What is Digital Transformation?What is Digital Transformation?
What is Digital Transformation?
Mark Brown46 views
Artificial Intelligence and The Sustainable Development Goals (SDGs) Adoption... von BC Chew
Artificial Intelligence and The Sustainable Development Goals (SDGs) Adoption...Artificial Intelligence and The Sustainable Development Goals (SDGs) Adoption...
Artificial Intelligence and The Sustainable Development Goals (SDGs) Adoption...
BC Chew40 views
Payment Integration using Braintree Connector | MuleSoft Mysore Meetup #37 von MysoreMuleSoftMeetup
Payment Integration using Braintree Connector | MuleSoft Mysore Meetup #37Payment Integration using Braintree Connector | MuleSoft Mysore Meetup #37
Payment Integration using Braintree Connector | MuleSoft Mysore Meetup #37
NodeJS and ExpressJS.pdf von ArthyR3
NodeJS and ExpressJS.pdfNodeJS and ExpressJS.pdf
NodeJS and ExpressJS.pdf
ArthyR353 views

04-CORRELATION AND REGRESSION (THEORY)

  • 1. Correlation and Regression 65 2.2.1 Introduction . “If it is proved true that in a large number of instances two variables tend always to fluctuate in the same or in opposite directions, we consider that the fact is established and that a relationship exists. This relationship is called correlation.” (1) Univariate distribution : These are the distributions in which there is only one variable such as the heights of the students of a class. (2) Bivariate distribution : Distribution involving two discrete variable is called a bivariate distribution. For example, the heights and the weights of the students of a class in a school. (3) Bivariate frequency distribution : Let x and y be two variables. Suppose x takes the values n x x x ,....., , 2 1 and y takes the values , ,....., , 2 1 n y y y then we record our observations in the form of ordered pairs ) , ( 1 1 y x , where n j n i     1 , 1 . If a certain pair occurs ij f times, we say that its frequency is ij f . The function which assigns the frequencies ij f ’s to the pairs ) , ( j i y x is known as a bivariate frequency distribution. Example: 1 The following table shows the frequency distribution of age (x) and weight (y) of a group of 60 individuals x (yrs) y (yrs.) 40 – 45 45 – 50 50 – 55 55 – 60 60 – 65 45 – 50 2 5 8 3 0 50 – 55 1 3 6 10 2 55 – 60 0 2 5 12 1 Then find the marginal frequency distribution for x and y. Solution: Marginal frequency distribution for x x 40 – 45 45 – 50 50 – 55 55 – 60 60 – 65 f 3 10 19 25 3 Marginal frequency distribution for y y 45 – 50 50 – 55 55 – 60 f 18 22 20 2.2.2 Covariance . Let n i x x i ,....., 2 , 1 ); , ( 1  be a bivariate distribution, where n x x x ,....., , 2 1 are the values of variable x and n y y y ,....., , 2 1 those of y. Then the covariance Cov (x, y) between x and y is given by      n i i i y y x x n y x Cov 1 ) )( ( 1 ) , ( or ) ( 1 ) , ( 1 y x y x n y x Cov n i i i     where,    n i i x n x 1 1 and    n i i y n y 1 1 are means of variables x and y respectively. Covariance is not affected by the change of origin, but it is affected by the change of scale. Example: 2 Covariance ) , ( y x between x and y, if   15 x ,   40 y ,   , 110 .y x 5  n is 65
  • 2. 66 Correlation and Regression (a) 22 (b) 2 (c) – 2 (d) None of these Solution: (c) Given,     40 , 15 y x   , 110 .y x 15  n We know that,                         n i n i i n i i i i y n x n y x n y x Cov 1 1 1 1 1 . 1 ) , (                  y n x n y x n 1 1 . 1               5 40 5 15 ) 110 ( 5 1 2 8 3 22      . 2.2.3 Correlation . The relationship between two variables such that a change in one variable results in a positive or negative change in the other variable is known as correlation. (1) Types of correlation (i) Perfect correlation : If the two variables vary in such a manner that their ratio is always constant, then the correlation is said to be perfect. (ii) Positive or direct correlation : If an increase or decrease in one variable corresponds to an increase or decrease in the other, the correlation is said to be positive. (iii) Negative or indirect correlation : If an increase or decrease in one variable corresponds to a decrease or increase in the other, the correlation is said to be negative. (2) Karl Pearson's coefficient of correlation : The correlation coefficient ) , ( y x r , between two variable x and y is given by, y x y x Cov y Var x Var y x Cov y x r   ) , ( or ) ( ) ( ) , ( ) , (  , 2 1 1 2 1 2 1 2 1 1 1 ) , (                                                           n i i n i i n i n i i i n i i n i i n i i i y y n x x n y x y x n y x r 2 2 2 2 ) ( ) ( ) ( ) ( dy dx dxdy y y x x y y x x r             . (3) Modified formula :                                    n dy dy n dx dx n dy dx dxdy r 2 2 2 2 . , where y y dy x x dx     ; Also ) var( ). var( ) , ( ) , ( y x y x Cov y x Cov r y x xy     . Example: 3 For the data x: 4 7 8 3 4 y: 5 8 6 3 5 The Karl Pearson’s coefficient is (a) 66 94 63  (b) 63 (c) 94 63 (d) 66 63 Solution: (a) Take 5 , 5   B A
  • 3. Correlation and Regression 67 i x i y 5   i i x u 5   i i y v 2 i u 2 i v i iv u 4 7 8 3 4 5 8 6 3 5 – 1 2 3 – 2 – 1 0 3 1 – 2 0 1 9 1 4 0 0 9 1 4 0 0 6 3 4 0 Total   1 i u   2 i v   19 2 i u   14 2 i v   13 i iv u                 2 2 2 1 1 1 ) , ( i i i i i i i i v n v u n u v u n v u y x r 5 2 14 5 1 19 5 2 1 13 2 2      66 94 63  . Example: 4 Coefficient of correlation between observations (1, 6),(2, 5),(3, 4), (4, 3), (5, 2), (6, 1) is (a) 1 (b) – 1 (c) 0 (d) None of these Solution: (b) Since there is a linear relationship between x and y, i.e. 7   y x  Coefficient of correlation = – 1. Example: 5 The value of co-variance of two variables x and y is 3 148  and the variance of x is 3 272 and the variance of y is 3 131 . The coefficient of correlation is (a) 0.48 (b) 0.78 (c) 0.87 (d) None of these Solution : (d) We know that coefficient of correlation y x y x Cov   . ) , (  Since the covariance is – ive.  Correlation coefficient must be – ive. Hence (d) is the correct answer. Example: 6 The coefficient of correlation between two variables x and y is 0.5, their covariance is 16. If the S.D of x is 4, then the S.D. of y is equal to (a) 4 (b) 8 (c) 16 (d) 64 Solution: (b) We have, , 5 . 0  xy r 16 ) , (  y x Cov . S.D of x i.e., 4  x  , ?  y  We know that, y x y x Cov y x r   . ) , ( ) , (  y  . 4 16 5 . 0  ;  8  y  . Example: 7 For a bivariate distribution (x, y) if   50 x ,   60 y , 6 , 5 , 350     y x xy variance of x is 4, variance of y is 9, then ) , ( y x r is (a) 5/6 (b) 5/36 (c) 11/3 (d) 11/18 Solution: (a) n x x    n 50 5   10  n .  ) 6 )( 5 ( 10 350 . ) , (      y x n y x y x Cov = 5.  9 . 4 5 . ) , ( ) , (   y x y x Cov y x r   = 6 5 . Example: 8 A, B, C, D are non-zero constants, such that (i) both A and C are negative. (ii) A and C are of opposite sign.
  • 4. 68 Correlation and Regression If coefficient of correlation between x and y is r, then that between B AX  and D CY  is (a) r (b) – r (c) r C A (d) r C A  Solution : (a,b) (i) Both A and C are negative. Now ) , ( D CY B AX Cov   ) , .( Y X Cov AC  x B AX A   | |   and y D CY C   | |   Hence ) , ( D CY B AX    = ) , ( | | ) | )(| | (| ) , ( . Y X AC AC C A Y X Cov AC y x     = ) 0 ( , ) , (   AC r Y X   (ii) ) , ( D CY B AX    = ) , ( | | Y X AC AC  , ) 0 (  AC  = ) , ( Y X AC AC   = r Y X    ) , (  . 2.2.4 Rank Correlation . Let us suppose that a group of n individuals is arranged in order of merit or proficiency in possession of two characteristics A and B. These rank in two characteristics will, in general, be different. For example, if we consider the relation between intelligence and beauty, it is not necessary that a beautiful individual is intelligent also. Rank Correlation : ) 1 ( 6 1 2 2     n n d  , which is the Spearman's formulae for rank correlation coefficient. Where  2 d = sum of the squares of the difference of two ranks and n is the number of pairs of observations. Note :  We always have, 0 ) ( ) ( ) (            y n x n y x y x d i i i i i , ) ( y x   If all d's are zero, then 1  r , which shows that there is perfect rank correlation between the variable and which is maximum value of r.  If however some values of i x are equal, then the coefficient of rank correlation is given by ) 1 ( ) ( 12 1 6 1 2 3 2                   n n m m d r where m is the number of times a particular i x is repeated. Positive and Negative rank correlation coefficients Let r be the rank correlation coefficient then, if  0  r , it means that if the rank of one characteristic is high, then that of the other is also high or if the rank of one characteristic is low, then that of the other is also low. e.g., if the two characteristics be height and weight of persons, then 0  r means that the tall persons are also heavy in weight.  1  r , it means that there is perfect correlation in the two characteristics i.e., every individual is getting the same ranks in the two characteristics. Here the ranks are of the type (1, 1), (2, 2),....., (n, n).
  • 5. Correlation and Regression 69  1  r , it means that if the rank of one characteristics is high, then that of the other is low or if the rank of one characteristics is low, then that of the other is high. e.g., if the two characteristics be richness and slimness in person, then 0  r means that the rich persons are not slim.  1   r , it means that there is perfect negative correlation in the two characteristics i.e, an individual getting highest rank in one characteristic is getting the lowest rank in the second characteristic. Here the rank, in the two characteristics in a group of n individuals are of the type (1, n), ) 1 , ( ),....., 1 , 2 ( n n  .  0  r , it means that no relation can be established between the two characteristics. Important Tips  If 0  r , the variable x and y are said to be uncorrelated or independent.  If 1   r , the correlation is said to be negative and perfect.  If , 1   r the correlation is said to be positive and perfect.  Correlation is a pure number and hence unitless.  Correlation coefficient is not affected by change of origin and scale.  If two variate are connected by the linear relation K y x   , then x, y are in perfect indirect correlation. Here 1   r .  If x, y are two independent variables, then 2 2 2 2 ) , ( y x y x y x y x           .     2 2 2 2 1 1 . 1 ) , (            i i i i i i i i v n v u n u v u n v u y x r , where B y v A x u i i i i     , . Example: 9 Two numbers within the bracket denote the ranks of 10 students of a class in two subjects (1, 10), (2, 9), (3, 8), (4, 7), (5, 6), (6, 5), (7, 4), (8, 3), (9, 2), (10, 1). The rank of correlation coefficient is (a) 0 (b) – 1 (c) 1 (d) 0.5 Solution: (b) Rank correlation coefficient is ) 1 ( . 6 1 2 2     n n d r , Where x y d   for pair ) , ( y x  330 ) 9 ( ) 7 ( ) 5 ( ) 3 ( ) 1 ( 1 3 5 7 9 2 2 2 2 2 2 2 2 2 2 2                 d Also 10  n ;  1 ) 1 100 ( 10 330 6 1       r . Example : 10 Let n x x x x ,....., , , 3 2 1 be the rank of n individuals according to character A and n y y y ,......, , 2 1 the ranks of same individuals according to other character B such that 1    n y x i i for n i ,....., 3 , 2 , 1  . Then the coefficient of rank correlation between the characters A and B is (a) 1 (b) 0 (c) – 1 (d) None of these Solution: (c) 1    n y x i i for all n i ,....., 3 , 2 , 1  Let i i i d y x   . Then, i i d n x    1 2  ) 1 ( 2    n x d i i         n i i n i i n x d 1 2 1 2 )] 1 ( 2 [ =       n i i i n x n x 1 2 2 )] 1 ( 4 ) 1 ( 4 [            n i n i i i n i i x n n n x d 1 1 2 2 1 2 ) 1 ( 4 ) 1 ( ) ( 4 = 2 ) 1 ( ) 1 ( 4 ) 1 ( ) ( 6 ) 1 2 )( 1 ( 4 2        n n n n n n n n 3 ) 1 ( 2 1 2     n n d n i i .  ) 1 )( ( 3 ) 1 )( ( 6 1 ) 1 ( 6 1 2 2 2 2         n n n n n n d r i i.e., 1   r .
  • 6. 70 Correlation and Regression Regression 2.2.5 Linear Regression . If a relation between two variates x and y exists, then the dots of the scatter diagram will more or less be concentrated around a curve which is called the curve of regression. If this curve be a straight line, then it is known as line of regression and the regression is called linear regression. Line of regression: The line of regression is the straight line which in the least square sense gives the best fit to the given frequency. 2.2.6 Equations of lines of Regression . (1) Regression line of y on x : If value of x is known, then value of y can be found as ) ( ) , ( 2 x x y x Cov y y x     or ) ( x x r y y x y      (2) Regression line of x on y : It estimates x for the given value of y as ) ( ) , ( 2 y y y x Cov x x y     or ) ( y y r x x y x      (3) Regression coefficient : (i) Regression coefficient of y on x is 2 ) , ( x x y yx y x Cov r b      (ii) Regression coefficient of x on y is 2 ) , ( y y x xy y x Cov r b      . 2.2.7 Angle between Two lines of Regression . Equation of the two lines of regression are ) ( x x b y y yx    and ) ( y y b x x xy    We have,  1 m slope of the line of regression of y on x = x y yx r b   .   2 m Slope of line of regression of x on y = x y xy r b   . 1   x y x y x y x y r r r r m m m m          . 1 1 tan 2 1 1 2         = ) ( ) 1 ( ) ( 2 2 2 2 2 2 y x y x y x x y y r r r r r                 . Here the positive sign gives the acute angle  , because 1 2  r and y x   , are positive.  2 2 2 . 1 tan y x y x r r         .....(i) Note :  If 0  r , from (i) we conclude    tan or 2 /    i.e., two regression lines are at right angels.  If 1   r , 0 tan   i.e., 0   , since  is acute i.e., two regression lines coincide.
  • 7. Correlation and Regression 71 2.2.8 Important points about Regression coefficients bxy and byx . (1) . . . e i b b r xy yx  the coefficient of correlation is the geometric mean of the coefficient of regression. (2) If 1  yx b , then 1  xy b i.e. if one of the regression coefficient is greater than unity, the other will be less than unity. (3) If the correlation between the variable is not perfect, then the regression lines intersect at ) , ( y x . (4) yx b is called the slope of regression line y on x and xy b 1 is called the slope of regression line x on y. (5) xy yx xy yx b b b b 2   or r b b xy yx 2   , i.e. the arithmetic mean of the regression coefficient is greater than the correlation coefficient. (6) Regression coefficients are independent of change of origin but not of scale. (7) The product of lines of regression’s gradients is given by 2 2 x y   . (8) If both the lines of regression coincide, then correlation will be perfect linear. (9) Ifboth yx b and xy b arepositive,the r willbepositiveandifboth yx b and xy b arenegative,ther willbenegative. Important Tips  If 0  r , then tan is not defined i.e. 2    . Thus the regression lines are perpendicular.  If 1   r or 1  , then tan  = 0 i.e.  = 0. Thus the regression lines are coincident.  If regression lines are b ax y   and d cy x   , then ac b ad y ac d bc x       1 and 1 .  If byx, bxy and 0  r then r b b yx xy   ) ( 2 1 and if bxy, byx and 0  r then r b b yx xy   ) ( 2 1 .  Correlationmeasures therelationshipbetweenvariableswhileregression measures only the cause andeffect ofrelationshipbetweenthevariables.  If line of regression of y on x makes an angle  , with the +ive direction of X-axis, then yx b   tan .  If line of regression of x on y makes an angle  , with the +ive direction of X-axis, then xy b   cot . Example : 11 The two lines of regression are 0 6 7 2    y x and 0 1 2 7    y x . The correlation coefficient between x and y is (a) – 2/7 (b) 2/7 (c) 4/49 (d) None of these Solution: (b) The two lines of regression are 0 6 7 2    y x .....(i) and 0 1 2 7    y x ......(ii) If (i) is regression equation of y on x, then (ii) is regression equation of x on y. We write these as 7 6 7 2   x y and 7 1 7 2   y x  7 2  yx b , 7 2  xy b ;  1 49 4 .   xy yx b b , So our choice is valid.  49 4 2  r  7 2  r . [ 0 , 0   xy yx b b ] Example: 12 Given that the regression coefficients are – 1.5 and 0.5, the value of the square of correlation coefficient is (a) 0.75 (b) 0.7 (c) – 0.75 (d) – 0.5
  • 8. 72 Correlation and Regression Solution: (c) Correlation coefficient is given by xy yx b b r . 2  = 75 . 0 ) 5 . 0 ( ) 5 . 1 (    . Example: 13 In a bivariate data     400 , 30 y x ,     850 , 196 2 xy x and 10  n .The regression coefficient of y on x is (a) – 3.1 (b) – 3.2 (c) – 3.3 (d) – 3.4 Solution: (c) y x n xy n y x Cov      . 1 1 ) , ( 2 = ) 400 ( ) 30 ( 100 1 ) 850 ( 10 1  = 35  2 2 2 1 ) (            n x x n x Var x  = 2 10 30 10 196        = 10.6 6 . 10 35 ) ( ) , (    x Var y x Cov byx = – 3.3. Example: 14 If two lines of regression are 0 66 10 8    y x and 214 18 40   y x , then ) , ( y x is (a) (17, 13) (b) (13, 17) (c) (– 17, 13) (d) (– 13, – 17) Solution: (b) Since lines of regression pass through ) , ( y x , hence the equation will be 0 66 10 8    y x and 214 18 40   y x On solving the above equations, we get the required answer 17 , 13   y x . Example: 15 The regression coefficient of y on x is 3 2 and of x on y is 3 4 . If the acute angle between the regression line is  , then   tan (a) 18 1 (b) 9 1 (c) 9 2 (d) None of these Solution: (a) 3 4 , 3 2   xy yx b b . Therefore, yx xy yx xy b b b b    1 1 tan = 18 1 3 / 2 3 / 4 1 2 3 3 4    . Example: 16 If the lines of regression of y on x and x on y make angles o 30 and o 60 respectively with the positive direction of X-axis, then the correlation coefficient between x and y is (a) 2 1 (b) 2 1 (c) 3 1 (d) 3 1 Solution: (c) Slope of regression line of y on x = 3 1 30 tan   o yx b Slope of regression line of x on y = 3 60 tan 1   o xy b  3 1  xy b . Hence, 3 1 3 1 3 1 .                    yx xy b b r . Example: 17 If two random variables x and y, are connected by relationship 3 2   y x , then  xy r (a) 1 (b) – 1 (c) – 2 (d) 3 Solution: (b) Since 3 2   y x  3 2   y x ;  ) ( 2 x x y y     . So, 2   yx b Also ) ( 2 1 y y x x     ,  2 1   xy b
  • 9. Correlation and Regression 73  1 2 1 ) 2 ( . 2            xy yx xy b b r  1   xy r . ( both xy yx b b , are –ive) 2.2.9 Standard error and Probable error. (1) Standard error of prediction : The deviation of the predicted value from the observed value is known as the standard error prediction and is defined as              n y y S p y 2 ) ( where y is actual value and p y is predicted value. In relation to coefficient of correlation, it is given by (i) Standard error of estimate of x is 2 1 r S x x    (ii) Standard error of estimate of y is 2 1 r S y y    . (2) Relation between probable error and standard error : If r is the correlation coefficient in a sample of n pairs of observations, then its standard error S.E. n r r 2 1 ) (   and probable error P.E. ) (r = 0.6745 (S.E.)= 0.6745          n r2 1 . The probable error or the standard error are used for interpreting the coefficient of correlation. (i) If ) .( . r E P r  , there is no evidence of correlation. (ii) If ) .( . 6 r E P r  , the existence of correlation is certain. The square of the coefficient of correlation for a bivariate distribution is known as the “Coefficient of determination”. Example: 18 If 4 21 ) (  x Var and 21 ) (  y Var and 1  r , then standard error of y is (a) 0 (b) 2 1 (c) 4 1 (d) 1 Solution: (a) 0 1 1 1 2      y y y r S   .