report from material on REGRESSION and CORRELATION in SPSS courses, this presentation is suitable for students and lecturers majoring in production management and marketing, industrial engineering etc.
1. REPORT ON REGRESSION & CORRELATION
Jeri Oktari
Tokat Gaziosmanpaşa University, Tokat, Turkey
INTRODUCTION
• Regression analysis is used to study and measure the statistical
relationship that occurs between two or more variables
• The variables are variable X (independent variable / variable that
influences / known variable), and variable Y (dependent variable
/ variable that is influenced / unknown variable)
Basically, the relationship between 2 variables can be distinguished on:
1. unidirectional/positive relationship
2. The relationship is not unidirectional / negative
3. No relationship
Unidirectional/positive relationship
A unidirectional relationship is defined if a change in the x (independent)
variable will affect the y (dependent) variable in the same direction.
Example:
• relationship between ad spend (x) and sales amount (y).
• The relationship between income (X) and consumption
expenditure (Y)
The relationship is not unidirectional / negative No relationship
Two variables are said to have an inverse or negative relationship, if a
change in the independent variable (x) will affect the dependent variable
(Y) in the opposite direction.
2. This means that if the x variable increases, then the y variable decreases
or vice versa, if the x variable decreases, the y variable increases.
Example:
• The relationship between the age of the vehicle (X) and the price
level (Y).
• The relationship between the price of goods (x) and the quantity
demanded (Y)
No relationship
Two variables are said to have no relationship if changes in the
independent variable (x) do not affect changes in the dependent variable
(y).
Example:
The relationship between food consumption (x) and building height (y).
Regression Line Drawing
There are 2 ways to draw a regression line:
1. The scatter diagram method
2. The least square's method
The scatter diagram method
After it is determined that there is a logical relationship between the
variables, then to support further analysis, perhaps the next step is to
use graphs.
This graph is called a scatter diagram, which shows certain points.
Each point represents a result that we value as either the dependent or
independent variable.
3. This scatter diagram has 2 benefits, namely:
1. helps to show whether there is a useful relationship between
two variables,
2. and helps define the type of equation that shows the
relationship between the two variables.
Figure 1. scatter chart form
The least square's method
Regression is a measuring tool that is also used to measure the presence or
absence of correlation between variables. The term regression itself means
forecast or estimate. The equation used to get the regression line on the
scatter diagram data is called the regression equation.
To place the regression line on the data obtained, the least squares method
is used, so that the form of the regression equation is as follows:
4. Y' = a + b X
Where:
Y': estimated value of dependent variable
a: the point where the regression line intersects on the y-axis (estimate Y'
value if x=0)
b: gradient of the regression line (perub value estimate Y' per unit change
in x value)
X: value of independent variable
The value of a and b in the regression equation can be calculated by the
following formula:
( )
X
b
Y
a
X
X
n
Y
X
Y
X
n
b
x
y
x
b
i
i
i
i
i
i
i
i
i
−
=
−
−
=
=
2
2
2
Regression line reading
Example: If the regression line is shown by the equation:
x
y 95
,
0
94
,
2
1
+
=
then it can be interpreted that: sales results will increase by 0.95 for each
increase in advertising expenditure of 1 unit.
5. Regression Coefficient
• Is the gradient of the regression line (b value)
• The value of b is positive, indicating the relationship between the
variables x and y is unidirectional or the relationship is positive.
• The value of b is negative, indicating the relationship between the
variables x and y is in the opposite direction or the relationship is
negative
• The size of the change in the variable x to the variable y is
determined by the size of the regression coefficient.
Coefficient of Determination
Is the main tool to determine the extent of the relationship between the
variables x and y.
( ) ( ) ( )
−
−
+
= 2
2
2
2
)
(
)
( Y
n
Y
Y
n
XY
b
Y
a
r
The value of the coefficient of determination between 0≤ r
2
≤1
The value of the coefficient of determination = 1 indicates a perfect
relationship.
The value of the coefficient of determination = 0 indicates there is no
relationship.
r
2
=81 means 81% change from variable y determined by the variable x.
Correlation Analysis
Measuring how strong or the degree of closeness of a relationship between
variables
• The correlation coefficient has a value of -1≤ KK +1
• To determine the closeness of the correlation between variables, the
KK . benchmark is given
• 0 < KK ≤0.2, very weak correlation
6. • 0.2 < KK≤ 0.4, correlation is weak but sure
• 0.4 < KK≤0.7, a significant correlation
• 0.7 < KK≤0.9, very strong correlation
• 0.9 < KK< 1, very strong correlation
• KK = 1, the correlation is very perfect
Coefficient of Determination:
( ) ( ) ( )
−
−
+
= 2
2
2
2
)
(
)
( Y
n
Y
Y
n
XY
b
Y
a
r
Correlation coefficient:
2
r
r
=
Types of correlation coefficient
1. Pearson correlation coefficient
2. Spearman rank correlation coefficient
3. Contingency correlation coefficient
4. Determinant coefficient
Difference between Regression and Correlation
• Regression shows the relationship between one variable and another
variable.
• The nature of the relationship can be explained: one variable is the
cause, the other variable is the effect.
• Correlation does not show a causal relationship,but shows the
relationship between one variable and another.