SlideShare a Scribd company logo
1 of 54
Ph.D Islamia College Peshawar Chap 12-1
Lecture No 3
Simple Linear Regression
Ph.D Islamia College Peshawar Chap 12-2
Chapter Goals
After completing this chapter, you should be
able to:
 Explain the simple linear regression model
 Obtain and interpret the simple linear regression
equation for a set of data
 Various Test in Regression Analysis such as T-test, F-
test etc
 Explain measures of variation and determine whether
the independent variable is significant
Ph.D Islamia College Peshawar Chap 12-3
Introduction to
Regression Analysis
 Regression analysis is used to:
 Predict the value of a dependent variable based on the
value of at least one independent variable
 Explain the impact of changes in an independent
variable on the dependent variable
Dependent variable: the variable we wish to explain
Independent variable: the variable used to explain
the dependent variable
Ph.D Islamia College Peshawar Chap 12-4
Simple Linear Regression
Model
 Only one independent variable, X
 Relationship between X and Y is
described by a linear function
 Changes in Y are assumed to be caused
by changes in X
Ph.D Islamia College Peshawar Chap 12-5
i
i
1
0
i ε
X
β
β
Y 


Linear component
Simple Linear Regression
Model
The population regression model:
Population
Y intercept
Population
Slope
Coefficient
Random
Error
term
Dependent
Variable
Independent
Variable
Random Error
component
Ph.D Islamia College Peshawar Chap 12-6
(continued)
Random Error
for this Xi value
Y
X
Observed Value
of Y for Xi
Predicted Value
of Y for Xi
i
i
1
0
i ε
X
β
β
Y 


Xi
Slope = β1
Intercept = β0
εi
Simple Linear Regression
Model
Ph.D Islamia College Peshawar Chap 12-7
i
1
0
i X
b
b
Ŷ 

The simple linear regression equation provides an
estimate of the population regression line
Simple Linear Regression
Equation
Estimate of
the regression
intercept
Estimate of the
regression slope
Estimated
(or predicted)
Y value for
observation i
Value of X for
observation i
The individual random error terms ei have a mean of zero
Ph.D Islamia College Peshawar Chap 12-8
Types of Relationships
Y
X
Y
X
Y
Y
X
X
Linear relationships Curvilinear relationships
Ph.D Islamia College Peshawar Chap 12-9
Types of Relationships
Y
X
Y
X
Y
Y
X
X
Strong relationships Weak relationships
(continued)
Ph.D Islamia College Peshawar Chap 12-10
Types of Relationships
Y
X
Y
X
No relationship
(continued)
Ph.D Islamia College Peshawar Chap 12-11
The Multiple Regression
Model
Idea: Examine the linear relationship between
1 dependent (Y) & 2 or more independent variables (Xi)
ε
X
β
X
β
X
β
β
Y ki
k
2i
2
1i
1
0
i 




 
Multiple Regression Model with k Independent Variables:
Y-intercept Population slopes Random Error
Ph.D Islamia College Peshawar Chap 12-12
Multiple Regression Equation
The coefficients of the multiple regression model are
estimated using sample data
ki
k
2i
2
1i
1
0
i X
b
X
b
X
b
b
Ŷ 



 
Estimated
(or predicted)
value of Y
Estimated slope coefficients
Multiple regression equation with k independent variables:
Estimated
intercept
In this chapter we will always use Excel to obtain the
regression slope coefficients and other regression
summary measures.
Important components of Regression
 Intercept coefficient
 Slope coefficient(s)
 T- Value
 F- Value
 R-square
 Adjusted R-square
Ph.D Islamia College Peshawar Chap 12-13
Ph.D Islamia College Peshawar Chap 12-14
Finding the Least Squares
Equation
 The coefficients b0 and b1 , and other
regression results in this chapter, will be
found using Excel and Stata software
Formulas are shown in the text at the end of
the chapter for those who are interested
Ph.D Islamia College Peshawar Chap 12-15
 b0 is the estimated average value of Y
when the value of X is zero
 b1 is the estimated change in the
average value of Y as a result of a
one-unit change in X
Interpretation of the
Slope and the Intercept
Ph.D Islamia College Peshawar Chap 12-16
Simple Linear Regression
Example
 A real estate agent wishes to examine the
relationship between the selling price of a home
and its size (measured in square feet)
 A random sample of 10 houses is selected
 Dependent variable (Y) = house price in $1000s
 Independent variable (X) = square feet
Ph.D Islamia College Peshawar Chap 12-17
Sample Data for House Price
Model
House Price in $1000s
(Y)
Square Feet
(X)
245 1400
312 1600
279 1700
308 1875
199 1100
219 1550
405 2350
324 2450
319 1425
255 1700
Ph.D Islamia College Peshawar Chap 12-18
0
50
100
150
200
250
300
350
400
450
0 500 1000 1500 2000 2500 3000
Square Feet
House
Price
($1000s)
Graphical Presentation
 House price model: scatter plot
Ph.D Islamia College Peshawar Chap 12-19
Regression Using Excel
 Tools / Data Analysis / Regression
Ph.D Islamia College Peshawar Chap 12-20
Excel Output
Regression Statistics
Multiple R 0.76211
R Square 0.58082
Adjusted R Square 0.52842
Standard Error 41.33032
Observations 10
ANOVA
df SS MS F Significance F
Regression 1 18934.9348 18934.9348 11.0848 0.01039
Residual 8 13665.5652 1708.1957
Total 9 32600.5000
Coefficients Standard Error t Stat P-value Lower 95% Upper 95%
Intercept 98.24833 58.03348 1.69296 0.12892 -35.57720 232.07386
Square Feet 0.10977 0.03297 3.32938 0.01039 0.03374 0.18580
The regression equation is:
feet)
(square
0.10977
98.24833
price
house 

Ph.D Islamia College Peshawar Chap 12-21
0
50
100
150
200
250
300
350
400
450
0 500 1000 1500 2000 2500 3000
Square Feet
House
Price
($1000s)
Graphical Presentation
 House price model: scatter plot and
regression line
feet)
(square
0.10977
98.24833
price
house 

Slope
= 0.10977
Intercept
= 98.248
Ph.D Islamia College Peshawar Chap 12-22
Interpretation of the
Intercept, b0
 b0 is the estimated average value of Y when the
value of X is zero (if X = 0 is in the range of
observed X values)
 Here, no houses had 0 square feet, so b0 = 98.24833
just indicates that, for houses within the range of
sizes observed, $98,248.33 is the portion of the
house price not explained by square feet
feet)
(square
0.10977
98.24833
price
house 

Ph.D Islamia College Peshawar Chap 12-23
Interpretation of the
Slope Coefficient, b1
 b1 measures the estimated change in the
average value of Y as a result of a one-
unit change in X
 Here, b1 = .10977 tells us that the average value of a
house increases by .10977($1000) = $109.77, on
average, for each additional one square foot of size
feet)
(square
0.10977
98.24833
price
house 

Ph.D Islamia College Peshawar Chap 12-24
317.85
0)
0.1098(200
98.25
(sq.ft.)
0.1098
98.25
price
house





Predict the price for a house
with 2000 square feet:
The predicted price for a house with 2000
square feet is 317.85($1,000s) = $317,850
Predictions using
Regression Analysis
Ph.D Islamia College Peshawar Chap 12-25
0
50
100
150
200
250
300
350
400
450
0 500 1000 1500 2000 2500 3000
Square Feet
House
Price
($1000s)
Interpolation vs. Extrapolation
 When using a regression model for prediction,
only predict within the relevant range of data
Relevant range for
interpolation
Do not try to
extrapolate
beyond the range
of observed X’s
Ph.D Islamia College Peshawar Chap 12-26
Least Squares Method
 b0 and b1 are obtained by finding the values
of b0 and b1 that minimize the sum of the
squared differences between Y and :
2
i
1
0
i
2
i
i ))
X
b
(b
(Y
min
)
Ŷ
(Y
min 


 

Ŷ
27
 The process of differentiation yields the following equations for
estimating β1 and β2:
Yi Xi = βˆ1Xi + βˆ2X2
i (3.1.4)
Yi = nβˆ1 + βˆ2Xi (3.1.5)
 where n is the sample size. These simultaneous equations are known
as the normal equations. Solving the normal equations
simultaneously, we obtain
28
 where X¯ and Y¯ are the sample means of X and Y and where we
define xi = (Xi − X¯ ) and yi = (Yi − Y¯). Henceforth we adopt the
convention of letting the lowercase letters denote deviations from
mean values.
29
T-Tests
 T test is used to check that sample beta can
statistically and significantly represent population
beta.
 The significance of this tests will show that there is
sufficient evidence that X variable (square footage)
affects the Y variable (house price).
 This is done by comparing the T-calculated value
with the T-critical values at 95% or 99% level of
significance.
 Where at 95% level of significance T-critical value is
1.96 and at 99% level its value is 2.33 as a rule of
thumb.
Ph.D Islamia College Peshawar Chap 12-30
Ph.D Islamia College Peshawar Chap 12-31
Inference about the Slope:
t Test
 t test for a population slope
 Is there a linear relationship between X and Y?
 Null and alternative hypotheses
H0: β1 = 0 (no linear relationship)
H1: β1  0 (linear relationship does exist)
 Test statistic
1
b
1
1
S
β
b
t


2
n
d.f. 

where:
b1 = regression slope
coefficient
β1 = hypothesized slope
Sb1 = standard
error of the slope
Ph.D Islamia College Peshawar Chap 12-32
Inferences about the Slope:
t Test Example
H0: β1 = 0
H1: β1  0
From Excel output:
Coefficients Standard Error t Stat P-value
Intercept 98.24833 58.03348 1.69296 0.12892
Square Feet 0.10977 0.03297 3.32938 0.01039
1
b
S
t
b1
32938
.
3
03297
.
0
0
10977
.
0
S
β
b
t
1
b
1
1





Ph.D Islamia College Peshawar Chap 12-33
Inferences about the Slope:
t Test Example
H0: β1 = 0
H1: β1  0
Test Statistic: t = 3.329
There is sufficient evidence
that square footage affects
house price
From Excel output:
Reject H0
Coefficients Standard Error t Stat P-value
Intercept 98.24833 58.03348 1.69296 0.12892
Square Feet 0.10977 0.03297 3.32938 0.01039
1
b
S t
b1
Decision:
Conclusion:
(continued)
How to compute Standard Error

Ph.D Islamia College Peshawar Chap 12-34
1
b
1
1
S
β
b
t


F-Test / ANOVA for Significance
 F-Test for Significance shows that the overall model
is statistically significance or not.
 In case of multiple regression where some variables
may be significant and some variables may not be
significant measured in terms of t-statistics.
 However, t-statistics cannot explain anything about
the overall model.
 To check that the overall model is statically
significance we use F test / ANOVA( Analysis of
Variance tests).
Ph.D Islamia College Peshawar Chap 12-35
Ph.D Islamia College Peshawar Chap 12-36
Multiple Regression Equation
The coefficients of the multiple regression model are
estimated using sample data
ki
k
2i
2
1i
1
0
i X
b
X
b
X
b
b
Ŷ 



 
Estimated
(or predicted)
value of Y
Estimated slope coefficients
Multiple regression equation with k independent variables:
Estimated
intercept
In this chapter we will always use Excel to obtain the
regression slope coefficients and other regression
summary measures.
F-Test for Significance
 If the F calculated value is more than F-critical
value 4 as a rule of thumb. So the overall model
is statically significant and can be used for
predication.
 If the P-value of the F-statistics is less than 5%
or 1% critical value then the overall model is
considered statistically significant.
Ph.D Islamia College Peshawar Chap 12-37
Ph.D Islamia College Peshawar Chap 12-38
Excel Output
Regression Statistics
Multiple R 0.76211
R Square 0.58082
Adjusted R Square 0.52842
Standard Error 41.33032
Observations 10
ANOVA
df SS MS F Significance F
Regression 1 18934.9348 18934.9348 11.0848 0.01039
Residual 8 13665.5652 1708.1957
Total 9 32600.5000
Coefficients Standard Error t Stat P-value Lower 95% Upper 95%
Intercept 98.24833 58.03348 1.69296 0.12892 -35.57720 232.07386
Square Feet 0.10977 0.03297 3.32938 0.01039 0.03374 0.18580
11.0848
1708.1957
18934.9348
MSE
MSR
F 


With 1 and 8 degrees
of freedom
P-value for
the F-Test
Ph.D Islamia College Peshawar Chap 12-39
Measures of Variation
 Total variation is made up of two parts:
SSE
SSR
SST 

Total Sum of
Squares
Regression Sum
of Squares
Error Sum of
Squares
 
 2
i )
Y
Y
(
SST  
 2
i
i )
Ŷ
Y
(
SSE
 
 2
i )
Y
Ŷ
(
SSR
where:
= Average value of the dependent variable
Yi = Observed values of the dependent variable
i = Predicted value of Y for the given Xi value
Ŷ
Y
Ph.D Islamia College Peshawar Chap 12-40
 SST = total sum of squares
 Measures the variation of the Yi values around their
mean Y
 SSR = regression sum of squares
 Explained variation attributable to the relationship
between X and Y
 SSE = error sum of squares
 Variation attributable to factors other than the
relationship between X and Y
(continued)
Measures of Variation
Ph.D Islamia College Peshawar Chap 12-41
F-Test for Significance
 F Test statistic:
where
MSE
MSR
F 
1
k
n
SSE
MSE
k
SSR
MSR




where F follows an F distribution with k numerator and (n – k - 1)
denominator degrees of freedom
(k = the number of independent variables in the regression model)
Ph.D Islamia College Peshawar Chap 12-42
Excel Output
Regression Statistics
Multiple R 0.76211
R Square 0.58082
Adjusted R Square 0.52842
Standard Error 41.33032
Observations 10
ANOVA
df SS MS F Significance F
Regression 1 18934.9348 18934.9348 11.0848 0.01039
Residual 8 13665.5652 1708.1957
Total 9 32600.5000
Coefficients Standard Error t Stat P-value Lower 95% Upper 95%
Intercept 98.24833 58.03348 1.69296 0.12892 -35.57720 232.07386
Square Feet 0.10977 0.03297 3.32938 0.01039 0.03374 0.18580
11.0848
1708.1957
18934.9348
MSE
MSR
F 


With 1 and 8 degrees
of freedom
P-value for
the F-Test
Ph.D Islamia College Peshawar Chap 12-43
 The coefficient of determination is the portion
of the total variation in the dependent variable
that is explained by variation in the
independent variable
 The coefficient of determination is also called
r-squared and is denoted as r2
Coefficient of Determination, r2
1
r
0 2


note:
squares
of
sum
total
squares
of
sum
regression
SST
SSR
r2


Ph.D Islamia College Peshawar Chap 12-44
r2 = 1
Examples of Approximate
r2 Values
Y
X
Y
X
r2 = 1
r2 = 1
Perfect linear relationship
between X and Y:
100% of the variation in Y is
explained by variation in X
Ph.D Islamia College Peshawar Chap 12-45
Examples of Approximate
r2 Values
Y
X
Y
X
0 < r2 < 1
Weaker linear relationships
between X and Y:
Some but not all of the
variation in Y is explained
by variation in X
Ph.D Islamia College Peshawar Chap 12-46
Examples of Approximate
r2 Values
r2 = 0
No linear relationship
between X and Y:
The value of Y does not
depend on X. (None of the
variation in Y is explained
by variation in X)
Y
X
r2 = 0
Ph.D Islamia College Peshawar Chap 12-47
Excel Output
Regression Statistics
Multiple R 0.76211
R Square 0.58082
Adjusted R Square 0.52842
Standard Error 41.33032
Observations 10
ANOVA
df SS MS F Significance F
Regression 1 18934.9348 18934.9348 11.0848 0.01039
Residual 8 13665.5652 1708.1957
Total 9 32600.5000
Coefficients Standard Error t Stat P-value Lower 95% Upper 95%
Intercept 98.24833 58.03348 1.69296 0.12892 -35.57720 232.07386
Square Feet 0.10977 0.03297 3.32938 0.01039 0.03374 0.18580
58.08% of the variation in
house prices is explained by
variation in square feet
0.58082
32600.5000
18934.9348
SST
SSR
r2



Ph.D Islamia College Peshawar Chap 12-48
Excel Output
Regression Statistics
Multiple R 0.76211
R Square 0.58082
Adjusted R Square 0.52842
Standard Error 41.33032
Observations 10
ANOVA
df SS MS F Significance F
Regression 1 18934.9348 18934.9348 11.0848 0.01039
Residual 8 13665.5652 1708.1957
Total 9 32600.5000
Coefficients Standard Error t Stat P-value Lower 95% Upper 95%
Intercept 98.24833 58.03348 1.69296 0.12892 -35.57720 232.07386
Square Feet 0.10977 0.03297 3.32938 0.01039 0.03374 0.18580
41.33032
SYX 
Ph.D Islamia College Peshawar Chap 12-49
Estimating a Multiple Linear
Regression Equation
 Excel will be used to generate the coefficients
and measures of goodness of fit for multiple
regression
 Excel:
 Tools / Data Analysis... / Regression
 Stata =
 Commonds write reg y x x

Ph.D Islamia College Peshawar Chap 12-50
Multiple Regression Output
Regression Statistics
Multiple R 0.72213
R Square 0.52148
Adjusted R Square 0.44172
Standard Error 47.46341
Observations 15
ANOVA df SS MS F Significance F
Regression 2 29460.027 14730.013 6.53861 0.01201
Residual 12 27033.306 2252.776
Total 14 56493.333
Coefficients Standard Error t Stat P-value Lower 95% Upper 95%
Intercept 306.52619 114.25389 2.68285 0.01993 57.58835 555.46404
Price -24.97509 10.83213 -2.30565 0.03979 -48.57626 -1.37392
Advertising 74.13096 25.96732 2.85478 0.01449 17.55303 130.70888
ertising)
74.131(Adv
ce)
24.975(Pri
-
306.526
Sales 

Ph.D Islamia College Peshawar Chap 12-51
Adjusted r2
 r2 never decreases when a new X variable is
added to the model
 This can be a disadvantage when comparing
models
 What is the net effect of adding a new variable?
 We lose a degree of freedom when a new X
variable is added
 Did the new X variable add enough
explanatory power to offset the loss of one
degree of freedom?
Ph.D Islamia College Peshawar Chap 12-52
 Shows the proportion of variation in Y explained by all X
variables adjusted for the number of X variables used
(where n = sample size, k = number of independent variables)
 Penalize excessive use of unimportant independent variables
 Smaller than r2
 Useful in comparing among models
Adjusted r2
(continued)


















1
k
n
1
n
)
r
1
(
1
r 2
k
..
12
.
Y
2
adj
Ph.D Islamia College Peshawar Chap 12-53
Regression Statistics
Multiple R 0.72213
R Square 0.52148
Adjusted R Square 0.44172
Standard Error 47.46341
Observations 15
ANOVA df SS MS F Significance F
Regression 2 29460.027 14730.013 6.53861 0.01201
Residual 12 27033.306 2252.776
Total 14 56493.333
Coefficients Standard Error t Stat P-value Lower 95% Upper 95%
Intercept 306.52619 114.25389 2.68285 0.01993 57.58835 555.46404
Price -24.97509 10.83213 -2.30565 0.03979 -48.57626 -1.37392
Advertising 74.13096 25.96732 2.85478 0.01449 17.55303 130.70888
.44172
r2
adj 
44.2% of the variation in pie sales is
explained by the variation in price and
advertising, taking into account the sample
size and number of independent variables
(continued)
Adjusted r2
Ph.D Islamia College Peshawar Chap 12-54
Two variable model
Y
X1
X2
2
2
1
1
0 X
b
X
b
b
Ŷ 


Yi
Yi
<
x2i
x1i The best fit equation, Y ,
is found by minimizing the
sum of squared errors, e2
<
Sample
observation
Residuals in Multiple Regression
Residual =
ei = (Yi – Yi)
<

More Related Content

Similar to lecture No. 3a.ppt

Regression analysis
Regression analysisRegression analysis
Regression analysisRavi shankar
 
Newbold_chap12.ppt
Newbold_chap12.pptNewbold_chap12.ppt
Newbold_chap12.pptcfisicaster
 
Correlation by Neeraj Bhandari ( Surkhet.Nepal )
Correlation by Neeraj Bhandari ( Surkhet.Nepal )Correlation by Neeraj Bhandari ( Surkhet.Nepal )
Correlation by Neeraj Bhandari ( Surkhet.Nepal )Neeraj Bhandari
 
Lesson07_new
Lesson07_newLesson07_new
Lesson07_newshengvn
 
Presentasi Tentang Regresi Linear
Presentasi Tentang Regresi LinearPresentasi Tentang Regresi Linear
Presentasi Tentang Regresi Lineardessybudiyanti
 
Simple Linier Regression
Simple Linier RegressionSimple Linier Regression
Simple Linier Regressiondessybudiyanti
 
Group 5 - Regression Analysis.pdf
Group 5 - Regression Analysis.pdfGroup 5 - Regression Analysis.pdf
Group 5 - Regression Analysis.pdffahlevet40
 
Module 2_ Regression Models..pptx
Module 2_ Regression Models..pptxModule 2_ Regression Models..pptx
Module 2_ Regression Models..pptxnikshaikh786
 
regression and correlation
regression and correlationregression and correlation
regression and correlationPriya Sharma
 
TTests.ppt
TTests.pptTTests.ppt
TTests.pptMUzair21
 
regressionanalysis-110723130213-phpapp02.pdf
regressionanalysis-110723130213-phpapp02.pdfregressionanalysis-110723130213-phpapp02.pdf
regressionanalysis-110723130213-phpapp02.pdfAdikesavaperumal
 

Similar to lecture No. 3a.ppt (20)

Chap12 simple regression
Chap12 simple regressionChap12 simple regression
Chap12 simple regression
 
Regression analysis
Regression analysisRegression analysis
Regression analysis
 
Newbold_chap12.ppt
Newbold_chap12.pptNewbold_chap12.ppt
Newbold_chap12.ppt
 
Simple Linear Regression
Simple Linear RegressionSimple Linear Regression
Simple Linear Regression
 
Correlation by Neeraj Bhandari ( Surkhet.Nepal )
Correlation by Neeraj Bhandari ( Surkhet.Nepal )Correlation by Neeraj Bhandari ( Surkhet.Nepal )
Correlation by Neeraj Bhandari ( Surkhet.Nepal )
 
Lesson07_new
Lesson07_newLesson07_new
Lesson07_new
 
Stat sample test ch 10
Stat sample test ch 10Stat sample test ch 10
Stat sample test ch 10
 
Chapter 12
Chapter 12Chapter 12
Chapter 12
 
Presentasi Tentang Regresi Linear
Presentasi Tentang Regresi LinearPresentasi Tentang Regresi Linear
Presentasi Tentang Regresi Linear
 
Simple Linier Regression
Simple Linier RegressionSimple Linier Regression
Simple Linier Regression
 
Group 5 - Regression Analysis.pdf
Group 5 - Regression Analysis.pdfGroup 5 - Regression Analysis.pdf
Group 5 - Regression Analysis.pdf
 
An Overview of Simple Linear Regression
An Overview of Simple Linear RegressionAn Overview of Simple Linear Regression
An Overview of Simple Linear Regression
 
chap12.pptx
chap12.pptxchap12.pptx
chap12.pptx
 
Module 2_ Regression Models..pptx
Module 2_ Regression Models..pptxModule 2_ Regression Models..pptx
Module 2_ Regression Models..pptx
 
regression and correlation
regression and correlationregression and correlation
regression and correlation
 
SPSS
SPSSSPSS
SPSS
 
TTests.ppt
TTests.pptTTests.ppt
TTests.ppt
 
Regression
RegressionRegression
Regression
 
33851.ppt
33851.ppt33851.ppt
33851.ppt
 
regressionanalysis-110723130213-phpapp02.pdf
regressionanalysis-110723130213-phpapp02.pdfregressionanalysis-110723130213-phpapp02.pdf
regressionanalysis-110723130213-phpapp02.pdf
 

Recently uploaded

Darshan Hiranandani [News About Next CEO].pdf
Darshan Hiranandani [News About Next CEO].pdfDarshan Hiranandani [News About Next CEO].pdf
Darshan Hiranandani [News About Next CEO].pdfShashank Mehta
 
Digital Transformation in the PLM domain - distrib.pdf
Digital Transformation in the PLM domain - distrib.pdfDigital Transformation in the PLM domain - distrib.pdf
Digital Transformation in the PLM domain - distrib.pdfJos Voskuil
 
8447779800, Low rate Call girls in Rohini Delhi NCR
8447779800, Low rate Call girls in Rohini Delhi NCR8447779800, Low rate Call girls in Rohini Delhi NCR
8447779800, Low rate Call girls in Rohini Delhi NCRashishs7044
 
The-Ethical-issues-ghhhhhhhhjof-Byjus.pptx
The-Ethical-issues-ghhhhhhhhjof-Byjus.pptxThe-Ethical-issues-ghhhhhhhhjof-Byjus.pptx
The-Ethical-issues-ghhhhhhhhjof-Byjus.pptxmbikashkanyari
 
TriStar Gold Corporate Presentation - April 2024
TriStar Gold Corporate Presentation - April 2024TriStar Gold Corporate Presentation - April 2024
TriStar Gold Corporate Presentation - April 2024Adnet Communications
 
Cyber Security Training in Office Environment
Cyber Security Training in Office EnvironmentCyber Security Training in Office Environment
Cyber Security Training in Office Environmentelijahj01012
 
Organizational Structure Running A Successful Business
Organizational Structure Running A Successful BusinessOrganizational Structure Running A Successful Business
Organizational Structure Running A Successful BusinessSeta Wicaksana
 
8447779800, Low rate Call girls in Saket Delhi NCR
8447779800, Low rate Call girls in Saket Delhi NCR8447779800, Low rate Call girls in Saket Delhi NCR
8447779800, Low rate Call girls in Saket Delhi NCRashishs7044
 
Cybersecurity Awareness Training Presentation v2024.03
Cybersecurity Awareness Training Presentation v2024.03Cybersecurity Awareness Training Presentation v2024.03
Cybersecurity Awareness Training Presentation v2024.03DallasHaselhorst
 
Chapter 9 PPT 4th edition.pdf internal audit
Chapter 9 PPT 4th edition.pdf internal auditChapter 9 PPT 4th edition.pdf internal audit
Chapter 9 PPT 4th edition.pdf internal auditNhtLNguyn9
 
Call US-88OO1O2216 Call Girls In Mahipalpur Female Escort Service
Call US-88OO1O2216 Call Girls In Mahipalpur Female Escort ServiceCall US-88OO1O2216 Call Girls In Mahipalpur Female Escort Service
Call US-88OO1O2216 Call Girls In Mahipalpur Female Escort Servicecallgirls2057
 
8447779800, Low rate Call girls in Kotla Mubarakpur Delhi NCR
8447779800, Low rate Call girls in Kotla Mubarakpur Delhi NCR8447779800, Low rate Call girls in Kotla Mubarakpur Delhi NCR
8447779800, Low rate Call girls in Kotla Mubarakpur Delhi NCRashishs7044
 
Church Building Grants To Assist With New Construction, Additions, And Restor...
Church Building Grants To Assist With New Construction, Additions, And Restor...Church Building Grants To Assist With New Construction, Additions, And Restor...
Church Building Grants To Assist With New Construction, Additions, And Restor...Americas Got Grants
 
Global Scenario On Sustainable and Resilient Coconut Industry by Dr. Jelfina...
Global Scenario On Sustainable  and Resilient Coconut Industry by Dr. Jelfina...Global Scenario On Sustainable  and Resilient Coconut Industry by Dr. Jelfina...
Global Scenario On Sustainable and Resilient Coconut Industry by Dr. Jelfina...ictsugar
 
Youth Involvement in an Innovative Coconut Value Chain by Mwalimu Menza
Youth Involvement in an Innovative Coconut Value Chain by Mwalimu MenzaYouth Involvement in an Innovative Coconut Value Chain by Mwalimu Menza
Youth Involvement in an Innovative Coconut Value Chain by Mwalimu Menzaictsugar
 
International Business Environments and Operations 16th Global Edition test b...
International Business Environments and Operations 16th Global Edition test b...International Business Environments and Operations 16th Global Edition test b...
International Business Environments and Operations 16th Global Edition test b...ssuserf63bd7
 
Investment in The Coconut Industry by Nancy Cheruiyot
Investment in The Coconut Industry by Nancy CheruiyotInvestment in The Coconut Industry by Nancy Cheruiyot
Investment in The Coconut Industry by Nancy Cheruiyotictsugar
 
Flow Your Strategy at Flight Levels Day 2024
Flow Your Strategy at Flight Levels Day 2024Flow Your Strategy at Flight Levels Day 2024
Flow Your Strategy at Flight Levels Day 2024Kirill Klimov
 
Guide Complete Set of Residential Architectural Drawings PDF
Guide Complete Set of Residential Architectural Drawings PDFGuide Complete Set of Residential Architectural Drawings PDF
Guide Complete Set of Residential Architectural Drawings PDFChandresh Chudasama
 

Recently uploaded (20)

Darshan Hiranandani [News About Next CEO].pdf
Darshan Hiranandani [News About Next CEO].pdfDarshan Hiranandani [News About Next CEO].pdf
Darshan Hiranandani [News About Next CEO].pdf
 
Digital Transformation in the PLM domain - distrib.pdf
Digital Transformation in the PLM domain - distrib.pdfDigital Transformation in the PLM domain - distrib.pdf
Digital Transformation in the PLM domain - distrib.pdf
 
Corporate Profile 47Billion Information Technology
Corporate Profile 47Billion Information TechnologyCorporate Profile 47Billion Information Technology
Corporate Profile 47Billion Information Technology
 
8447779800, Low rate Call girls in Rohini Delhi NCR
8447779800, Low rate Call girls in Rohini Delhi NCR8447779800, Low rate Call girls in Rohini Delhi NCR
8447779800, Low rate Call girls in Rohini Delhi NCR
 
The-Ethical-issues-ghhhhhhhhjof-Byjus.pptx
The-Ethical-issues-ghhhhhhhhjof-Byjus.pptxThe-Ethical-issues-ghhhhhhhhjof-Byjus.pptx
The-Ethical-issues-ghhhhhhhhjof-Byjus.pptx
 
TriStar Gold Corporate Presentation - April 2024
TriStar Gold Corporate Presentation - April 2024TriStar Gold Corporate Presentation - April 2024
TriStar Gold Corporate Presentation - April 2024
 
Cyber Security Training in Office Environment
Cyber Security Training in Office EnvironmentCyber Security Training in Office Environment
Cyber Security Training in Office Environment
 
Organizational Structure Running A Successful Business
Organizational Structure Running A Successful BusinessOrganizational Structure Running A Successful Business
Organizational Structure Running A Successful Business
 
8447779800, Low rate Call girls in Saket Delhi NCR
8447779800, Low rate Call girls in Saket Delhi NCR8447779800, Low rate Call girls in Saket Delhi NCR
8447779800, Low rate Call girls in Saket Delhi NCR
 
Cybersecurity Awareness Training Presentation v2024.03
Cybersecurity Awareness Training Presentation v2024.03Cybersecurity Awareness Training Presentation v2024.03
Cybersecurity Awareness Training Presentation v2024.03
 
Chapter 9 PPT 4th edition.pdf internal audit
Chapter 9 PPT 4th edition.pdf internal auditChapter 9 PPT 4th edition.pdf internal audit
Chapter 9 PPT 4th edition.pdf internal audit
 
Call US-88OO1O2216 Call Girls In Mahipalpur Female Escort Service
Call US-88OO1O2216 Call Girls In Mahipalpur Female Escort ServiceCall US-88OO1O2216 Call Girls In Mahipalpur Female Escort Service
Call US-88OO1O2216 Call Girls In Mahipalpur Female Escort Service
 
8447779800, Low rate Call girls in Kotla Mubarakpur Delhi NCR
8447779800, Low rate Call girls in Kotla Mubarakpur Delhi NCR8447779800, Low rate Call girls in Kotla Mubarakpur Delhi NCR
8447779800, Low rate Call girls in Kotla Mubarakpur Delhi NCR
 
Church Building Grants To Assist With New Construction, Additions, And Restor...
Church Building Grants To Assist With New Construction, Additions, And Restor...Church Building Grants To Assist With New Construction, Additions, And Restor...
Church Building Grants To Assist With New Construction, Additions, And Restor...
 
Global Scenario On Sustainable and Resilient Coconut Industry by Dr. Jelfina...
Global Scenario On Sustainable  and Resilient Coconut Industry by Dr. Jelfina...Global Scenario On Sustainable  and Resilient Coconut Industry by Dr. Jelfina...
Global Scenario On Sustainable and Resilient Coconut Industry by Dr. Jelfina...
 
Youth Involvement in an Innovative Coconut Value Chain by Mwalimu Menza
Youth Involvement in an Innovative Coconut Value Chain by Mwalimu MenzaYouth Involvement in an Innovative Coconut Value Chain by Mwalimu Menza
Youth Involvement in an Innovative Coconut Value Chain by Mwalimu Menza
 
International Business Environments and Operations 16th Global Edition test b...
International Business Environments and Operations 16th Global Edition test b...International Business Environments and Operations 16th Global Edition test b...
International Business Environments and Operations 16th Global Edition test b...
 
Investment in The Coconut Industry by Nancy Cheruiyot
Investment in The Coconut Industry by Nancy CheruiyotInvestment in The Coconut Industry by Nancy Cheruiyot
Investment in The Coconut Industry by Nancy Cheruiyot
 
Flow Your Strategy at Flight Levels Day 2024
Flow Your Strategy at Flight Levels Day 2024Flow Your Strategy at Flight Levels Day 2024
Flow Your Strategy at Flight Levels Day 2024
 
Guide Complete Set of Residential Architectural Drawings PDF
Guide Complete Set of Residential Architectural Drawings PDFGuide Complete Set of Residential Architectural Drawings PDF
Guide Complete Set of Residential Architectural Drawings PDF
 

lecture No. 3a.ppt

  • 1. Ph.D Islamia College Peshawar Chap 12-1 Lecture No 3 Simple Linear Regression
  • 2. Ph.D Islamia College Peshawar Chap 12-2 Chapter Goals After completing this chapter, you should be able to:  Explain the simple linear regression model  Obtain and interpret the simple linear regression equation for a set of data  Various Test in Regression Analysis such as T-test, F- test etc  Explain measures of variation and determine whether the independent variable is significant
  • 3. Ph.D Islamia College Peshawar Chap 12-3 Introduction to Regression Analysis  Regression analysis is used to:  Predict the value of a dependent variable based on the value of at least one independent variable  Explain the impact of changes in an independent variable on the dependent variable Dependent variable: the variable we wish to explain Independent variable: the variable used to explain the dependent variable
  • 4. Ph.D Islamia College Peshawar Chap 12-4 Simple Linear Regression Model  Only one independent variable, X  Relationship between X and Y is described by a linear function  Changes in Y are assumed to be caused by changes in X
  • 5. Ph.D Islamia College Peshawar Chap 12-5 i i 1 0 i ε X β β Y    Linear component Simple Linear Regression Model The population regression model: Population Y intercept Population Slope Coefficient Random Error term Dependent Variable Independent Variable Random Error component
  • 6. Ph.D Islamia College Peshawar Chap 12-6 (continued) Random Error for this Xi value Y X Observed Value of Y for Xi Predicted Value of Y for Xi i i 1 0 i ε X β β Y    Xi Slope = β1 Intercept = β0 εi Simple Linear Regression Model
  • 7. Ph.D Islamia College Peshawar Chap 12-7 i 1 0 i X b b Ŷ   The simple linear regression equation provides an estimate of the population regression line Simple Linear Regression Equation Estimate of the regression intercept Estimate of the regression slope Estimated (or predicted) Y value for observation i Value of X for observation i The individual random error terms ei have a mean of zero
  • 8. Ph.D Islamia College Peshawar Chap 12-8 Types of Relationships Y X Y X Y Y X X Linear relationships Curvilinear relationships
  • 9. Ph.D Islamia College Peshawar Chap 12-9 Types of Relationships Y X Y X Y Y X X Strong relationships Weak relationships (continued)
  • 10. Ph.D Islamia College Peshawar Chap 12-10 Types of Relationships Y X Y X No relationship (continued)
  • 11. Ph.D Islamia College Peshawar Chap 12-11 The Multiple Regression Model Idea: Examine the linear relationship between 1 dependent (Y) & 2 or more independent variables (Xi) ε X β X β X β β Y ki k 2i 2 1i 1 0 i        Multiple Regression Model with k Independent Variables: Y-intercept Population slopes Random Error
  • 12. Ph.D Islamia College Peshawar Chap 12-12 Multiple Regression Equation The coefficients of the multiple regression model are estimated using sample data ki k 2i 2 1i 1 0 i X b X b X b b Ŷ       Estimated (or predicted) value of Y Estimated slope coefficients Multiple regression equation with k independent variables: Estimated intercept In this chapter we will always use Excel to obtain the regression slope coefficients and other regression summary measures.
  • 13. Important components of Regression  Intercept coefficient  Slope coefficient(s)  T- Value  F- Value  R-square  Adjusted R-square Ph.D Islamia College Peshawar Chap 12-13
  • 14. Ph.D Islamia College Peshawar Chap 12-14 Finding the Least Squares Equation  The coefficients b0 and b1 , and other regression results in this chapter, will be found using Excel and Stata software Formulas are shown in the text at the end of the chapter for those who are interested
  • 15. Ph.D Islamia College Peshawar Chap 12-15  b0 is the estimated average value of Y when the value of X is zero  b1 is the estimated change in the average value of Y as a result of a one-unit change in X Interpretation of the Slope and the Intercept
  • 16. Ph.D Islamia College Peshawar Chap 12-16 Simple Linear Regression Example  A real estate agent wishes to examine the relationship between the selling price of a home and its size (measured in square feet)  A random sample of 10 houses is selected  Dependent variable (Y) = house price in $1000s  Independent variable (X) = square feet
  • 17. Ph.D Islamia College Peshawar Chap 12-17 Sample Data for House Price Model House Price in $1000s (Y) Square Feet (X) 245 1400 312 1600 279 1700 308 1875 199 1100 219 1550 405 2350 324 2450 319 1425 255 1700
  • 18. Ph.D Islamia College Peshawar Chap 12-18 0 50 100 150 200 250 300 350 400 450 0 500 1000 1500 2000 2500 3000 Square Feet House Price ($1000s) Graphical Presentation  House price model: scatter plot
  • 19. Ph.D Islamia College Peshawar Chap 12-19 Regression Using Excel  Tools / Data Analysis / Regression
  • 20. Ph.D Islamia College Peshawar Chap 12-20 Excel Output Regression Statistics Multiple R 0.76211 R Square 0.58082 Adjusted R Square 0.52842 Standard Error 41.33032 Observations 10 ANOVA df SS MS F Significance F Regression 1 18934.9348 18934.9348 11.0848 0.01039 Residual 8 13665.5652 1708.1957 Total 9 32600.5000 Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Intercept 98.24833 58.03348 1.69296 0.12892 -35.57720 232.07386 Square Feet 0.10977 0.03297 3.32938 0.01039 0.03374 0.18580 The regression equation is: feet) (square 0.10977 98.24833 price house  
  • 21. Ph.D Islamia College Peshawar Chap 12-21 0 50 100 150 200 250 300 350 400 450 0 500 1000 1500 2000 2500 3000 Square Feet House Price ($1000s) Graphical Presentation  House price model: scatter plot and regression line feet) (square 0.10977 98.24833 price house   Slope = 0.10977 Intercept = 98.248
  • 22. Ph.D Islamia College Peshawar Chap 12-22 Interpretation of the Intercept, b0  b0 is the estimated average value of Y when the value of X is zero (if X = 0 is in the range of observed X values)  Here, no houses had 0 square feet, so b0 = 98.24833 just indicates that, for houses within the range of sizes observed, $98,248.33 is the portion of the house price not explained by square feet feet) (square 0.10977 98.24833 price house  
  • 23. Ph.D Islamia College Peshawar Chap 12-23 Interpretation of the Slope Coefficient, b1  b1 measures the estimated change in the average value of Y as a result of a one- unit change in X  Here, b1 = .10977 tells us that the average value of a house increases by .10977($1000) = $109.77, on average, for each additional one square foot of size feet) (square 0.10977 98.24833 price house  
  • 24. Ph.D Islamia College Peshawar Chap 12-24 317.85 0) 0.1098(200 98.25 (sq.ft.) 0.1098 98.25 price house      Predict the price for a house with 2000 square feet: The predicted price for a house with 2000 square feet is 317.85($1,000s) = $317,850 Predictions using Regression Analysis
  • 25. Ph.D Islamia College Peshawar Chap 12-25 0 50 100 150 200 250 300 350 400 450 0 500 1000 1500 2000 2500 3000 Square Feet House Price ($1000s) Interpolation vs. Extrapolation  When using a regression model for prediction, only predict within the relevant range of data Relevant range for interpolation Do not try to extrapolate beyond the range of observed X’s
  • 26. Ph.D Islamia College Peshawar Chap 12-26 Least Squares Method  b0 and b1 are obtained by finding the values of b0 and b1 that minimize the sum of the squared differences between Y and : 2 i 1 0 i 2 i i )) X b (b (Y min ) Ŷ (Y min       Ŷ
  • 27. 27
  • 28.  The process of differentiation yields the following equations for estimating β1 and β2: Yi Xi = βˆ1Xi + βˆ2X2 i (3.1.4) Yi = nβˆ1 + βˆ2Xi (3.1.5)  where n is the sample size. These simultaneous equations are known as the normal equations. Solving the normal equations simultaneously, we obtain 28
  • 29.  where X¯ and Y¯ are the sample means of X and Y and where we define xi = (Xi − X¯ ) and yi = (Yi − Y¯). Henceforth we adopt the convention of letting the lowercase letters denote deviations from mean values. 29
  • 30. T-Tests  T test is used to check that sample beta can statistically and significantly represent population beta.  The significance of this tests will show that there is sufficient evidence that X variable (square footage) affects the Y variable (house price).  This is done by comparing the T-calculated value with the T-critical values at 95% or 99% level of significance.  Where at 95% level of significance T-critical value is 1.96 and at 99% level its value is 2.33 as a rule of thumb. Ph.D Islamia College Peshawar Chap 12-30
  • 31. Ph.D Islamia College Peshawar Chap 12-31 Inference about the Slope: t Test  t test for a population slope  Is there a linear relationship between X and Y?  Null and alternative hypotheses H0: β1 = 0 (no linear relationship) H1: β1  0 (linear relationship does exist)  Test statistic 1 b 1 1 S β b t   2 n d.f.   where: b1 = regression slope coefficient β1 = hypothesized slope Sb1 = standard error of the slope
  • 32. Ph.D Islamia College Peshawar Chap 12-32 Inferences about the Slope: t Test Example H0: β1 = 0 H1: β1  0 From Excel output: Coefficients Standard Error t Stat P-value Intercept 98.24833 58.03348 1.69296 0.12892 Square Feet 0.10977 0.03297 3.32938 0.01039 1 b S t b1 32938 . 3 03297 . 0 0 10977 . 0 S β b t 1 b 1 1     
  • 33. Ph.D Islamia College Peshawar Chap 12-33 Inferences about the Slope: t Test Example H0: β1 = 0 H1: β1  0 Test Statistic: t = 3.329 There is sufficient evidence that square footage affects house price From Excel output: Reject H0 Coefficients Standard Error t Stat P-value Intercept 98.24833 58.03348 1.69296 0.12892 Square Feet 0.10977 0.03297 3.32938 0.01039 1 b S t b1 Decision: Conclusion: (continued)
  • 34. How to compute Standard Error  Ph.D Islamia College Peshawar Chap 12-34 1 b 1 1 S β b t  
  • 35. F-Test / ANOVA for Significance  F-Test for Significance shows that the overall model is statistically significance or not.  In case of multiple regression where some variables may be significant and some variables may not be significant measured in terms of t-statistics.  However, t-statistics cannot explain anything about the overall model.  To check that the overall model is statically significance we use F test / ANOVA( Analysis of Variance tests). Ph.D Islamia College Peshawar Chap 12-35
  • 36. Ph.D Islamia College Peshawar Chap 12-36 Multiple Regression Equation The coefficients of the multiple regression model are estimated using sample data ki k 2i 2 1i 1 0 i X b X b X b b Ŷ       Estimated (or predicted) value of Y Estimated slope coefficients Multiple regression equation with k independent variables: Estimated intercept In this chapter we will always use Excel to obtain the regression slope coefficients and other regression summary measures.
  • 37. F-Test for Significance  If the F calculated value is more than F-critical value 4 as a rule of thumb. So the overall model is statically significant and can be used for predication.  If the P-value of the F-statistics is less than 5% or 1% critical value then the overall model is considered statistically significant. Ph.D Islamia College Peshawar Chap 12-37
  • 38. Ph.D Islamia College Peshawar Chap 12-38 Excel Output Regression Statistics Multiple R 0.76211 R Square 0.58082 Adjusted R Square 0.52842 Standard Error 41.33032 Observations 10 ANOVA df SS MS F Significance F Regression 1 18934.9348 18934.9348 11.0848 0.01039 Residual 8 13665.5652 1708.1957 Total 9 32600.5000 Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Intercept 98.24833 58.03348 1.69296 0.12892 -35.57720 232.07386 Square Feet 0.10977 0.03297 3.32938 0.01039 0.03374 0.18580 11.0848 1708.1957 18934.9348 MSE MSR F    With 1 and 8 degrees of freedom P-value for the F-Test
  • 39. Ph.D Islamia College Peshawar Chap 12-39 Measures of Variation  Total variation is made up of two parts: SSE SSR SST   Total Sum of Squares Regression Sum of Squares Error Sum of Squares    2 i ) Y Y ( SST    2 i i ) Ŷ Y ( SSE    2 i ) Y Ŷ ( SSR where: = Average value of the dependent variable Yi = Observed values of the dependent variable i = Predicted value of Y for the given Xi value Ŷ Y
  • 40. Ph.D Islamia College Peshawar Chap 12-40  SST = total sum of squares  Measures the variation of the Yi values around their mean Y  SSR = regression sum of squares  Explained variation attributable to the relationship between X and Y  SSE = error sum of squares  Variation attributable to factors other than the relationship between X and Y (continued) Measures of Variation
  • 41. Ph.D Islamia College Peshawar Chap 12-41 F-Test for Significance  F Test statistic: where MSE MSR F  1 k n SSE MSE k SSR MSR     where F follows an F distribution with k numerator and (n – k - 1) denominator degrees of freedom (k = the number of independent variables in the regression model)
  • 42. Ph.D Islamia College Peshawar Chap 12-42 Excel Output Regression Statistics Multiple R 0.76211 R Square 0.58082 Adjusted R Square 0.52842 Standard Error 41.33032 Observations 10 ANOVA df SS MS F Significance F Regression 1 18934.9348 18934.9348 11.0848 0.01039 Residual 8 13665.5652 1708.1957 Total 9 32600.5000 Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Intercept 98.24833 58.03348 1.69296 0.12892 -35.57720 232.07386 Square Feet 0.10977 0.03297 3.32938 0.01039 0.03374 0.18580 11.0848 1708.1957 18934.9348 MSE MSR F    With 1 and 8 degrees of freedom P-value for the F-Test
  • 43. Ph.D Islamia College Peshawar Chap 12-43  The coefficient of determination is the portion of the total variation in the dependent variable that is explained by variation in the independent variable  The coefficient of determination is also called r-squared and is denoted as r2 Coefficient of Determination, r2 1 r 0 2   note: squares of sum total squares of sum regression SST SSR r2  
  • 44. Ph.D Islamia College Peshawar Chap 12-44 r2 = 1 Examples of Approximate r2 Values Y X Y X r2 = 1 r2 = 1 Perfect linear relationship between X and Y: 100% of the variation in Y is explained by variation in X
  • 45. Ph.D Islamia College Peshawar Chap 12-45 Examples of Approximate r2 Values Y X Y X 0 < r2 < 1 Weaker linear relationships between X and Y: Some but not all of the variation in Y is explained by variation in X
  • 46. Ph.D Islamia College Peshawar Chap 12-46 Examples of Approximate r2 Values r2 = 0 No linear relationship between X and Y: The value of Y does not depend on X. (None of the variation in Y is explained by variation in X) Y X r2 = 0
  • 47. Ph.D Islamia College Peshawar Chap 12-47 Excel Output Regression Statistics Multiple R 0.76211 R Square 0.58082 Adjusted R Square 0.52842 Standard Error 41.33032 Observations 10 ANOVA df SS MS F Significance F Regression 1 18934.9348 18934.9348 11.0848 0.01039 Residual 8 13665.5652 1708.1957 Total 9 32600.5000 Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Intercept 98.24833 58.03348 1.69296 0.12892 -35.57720 232.07386 Square Feet 0.10977 0.03297 3.32938 0.01039 0.03374 0.18580 58.08% of the variation in house prices is explained by variation in square feet 0.58082 32600.5000 18934.9348 SST SSR r2   
  • 48. Ph.D Islamia College Peshawar Chap 12-48 Excel Output Regression Statistics Multiple R 0.76211 R Square 0.58082 Adjusted R Square 0.52842 Standard Error 41.33032 Observations 10 ANOVA df SS MS F Significance F Regression 1 18934.9348 18934.9348 11.0848 0.01039 Residual 8 13665.5652 1708.1957 Total 9 32600.5000 Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Intercept 98.24833 58.03348 1.69296 0.12892 -35.57720 232.07386 Square Feet 0.10977 0.03297 3.32938 0.01039 0.03374 0.18580 41.33032 SYX 
  • 49. Ph.D Islamia College Peshawar Chap 12-49 Estimating a Multiple Linear Regression Equation  Excel will be used to generate the coefficients and measures of goodness of fit for multiple regression  Excel:  Tools / Data Analysis... / Regression  Stata =  Commonds write reg y x x 
  • 50. Ph.D Islamia College Peshawar Chap 12-50 Multiple Regression Output Regression Statistics Multiple R 0.72213 R Square 0.52148 Adjusted R Square 0.44172 Standard Error 47.46341 Observations 15 ANOVA df SS MS F Significance F Regression 2 29460.027 14730.013 6.53861 0.01201 Residual 12 27033.306 2252.776 Total 14 56493.333 Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Intercept 306.52619 114.25389 2.68285 0.01993 57.58835 555.46404 Price -24.97509 10.83213 -2.30565 0.03979 -48.57626 -1.37392 Advertising 74.13096 25.96732 2.85478 0.01449 17.55303 130.70888 ertising) 74.131(Adv ce) 24.975(Pri - 306.526 Sales  
  • 51. Ph.D Islamia College Peshawar Chap 12-51 Adjusted r2  r2 never decreases when a new X variable is added to the model  This can be a disadvantage when comparing models  What is the net effect of adding a new variable?  We lose a degree of freedom when a new X variable is added  Did the new X variable add enough explanatory power to offset the loss of one degree of freedom?
  • 52. Ph.D Islamia College Peshawar Chap 12-52  Shows the proportion of variation in Y explained by all X variables adjusted for the number of X variables used (where n = sample size, k = number of independent variables)  Penalize excessive use of unimportant independent variables  Smaller than r2  Useful in comparing among models Adjusted r2 (continued)                   1 k n 1 n ) r 1 ( 1 r 2 k .. 12 . Y 2 adj
  • 53. Ph.D Islamia College Peshawar Chap 12-53 Regression Statistics Multiple R 0.72213 R Square 0.52148 Adjusted R Square 0.44172 Standard Error 47.46341 Observations 15 ANOVA df SS MS F Significance F Regression 2 29460.027 14730.013 6.53861 0.01201 Residual 12 27033.306 2252.776 Total 14 56493.333 Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Intercept 306.52619 114.25389 2.68285 0.01993 57.58835 555.46404 Price -24.97509 10.83213 -2.30565 0.03979 -48.57626 -1.37392 Advertising 74.13096 25.96732 2.85478 0.01449 17.55303 130.70888 .44172 r2 adj  44.2% of the variation in pie sales is explained by the variation in price and advertising, taking into account the sample size and number of independent variables (continued) Adjusted r2
  • 54. Ph.D Islamia College Peshawar Chap 12-54 Two variable model Y X1 X2 2 2 1 1 0 X b X b b Ŷ    Yi Yi < x2i x1i The best fit equation, Y , is found by minimizing the sum of squared errors, e2 < Sample observation Residuals in Multiple Regression Residual = ei = (Yi – Yi) <