SlideShare a Scribd company logo
1 of 5
Download to read offline
Multiple Linear Regression
The following case study and data set are taken from the book:
Chatterjee, S. and Hadi, A.S., 2015. Regression analysis by example. John Wiley & Sons.
The authors have used the data from a study in Industrial Psychology (Management).
An exploratory study was carried out in a large financial organization in an attempt to explain
specific supervisor characteristics/traits and overall satisfaction with supervisors as perceived by
the employees. Data were collected from 30 departments in the organization. Table 1 below
provides details of the variables (all can be treated as continuous) used in the study.
Table 1.
Variable Description
Y Overall rating of job being done by supervisor (0-100)
X1 Handles employee complaints (0-100)
X2 Does not allow special privileges (0-100)
X3 Opportunity learn new things (0-100)
X4 Raises based on performance (0-100)
X5 Too critical of employee’s poor performance (0-100)
X6 Rate of advancing to better job (0-100)
A multiple linear regression model is fitted on the data, and the output is presented in Table 2.
Answer the following questions:
a) Write down the equation for the full MLR model.
b) Use the model to predict a supervisor’s performance (overall rating of job being done)
when his scores for X1 – X6 for a department are all 50.
c) Explain the coefficient value ‘0.6132’ and standard error ‘0.1609’ corresponding to X1
given in the output.
d) Explain R2 for this model.
e) Find the missing values ‘a’, ‘b’, ‘c’ and ‘d’ from the output.
f) Is the model significant in predicting or explaining Y? Justify with proper explanation.
g) Do you find any individual factor being significant for Y? Justify with proper explanation.
• For any test of hypothesis, consider 5% as the level of significance if nothing
is specifically mentioned.
‘
Table 2.
Model answers:
a) The full multiple linear regression (MLR) equation is given by:
𝑌𝑌 = 10.787 + 0.613 𝑋𝑋1 − 0.073 𝑋𝑋2 + 0.320 𝑋𝑋3 + 0.082 𝑋𝑋4 + 0.038 𝑋𝑋5 − 0.217 𝑋𝑋6
(1)
b) 𝑌𝑌 = 10.787 + 0.613 × 50 − 0.073 × 50 × +0.320 × 50 + 0.082 × 50 + 0.038 ×
50 − 0.217 × 50 = 48.937
The supervisor’s predicted overall rating of job being done is 48.937.
c) The value of the partial regression coefficient related to X1 is 0.6132. It signifies
that if the score X1 (handling employee complaints) is increased by 1 unit, then Y
(supervisor’s overall rating of job being done) increases by 0.613 units when all
other factors/predictors (X2-X6) are kept constant at the same level. The standard
error of the estimate 0.6132 is 0.1609 which indicates the degree of imprecision or
variability involved in estimation. Lower the standard error, more precise is the
estimated value.
d) R2=SSR/SST is a measure of adequacy for a fitted linear regression model. R2
indicates the amount (%) of variability present in Y that is explained by the linear
regression model. In this case, R2 is 0.7326 which means 73.26% of the total
variation in Y (supervisor’s overall rating of job being done) is being explained by
SUMMARY OUTPUT
Regression Statistics
Multiple R 0.855921721
R Square 0.732601993
Adjusted R Square 0.662845991
Standard Error 7.067993765
Observations 30
ANOVA
df SS MS F Significance F
Regression 6 b c d 1.24041E-05
Residual a 1149.000325 49.95653586
Total 29 4296.966667
Coefficients Standard Error t Stat P-value
Intercept 10.78707639 11.58925724 0.930782375 0.361633721
X1 0.613187608 0.160983115 3.809018158 0.000902868
X2 -0.073050143 0.13572469 -0.53822295 0.595593921
X3 0.320332116 0.168520319 1.900851595 0.069925346
X4 0.081732134 0.221477677 0.369031022 0.715480088
X5 0.038381447 0.146995442 0.261106377 0.796334264
X6 -0.217056682 0.178209471 -1.217986229 0.235577049
the linear regression model given in equation 1, and remaining 100-73.26=26.74%
of the total variability in Y could not be explained by the model. In this case, we
can conclude that the model fit is not adequate based on R2 value only, but in
practice other measures (Adjusted R2, PRESS, AIC, BIC) also need to be checked
to comment on model adequacy.
e) a = Residual df = Total df - Regression df = 29-6 = 23
b = SS Regression = SS Total – SS Residual = 4296.967 – 1149.000 = 3147.967
c = MS Regression = (SS Regression) / (Regression df) = 3147.967/6 = 524.661
d = F statistic value = (MS Regression)/(MS Residual) = 524.661/49.956 = 10.502
f) To check if the model is significant in predicting or explaining Y, we carry out the
following test of hypothesis:
𝐻𝐻0: 𝛽𝛽1 = 𝛽𝛽2 = ⋯ = 𝛽𝛽6 = 0 vs. 𝐻𝐻1: at least one inequality in H0
The test statistic related to this test is F=MSR/MSE which follows an F distribution
(sampling distribution) and the value of F is 10.502 obtained from the ANOVA table
given in Table 2. The p-value related to this test is given in the column ‘Significance
F’ of the ANOVA table. The p-value is 1.2404 × 10−5
< 0.05. Hence, we reject the
null hypothesis at 5% level of significance, which means at least one 𝛽𝛽 is
significantly different than 0 and the model IS significant in explaining or predicting
Y.
g) The individual significant predictor test tests one of the following at a time
𝐻𝐻0: 𝛽𝛽1 = 0 vs. 𝐻𝐻1: 𝛽𝛽1 ≠ 0
OR
𝐻𝐻0: 𝛽𝛽2 = 0 vs. 𝐻𝐻1: 𝛽𝛽2 ≠ 0
:
:
𝐻𝐻0: 𝛽𝛽6 = 0 vs. 𝐻𝐻1: 𝛽𝛽6 ≠ 0
The test statistic values related to these tests are given in the column ‘t Stat’ and
p-values in the column ‘P-value’ against the predictors. We note that p-value
related to the test
𝐻𝐻0: 𝛽𝛽1 = 0 vs. 𝐻𝐻1: 𝛽𝛽1 ≠ 0
Is 0.0009 < 0.05. Hence, we can reject the null hypothesis in favor of the
alternative, and say that X1 is a significant predictor of Y individually when all other
factors are adjusted for. No other p-value is less than 0.05. Hence, X1 (Handles
employee complaints) is found to be the only significant predictor which affects Y
(supervisor’s overall rating in job being done).
Testing of hypothesis
Based on Mini Case 9.2:
Lisa has been working at a beauty counter in a department store for 5 years. In her spare time, she has
also been creating lotions and fragrances using all natural products. After receiving positive feedback from
her friends and family about her beauty products, Lisa decides to open her own store. Lisa knows that
convincing a bank to help fund her new business will require more than few positive testimonials from
family. Based on her experience working at the department store, Lisa believes women in her area spend
more than national average on fragrance products. This fact could help her make her business successful.
Lisa would like to be able to support her belief with data to include in a business plan proposal that she
would then use to obtain a small business loan. Lisa took a business statistics course while in college and
decides to use the hypothesis tool she learned. After conducting research, she learns that the national
average spending by women on fragrance products is $59 every 3 months. Lisa takes a random sample of
25 women from the local area and finds that the sample mean is $68.10, and the sample standard deviation
is $14.46. Assume the amount spent every 3 months on fragrance products by women in the area follows
a normal distribution. Find the corresponding output in Table 3.
Table 3
Use the above case study to answer the following questions:
a) What is the population considered in the study?
- Amount of money spent every 3 months by women on fragrance products in
the local area (not the country)
b) What is the population parameter?
- Population mean (𝜇𝜇) of the amount spent every 3 months by women on fragrance
products in the local area
c) What is the related test of hypothesis?
- 𝐻𝐻0: 𝜇𝜇 = $59 vs 𝐻𝐻1: 𝜇𝜇 > $59
d) Which test of hypothesis should Lisa use?
- One sample t-test because the population distribution is normal, the population
SD is unknown, and the sample size (of 25) is not large
e) What is the sampling distribution related to the test statistic?
- t-distribution with 𝑛𝑛 − 1=25-1=24 df
f) What is the test statistic value?
- 3.1484
Variable 1 Variable 2
Mean 68.1055132 0
Variance 209.0951071 0
Observations 25 25
Hypothesized Mean Difference 59
df 24
t Stat 3.148491299
P(T<=t) one-tail 0.002174776
t Critical one-tail 1.71088208
P(T<=t) two-tail 0.004349552
t Critical two-tail 2.063898562
g) What is the p-value for this test?
- 0.0021
- 0.0043 is NOT the p-value for this test, it is a p-value for the test 𝐻𝐻0: 𝜇𝜇 = $59
vs 𝐻𝐻1: 𝜇𝜇 ≠ $59
h) What is the critical value related to this test?
- 1.7108
- 2.0638 and -2.0638 are NOT the critical values for this test; they are the critical
values for the test 𝐻𝐻0: 𝜇𝜇 = $59 vs 𝐻𝐻1: 𝜇𝜇 ≠ $59
i) How can you calculate the test statistic value using excel?
- T.INV(0.95, 24) = 1.7108
- T.INV(0.975, 24) = 2.0638 and T.INV(0.025, 24) = -2.0638 for 𝐻𝐻0: 𝜇𝜇 = $59 vs
𝐻𝐻1: 𝜇𝜇 ≠ $59
- Also T.INV.2T(0.05,24)=2.0638 gives the critical values for 𝐻𝐻0: 𝜇𝜇 = $59 vs
𝐻𝐻1: 𝜇𝜇 ≠ $59
j) What is the area above the critical value 1.7108 under the curve of t-distribution known as?
- Rejection region
- Rejection regions for 𝐻𝐻0: 𝜇𝜇 = $59 vs 𝐻𝐻1: 𝜇𝜇 ≠ $59 would be the area less than -
2.0638 and above 2.0638
k) What is the area above the test statistic value 3.1484 under the curve of t-distribution known
as?
- P-value = 0.0021
- P-value = 0.0043 for 𝐻𝐻0: 𝜇𝜇 = $59 vs 𝐻𝐻1: 𝜇𝜇 ≠ $59 would be the combined area
of less than -3.1484 and above 3.1484
l) What can you infer from the test at 5% level of significance?
- P-value is less than 0.05; We reject H0 at 5% level meaning that there is
sufficient evidence to believe that the average amount spend on fragrance
products by the women every 3 months in the area is indeed significantly
greater than $59.
m) Provide a 95% CI for the population mean.
�𝑋𝑋 − 𝑡𝑡𝑛𝑛−1;
𝛼𝛼
2
×
𝑆𝑆
√𝑛𝑛
, 𝑋𝑋 + 𝑡𝑡𝑛𝑛−1;
𝛼𝛼
2
×
𝑆𝑆
√𝑛𝑛
�
�68.10 − 𝑇𝑇. 𝐼𝐼𝐼𝐼𝐼𝐼(0.975,24) ×
14.46
√25
, 68.10 + 𝑇𝑇. 𝐼𝐼𝐼𝐼𝐼𝐼(0.975,24) ×
14.46
√25
�
(62.13, 74.06)
n) How would you interpret this confidence interval?
- We would say that we are 95% confident that the true value of the population
mean (average amount spend on fragrance products by the women every 3
months in the area) lies within (62.13, 74.06)
- Confidence intervals are random; out of every 100 randomly drawn samples
and random confidence intervals, approximately 95 of them would contain the
true population mean.

More Related Content

Similar to Some study materials

BUSI 620Questions for Critical Thinking 3Salvatore’s Chapter.docx
BUSI 620Questions for Critical Thinking 3Salvatore’s Chapter.docxBUSI 620Questions for Critical Thinking 3Salvatore’s Chapter.docx
BUSI 620Questions for Critical Thinking 3Salvatore’s Chapter.docx
humphrieskalyn
 
Running Head Response .docx
Running Head Response                                        .docxRunning Head Response                                        .docx
Running Head Response .docx
toltonkendal
 
DataIDSalaryCompaMidpoint AgePerformance RatingServiceGenderRaiseD.docx
DataIDSalaryCompaMidpoint AgePerformance RatingServiceGenderRaiseD.docxDataIDSalaryCompaMidpoint AgePerformance RatingServiceGenderRaiseD.docx
DataIDSalaryCompaMidpoint AgePerformance RatingServiceGenderRaiseD.docx
theodorelove43763
 
Topic 8a Basic Statistics
Topic 8a Basic StatisticsTopic 8a Basic Statistics
Topic 8a Basic Statistics
Yee Bee Choo
 
Week 5 HomeworkHomework #1Ms. Lisa Monnin is the budget dire.docx
Week 5 HomeworkHomework #1Ms. Lisa Monnin is the budget dire.docxWeek 5 HomeworkHomework #1Ms. Lisa Monnin is the budget dire.docx
Week 5 HomeworkHomework #1Ms. Lisa Monnin is the budget dire.docx
melbruce90096
 
Page 1 of 18Part A Multiple Choice (1–11)______1. Using.docx
Page 1 of 18Part A Multiple Choice (1–11)______1. Using.docxPage 1 of 18Part A Multiple Choice (1–11)______1. Using.docx
Page 1 of 18Part A Multiple Choice (1–11)______1. Using.docx
alfred4lewis58146
 
5DDBA 8307 Week 6 Assignment Template – Multiple Regression.docx
5DDBA 8307 Week 6 Assignment Template – Multiple Regression.docx5DDBA 8307 Week 6 Assignment Template – Multiple Regression.docx
5DDBA 8307 Week 6 Assignment Template – Multiple Regression.docx
troutmanboris
 
InstructionsView CAAE Stormwater video Too Big for Our Ditches.docx
InstructionsView CAAE Stormwater video Too Big for Our Ditches.docxInstructionsView CAAE Stormwater video Too Big for Our Ditches.docx
InstructionsView CAAE Stormwater video Too Big for Our Ditches.docx
dirkrplav
 
Final Exam Due Friday, Week EightInstructions  Each response is.docx
Final Exam Due Friday, Week EightInstructions  Each response is.docxFinal Exam Due Friday, Week EightInstructions  Each response is.docx
Final Exam Due Friday, Week EightInstructions  Each response is.docx
mydrynan
 
Frequencies, Proportion, GraphsFebruary 8th, 2016Frequen.docx
Frequencies, Proportion, GraphsFebruary 8th, 2016Frequen.docxFrequencies, Proportion, GraphsFebruary 8th, 2016Frequen.docx
Frequencies, Proportion, GraphsFebruary 8th, 2016Frequen.docx
hanneloremccaffery
 
Lecture 7 guidelines_and_assignment
Lecture 7 guidelines_and_assignmentLecture 7 guidelines_and_assignment
Lecture 7 guidelines_and_assignment
Daria Bogdanova
 
New statistics chapter 3 after edit.pptx
New statistics chapter 3 after edit.pptxNew statistics chapter 3 after edit.pptx
New statistics chapter 3 after edit.pptx
obadfaisal24
 

Similar to Some study materials (20)

BUSI 620Questions for Critical Thinking 3Salvatore’s Chapter.docx
BUSI 620Questions for Critical Thinking 3Salvatore’s Chapter.docxBUSI 620Questions for Critical Thinking 3Salvatore’s Chapter.docx
BUSI 620Questions for Critical Thinking 3Salvatore’s Chapter.docx
 
Multiple Regression
Multiple RegressionMultiple Regression
Multiple Regression
 
Anova by Hazilah Mohd Amin
Anova by Hazilah Mohd AminAnova by Hazilah Mohd Amin
Anova by Hazilah Mohd Amin
 
Factorial Experiments
Factorial ExperimentsFactorial Experiments
Factorial Experiments
 
Running Head Response .docx
Running Head Response                                        .docxRunning Head Response                                        .docx
Running Head Response .docx
 
DataIDSalaryCompaMidpoint AgePerformance RatingServiceGenderRaiseD.docx
DataIDSalaryCompaMidpoint AgePerformance RatingServiceGenderRaiseD.docxDataIDSalaryCompaMidpoint AgePerformance RatingServiceGenderRaiseD.docx
DataIDSalaryCompaMidpoint AgePerformance RatingServiceGenderRaiseD.docx
 
Demand forecasting methods 1 gp
Demand forecasting methods 1 gpDemand forecasting methods 1 gp
Demand forecasting methods 1 gp
 
Topic 8a Basic Statistics
Topic 8a Basic StatisticsTopic 8a Basic Statistics
Topic 8a Basic Statistics
 
Week 5 HomeworkHomework #1Ms. Lisa Monnin is the budget dire.docx
Week 5 HomeworkHomework #1Ms. Lisa Monnin is the budget dire.docxWeek 5 HomeworkHomework #1Ms. Lisa Monnin is the budget dire.docx
Week 5 HomeworkHomework #1Ms. Lisa Monnin is the budget dire.docx
 
Page 1 of 18Part A Multiple Choice (1–11)______1. Using.docx
Page 1 of 18Part A Multiple Choice (1–11)______1. Using.docxPage 1 of 18Part A Multiple Choice (1–11)______1. Using.docx
Page 1 of 18Part A Multiple Choice (1–11)______1. Using.docx
 
5DDBA 8307 Week 6 Assignment Template – Multiple Regression.docx
5DDBA 8307 Week 6 Assignment Template – Multiple Regression.docx5DDBA 8307 Week 6 Assignment Template – Multiple Regression.docx
5DDBA 8307 Week 6 Assignment Template – Multiple Regression.docx
 
Multiple Regression.ppt
Multiple Regression.pptMultiple Regression.ppt
Multiple Regression.ppt
 
Mb0040 statistics for management spring2015_assignment- SMU_MBA-Solved-Assign...
Mb0040 statistics for management spring2015_assignment- SMU_MBA-Solved-Assign...Mb0040 statistics for management spring2015_assignment- SMU_MBA-Solved-Assign...
Mb0040 statistics for management spring2015_assignment- SMU_MBA-Solved-Assign...
 
TOPIC 9 VARIABILITY TESTS.pdf
TOPIC 9 VARIABILITY TESTS.pdfTOPIC 9 VARIABILITY TESTS.pdf
TOPIC 9 VARIABILITY TESTS.pdf
 
InstructionsView CAAE Stormwater video Too Big for Our Ditches.docx
InstructionsView CAAE Stormwater video Too Big for Our Ditches.docxInstructionsView CAAE Stormwater video Too Big for Our Ditches.docx
InstructionsView CAAE Stormwater video Too Big for Our Ditches.docx
 
Final Exam Due Friday, Week EightInstructions  Each response is.docx
Final Exam Due Friday, Week EightInstructions  Each response is.docxFinal Exam Due Friday, Week EightInstructions  Each response is.docx
Final Exam Due Friday, Week EightInstructions  Each response is.docx
 
Chi-square, Yates, Fisher & McNemar
Chi-square, Yates, Fisher & McNemarChi-square, Yates, Fisher & McNemar
Chi-square, Yates, Fisher & McNemar
 
Frequencies, Proportion, GraphsFebruary 8th, 2016Frequen.docx
Frequencies, Proportion, GraphsFebruary 8th, 2016Frequen.docxFrequencies, Proportion, GraphsFebruary 8th, 2016Frequen.docx
Frequencies, Proportion, GraphsFebruary 8th, 2016Frequen.docx
 
Lecture 7 guidelines_and_assignment
Lecture 7 guidelines_and_assignmentLecture 7 guidelines_and_assignment
Lecture 7 guidelines_and_assignment
 
New statistics chapter 3 after edit.pptx
New statistics chapter 3 after edit.pptxNew statistics chapter 3 after edit.pptx
New statistics chapter 3 after edit.pptx
 

More from SatishH5 (11)

Regression2
Regression2Regression2
Regression2
 
Regression trees lot example
Regression trees lot exampleRegression trees lot example
Regression trees lot example
 
Regression trees
Regression treesRegression trees
Regression trees
 
Regression trees
Regression treesRegression trees
Regression trees
 
Regression tree
Regression treeRegression tree
Regression tree
 
Multi linear regression
Multi linear regressionMulti linear regression
Multi linear regression
 
Module 4 bayes classification
Module 4 bayes classificationModule 4 bayes classification
Module 4 bayes classification
 
Knn classification
Knn classificationKnn classification
Knn classification
 
Knn classification (1)
Knn classification (1)Knn classification (1)
Knn classification (1)
 
Decision trees
Decision treesDecision trees
Decision trees
 
Decision tree cart c4.5
Decision tree   cart c4.5Decision tree   cart c4.5
Decision tree cart c4.5
 

Recently uploaded

%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
masabamasaba
 
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
VictoriaMetrics
 
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
masabamasaba
 
Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...
Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...
Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...
Medical / Health Care (+971588192166) Mifepristone and Misoprostol tablets 200mg
 

Recently uploaded (20)

WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?
 
What Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the SituationWhat Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the Situation
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation Template
 
%in Benoni+277-882-255-28 abortion pills for sale in Benoni
%in Benoni+277-882-255-28 abortion pills for sale in Benoni%in Benoni+277-882-255-28 abortion pills for sale in Benoni
%in Benoni+277-882-255-28 abortion pills for sale in Benoni
 
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand
 
Artyushina_Guest lecture_YorkU CS May 2024.pptx
Artyushina_Guest lecture_YorkU CS May 2024.pptxArtyushina_Guest lecture_YorkU CS May 2024.pptx
Artyushina_Guest lecture_YorkU CS May 2024.pptx
 
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
 
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
 
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open SourceWSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 
%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in soweto%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in soweto
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
 
WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...
WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...
WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...
 
Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...
Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...
Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...
 
WSO2CON 2024 Slides - Open Source to SaaS
WSO2CON 2024 Slides - Open Source to SaaSWSO2CON 2024 Slides - Open Source to SaaS
WSO2CON 2024 Slides - Open Source to SaaS
 

Some study materials

  • 1. Multiple Linear Regression The following case study and data set are taken from the book: Chatterjee, S. and Hadi, A.S., 2015. Regression analysis by example. John Wiley & Sons. The authors have used the data from a study in Industrial Psychology (Management). An exploratory study was carried out in a large financial organization in an attempt to explain specific supervisor characteristics/traits and overall satisfaction with supervisors as perceived by the employees. Data were collected from 30 departments in the organization. Table 1 below provides details of the variables (all can be treated as continuous) used in the study. Table 1. Variable Description Y Overall rating of job being done by supervisor (0-100) X1 Handles employee complaints (0-100) X2 Does not allow special privileges (0-100) X3 Opportunity learn new things (0-100) X4 Raises based on performance (0-100) X5 Too critical of employee’s poor performance (0-100) X6 Rate of advancing to better job (0-100) A multiple linear regression model is fitted on the data, and the output is presented in Table 2. Answer the following questions: a) Write down the equation for the full MLR model. b) Use the model to predict a supervisor’s performance (overall rating of job being done) when his scores for X1 – X6 for a department are all 50. c) Explain the coefficient value ‘0.6132’ and standard error ‘0.1609’ corresponding to X1 given in the output. d) Explain R2 for this model. e) Find the missing values ‘a’, ‘b’, ‘c’ and ‘d’ from the output. f) Is the model significant in predicting or explaining Y? Justify with proper explanation. g) Do you find any individual factor being significant for Y? Justify with proper explanation. • For any test of hypothesis, consider 5% as the level of significance if nothing is specifically mentioned. ‘
  • 2. Table 2. Model answers: a) The full multiple linear regression (MLR) equation is given by: 𝑌𝑌 = 10.787 + 0.613 𝑋𝑋1 − 0.073 𝑋𝑋2 + 0.320 𝑋𝑋3 + 0.082 𝑋𝑋4 + 0.038 𝑋𝑋5 − 0.217 𝑋𝑋6 (1) b) 𝑌𝑌 = 10.787 + 0.613 × 50 − 0.073 × 50 × +0.320 × 50 + 0.082 × 50 + 0.038 × 50 − 0.217 × 50 = 48.937 The supervisor’s predicted overall rating of job being done is 48.937. c) The value of the partial regression coefficient related to X1 is 0.6132. It signifies that if the score X1 (handling employee complaints) is increased by 1 unit, then Y (supervisor’s overall rating of job being done) increases by 0.613 units when all other factors/predictors (X2-X6) are kept constant at the same level. The standard error of the estimate 0.6132 is 0.1609 which indicates the degree of imprecision or variability involved in estimation. Lower the standard error, more precise is the estimated value. d) R2=SSR/SST is a measure of adequacy for a fitted linear regression model. R2 indicates the amount (%) of variability present in Y that is explained by the linear regression model. In this case, R2 is 0.7326 which means 73.26% of the total variation in Y (supervisor’s overall rating of job being done) is being explained by SUMMARY OUTPUT Regression Statistics Multiple R 0.855921721 R Square 0.732601993 Adjusted R Square 0.662845991 Standard Error 7.067993765 Observations 30 ANOVA df SS MS F Significance F Regression 6 b c d 1.24041E-05 Residual a 1149.000325 49.95653586 Total 29 4296.966667 Coefficients Standard Error t Stat P-value Intercept 10.78707639 11.58925724 0.930782375 0.361633721 X1 0.613187608 0.160983115 3.809018158 0.000902868 X2 -0.073050143 0.13572469 -0.53822295 0.595593921 X3 0.320332116 0.168520319 1.900851595 0.069925346 X4 0.081732134 0.221477677 0.369031022 0.715480088 X5 0.038381447 0.146995442 0.261106377 0.796334264 X6 -0.217056682 0.178209471 -1.217986229 0.235577049
  • 3. the linear regression model given in equation 1, and remaining 100-73.26=26.74% of the total variability in Y could not be explained by the model. In this case, we can conclude that the model fit is not adequate based on R2 value only, but in practice other measures (Adjusted R2, PRESS, AIC, BIC) also need to be checked to comment on model adequacy. e) a = Residual df = Total df - Regression df = 29-6 = 23 b = SS Regression = SS Total – SS Residual = 4296.967 – 1149.000 = 3147.967 c = MS Regression = (SS Regression) / (Regression df) = 3147.967/6 = 524.661 d = F statistic value = (MS Regression)/(MS Residual) = 524.661/49.956 = 10.502 f) To check if the model is significant in predicting or explaining Y, we carry out the following test of hypothesis: 𝐻𝐻0: 𝛽𝛽1 = 𝛽𝛽2 = ⋯ = 𝛽𝛽6 = 0 vs. 𝐻𝐻1: at least one inequality in H0 The test statistic related to this test is F=MSR/MSE which follows an F distribution (sampling distribution) and the value of F is 10.502 obtained from the ANOVA table given in Table 2. The p-value related to this test is given in the column ‘Significance F’ of the ANOVA table. The p-value is 1.2404 × 10−5 < 0.05. Hence, we reject the null hypothesis at 5% level of significance, which means at least one 𝛽𝛽 is significantly different than 0 and the model IS significant in explaining or predicting Y. g) The individual significant predictor test tests one of the following at a time 𝐻𝐻0: 𝛽𝛽1 = 0 vs. 𝐻𝐻1: 𝛽𝛽1 ≠ 0 OR 𝐻𝐻0: 𝛽𝛽2 = 0 vs. 𝐻𝐻1: 𝛽𝛽2 ≠ 0 : : 𝐻𝐻0: 𝛽𝛽6 = 0 vs. 𝐻𝐻1: 𝛽𝛽6 ≠ 0 The test statistic values related to these tests are given in the column ‘t Stat’ and p-values in the column ‘P-value’ against the predictors. We note that p-value related to the test 𝐻𝐻0: 𝛽𝛽1 = 0 vs. 𝐻𝐻1: 𝛽𝛽1 ≠ 0 Is 0.0009 < 0.05. Hence, we can reject the null hypothesis in favor of the alternative, and say that X1 is a significant predictor of Y individually when all other factors are adjusted for. No other p-value is less than 0.05. Hence, X1 (Handles employee complaints) is found to be the only significant predictor which affects Y (supervisor’s overall rating in job being done).
  • 4. Testing of hypothesis Based on Mini Case 9.2: Lisa has been working at a beauty counter in a department store for 5 years. In her spare time, she has also been creating lotions and fragrances using all natural products. After receiving positive feedback from her friends and family about her beauty products, Lisa decides to open her own store. Lisa knows that convincing a bank to help fund her new business will require more than few positive testimonials from family. Based on her experience working at the department store, Lisa believes women in her area spend more than national average on fragrance products. This fact could help her make her business successful. Lisa would like to be able to support her belief with data to include in a business plan proposal that she would then use to obtain a small business loan. Lisa took a business statistics course while in college and decides to use the hypothesis tool she learned. After conducting research, she learns that the national average spending by women on fragrance products is $59 every 3 months. Lisa takes a random sample of 25 women from the local area and finds that the sample mean is $68.10, and the sample standard deviation is $14.46. Assume the amount spent every 3 months on fragrance products by women in the area follows a normal distribution. Find the corresponding output in Table 3. Table 3 Use the above case study to answer the following questions: a) What is the population considered in the study? - Amount of money spent every 3 months by women on fragrance products in the local area (not the country) b) What is the population parameter? - Population mean (𝜇𝜇) of the amount spent every 3 months by women on fragrance products in the local area c) What is the related test of hypothesis? - 𝐻𝐻0: 𝜇𝜇 = $59 vs 𝐻𝐻1: 𝜇𝜇 > $59 d) Which test of hypothesis should Lisa use? - One sample t-test because the population distribution is normal, the population SD is unknown, and the sample size (of 25) is not large e) What is the sampling distribution related to the test statistic? - t-distribution with 𝑛𝑛 − 1=25-1=24 df f) What is the test statistic value? - 3.1484 Variable 1 Variable 2 Mean 68.1055132 0 Variance 209.0951071 0 Observations 25 25 Hypothesized Mean Difference 59 df 24 t Stat 3.148491299 P(T<=t) one-tail 0.002174776 t Critical one-tail 1.71088208 P(T<=t) two-tail 0.004349552 t Critical two-tail 2.063898562
  • 5. g) What is the p-value for this test? - 0.0021 - 0.0043 is NOT the p-value for this test, it is a p-value for the test 𝐻𝐻0: 𝜇𝜇 = $59 vs 𝐻𝐻1: 𝜇𝜇 ≠ $59 h) What is the critical value related to this test? - 1.7108 - 2.0638 and -2.0638 are NOT the critical values for this test; they are the critical values for the test 𝐻𝐻0: 𝜇𝜇 = $59 vs 𝐻𝐻1: 𝜇𝜇 ≠ $59 i) How can you calculate the test statistic value using excel? - T.INV(0.95, 24) = 1.7108 - T.INV(0.975, 24) = 2.0638 and T.INV(0.025, 24) = -2.0638 for 𝐻𝐻0: 𝜇𝜇 = $59 vs 𝐻𝐻1: 𝜇𝜇 ≠ $59 - Also T.INV.2T(0.05,24)=2.0638 gives the critical values for 𝐻𝐻0: 𝜇𝜇 = $59 vs 𝐻𝐻1: 𝜇𝜇 ≠ $59 j) What is the area above the critical value 1.7108 under the curve of t-distribution known as? - Rejection region - Rejection regions for 𝐻𝐻0: 𝜇𝜇 = $59 vs 𝐻𝐻1: 𝜇𝜇 ≠ $59 would be the area less than - 2.0638 and above 2.0638 k) What is the area above the test statistic value 3.1484 under the curve of t-distribution known as? - P-value = 0.0021 - P-value = 0.0043 for 𝐻𝐻0: 𝜇𝜇 = $59 vs 𝐻𝐻1: 𝜇𝜇 ≠ $59 would be the combined area of less than -3.1484 and above 3.1484 l) What can you infer from the test at 5% level of significance? - P-value is less than 0.05; We reject H0 at 5% level meaning that there is sufficient evidence to believe that the average amount spend on fragrance products by the women every 3 months in the area is indeed significantly greater than $59. m) Provide a 95% CI for the population mean. �𝑋𝑋 − 𝑡𝑡𝑛𝑛−1; 𝛼𝛼 2 × 𝑆𝑆 √𝑛𝑛 , 𝑋𝑋 + 𝑡𝑡𝑛𝑛−1; 𝛼𝛼 2 × 𝑆𝑆 √𝑛𝑛 � �68.10 − 𝑇𝑇. 𝐼𝐼𝐼𝐼𝐼𝐼(0.975,24) × 14.46 √25 , 68.10 + 𝑇𝑇. 𝐼𝐼𝐼𝐼𝐼𝐼(0.975,24) × 14.46 √25 � (62.13, 74.06) n) How would you interpret this confidence interval? - We would say that we are 95% confident that the true value of the population mean (average amount spend on fragrance products by the women every 3 months in the area) lies within (62.13, 74.06) - Confidence intervals are random; out of every 100 randomly drawn samples and random confidence intervals, approximately 95 of them would contain the true population mean.