2. The main types of research questions that
multiple regression can be used to address are:
1. how well a set of variables is able to
predict a particular outcome?
2. which variable in a set of variables is
the best predictor of an outcome?
3. whether a particular predictor variable
is still able to predict an outcome
when the effects of another variable
are controlled
3. To recap:
It explores the relationship between one
continuous dependent variable and a number of
independent variables or predictors (usually
continuous).
Multiple regression is based on correlation.
It allows a more sophisticated exploration of the
interrelationship among a set of variables.
Multiple regression will provide you with
information about the model as a whole (all
subscales) and the relative contribution of each
of the variables that make up the model
(individual subscales).
4. Homework 5:
Interested to investigate the relationship between
anxiety, motivation and writing performance, a
researcher conducted a study with 50 learners.
Anxiety and motivation with 20 questions were
measured on separate questionnaires on a 5-
point Likert scale. The index for writing (out of 25)
was the average of two raters of the essay written
under timed conditions.
Predictors: anxiety and motivation
Dependent V: writing performance
Relationship btw them (predictive power)
5. Research question and research hypothesis:
How well the anxiety and motivation levels can
predict writing performance? How much variance
in writing performance can be explained by
scores of anxiety and motivation scales?
Which variables is the best predictor of writing
performance?
H0: there is no significant relationship with
predictive power between anxiety, motivation and
the dependent variable of the study (writing
performance).
6. Step one: checking the assumptions
sample size
It is recommended that ‘for social science
research, about 15 participants per predictor
are needed for a reliable equation’.
In this study there are 2 independent variables
so the required sample must be larger that 30
which in this study is 50 and quite acceptable.
7. Step one: checking the assumptions
Multicollinearity
Multicollinearity exists when the independent
variables are highly correlated (r=.9 and
above). Singularity occurs when one
independent variable is actually a combination
of other independent variables.
The correlation btw independent variable must
be smaller that 0.7.
The correlation btw independent variables and
dependent one must be larger than 0.3
9. Step one: checking the assumptions
outliers, normality, linearity, homoscedastisity
Multiple regression is very sensitive to outliers (very high or
very low scores). Tabachnick and Fidell (2007, p. 128) define
outliers as those with standardised residual values above
about 3.3 (or less than –3.3).
Residuals are the differences between the obtained and the
predicted dependent variable (DV) scores. The residuals
scatterplots allow you to check:
• normality: the residuals should be normally distributed
about the predicted DV scores
• linearity: the residuals should have a straight-line
relationship with predicted DV scores
• homoscedasticity: the variance of the residuals about
predicted DV scores should be the same for all predicted
scores.
10. Step one: checking the assumptions
scatter plot and normal probability plot
There was no outlier.
11. Step two:
evaluating the model
how much of the variance in the dependent variable is explained by the
model and in this study R square is 0.152 or explains 15per cent of the
variance in writing performance. This is not a respectable result.
Adjusted R square statistic ‘corrects’ this value to provide a better
estimate of the true population value. As the sample of the study is fairly
small it is better to include Adjusted one in the interpretation stage of the
study.
To assess the statistical significance of the result, it is necessary to look
in the table labelled ANOVA. This tests the null hypothesis that multiple
R in the population equals 0. The model in this example reaches
statistical significance (Sig. = .021; this really means p<.05).
12. Model summary table and ANOVA
ANOVAb
Model Sum of Squares df Mean Square F Sig.
1 Regression 146.967 2 73.483 4.225 .021a
Residual 817.513 47 17.394
Total 964.480 49
a. Predictors: (Constant), motivation, anxiety
b. Dependent Variable: writing performance
Model Summaryb
Model R R Square Adjusted R Square Std. Error of the Estimate
1 .390a .152 .116 4.17060
a. Predictors: (Constant), motivation, anxiety
b. Dependent Variable: writing performance
13. Step three: evaluating each of the
independent variables
we are interested in comparing the contribution of each
independent variable; therefore we will use the beta
values.
In this case the largest beta coefficient is 0.36, which is for
Anxiety. This means that this variable makes the
strongest unique contribution to explaining the dependent
variable, when the variance explained by all other
variables in the model is controlled for. The Beta value for
Motivation was slightly lower (–0.25), indicating that it
made less of a unique contribution.
In this case, just anxiety made a unique, and statistically
significant, contribution to the prediction of writing
performance, yet it turned out that motivation didn’t make a
respectable contribution.
14. Step three: evaluating each of the independent
variables
In this example, Anxiety has a part correlation coefficient
of 0.35. If we square this (multiply it by itself) we get .12,
indicating that Anxiety uniquely explains 12 per cent of
the variance in writing performance.
For the motivation the value is –.24, which squared gives
us .05, indicating a unique contribution of 5 per cent to the
explanation of variance in writing performance.
All in all these 2 predictors 12+5=17% of variance in
dependant V can be explained. The total R square
explained 15% of variance in scores. It is because of
the violation of the required assumptions.
15. Coefficient table
Coefficientsa
Model
Unstandardized
Coefficients
Standardiz
ed
Coefficient
s
t Sig.
95% Confidence
Interval for B Correlations
Collinearity
Statistics
B Std. Error Beta
Lower
Bound
Upper
Bound
Zero-
order Partial Part
Toleran
ce VIF
(Constant)
19.158 2.326 8.236 .000 14.478 23.837
anxiety .056 .021 .362 2.623 .012 .013 .098 .305 .357 .352 .949 1.054
motivation
-.058 .032 -.250 -1.815 .076 -.122 .006 -.168 -.256 -.244 .949 1.054
. Dependent Variable:
writingperformance
16. How to report and present the outcome
Our model, which includes anxiety and motivation,
explains 15 percent of the variance in writing
performance(Question 1).
Of these two variables, anxiety makes the largest unique
contribution (beta = .36), and motivation contribution was
not statistically significant (Question 2).