SlideShare ist ein Scribd-Unternehmen logo
1 von 60
Dr. Gaurav Kamboj
Deptt. of Community Medicine
PGIMS, Rohtak
Logistic
Regression
Introduction
Types of regression
Regression line and equation
Logistic regression
Relation between probability, odds ratio and logit
Purpose
Uses
Assumptions
Logistic regression equation
Interpretation of log odd and odds ratio
Example
CONTENTS
REGRESSION is the measure of the average
relationship between two or more variables in terms
of the original units of the data.
There are different types of regression.
Among many types of regression, the most common
in medical research is LOGISTIC REGRESSION.
Introduction
SIMPLE LINEAR REGRESSION uses one independent
variable to explain and/or predict the outcome of Y
Y = α + βX + e
MULTIPLE LINEAR REGRESSION uses two or more
independent variables to predict the outcome. The general
form of each type of regression is:
Introduction
 The equation of the straight line
is given by regression equation.
 Population Regression equation
Y = α + βX + e
 Sample regression equation
Y= a + bx
Where ‘α’ or ‘a’ is the intercept
‘β’ or ‘b’ is the slope of the line
which measures the amount of
change in y for unit change in x.
‘e’ is the regression residual/error
Types of Regression Models. . .
Used to analyze relationships between a CATEGORICAL
dependent variable and metric or categorical independent
variables.
Often chosen if the predictor/independent variables are a
mix of continuous and categorical variables
ln[p/(1-p)] = α + β1X1
+ β2X2 + β3X3 + ... + βtXt + e
The estimated probability is:
p = 1/[1 + exp-(α + β1X1
+ β2X2 + β3X3 + ... + βtXt )]
• p is the probability that the event Y occurs, p(Y=1)
• p/(1-p) is the "odds ratio"
• ln[p/(1-p)] is the log odds ratio, or "logit"
Logistic Regression
Each predictor (IV) is given a coefficient ‘b’
which measures its independent contribution
to variations in the DV, the DV can only take
on one of the two values: 0 or 1.
What we want to predict from a knowledge of
relevant IVs and coefficients is therefore not a
numerical value of a DV as in linear
regression, but rather the probability (p) that it
is 1 rather than 0 (belonging to one group
rather than the other).
Logistic regression equation
When And Why
Used because having a categorical outcome
variable violates the assumption of linearity in
normal regression.
Does not assume a linear relationship between
DV and IV
Predictors do not have to be normally
distributed
Logistic regression does not make any
assumptions of normality, linearity, and
homogeneity of variance for the independent
Marks
Study
Hours
Passing
Marks
Study
Hours
Result
Pass
Fail
Logistic RegressionLinear Regression
Binary logistic regression model:
Used to model a binary response—e.g. yes or no.
Ordinal (ordered) logistic regression model (ordinal
multinomial logistic model.)
Used to model an ordered response—e.g. low,
medium, or high.
Nominal (unordered) logistic regression model
(polytomous, polychotomous, or multinomial)
Used to model a multilevel response with no
ordering—e.g. eye color with levels brown,
green, and blue.
Types Of Logistic Regression
Relation between
probability, odds ratio
and logit
Example :100 participant are randomized to a new or
standard treatment (50 subjects to each treatment
group)
Are chances of success equal for each treatment
group?
Groups New Standard Total
Success 20 10 30
Failure 30 40 70
Total 50 50 100
The probability of success:
Pnew = Pr (success/ new treatment) =20/50=40%
Pst = Pr (success / std. treatment) = 10/50 =20 %
The odds of success:
Onew = Pnew/ (1-Pnew) = 20/30 = 0.66
Ost = Pst/(1-Pst) = 10/40 = 0.25
The natural logarithm of odds of success (= LOGIT)
LOGITnew = log (20/30) = -0.41 (new treatment)
LOGITst = log (10/40) = log(0.25) = -1.39 (std.
treatment)
How to measure the chances of success?
OR = Onew/Ost =(20/30)/(10/40)= 0.67/0.25 = 2.67
If OR = 1 then the success chances are the
same in each group which means
Pnew = Pst or Onew = Ost
The null hypothesis is H0. OR=1 vs the
alternative Ha: OR is not equal to 1
In this case, the odds of success are 2.67
times higher for the new treatment
comparing to the standard one
Odds Ratio is a possible way in the chances of
success to capture inequality
The probability of success can be represented via
odds or LOGITs of success
From above example
LOGITnew = -0.41 (new treatment)
LOGITst = -1.39 (standard treatment)
So the difference between the log odds = .98
We can combine these two log odds for different
groups into one formula
Log(odds) = -1.39 +0.98*(treatment is new)
(example of simple logistic regression)
Simple logistic regression
In this logistic regression -1.39 and 0.98 are
regression coefficients
-1.39 is called the model intercept
0.98 is the treatment effect or the difference
between LOGITs
Simple logistic regression
LOGIT = -1.39 + 0.98 (treatment is new)
If treatment is ‘standard” then
LOGIT = -1.39 +0.98*0 = -1.39 and
odds = Ost = exp(-1.39) = 0.25 and
Pst = 20%
If treatment is ‘new” then
LOGIT = -1.39 +0.98*1 = -0.41 and
odds = Onew = exp(-0.41) = 0.67 and
Pnew = 40%
Simple logistic regresion
If we apply antilog to 0.98 then exp(0.98) =2.67,
the odds ratio!!!
This 2.67 is different from 1, which means we
have a significant increase in odds of
treatment success (chi-square p-value was
<5%)
Simple logistic regresion
The crucial limitation of linear regression is that it
cannot deal with DV’s that are dichotomous and
categorical
Logistic regression employs binomial probability
theory in which there are only two values to predict:
that probability (p) is 1 rather than 0, i.e. the
event/person belongs to one group rather than the
other.
Logistic regression forms a best fitting equation or
function using the maximum likelihood method, which
maximizes the probability of classifying the observed
data into the appropriate category given the
regression coefficients.
Purpose of logistic regression
Like ordinary regression, logistic regression
provides a coefficient ‘b’, which measures each
IV’s partial contribution to variations in the DV.
To accomplish this goal, a model (i.e. an equation)
is created that includes all predictor variables that
are useful in predicting the response variable.
Variables can, if necessary, be entered into the
model in the order specified by the researcher in a
stepwise fashion like regression.
Purpose of logistic regression
The first is the prediction of group membership.
Since logistic regression calculates the
probability of success over the probability of
failure, the results of the analysis are in the
form of an ODDS RATIO.
It also provides knowledge of the relationships
and strengths among the variables (e.g.
marrying the boss’s daughter puts you at a
higher probability for job promotion than
undertaking five hours unpaid overtime each
week).
Uses of logistic regression
Methods
Simultaneous method: in which all independents
are included at the same time
Hierarchical method: Variables entered in blocks.
Blocks should be based on past research, or theory
being tested. Good Method.
Stepwise method: (forward conditional in SPSS) in
which variables are selected in the order in which
they maximize the statistically significant
contribution to the model.
Binary Logistic Regression
The minimum number of cases per independent
variable is 10.
For preferred case-to-variable ratios, we will
use 20 to 1 for simultaneous and hierarchical
logistic regression and 50 to 1 for stepwise
logistic regression.
Sample size requirements
1. Assumes a linear relationship between the LOGIT of the
IVs and DVs
However, does not assume a liner relationship
between the actual dependent and independent
variables
2. The sample is ‘large’- reliability of estimation declines
when there are only a few cases. A minimum of 50
cases per predictor is recommended.
3. IVs are not linear functions of each other
4. Normal distribution is not necessary or assumed for
the dependent variable..
5. Homoscedasticity is not necessary for each level of the
independent variables.
Assumptions
 Logistic Distribution
 Transformed,
however, the “log
odds” are linear.
ln[p/(1-p)]
P (Y=1)
x
x
In SPSS the b coefficients are located in column ‘B’ in
the ‘Variables in the Equation’ table.
Logistic regression calculates changes in the log
odds of the dependent, not changes in the
dependent value.
Odds value can range from 0 to infinity and tell you
how much more likely it is that an observation is a
member of the target group rather than a member
of the other group.
SPSS actually calculates this value of the ln(odds
ratio) for us and presents it as EXP(B) in the results
printout in the ‘Variables in the Equation’ table.
Interpreting log odds and the odds ratio
compare the fit of
two models. How
well a model fits
as compared to
the other.
-2
Logliklihood
Lower the
Value better
the fit of
Alternative
Chi Square
Test
Base Model is
better
Alternative is
better
Table showing how
many observations
have been predicted
correctly
Both Models
are same
Proposed is
better
Larger
difference is
better
P < 0.05
Diagnosis of LR
Classification
Table
Difference
between the Base
Model and
Proposed Model
Higher the correct
prediction the better
Likelihood Ratio Test
Based On
it checks whether the fuller model is better
than the base model.
What is it?
Loglikelihood function= -2loglikelihood
Measures the discrepancy between the
observed and predicted values
Interpretation
loglikelihood
Lower the value the better
Wald Test
Based On
give the “importance” of the contribution of
each variable in the model
What is it?
Chi Square distribution at 1 df
Interpretatio
n
Higher the value, the more “important” it is.
Measure of the Proportion of Variance
Based On
Measure of the proportion of variation
explained
What is it?
Comparison of log-liklihood of the base and
proposed model
Measures Cox & Snell’s R2 Nagelkerke’s R2
Interpretati
on
The higher the better (Value is between 0 & 1)
Does not attain 1 for
the perfect model
Attains1 for the
perfect model
The Hosmer-Lemeshow Goodness-of-
Fit Test
Based On
How well does your model fit the dataWhat is it?
produce a p-value
Interpretation if it’s low (< .05), you reject the model. If it’s
high, then your model passes the test
Interpreting the Logistic ModelModel
With one unit
increase in x
log(OR) of the
success will
increase by 1.3
units on average
Interpretation
Logit Odd Ratio Probability
With one unit
increase in x OR
of success will
increase by e1.3
units or by 3.67
units.
It gives the
probability of
success for a
particular value
of x
Data from a survey of home owners conducted by an electricity
company about an offer of roof solar panels with a 50% subsidy
from the state government as part of the state’s environmental
policy.
The variables involve household income measured in units of a
thousand dollars, age, monthly mortgage, size of family
household, and whether the householder would take or decline
the offer.
1. Click Analyze >>Regression >> Binary Logistic
2. Select the grouping variable (the variable to be predicted)
which must be a dichotomous measure and place it into the
Dependent box.
3. Enter your predictors (IV’s) into the Covariates box. These are
‘family size’ and ‘mortgage’.
SPSS Example
In SPSS, the model is always constructed to predict the
group with higher numeric code.
• If responses are coded 1 for Yes and 2 for No, SPSS will predict
membership in the No category.
• If responses are coded 1 for No and 2 for Yes, SPSS will predict
membership in the Yes category.
We will refer to the predicted event for a
particular analysis as the modeled event.
Logistic regression dialogue box
4. Whether there is any categorical predictor
variables, click “categorical” button and enter it (
there is none in the example).
5. Click on options botton and select Classification
plots, Hosmer-Lemeshow Goodnes of Fit, Casewise
Listing Of Residuals and select Outliers Outside
2sd.
Retain default entries for probability of
stepwise, classifi cation cutoff and maximum
iterations
6. Continue then OK.
Option dialogue box
The first one to take note of is the Classification table in
Block 0 Beginning Block.
Block 0: Beginning Block. Block 0 presents the results
with only the constant included before any coefficients
(i.e. those relating to family size and mortgage) are
entered into the equation.
The table suggests that if we knew nothing about our
variables and guessed that a person would take the
offer we would be correct 53.3% of the time.
Interpretation of printout tables
The variables not in the equation table tells us whether
each IV improves the model
The answer is yes for both variables, with family size
slightly better than mortgage size, as both are
significant and if included would add to the predictive
power of the model.
If they had not been significant and able to contribute
to the prediction, then termination of the analysis would
obviously occur at this point.
Variables not in the equation
The overall significance is tested using what SPSS calls
the Model Chi square, which is derived from the
likelihood of observing the actual data under the
assumption that the model that has been fitted is
accurate.
In our case model chi square has 2 degrees of freedom,
a value of 24.096 and a probability of p < 0.000 .
Thus, the indication is that the model has a poor fit,
with the model containing only the constant indicating
that the predictors do have a significant effect and
create essentially a different model.
So we need to look closely at the predictors and from
later tables determine if one or both are significant
predictors.
Model chi-square
Cox and Snell’s R-Square attempts to imitate multiple
R-Square based on ‘likelihood’, but its maximum can
be (and usually is) less than 1.0
The Nagelkerke modification that does range from 0
to 1 is a more reliable measure of the relationship.
Nagelkerke’s R2 is part of SPSS output in the ‘Model
Summary’ table and is the most-reported of the R-
squared estimates.
In this case it is 0.737, indicating a moderately strong
relationship of 73.7% between the predictors and the
prediction.
Model Summary
R2 = +1
Examples of Approximate R2 Values
y
x
y
x
R2 = 1
R2 = 1
Perfect linear relationship
between x and y:
100% of the variation in y is
explained by variation in x
y
x
y
x
0 < R2 < 1
Weaker linear relationship
between x and y:
Some but not all of the
variation in y is explained
by variation in x
Examples of Approximate R2 Values
R2 = 0
No linear relationship
between x and y:
The value of Y does not
depend on x. (None of the
variation in y is explained
by variation in x)
y
xR2 = 0
Examples of Approximate R2 Values
If the H-L goodness-of-fit test statistic is greater than .05,
as we want for well-fitting models, we fail to reject the
null hypothesis that there is no difference between
observed and model-predicted values, implying that
the model’s estimates fit the data at an acceptable
level.
That is, well-fitting models show non-significance on the
H-L goodness-of-fit test.
Hosmer and Lemeshow statistic
In the Classification table, the columns are the two
predicted values of the dependent, while the rows are the
two observed (actual) values of the dependent.
In this study, 87.5% were correctly classified for the take
offer group and 92.9% for the decline offer group.
Overall 90% were correctly classified.
This is a considerable improvement on the 53.3% correct
classification with the constant model so we know that
the model with predictors is a significantly better mode.
The benchmark that we will use to characterize a logistic
regression model as useful is a 25% improvement over the
rate of accuracy achievable by chance alone.
Classification table
In this case, we note that family size contributed
significantly to the prediction (p = .013) but
mortgage did not (p = .075).
The EXP(B) value associated with family size is
11.007.
Hence when family size is raised by one unit (one
person) the odds ratio is 11 times as large and
therefore householders are 11 more times likely to
belong to the take offer group.
Variables in the Equation
The odds ratio is a measure of effect size.
The ratio of odds ratios of the
independents is the ratio of relative
importance of the independent variables in
terms of effect on the dependent variable’s
odds.
In this example family size is 11 times as
important as monthly mortgage in
determining the decision.
Effect size
Thank You

Weitere ähnliche Inhalte

Was ist angesagt?

Multinomial Logistic Regression
Multinomial Logistic RegressionMultinomial Logistic Regression
Multinomial Logistic RegressionDr Athar Khan
 
Multinomial Logistic Regression Analysis
Multinomial Logistic Regression AnalysisMultinomial Logistic Regression Analysis
Multinomial Logistic Regression AnalysisHARISH Kumar H R
 
Logistic regression
Logistic regressionLogistic regression
Logistic regressionVARUN KUMAR
 
Categorical data analysis
Categorical data analysisCategorical data analysis
Categorical data analysisSumit Das
 
Correlation and regression
Correlation and regressionCorrelation and regression
Correlation and regressionKhalid Aziz
 
Logistic regression
Logistic regressionLogistic regression
Logistic regressionsaba khan
 
Logistic regression (blyth 2006) (simplified)
Logistic regression (blyth 2006) (simplified)Logistic regression (blyth 2006) (simplified)
Logistic regression (blyth 2006) (simplified)MikeBlyth
 
Multinomial logisticregression basicrelationships
Multinomial logisticregression basicrelationshipsMultinomial logisticregression basicrelationships
Multinomial logisticregression basicrelationshipsAnirudha si
 
Regression analysis
Regression analysisRegression analysis
Regression analysissaba khan
 
4.5. logistic regression
4.5. logistic regression4.5. logistic regression
4.5. logistic regressionA M
 
An Introduction to Factor analysis ppt
An Introduction to Factor analysis pptAn Introduction to Factor analysis ppt
An Introduction to Factor analysis pptMukesh Bisht
 
Multiple Regression and Logistic Regression
Multiple Regression and Logistic RegressionMultiple Regression and Logistic Regression
Multiple Regression and Logistic RegressionKaushik Rajan
 
Logistic Regression.ppt
Logistic Regression.pptLogistic Regression.ppt
Logistic Regression.ppthabtamu biazin
 
Regression Analysis presentation by Al Arizmendez and Cathryn Lottier
Regression Analysis presentation by Al Arizmendez and Cathryn LottierRegression Analysis presentation by Al Arizmendez and Cathryn Lottier
Regression Analysis presentation by Al Arizmendez and Cathryn LottierAl Arizmendez
 
Regression analysis ppt
Regression analysis pptRegression analysis ppt
Regression analysis pptElkana Rorio
 

Was ist angesagt? (20)

Logistic regression
Logistic regressionLogistic regression
Logistic regression
 
Multinomial Logistic Regression
Multinomial Logistic RegressionMultinomial Logistic Regression
Multinomial Logistic Regression
 
Multinomial Logistic Regression Analysis
Multinomial Logistic Regression AnalysisMultinomial Logistic Regression Analysis
Multinomial Logistic Regression Analysis
 
Logistic Regression Analysis
Logistic Regression AnalysisLogistic Regression Analysis
Logistic Regression Analysis
 
Ordinal Logistic Regression
Ordinal Logistic RegressionOrdinal Logistic Regression
Ordinal Logistic Regression
 
Logistic regression
Logistic regressionLogistic regression
Logistic regression
 
Categorical data analysis
Categorical data analysisCategorical data analysis
Categorical data analysis
 
Logistic regression sage
Logistic regression sageLogistic regression sage
Logistic regression sage
 
Correlation and regression
Correlation and regressionCorrelation and regression
Correlation and regression
 
Logistic regression
Logistic regressionLogistic regression
Logistic regression
 
Logistic regression (blyth 2006) (simplified)
Logistic regression (blyth 2006) (simplified)Logistic regression (blyth 2006) (simplified)
Logistic regression (blyth 2006) (simplified)
 
Multinomial logisticregression basicrelationships
Multinomial logisticregression basicrelationshipsMultinomial logisticregression basicrelationships
Multinomial logisticregression basicrelationships
 
Regression analysis
Regression analysisRegression analysis
Regression analysis
 
4.5. logistic regression
4.5. logistic regression4.5. logistic regression
4.5. logistic regression
 
An Introduction to Factor analysis ppt
An Introduction to Factor analysis pptAn Introduction to Factor analysis ppt
An Introduction to Factor analysis ppt
 
Multiple Regression and Logistic Regression
Multiple Regression and Logistic RegressionMultiple Regression and Logistic Regression
Multiple Regression and Logistic Regression
 
Logistic Regression.ppt
Logistic Regression.pptLogistic Regression.ppt
Logistic Regression.ppt
 
MANOVA SPSS
MANOVA SPSSMANOVA SPSS
MANOVA SPSS
 
Regression Analysis presentation by Al Arizmendez and Cathryn Lottier
Regression Analysis presentation by Al Arizmendez and Cathryn LottierRegression Analysis presentation by Al Arizmendez and Cathryn Lottier
Regression Analysis presentation by Al Arizmendez and Cathryn Lottier
 
Regression analysis ppt
Regression analysis pptRegression analysis ppt
Regression analysis ppt
 

Ähnlich wie Logistic regression with SPSS examples

Quantitative Methods for Lawyers - Class #22 - Regression Analysis - Part 5
Quantitative Methods for Lawyers - Class #22 - Regression Analysis - Part 5Quantitative Methods for Lawyers - Class #22 - Regression Analysis - Part 5
Quantitative Methods for Lawyers - Class #22 - Regression Analysis - Part 5Daniel Katz
 
Logistic regression
Logistic regressionLogistic regression
Logistic regressionRupak Roy
 
Correlation analysis
Correlation analysisCorrelation analysis
Correlation analysisAwais Salman
 
Simple Linear Regression detail explanation.pdf
Simple Linear Regression detail explanation.pdfSimple Linear Regression detail explanation.pdf
Simple Linear Regression detail explanation.pdfUVAS
 
Data Science - Part XII - Ridge Regression, LASSO, and Elastic Nets
Data Science - Part XII - Ridge Regression, LASSO, and Elastic NetsData Science - Part XII - Ridge Regression, LASSO, and Elastic Nets
Data Science - Part XII - Ridge Regression, LASSO, and Elastic NetsDerek Kane
 
Correlation & Regression Analysis using SPSS
Correlation & Regression Analysis  using SPSSCorrelation & Regression Analysis  using SPSS
Correlation & Regression Analysis using SPSSParag Shah
 
Linear Regression and Logistic Regression in ML
Linear Regression and Logistic Regression in MLLinear Regression and Logistic Regression in ML
Linear Regression and Logistic Regression in MLKumud Arora
 
Multinomial Logistic Regression.pdf
Multinomial Logistic Regression.pdfMultinomial Logistic Regression.pdf
Multinomial Logistic Regression.pdfAlemAyahu
 
Simple Linear Regression explanation.pptx
Simple Linear Regression explanation.pptxSimple Linear Regression explanation.pptx
Simple Linear Regression explanation.pptxUVAS
 
Logistic regression vs. logistic classifier. History of the confusion and the...
Logistic regression vs. logistic classifier. History of the confusion and the...Logistic regression vs. logistic classifier. History of the confusion and the...
Logistic regression vs. logistic classifier. History of the confusion and the...Adrian Olszewski
 
Correlation.pptx
Correlation.pptxCorrelation.pptx
Correlation.pptxIloveBepis
 
Logistic regression
Logistic regressionLogistic regression
Logistic regressionAyurdata
 
Regression with Time Series Data
Regression with Time Series DataRegression with Time Series Data
Regression with Time Series DataRizano Ahdiat R
 

Ähnlich wie Logistic regression with SPSS examples (20)

Quantitative Methods for Lawyers - Class #22 - Regression Analysis - Part 5
Quantitative Methods for Lawyers - Class #22 - Regression Analysis - Part 5Quantitative Methods for Lawyers - Class #22 - Regression Analysis - Part 5
Quantitative Methods for Lawyers - Class #22 - Regression Analysis - Part 5
 
M8.logreg.ppt
M8.logreg.pptM8.logreg.ppt
M8.logreg.ppt
 
M8.logreg.ppt
M8.logreg.pptM8.logreg.ppt
M8.logreg.ppt
 
Logistic regression
Logistic regressionLogistic regression
Logistic regression
 
Correlation analysis
Correlation analysisCorrelation analysis
Correlation analysis
 
Simple Regression.pptx
Simple Regression.pptxSimple Regression.pptx
Simple Regression.pptx
 
Measure of Association
Measure of AssociationMeasure of Association
Measure of Association
 
Bus 173_6.pptx
Bus 173_6.pptxBus 173_6.pptx
Bus 173_6.pptx
 
Simple Linear Regression detail explanation.pdf
Simple Linear Regression detail explanation.pdfSimple Linear Regression detail explanation.pdf
Simple Linear Regression detail explanation.pdf
 
Data Science - Part XII - Ridge Regression, LASSO, and Elastic Nets
Data Science - Part XII - Ridge Regression, LASSO, and Elastic NetsData Science - Part XII - Ridge Regression, LASSO, and Elastic Nets
Data Science - Part XII - Ridge Regression, LASSO, and Elastic Nets
 
Correlation & Regression Analysis using SPSS
Correlation & Regression Analysis  using SPSSCorrelation & Regression Analysis  using SPSS
Correlation & Regression Analysis using SPSS
 
Linear Regression and Logistic Regression in ML
Linear Regression and Logistic Regression in MLLinear Regression and Logistic Regression in ML
Linear Regression and Logistic Regression in ML
 
Multinomial Logistic Regression.pdf
Multinomial Logistic Regression.pdfMultinomial Logistic Regression.pdf
Multinomial Logistic Regression.pdf
 
Quantitative Methods - Level II - CFA Program
Quantitative Methods - Level II - CFA ProgramQuantitative Methods - Level II - CFA Program
Quantitative Methods - Level II - CFA Program
 
Simple Linear Regression explanation.pptx
Simple Linear Regression explanation.pptxSimple Linear Regression explanation.pptx
Simple Linear Regression explanation.pptx
 
Logistic regression vs. logistic classifier. History of the confusion and the...
Logistic regression vs. logistic classifier. History of the confusion and the...Logistic regression vs. logistic classifier. History of the confusion and the...
Logistic regression vs. logistic classifier. History of the confusion and the...
 
2-20-04.ppt
2-20-04.ppt2-20-04.ppt
2-20-04.ppt
 
Correlation.pptx
Correlation.pptxCorrelation.pptx
Correlation.pptx
 
Logistic regression
Logistic regressionLogistic regression
Logistic regression
 
Regression with Time Series Data
Regression with Time Series DataRegression with Time Series Data
Regression with Time Series Data
 

Mehr von Gaurav Kamboj

Sexual harassment at workplace (Prevention, Prohibition and Redressal) Act 2...
Sexual harassment at workplace (Prevention, Prohibition and Redressal)  Act 2...Sexual harassment at workplace (Prevention, Prohibition and Redressal)  Act 2...
Sexual harassment at workplace (Prevention, Prohibition and Redressal) Act 2...Gaurav Kamboj
 
RMNCH+A strategy: Reproductive, Maternal, neonatal, child and Adolescent Health
RMNCH+A strategy: Reproductive, Maternal, neonatal, child and Adolescent Health RMNCH+A strategy: Reproductive, Maternal, neonatal, child and Adolescent Health
RMNCH+A strategy: Reproductive, Maternal, neonatal, child and Adolescent Health Gaurav Kamboj
 
Nutrition, Macronutrients and Micronutrients and their deficiency disorders
Nutrition, Macronutrients and Micronutrients and their deficiency disordersNutrition, Macronutrients and Micronutrients and their deficiency disorders
Nutrition, Macronutrients and Micronutrients and their deficiency disordersGaurav Kamboj
 
Middle East Respiratory Syndrome: MERS- CoV
Middle East Respiratory Syndrome: MERS- CoVMiddle East Respiratory Syndrome: MERS- CoV
Middle East Respiratory Syndrome: MERS- CoVGaurav Kamboj
 
IDSP- Integrated Disease Surveillance Programme
IDSP- Integrated Disease Surveillance ProgrammeIDSP- Integrated Disease Surveillance Programme
IDSP- Integrated Disease Surveillance ProgrammeGaurav Kamboj
 
Randomized controlled trial: Going for the Gold
Randomized controlled trial: Going for the GoldRandomized controlled trial: Going for the Gold
Randomized controlled trial: Going for the GoldGaurav Kamboj
 
Meta analysis: Made Easy with Example from RevMan
Meta analysis: Made Easy with Example from RevManMeta analysis: Made Easy with Example from RevMan
Meta analysis: Made Easy with Example from RevManGaurav Kamboj
 

Mehr von Gaurav Kamboj (7)

Sexual harassment at workplace (Prevention, Prohibition and Redressal) Act 2...
Sexual harassment at workplace (Prevention, Prohibition and Redressal)  Act 2...Sexual harassment at workplace (Prevention, Prohibition and Redressal)  Act 2...
Sexual harassment at workplace (Prevention, Prohibition and Redressal) Act 2...
 
RMNCH+A strategy: Reproductive, Maternal, neonatal, child and Adolescent Health
RMNCH+A strategy: Reproductive, Maternal, neonatal, child and Adolescent Health RMNCH+A strategy: Reproductive, Maternal, neonatal, child and Adolescent Health
RMNCH+A strategy: Reproductive, Maternal, neonatal, child and Adolescent Health
 
Nutrition, Macronutrients and Micronutrients and their deficiency disorders
Nutrition, Macronutrients and Micronutrients and their deficiency disordersNutrition, Macronutrients and Micronutrients and their deficiency disorders
Nutrition, Macronutrients and Micronutrients and their deficiency disorders
 
Middle East Respiratory Syndrome: MERS- CoV
Middle East Respiratory Syndrome: MERS- CoVMiddle East Respiratory Syndrome: MERS- CoV
Middle East Respiratory Syndrome: MERS- CoV
 
IDSP- Integrated Disease Surveillance Programme
IDSP- Integrated Disease Surveillance ProgrammeIDSP- Integrated Disease Surveillance Programme
IDSP- Integrated Disease Surveillance Programme
 
Randomized controlled trial: Going for the Gold
Randomized controlled trial: Going for the GoldRandomized controlled trial: Going for the Gold
Randomized controlled trial: Going for the Gold
 
Meta analysis: Made Easy with Example from RevMan
Meta analysis: Made Easy with Example from RevManMeta analysis: Made Easy with Example from RevMan
Meta analysis: Made Easy with Example from RevMan
 

Kürzlich hochgeladen

Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...HyderabadDolls
 
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi ArabiaIn Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabiaahmedjiabur940
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Valters Lauzums
 
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With OrangePredicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With OrangeThinkInnovation
 
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...SOFTTECHHUB
 
Digital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareDigital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareGraham Ware
 
Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraGovindSinghDasila
 
Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...
Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...
Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...HyderabadDolls
 
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样wsppdmt
 
Dubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls DubaiDubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls Dubaikojalkojal131
 
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...gajnagarg
 
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...nirzagarg
 
Fun all Day Call Girls in Jaipur 9332606886 High Profile Call Girls You Ca...
Fun all Day Call Girls in Jaipur   9332606886  High Profile Call Girls You Ca...Fun all Day Call Girls in Jaipur   9332606886  High Profile Call Girls You Ca...
Fun all Day Call Girls in Jaipur 9332606886 High Profile Call Girls You Ca...kumargunjan9515
 
Top Call Girls in Balaghat 9332606886Call Girls Advance Cash On Delivery Ser...
Top Call Girls in Balaghat  9332606886Call Girls Advance Cash On Delivery Ser...Top Call Girls in Balaghat  9332606886Call Girls Advance Cash On Delivery Ser...
Top Call Girls in Balaghat 9332606886Call Girls Advance Cash On Delivery Ser...kumargunjan9515
 
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...nirzagarg
 
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book nowVadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book nowgargpaaro
 
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...gajnagarg
 
7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.pptibrahimabdi22
 
Ranking and Scoring Exercises for Research
Ranking and Scoring Exercises for ResearchRanking and Scoring Exercises for Research
Ranking and Scoring Exercises for ResearchRajesh Mondal
 
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...nirzagarg
 

Kürzlich hochgeladen (20)

Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
 
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi ArabiaIn Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With OrangePredicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
 
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
 
Digital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareDigital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham Ware
 
Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - Almora
 
Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...
Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...
Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...
 
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
 
Dubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls DubaiDubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls Dubai
 
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
 
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
 
Fun all Day Call Girls in Jaipur 9332606886 High Profile Call Girls You Ca...
Fun all Day Call Girls in Jaipur   9332606886  High Profile Call Girls You Ca...Fun all Day Call Girls in Jaipur   9332606886  High Profile Call Girls You Ca...
Fun all Day Call Girls in Jaipur 9332606886 High Profile Call Girls You Ca...
 
Top Call Girls in Balaghat 9332606886Call Girls Advance Cash On Delivery Ser...
Top Call Girls in Balaghat  9332606886Call Girls Advance Cash On Delivery Ser...Top Call Girls in Balaghat  9332606886Call Girls Advance Cash On Delivery Ser...
Top Call Girls in Balaghat 9332606886Call Girls Advance Cash On Delivery Ser...
 
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
 
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book nowVadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
 
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
 
7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt
 
Ranking and Scoring Exercises for Research
Ranking and Scoring Exercises for ResearchRanking and Scoring Exercises for Research
Ranking and Scoring Exercises for Research
 
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
 

Logistic regression with SPSS examples

  • 1. Dr. Gaurav Kamboj Deptt. of Community Medicine PGIMS, Rohtak Logistic Regression
  • 2. Introduction Types of regression Regression line and equation Logistic regression Relation between probability, odds ratio and logit Purpose Uses Assumptions Logistic regression equation Interpretation of log odd and odds ratio Example CONTENTS
  • 3. REGRESSION is the measure of the average relationship between two or more variables in terms of the original units of the data. There are different types of regression. Among many types of regression, the most common in medical research is LOGISTIC REGRESSION. Introduction
  • 4. SIMPLE LINEAR REGRESSION uses one independent variable to explain and/or predict the outcome of Y Y = α + βX + e MULTIPLE LINEAR REGRESSION uses two or more independent variables to predict the outcome. The general form of each type of regression is: Introduction
  • 5.  The equation of the straight line is given by regression equation.  Population Regression equation Y = α + βX + e  Sample regression equation Y= a + bx Where ‘α’ or ‘a’ is the intercept ‘β’ or ‘b’ is the slope of the line which measures the amount of change in y for unit change in x. ‘e’ is the regression residual/error
  • 6.
  • 7. Types of Regression Models. . .
  • 8. Used to analyze relationships between a CATEGORICAL dependent variable and metric or categorical independent variables. Often chosen if the predictor/independent variables are a mix of continuous and categorical variables ln[p/(1-p)] = α + β1X1 + β2X2 + β3X3 + ... + βtXt + e The estimated probability is: p = 1/[1 + exp-(α + β1X1 + β2X2 + β3X3 + ... + βtXt )] • p is the probability that the event Y occurs, p(Y=1) • p/(1-p) is the "odds ratio" • ln[p/(1-p)] is the log odds ratio, or "logit" Logistic Regression
  • 9. Each predictor (IV) is given a coefficient ‘b’ which measures its independent contribution to variations in the DV, the DV can only take on one of the two values: 0 or 1. What we want to predict from a knowledge of relevant IVs and coefficients is therefore not a numerical value of a DV as in linear regression, but rather the probability (p) that it is 1 rather than 0 (belonging to one group rather than the other). Logistic regression equation
  • 10. When And Why Used because having a categorical outcome variable violates the assumption of linearity in normal regression. Does not assume a linear relationship between DV and IV Predictors do not have to be normally distributed Logistic regression does not make any assumptions of normality, linearity, and homogeneity of variance for the independent
  • 11.
  • 13. Binary logistic regression model: Used to model a binary response—e.g. yes or no. Ordinal (ordered) logistic regression model (ordinal multinomial logistic model.) Used to model an ordered response—e.g. low, medium, or high. Nominal (unordered) logistic regression model (polytomous, polychotomous, or multinomial) Used to model a multilevel response with no ordering—e.g. eye color with levels brown, green, and blue. Types Of Logistic Regression
  • 15. Example :100 participant are randomized to a new or standard treatment (50 subjects to each treatment group) Are chances of success equal for each treatment group? Groups New Standard Total Success 20 10 30 Failure 30 40 70 Total 50 50 100
  • 16. The probability of success: Pnew = Pr (success/ new treatment) =20/50=40% Pst = Pr (success / std. treatment) = 10/50 =20 % The odds of success: Onew = Pnew/ (1-Pnew) = 20/30 = 0.66 Ost = Pst/(1-Pst) = 10/40 = 0.25 The natural logarithm of odds of success (= LOGIT) LOGITnew = log (20/30) = -0.41 (new treatment) LOGITst = log (10/40) = log(0.25) = -1.39 (std. treatment) How to measure the chances of success?
  • 17. OR = Onew/Ost =(20/30)/(10/40)= 0.67/0.25 = 2.67 If OR = 1 then the success chances are the same in each group which means Pnew = Pst or Onew = Ost The null hypothesis is H0. OR=1 vs the alternative Ha: OR is not equal to 1 In this case, the odds of success are 2.67 times higher for the new treatment comparing to the standard one Odds Ratio is a possible way in the chances of success to capture inequality
  • 18. The probability of success can be represented via odds or LOGITs of success From above example LOGITnew = -0.41 (new treatment) LOGITst = -1.39 (standard treatment) So the difference between the log odds = .98 We can combine these two log odds for different groups into one formula Log(odds) = -1.39 +0.98*(treatment is new) (example of simple logistic regression) Simple logistic regression
  • 19. In this logistic regression -1.39 and 0.98 are regression coefficients -1.39 is called the model intercept 0.98 is the treatment effect or the difference between LOGITs Simple logistic regression
  • 20. LOGIT = -1.39 + 0.98 (treatment is new) If treatment is ‘standard” then LOGIT = -1.39 +0.98*0 = -1.39 and odds = Ost = exp(-1.39) = 0.25 and Pst = 20% If treatment is ‘new” then LOGIT = -1.39 +0.98*1 = -0.41 and odds = Onew = exp(-0.41) = 0.67 and Pnew = 40% Simple logistic regresion
  • 21. If we apply antilog to 0.98 then exp(0.98) =2.67, the odds ratio!!! This 2.67 is different from 1, which means we have a significant increase in odds of treatment success (chi-square p-value was <5%) Simple logistic regresion
  • 22. The crucial limitation of linear regression is that it cannot deal with DV’s that are dichotomous and categorical Logistic regression employs binomial probability theory in which there are only two values to predict: that probability (p) is 1 rather than 0, i.e. the event/person belongs to one group rather than the other. Logistic regression forms a best fitting equation or function using the maximum likelihood method, which maximizes the probability of classifying the observed data into the appropriate category given the regression coefficients. Purpose of logistic regression
  • 23. Like ordinary regression, logistic regression provides a coefficient ‘b’, which measures each IV’s partial contribution to variations in the DV. To accomplish this goal, a model (i.e. an equation) is created that includes all predictor variables that are useful in predicting the response variable. Variables can, if necessary, be entered into the model in the order specified by the researcher in a stepwise fashion like regression. Purpose of logistic regression
  • 24. The first is the prediction of group membership. Since logistic regression calculates the probability of success over the probability of failure, the results of the analysis are in the form of an ODDS RATIO. It also provides knowledge of the relationships and strengths among the variables (e.g. marrying the boss’s daughter puts you at a higher probability for job promotion than undertaking five hours unpaid overtime each week). Uses of logistic regression
  • 25. Methods Simultaneous method: in which all independents are included at the same time Hierarchical method: Variables entered in blocks. Blocks should be based on past research, or theory being tested. Good Method. Stepwise method: (forward conditional in SPSS) in which variables are selected in the order in which they maximize the statistically significant contribution to the model. Binary Logistic Regression
  • 26. The minimum number of cases per independent variable is 10. For preferred case-to-variable ratios, we will use 20 to 1 for simultaneous and hierarchical logistic regression and 50 to 1 for stepwise logistic regression. Sample size requirements
  • 27. 1. Assumes a linear relationship between the LOGIT of the IVs and DVs However, does not assume a liner relationship between the actual dependent and independent variables 2. The sample is ‘large’- reliability of estimation declines when there are only a few cases. A minimum of 50 cases per predictor is recommended. 3. IVs are not linear functions of each other 4. Normal distribution is not necessary or assumed for the dependent variable.. 5. Homoscedasticity is not necessary for each level of the independent variables. Assumptions
  • 28.  Logistic Distribution  Transformed, however, the “log odds” are linear. ln[p/(1-p)] P (Y=1) x x
  • 29. In SPSS the b coefficients are located in column ‘B’ in the ‘Variables in the Equation’ table. Logistic regression calculates changes in the log odds of the dependent, not changes in the dependent value. Odds value can range from 0 to infinity and tell you how much more likely it is that an observation is a member of the target group rather than a member of the other group. SPSS actually calculates this value of the ln(odds ratio) for us and presents it as EXP(B) in the results printout in the ‘Variables in the Equation’ table. Interpreting log odds and the odds ratio
  • 30. compare the fit of two models. How well a model fits as compared to the other. -2 Logliklihood Lower the Value better the fit of Alternative Chi Square Test Base Model is better Alternative is better Table showing how many observations have been predicted correctly Both Models are same Proposed is better Larger difference is better P < 0.05 Diagnosis of LR Classification Table Difference between the Base Model and Proposed Model Higher the correct prediction the better
  • 31. Likelihood Ratio Test Based On it checks whether the fuller model is better than the base model. What is it? Loglikelihood function= -2loglikelihood Measures the discrepancy between the observed and predicted values Interpretation loglikelihood Lower the value the better
  • 32. Wald Test Based On give the “importance” of the contribution of each variable in the model What is it? Chi Square distribution at 1 df Interpretatio n Higher the value, the more “important” it is.
  • 33. Measure of the Proportion of Variance Based On Measure of the proportion of variation explained What is it? Comparison of log-liklihood of the base and proposed model Measures Cox & Snell’s R2 Nagelkerke’s R2 Interpretati on The higher the better (Value is between 0 & 1) Does not attain 1 for the perfect model Attains1 for the perfect model
  • 34. The Hosmer-Lemeshow Goodness-of- Fit Test Based On How well does your model fit the dataWhat is it? produce a p-value Interpretation if it’s low (< .05), you reject the model. If it’s high, then your model passes the test
  • 35. Interpreting the Logistic ModelModel With one unit increase in x log(OR) of the success will increase by 1.3 units on average Interpretation Logit Odd Ratio Probability With one unit increase in x OR of success will increase by e1.3 units or by 3.67 units. It gives the probability of success for a particular value of x
  • 36. Data from a survey of home owners conducted by an electricity company about an offer of roof solar panels with a 50% subsidy from the state government as part of the state’s environmental policy. The variables involve household income measured in units of a thousand dollars, age, monthly mortgage, size of family household, and whether the householder would take or decline the offer. 1. Click Analyze >>Regression >> Binary Logistic 2. Select the grouping variable (the variable to be predicted) which must be a dichotomous measure and place it into the Dependent box. 3. Enter your predictors (IV’s) into the Covariates box. These are ‘family size’ and ‘mortgage’. SPSS Example
  • 37. In SPSS, the model is always constructed to predict the group with higher numeric code. • If responses are coded 1 for Yes and 2 for No, SPSS will predict membership in the No category. • If responses are coded 1 for No and 2 for Yes, SPSS will predict membership in the Yes category. We will refer to the predicted event for a particular analysis as the modeled event.
  • 39. 4. Whether there is any categorical predictor variables, click “categorical” button and enter it ( there is none in the example).
  • 40. 5. Click on options botton and select Classification plots, Hosmer-Lemeshow Goodnes of Fit, Casewise Listing Of Residuals and select Outliers Outside 2sd. Retain default entries for probability of stepwise, classifi cation cutoff and maximum iterations 6. Continue then OK.
  • 42. The first one to take note of is the Classification table in Block 0 Beginning Block. Block 0: Beginning Block. Block 0 presents the results with only the constant included before any coefficients (i.e. those relating to family size and mortgage) are entered into the equation. The table suggests that if we knew nothing about our variables and guessed that a person would take the offer we would be correct 53.3% of the time. Interpretation of printout tables
  • 43.
  • 44. The variables not in the equation table tells us whether each IV improves the model The answer is yes for both variables, with family size slightly better than mortgage size, as both are significant and if included would add to the predictive power of the model. If they had not been significant and able to contribute to the prediction, then termination of the analysis would obviously occur at this point. Variables not in the equation
  • 45.
  • 46. The overall significance is tested using what SPSS calls the Model Chi square, which is derived from the likelihood of observing the actual data under the assumption that the model that has been fitted is accurate. In our case model chi square has 2 degrees of freedom, a value of 24.096 and a probability of p < 0.000 . Thus, the indication is that the model has a poor fit, with the model containing only the constant indicating that the predictors do have a significant effect and create essentially a different model. So we need to look closely at the predictors and from later tables determine if one or both are significant predictors. Model chi-square
  • 47.
  • 48. Cox and Snell’s R-Square attempts to imitate multiple R-Square based on ‘likelihood’, but its maximum can be (and usually is) less than 1.0 The Nagelkerke modification that does range from 0 to 1 is a more reliable measure of the relationship. Nagelkerke’s R2 is part of SPSS output in the ‘Model Summary’ table and is the most-reported of the R- squared estimates. In this case it is 0.737, indicating a moderately strong relationship of 73.7% between the predictors and the prediction. Model Summary
  • 49.
  • 50. R2 = +1 Examples of Approximate R2 Values y x y x R2 = 1 R2 = 1 Perfect linear relationship between x and y: 100% of the variation in y is explained by variation in x
  • 51. y x y x 0 < R2 < 1 Weaker linear relationship between x and y: Some but not all of the variation in y is explained by variation in x Examples of Approximate R2 Values
  • 52. R2 = 0 No linear relationship between x and y: The value of Y does not depend on x. (None of the variation in y is explained by variation in x) y xR2 = 0 Examples of Approximate R2 Values
  • 53. If the H-L goodness-of-fit test statistic is greater than .05, as we want for well-fitting models, we fail to reject the null hypothesis that there is no difference between observed and model-predicted values, implying that the model’s estimates fit the data at an acceptable level. That is, well-fitting models show non-significance on the H-L goodness-of-fit test. Hosmer and Lemeshow statistic
  • 54.
  • 55. In the Classification table, the columns are the two predicted values of the dependent, while the rows are the two observed (actual) values of the dependent. In this study, 87.5% were correctly classified for the take offer group and 92.9% for the decline offer group. Overall 90% were correctly classified. This is a considerable improvement on the 53.3% correct classification with the constant model so we know that the model with predictors is a significantly better mode. The benchmark that we will use to characterize a logistic regression model as useful is a 25% improvement over the rate of accuracy achievable by chance alone. Classification table
  • 56.
  • 57. In this case, we note that family size contributed significantly to the prediction (p = .013) but mortgage did not (p = .075). The EXP(B) value associated with family size is 11.007. Hence when family size is raised by one unit (one person) the odds ratio is 11 times as large and therefore householders are 11 more times likely to belong to the take offer group. Variables in the Equation
  • 58.
  • 59. The odds ratio is a measure of effect size. The ratio of odds ratios of the independents is the ratio of relative importance of the independent variables in terms of effect on the dependent variable’s odds. In this example family size is 11 times as important as monthly mortgage in determining the decision. Effect size

Hinweis der Redaktion

  1. DISCRIMINANT FUNCTION ANALYSIS is usually employed with a categorical dependent variable, & all of the predictors are continuous and nicely distributed; LOGIT ANALYSIS is usually employed if all of the predictors are categorical;
  2. DISCRIMINANT FUNCTION ANALYSIS is usually employed with a categorical dependent variable, & all of the predictors are continuous and nicely distributed; LOGIT ANALYSIS is usually employed if all of the predictors are categorical;
  3. DISCRIMINANT FUNCTION ANALYSIS is usually employed with a categorical dependent variable, & all of the predictors are continuous and nicely distributed; LOGIT ANALYSIS is usually employed if all of the predictors are categorical;
  4. Homoscedasticity This assumption means that the variance around the regression line is the same for all values of the predictor variable (X)
  5. The likelihood-ratio test uses the ratio of the maximized value of the likelihood function for the full model (L1) over the maximized value of the likelihood function for the simpler model (L0). This log transformation of the likelihood functions yields a chi-squared statistic.
  6. A Wald test is used to test the statistical significance of each coefficient (􀁅) in the model. A Wald test calculates a 􀀽 statistic. This z value is then squared, yielding a Wald statistic with a chi-square distribution. Wald estimates give the “importance” of the contribution of each variable in the model. The higher the value, the more “important” it is.
  7. R2 is a measure of predictive power, that is, how well you can predict the dependent variable based on the independent variables.
  8. Hosmer DW, Lemeshow S. Applied logistic regression.Wiley & Sons, New York, 1989