3. FAQ
• Which relevant statistical method can I use?
• How do I interpret the odd ratio?
• How is the logistic regression coefficient different
from the odd ratio?
• How do I select my choice of reference category in
logistic regression?
• What is this about collinearity issues in regression?
• Logit regression vs logistic regression: any
difference?
• Some Stata commands though correct but return
error messages on my Stata software. Why?
• How do I install user written commands in Stata?
• What Stata command produces stepwise regression?
4. Our Learning Objectives
• To present an overview of different statistical
methods at the multivariate level ;
• To provide guidance in the choice of appropriate
multivariate statistical techniques ; and
• Build capacity in the use of basic statistical
procedures and interpretation at the
multivariate level
4
5. If I have seen further, it is by
standing on the shoulders of
giants
NEWTON, 1676; LETTER TO ROBERT HOOKE]
6. • Many slides were obtained from multiple sources
• Model data and Individual Recode
Data(NGIR6AFL.DTA) from MEASURE DHS :
www.measuredhs.com
• Special Thanks to the Head of Demography & Social
Statistics and to PG Anchor
• Many Thanks to PG students
Acknowledgements
7. OUTLINE OF PRESENTATION
• Overview of Multivariate Analysis Techniques
• Practical Sessions with Selected Multivariate
Analytical Techniques using the Stata Software
8. MODE OF PRESENTATION
• Hands-on
• Group work/Assignment
• PowerPoint slides
• Data from DHS (demonstrated)
• Other datafiles: discrim2.dta; etc
9. Some factors to consider:
• Research design
• Number of groups
• Number of variables
• Level of measurement
(nominal, ordinal, interval/ratio)
• Normality
Choosing the Appropriate Statistic
10. Choice of relevant statistical methods
• Single categorical variable (nominal /ordinal)
• Single numeric variable (count/continuous)
• Two categorical variables
• Two numeric variables
• One categorical and one numeric
• One categorical +2 or more numeric
• One numeric + 2 or more categorical
• One numeric + 2 or more numeric
• 3 or more categorical variables
11. Single Categorical Variable (1)
• Type :Nominal
• Example: Residence (Rural/Urban)
• Statistic/graph:
– Frequency tabulation
– Bar chart
– Pie chart
– Histogram X ???? Use only for numeric variable
– Mean X ???? Use only for numeric variable
– Median X: ????? use only for numeric variable
12. Single Numeric Variable(2)
• Type: continuous
• Example: Age; weight; height; income
• Statistic/Graph
– Mean
– Median (use only if distribution is not normal)
– Mode
– Frequency tabulation (use only when grouped)
– Histogram
– Bar Chart (only when grouped)
– Pie Chart (only when grouped)
Research Question:
1. What is the average income of respondents engaged in impulse
buying?
2. What is the median age at first marriage of women who
experienced domestic violence?
13. Single Numeric Variable(3)
• Type: Count
• Example: CEB; crime rate;
• Statistic/graph
– Mean
– Median
– Mode
– Frequency tabulation (if grouped)
– Histogram
– Bar chart (If grouped)
– Pie chart (if grouped)
Research question: What is the mean CEB of women who
married before age 18?; above age 35?
14. Two variables (4)
• One numeric, one categorical
• Numeric dependent variable
• Statistic
– Independent t-test (with two categories)
– One way anova (with more than 2 categories)
• Mean of dependent variable compared across levels of
categorical variable
• Research Question: Is age at marriage in rural area significantly
higher than in urban area(independent t-test)
• Are there differences in mean age at first intercourse by levels of
education(4 levels)? (one-way anova)
15. Two variables (5)
• Two categorical variables
Statistic:
• Chi square test
• To test whether there is an association between
two categorical variables
Research questions: Is there any relationship
between residence and contraceptive use?
• Is marital stability independent of education?
16. Two variables (6)
• Both variables are numeric
• Example: age and CEB; age at marriage and number
of sexual partner
Statistic
• Simple linear regression (one variable is dep)
• Pearson correlation (both variables independent)
Research questions: Is there any relationship between
age at marriage and number of CEB?
• Does number of sexual partners depend on age at
marriage?
17. Multivariate Level of Analysis
• Involve employing relationship among three or more
variables
• Few Basic Examples are:
– Analysis of variance
– Multiple Regression
– Logistic Regression (binary, multinomial, Ordinal….)
– Probit Regression (biprobit, oprobit….)
– Poisson Regression
– Recursive Models (Path Analysis)
– Factor analysis
– Discriminant analysis etc
– Analysis of covariance
– Cluster Analysis 17
18. CASE1
One numeric ; two or more categorical (1)
• Numeric dependent variable:
Statistic:
• N way anova
• Compare mean differences across levels of
categorical independent variables
Research questions: are there significant differences
in the mean number of children by education and
residence?
• What are the effects of education and ethnicity on
number of children ever born?
• Stata command: anova ceb education ethnicity
19. CASE 2
3 or more variables – all continuous
• Continuous dependent variable + 2 or more continuous
ind var:
Statistic:
• Multiple Regression
• Effects of 2 or more independent vars on a single
dependent variable
• Research Question: what are the effects Age of
mother, age at marriage, age at first sex on age at first
birth?
• Stata Command: regress depvar indvar1 indvar2
indpvar3
20. CASE 2(contd)
MEASURING RELATIVE IMPORTANCE
• There are two types of regression coefficients
– Unstandardized regression coefficients
– Standardized Regression coefficients
• BETA COEFFICENT is the standardized coefficients
and is a measure of relative Importance
Stata Command
regress y x1 x2 x3, beta
21. CASE 3
One numeric count variable + 2 or more numeric
/categorical variables
• Count dependent variable:
Statistic:
• Poisson Regression
• Effects of two or more independent variables on
count dependent variable
Research question: What are the effects of age of
mother, age at marriage and residence on number of
children ever born?
Stata Command:
xi: poisson ceb age agemar i.residence
22. CASE 4
One numeric continuous variable + 2 or more
categorical variables
• Continuous dependent variable:
• Statistic:
• Multiple Regression with dummy categorical
variables
• Effects of two or more categorical independent
variables on continuous dependent variable
Research question: What are the effects of residence
and ethnicity on age at marriage?
xi: regress agemar i.residence i.ethnicity
23. 23
CASE 4(contd)
REGRESSION (INTERACTION EFFECTS)
• xi: regress bmi age i.sex i.treat i.treat*i.sex
By default the first (lowest) category will be
omitted, i.e. be the reference group. You may,
before the analysis.
Select agegrp 3 to be the reference by defining a
'characteristic':
• char agegrp[omit] 3
24. CASE 5
All categorical variables – 3 or more
• Categorical dependent variable+ 2 or more
categorical independent variables
Statistic:
• Log-linear Analysis
• Effects of two or more categorical independent
variables on categorical dependent variable
Research question: What are the effects of
residence and family type on wantedness of last
birth (wanted then, wanted later, not wanted at
all)?
25. CASE 6a(Logit and Probit)
One dichotomous dep variable + 2 or more
independent numeric variables/categorical variables
• Dichotomous dependent variable:
Statistic:
• Binary Logistic Regression
• Effects of two or more independent variables
(numeric or categorical or both) on dichotomous
dependent variable
Research question: What are the effects of residence
religion age at marriage on modern contraceptive use?
Stata command
xi: logistic contuse i.residence age i.relig agemar
xi: logit contuse i.residence age i.relig agemar
26. CASE 6b (Logit and Probit)
One dichotomous dep variable + 2 or more
independent numeric variables/categorical variables
• Dichotomous dependent variable:
Alternative Statistic:
• Probit Regression
• Effects of two or more independent variables
(numeric or categorical or both) on dichotomous
dependent variable
Research question: What are the effects of residence
religion age at marriage on modern contraceptive use?
Stata command
xi: probit contuse i.residence age i.relig agemar
27. CASE 7
One categorical variable + 2 or more numeric variables/categorical
• Categorical dependent variable (> 2 categories):
Statistic:
• Multinomial Logistic Regression Analysis
• Effects of two or more independent variables (numeric or
categorical or both) on categorical dependent variable(>2 cat)
Research question: What are the effects of age of mother, age at
marriage, residence and years of schooling on wantedness of last
birth (wanted then, wanted later, not at all)?
28. CASE 8
One categorical variable + 2 or more interval variables
• Categorical dependent variable ( 2 categories or
more):
Statistic:
• Discriminant Analysis
• Classify modern contraceptive users by selected
NUMERIC background age, age at marriage
Research question: How do users and non users of
modern contraceptive compare by age of mother, age
at marriage?
29. Cases 9 & 10 (Regression with more
than one outcome variable)
• More than one outcome variable + one predictor
variable
• Statistics: Multivariate Regression Analysis
• More than one outcome variable + two or more
predictor variables
• Statistics: Multivariate Multiple Regression
Analysis
30. Appropriate techniques for problems with distinction between independent and
dependent variables
No. of Variables Measurement Level Analysis Method
Dependent Independent Dependent Independent
One One Nominal Nominal Non-parametric tests, Chi-
square
One One Nominal
(dichotomous)
Nominal Multiple Classification
Analysis
One One Nominal Nominal
(Dichotomous)
Wilcoxon's two sample
test, Chi-square,
Kolmogorov-Smirnov
Test
One One Interval-scale Nominal
(Dichotomous)
t-test, Analysis of
Variance
One One Interval-scale Interval-scale Regression Analysis
One One Interval-scale Nominal Analysis of Variance
One More Nominal Interval-scale Discriminant Analysis