Econometrics

AGRB409
INTRODUCTION TO ECONOMETRICS
1-0© 2011 Pearson Addison-Wesley. All rights reserved.

What is Econometrics?

What is econometrics?
• Combining statistics and mathematics with
economics led to the development of a new
field called ECONOMETRICS

What is Econometrics?
• Econometrics literally means “economic
measurement”
• It is the quantitative measurement and
analysis of actual economic and business
phenomena
– It attempts to quantify economic reality and
bridge the gap between abstract world of
economic theory and the real world of human
activity

Why use econometrics?
• Three major uses of econometrics
– Describe economic reality
– Test hypotheses about economic theory
– Forecasting economic information?

Describing economic reality
–Econometrics: Estimated relationship has
numerical contents that can be used to
describe human behaviour

Describing economic reality
–Example: Consumer demand
• Identification of factors that affect it
• Description may be based on estimated
values of coefficients
• Comparability can be assessed using
elasticities

Testing economic hypotheses
• Evaluation of abstract theories of economics with
quantitative evidence
• Questions such as:
– Do consumer pay attention to the price of a product in
making their choices?
– Is the consumer demand for a given product normal or
inferior?
– Do producers react to actual price or expected price?

Third use of Econometrics:
Forecasting
• Decision makers look for information about
future events
• Future product market conditions
• Future input market conditions
• Impact of future policies
• These requires Forecasting economic activity and
related indicators
• Most situations answer ‘what-if’ type of questions

CORRELATION
ANALYSIS
© 2011 Pearson Addison-Wesley. All rights reserved.

• Correlation is really about linear association
between variables.
• Correlation is a measure of association
– Bivariate correlation
– Partial correlation

Example of Correlation questions
Is there an association between:
 Educational attainment and income
 Children’s IQ and Parents’ IQ
 Urban growth and air quality violations?
 Number of police patrol and number of crime
 Grade on exam and time on exam

Scatterplot
 The relationship between any two variables
can be portrayed graphically on an x- and
y- axis.
 Each subject i1 has (x1, y1). When score s
for an entire sample are plotted, the result
is called scatter plot.

How is the Correlation Coefficient
Computed?
 The conceptual formula for the
correlation coefficient:
∑(X – X) (Y – Y)
[∑ (X – X)2 ] [∑ (Y – Y)2 ]
Where X is a person’s or case’s score on the independent variable, Y is a person’s or case’s
score on the dependent variable, and X-bar and Y-bar are the means of the scores on the
independent and dependent variables, respectively. The quantity in the numerator is called the
sum of the crossproducts (SP). The quantity in the denominator is the square root of the
product of the sum of squares for both variables (SSx and SSy)
r =

Direction of the relationship
Variables can be positively or negatively
correlated.
Positive correlation: A value of one variable
increase, value of other variable increase.
Negative correlation: A value of one variable
increase, value of other variable decrease.
Zero correlation: two variables are not
related

2-162-16© 2011 Pearson Addison-Wesley. All rights reserved.
The Simple or Bivariate
Correlation Coefficient, r
• Bivariate Correlation coefficient (r) measures the
strength and direction of movement of the linear
relationship between two (and only two) variables:
– r = +1: the two variables are perfectly positively
correlated – Tendency to move together in the same
direction
– r = –1: the two variables are perfectly negatively
correlated -- Tendency to move together in opposite
directions
– r = 0: the two variables are totally uncorrelated

 Hence, correlation is really about linear
association between two variables
 Correlations is not the same as causation

2-19
Causation vs. Association
• If two events happen together, they have some thing in common
• This suggests that as one event changes, the other event may also change
• The random variables that are generated through this process have then the
tendency to move together – called association (Negative or positive)
• Just because two variables move together, does not imply causation – thus no
relationship
• Causation is decided by a logical process – generally economic theory;
• Failing that, we consult: theory in the making (Literature review), expert opinion,
or familiarity with the process
• Causation may be direct or indirect – only direct causation are used for
regression analysis
• Causation is useful in forecasting and requires use of regression analysis

How to identify Causality
• A regression model cannot prove causality
• Causal Variables are identified using
economic theory or other (common)
knowledge (NOT STATISTICAL
KNOWLEDGE)
• In statistics if two events happen (A and B),
A may cause B, B may cause A, Or another
variable causes a change in both A and B

Regression
Analysis?

Regression Analysis
• Formally, regression analysis is a statistical
technique that attempts to “explain”
movements in one variable, the dependent
variable, as a function of movements in a set
of other variables, the independent (or
explanatory) variables, through the
quantification of a single equation

Simplest Form of
Regression Model

Types of Variables
 Discrete variables:
 Take exact numbers. Cannot be decimals
 Number of children
 Number of calls you make a day

Types of Variables
 Continuous variables:
 Always numeric
 Can be any number, positive or negative
 Examples: age in years, weight, blood pressure
readings, temperature, concentrations of
pollutants and other measurements
 Categorical variables:
 Information that can be sorted into categories
 Types of categorical variables – ordinal, nominal
and dichotomous (binary)

Categorical Variables:
Ordinal Variables
 Ordinal variable—a categorical variable with
some intrinsic order or numeric value
 Examples of ordinal variables:
 Education (no high school degree, HS degree,
some college, college degree)
 Agreement (strongly disagree, disagree, neutral,
agree, strongly agree)
 Rating (excellent, good, fair, poor)
 Frequency (always, often, sometimes, never)
 Any other scale (“On a scale of 1 to 5...”)

Nominal Variables
 Nominal variable – a categorical variable
without an intrinsic order
 Examples of nominal variables:
 Where a person lives in the U.S. (Northeast,
South, Midwest, etc.)
 Sex (male, female)
 Nationality (American, Mexican, French)
 Race/ethnicity (African American, Hispanic, White,
Asian American)
 Favorite pet (dog, cat, fish, snake)

Dichotomous Variables
 Dichotomous (or binary) variables – a
categorical variable with only 2 levels of
categories
 Often represents the answer to a yes or no
question
 For example:
 “Did you attend the church picnic on May 24?”
 “Did you eat potato salad at the picnic?”
 Anything with only 2 categories

DUMMY VARIABLES
 Let’s say that we want to predict the salary a
customer service agent gets. We think that years of
experience is one of the variables (X1).
 We would also like to include whether the person is a
college graduate or not. We will use a dummy
variable to include this information. Therefore x2 will
be
x2 = 0, if the person is not a college graduate.
x2 = 1, if the person is a college graduate.

REGRESSION MODELS WITH CONTINOUS
DEPENDENT VARIABLE

Single Equation Linear Model
• The simplest example is a linear additive model:
Y = β0 + β1X
• Dependent and independent variables
The βs are denoted “coefficients”
– β0 is the “constant” or “intercept” term
• Statistically it is the value of the dependent variable
when independent variable takes a value zero
– β1 is the “slope coefficient”: the amount that Y will
change when X increases by one unit; for a linear model,
β1 is constant over the entire function

Extending the Notification
(Multivariate regression)
• Include reference to the number of
observations
– Single-equation linear case:
Yt = β0 + β1Xt (t = 1,2,…,n)
• So there are really n equations, one for each
observation
• the coefficients, β0 and β1, are the same
• the values of Y, X differ across observations

DUMMY VARIABLE EXAMPLE
Y: annual salary
X1: years of experience
X2: 1 if the person has a college degree, 0
otherwise.
Assume that the person has 5 years of experience.
What would his salary be if he is not a college
graduate? What would his salary be if he is a college
graduate?
21 85.225ˆ xxy 

Extending the Notation –
Multivariate Regression (contd.)
• We may find need for adding more variables
explaining change in dependent variable
• Equation can be written as:
Yt = β0 + β1 X1t + β2 X2t + β3 X3t + β4 X4t
– Where: βs are unknown coefficients to be estimated
• Called Multivariate Regression coefficients
– Xs are independent variables
• Regression coefficients show ‘Partial Change’

Concept of Stochastic Error
Terms and Residuals

Error Term
• If we live in a pure world, our models
(equations) would have a perfect fit
• Such is not the case and therefore we need a
mechanism to show this state of the world
• Model is now revised to include a “stochastic
error term” (ε)
• This term effectively “takes care” of all these
other sources of variation in Y that are NOT
captured by X, so that equation becomes:
Yt = β0 + β1Xt + εt

Stochastic Error Term
• Two components in:
–deterministic component (β0 + β1Xt)
–stochastic/random component (εt)
• Part of the dependent variable that is a result
of the change in the independent variable
• Stochastic component is variation in dependent variable
that we cannot be explained by the model

Reasons for the Introduction of
Stochastic Error Term
• There are at least four sources of variation in Y
other than the variation caused by the included Xs:
• Other potentially important explanatory variables
may be missing
(e.g., X2 and X3)
• Measurement error
• Incorrect functional form
• Purely random and totally unpredictable
occurrences

Concept of Residual
• It can be estimated. Once equation is
estimated it is presented as:
• The signs on top of the estimates are
denoted “hat,” including “Y-hat,” which is the
predicted value of the dependent variable
• The residual is estimated as:
et = Yt – (Y-hat)t
ii XY 10
ˆˆˆ  

Stochastic Error Term vs.
Residuals
• This can also be seen from the fact that
(1.12)
• Note difference with the error term, εi, given as
εi = Yi – E(Yi | X i) (1.13)
• This all comes together in Figure

How to obtain the
parameters
(c) 2007 IUPUI SPEA K300 (4392)

(c) 2007 IUPUI SPEA K300 (4392)
1. Least Squares Method 1
  XY
bXaYYE  ˆ)(
bXaYbXaYYY  )(ˆ
222
)()ˆ( bXaYYY 
  222
)()ˆ( bXaYYY
abXbXYaYXbaYbXaY 222)( 22222

  22
)( bXaYMinMin 
How to get a and b that can minimize the sum
of squares of errors?

(c) 2007 IUPUI SPEA K300 (4392)
Least Squares Method
• Linear algebraic solution
• Compute a and b so that partial derivatives
with respect to a and b are equal to zero
    0222
)( 22






 XbYna
a
bXaY
a

0  XbYna
XbY
n
X
b
n
Y
a  

(c) 2007 IUPUI SPEA K300 (4392)
Least Squares Method 3
Take a partial derivative with respect to b and
plug in a you got, a=Ybar –b*Xbar
    0222
)( 2
22






 XaXYXb
b
bXaY
b

02
  XaXYXb   02
  XXbYXYXb
02








  X
n
X
b
n
Y
XYXb
  0
2
2
  n
X
b
n
YX
XYXb
 
n
YXXY
n
XXn
b  








 
22

(c) 2007 IUPUI SPEA K300 (4392)
Least Squares Method 4
Least squares method is an algebraic solution
that minimizes the sum of squares of errors
(variance component of error)
  x
xy
SS
SP
XX
YYXX
XXn
YXXYn
b 










222
)(
))((
XbY
n
X
b
n
Y
a  
 22
2





XXn
XYXXY
a Not recommended

(c) 2007 IUPUI SPEA K300 (4392)
OLS: Example 1
No x y x-xbar y-ybar (x-xb)(y-yb) (x-xbar)^2
1 43 128 -14.5 -8.5 123.25 210.25
2 48 120 -9.5 -16.5 156.75 90.25
3 56 135 -1.5 -1.5 2.25 2.25
4 61 143 3.5 6.5 22.75 12.25
5 67 141 9.5 4.5 42.75 90.25
6 70 152 12.5 15.5 193.75 156.25
Mean 57.5 136.5
Sum 345 819 541.5 561.5
0481.815.579644.5.136  XbYa
9644.
5.561
5.541
)(
))((
2






x
xy
SS
SP
XX
YYXX
b

(c) 2007 IUPUI SPEA K300 (4392)
OLS: Example 10-5 (3)120130140150
40 50 60 70
x
Fitted values y
Y hat = 81.048 + .964X

(c) 2007 IUPUI SPEA K300 (4392)
Hypothesis Testing: regression
parameters
 How reliable are a and b we computed?
 T-test (Wald test in general) can answer
 The standardized effect size (effect size /
standard error)
 Effect size is a-0 and b-0 assuming 0 is the
hypothesized value; H0: α=0, H0: β=0
 Degrees of freedom is N-K, where K is the
number of regressors +1
 How to compute standard error (deviation)?

(c) 2007 IUPUI SPEA K300 (4392)
Illustration: Test b
 How to test whether beta is zero (no effect)?
 Like y, α and β follow a normal distribution; a
and b follows the t distribution
 b=.9644, SE(b)=.2381,df=N-K=6-2=4
 Hypothesis Testing
 1. H0:β=0 (no effect), Ha:β≠0 (two-tailed)
 2. Significance level=.05, CV=2.776, df=6-2=4
 3. TS=(.9644-0)/.2381=4.0510~t(N-K)
 4. TS (4.051)>CV (2.776), Reject H0

(c) 2007 IUPUI SPEA K300 (4392)
Illustration: Test a
 How to test whether alpha is zero?
 Like y, α and β follow a normal distribution; a
and b follows the t distribution
 a=81.0481, SE(a)=13.8809, df=N-K=6-2=4
 Hypothesis Testing
 1. H0:α=0, Ha:α≠0 (two-tailed)
 2. Significance level=.05, CV=2.776
 3. TS=(81.0481-0)/.13.8809=5.8388~t(N-K)
 4. TS (5.839)>CV (2.776), Reject H0

(c) 2007 IUPUI SPEA K300 (4392)
Partitioning Variance of Y (3)
81+.96X
No x y yhat (y-ybar)^2 (yhat-ybar)^2 (y-yhat)^2
1 43 128 122.52 72.25 195.54 30.07
2 48 120 127.34 272.25 83.94 53.85
3 56 135 135.05 2.25 2.09 0.00
4 61 143 139.88 42.25 11.39 9.76
5 67 141 145.66 20.25 83.94 21.73
6 70 152 148.55 240.25 145.32 11.87
Mean 57.5 136.5 SST SSM SSE
Sum 345 819 649.5000 522.2124 127.2876
•122.52=81+.96×43, 148.6=.81+.96×70
•SST=SSM+SSE, 649.5=522.2+127.3

(c) 2007 IUPUI SPEA K300 (4392)
ANOVA Table: F-test
 H0: all parameters are zero, β0 = β1 = 0
 Ha: at least one parameter is not zero
 CV is 12.22 (1,4), TS>CV, reject H0
Sources Sum of Squares DF Mean Squares F
Model SSM K-1 MSM=SSM/(K-1) MSM/MSE
Residual SSE N-K MSE=SSE/(N-K)
Total SST N-1
Sources Sum of Squares DF Mean Squares F
Model 522.2124 1 522.2124 16.41047
Residual 127.2876 4 31.8219
Total 649.5000 5

(c) 2007 IUPUI SPEA K300 (4392)
R2 and Goodness-of-fit
 Goodness-of-fit measures evaluates how well
a regression model fits the data
 The smaller SSE, the better fit the model
 F test examines if all parameters are zero.
(large F and small p-value indicate good fit)
 R2 (Coefficient of Determination) is SSM/SST
that measures how much a model explains the
overall variance of Y.
 R2=SSM/SST=522.2/649.5=.80
 Large R square means the model fits the data

(c) 2007 IUPUI SPEA K300 (4392)
Myth and Misunderstanding in R2
 R square is Karl Pearson correlation coefficient
squared. r2=.89672=.80
 If a regression model includes many regressors, R2 is
less useful, if not useless.
 Addition of any regressor always increases R2
regardless of the relevance of the regressor
 Adjusted R2 give penalty for adding regressors, Adj.
R2=1-[(N-1)/(N-K)](1-R2)
 R2 is not a panacea although its interpretation is
intuitive; if the intercept is omitted, R2 is incorrect.
 Check specification, F, SSE, and individual parameter
estimators to evaluate your model; A model with
smaller R2 can be better in some cases.

Dependent Variable: AREA
Method: Least Squares
Date: 09/30/11 Time: 17:24
Sample: 1901 1921
Included observations: 21
Variable Coefficient Std. Error t-Statistic Prob.
RATIO 12.10000 4.013236 3.015024 0.0071
C 9.119504 1.961913 4.648272 0.0002
R-squared 0.323612 Mean dependent var 14.89990
Adjusted R-
squared
0.288012 S.D. dependent var 2.261827
S.E. of
regression
1.908515 Akaike info criterion 4.220921
Sum squared
resid
69.20620 Schwarz criterion 4.320400
Log likelihood -42.31967 F-statistic 9.090367
Durbin-Watson
stat
0.941566 Prob(F-statistic) 0.007121
2-55
© 2011 Pearson Addison-Wesley. All rights
reserved.

Empirical Regression
Analysis Results

Milk Consumption
• Formulate a relationship using economic
theory
• Collect data
• Estimate the relationship

Hypothetical Data on Milk
Consumption
Cons (L/m/cap) Price ($/l)
7 1
9 0.8
5 2.5
10 0.75
11 0.5
3 2.75
8 1.1

Estimate relationship
Qt = 11.56 -2.97 Pt
• Plot the relationship

Plot of regression line

Estimate residual
Const Pricet Y-hatt Error (et)
7 1 8.59 -1.59
9 0.8 9.18 -0.18
5 2.5 4.13 0.87
10 0.75 9.33 0.67
11 0.5 10.08 0.92
3 2.75 3.39 -0.39
8 1.1 8.29 -0.29
7.57 1.34 7.57 0.00

Residual Plot
-2.0
-1.5
-1.0
-0.5
0.0
0.5
1.0
1.5
2
4
6
8
10
12
2001 2002 2003 2004 2005 2006 2007
Residual Actual Fitted

STUDENMUND CHAPTER 2
ORDINARY LEAST
SQUARES

Regression
Analysis

Regression Analysis
• It is a statistical technique that attempts to
“explain” movement in one variable (called
the dependent variable) as a function of
movements in one or more independent
variables (called independent variables)
through the quantification of numerical
values of a single equation

2-66
Stages in Regression Analysis
• Analysis starts with economic theory or
other available (relevant) literature
• It goes through four stages
1. Specification
2. Estimation
3. Evaluation
4. Application

2-67
Stage 1: Specification
• Economic theory is helpful in five respects:
– 1. What variables are relevant,
– -2. Which one is the dependent variable (line of causation)?
– 3. Expected direction of change between the independent variable
and dependent variable
– 4. What is the nature of functional form between dependent and
independent variable(s)?
• Is it linear or non-linear?
– 5. Is the relationship static or dynamic?
• Failing that, use of existing literature, expert
opinion and logical thinking is recommended

Review of Questions
• 1. What variables are relevant?
– Follow economic theory, available literature,
• 2. Which one is dependent variable and which ones are
independent variable(s)?
– Direct causation based on economic theory
– Indirect causation is not permitted
• 3. Expected direction of change between the independent
variable and dependent variable
• Economic theory indicates for some variable the direction of change
• Criteria important “Economic Consistency”

Review of Questions
• 4. On the nature of the functional form
– Frequently economic theory is silent on
functional form
– Use of literature is the next option
– If everything else fails, use of trial and error is the only option
available
• 5. Is the model static or dynamic
– Guided by economic theory and literature

2-70
Stage 2: Estimation
• After specification stage you know:
• You now have a hypothesis:
Y = f(X)
Let us assume it is a Linear additive model:
Yt = β0 + β1 Xt + et
• Collect data; Select estimator
• Using sample data, sample estimates are generated using
appropriate estimators

2-71
Stage 3: Evaluation
• Is the model as estimated worthy (good
enough) of (for) further application?
• To answer this we need to evaluate the
model.
• Four types of evaluations are used:
– 1. Economic – Theoretical Consistency
– 2. Statistical – Goodness of Fit
– 3. Econometric – Violation of Assumptions
– 4. Forecasting Performance – Past performance good

2-72
Stage 4: Application
• Applications include:
– 1. Description (Market Structure)
– 2. Inference about population and confidence
intervals
– 3. Forecasting

Stage One: Specification
• Based on knowledge of economic theory or
literature review
• Beyond the scope of this course
• Literature review process discussed in
Section 8

Stage Two: Estimation
ESTIMATION OF SIMPLE
REGRESSION MODEL USING
ORDINARY LEAST SQUARES
(OLS)

Estimating Single-Independent-
Variable Models with OLS
• Recall that the objective of regression analysis is to start
from:
(1)
• And, through the use of data, to get to:
(2)
• Recall that equation 1 is purely theoretical, while equation 2
is its empirical counterpart, and can be written as:
Yi = β0 + β1 Xi + ei

Guideline
• From above equation we can see:
ei = f (β0 and β1)
And
ei = (Yi – Y-hat)

Dilemma of the researcher
• Many regression lines can be fitted through
a given scatter plot of Y and X

Criteria for Selection -- Least
Squares
• Two criteria:
– We do not want to be wrong too often. On average, not wrong.
– If we have to be wrong we want errors to be smaller in magnitude
• First criterion suggests that we minimize error term
in the regression model (ei)
• Second criterion suggests that we penalize larger
errors more
• Procedure that can meet the above conditions is
called Ordinary Least Squares

Variable Models with OLS (cont.)
• OLS minimizes (i = 1, 2, …., N) (3)
• Or, the sum of squared deviations of the vertical distance
between the residuals (i.e. the estimated error terms) and
the estimated regression line
• We also denote this term the “Residual Sum of Squares”
(RSS)

• Why use OLS?
• Relatively easy to use
• The goal of minimizing RSS is intuitively /
theoretically appealing
• This basically says we want the estimated regression
equation to be as close as possible to the observed data
• OLS estimates have two useful characteristics

• These two useful characteristics:
• The sum of the residuals is exactly zero
• OLS can be shown to be the “best” estimator when
certain specific conditions hold
– Ordinary Least Squares (OLS) is an estimator
– A given produced by OLS is an estimate
• Estimates are called BLUE – Best Linear
Unbiased Estimators (under a set of
assumptions)

Derivation of Estimators
• Least square criteria
• Minimize sum of squares of error terms by selecting β1 and
β0.
• Take first Partial derivatives with respect to each unknown,
and set them equal to zero
• Test for minimum or maximum by taking second partial
derivative; If positive it is a minimum
• The first derivative equations would lead to two normal
equations
• When solved would lead to least squares estimators

• The estimators are:
(2.4)

METHODS OF EMPIRICAL
ESTIMATION OF REGRESSION
COEFFICIENTS

Three ways to estimation
• Machine method (Hand)
• Excel DATA Analysis package
• EViews

Problem
• You are working as an economist for a Saskatchewan chemical
manufacturer selling products to be used by wheat producers. To
estimate the expected market potential, the Marketing Department
of the Company needs to know the possible demand for the product.
You are given the task of estimating this demand.
• If producers follow recommended dose of the chemical, demand for
the chemical products is decided by the area planted by producers
to wheat.
• To keep things simple, you hypothesis that wheat farmers decide
between wheat and canola area for their planting decisions.
Furthermore, it is the relative price of the two crops that determines
the area, and through that demand for chemicals.
• Do Saskatchewan producers pay attention to the relative price of the
two crops?

Data Collected
• You have collected information on past (for
the past 21 years) wheat area (in million
acres) [Variable called AREA] and ratio of
wheat to canola price[Variable called RATIO]
• RATIO was selected since canola is a
substitute crop in Saskatchewan for wheat
• Could have used individual prices

• Estimation using Machine
(Hand) Method

2-89
Estimation using Calculator or
EXCEL
• 1. Collect data on Y and X
• 2. Estimate mean of Y and X
• 3. Estimate sums of squares of deviation from the mean for
Y and X (Could be simpler using SUMPRODUCT operator)
• 4. Estimate sums of cross-products of deviation from the
mean for Y and X (Could be simpler using SUMPRODUCT
operator)
• 5. Estimate β1-hat using estimator first
• 6. Estimate β0-hat, since it need value of β1-hat

DataEAR AREA (Y) RATIO (X)
1 14.1 0.3906
2 14.8 0.4797
3 14.8 0.5534
4 15.85 0.6144
5 16.5 0.65
6 17.75 0.4427
7 16.3 0.4857
8 16.45 0.6154
9 17.35 0.6533
10 15.7 0.4215
11 14.55 0.5051
12 14.8 0.5871
13 16.3 0.5378
14 17.253 0.406
15 17.5 0.4409
16 14.9 0.3477
17 10.9 0.3161
18 11.48 0.4372
19 13.7 0.4835
20 12.45 0.343
21 9.465 0.321

Estimation
SUM OF SQUARES OF Y
102.317234
SUM OF SQUARES OF X
0.22615285
SUM OF CROSS-PRODUCTS
2.73644944
Mean of AREA 14.8999
Mean of RATIO 0.4777
Coefficients
12.0999997 Slope
9.11950445 Intercept

Estimation using Excel Data
Analysis Package

Excel data Analysis Package
steps
• Have data set in Excel
• Click on Data Analysis Package
• Selection Regression
• In the window, provide information
• Follow instructions

Excel Regression using data Analysis
Package
SUMMARY OUTPUT
Regression Statistics
Multiple R 0.568869
R Square 0.323612
Adjusted R
Square 0.288012
Standard Error 1.908515
Observations 21
ANOVA
df SS MS F
Significance
F
Regression 1 33.11104 33.11104 9.090367 0.007121
Residual 19 69.2062 3.642431
Total 20 102.3172
Coefficients
Standard
Error t Stat P-value Lower 95% Upper 95%
Intercept 9.119504 1.961913 4.648272 0.000175 5.013174 13.22583
X Variable 1 12.1 4.013236 3.015024 0.007121 3.700201 20.4998

Using EViews
• Input your data
• Select group
• Estimate descriptive statistics
• Estimate regression
– AREA C RATIO
– Names must match the names in the Master file

E-Views Regression
AREA RATIO
Mean 14.89990 0.477719
Median 14.90000 0.479700
Maximum 17.75000 0.653300
Minimum 9.465000 0.316100
Std. Dev. 2.261827 0.106337
Skewness -0.878357 0.152924
Kurtosis 3.015785 1.942130
Jarque-Bera 2.700508 1.061054
Probability 0.259174 0.588295
Observations 21 21

Date: 09/30/11 Time: 17:24
Sample: 1901 1921
RATIO 12.10000 4.013236 3.015024 0.0071
C 9.119504 1.961913 4.648272 0.0002
Adjusted R-
squared
0.288012 S.D. dependent var 2.261827
S.E. of
regression
Sum squared
resid
Durbin-Watson
stat

Interpretation
• If the ratio of wheat to canola prices change
by one unit, Saskatchewan producers would
add 12.1 million acres to wheat
• Intercept – even when wheat price is zero,
producers would allocate 9.1 million acres to
wheat – Has no economic interpretation

Multivariate Regression
Model

Estimation of Multivariate
Regression
• Y = f (X1, X2, …, XK)
• Model to be estimated now is:
Yi = β0 + β1 X1i + β2 X2i + β3 X3i + … + βK XKi + ei
• Intercept may have no interpretation
• Regression coefficients now are partial regression
coefficients
– These indicate a change in the dependent variable when one
independent variable changes by one unit, holding other variables
constant

Estimation
• Using least squares
• Estimators become longer
• Use of matrix algebra is recommended
• Estimation is done using Eviews or Excel

Back to the Problem
• When you presented your results to the Marketing
Department, the Manager commented that your analysis was
much too simplistic
• According to her, “There are more factors that affect the
producer decisions, and thus company’s sales of chemical
products
• As an example, the member suggested that the larger wheat
stocks on farms may have played a role as well. As farmers
have more in storage, they will produce less during the current
year, and therefore, allocate less area under wheat
• You were instructed to revise the analysis

Multivariate Regression
• Specify
• AREA = f(RATIO, STOCKS)
– Where STOCKS are in million tonnes
– Using EViews, you get the results

AREA RATIO STOCKS
1901 14.1 0.3906 4.25
1902 14.8 0.4797 3.451
1903 14.8 0.5534 3.078
1904 15.85 0.6144 2.158
1905 16.5 0.65 1.962
1906 17.75 0.4427 1.75
1907 16.3 0.4857 5.21
1908 16.45 0.6154 3.64
1909 17.35 0.6533 4.45
1910 15.7 0.4215 4.45
1911 14.55 0.5051 6.75
1912 14.8 0.5871 2.4
1913 16.3 0.5378 1.6
1914 17.253 0.406 1.3
1915 17.5 0.4409 3.2
1916 14.9 0.3477 6
1917 10.9 0.3161 5.7
1918 11.48 0.4372 3.75
1919 13.7 0.4835 4.75
1920 12.45 0.343 5.8
1921 9.465 0.321 6.2

Sample: 1901 1921
RATIO 7.747999 4.246243 1.824671 0.0847
STOCKS -0.568474 0.272264 -2.087951 0.0513
C 13.41420 2.738904 4.897655 0.0001
Adjusted R-
squared
0.394989 S.D. dependent var 2.261827
S.E. of
regression
Sum squared
resid
Durbin-Watson
stat

Interpretation
• With a one unit change in ratio of wheat and canola prices,
producers will increase wheat area by 7.48 million acres,
provided that stocks of wheat do not change
• With stocks of wheat increasing by 1 million tonnes, producers
would decrease wheat area by 0.57 million acres, holding price
ratio constant
• Intercept of 13.41 million acres, in statistics means that
producers would allocate this amount of area to wheat even
when the price of wheat is zero, and there are no farm level
stocks of wheat. However, in economics, it has no meaningful
interpretation

Effect of Price ratio on decision
to plant area under wheat

Effect of Farm Level Stocks of
Wheat on decision to plant area
under wheat

Estimating Correlation
coefficients
• Using Eviews you can calculate the
correlation matrix.
• Matrix is symmetrical
• Diagonal is always equal to one –
association of the variable with itself
• Off-diagonal values are mirror image – Top
off-diagonal is the same as bottom off the
diagonal

2-110
Correlation Matrix
Y--AREA X1--RATIO X2--
STOCKS
Y--AREA 1.000000 0.568869 -0.595628
X1--RATIO 0.568869 1.000000 -0.490867
X2--
STOCKS
-0.595628 -0.490867 1.000000

Ordinary Least Squares (OLS)
 Objective of OLS  Minimize the sum of
squared residuals:
 where
 Remember that OLS is not the only possible
estimator of the βs.
 But OLS is the best estimator under certain
assumptions…

n
i
ie
1
2
ˆ
min

iKiKiii XXXY   ...22110
iii YYe ˆ

Classical Assumptions
1. Regression is linear in parameters
2. Error term has zero population mean
3. Error term is not correlated with X’s
4. No serial correlation
5. No heteroskedasticity
6. No perfect multicollinearity
 and we usually add:
7. Error term is normally distributed

Assumption 1: Linearity
 The regression model:
A) is linear
It can be written as
This doesn’t mean that the theory must be linear
For example… suppose we believe that CEO salary is
related to the firm’s sales and CEO’s tenure.
We might believe the model is:
iKiKiii XXXY   ...22110
iiiii tenuretenuresalessalary   2
3210 )log()log(

B) is correctly specified
The model must have the right variables
No omitted variables
The model must have the correct functional form
This is all untestable  We need to rely on economic
theory.

C) must have an additive error term
The model must have + εi

Assumption 2: E(εi)=0
Error term has a zero population mean
E(εi)=0
Each observation has a random error with
a mean of zero
What if E(εi)≠0?
This is actually fixed by adding a constant
(AKA intercept) term

Example: Suppose instead the mean of εi
was -4.
Then we know E(εi+4)=0
We can add 4 to the error term and
subtract 4 from the constant term:
Yi =β0+ β1Xi+εi
Yi =(β0-4)+ β1Xi+(εi+4)

Yi =(β0-4)+ β1Xi+(εi+4)
We can rewrite:
Yi =β0*+ β1Xi+εi*
Where β0*= β0-4 and εi*=εi+4
Now E(εi*)=0, so we are OK.

Assumption 3: Exogeneity
Important!!
All explanatory variables are uncorrelated
with the error term
E(εi|X1i,X2i,…, XKi,)=0
Explanatory variables are determined
outside of the model (They are
exogenous)

What happens if assumption 3 is violated?
Suppose we have the model,
Suppose Xi and εi are positively correlated
When Xi is large, εi tends to be large as
well.

“True” Line
-40
-20
0
20
40
60
80
100
120
0 5 10 15 20 25
“True Line”

“True” Line
“True Line”
Data
-40
-20
0
20
40
60
80
100
120
0 5 10 15 20 25
“True Line”
Data

-40
-20
0
20
40
60
80
100
120
0 5 10 15 20 25
“True Line”
Data
Estimated Line

Why would x and ε be correlated?
Suppose you are trying to study the
relationship between the price of a
hamburger and the quantity sold across a
wide variety of Ventura County
restaurants.

We estimate the relationship using the
following model:
salesi= β0+β1pricei+εi
What’s the problem?

What else determines sales of hamburgers?
How would you decide between buying a
burger at McDonald’s ($0.89) or a burger at TGI
Fridays ($9.99)?
Quality differs
salesi= β0+β1pricei+εi  quality isn’t an X
variable even though it should be.
It becomes part of εi

But price and quality are highly positively
correlated
Therefore x and ε are also positively correlated.
This means that the estimate of β1will be too
high
This is called “Omitted Variables Bias” (More in
Chapter 6)

Assumption 4: No Serial Correlation
Serial Correlation: The error terms across
observations are correlated with each
other
i.e. ε1 is correlated with ε2, etc.
This is most important in time series
If errors are serially correlated, an
increase in the error term in one time
period affects the error term in the next.

 The assumption that there is no serial
correlation can be unrealistic in time series
Think of data from a stock market…

-500
0
500
1000
1500
2000
1870 1920 1970 2020
Year
RealS&P500StockPriceIndex
Price
Stock data is serially correlated!

Assumption 5: Homoskedasticity
Homoskedasticity: The error has a
constant variance
This is what we want…as opposed to
Heteroskedasticity: The variance of the
error depends on the values of Xs.

Homoskedasticity: The error has constant variance

Heteroskedasticity: Spread of error depends on X.

Another form of Heteroskedasticity

Assumption 6: No Perfect Multicollinearity
Two variables are perfectly collinear if one
can be determined perfectly from the other
(i.e. if you know the value of x, you can
always find the value of z).
Example: If we regress income on age,
and include both age in months and age in
years.
But age in years = age in months/12
e.g. if we know someone is 246 months old, we
also know that they are 20.5 years old.

What’s wrong with this?
incomei= β0 + β1agemonthsi +
β2ageyearsi + εi
What is β1?
It is the change in income associated with
a one unit increase in “age in months,”
holding age in years constant.
But if you hold age in years constant, age in
months doesn’t change!

β1 = Δincome/Δagemonths
Holding Δageyears = 0
If Δageyears = 0; then Δagemonths = 0
So β1 = Δincome/0
It is undefined!

When more than one independent variable
is a perfect linear combination of the other
independent variables, it is called Perfect
MultiCollinearity
Example: Total Cholesterol, HDL and LDL
Total Cholesterol = LDL + HDL
Can’t include all three as independent
variables in a regression.
Solution: Drop one of the variables.

Assumption 7: Normally Distributed Error

Assumption 7: Normally Distributed Error
This is required not required for OLS, but it
is important for hypothesis testing
More on this assumption next time.

Putting it all together
 Last class, we talked about how to compare
estimators. We want:
 1. is unbiased.

on average, the estimator is equal to the population
value
 2. is efficient
 The variance of the estimator is as small as possible
ˆ
 )ˆ(E
ˆ

Gauss-Markov Theorem
Given OLS assumptions 1 through 6, the
OLS estimator of βk is the minimum
variance estimator from the set of all linear
unbiased estimators of βk for k=0,1,2,…,K
OLS is BLUE
The Best, Linear, Unbiased Estimator

What happens if we add assumption 7?
Given assumptions 1 through 7, OLS is
the best unbiased estimator
Even out of the non-linear estimators
OLS is BUE?

 With Assumptions 1-7 OLS is:
 1. Unbiased:
 2. Minimum Variance – the sampling distribution
is as small as possible
 3. Consistent – as n∞, the estimators
converge to the true parameters
As n increases, variance gets smaller, so each estimate
approaches the true value of β.
 4. Normally Distributed. You can apply
statistical tests to them.
 )ˆ(E

Econometrics

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Ähnlich wie Econometrics

Ähnlich wie Econometrics (20)

Mehr von DavidEdem4

Mehr von DavidEdem4 (6)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Econometrics