1. Quantitative Techniques in
Analysis (QTiA)
Introductory Lecture
Saturday, 02nd October 2010
M. Shahnawaz Adil
Assistant Professor & Course Advisor (Strategies & Management)
mshahnawazadil@yahoo.com
1
4. Common Statistical Packages
• S.P.S.S. (originally, Statistical Package for the
Social Sciences)
• S.A.S. (pronounced "sas", originally Statistical
Analysis System)
4
6. LoM: Key points
6
Objects
Events
People
Assigning the discrete
categories Called
CATEGORICAL MEASUREMENT
Identifying the attributes on
numerical scale called
METRIC MEASUREMENT
Nominal
Ordinal
Interval
Ratio
In SPSS: Scale
7. 1) Categorical Measurement
1.1 Nominal Level
Categories must be homogeneous,
mutually exclusive and exhaustive.
Dichotomous / Binary
1.2 Ordinal Level
The categories are ranked order along some
dimension (high to low) e.g. Social class (upper,
middle, lower)
e.g. Likert Scale (strongly agree, agree, neither
agree not disagree, disagree, strongly disagree)
7
8. 2) Metric Measurement
2.1 Interval Level
The categories or scores on a scale are of the
same distance apart whereas in ordinal level, the
numbers only indicate relative position.
It has an arbitrary zero.
e.g. Attitude scale from 10 to 50 (may be
taken from 5 responses). It can be range from 0
to 40.
Temperature 0 C= 273 Kelvin = 32 F
8
9. 2) Metric Measurement
2.2 Ratio Level
Same as Interval level except it has an
absolute zero point.
e.g. height, volume, income/salary in PKR,
time, etc.
9
10. Key Points to Ponder
• For ordinal string variables, the alphabetic order
of string values is assumed to reflect the true
order of the categories.
• For example, for a string variable with the values
of low, medium, high, the order of the categories
is interpreted as high, low, medium, which is not
the correct order. In general, it is more reliable to
use numeric codes to represent ordinal data.
10
11. Key Points to Ponder (cont’d…)
• New numeric variables created during a session are
assigned the scale measurement level. For data read
from external file formats and SPSS data files that were
created prior to version 8.0, default assignment of
measurement level is based on the following rules:
• Numeric variables with fewer than 24 unique values
and string variables are set to nominal.
• Numeric variables with 24 or more unique values are
set to scale.
Refer: SPSS User Guide 16.0 for further details
11
12. Time Dimension in Research
• Three types of data may be available for
empirical analysis:
12
Time Series
data
Cross-Section
data
OR OR
Pooled data
Time Series
data
Cross-Section
data
and
13. 1. Time Series data (TS)
one subject's changes over the course of time
• A time series is a set of observations on the values
that a variable takes at different times.
• Such data may be collected at regular time intervals
such as:
– Daily (e.g., stock prices, weather reports);
– Weekly (e.g., money supply figures);
– Monthly (e.g., the unemployment rate, Consumer
Price Index (CPI));
– Quarterly (e.g., GDP)
– Annually (e.g., government budgets);
– Quinquennially, i.e., every 5 years
(e.g., the census of manufactures); or
– Decennially, (e.g., the census of population).
13
17. TS Data: Important Points
• Sometimes data are available both quarterly
as well as annually, as in the case of the data
on GDP and consumer expenditure.
• With the advent of high-speed computers,
data can be collected over an extremely short
interval of time, such as the data on stock
prices, which can be obtained literally
continuously (the so-called real-time quote)
• Application: Econometric studies
17
18. • Assumption: most empirical work based on
time series data assumes that the underlying
time series is Stationary (loosely speaking a time
series is stationary if its mean and variance do not
vary systematically over time)
• Example: on next slide…
18
TS Data: Important Points
19. M1 Money Supply: United States, 1947-97
19
The M1 money supply shows a steady upward trend as well as variability over the
years, suggesting that the M1 time series in NOT stationary.
20. 2. Cross-Section data (CS)
• Data on one or more variables collected at the
same point in time or without regard to
differences in time, such as
– census of population conducted by the Census
Bureau every 10 years (the latest being conducted
in year 2000;
– The surveys of consumer expenditures conducted
the University of Michigan); and
– Opinion polls by Gallup, etc.
20
21. CS Data – an example
21
We want to measure current obesity levels in a population. We
could draw a sample of 1,000 people randomly from that
population (a.k.a a cross section of that population), measure
their weight and height, and calculate what percentage of that
sample is categorized as obese (mean: overweight). For example,
30% of our sample were categorized as obese. This cross-
sectional sample provides us with a snapshot of that population,
at that one point in time. Note that we do not know based on
one cross-sectional sample if obesity is increasing or decreasing;
we can only describe the current proportion.
22. Panel Data
• In statistics and econometrics, the term panel
data refers to multi-dimensional data. Panel
data contains observations on multiple
phenomena observed over multiple time
periods for the same firms or individuals.
• TS and CS data are special cases of panel data
that are in one-dimension only.
22
23. Balanced vs. Unbalanced Data
23
In the example above, two data sets with a two-dimensional panel structure
are shown. Individual characteristics (income, age, sex. educ) are collected
for different persons and different years. In the left data set two persons (1,
2) are observed over three years (2003, 2004, 2005). Because each person is
observed every year, the left-hand data set is called a balanced panel,
whereas the data set on the right hand is called an unbalanced panel, since
Person 1 is not observed in year 2005 and Person 3 is not observed in 2003
or 2005.
24. Compulsory Home Reading Assignment
• Chapter No 2: Data Coding &
Exploratory Data Analysis from
• Chapter No. 1 and 02 from
24