This document summarizes research on the First-Order Integer-Valued Autoregressive (INAR(1)) process. It describes the INAR(1) model, including how it represents lag-one dependence between integer-valued random variables. It also discusses four estimation methods for the INAR(1) parameters (α and λ): Yule-Walker, Conditional Least Squares, Maximum Likelihood, and Whittle estimation. Simulation results show that Conditional Maximum Likelihood generally has the lowest bias, making it the best estimation method among the four.
Understanding Integer-Valued Autoregressive Models and Estimation Methods
1. Yan Wen Tan Assignment 1
1
Time Series is a collection of observations typically consisting of successive
measurements made over a time interval. Data of time series are categorized into either
continuous or discrete. This work is about First-Order Integer-Valued Autoregressive
(INAR(1))Process which is written by Al-Osh and Alzaid (1987). In the paper, lag-one
dependence is given and we refer to integer-valued autoregressive of first order process. The
paper covers the definition of INAR(1), parameter estimation methods for α and λ, and also
simulations to compare estimators. Furthermore, we used General Poisson (GP) instead of
poisson for error terms to compare properties of INAR(1) while Whittle estimator is added to
explore a new way to estimate α and λ.
INAR(1) First of all, INAR(1) is a simple model for a stationary sequence of integer-
valued random variables with lag-one dependence is given and is referred to as the integer-
valued autoregressive of order one process. INAR(1) states that the components of the
process at time t, , are (i) Survivors of the elements of the process at time t-1, Xt-1, each
with probability of survival α and (ii) Elements which entered the system in the interval (t-1, t]
is the term ɛt.
INAR(1) process {Xt : t = 0, ±1, ±2,…} by
where and is a sequence of uncorrelated non-negative integer-valued
random variables having mean and finite variance .
Besides that, the marginal distribution of can be expressed in terms of
innovation sequence } as
where .
Remarks: is dependence on the sequence and it decays exponentially with the
time lag of this sequence.
where t =1,2,… (Ali Khan, 1986)
2. Yan Wen Tan Assignment 2
2
where
~ : the content of the dam at the end of time t.
~ : the input during the time interval [t,t+1]
~ : the output at the end of time t
Remarks : (3) is a dam theory which is an example of applicability of INAR(1) model
in other disciplines. Thus, given ,
(3) can be rewrite in terms of thinning operator, ,which implies that is
independent of .
where (3.1)
Under correlation and distributional properties of INAR(1) model, we examine the
mean and variance of in . We can use second-order
stationarity to derive variance from mean. In other words, we assume t=1, then we
will get . They are given as: (i) and
(ii) .
Followings are the simulation graphs to compare to sample paths of INAR(1) and
AR(1) with n=100. They have same mean and variance.
Figure 1 Integer valued time series generated from Bernoulli(0.5) and Poisson(1.0).
0 20 40 60 80 100
0
1
2
3
4
5
6
7
tt
X[2:100]
3. Yan Wen Tan Assignment 3
3
Figure 2 Continuous time series generated from normal(2,2)
Observations : Realization of INAR(1) in Figure 1 have many runs near the mean(2)
while this pattern doesn’t exist in Figure 2. We used R to simulate the models and plot
them. (*R code is sent as attachment.)
Generalized Poisson Next, we would like to compare INAR(1) model when
are having Generalized Poisson(GP) distribution. Consider a stationary
process with Poisson marginal densities. One difference of this model from the
previous one is we employ quasibinomial thinning operator instead of binomial
thinning (Kurt.B , 1994). Thus, the probability to retain an element is increasing in
for and decrease for . When is GP distributed with parameters
and , . It can be proven that
is distributed as . However, when is distributed as
it can be shown that . Furthermore, the moments of GP model
(mean and variance) are given by: (i) , (ii)
, (iii) Autocorrelation of GP at lag k is . It is definitely possible to
make GP compatible with Poisson specification, all we need to do is to let process
0 20 40 60 80 100
0
1
2
3
4
5
6
tt
X[2:100]
4. Yan Wen Tan Assignment 4
4
have parameter and . In other words, because mean of the specific
poisson is when
Estimators Next is the exploration of estimators of INAR(1). Estimation of
INAR(1) model is more complicated than AR(1) process because conditional
distribution of is the convolution of the distribution of and binomial
scale parameter and index parameter . Following are the explanation of 4
estimators, Yule-Walker, Conditional Least Square (CLS), Maximum Likelihood
(ML), and Whittle. We assume .
(i) Yule-Walker estimators (Y-W)
Estimator for α is given by, where is sample mean.
Estimator for λ is given as . (4). Note that can be obtained by
calculating for t=1,2,…,n.
(ii) Conditional least squares estimators (CLSE). According to Klimko and
Nelson(1978), CLSE estimator is based on minimization of the sum of squared
deviations about the conditional expectation. The estimators of α and λ are given by
and (5). CLS
estimators are strongly consistent compared to Yule-Walker estimators.
Before proceed to MLE, it is important for us to know that Yule-walker and CLSE
use moment based methods and they are asymptotically equivalent. For moment
based estimators, thinning parameter α is estimated from autocorrelation function of
INAR(1) process. Once α is estimated by , we will obtain a sequence of estimator
through .
(iii) Maximum likelihood estimators (MLE)
5. Yan Wen Tan Assignment 5
5
a) Conditional likelihood function (CML): Given , the conditional likelihood
function is given by . According to Sprott (1983),
derivatives of the log conditional likelihood function with respect to λ and α are
, and (6)
.We just need to set and
to obtain conditional maximum likelihood estimator (CMLE). In addition, we can use
(7) to eliminate either α or
b) Unconditional likelihood function: The likelihood function has the following
form: e p
where e p for t … n and
x=(x0,x1 … n). We can obtain and by derivative the log likelihood
function with respect to and α: , and
. In short, we can find MLE by setting and
.
(iv) Whittle estimator (Maria & Oliveira , 2004)
Whittle estimator is introduced by Whittle (1953) for Gaussian models. The main
motivation of Whittle estimator is to present the likelihood of a stochastic process
using its spectral properties because the spectral density function of a process is
actually easier to obtain compared to exact likelihood.
Whittle estimator can be used as an approximation for non-Gaussian mixing processes.
There are three conditions for a stochastic process belongs to non-Gaussian
mixing processes:
(i) .
6. Yan Wen Tan Assignment 6
6
(ii) has finite absolute moments of all orders: .
(iii) … … …
Note that the condition (iii) is a mixing condition on to guarantee a fast decrease of
statistical dependence between and .
Probability density of the variable is asymptotically given by,
e p where … . On the other hand, discrete
version of Whittle estimator is og og .
In summary, we estimate for RINAR process (Replicated INAR) by
=
Remarks :
i) is the value of the spectral density function at Fourter frequency for
…
ii) is the sample mean periodogram ordinate at same frequency like .
Simulation study:
In order to present the relative merits for Y-W, CLS and CML, simulation is
completed for the case of an element of has probability of survival and
. Initial value of process, , is assumed to be expected value of
. Following variables are used: sample size, and for
different values of parameters, and .
7. Yan Wen Tan Assignment 7
7
Bias(α) Bias(λ)
N α Y-W CLS Y-W CLS
50 0.1 0.3624477 0.3766839 -0.3526426 -0.4096315
0.3 0.1624477 0.1766839 -0.1004148 -0.114651
0.5 -0.03755227 -0.02331606 0.1110138 0.0967776
0.7 -0.2375523 -0.2233161 0.3376805 0.3234443
0.9 -0.4375523 -0.4233161 0.6710138 0.6567776
75 0.1 0.4722144 0.4872945 -0.4818151 -0.4968953
0.3 0.2722144 0.2872945 -0.2775823 -0.2926625
0.5 0.07221436 0.08729454 -0.06996328 -0.08504345
0.7 -0.1277856 -0.1127055 0.1478145 0.1327343
0.9 -0.3277856 -0.3127055 0.4367034 0.4216232
100 0.1 0.4609967 0.465 -0.4480229 -0.4520261
0.3 0.2609967 0.265 -0.2448483 -0.2488515
0.5 0.06099672 0.065 -0.03913397 -0.04313725
0.7 -0.1390033 -0.135 0.1741994 0.1701961
0.9 -0.3390033 -0.335 0.440866 0.4368627
200 0.1 0.3682165 0.3788248 -0.376052 -0.3866603
0.3 0.1682165 0.1788248 -0.1744647 -0.185073
0.5 -0.03178354 -0.02117524 0.02839245 0.01778415
0.7 -0.2317835 -0.2211752 0.2350591 0.2244508
0.9 -0.4317835 -0.4211752 0.4683925 0.4577841
Table 1: Bias result of different estimation methods for INAR(1) model (λ=1.0) based on 200
replications. (The table is derived ourselves using R)
Observations of Bias of n=75 based on the research paper:
i. Inverse relationship between sample size and parameters (α and λ) exists from
3 estimators (equations (4), (5) and (6)). Out of 20 cases with different
combinations of and n, is biased down while is biased up in 17 cases of
CML, 19 cases of CLS, 20 cases of Y-W. On the other hand, possibility that
and have same bias sign occurs if and only if value of exceeds the mean
of process, in many replications :
ii. Magnitude of biases of and increases by increasing α in Y-W and CLS.
Increase in is 8 times higher than .
8. Yan Wen Tan Assignment 8
8
iii. Amount of bias is reciprocally related to sample size in Y-W and CLS but less
extent in CML.
The results are showing that CML method is better than Y-W and CLS because:
a) The gain in terms of bias of CML over Y-W increases with the increase in sample
size.
b) When α , the gain (of bias and MSE) of Y-W diminishes with increasing
sample size.
c) When , estimates in CLS and CML in terms of bias outperform those in Y-
W.
However, reliability of CML estimates for n=50, 75 and when α decrease. It can
be solved by assume that: Since the mean of process is near 1 when α , the
generated sample path might contain many zero values. Thus, zero solution of equation
(6) is taken to be zero.
The Monte Carlo study done on AR(1) shows that when is near zero, Yule-Walker
might have a smaller bias than the least squares or the maximum likelihood (Dent &
Min, 1978).
In short, Conditional Maximum Likelihood is the best estimator among 3
estimators besides Whittle. Conditional Least Square is the second best estimator while
Yule-Walker is the last.
9. Yan Wen Tan Assignment 9
9
References
Al-Osh,M.A. & Alzaid, A.A. (1987). First order integer-valued autoregressive(INAR(1))
process. Journal of Time Series Analysis, 8(3), p261-274.
Ali Khan,M.S. (1986). An infinite dam with random withdrawal policy. Advances in Applied
Probability, 18(4), p933-95. Retrieved from
http://eds.a.ebscohost.com.proxy.lib.siu.edu/eds/detail/detail?vid=1&sid=b7041ed9-
68d0-407b-93a4-
20b1bae2a0e4%40sessionmgr4002&hid=4113&bdata=JnNpdGU9ZWRzLWxpdmU
mc2NvcGU9c2l0ZQ%3d%3d#db=edsjsr&AN=edsjsr.10.2307.1427257
Brannas,K.(1994). Estimation and testing in Integer-Valued AR(1) models. Retrieved
from http://www.sprak.umu.se/digitalAssets/39/39107_ues335.pdf
Dent,W. & Min, A.S. (1978). Monte carlo study of autoregressive integrated moving average
processes. Journal of Econometrics,7(1), p23-55. doi:10.1016/0304-4076(78)90004-0
Klimko, L.A. & Nelson, P.I. (1978). On conditional least squares estimation for stochastic
processes. The Annals of Statistics,6(3),p629-642. Retrieved from
http://www.jstor.org/stable/2958566?seq=1#page_scan_tab_contents
Maria, E.D & Oliveira, V.L. (2004). Difference equations for the higher-order moments and
cumulants of the INAR(1) model. Journal of Time Series Analysis,25(3), p317-3. doi:
10.1111/j.1467-9892.2004.01685.x
Sprott,D.A. (1983). Estimating the parameters of a convolution by maximum likelihood.
Journal of the American Statistical association, 78, p460-467.