Two-Threshold-Variable Integer-Valued

Full text

Turn on search term navigation

1. Introduction

Integer-valued time series data often occur in real applications, such as in the number of births at a hospital for several consecutive months; in the number of workers that are fired from a factory each month; in the number of claims and in information transmission times of insurance companies every month; and, particularly, in health studies as the daily number of infected patients or deaths due to a virus. The binomial thinning operator proposed by Steutel and van Harn [1] has been widely used to construct the autoregressive model, i.e., the integer-valued autoregressive (INAR) model (Al-Osh and Alzaid [2], McKenzie [3]), which is a popular method to analyze the integer-valued time series data and is defined as follows:

$X_{t} = α \circ X_{t - 1} + ϵ_{t}, t = 1, 2 \dots,$

where

{ϵ_{t}}

is a sequence of independently and identified distributed (i.i.d.) random variables and is independent of

X_{s}, \forall s > t

, “

α \circ

" is the binomial thinning operator with

$α \circ X : = \sum_{i = 1}^{X} B_{i}, \forall X \in N,$

and

B_{i}

is independent of X and is a sequence of i.i.d. Bernoulli random variables with

P (B_{i} = 1) = α \in (0, 1) = 1 - P (B_{i} = 0)

. See Du and Li [4], Silva and Oliveira [5], Silva and Silva [6], and Zhang et al. [7] for more extensions of the INAR model, among others.

However, the above INAR model and its extensions aim to analyze the integer-valued time series with a linear structure and are unavailable to analyze the integer-valued time series with a nonlinear structure. Scotto et al. [8] proposed a discrete counterpart of the conventional max- autoregressive process of order one. It is based on the binomial thinning operator and driven by a sequence of i.i.d. non-negative integer-valued random variables with either a regularly varying right tail or an exponential-type right tail. Aleksić and Ristić [9] introduced a new minification INAR model of the first-order to solve the problem, which can arise when the binomial thinning operator or the negative binomial thinning operator is used. Namely, if one of these thinning operators is used in the construction of the minification model then it is possible that the model becomes zero constantly over time.

The threshold autoregressive (TAR) model proposed by Tong [10] provides an efficient method with which to handle continuous-valued time series data with a nonlinear structure. Boero and Marrocu [11] and Potter [12] applied threshold models to economy and finance. Dueker et al [13] proposed a contemporaneous TAR model for the bond market. See Tong [14] for more discussion on continuous-valued threshold models. Li and Tong [15] proposed a faster approach (called the nested sub-sample search algorithm) to fit a threshold model using the least squares method. In an analogy to the TAR model, Monteiro et al. [16] introduced an integer-valued self-exciting threshold autoregressive process (SETINAR(2,1)), which is driven by an independent Poisson-distributed random variable. Wang et al. [17] proposed a self-excited threshold Poisson autoregressive model, which assumes a two-regime structure of the conditional mean process according to the magnitude of the lagged observations. Yang et al. [18] proposed an integer-valued threshold autoregressive process driven by an independent negative-binomial distributed random variable and the negative binomial thinning operator. To explore the relationship between stock return autocorrelation and trading volume, Zhang et al. [19] proposed a multiple-threshold-variable autoregressive model and applied it to analyze quarterly U.S. real GNP data. But the multiple-threshold-variable autoregressive model is restricted to a continuous-valued time series, and few studies have discussed a similar model for the nonnegative integer-valued time series. To fill this gap, we propose a new two-threshold-variable INAR (2-TINAR) model, which provides an alternative way to analyze nonnegative integer-valued time series with a nonlinear structure.

The paper is organized as follows. Section 2 defines the 2-TINAR model and establishes its stability properties. Section 3 considers conditional least squares (CLS) estimation. Section 4 gives a simulation study. Section 5 considers two real data applications to illustrate the effectiveness of the proposed model. Section 6 concludes.

2. Two-Threshold-Variable Integer-Valued Autoregressive Model

In this paper, we first give the definition of the 2-TINAR model and then discuss some properties of the model.

Definition 1.

The 2-TINAR process ${X_{t}}$ is defined as

(1) $X_{t} = \sum_{j = 1}^{4} (α_{j 1} \circ X_{t - 1} + α_{j 2} \circ X_{t - 2} + ϵ_{j t}) I_{j t} (r, s), t = 1, 2, \dots,$

where

(1)
$(r, s)$ are the threshold parameters and
$\begin{matrix} I_{1 t} (r, s) & = I (X_{t - 1} > r, X_{t - 2} > s) = I (X_{t - 1} > r) I (X_{t - 2} > s), \\ I_{2 t} (r, s) & = I (X_{t - 1} \leq r, X_{t - 2} > s) = I (X_{t - 1} \leq r) I (X_{t - 2} > s), \\ I_{3 t} (r, s) & = I (X_{t - 1} \leq r, X_{t - 2} \leq s) = I (X_{t - 1} \leq r) I (X_{t - 2} \leq s), \\ I_{4 t} (r, s) & = I (X_{t - 1} > r, X_{t - 2} \leq s) = I (X_{t - 1} > r) I (X_{t - 2} \leq s); \end{matrix}$
(2)
$α_{j i} \in (0, 1), 0 < \sum_{i = 1}^{2} α_{j i} < 1, i = 1, 2, j = 1, 2, 3, 4$ , $“ \circ ”$ is the binomial thinning operator, and the operators in $α_{j 1} \circ X_{t - 1}$ and $α_{j 2} \circ X_{t - 2}$ operate independently;
(3)
$\forall t,$ $ϵ_{j t} \sim P o i s (λ_{j})$ and for fixed j, $ϵ_{j t}$ is i.i.d. and independent of $α_{j i} \circ X_{t - i}$ and $X_{t - i}$ .

In the following, we consider the properties of the 2-TINAR model, including stationarity, ergodicity, mean and variance, which will be given in the next three propositions, whose proofs are delegated to Appendix A.

Proposition 1.

Let $Y_{t} = {(X_{t}, X_{t - 1})}^{⊤}$ ; then,

(1)
${Y_{t}}$ is an irreducible, aperiodic, and positive recurrent Markov chain;
(2)
${Y_{t}}$ is an ergodic sequence, and a strictly stationary process satisfying (1) exists.

Proposition 2.

Assume that ${X_{t}}$ is generated from (1); then, $E (X_{t}^{k}) < \infty, k = 1, 2, 3 .$

Proposition 3.

Assume that ${X_{t}}$ is generated from (1) and $F = σ {X_{t - i}, i \geq 1}$ . Then, the mean and variance are

(1)
$E (X_{t} | F_{t - 1}) = \sum_{j = 1}^{4} (α_{j 1} X_{t - 1} + α_{j 2} X_{t - 2} + λ_{j}) I_{j t} (r, s);$
(2)
$E (X_{t}) = \sum_{j = 1}^{4} p_{j} (α_{j 1} u_{j} + α_{j 2} u_{j}^{*} + λ_{j});$
(3)
$V a r (X_{t} | F_{t - 1}) = \sum_{j = 1}^{4} (α_{j 1} (1 - α_{j 1}) X_{t - 1} + α_{j 2} (1 - α_{j 2}) X_{t - 2} + λ_{j}) I_{j t} (r, s);$
(4)
$V a r (X_{t}) = \sum_{j = 1}^{4} (α_{j 1}^{2} (p_{j} (v_{j} + u_{j}^{2}) - p_{j}^{2} u_{j}^{2}) + α_{j 2}^{2} (p_{j} (v_{j}^{*} + {(u_{j}^{*})}^{2}) - p_{j}^{2} {(u_{j}^{*})}^{2}) + 2 (α_{j 1} α_{j 2} w_{j} p_{j} - α_{j 1} α_{j 2} p_{j}^{2} u_{j} u_{j}^{*}) + 2 (α_{j 1} λ_{j} p_{j} u_{j} - α_{j 1} λ_{j} p_{j}^{2} u_{j}) + 2 (α_{j 2} λ_{j} p_{j} u_{j}^{*} - α_{j 2} λ_{j} p_{j}^{2} u_{j}^{*}) + α_{j 1} (1 - α_{j 1}) p_{j} u_{j} + α_{j 2} (1 - α_{j 2}) p_{j} u_{j}^{*} + λ_{j} p_{j}) + \sum_{m = 1}^{6} C_{m}$ , $m = 1, 2, \dots, 6$ , where $p_{j}, u_{j}, v_{j}, u_{j}^{*}, v_{j}^{*}, w_{j}$ and $C_{m}$ are given in Appendix A.

3. Conditional Least Squares Estimation

In this section, we use the CLS method to estimate the parameters involved in the 2-TINAR model. Here, we consider the following two cases: the first one is that the threshold values r and s are known and the second one is that the threshold values r and s are unknown.

3.1. Known Case of $(r, s)$

In this part, we assume that ${X_{t}, t = 1, . . ., n}$ comprises observations generated by (1), $ψ_{j} = {(α_{j 1}, α_{j 2}, λ_{j})}^{⊤}$ is the vector of regression parameters, $j = 1, 2, 3, 4, ϕ = {(ψ_{1}^{⊤}, ψ_{2}^{⊤}, ψ_{3}^{⊤}, ψ_{4}^{⊤})}^{⊤} = {(ϕ_{1}, . . ., ϕ_{12})}^{⊤}$ , and $g (ϕ, X_{t - 1}, X_{t - 2}) = E_{ϕ} (X_{t - 1} | F_{t - 1}) = \sum_{j = 1}^{4} (α_{j 1} X_{t - 1} + α_{j 2} X_{t - 2} + λ_{j}) I_{j t} (r, s) .$ Then, the CLS estimate ${\hat{ψ}}_{j, C L S} = {({\hat{α}}_{j 1, C L S}, {\hat{α}}_{j 2, C L S}, {\hat{λ}}_{j, C L S})}^{⊤}$ is obtained by minimizing the function

(2) $Q (ϕ) = \sum_{t = 3}^{n} {(X_{t} - g (ϕ, X_{t - 1}, X_{t - 2}))}^{2} = \sum_{t = 3}^{n} q_{t}^{2} (ϕ),$

where

q_{t} (ϕ) = X_{t} - \sum_{j = 1}^{4} (α_{j 1} X_{t - 1} + α_{j 2} X_{t - 2} + λ_{j}) I_{j t} (r, s) .

Then, the closed form of

ψ_{j, C L S}

is obtained by the following equations:

$\begin{matrix} \{\begin{matrix} \frac{\partial Q (ϕ)}{\partial α_{j 1}} = - 2 \sum_{t = 3}^{n} (X_{t} - \sum_{j = 1}^{4} (α_{j 1} X_{t - 1} + α_{j 2} X_{t - 2} + λ_{j}) I_{j t} (r, s)) X_{t - 1} = 0, \\ \frac{\partial Q (ϕ)}{\partial α_{j 2}} = - 2 \sum_{t = 3}^{n} (X_{t} - \sum_{j = 1}^{4} (α_{j 1} X_{t - 1} + α_{j 2} X_{t - 2} + λ_{j}) I_{j t} (r, s)) X_{t - 2} = 0, \\ \frac{\partial Q (ϕ)}{\partial λ_{j}} = - 2 \sum_{t = 3}^{n} (X_{t} - \sum_{j = 1}^{4} (α_{j 1} X_{t - 1} + α_{j 2} X_{t - 2} + λ_{j}) I_{j t} (r, s)) = 0 . \end{matrix} \end{matrix}$

Denote

$A_{j} = (\begin{matrix} \sum_{t = 3}^{n} I_{j t} (r, s) X_{t - 1}^{2} & \sum_{t = 3}^{n} I_{j t} (r, s) X_{t - 1} X_{t - 2} & \sum_{t = 3}^{n} I_{j t} (r, s) X_{t - 1} \\ \sum_{t = 3}^{n} I_{j t} (r, s) X_{t - 1} X_{t - 2} & \sum_{t = 3}^{n} I_{j t} (r, s) X_{t - 2}^{2} & \sum_{t = 3}^{n} I_{j t} (r, s) X_{t - 2} \\ \sum_{t = 3}^{n} I_{j t} (r, s) X_{t - 1} & \sum_{t = 3}^{n} I_{j t} (r, s) X_{t - 2} & \sum_{t = 3}^{n} I_{j t} (r, s) \end{matrix}),$

$D_{j} = {(\begin{matrix} \sum_{t = 3}^{n} I_{j t} (r, s) X_{t} X_{t - 1}, \sum_{t = 3}^{n} I_{j t} (r, s) X_{t} X_{t - 2}, \sum_{t = 3}^{n} I_{j t} (r, s) X_{t} \end{matrix})}^{⊤} .$

Then, $ψ_{j, C L S} = A_{j}^{- 1} D_{j}$ .

To study the asymptotic behaviour of the estimators, we make the following assumptions about the underlying process and the parameter space.

Assumption 1.

If ${X_{t}}$ is generated from (1), then the parameter space ϕ is a compact subset of $D \times R_{+}^{4}$ , and $D = (0, 1) \times (0, 1) \times (0, 1) \times (0, 1) \times (0, 1) \times (0, 1) \times (0, 1) \times (0, 1)$ is a compact subset of $R_{+}^{8}$ .

Assumption 2.

The model (1) is identifiable, i.e., $p_{ϕ} \neq p_{ϕ_{0}}$ , if $ϕ \neq ϕ_{0}$ , where $p_{ϕ}$ denotes the marginal distribution of ${X_{t}}$ with parameter ϕ.

For Assumption 1, assume the parameter space is compact so that the asymptotic properties of the CLS estimator can be guaranteed, which is common in INAR models. Parameter identifiability in Assumption 2 is a property that concerns whether the model parameters can be uniquely determined, which is the foundation for parameter estimation.

The following theorem establishes the asymptotic porperties of the CLS estimator, whose proof will be given in Appendix A.

Theorem 1.

Under the Assumptions 1 and 2, ${\hat{ϕ}}_{C L S}$ is strongly consistent and asymptotically normally distributed with

$\sqrt{T} ({\hat{ϕ}}_{C L S} - ϕ_{0}) \overset{d}{⟶} N (0, V^{- 1} W V^{- 1}),$

where $V = E (\frac{\partial E (X_{t} | F_{t - 1})}{\partial ϕ} \frac{\partial E (X_{t} | F_{t - 1})}{\partial ϕ^{⊤}})$ and $W = E ({(q_{t} (ϕ))}^{2} \frac{\partial E (X_{t} | F_{t - 1})}{\partial ϕ} \frac{\partial E (X_{t} | F_{t - 1})}{\partial ϕ^{⊤}}) .$

3.2. Unknown Case of $(r, s)$

Under the unknown case of $(r, s)$ , we first estimate $(r, s)$ , which is obtained by minimizing (2) by the following steps:

(1). For each candidate for $(r, s)$ in CR, we estimate $\hat{ϕ}$ by minimizing $Q (ϕ, r, s)$ , i.e.,
$\hat{ϕ} = \underset{ϕ}{arg min} Q (ϕ, r, s) .$
(2). The estimator for thresholds $(r, s)$ is obtained by searching over all of the candidates for $(r, s)$ in CR, i.e.,
$(\hat{r}, \hat{s}) = \underset{(r, s) \in C R}{arg min} Q (\hat{ϕ}, r, s),$

where CR is the set of candidates for estimators for

(r, s)

with CR =

{X_{{i}}, X_{{i + 1}}}, i = (0.2, 0.25

0.3, 0.35, 0.4, 0.45, 0, 5, 0.55, 0.6, 0.65, 0.7

0.75, 0.8)

. To select a proper CR, we propose a validated method, which can put less pressure on the capacity of computing and guarantee the accuracy of estimates;

{X_{{i}}, X_{{i + 1}}}

are thirteen candidates for

(\hat{r}, \hat{s})

. Actually, the estimators for

(r, s)

are searched for from

X_{{0.20}}

X_{{0.85}}

, and this is a sufficient search range for the threshold. Hence, this method guarantees reasonable and sufficient search ranges without too much pressure on computing.

Based on the initial setting of the parameter space, both of the thresholds r and s are integers. Therefore, the consistency of $\hat{r}$ means that $\hat{r} = r$ utterly, and so does $\hat{s}$ . It is feasible that we estimate the other parameters by assuming that thresholds $(r, s)$ are known, which is similar to the discussion in Wang et al. [17] because the validity of the estimates with unknown thresholds, as for the other parameters, is asymptotically identical to that obtained with known $(r, s)$ . Hence, in this subsection, we treat the thresholds r and s as known parameters and consider the consistency for $ϕ = {(α_{j 1}, α_{j 2}, λ_{j})}^{⊤}, j = 1, 2, 3, 4$ .

4. Simulation Study

In this section, we illustrate the finite sample property of the CLS estimates under the known and unknown cases of $(r, s)$ . In the simulation, we use m = 10,000 replications and set the sample size is $T = 500, 1000, 2000$ and 10,000.

4.1. Known Case of $(r, s)$

In the simulation study, we consider the following parameter combinations:

$\begin{matrix} (C 1) (0.3, 0.2, 7, 0.2, 0.25, 6, 0.2, 0.3, 8, 0.3, 0.2, 6) with (r, s) = (13, 11), \\ (C 2) (0.3, 0.35, 15, 0.3, 0.35, 20, 0.3, 0.4, 25, 0.35, 0.25, 15) with (r, s) = (55, 53) . \end{matrix}$

As discussed in Li and Tong [15], if the proportion of observations in one regime to the whole is less than $5 %$ , the estimation results may not be reliable. To illustrate the reasonableness of $(r, s)$ given in (C1) and (C2), we give two sample paths of the sample generated by the 2-TINAR model with (C1) and (C2) in Figure 1, which shows that the proportion of the observations in each range is no less than $20 %$ . In Figure 1, circle means sample point, red dot line means the value of r, and blue dot line means the value of s.

The mean and standard deviation (SD) of the estimates are summarized in Table 1, from which we obtain that the CLS method performs reasonably well when $(r, s)$ is known because the mean gradually approaches the true value of the parameter and SD decreases gradually, when the sample size is increasing.

To further account for the reasonableness of the CLS estimates, we present the boxplots of the parameter combinations (C1) in Figure 2 (the boxplots of (C2) are similar, and we omit them), and the QQ-plots of the parameter combinations (C1) and (C2) indicate the asymptotic normality of the CLS estimator. For saving space, we omit the QQ-plots, which are available upon request. All of them highlight that the good performances of the CLS estimate under the case of $(r, s)$ are known.

4.2. Unknown Case of $(r, s)$

In this part, we first let (C3) $(0.2, 0.25, 3, 0.25, 0.35, 5, 0.3, 0.3, 4, 0.3, 0.25, 6, 13, 14)$ and (C4) $(0.30, 0.25, 6, 0.25, 0.35, 6, 0.4, 0.3, 9, 0.3, 0.35, 8, 30, 31)$ . Then, we use the approach given in Section 3.2 to obtain $(\hat{r}, \hat{s})$ .

To illustrate the reasonableness of $(\hat{r}, \hat{s})$ given in (C3) and (C4), we give two sample paths of the sample generated by 2-TINAR model with (C3) and (C4) in Figure 3, which shows that the proportion of the observations in each range is no less than $20 %$ . In Figure 3, circle, red and blue dot line in figure are the same as Figure 1.

The mean and SD of the estimates are summarized in Table 2, from which we can see that with the increase in sample size, the mean gradually approaches the true value of the parameter and SD decreases gradually. The boxplots (C3) and (C4) are similar to (C1) and (C2). We present the boxplots of the parameter combinations (C3) in Figure 4(the boxplots of (C4) are similar and we omit them).

From Figure 4, the median of the estimator is closer to the true value and the quartile range and overall range of the estimated values become narrower, both of which indicate the consistency of the estimators. The QQ−plots of the parameter combinations (C3) and (C4) indicate the asymptotic normality of the CLS estimator. For the same reason, we omit the QQ-plots. All of them highlight that the good performances of the CLS estimate under the case of $(r, s)$ are unknown.

5. Two Real Examples

In this section, we use 2-TINAR(2) models to study two stock datasets listed in the New York Stock Exchange (NYSE).

5.1. Siparex Croissance Stock

In this subsection, we consider the daily number of trades of a stock listed in the NYSE (Siparex Croissance). By computation, the mean is 10.0190 and the variance is 129.7295, which shows that this dataset is over-dispersed and implies that it may be better suited to the piecewise structure. Figure 5 shows the path of the data, whose autocorrelation (ACF) and partial autocorrelation functions (PACF) are presented in Figure 6.

We compare the proposed model with the max-INAR(1) model with geometric innovations (Scotto et al. [8]), the min-INAR(1) model (Aleksić and Ristić [9]), the Poisson INAR(2) (P-INAR) model (Du and Li [4]), and the SETINAR(2,1) model (Monteiro et al. [16]) with $Z_{t} \in P o i s (λ_{j})$ to fit the data set by the CLS method and compare their mean squared error (MSE) and mean absolute deviation error (MADE), where

$M S E = \frac{1}{T - 3} \sum_{t = 3}^{T} {(x_{t} - {\hat{x}}_{t})}^{2}, M A D E = \frac{1}{T - 3} \sum_{t = 3}^{T} |x_{t} - {\hat{x}}_{t}| .$

For each model, we obtain the values of the CLS estimates of the parameters, include the SD of the estimates, and include the in-sample and out-of-sample MSE and MADE values.

For the in-sample values, all of the observations are used to estimate the parameters, while for the out-of-sample values, the first 3618 observations are used to estimate parameters; we predict that the last $m = 15$ observations, the r for SETINAR(2,1) model, and $(r, s)$ for the 2-TINAR(2) model are obtained by the method described in Section 3.2.

The results of the CLS estimates, MSE and MADE, are summarized in Table 3, from which we can see that the max-INAR(1) model and the min-INAR(1) model are not well fitted, so these two models are not suitable for the kind of datum applied in this paper. The 2-TINAR(2) model takes the smallest MSE and MADE values; hence, 2-TINAR(2) is more appropriate for this data set.

5.2. Westar Energy Stock

In this subsection, we consider the number of trades in 5-min intervals between 9:45 a.m. and 4:00 p.m. of a stock listed in the NYSE (Westar Energy, Inc. (WR)), which belongs to the industry subsector conventional electricity. The time period covered is the first quarter of 2005 (3 January 2005–31 March 2005) with 61 trading days; the sample size is $T = 4575$ . Data are taken from the Trades and Quotes (TAQ) dataset. By computation, the mean is 9.6070 and the variance is 34.8908, which shows it is overdispersed and implies that the piecewise structure may be more suitable for this set of data. Figure 7 shows the path of the data, whose autocorrelation (ACF) and partial autocorrelation functions (PACF) are presented in Figure 8.

Like in Section 5.1, we compare the proposed model with the max-INAR(1) model with geometric innovations (Scotto et al. [8]), the min-INAR(1) model (Aleksić and Ristić [9]), the Poisson INAR(2) (P-INAR) model (Du and Li [4]), and the SETINAR(2,1) model (Monteiro et al. [16]) with $Z_{t} \in P o i s (λ_{j})$ to fit the data set by the CLS method and compare their MSE and MADE. For each model, we obtain the values of the CLS estimates of the parameters, include the SD of the estimates, and include the in-sample and out-of-sample MSE and MADE values.

For the in-sample values, all observations are used to estimate parameters, while for the out-of-sample values, the first 4560 observations are used to estimate the parameters; we predict that the last $m = 15$ observations, r for the SETINAR(2,1) model and $(r, s)$ for the 2-TINAR(2) model, are obtained by the method described in Section 3.2. The results of the CLS estimates, MSE and MADE, are summarized in Table 4, from which we can see that the max-INAR(1) model and the min-INAR(1) model are not well fitted, so these two models are not suitable for this kind of data again. The 2-TINAR(2) model takes the smallest MSE and MADE values; hence, 2-TINAR(2) is more appropriate for this data set.

Obviously, we can see that as one of the main novelties of the proposed model, it means that the regime of r and s is determined by considering lots of past information, which makes it more practical. Compared with other models, the 2-TINAR(2) model distinguishes innovations in different regions, which makes our model more flexible, but at the same time it also increases the parameters of the model and makes the model more complex. Therefore, we can find that the 2-TINAR(2) model is more suitable for the analysis of the scattered dataset, such as two real examples mentioned in this paper, but it is not suitable for data with small variances and variations.

6. Conclusions

In this paper, we propose a new two-threshold-variable INAR(2) model, which is a generalization of existing INAR models. We consider the CLS estimate with known $(r, s)$ and unknown $(r, s)$ , respectively. To verify the asymptotic behaviour of the estimators, we give the results of the simulation in each case. A superior performance of the proposed model in real example is demonstrated.

In model (1), we use $X_{t - 1}$ and $X_{t - 2}$ as threshold variables, while other variables can also be used as threshold variables, i.e., the method considered here can be easily extended to other INAR models, such as INAR models with explanatory variable or covariate defined in Enciso-Mora et al. [20],

$\begin{matrix} X_{t} = α \circ X_{t - 1} + Z_{t}, Z_{t} \sim P o i s (exp (w_{t} γ)), \end{matrix}$

$\begin{matrix} X_{t} = α_{t} \circ X_{t - 1} + Z_{t}, α_{t} = {[1 + exp (w_{t} δ)]}^{- 1}, Z_{t} \sim P o i s (exp (w_{t} γ)), \end{matrix}$

where

w_{t}

is explanatory variable or covariate; then, we can use

X_{t - 1}

and

w_{t}

as threshold variables.

Furthermore, there should be more efficient methods of determining the search range for the threshold estimates, and the possibility of extending this model to the high-dimensional situation is worthy of attention. Moreover, we can extend the results to the three-threshold-variable case. These remains topics of future study. Extensions to the models in Chen et al. [21], Qian and Zhu [22], Su and Zhu [23] and Zhang et al. [7] are similar. Details will be discussed in a future project.

Author Contributions

Conceptualization, F.Z.; methodology, F.Z. and J.Z.; software, J.Z. and H.C.; validation, J.Z. and H.C.; formal analysis, J.Z., F.Z. and H.C.; investigation, J.Z. and H.C.; resources, F.Z.; data curation, J.Z. and H.C.; writing—original draft preparation, J.Z.; writing—review and editing, F.Z. and H.C.; visualization, J.Z. and H.C.; supervision, F.Z. All authors have read and agreed to the published version of the manuscript.

Data Availability Statement

No new data were created in this review.

Acknowledgments

We thank three reviewers for their insightful and constructive comments, which greatly improved the overall presentation.

Conflicts of Interest

The authors declare no conflict of interest.

Footnotes

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Figures and Tables

Figure 1. Sample paths for combinations (C1) and (C2).

Figure 2. Boxplots of (C1).

Figure 3. Sample paths of (C3) and (C4).

Figure 4. Boxplots of (C3).

Figure 5. Path of the stock data.

Figure 6. Daily number of trades of a stock data: (a). ACF (b). PACF.

Figure 7. Path of the stock data.

Figure 8. The number of trades of a stock data: (a). ACF (b). PACF.

Table 1

The mean and standard deviation (in brackets) of the estimates of 2-TINAR model with known $(r, s)$ .

T	$α_{11}$	$α_{12}$	$λ_{1}$	$α_{21}$	$α_{22}$	$λ_{2}$	$α_{31}$	$α_{32}$	$λ_{3}$	$α_{41}$	$α_{42}$	$λ_{4}$
	(C1) = (0.3, 0.2, 7, 0.2, 0.25, 6, 0.2, 0.3, 8, 0.3, 0.2, 6)
500	0.2913	0.1884	7.3424	0.2087	0.2495	6.0019	0.2090	0.3034	8.1320	0.3301	0.2921	6.5232
	(0.1106)	(0.0908)	(2.1188)	(0.1225)	(0.1116)	(2.0209)	(0.1285)	(0.1692)	(2.1968)	(0.2155)	(0.2147)	(4.1374)
1000	0.2951	0.1936	7.1725	0.2022	0.2493	5.9952	0.2000	0.2974	8.0702	0.3060	0.2373	6.1198
	(0.0781)	(0.0658)	(1.4782)	(0.0918)	(0.0802)	(1.4289)	(0.0990)	(0.1280)	(1.5499)	(0.1634)	(0.1635)	(3.1374)
2000	0.2978	0.1965	7.0862	0.2004	0.2497	5.9980	0.1991	0.2981	8.0362	0.3014	0.2110	6.0076
	(0.0548)	(0.0467)	(1.0395)	(0.0657)	(0.0565)	(0.9988)	(0.0728)	(0.0921)	(1.0918)	(0.1208)	(0.1281)	(2.3074)
10,000	0.2995	0.1994	7.0170	0.2001	0.2499	5.9998	0.1999	0.2997	8.0048	0.3001	0.1998	6.0013
	(0.0244)	(0.0209)	(0.4650)	(0.0294)	(0.0252)	(0.4464)	(0.0328)	(0.0413)	(0.4897)	(0.0537)	(0.0645)	(1.0270)
	(C2) = (0.3, 0.35, 15, 0.3, 0.35, 20, 0.3, 0.4, 25, 0.35, 0.25, 15)
500	0.2955	0.3395	17.3932	0.3014	0.3460	20.2152	0.2978	0.3985	25.3153	0.3506	0.2493	15.7572
	(0.1487)	(0.1278)	(11.5915)	(0.1017)	(0.1130)	(8.1557)	(0.1371)	(0.1181)	(6.2198)	(0.1245)	(0.1334)	(8.8644)
1000	0.2948	0.3448	15.8756	0.3007	0.3482	20.0721	0.2974	0.3996	25.1709	0.3503	0.2479	15.1737
	(0.1088)	(0.0898)	(8.7210)	(0.0713)	(0.0793)	(5.7577)	(0.1053)	(0.0891)	(5.2321)	(0.0870)	(0.0994)	(6.5817)
2000	0.2969	0.3475	15.3461	0.3001	0.3489	20.0642	0.2986	0.4002	25.0691	0.3502	0.2483	15.0749
	(0.0775)	(0.0629)	(6.4147)	(0.0502)	(0.0558)	(4.0647)	(0.0774)	(0.0661)	(4.1621)	(0.0611)	(0.0708)	(4.7424)
10,000	0.2995	0.3495	15.0594	0.3001	0.3499	20.0017	0.2999	0.4001	25.0010	0.3500	0.2497	15.0124
	(0.0343)	(0.0279)	(2.8665)	(0.0223)	(0.0250)	(1.8115)	(0.0363)	(0.0311)	(2.1124)	(0.0272)	(0.0317)	(2.1158)

Table 2

The mean and standard deviation (in brackets) of the estimates of 2-TINAR model with unknown $(r, s)$ .

T	$α_{11}$	$α_{12}$	$λ_{1}$	$α_{21}$	$α_{22}$	$λ_{2}$	$α_{31}$	$α_{32}$	$λ_{3}$	$α_{41}$	$α_{42}$	$λ_{4}$	r	s
	(C3) = (0.2, 0.25, 3, 0.25, 0.35, 5, 0.3, 0.3, 4, 0.3, 0.25, 6, 13, 14)
500	0.3165	0.3882	9.6383	0.2897	0.4308	6.7583	0.3092	0.3000	3.9558	0.3908	0.2547	6.6819	9.9851	10.5800
	(0.2458)	(0.3007)	(7.8787)	(0.2093)	(0.2878)	(5.0400)	(0.0655)	(0.0590)	(0.6470)	(0.2612)	(0.1715)	(4.5692)	(2.0175)	(2.1507)
1000	0.2482	0.2966	6.6049	0.2544	0.3860	5.6138	0.3041	0.3012	3.9727	0.3370	0.2404	6.2223	11.3103	12.0423
	(0.1764)	(0.2082)	(5.6639)	(0.1584)	(0.2204)	(3.8261)	(0.0471)	(0.0419)	(0.4652)	(0.1945)	(0.1337)	(3.4623)	(1.9055)	(2.0830)
2000	0.2156	0.2634	4.5700	0.2486	0.3584	5.1460	0.3014	0.3007	3.9891	0.3127	0.2427	6.0176	12.3400	13.2323
	(0.1327)	(0.1540)	(3.9383)	(0.1235)	(0.1626)	(2.8763)	(0.0325)	(0.0291)	(0.3225)	(0.1352)	(0.0971)	(2.3564)	(1.3604)	(1.5242)
10,000	0.2003	0.2486	3.0470	0.2503	0.3491	5.0088	0.2996	0.2998	4.0057	0.3004	0.2501	5.9923	12.9962	13.9960
	(0.0627)	(0.0725)	(1.4592)	(0.0569)	(0.0717)	(1.3053)	(0.0133)	(0.0127)	(0.1368)	(0.0537)	(0.0411)	(0.8949)	(0.1104)	(0.1166)
	(C4) = (0.30, 0.25, 6, 0.25, 0.35, 6, 0.4, 0.3, 9, 0.3, 0.35, 8, 30, 31)
500	0.5152	0.4893	22.8606	0.2621	0.3919	12.3190	0.3923	0.2849	9.5150	0.3138	0.3337	11.0724	26.5934	27.4885
	(0.4299)	(0.4219)	(19.7688)	(0.1613)	(0.2612)	(10.0337)	(0.0666)	(0.0647)	(1.8098)	(0.2115)	(0.1714)	(7.6731)	(3.9233)	(4.0751)
1000	0.3794	0.3437	14.3485	0.2502	0.3564	8.8832	0.3966	0.2933	9.2336	0.3006	0.3395	9.1092	28.5735	29.5242
	(0.2716)	(0.2547)	(11.4391)	(0.1216)	(0.1978)	(7.1729)	(0.0465)	(0.0458)	(1.3281)	(0.1619)	(0.1253)	(5.7439)	(3.0484)	(3.1956)
2000	0.3216	0.2811	10.0719	0.2491	0.3514	6.8234	0.3983	0.2983	9.0834	0.3009	0.3487	8.1454	29.6788	30.6630
	(0.2052)	(0.1911)	(7.6392)	(0.0876)	(0.1446)	(4.7774)	(0.0321)	(0.0307)	(0.9129)	(0.1163)	(0.0873)	(4.1778)	(1.5502)	(1.6476)
10,000	0.3007	0.2502	6.4417	0.2496	0.3498	6.0227	0.3996	0.2999	9.0131	0.3004	0.3492	8.0065	30.0000	31.0000
	(0.1036)	(0.0985)	(4.0326)	(0.0390)	(0.0638)	(2.2480)	(0.0143)	(0.0135)	(0.4048)	(0.0513)	(0.0382)	(1.8861)	(0.0000)	(0.0000)

Table 3

Fitting results of the Siparex Croissance data.

Model in Sample	Estimate												MSE	MADE
max-INAR(1)	${\hat{α}}_{1}$	$\hat{q}$
	0.4967	0.1208
	(1.8621)	(0.0627)											21.1591	3.9308
min-INAR(1)	${\hat{α}}_{1}$	$\hat{μ}$
	1.5074	8.7789
	(50.3267)	(414.5962)											37.8780	5.2950
P-INAR(2)	${\hat{α}}_{1}$	${\hat{α}}_{2}$	$\hat{λ}$
	0.3714	0.2228	4.0663
	(0.0316)	(0.0279)	(0.2790)										21.1386	3.8917
SETINAR(2) $r = 9$	${\hat{α}}_{11}$	${\hat{α}}_{12}$	$λ_{1}$	${\hat{α}}_{21}$	${\hat{α}}_{22}$	$λ_{2}$
	0.4015	0.2295	3.7634	0.3497	0.2155	5.0422
	(0.0373)	0.0312)	(0.2895)	(0.0485)	(0.0392)	(1.1459)							21.3504	3.8975
2-TINAR(2) $r = 14, s = 17$	${\hat{α}}_{11}$	${\hat{α}}_{12}$	$λ_{1}$	${\hat{α}}_{21}$	${\hat{α}}_{22}$	$λ_{2}$	${\hat{α}}_{31}$	${\hat{α}}_{32}$	$λ_{3}$	${\hat{α}}_{41}$	${\hat{α}}_{42}$	$λ_{4}$
	0.3544	0.1413	9.0932	0.7353	0.1464	2.8976	0.3666	0.2699	3.7258	0.3490	0.2022	4.2198
	(0.0739)	(0.0583)	(2.4291)	(0.1802)	(0.0796)	(2.5865)	(0.0379)	( 0.0314)	(0.2639)	(0.0630)	(0.1057)	(1.5101)	20.5577	3.7951
out of sample
max-INAR(1)	${\hat{α}}_{1}$	$\hat{q}$
	0.4967	0.1208
	(1.8660)	(0.0629)											25.3317	4.0658
min-INAR(1)	${\hat{α}}_{1}$	$\hat{μ}$
	1.5074	8.7789
	(49.6399)	(408.9382)											40.6800	4.4990
P-INAR(2)	${\hat{α}}_{1}$	${\hat{α}}_{2}$	$\hat{λ}$
	0.3717	0.2224	4.0767
	(0.0316)	(0.0279)	(0.2796)										22.3961	3.1670
SETINAR(2) $r = 9$	${\hat{α}}_{11}$	${\hat{α}}_{12}$	$λ_{1}$	${\hat{α}}_{21}$	${\hat{α}}_{22}$	$λ_{2}$
	0.4058	0.2282	3.7572	0.3497	0.2155	5.0422
	(0.0374)	(0.0313)	(0.2898)	(0.0485)	(0.0392)	(1.1459)							22.9152	3.2409
2-TINAR(2) $r = 14, s = 17$	${\hat{α}}_{11}$	${\hat{α}}_{12}$	$λ_{1}$	${\hat{α}}_{21}$	${\hat{α}}_{22}$	$λ_{2}$	${\hat{α}}_{31}$	${\hat{α}}_{32}$	$λ_{3}$	${\hat{α}}_{41}$	${\hat{α}}_{42}$	$λ_{4}$
	0.3544	0.1413	9.0932	0.7353	0.1464	2.8976	0.3710	0.2681	3.7213	0.3490	0.2022	4.2198
	(0.0737)	(0.0582)	(2.4241)	(0.1798)	(0.0795)	(2.5812)	(0.0380)	(0.0315)	(0.2636)	(0.0629)	(0.1055)	(1.5070)	17.3452	2.9123

Table 4

Fitting results of the WR data.

Model in Sample	Estimate												MSE	MADE
max-INAR(1)	${\hat{α}}_{1}$	$\hat{q}$
	0.2461	0.3982
	(1.2089)	(2.4770)											150.0196	11.5280
min-INAR(1)	${\hat{α}}_{1}$	$\hat{μ}$
	1.7625	1.6307
	(50.1349)	(27.6055)											175.1617	12.6286
P-INAR(2)	${\hat{α}}_{1}$	${\hat{α}}_{2}$	$\hat{λ}$
	0.3362	0.2187	4.2795
	(0.0192)	(0.0172)	(0.1804)										25.1577	4.4766
SETINAR(2) $r = 14$	${\hat{α}}_{11}$	${\hat{α}}_{12}$	$λ_{1}$	${\hat{α}}_{21}$	${\hat{α}}_{22}$	$λ_{2}$
	0.3325	0.2424	4.0595	0.2586	0.1499	6.9125
	(0.0227)	(0.0186)	(0.1909)	(0.0634)	(0.0396)	(1.1693)							23.9746	4.4016
2-TINAR(2) $r = 11, s = 12$	${\hat{α}}_{11}$	${\hat{α}}_{12}$	$λ_{1}$	${\hat{α}}_{21}$	${\hat{α}}_{22}$	$λ_{2}$	${\hat{α}}_{31}$	${\hat{α}}_{32}$	$λ_{3}$	${\hat{α}}_{41}$	${\hat{α}}_{42}$	$λ_{4}$
	0.1855	0.0458	10.8460	0.4619	0.0985	5.3952	0.3116	0.3009	3.8049	0.3660	0.1952	4.3789
	(0.0848)	(0.0919)	(2.2813)	(0.0962)	(0.0585)	(1.4106)	(0.0233)	(0.0225)	(0.1971)	(0.0884)	(0.0942)	(1.6380)	23.5336	4.4005
out of sample
max-INAR(1)	${\hat{α}}_{1}$	$\hat{q}$
	0.2461	0.3982
	(1.2109)	(2.4811)											179.5401	12.6882
min-INAR(1)	${\hat{α}}_{1}$	$\hat{μ}$
	1.7625	1.6307
	(50.2126)	(27.6483)											178.7166	12.7448
P-INAR(2)	${\hat{α}}_{1}$	${\hat{α}}_{2}$	$\hat{λ}$
	0.3359	0.2172	4.2877
	(0.0192)	(0.0172)	(0.1808)										73.5537	7.7076
SETINAR(2) $r = 14$	${\hat{α}}_{11}$	${\hat{α}}_{12}$	$λ_{1}$	${\hat{α}}_{21}$	${\hat{α}}_{22}$	$λ_{2}$
	0.3310	0.2412	4.0759	0.2630	0.1479	6.8355
	(0.0227)	(0.0186)	(0.1909)	(0.0637)	(0.0397)	(1.1741)							73.8454	7.7045
2-TINAR(2) $r = 11, s = 12$	${\hat{α}}_{11}$	${\hat{α}}_{12}$	$λ_{1}$	${\hat{α}}_{21}$	${\hat{α}}_{22}$	$λ_{2}$	${\hat{α}}_{31}$	${\hat{α}}_{32}$	$λ_{3}$	${\hat{α}}_{41}$	${\hat{α}}_{42}$	$λ_{4}$
	0.1885	0.0458	10.7683	0.4550	0.0984	5.4278	0.3105	0.3004	3.8135	0.3706	0.1892	4.3368
	(0.0849)	(0.0922)	(2.3067)	(0.0961)	(0.0583)	(1.4070)	(0.0233)	(0.0224)	(0.1969)	(0.0887)	(0.0948)	(1.6435)	71.6180	7.5889

Appendix A

Proof of Proposition 1.

(1) It is easy to see that ${Y_{t}}$ is a Markov chain with state space $N_{0}$ , and the transition probability of the 2-TINAR(2) process is $\begin{matrix} P (Y_{t} = {(j, l)}^{⊤} | Y_{t - 1} = {(l, m)}^{⊤}) = \sum_{k = 0}^{min (m, j)} (\binom{m}{k}) α_{j 2}^{k} {(1 - α_{j 2})}^{m - k} P (R = j - k), \end{matrix}$ note that $P (R = j - k) = \sum_{b = 0}^{min (l, j - k)} (\binom{j - k}{b}) α_{j 1}^{b} {(1 - α_{j 1})}^{l - b} P (ϵ_{j t} = j - k - b)$ ,

$P (Y_{t} = {(j, l)}^{⊤} | Y_{t - 1} = {(l, m)}^{⊤}) > 0$ ; thus, we can see that ${Y_{t}}$ is an irreducible, aperiodic Markov chain. To prove that $Y_{t}$ is positive recurrent, let $Y_{t} = A * Y_{t - 1} + {(ϵ_{j t}, 0)}^{⊤}$ , $ρ (A_{j}) < 1$ , where $A_{j} = (\begin{matrix} α_{j 1} & α_{j 2} \\ 1 & 0 \end{matrix})$ . The definition of * can be expressed as $(\begin{matrix} α_{1} & α_{2} \\ α_{3} & α_{4} \end{matrix}) * (\begin{matrix} X_{1} \\ X_{2} \end{matrix}) : = (\begin{matrix} α_{1} \circ X_{1} + α_{2} \circ X_{2} \\ α_{3} \circ X_{3} + α_{4} \circ X_{4} \end{matrix});$ then, based on Proposition 2.1 in Yang et al. [18], we can know that ${Y_{t}}$ is a positive recurrent Markov chain.

(2) It is the direct conclusion of (1); thus, we can see the existence of a strictly stationary distribution of (1). □

Proof of Proposition 2.

Note that three of the four values $I_{1 t} (r, s)$ , $I_{2 t} (r, s)$ , $I_{3 t} (r, s)$ , $I_{4 t} (r, s)$ must be equal to 0; thus, from Silva and Oliveira [5], let $max ρ (A_{j}) < 1$ , $max ρ (A_{j} \otimes A_{l}) < 1$ , and $max ρ (A_{j} \otimes A_{l} \otimes A_{k}) < 1$ . We can know that ${X_{t}}$ is a 3-order stationarity process. The 3-order joint moment of $X_{t}, X_{t + s_{1}}, X_{t + s_{2}}$ , for $s_{1}, s_{2} \in R$ is a function of 2 variables defined by $μ_{X} (s_{1}, s_{2}) = E (X_{t}, X_{t + s_{1}}, X_{t + s_{2}})$ with $μ_{X} = E (X_{t})$ . Similar to Silva and Silva [6], we let $μ_{j i} = E (B_{i}) = α_{j i}, σ_{j i}^{2} = Var (B_{i}) = α_{j i} (1 - α_{j i}), γ_{j i} = E (B_{i}^{3}) = α_{j i}, γ_{ϵ_{j t}} = E (ϵ_{j t}^{3}), C_{j i} = \sum_{i = 1}^{2} (γ_{j i} - 3 α_{j i} σ_{j i}^{2} - α_{j i}^{3})$ . Then, for $k > 0$ , $\begin{matrix} μ_{X} (0, 0) & \leq \sum_{i = 1}^{2} \sum_{l = 1}^{2} \sum_{k = 1}^{2} max (α_{j i}) max (α_{j l}) max (α_{j k}) μ_{X} (j - l, j - k) \\ + 3 \sum_{i = 1}^{2} \sum_{l = 1}^{2} max (α_{j l}) max (σ_{j i}^{2}) μ_{X} (i - l) + 3 λ_{j} \sum_{i = 1}^{2} \sum_{l = 1}^{2} max (α_{j i}) max (α_{j l}) μ_{X} (i - l) \\ + 3 λ_{j} \sum_{i = 1}^{2} max (σ_{j i}^{2}) μ_{X} + \sum_{i = 1}^{2} max (C_{j i}) μ_{X} + γ_{ϵ_{j t}}, \\ μ_{X} (0, k) & \leq \sum_{i = 1}^{2} max (α_{j i}) μ_{X} (0, k - i) + λ_{j} μ_{X} (0), \\ μ_{X} (k, k) & \leq \sum_{i = 1}^{2} \sum_{l = 1}^{2} max (α_{j i}) max (α_{j l}) μ_{X} (k - i, k - l) + \sum_{i = 1}^{2} max (σ_{j i}^{2}) μ_{X} (k - i) + 2 λ_{j} μ_{X} (k), \\ μ_{X} (k, m) & \leq \sum_{i = 1}^{2} max (α_{j i}) μ_{X} (k, m - i) + λ_{j} μ_{X} (k), m > k, \end{matrix}$ the second-order moment of ${X_{t}}$ is $μ_{X} (0) \leq \sum_{i = 1}^{2} max (α_{j i}) μ_{X} (i) + λ_{j} μ_{X} + max (V_{j p}),$ where $V_{j p} = λ_{j} + μ_{X} \sum_{i = 1}^{2} σ_{j i}^{2}$ . □

Proof of Proposition 3.

(1) and (3) are obvious, so we just present the proofs of (2) and (4), which are obtained by similar arguments after some tedious calculations.

(2). Let $\begin{matrix} p_{1} & = P (X_{t - 1} > r, X_{t - 2} > s), p_{2} = P (X_{t - 1} \leq r, X_{t - 2} > s), \\ p_{3} & = P (X_{t - 1} \leq r, X_{t - 2} \leq s), p_{4} = P (X_{t - 1} > r, X_{t - 2} \leq s), \\ u_{1} & = E (X_{t - 1} | X_{t - 1} > r, X_{t - 2} > s), u_{2} = E (X_{t - 1} | X_{t - 1} \leq r, X_{t - 2} > s), \\ u_{3} & = E (X_{t - 1} | X_{t - 1} \leq r, X_{t - 2} \leq s), u_{4} = E (X_{t - 1} | X_{t - 1} > r, X_{t - 2} \leq s), \\ u_{1}^{*} & = E (X_{t - 2} | X_{t - 1} > r, X_{t - 2} > s), u_{2}^{*} = E (X_{t - 2} | X_{t - 1} \leq r, X_{t - 2} > s), \\ u_{3}^{*} & = E (X_{t - 2} | X_{t - 1} \leq r, X_{t - 2} \leq s), u_{4}^{*} = E (X_{t - 2} | X_{t - 1} > r, X_{t - 2} \leq s), \\ v_{1} & = V a r (X_{t - 1} | X_{t - 1} > r, X_{t - 2} > s), v_{2} = V a r (X_{t - 1} | X_{t - 1} \leq r, X_{t - 2} > s), \\ v_{3} & = V a r (X_{t - 1} | X_{t - 1} \leq r, X_{t - 2} \leq s), v_{4} = V a r (X_{t - 1} | X_{t - 1} > r, X_{t - 2} \leq s), \\ v_{1}^{*} & = V a r (X_{t - 2} | X_{t - 1} > r, X_{t - 2} > s), v_{2}^{*} = V a r (X_{t - 2} | X_{t - 1} \leq r, X_{t - 2} > s), \\ v_{3}^{*} & = V a r (X_{t - 2} | X_{t - 1} \leq r, X_{t - 2} \leq s), v_{4}^{*} = V a r (X_{t - 2} | X_{t - 1} > r, X_{t - 2} \leq s), \\ w_{1} & = E (X_{t - 1} X_{t - 2} | X_{t - 1} > r, X_{t - 2} > s), w_{2} = E (X_{t - 1} X_{t - 2} | X_{t - 1} \leq r, X_{t - 2} > s), \\ w_{3} & = E (X_{t - 1} X_{t - 2} | X_{t - 1} \leq r, X_{t - 2} \leq s), w_{4} = E (X_{t - 1} X_{t - 2} | X_{t - 1} > r, X_{t - 2} \leq s) . \end{matrix}$

Hence, $\begin{matrix} E (X_{t}) & = E [E (X_{t} | F_{t - 1})] = E (\sum_{j = 1}^{4} (α_{j 1} X_{t - 1} + α_{j 2} X_{t - 2} + λ_{j}) I_{j t} (r, s)) \\ = \sum_{j = 1}^{4} I_{j t} (r, s) (α_{j 1} E X_{t - 1} + α_{j 2} E X_{t - 2} + λ_{j}) = \sum_{j = 1}^{4} p_{j} (α_{j 1} u_{j} + α_{j 2} u_{j}^{*} + λ_{j}) . \end{matrix}$

(4). According to the variance formula, the variance of $X_{t}$ is (A1) $\begin{matrix} V a r (X_{t}) & = V a r ((α_{11} \circ X_{t - 1} + α_{12} \circ X_{t - 2} + λ_{1}) I_{1 t} (r, s) + (α_{21} \circ X_{t - 1} + α_{22} \circ X_{t - 2} + λ_{2}) I_{2 t} (r, s) \\ + (α_{31} \circ X_{t - 1} + α_{32} \circ X_{t - 2} + λ_{3}) I_{3 t} (r, s) + (α_{41} \circ X_{t - 1} + α_{42} \circ X_{t - 2} + λ_{4}) I_{4 t} (r, s)) \\ = V a r ((α_{11} \circ X_{t - 1} + α_{12} \circ X_{t - 2} + λ_{1}) I_{1 t} (r, s)) + V a r ((α_{21} \circ X_{t - 1} + α_{22} \circ X_{t - 2} + λ_{2}) I_{2 t} (r, s)) \\ + V a r ((α_{31} \circ X_{t - 1} + α_{32} \circ X_{t - 2} + λ_{3}) I_{3 t} (r, s)) + V a r ((α_{41} \circ X_{t - 1} + α_{42} \circ X_{t - 2} + λ_{4}) I_{4 t} (r, s)) \\ + 2 C o v ((α_{11} \circ X_{t - 1} + α_{12} \circ X_{t - 2} + λ_{1}) I_{1 t} (r, s), (α_{21} \circ X_{t - 1} + α_{22} \circ X_{t - 2} + λ_{2}) I_{2 t} (r, s)) \\ + 2 C o v ((α_{11} \circ X_{t - 1} + α_{12} \circ X_{t - 2} + λ_{1}) I_{1 t} (r, s), (α_{31} \circ X_{t - 1} + α_{32} \circ X_{t - 2} + λ_{3}) I_{3 t} (r, s)) \\ + 2 C o v ((α_{11} \circ X_{t - 1} + α_{12} \circ X_{t - 2} + λ_{1}) I_{1 t} (r, s), (α_{41} \circ X_{t - 1} + α_{42} \circ X_{t - 2} + λ_{4}) I_{4 t} (r, s)) \\ + 2 C o v ((α_{21} \circ X_{t - 1} + α_{22} \circ X_{t - 2} + λ_{2}) I_{2 t} (r, s), (α_{31} \circ X_{t - 1} + α_{32} \circ X_{t - 2} + λ_{3}) I_{3 t} (r, s)) \\ + 2 C o v ((α_{21} \circ X_{t - 1} + α_{22} \circ X_{t - 2} + λ_{2}) I_{2 t} (r, s), (α_{41} \circ X_{t - 1} + α_{42} \circ X_{t - 2} + λ_{4}) I_{4 t} (r, s)) \\ + 2 C o v ((α_{31} \circ X_{t - 1} + α_{32} \circ X_{t - 2} + λ_{3}) I_{3 t} (r, s), (α_{41} \circ X_{t - 1} + α_{42} \circ X_{t - 2} + λ_{4}) I_{4 t} (r, s)) \\ : = V_{1} + V_{2} + V_{3} + V_{4} + C_{1} + C_{2} + C_{3} + C_{4} + C_{5} + C_{6} . \end{matrix}$

In the following, we compute these quantities in (A1). First, we have (A2) $\begin{matrix} V_{1} & = V a r ((α_{11} \circ X_{t - 1} + α_{12} \circ X_{t - 2} + λ_{1}) I_{1 t} (r, s)) \\ = V a r (E ((α_{11} \circ X_{t - 1} + α_{12} \circ X_{t - 2} + λ_{1}) I_{1 t} (r, s) | F)) \\ + E (V a r ((α_{11} \circ X_{t - 1} + α_{12} \circ X_{t - 2} + λ_{1}) I_{1 t} (r, s) | F)) \\ = V a r (α_{11} X_{t - 1} I_{1 t} + α_{12} X_{t - 2} I_{1 t} + λ_{1} I_{1 t}) + E (α_{11} (1 - α_{11}) X_{t - 1} I_{1 t} + α_{12} (1 - α_{12}) X_{t - 2} I_{1 t} + λ_{1} I_{1 t}) \\ = α_{11}^{2} V a r (I_{1 t} X_{t - 1}) + α_{12}^{2} V a r (I_{2 t} X_{t - 2}) + 2 C o v (α_{11} X_{t - 1} I_{1 t}, α_{12} X_{t - 2} I_{1 t}) + 2 C o v (α_{11} X_{t - 1} I_{1 t}, λ_{1} I_{1 t}) \\ + 2 C o v (α_{12} X_{t - 2} I_{1 t}, λ_{1} I_{1 t}) + α_{11} (1 - α_{11}) p_{1} u_{1} + α_{12} (1 - α_{12}) p_{1} u_{1}^{*} + λ_{1} p_{1} \\ = α_{11}^{2} (p_{1} (v_{1} + u_{1}^{2}) - p_{1}^{2} u_{1}^{2}) + α_{12}^{2} (p_{1} (v_{1}^{*} + {(u_{1}^{*})}^{2}) - p_{1}^{2} {(u_{2}^{*})}^{2}) \\ + 2 (α_{11} α_{12} w_{1} p_{1} - α_{11} α_{12} p_{1}^{2} u_{1} u_{1}^{*}) + 2 (α_{11} λ_{1} p_{1} u_{1} - α_{11} λ_{1} p_{1}^{2} u_{1}) + 2 (α_{12} λ_{1} p_{1} u_{1}^{*} - α_{12} λ_{1} p_{1}^{2} u_{1}^{*}) \\ + α_{11} (1 - α_{11}) p_{1} u_{1} + α_{12} (1 - α_{12}) p_{1} u_{1}^{*} + λ_{1} p_{1} . \end{matrix}$

By the same arguments as above, it follows that (A3) $\begin{matrix} V_{j} & = V a r ((α_{j 1} \circ X_{t - 1} + α_{j 2} \circ X_{t - 2} + λ_{j}) I_{j t} (r, s)) \\ = α_{j 1}^{2} (p_{j} (v_{j} + u_{j}^{2}) - p_{j}^{2} u_{j}^{2}) + α_{j 2}^{2} (p_{j} (v_{j}^{*} + {(u_{j}^{*})}^{2}) - p_{j}^{2} {(u_{j}^{*})}^{2}) \\ + 2 (α_{j 1} α_{j 2} w_{j} p_{j} - α_{j 1} α_{j 2} p_{j}^{2} u_{j} u_{j}^{*}) + 2 (α_{j 1} λ_{j} p_{j} u_{j} - α_{j 1} λ_{j} p_{j}^{2} u_{j}) + 2 (α_{j 2} λ_{j} p_{j} u_{j}^{*} - α_{j 2} λ_{j} p_{j}^{2} u_{j}^{*}) \\ + α_{j 1} (1 - α_{j 1}) p_{j} u_{j} + α_{j 2} (1 - α_{j 2}) p_{j} u_{j}^{*} + λ_{j} p_{j}, j = 2, 3, 4 . \end{matrix}$

We can see that $C_{j}$ takes the form (A4) $\begin{matrix} C_{1} & = 2 C o v ((α_{11} \circ X_{t - 1} + α_{12} \circ X_{t - 2} + λ_{1}) I_{1 t} (r, s), (α_{21} \circ X_{t - 1} + α_{22} \circ X_{t - 2} + λ_{2}) I_{2 t} (r, s)) \\ = - 2 E ((α_{11} \circ X_{t - 1} + α_{12} \circ X_{t - 2} + λ_{1}) I_{1 t} (r, s)) E ((α_{21} \circ X_{t - 1} + α_{22} \circ X_{t - 2} + λ_{2}) I_{2 t} (r, s)) \\ = - 2 (α_{11} u_{1} p_{1} + α_{12} u_{1}^{*} p_{1} + λ_{1} p_{1}) (α_{21} u_{2} p_{2} + α_{22} u_{2}^{*} p_{2} + λ_{2} p_{2}), \end{matrix}$ (A5) $\begin{matrix} C_{2} & = 2 C o v ((α_{11} \circ X_{t - 1} + α_{12} \circ X_{t - 2} + λ_{1}) I_{1 t} (r, s), (α_{31} \circ X_{t - 1} + α_{32} \circ X_{t - 2} + λ_{3}) I_{3 t} (r, s)) \\ = - 2 (α_{11} u_{1} p_{1} + α_{12} u_{1}^{*} p_{1} + λ_{1} p_{1}) (α_{31} u_{3} p_{3} + α_{32} u_{3}^{*} p_{3} + λ_{3} p_{3}), \end{matrix}$ (A6) $\begin{matrix} C_{3} & = 2 C o v ((α_{11} \circ X_{t - 1} + α_{12} \circ X_{t - 2} + λ_{1}) I_{1 t} (r, s), (α_{41} \circ X_{t - 1} + α_{42} \circ X_{t - 2} + λ_{4}) I_{4 t} (r, s)) \\ = - 2 (α_{11} u_{1} p_{1} + α_{12} u_{1}^{*} p_{1} + λ_{1} p_{1}) (α_{41} u_{4} p_{4} + α_{42} u_{4}^{*} p_{4} + λ_{4} p_{4}), \end{matrix}$ (A7) $\begin{matrix} C_{4} & = 2 C o v ((α_{21} \circ X_{t - 1} + α_{22} \circ X_{t - 2} + λ_{2}) I_{2 t} (r, s), (α_{31} \circ X_{t - 1} + α_{32} \circ X_{t - 2} + λ_{3}) I_{3 t} (r, s)) \\ = - 2 (α_{21} u_{2} p_{2} + α_{22} u_{2}^{*} p_{2} + λ_{2} p_{2}) (α_{31} u_{3} p_{3} + α_{32} u_{3}^{*} p_{3} + λ_{3} p_{3}), \end{matrix}$ (A8) $\begin{matrix} C_{5} & = 2 C o v ((α_{21} \circ X_{t - 1} + α_{22} \circ X_{t - 2} + λ_{2}) I_{2 t} (r, s), (α_{41} \circ X_{t - 1} + α_{42} \circ X_{t - 2} + λ_{4}) I_{4 t} (r, s)) \\ = - 2 (α_{21} u_{2} p_{2} + α_{22} u_{2}^{*} p_{2} + λ_{2} p_{2}) (α_{41} u_{4} p_{4} + α_{42} u_{4}^{*} p_{4} + λ_{4} p_{4}), \end{matrix}$ (A9) $\begin{matrix} C_{6} & = 2 C o v ((α_{31} \circ X_{t - 1} + α_{32} \circ X_{t - 2} + λ_{3}) I_{3 t} (r, s), (α_{41} \circ X_{t - 1} + α_{42} \circ X_{t - 2} + λ_{4}) I_{4 t} (r, s)) \\ = - 2 (α_{31} u_{3} p_{3} + α_{32} u_{3}^{*} p_{3} + λ_{3} p_{3}) (α_{41} u_{4} p_{4} + α_{42} u_{4}^{*} p_{4} + λ_{4} p_{4}), \end{matrix}$ then, (4) follows by replacing (A2)–(A9) in (A1). □

Proof of Theorem 2

It can easily be seen that the conditions in Klimko and Nelson [24] are verified; we can see that $g (ϕ, X_{t - 1}, X_{t - 2}), \frac{\partial g (ϕ, X_{t - 1}, X_{t - 2})}{\partial ϕ_{i}}, \frac{\partial g^{2} (ϕ, X_{t - 1}, X_{t - 2})}{\partial ϕ_{i} \partial ϕ_{j}}, \frac{\partial g^{3} (ϕ, X_{t - 1}, X_{t - 2})}{\partial ϕ_{i} \partial ϕ_{j} \partial ϕ_{k}}$ satisfy all the regularity conditions for $i, j, k = 1, \dots, 12$ . Thus, the CLS estimator is strongly consistent. Moreover, when proving asymptotic normality we first have to check the following conditions:

(1)
$E (X_{t} | X_{t - 1}, \dots, X_{0}) = E (X_{t} | X_{t - 1}, X_{t - 2}), t \geq 3, a . s .$ ;
(2)
$E (q_{t}^{2} (ϕ) | \frac{\partial g (ϕ, X_{t - 1}, X_{t - 2})}{\partial ϕ_{i}} \frac{\partial g (ϕ, X_{t - 1}, X_{t - 2})}{\partial ϕ_{j}} |) < \infty, i, j = 1, \dots, 12$ ;
(3)
V is non singular,

then, we know from Klimko and Nelson [24] that the CLS estimation is asymptotically normal with the asymptotic variance

V^{- 1} W V^{- 1}

. □

References

1. Steutel, F.W.; van Harn, K. Discrete analogues of self-decomposability and stability. Ann. Probab.; 1979; 7, pp. 893-899. [DOI: https://dx.doi.org/10.1214/aop/1176994950]

2. Al-Osh, M.A.; Alzaid, A.A. First-order integer-valued autoregressive (INAR(1)) process. J. Time Ser. Anal.; 1987; 8, pp. 261-275. [DOI: https://dx.doi.org/10.1111/j.1467-9892.1987.tb00438.x]

3. McKenzie, E. Some simple models for discrete variate time series. Water Resour. Bull.; 1985; 21, pp. 645-650. [DOI: https://dx.doi.org/10.1111/j.1752-1688.1985.tb05379.x]

4. Du, J.; Li, Y. The integer-valued autoregressive (INAR(p)) model. J. Time Ser. Anal.; 1991; 12, pp. 129-142.

5. Silva, I.; Oliveira, V.L. Difference equations for the higher-order moments and cumulants of the INAR(p) model. J. Time Ser. Anal.; 2005; 26, pp. 17-36. [DOI: https://dx.doi.org/10.1111/j.1467-9892.2005.00388.x]

6. Silva, I.; Silva, M.E. Parameter estimation for INAR processes based on high-order statistics. REVSTAT Stat. J.; 2009; 7, pp. 105-117.

7. Zhang, J.; Zhu, F.; Mamode Khan, N. A new INAR model based on Poisson-BE2 innovations. Commun. Stat.-Theory Methods; 2023; 52, pp. 6063-6067. [DOI: https://dx.doi.org/10.1080/03610926.2021.2024571]

8. Scotto, M.G.; Weiß, C.H.; Möller, T.A.; Gouveia, S. The max-INAR(1) model for count processes. TEST; 2018; 27, pp. 850-870. [DOI: https://dx.doi.org/10.1007/s11749-017-0573-z]

9. Aleksić, M.S.; Ristić, R.M. A geometric minification integer-valued autoregressive model. Appl. Math. Model.; 2021; 90, pp. 265-280. [DOI: https://dx.doi.org/10.1016/j.apm.2020.08.047]

10. Tong, H. On a threshold model. Pattern Recognition and Signal Processing; Sijthoff and Noordhoff: Amsterdam, The Netherlands, 1978.

11. Boero, G.; Marrocu, E. The performance of SETAR models: A regime conditional evaluation of point, interval and density forecasts. Int. J. Forecast.; 2004; 20, pp. 305-320. [DOI: https://dx.doi.org/10.1016/j.ijforecast.2003.09.011]

12. Potter, S.M. A nonlinear approach to U.S. GNP. J. Appl. Econom.; 1995; 10, pp. 109-125. [DOI: https://dx.doi.org/10.1002/jae.3950100203]

13. Dueker, M.; Martin, S.; Spagnolo, F. Contemporaneous threshold autoregressive models: Estimation, testing and forecasting. J. Econom.; 2007; 141, pp. 517-547. [DOI: https://dx.doi.org/10.1016/j.jeconom.2006.10.022]

14. Tong, H. Threshold models in time series analysis 30 years on. Stat. Its Interface; 2011; 4, pp. 107-118. [DOI: https://dx.doi.org/10.4310/SII.2011.v4.n2.a1]

15. Li, D.; Tong, H. Nested sub-sample search algorithm for estimation of threshold models. Stat. Sin.; 2016; 26, pp. 1543-1554. [DOI: https://dx.doi.org/10.5705/ss.2013.394t]

16. Monteiro, M.; Scotto, M.G.; Pereira, I. Integer-valued self-exciting threshold autoregressive processes. Commun. Stat.-Theory Methods; 2012; 41, pp. 2717-2737. [DOI: https://dx.doi.org/10.1080/03610926.2011.556292]

17. Wang, C.; Liu, H.; Yao, J.; Davis, R.A.; Li, W.K. Self-excited threshold Poisson autoregression. J. Am. Stat. Assoc.; 2014; 109, pp. 776-787. [DOI: https://dx.doi.org/10.1080/01621459.2013.872994]

18. Yang, K.; Wang, D.; Jia, B.; Li, H. An integer-valued threshold autoregressive process based on negative binomial thinning. Stat. Pap.; 2018; 59, pp. 1131-1160. [DOI: https://dx.doi.org/10.1007/s00362-016-0808-1]

19. Zhang, X.; Li, D.; Tong, H. On the least squares estimation of multiple-threshold-variable autoregressive models. J. Bus. Econ. Stat.; 2023; pp. 1-14. [DOI: https://dx.doi.org/10.1080/07350015.2023.2174124]

20. Enciso-Mora, V.; Neal, P.; Subba Rao, T. Integer valued AR processes with explanatory variables. Sankhyā; 2009; 71, pp. 248-263.

21. Chen, H.; Li, Q.; Zhu, F. Two classes of dynamic binomial integer-valued ARCH models. Braz. J. Probab. Stat.; 2020; 34, pp. 685-711. [DOI: https://dx.doi.org/10.1214/19-BJPS452]

22. Qian, L.; Zhu, F. A new minification integer-valued autoregressive process driven by explanatory variables. Aust. N. Z. J. Stat.; 2022; 64, pp. 478-494. [DOI: https://dx.doi.org/10.1111/anzs.12379]

23. Su, B.; Zhu, F. Comparison of BINAR(1) models with bivariate negative binomial innovations and explanatory variables. J. Stat. Comput. Simul.; 2021; 91, pp. 1616-1634. [DOI: https://dx.doi.org/10.1080/00949655.2020.1863965]

24. Klimko, L.A.; Nelson, P.I. On conditional least squares estimation for stochastic processes. Ann. Stat.; 1978; 6, pp. 629-642. [DOI: https://dx.doi.org/10.1214/aos/1176344207]

Word count: 5521

Show less

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Abstract

Translate

In the past, most threshold models considered a single threshold variable. However, for some practical applications, models with two threshold variables may be needed. In this paper, we propose a two-threshold-variable integer-valued autoregressive model based on the binomial thinning operator and discuss some of its basic properties, including the mean, variance, strict stationarity, and ergodicity. We consider the conditional least squares (CLS) estimation and discuss the asymptotic normality of the CLS estimator under the known and unknown threshold values. The performances of the CLS estimator are compared via simulation studies. In addition, two real data sets are considered to underline the superior performance of the proposed model.

Details

Title

Two-Threshold-Variable Integer-Valued Autoregressive Model

Author

Zhang, Jiayue¹; Zhu, Fukang¹

; Chen, Huaping²

¹ School of Mathematics, Jilin University, Changchun 130012, China
² School of Mathematics and Statistics, Henan University, Kaifeng 475004, China; [email protected]

First page

3586

Publication year

2023

Publication date

2023

Publisher

MDPI AG

e-ISSN

22277390

Source type

Scholarly Journal

Language of publication

English

DOI

https://doi.org/10.3390/math11163586

ProQuest document ID

2857126257

Two-Threshold-Variable Integer-Valued Autoregressive Model

Jump to:

Full text

Abstract

Details

Suggested sources