1. Introduction
Since the regression quantile has similar robustness as the sample quantile, the quantile regression model can well characterize the conditional distribution of the response variable y when the independent variable is given, thus characterizing the link between the two [1]. Response variable and p-dimensional covariates are given, and is the conditional distribution function of y concerning . The classical quantile regression model can be expressed as
(1)
where is the regression coefficient. Koenker, R. et al. [2,3] provided a detailed discussion of quantile regression modeling in terms of methodology, theory, and computation. Because the loss function of quantile regression is not smooth, its computational complexity increases dramatically. Considering that the loss function is not derivable, Horowitz [4] used the smooth function to approximate the indicator function, in order to smooth the objective function. This method has also been applied to deal with other quantile regression-related problems, in which de Castro, L. et al. [5] used a smooth moment estimating equation to estimate the parameter in quantile regression; Chen et al. [6] used a smoothing method to study the quantile regression model with constraints; Galvao, A.F. and Kato, K. [7] studied the smoothing estimation of fixed effects in quantile regression models for panel data; Whang [8] discussed the empirical likelihood estimation of quantile regression models by using a smoothing approach. Although the above literature has solved the problem that the objective function is not derivable, it cannot guarantee the concavity of the objective function. Thus, there is no guarantee that the result is a global optimal solution. Recently, Fernandes et al. [9] proposed a convolutional smoothing method for estimating fixed dimensional parameters of the quantile regression model, based on which the loss function is quadratically derivable and convex, and the method outperforms the original smoothing estimation in terms of estimation accuracy. Under the complete observation data, He [10] used the convolutional smoothing method to estimate the parameters of the high-dimensional quantile regression model, and the loss function after smoothing is a quadratically derivable and convex function. When solving the minimum point of the smoothed objective function numerically, the gradient descent algorithm [10] is used to replace the quantile regression with the least squares estimation, which effectively shortens the computation time and improves the estimation accuracy.In empirical studies, it is frequently observed that variables of interest are subject to censoring. For instance, in the study [11] on the survival times of AIDS patients, it was found that 43% of the data were right-censored. For parameter estimation in quantile regression models for right-censored data, we refer to Ying, Z. et al. [12], Honoré, B. et al. [13], Portnoy, S. [14], Peng, L. [15], and Yuan, X. et al. [16]. Parameter estimation in quantile regression models for censored data [17,18,19] has been extensively studied, including right-censored data. In the quantile regression for right-censored data, the problem of non-smooth loss functions still exists, and it has already been studied [20,21,22,23]. For the right-censored quantile regression model with fixed dimensional parameters, Peng, L. and Huang, Y. [20], Xu, G. et al. [21], Cai, Z. and Sit, T. [22], Kim, K.H. [23] considered the smoothing estimation methods, respectively. For the high-dimensional large-sample case of the censored quantile regression model, Wu, Y. et al. [24], He, X. et al. [25], and Fei, Z. et al. [26] used a similar method to Fernandes et al. [9] to smooth the estimating equations, which improved the estimation accuracy over classical parameter estimation methods. However, these smoothing estimation methods for high-dimensional censored quantile regression need to grid the quantiles as , to ensure that the approximation error of the estimation equation will not be too large. Furthermore, we need to use the estimates of before estimating , and the estimation accuracy depends on the number of segmentation points, which also increases the computational complexity.
Considering the censored data, this paper extends the convolutional smoothing method [10] to the high-dimensional censored quantile regression model and proposes the coefficient estimation of the censored quantile regression model based on the convolutional smoothing method. Under certain conditions, the loss function of the smoothed censored quantile regression model is quadratically derivable and globally convex, and the gradient-based iterative algorithm could be used to calculate the regression parameters. In this paper, the bias of the smoothing estimation of the censored quantile regression is characterized under certain conditions, and the speed of convergence, the Bahadur–Kiefer representation, and the Berry–Esseen upper bound of the smoothing estimation of the censored quantile regression are established under high-dimensional and large-sample conditions. Moreover, compared with the classical parameter estimation method, which has different estimation accuracies at different quantile points, the smoothing estimation method basically maintains the same estimation accuracies at each quantile point, and the proposed method greatly saves the computation time under high-dimensional large-sample condition. In summary, the key contributions and significance of this article are as follows: (1) To our knowledge, this article is the first to apply the convolutional smoothing method to high-dimensional censored data regression analysis. Compared with the non differentiability of the objective function in classical censored quantile regression, the objective function in this method is second-order differentiable, which brings great convenience to the establishment of gradient based calculation algorithms and the discussion of theoretical properties. (2) For high-dimensional data scenarios, under certain regularization conditions, this paper also establishes the asymptotic consistency, asymptotic normality, and other theoretical properties of the proposed smooth estimator, ensuring the good properties of statistical inference.
This paper is organized as follows. Section 2 proposes a convolutional smoothing estimation method for high-dimensional censored quantile regression models, and gives the asymptotic properties of the smoothing estimation. In Section 3, numerical simulations of the smoothing estimation and the classical parameter estimation are carried out for the low and high dimensional cases, and the estimation accuracy and computational speed of the smoothing method in the censored quantile regression analysis are discussed. The discussion and conclusion are given in Section 4 and Section 5, and then a detailed proof of the asymptotic property is given in Appendix A.
2. Methods
In this paper, we consider a right-censored quantile regression model; i.e., we assume that the response variable y in the model (1) is right-censored, in which case the observed variables are , where c is the censored variable. The observations are denoted as . Then, the estimation of the parameters in the censored quantile regression model is defined as
(2)
where is the quantile loss function, and is the distribution function of the censored variable , and is the estimate of .Convolutional smoothing estimation for high-dimensional quantile regression models has been studied in the literature [10] by He et al. In this paper, we apply the method to censored data. Let be the kernel function with integral 1, and h be the window width. Denote the following:
Then, the objective function for the smoothing estimation of the censored quantile regression can be written as
(3)
where and “∗” denotes the convolution operator. The convolutional smoothing estimate (denoted SCQ) of the censored quantile regression model is defined as the point of minimum of , . For any , write , with(4)
For ease of presentation and without ambiguity, and are abbreviated as and , respectively.
It is easy to find that the loss function of the (3) is quadratically derivable, and its gradient and Hessian matrix can be expressed, respectively, as follows:
(5)
As long as the kernel function is non-negative, for any window width , is a convex function and satisfies the first-order condition .
When estimating the distribution of the censored variable , we can use KM estimation to estimate . If we assume that the form of the distribution of the censored variable is known, even if there is a mis-specification of the distribution, we find through the subsequent simulation that the estimation of the distribution of the censored variable by the parametric method is better than that of the distribution of the censored variable by using the KM estimation before smoothing the estimation of the regression parameter in most cases; thus, we rewrite the distribution function of the censored variable as follows . Where the parameter vector , is the maximum likelihood estimate of . The parametric distribution form is used for estimation in both the proof and the hypotheses.
To give theoretical results, we assume that the covariate has been centered. Given the vectors , and both represent their inner product: , where a and b are constants. The denotes the -paradigm, i.e., , and , where denotes the ith element of the p-dimensional real vector . Given a semi-positive definite matrix , define for any vector . For all real numbers , define and . For two non-negative sequences and , denotes the existence of a constant independent of n such that . is equivalent to . is equivalent to and holds simultaneously. The assumptions required for the theorem are as follows.
. The non-negative kernel function satisfies , with upper bound , and , where .
. The regression error term on the conditional density function on satisfies the Lipschtiz condition; i.e., there exists a constant such that , there is which holds almost everywhere. There exist real numbers such that holds almost everywhere for any .
. There exist positive constants and such that
(6)
Denote as the ith order statistic of , and as the corresponding indicator function. and satisfy
(7)
. The covariate obeys a subexponential distribution—i.e., there exists such that for any and , there are , where is positive definite with .
With and positive integer k, we define . The following theorems can be obtained.
(Upper bound on the estimation error). Suppose the condition – holds for any real number . If h satisfies the constraints , is the uncensored proportion, then the convolutional smoothing estimate satisfies the boundary conditions
(8)
where . and R are positive constants.The upper bounds on the estimation error and can be interpreted as the prediction bias and the speed of convergence. A smaller h leads to smaller deviations after smoothing, but an h that is too small could result in overfitting and slow convergence rates. According to Theorem 1, h satisfies the condition . In order to obtain a non-asymptotic Bahadur representation of the smoothing estimation, we tend to replace with .
. The covariate obeys a sub-Gaussian distribution—i.e. there exists such that for any and , there is a , where is positive definite with .
(non-asymptotic Bahadur representation). Assuming conditions – and hold, holds almost everywhere. For any real number , h satisfies the constraint . Let ; then,
(9)
where the real number is a constant independent of p and n.Theorem 2 allows for establishing the limiting distribution of the estimators. Based on the non-asymptotic representation in Theorem 2, we establish the Berry–Esseen upper bound for smoothing estimators.
(Berry–Esseen upper bound). Assuming that the conditions – and hold, holds almost everywhere, and h satisfies the condition that for any real number , there exists . Then,
(10)
where , denotes the standard normal distribution function.Further, if is quadratically continuously derivable and satisfies for any real number , where the function satisfies for some positive constant C, then
(11)
Theorem 3 proves that when h is chosen in the appropriate range and , the linear combination of is estimated to be asymptotically normal. According to Theorem 3, the optimal in the sense that minimizes the right hand of , and the error is approximated as . If , for any given vector , is asymptotically normal.
The assumptions , , and are commonly used in convolutional smoothing estimation of high-dimensional quantile regression models in fully observed data [10]. Condition refers to the assumptions concerning the distribution of the censoring variable c. Note that , assuming that is equivalent to —i.e., the probability that the largest observation equals to the true variable of interest is not zero.This is a commonly used condition in statistical inference for censored data [27,28], and this condition can avoid the situation where a large number of observations are censored data. Assuming that provides a local smoothing condition for in the neighborhood θ, the validity of this assumption could be verified intuitively for many commonly used distribution functions .
3. Numerical Simulation
In this section, the smoothing estimation and classical parameter estimation of the quantile regression model for censored data are numerically simulated, and two cases of low and high dimensionality are considered. Estimation of (2), as proposed in the literature [17], was chosen as the classical parameter estimation of the censored quantile regression model. Notice that the objective function of the classical parameter estimation for censored quantile regression model can be rewritten as . Therefore, when calculating the regression parameters, the censoring problem is transformed into a non-censoring problem, and the objective function of the smoothing estimation for censored quantile regression is rewritten as . The Gaussian kernel and window width are taken as the kernel function and window width, respectively, for the smoothing estimation of the censored quantile regression.
3.1. Model Setting and Evaluation Indicators
In the simulation, the covariates are generated from different distributions to simulate different types of variables commonly found in real data. The error term is generated by three different distributions, specifically, by drawing independent identically distributed random numbers with capacity n, and letting , where obeys the distributions: ; ; . Let the regression coefficient , given the quantile ; then, the response variable is modeled by
For both low- and high-dimensional models, the right censoring variable is set as , where are unknown parameters, which can take different values to make the censoring ratio of the response variable up to the set 15%, 30%, or 45%. In the actual simulation, in order to obtain the value of , the parameters are estimated by using maximum likelihood estimation in the simulations of Section 3.2 and Section 3.3. In Section 3.4, we discuss the smoothing estimation under the misspecification for the distribution of censored variables. KM estimation is also taken into consideration. Let the number of simulation repetitions be K, and for the parameter estimates in the kth simulation, write
Then, we can use
to evaluate the performance of classical parameter estimation for censored quantile regression models (CQ) and smoothing estimation methods (SCQ). In the actual simulation, we set .3.2. Low-Dimensional Performance of Regression Smoothing Estimates for Censored Quantiles
In the low-dimensional numerical study of smoothing estimation, the number of covariates is set to be and sample sizes are 100, 200, and 500. In order to assess the performance of smoothing method in the low-dimensional case, the generation of covariates is categorized into three cases.
Case 1: The p-dimensional covariates are generated from multivariate uniform distribution on , and the covariance matrix is a unit matrix;
Case 2: The p-dimensional covariate is generated from multivariate uniform distribution on , where is the covariance matrix;
Case 3: The p-dimensional covariates consist of a mixture of distributions, where the first two dimensional covariates are generated by a multivariate uniform distribution on , with a covariance matrix of . The covariates in the posterior three dimensions are generated from with mean and covariance matrix , where
Table 1, Table 2 and Table 3 show the simulation results when the covariates are generated according to the three scenarios, where CP denotes the censoring ratio of the response variable, n is the sample size, and columns 3–12 show the results of CQ and SCQ at different quantiles. From the estimation results, when the regression errors are generated by symmetric distributions, i.e., t and Laplace distributions, SCQ has higher accuracy than CQ, especially at the lower and higher quantiles. When the regression error term is generated by the distribution, the estimation accuracy of CQ decreases as increases from a global perspective, and the estimation accuracy of SCQ is much better than that of CQ at the higher quantiles, although CQ is better than SCQ at the lower quantiles. This may be because the density function of the distribution is biased and the observations are excessively clustered in the lower quantiles, so that CQ decreases in estimation accuracy as the number of quantiles increases, while SCQ maintains better estimation accuracy. It can be seen from Table 1, Table 2 and Table 3 that SCQ is more stable than CQ, regardless of whether the error terms follow symmetric or asymmetric distribution. Specifically, the estimation accuracy of SCQ is almost the same in all quantiles, while that of CQ fluctuates with the change of , especially in the case of the asymmetric distribution of the error terms. Overall, the estimation accuracy of CQ depends greatly on the value of , the size of the censoring ratio, and the distribution of the error term, while the estimation effect of SCQ is minimally affected by these factors and shows good robustness.
3.3. High-Dimensional Performance of Smoothing Estimators of Censored Quantile Regression
In high-dimensional large-sample numerical studies, the ratio of sample size to dimension is fixed at , the sample size is set at 1000–5000, and the step size is 500. In order to simulate the smoothing estimation of censored quantile regression with the change of dimension and sample size, the covariate generation is categorized into three cases.
Case 1: The p-dimensional covariate is generated from a multivariate uniform distribution on , with covariance matrix .
Case 2: The p-dimensional covariate is generated from a multivariate uniform distribution on , with covariance matrix , where
Case 3: The p-dimensional covariate consists of a mixture of distributions, where the first -dimensional covariate is generated from a multivariate uniform distribution on , and the covariance matrix is ; the second -dimensional covariate is generated by with mean , where the jth component , and covariance matrix , where
The ratio of DMSE between CQ and SCQ is firstly calculated, and the simulation results of the covariates under Case 1 are displayed in Figure 1. Since the results of the three covariate generation cases are very similar, we do not show the results of Case 2 and Case 3. The results in Figure 1 show that the ratio of DMSE estimated by CQ and SCQ is not significantly affected by changes in sample size and dimensionality. When the regression error terms are generated by symmetric distributions, i.e., the t-distribution and -distribution, the DMSE ratios of the regression coefficients estimators between CQ and SCQ remain above one. This indicates that SCQ has a higher precision compared with CQ, especially in the case where the difference between the lower quantile of and the upper quantile of is more obvious, and the ratio of DMSE for regression coefficients’ estimators between CQ and SCQ can reach three. When the error term is generated by the asymmetric distribution, the ratio of DMSE between CQ and SCQ is less than one at the low quantile level, and CQ is superior to SCQ in this case. However, the ratio of DMSE between the two methods increases as increases, and the ratio reaches around 10 when . In order to clarify the reason for this phenomenon, we calculate the DMSE of two estimation methods when the error term is generated by the distribution. As shown in Figure 2, CQ performs better when = 0.1. As increases, the DMSE of CQ increases while its growth rate gradually increases. The CQ’s DMSE is significantly higher than the SCQ’s when , whereas the SCQ’s DMSE stabilizes at each quantile point in a straight line. The observed phenomenon aligns with its lower-dimensional counterpart, underscoring that the estimation accuracy of the CQ depends on the magnitude of , the censoring ration of response variables and the distribution characteristics of the error term. In contrast, the estimation accuracy of the SCQ exhibits robustness against variations in these factors.
To assess the computational efficiency of the smoothing estimation, it is imperative to compare both the DMSE of CQ and SCQ estimations within the context of high-dimensional simulations, as well as the time expenditure associated with each estimation method. In Figure 3, the computational time ratios of CQ and SCQ are both greater than 1, which tends to increase with the augmentation of dimensionality and the sample size, and the computational time ratios of CQ and SCQ are more than 10 in all the cases when the sample size is . The results of the covariate generation in Case 2 and Case 3, which are not shown here, are similar to those in Case 1. Combined with the previous study, it is clear that, when compared to CQ, SCQ significantly decreases calculation time and increases estimation accuracy in the majority of circumstances.
3.4. Robustness of Smoothing Estimates
In order to compare the effects under the misspecification for the distribution of censored variables and KM estimation of censored distributions on the smoothing estimation of parameters, we choose Case 1 of the covariate generation method in the low- and high-dimensional simulations for numerical research.
When the sample size is 200, the model coefficients are estimated using convolutional smoothing after assuming that the distribution of the censored variables is misclassified as Normal, Weibull, and Lognormal, and using the KM estimation of G with different conditions of quantiles, censoring proportions, and error terms. The simulation in Table 4 shows that the smoothing estimation using the misclassified distribution of the censored variables is more robust than the smoothing estimation using the KM estimation to estimate the distribution of the censored variables. Similar results are obtained for a sample size of 3000, as shown in Table 5.
The simulation shows that although there is a possibility of mis-setting in estimating the parameters of the distribution of the censored variables by using , this method is better and more robust than the smoothing of regression models after estimating the distribution of the censored variables by using the KM estimation. During the simulation process, it is also found in Table 4 and Table 5 that when the censored variables are estimated in different distribution forms, the smoothing estimation errors of the model coefficients exhibit a minimal degree of variation, and the estimation accuracy is higher than that of using KM estimation to estimate the distribution of the censored variables and then carry out the smoothing estimation. Regarding computational efficiency, the smoothing estimation method, when applied in scenarios of distributional misclassification, demonstrates a running time ratio to the SCQ that fluctuates within the range of 0.9 to 1.1. In contrast, the smoothing estimation method, once preceded by the KM estimation, incurs a significantly extended running time.
4. Discussion
In the smoothing estimation of censored quantile regression models, the distribution of the censored variables is usually unknown. The problem of the unknown distribution of the censored variables can be solved by estimating the density function of the censored variables through existing nonparametric methods, then determining the type of the censored distribution using the goodness-of-fit test, and accessing the unknown parameters in the distribution using the great likelihood method. In the numerical simulation, this paper also discusses using such a method to fit the unknown censored distribution to estimate , and then estimate the regression parameter .
We also discuss the parametric smoothing estimation method in the case of misspecification of the distribution of the censored variables, and compare it with the smoothing estimation method of the parameters after estimating the distribution of the censored variables with KM estimation. The simulation results show that the smoothing estimation method is still more robust than the method of estimating the distribution of censored variables with KM estimation even if there is a misspecification in the estimation of the distribution of the censored variables. Meanwhile, the smoothing estimation method is more robust than the classical censored quantile regression model.
Our research has certain limitations and there are some issues that need to be further explored. Firstly, we have performed some analysis and research on parameter smoothing estimation for quantile linear models; moreover, we can plan further research on parameter estimation and interval estimation for complex models such as generalized linear models. Secondly, our proof is based on the assumption that the form of censored distribution is known, and the theoretical proof by using a nonparametric model to estimate the censored distribution is still a challenging task. This requires understanding and knowledge in the theory of nonparametric statistics and probability limits.
5. Conclusions
In this paper, a convolutional smoothing estimation method for the censored quantile regression model is proposed to address the problem that the loss function is not derivable. Our method associates the convolutional smoothing estimation with the loss function of censored quantile regression, which is quadratically derivable, compared with the classical censored quantile regression estimation. Moreover, the smoothing estimation method for censored quantile regression models improves the estimation accuracy, computational speed, and robustness over the classical parameter estimation method. The contribution and significance of this paper can be summarized as follows:
The method establishes links between the convolutional smoothing method and the loss function of the censored quantile regression model, and the use of the non-negative kernel function ensures that the smoothed loss function is quadratically derivable and globally convex, which can be used to improve the computational speed by using the gradient-based iterative algorithm.
Theoretically, we characterize the bias of the smoothing estimation for censored quantile regression and establish the convergence rate, Bahadur–Kiefer representation, and Berry–Esseen upper bound of the smoothing estimation under high-dimensional and large-sample conditions.
The numerical simulation shows that the smoothing estimation method greatly reduces the computation time and improves the estimation accuracy in most cases, compared with the classical parameter estimation. In addition, the accuracy of CQ estimator is highly dependent on , censoring ratio(CP), and the distribution of error term, but the SCQ estimation is robust to these factors.
Conceptualization, M.W.; methodology, M.W.; software, M.W.; validation, M.W., X.Z. and Q.G.; formal analysis, M.W.; investigation, M.W.; resources, M.W.; data curation, M.W.; writing—original draft preparation, M.W.; writing—review and editing, X.W., X.M. and J.W.; visualization, M.W.; supervision, X.Z. and Q.G.; project administration, X.Z. and Q.G.; funding acquisition, X.W., X.Z. and Q.G. All authors have read and agreed to the published version of the manuscript.
The datasets used and analyzed of this study are available from the corresponding author(s) on reasonable request.
The authors declare no conflicts of interest.
Footnotes
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
Figure 1. Estimation results for the high-dimensional model when covariate X is generated from Case 1. The horizontal coordinates of the plots indicate the sample size (in thousands), and the vertical coordinates indicate the ratio of DMSE for regression coefficients’ estimators between CQ and SCQ.
Figure 1. Estimation results for the high-dimensional model when covariate X is generated from Case 1. The horizontal coordinates of the plots indicate the sample size (in thousands), and the vertical coordinates indicate the ratio of DMSE for regression coefficients’ estimators between CQ and SCQ.
Figure 2. The DMSE of CQ and SCQ for three high-dimensional covariate generation cases when the error terms obey the [Forumla omitted. See PDF.] distribution, with the horizontal coordinates denoting the different quantile levels, and the vertical axes denoting the DMSEs of the CQ and SCQ estimations scaled up by a factor of 10, where the solid line denotes the SCQ, and the dashed line denotes the CQ.
Figure 2. The DMSE of CQ and SCQ for three high-dimensional covariate generation cases when the error terms obey the [Forumla omitted. See PDF.] distribution, with the horizontal coordinates denoting the different quantile levels, and the vertical axes denoting the DMSEs of the CQ and SCQ estimations scaled up by a factor of 10, where the solid line denotes the SCQ, and the dashed line denotes the CQ.
Figure 3. Simulation results under the high-dimensional model when the covariates are generated from Case 1, where the horizontal coordinate denotes the sample size (in thousands), and the vertical coordinate denotes the ratio of estimated running time between CQ and SCQ.
Figure 3. Simulation results under the high-dimensional model when the covariates are generated from Case 1, where the horizontal coordinate denotes the sample size (in thousands), and the vertical coordinate denotes the ratio of estimated running time between CQ and SCQ.
The DMSE of CQ and SCQ estimators for the low-dimensional model with covariate X generated from Case 1, where the values in columns 3–12 are DMSE ×
| | | | | | | |||||
---|---|---|---|---|---|---|---|---|---|---|---|
CQ | SCQ | CQ | SCQ | CQ | SCQ | CQ | SCQ | CQ | SCQ | ||
| |||||||||||
100 | 397 | 116 | 143 | 112 | 112 | 109 | 137 | 116 | 387 | 114 | |
15 | 200 | 180 | 58 | 63 | 54 | 51 | 56 | 66 | 52 | 179 | 56 |
500 | 71 | 24 | 28 | 23 | 21 | 22 | 27 | 21 | 73 | 21 | |
100 | 489 | 151 | 190 | 134 | 144 | 133 | 170 | 131 | 518 | 135 | |
30 | 200 | 228 | 74 | 85 | 70 | 65 | 69 | 82 | 67 | 230 | 65 |
500 | 95 | 34 | 30 | 27 | 25 | 27 | 32 | 26 | 91 | 25 | |
100 | 687 | 200 | 224 | 179 | 181 | 178 | 220 | 178 | 628 | 173 | |
45 | 200 | 302 | 97 | 103 | 91 | 84 | 88 | 107 | 81 | 291 | 77 |
500 | 118 | 45 | 38 | 39 | 34 | 35 | 39 | 31 | 118 | 32 | |
| |||||||||||
100 | 2 | 128 | 13 | 135 | 62 | 126 | 212 | 122 | 1103 | 123 | |
15 | 200 | 0 | 78 | 6 | 78 | 29 | 79 | 118 | 75 | 591 | 77 |
500 | 0 | 39 | 2 | 40 | 13 | 39 | 48 | 39 | 236 | 37 | |
100 | 2 | 164 | 17 | 155 | 77 | 164 | 270 | 155 | 1397 | 144 | |
30 | 200 | 0 | 97 | 7 | 98 | 37 | 97 | 136 | 95 | 718 | 90 |
500 | 0 | 51 | 3 | 54 | 15 | 52 | 55 | 48 | 305 | 44 | |
100 | 3 | 208 | 23 | 198 | 89 | 204 | 321 | 189 | 1746 | 194 | |
45 | 200 | 1 | 127 | 10 | 120 | 46 | 124 | 162 | 117 | 860 | 107 |
500 | 0 | 68 | 4 | 69 | 20 | 62 | 73 | 59 | 366 | 55 | |
| |||||||||||
100 | 603 | 235 | 246 | 230 | 213 | 235 | 258 | 223 | 564 | 234 | |
15 | 200 | 304 | 119 | 125 | 115 | 109 | 114 | 129 | 114 | 299 | 111 |
500 | 119 | 52 | 50 | 49 | 43 | 46 | 52 | 45 | 109 | 46 | |
100 | 743 | 322 | 343 | 302 | 271 | 291 | 336 | 288 | 762 | 270 | |
30 | 200 | 365 | 171 | 159 | 149 | 124 | 147 | 163 | 136 | 385 | 289 |
500 | 140 | 74 | 62 | 66 | 53 | 60 | 59 | 54 | 138 | 54 | |
100 | 983 | 441 | 410 | 374 | 370 | 364 | 435 | 361 | 964 | 330 | |
45 | 200 | 461 | 210 | 200 | 184 | 165 | 171 | 200 | 175 | 456 | 163 |
500 | 195 | 101 | 76 | 80 | 65 | 81 | 76 | 71 | 187 | 64 |
The DMSE of CQ and SCQ estimators for the low-dimensional model with covariate X generated from Case 2, where the values in columns 3–12 are DMSE ×
| | | | | | | |||||
---|---|---|---|---|---|---|---|---|---|---|---|
CQ | SCQ | CQ | SCQ | CQ | SCQ | CQ | SCQ | CQ | SCQ | ||
| |||||||||||
100 | 661 | 173 | 241 | 168 | 180 | 179 | 245 | 183 | 667 | 173 | |
15 | 200 | 323 | 88 | 113 | 80 | 89 | 80 | 108 | 78 | 303 | 79 |
500 | 128 | 33 | 46 | 32 | 36 | 32 | 47 | 34 | 119 | 33 | |
100 | 840 | 230 | 266 | 219 | 226 | 205 | 288 | 208 | 847 | 209 | |
30 | 200 | 390 | 110 | 137 | 105 | 112 | 99 | 145 | 103 | 413 | 106 |
500 | 160 | 47 | 53 | 42 | 44 | 42 | 55 | 42 | 151 | 40 | |
100 | 1101 | 310 | 423 | 302 | 344 | 295 | 422 | 265 | 1210 | 275 | |
45 | 200 | 521 | 150 | 180 | 143 | 145 | 136 | 175 | 135 | 506 | 125 |
500 | 186 | 62 | 69 | 53 | 55 | 52 | 74 | 50 | 210 | 52 | |
| |||||||||||
100 | 3 | 190 | 21 | 195 | 97 | 199 | 352 | 194 | 1889 | 196 | |
15 | 200 | 1 | 118 | 10 | 113 | 50 | 114 | 195 | 110 | 996 | 115 |
500 | 0 | 60 | 4 | 59 | 21 | 58 | 81 | 59 | 450 | 60 | |
100 | 4 | 252 | 29 | 253 | 124 | 243 | 465 | 236 | 2219 | 236 | |
30 | 200 | 1 | 146 | 13 | 145 | 60 | 142 | 221 | 139 | 1181 | 136 |
500 | 0 | 78 | 5 | 74 | 25 | 73 | 94 | 72 | 521 | 68 | |
100 | 9 | 321 | 41 | 330 | 157 | 303 | 571 | 301 | 2653 | 307 | |
45 | 200 | 1 | 179 | 17 | 184 | 75 | 193 | 271 | 187 | 1578 | 179 |
500 | 0 | 93 | 7 | 92 | 31 | 91 | 121 | 93 | 671 | 90 | |
| |||||||||||
100 | 1027 | 363 | 434 | 363 | 365 | 361 | 462 | 347 | 1028 | 356 | |
15 | 200 | 506 | 179 | 227 | 168 | 179 | 183 | 218 | 174 | 484 | 172 |
500 | 195 | 75 | 86 | 67 | 76 | 68 | 85 | 70 | 204 | 71 | |
100 | 1267 | 449 | 546 | 453 | 481 | 414 | 583 | 437 | 1366 | 443 | |
30 | 200 | 560 | 224 | 270 | 208 | 215 | 209 | 273 | 221 | 574 | 196 |
500 | 241 | 93 | 112 | 85 | 94 | 86 | 100 | 86 | 244 | 83 | |
100 | 1656 | 615 | 702 | 586 | 646 | 608 | 784 | 584 | 1642 | 564 | |
45 | 200 | 783 | 302 | 389 | 280 | 296 | 277 | 348 | 266 | 830 | 255 |
500 | 305 | 122 | 133 | 115 | 114 | 116 | 136 | 105 | 313 | 101 |
The DMSE of CQ and SCQ estimators for the low-dimensional model with covariate X generated from Case 3, where the values in columns 3–12 are DMSE ×
| | | | | | | |||||
---|---|---|---|---|---|---|---|---|---|---|---|
CQ | SCQ | CQ | SCQ | CQ | SCQ | CQ | SCQ | CQ | SCQ | ||
| |||||||||||
100 | 863 | 259 | 315 | 255 | 257 | 253 | 302 | 252 | 841 | 249 | |
15 | 200 | 405 | 131 | 144 | 128 | 118 | 126 | 148 | 125 | 442 | 122 |
500 | 166 | 52 | 56 | 50 | 43 | 49 | 54 | 48 | 164 | 46 | |
100 | 1082 | 331 | 371 | 321 | 311 | 315 | 390 | 311 | 104 | 304 | |
30 | 200 | 490 | 168 | 172 | 159 | 141 | 153 | 181 | 149 | 528 | 144 |
500 | 203 | 73 | 71 | 67 | 54 | 64 | 66 | 61 | 187 | 58 | |
100 | 1443 | 467 | 518 | 442 | 421 | 431 | 522 | 425 | 1372 | 415 | |
45 | 200 | 653 | 226 | 233 | 209 | 187 | 202 | 247 | 196 | 712 | 186 |
500 | 260 | 101 | 91 | 89 | 70 | 83 | 89 | 78 | 248 | 73 | |
| |||||||||||
100 | 3 | 286 | 30 | 286 | 127 | 284 | 468 | 282 | 2529 | 277 | |
15 | 200 | 1 | 180 | 12 | 180 | 65 | 179 | 262 | 177 | 1290 | 172 |
500 | 0 | 87 | 5 | 87 | 26 | 86 | 101 | 85 | 535 | 82 | |
100 | 5 | 358 | 36 | 357 | 154 | 354 | 566 | 348 | 2992 | 335 | |
30 | 200 | 1 | 218 | 15 | 216 | 76 | 212 | 295 | 208 | 1551 | 199 |
500 | 0 | 118 | 6 | 116 | 32 | 114 | 122 | 109 | 688 | 100 | |
100 | 9 | 467 | 51 | 465 | 210 | 457 | 709 | 450 | 382 | 431 | |
45 | 200 | 2 | 281 | 21 | 279 | 97 | 274 | 374 | 265 | 1866 | 246 |
500 | 0 | 154 | 8 | 152 | 41 | 148 | 161 | 139 | 907 | 124 | |
| |||||||||||
100 | 1372 | 553 | 619 | 545 | 537 | 539 | 615 | 535 | 1337 | 527 | |
15 | 200 | 638 | 259 | 276 | 252 | 229 | 247 | 269 | 244 | 649 | 241 |
500 | 260 | 117 | 103 | 112 | 89 | 109 | 107 | 106 | 255 | 103 | |
100 | 1757 | 718 | 780 | 693 | 655 | 678 | 766 | 664 | 1675 | 639 | |
30 | 200 | 780 | 358 | 334 | 337 | 295 | 325 | 336 | 315 | 819 | 304 |
500 | 311 | 170 | 133 | 154 | 114 | 146 | 137 | 139 | 318 | 130 | |
100 | 2274 | 985 | 1007 | 939 | 865 | 907 | 945 | 874 | 2145 | 822 | |
45 | 200 | 999 | 483 | 452 | 446 | 388 | 424 | 433 | 407 | 1064 | 386 |
500 | 405 | 238 | 172 | 210 | 150 | 196 | 178 | 183 | 415 | 165 |
Smoothing estimators of regression model parameters after misclassifying the distribution of censored variables as Normal, Weibull, and Lognormal, and after estimating G using KM estimation. The covariate X in the low-dimensional model is generated from Case 1, where the values in columns 2–13 are DMSE ×
| Normal | Weibull | Lognormal | KM Estimation | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
15% | 30% | 45% | 15% | 30% | 45% | 15% | 30% | 45% | 15% | 30% | 45% | |
| ||||||||||||
0.1 | 588 | 766 | 991 | 567 | 722 | 938 | 564 | 709 | 916 | 1366 | 1404 | 1448 |
0.3 | 554 | 682 | 861 | 552 | 677 | 858 | 552 | 677 | 858 | 1158 | 1192 | 1203 |
0.5 | 537 | 644 | 808 | 543 | 655 | 823 | 545 | 660 | 832 | 1123 | 1128 | 1157 |
0.7 | 524 | 618 | 774 | 536 | 639 | 800 | 539 | 648 | 815 | 1037 | 1051 | 1100 |
0.9 | 511 | 594 | 754 | 529 | 620 | 776 | 534 | 631 | 793 | 976 | 1000 | 1091 |
| ||||||||||||
0.1 | 808 | 1006 | 1270 | 793 | 974 | 1227 | 791 | 966 | 1212 | 1786 | 1847 | 1982 |
0.3 | 803 | 995 | 1250 | 790 | 969 | 1215 | 789 | 962 | 1202 | 1788 | 1810 | 1902 |
0.5 | 792 | 969 | 1212 | 785 | 955 | 1191 | 785 | 953 | 1185 | 1716 | 1778 | 1856 |
0.7 | 772 | 922 | 1150 | 776 | 929 | 1151 | 778 | 933 | 1155 | 1521 | 1684 | 1820 |
0.9 | 740 | 858 | 1076 | 759 | 886 | 1088 | 765 | 898 | 1102 | 1305 | 1409 | 1472 |
| ||||||||||||
0.1 | 1213 | 1671 | 2138 | 1160 | 1560 | 2047 | 1151 | 1525 | 1999 | 1778 | 2023 | 2348 |
0.3 | 1141 | 1472 | 1835 | 1126 | 1451 | 1851 | 1125 | 1444 | 1851 | 1370 | 1595 | 1897 |
0.5 | 1102 | 1372 | 1695 | 1106 | 1392 | 1754 | 1110 | 1400 | 1777 | 1208 | 1434 | 1731 |
0.7 | 1071 | 1295 | 1593 | 1092 | 1344 | 1678 | 1099 | 1363 | 1717 | 1116 | 1326 | 1609 |
0.9 | 1034 | 1215 | 1513 | 1072 | 1285 | 1604 | 1083 | 1314 | 1655 | 1078 | 1304 | 1598 |
Smoothing estimates of regression model parameters after misclassifying the distribution of censored variables as Normal, Weibull, and Lognormal, and after estimating G using KM estimation. The covariate X in the high-dimensional model is generated from Case 1, where the values in columns 2–13 are DMSE ×
| Normal | Weibull | Lognormal | KM Estimation | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
15% | 30% | 45% | 15% | 30% | 45% | 15% | 30% | 45% | 15% | 30% | 45% | |
| ||||||||||||
0.1 | 1962 | 2272 | 2891 | 1952 | 2270 | 2945 | 2020 | 2274 | 2965 | 2634 | 3593 | 4324 |
0.3 | 1951 | 2260 | 2937 | 1940 | 2258 | 2925 | 2022 | 2263 | 2943 | 2570 | 3532 | 4343 |
0.5 | 1945 | 2253 | 2891 | 1935 | 2251 | 2913 | 2026 | 2255 | 2936 | 2489 | 3459 | 4373 |
0.7 | 1943 | 2249 | 2902 | 1933 | 2247 | 2900 | 2019 | 2253 | 2931 | 2394 | 3381 | 4273 |
0.9 | 1943 | 2239 | 2892 | 1932 | 2236 | 2876 | 2013 | 2239 | 2912 | 2369 | 3306 | 4172 |
| ||||||||||||
0.1 | 2411 | 2427 | 2855 | 2391 | 2415 | 2844 | 2421 | 2445 | 2868 | 3605 | 3613 | 4105 |
0.3 | 2402 | 2439 | 2864 | 2392 | 2437 | 2844 | 2416 | 2448 | 2885 | 3390 | 3438 | 4035 |
0.5 | 2421 | 2435 | 2851 | 2391 | 2426 | 2842 | 2432 | 2447 | 2865 | 3308 | 3350 | 4062 |
0.7 | 2412 | 2446 | 2854 | 2392 | 2434 | 2843 | 2425 | 2465 | 2864 | 3307 | 3353 | 3955 |
0.9 | 2409 | 2442 | 2832 | 2390 | 2431 | 2825 | 2413 | 2444 | 2850 | 3262 | 3271 | 4009 |
| ||||||||||||
0.1 | 3857 | 4742 | 6153 | 3846 | 4739 | 6143 | 3861 | 4757 | 6165 | 3640 | 4442 | 5388 |
0.3 | 3840 | 4724 | 6096 | 3834 | 4711 | 6085 | 3853 | 4742 | 6101 | 3602 | 4349 | 5403 |
0.5 | 3838 | 4710 | 6051 | 3827 | 4692 | 6045 | 3841 | 4740 | 6067 | 3728 | 4252 | 5332 |
0.7 | 3835 | 4682 | 6018 | 3819 | 4675 | 6006 | 3851 | 4695 | 6024 | 3789 | 4169 | 5236 |
0.9 | 3834 | 4662 | 5963 | 3816 | 4654 | 5951 | 3849 | 4686 | 5975 | 3610 | 4097 | 5195 |
Appendix A
Suppose
In order to prove Theorem 1, two lemmas are given first.
Assuming conditions
Moreover, assuming that
To prove (
The last step of (
By using Taylor’s formula, we obtain
Notice that
Combined with the above equation, we can obtain
For the left-hand side of Equation (
Combining (
Suppose
Repeating the analysis of (
This leads to a contradiction. Therefore, h must be satisfied to fulfill (
In order to demonstrate (A3), the bias of the estimator is examined below. We define the variables
Furthermore, expanding
Combining (
Lemma A1 discusses the connection between
For any
For each sample
For any given
Through Rademacher’s symmetric control moment function
Define
Taking
It follows that for every
Substituting the above inequality into (
The required constraint can be obtained by taking
According to the definition of
Take
To control the last term on the right-hand side of (
Combining the inequalities (
Using the condition (
Secondly, when
The above inequality in
The conclusion of Theorem 1 can be proved by eliminating
The procedure for the proof of Theorem 2 will retain the notation used in the proof of Theorem 1, for any
It is now necessary to give upper bounds for M and N, respectively. Consider first the upper bound for M, using the median theorem,
Utilizing the Lipschitz continuity of
The last inequality can be obtained from the Cauchy–Schwarz inequality. Therefore,
Let
The proof procedure of Theorem 3 can be found in [
Appendix B
Estimation results for the high-dimensional model when covariate X is generated from Case 2 (
Figure A1. Estimation results for the high-dimensional model when covariate X is generated from Case 2. The horizontal coordinates of the plots indicate the sample size (in thousands), and the vertical coordinates indicate the ratio of DMSE for regression coefficients’ estimators between CQ and SCQ.
Figure A2. Estimation results for the high-dimensional model when covariate X is generated from Case 3. The horizontal coordinates of the plots indicate the sample size (in thousands), and the vertical coordinates indicate the ratio of DMSE for regression coefficients’ estimators between CQ and SCQ.
Figure A3. Simulation results under the high-dimensional model when the covariates are generated from Case 2, where the horizontal coordinate denotes the sample size (in thousands), and the vertical coordinate denotes the ratio of estimated running time between CQ and SCQ.
Figure A4. Simulation results under the high-dimensional model when the covariates are generated from Case 3, where the horizontal coordinate denotes the sample size (in thousands), and the vertical coordinate denotes the ratio of estimated running time between CQ and SCQ.
References
1. Koenker, R.; Gilbert, B., Jr. Regression quantiles. Econometrica; 1978; 46, pp. 33-50. [DOI: https://dx.doi.org/10.2307/1913643]
2. Koenker, R. Quantile Regression; Cambridge University Press: Cambridge, UK, 2005.
3. Koenker, R.; Chernozhukov, V.; He, X.M.; Peng, L. Handbook of Quantile Regression; Chapman & Hall/CRC: Boca Raton, FL, USA, 2017.
4. Horowitz, J.L. Bootstrap methods for median regression models. Econometrica; 1998; 66, pp. 1327-1351. [DOI: https://dx.doi.org/10.2307/2999619]
5. de Castro, L.; Galvao, A.F.; Kaplan, D.M.; Liu, X. Smoothed GMM for quantile models. J. Econom.; 2019; 213, pp. 121-144. [DOI: https://dx.doi.org/10.1016/j.jeconom.2019.04.008]
6. Chen, X.; Liu, W.; Zhang, Y. Quantile regression under memory constraint. Ann. Stat.; 2019; 47, pp. 3244-3273. [DOI: https://dx.doi.org/10.1214/18-AOS1777]
7. Galvao, A.F.; Kato, K. Smoothed quantile regression for panel data. J. Econom.; 2016; 193, pp. 92-112. [DOI: https://dx.doi.org/10.1016/j.jeconom.2016.01.008]
8. Whang, Y.J. Smoothed empirical likelihood methods for quantile regression models. Econ. Theory; 2006; 22, pp. 173-205. [DOI: https://dx.doi.org/10.1017/S0266466606060087]
9. Fernandes, M.; Guerre, E.; Horta, E. Smoothing quantile regressions. J. Bus. Econom Stat.; 2021; 39, pp. 338-357. [DOI: https://dx.doi.org/10.1080/07350015.2019.1660177]
10. He, X.; Pan, X.; Tan, K.M.; Zhou, W.X. Smoothed quantile regression with large-scale inference. J. Econom.; 2023; 232, pp. 367-388. [DOI: https://dx.doi.org/10.1016/j.jeconom.2021.07.010] [PubMed: https://www.ncbi.nlm.nih.gov/pubmed/36776480]
11. Bayarassou, N.; Hamrani, F.; Ould, S.E. Nonparametric relative error estimation of the regression function for left truncated and right censored time series data. J. Nonparametr. Stat.; 2023; 36, pp. 706-729. [DOI: https://dx.doi.org/10.1080/10485252.2023.2241572]
12. Ying, Z.; Jung, S.H.; Wei, L.J. Survival analysis with median regression models. J. Am. Stat. Assoc.; 1995; 90, pp. 178-184. [DOI: https://dx.doi.org/10.1080/01621459.1995.10476500]
13. Honoré, B.; Khan, S.; Powell, J.L. Quantile regression under random censoring. J. Econom.; 2002; 109, pp. 67-105. [DOI: https://dx.doi.org/10.1016/S0304-4076(01)00142-7]
14. Portnoy, S. Censored regression quantiles. J. Am. Stat. Assoc.; 2003; 98, pp. 1001-1012. [DOI: https://dx.doi.org/10.1198/016214503000000954]
15. Peng, L. Self-consistent estimation of censored quantile regression. J. Multivar. Anal.; 2012; 105, pp. 368-379. [DOI: https://dx.doi.org/10.1016/j.jmva.2011.10.005]
16. Yuan, X.; Zhang, X.; Guo, W.; Hu, Q. An adapted loss function for composite quantile regression with censored data. Comput. Stat.; 2024; 39, pp. 1371-1401. [DOI: https://dx.doi.org/10.1007/s00180-023-01352-6]
17. Gao, Q.; Zhou, X.; Feng, Y.; Du, X.; Liu, X. An empirical likelihood method for quantile regres sion models with censored data. Metrika; 2021; 84, pp. 75-96. [DOI: https://dx.doi.org/10.1007/s00184-020-00775-1]
18. Hao, R.; Weng, C.; Liu, X.; Yang, X. Data augmentation based estimation for the censored quantile regression neural network model. Expert Syst. Appl.; 2023; 214, 119097. [DOI: https://dx.doi.org/10.1016/j.eswa.2022.119097]
19. Yang, X.; Narisetty, N.N.; He, X. A new approach to censored quantile regression estimation. J. Comput. Graph. Stat.; 2018; 27, pp. 417-425. [DOI: https://dx.doi.org/10.1080/10618600.2017.1385469]
20. Peng, L.; Huang, Y. Survival analysis with quantile regression models. J. Am. Stat. Assoc.; 2008; 103, pp. 637-649. [DOI: https://dx.doi.org/10.1198/016214508000000355]
21. Xu, G.; Sit, T.; Wang, L.; Huang, C.Y. SEstimation and inference of quantile regression for survival data under biased sampling. J. Am. Stat. Assoc.; 2017; 112, pp. 1571-1586. [DOI: https://dx.doi.org/10.1080/01621459.2016.1222286] [PubMed: https://www.ncbi.nlm.nih.gov/pubmed/30078919]
22. Cai, Z.; Sit, T. On interquantile smoothness of censored quantile regression with induced smoothing. Biometrics; 2023; 79, pp. 3549-3563. [DOI: https://dx.doi.org/10.1111/biom.13892] [PubMed: https://www.ncbi.nlm.nih.gov/pubmed/37382567]
23. Kim, K.H.; Caplan, D.J.; Kang, S. Smoothed quantile regression for censored residual life. Comput Stat.; 2023; 38, pp. 1001-1022. [DOI: https://dx.doi.org/10.1007/s00180-022-01262-z]
24. Wu, Y.; Ma, Y.; Yin, G. Smoothed and corrected score approach to censored quantile regression with measurement errors. J. Am. Stat. Assoc.; 2015; 110, pp. 1670-1683. [DOI: https://dx.doi.org/10.1080/01621459.2014.989323]
25. He, X.; Pan, X.; Tan, K.M.; Zhou, W.X. Scalable estimation and inference for censored quantile regression process. Ann. Stat.; 2022; 50, pp. 2899-2924. [DOI: https://dx.doi.org/10.1214/22-AOS2214]
26. Fei, Z.; Zheng, Q.; Hong, H.G.; Li, Y. Inference for high-dimensional censored quantile regression. J. Am. Stat. Assoc.; 2023; 118, pp. 898-912. [DOI: https://dx.doi.org/10.1080/01621459.2021.1957900] [PubMed: https://www.ncbi.nlm.nih.gov/pubmed/37309513]
27. De Backer, M.; Ghouch, A.E.; Van, K.I. An adapted loss function for cen sored quantile regression. J. Am. Stat. Assoc.; 2019; 114, pp. 1126-1137. [DOI: https://dx.doi.org/10.1080/01621459.2018.1469996]
28. Leng, C.; Tong, X. Censored quantile regression via Box-Cox transformation under conditional independence. Stat. Sin.; 2014; 24, pp. 221-249. [DOI: https://dx.doi.org/10.5705/ss.2012.089]
29. Spokoiny, V. Bernstein–von Mises theorem for growing parameter dimension. arXiv; 2013; [DOI: https://dx.doi.org/10.48550/arXiv.1302.3430] arXiv: 1302.3430
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Abstract
In this paper, we propose a smoothing estimation method for censored quantile regression models. The method associates the convolutional smoothing estimation with the loss function, which is quadratically derivable and globally convex by using a non-negative kernel function. Thus, the parameters of the regression model can be computed by using the gradient-based iterative algorithm. We demonstrate the convergence speed and asymptotic properties of the smoothing estimation for large samples in high dimensions. Numerical simulations show that the smoothing estimation method for censored quantile regression models improves the estimation accuracy, computational speed, and robustness over the classical parameter estimation method. The simulation results also show that the parametric methods perform better than the KM method in estimating the distribution function of the censored variables. Even if there is an error setting in the distribution estimation, the smoothing estimation does not fluctuate too much.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
Details


1 School of Mathematical Sciences, Nanjing Normal University, Nanjing 210023, China;
2 College of International Languages and Cultures, Hohai University, Nanjing 211100, China;