1. Introduction
In many practical situations, linear models may not be complex enough to capture the underlying relation between the response variable and some associated covariates, especially when the response variable Y is not linearly related to all the covariates. For example, suppose one is interested in estimating the relationship between an outcome variable Y and vectors of variables X and Z. The researcher is comfortably modeling the linear function in X but hesitates to extend the linearity to Z. One example given by Engle et al. [1] is the effect of temperature on electricity consumption for four cities. They modeled the average monthly electricity consumption as the sum of a smooth function of the monthly temperatures and a linear function of the monthly price of electricity, income and 11 other monthly dummy variables. It is natural to impose linearity on the part of the regression function involving household characteristics and a nonlinear function involving temperature since electricity consumption tends to be higher at extreme temperatures but lower at moderate temperatures. A partially linear model provides a good fit for these types of data because it allows for a regression function that maintains linearity in some variables and also extends the effect of other variables to be nonlinear.
A partially linear regression model is defined as
(1)
where the ’s are scalar response variables, are known p-variate covariate, is a scalar explanatory variable, and is the smooth part of the model, which is assumed to represent a smooth unparameterized functional relationship. is a vector of unknown parameters and are independent random errors with mean zero and finite variance given the covariates X and Z.The partial linear model Equation (1) is a semiparametric model since it contains both parametric and nonparametric components. The partially linear model is more flexible to interpret the effect of each linear covariate and allows one to focus on particular variables that can have nonlinear effects. It may be preferable to a completely nonparametric model because of the well-known “curse of dimensionality”. Computationally, partially linear models are remarkably easier than additive models, in which iterative approaches such as a backfitting algorithm [2] or marginal integration [3] are necessary.
Partially linear models are widely used in biometrics, econometrics, social sciences and other fields (see [1,4]), and have been studied extensively for estimating and . For example, Wahba [5], Engle et al. [1], and Green et al. [6] described penalized spline estimates of and . Heckman [7] and Rice [8] proposed the polynomial method. Speckman [9] described the kernel method. Chen and Shiau [10] used a smoothing spline method. Chen [11] proposed the projection method. For more discussions about partially linear models, we refer to Härdle et al. [12] for a summary.
In most cases, investigators are more interested in the parameter and take as a nuisance parameter [13]. Estimating the confidence interval for the parametric components in partially linear models using a backfitting algorithm or marginal integration can be computationally heavy. Severini and Staniswalis [14] derived the asymptotic properties for their proposed estimators of and under mild regularity conditions. These asymptotic properties serve as a foundation for constructing confidence intervals that are asymptotically accurate for the parameters. However, in practice, the finite-sample performance of these confidence intervals may be less satisfactory because of the complex structure of the covariance matrix, requiring estimates to be plugged in for multiple parameters. The linear components in the partially linear models can also be estimated using the generalized additive models [15], but the results depend on the distribution family used in the gam function in R. When a wrong distribution family is chosen, the results could be very biased. Confidence interval for the parametric components in the partially linear models can also be constructed based on the asymptotic normal distribution; however, this may not hold when the normality assumption fails or when the sample size is small.
Empirical likelihood provides a good alternative among the nonparametric methods that can be used to make statistical inference when the normality assumption fails or when the distribution is unspecified. The advantages of empirical likelihood compared to the bootstrap method and the jackknife method arise from it being a nonparametric method of inference based on a data-driven likelihood ratio function. As a combination of a nonparametric method and the likelihood method, on one hand, it does not require any specification of a family of distributions for the data; on the other hand, like parametric likelihood methods, it makes an automatic determination of the shape of confidence regions [16]. This property makes it a serious competitor with other nonparametric methods such as the bootstrap method and the jackknife method. Although empirical likelihood can be a very useful tool for deriving statistical inference, the use of a conventional empirical likelihood method or the profile empirical likelihood has limitations when constructing confidence intervals for each element of a large parameter vector.
Motivated by the above mentioned concerns, this paper develops an empirical-likelihood-based procedure which can be used to make inferences a for large parameter vector in partially linear models in Equation (1) by incorporating the projection method. The proposed method has two main advantages. First, it does not require distribution assumptions. Second, we provide theoretical justification that the proposed method can be applied to partially linear models, and the computation requirements are relatively straightforward because it does not require an asymptotic variance estimation. After the Bartlett correction, the coverage probability of the confidence interval is improved and better than normal-approximation-based methods in most cases.
The structure of the paper is as follows. Section 2 gives the model formulation of the empirical likelihood for the parameter of interest and the Bartlett correction procedure for the proposed method. Section 3 studies the performance of the proposed methods through simulation studies and illustrates the method by a real study example. Section 4 gives the conclusion. All the proofs are given in the Appendix A.
2. Materials and Methods
2.1. Model Formulation
Since the interest in this paper is in obtaining inference for only in the partially linear model, the nuisance parameter needs to be removed first. This is implemented by using the projection principle [17,18]. Y and need to be first regressed on Z using a nonparametric regression method, where is an covariate matrix. Denote the nonparametric regressions of Y on Z and on Z by and , respectively. Here, without loss of generality, let X be a one-dimensional vector (for a multidimensional vector, can be obtained for each column of , respectively), and then, the effect of Z on Y and X can be removed by using the regression residual of Y and X given Z. For simplicity of notation, the matrix form of the partially linear model was used here:
(2)
The first step is to regress Y and X onto Z and obtain the following equation(3)
Then, Equation (3) is subtracted from the original model (2), and the residual model is obtained as follows:(4)
Denote and . For example, Assuming has full rank, based on Speckman [9], the estimator of can then be given by Equation (5) by the least squares method if and are known:
(5)
The formula above cannot be applied directly since and need to be estimated appropriately. There are lots of methods for estimating and , including local constant smoothers [9], higher-order local polynomial estimators [19], kernel methods with varying bandwidths, smoothing and regression splines, etc. Fan and Gijbels [19] showed that within the class of linear estimators which include kernel and spline estimates, the local linear estimates achieve the best possible rates of convergence. Due to these desirable properties, the local linear smoothers was used with fixed bandwidths for estimating the nonparametric regression of Y and X on Z. Let and be the local linear nonparametric regression estimators for and , be a symmetric density function, h be a suitable bandwidth, and define ; then, the estimators take the form given by Fan and Gijbels [19]:
(6)
where . and are then replaced with their corresponding estimates in the estimating procedure (a Gaussian kernel is an example of kernel function used in the estimating procedure) and the empirical likelihood estimator of needs to satisfy the following estimating equation:(7)
This implies that the estimator for can be obtained by(8)
Next, the empirical likelihood principle was applied to construct statistical inference for . Let be the probability assigned to . The empirical likelihood ratio function for can be expressed as:
(9)
We establish the asymptotic distribution of under the following assumptions:
, , and is nonsingular; X and Z are correlated.
The bandwidths used in estimating and are of order .
The function is a bounded symmetric density function with compact support and satisfies and .
The functions and have bounded and continuous second derivatives.
The density function of is bounded away from zero and has bounded continuous second derivatives.
converges to a chi-squared distribution with p degrees of freedom under Assumptions 1–5.
The proof of Theorem 1 is given in the Appendix A. A confidence region for can be constructed based on Theorem 1 and further adjusted by using the Bartlett correction [20].
When is a vector (or when is an matrix), and we are interested in a subset of the parameter vector , say the first element , we can apply the projection method again, i.e., we regress , the first column of , which is , onto the space of , which is the remaining columns of . Similarly, we apply the same projection principle from to . Then, we obtain a new residual model, i.e., should satisfy the estimating equation as follows:
(10)
Let be the probability assigned to , where could be different from the in Theorem 1. The estimating equation for can be written as:(11)
converges to a chi-squared distribution with degree of freedom under Assumptions 1–5.
The proof of Theorem 2 is given in the Appendix A. Based on Theorem 2, the empirical likelihood confidence interval for can be obtained by:
(12)
The confidence interval for other components of can be constructed similarly.2.2. Bartlett Correction
To further improve the accuracy of the inference, the empirical likelihood ratio may be Bartlett corrected with a higher-order error than the usual error term of order [20]. The Bartlett correction can effectively control the coverage error of the confidence interval, providing more accurate estimations and reducing the chance of obtaining intervals that do not contain the true parameter value. The basic idea is to multiply the threshold by a constant instead of 1, where is the Bartlett correction constant. Because it is very difficult to obtain an exact expression for , we give an estimator of by using the bootstrap procedure, which has successfully been applied in a more complex setting by Chen and Cui [21].
The Bartlett correction of the empirical likelihood confidence interval for a parameter of interest in a partially linear model in Equation (1) is constructed by the following procedures. The procedures for another component of , say , would be similar.
-
First, the nonparametric regression method is used to regress Y and X on the nonparametric component Z. The reduced partial residuals follow a linear model of the form We use and to replace and in the estimating procedure.
-
Then, the first column of (denoting by ) is regressed on the rest of the columns (denoting by . The residual serves as the new fixed covariates of , and the residual of regressing on serves as the new response variable. The residual model is obtained and given by
-
We treat the residual model as the new linear model. The bootstrap procedure of estimating the Bartlett correction factor in the new linear model follows the procedure shown below:
-
(a).. Generate bootstrap resamples of size n by sampling with replacement from the sample and , respectively, after the projection; then, calculate based on the resamples, where is the global maximum empirical likelihood estimator of based on the original sample and .
-
(b).. Repeat (a) B times to obtain and , which is the bootstrap estimator of .
-
The bootstrap estimator of is . In consequence, the Bartlett corrected confidence region is constructed by
The Bartlett corrected confidence interval for is thus constructed.3. Results
3.1. Simulation Studies
In the simulation studies, we studied the performance of the proposed method in getting the inference of the parameter of interest in the partially linear model (1). We first simulated Z from a distribution with sample size n. The true value was set to be , and we aimed to estimate the first component of . X was set to be the sum of two matrices and , where was the matrix composed of vectors , and . was the matrix of error terms composed of n samples from the scaled multivariate normal distribution with zero mean and a compound symmetry covariance matrix with diagonal 1 and off-diagonal ; the scale parameter was . The columns of the X matrix were functions of Z and were thus correlated. The nonparametric component took the function . Two cases for the distribution of the error term were considered,
-
Case 1:. follows a normal distribution with mean 0 and variance .
-
Case 2:. follows the scaled log-normal distribution such that has mean 0 and variance .
In the simulations, the sample sizes were considered to be 50, 100, and 200. In each simulation, we generated 1000 independent data sets and constructed the confidence interval for each data set. In estimating the nonparametric regression of and , the direct plug-in method was used to select the bandwidth of a local linear Gaussian kernel regression estimate, as described by Ruppert, Sheather, and Wand [22]. The proposed method was compared with the normal-based method and the generalized additive model method (gam) [15].
Table 1 gives the average results from the 1000 simulations (the endpoints of the confidence intervals were obtained by the medians of the 1000 simulation results, and the confidence interval lengths were computed using the difference of the two endpoints). In Table 1, Est refers to the estimated value; Norm, Gam, EL, and ELb refer to the normal-based method, the gam function in R, the empirical-likelihood-based method without Bartlett correction, and the empirical-likelihood-based method with Bartlett correction, respectively. Length and coverage probability refer to the respective length and coverage probability of the confidence intervals constructed using the four different methods. It is worth mentioning that each confidence interval based on the normal approximation is symmetric while the confidence interval based on empirical likelihood is not symmetric. In the simulation, the Gaussian distribution was used as the distribution family within the gam function under both error cases.
The simulation results from Table 1 indicates that the Bartlett correction indeed improved the statistical inference. The coverage probability was improved after the Bartlett correction, especially when the sample size is small, where the normal approximation method may not be appropriate. When the sample size is small (for example ), our proposed method tends to enlarge the confidence interval to have a better coverage probability for the true parameter. In that case, the length of the confidence interval for the Bartlett correction is larger than that of the normal approximation, the gam method, and the empirical likelihood without Bartlett correction, but the coverage probability is the closest to the nominal level . When the sample size becomes larger, the length of the confidence interval using the proposed method tends to be close to or shorter than the confidence interval of the normal approximation method and yet still has slightly better or equally good coverage probability compared to the normal approximation method and gam method.
3.2. A Real Study Example
The proposed method is illustrated by an application to the Boston housing data set, which was obtained from the StatLib archive and has been extensively used in regression analysis. The data set consists of the median value of owner-occupied homes in 506 US census tracts in the Boston area in 1970, as well as several variables which might explain the variation in housing values. Based on the correlations and multicollinearity analysis, we fit a partially linear model with the variable of interest MEDV (median value of owner-occupied home in USD 1000) linearly related with predictor PTRATIO (pupil–teacher ratio by town), RM (number of rooms per dwelling), and nonlinearly related with variable LSTAT (% lower status of the population). The partially linear model has the following form:
The proposed method was used to construct the 95% confidence interval for . The proposed empirical-likelihood-based Bartlett corrected 95% confidence interval for was (2.375, 4.656), and the normal-based 95% confidence interval was (2.406, 4.502). Both methods indicated a positive linear relationship between PTRATIO and MEDV, with the proposed method’s confidence interval slightly wider than the normal-based confidence interval. Based on our simulation results for the coverage probability under a large sample size, the confidence interval obtained from the proposed method was comparable with the normal-based confidence interval and was trustworthy.
4. Discussion
In this paper, an empirical-likelihood-based method to construct the confidence interval for the linear components in partially linear models was proposed. Simulation studies showed that the length of the confidence interval for the proposed empirical likelihood with Bartlett correction method was larger than the normal approximation when the sample size was small, but the coverage probability was the closest to the nominal level. When the sample size was larger, the confidence interval for the proposed empirical likelihood with Bartlett correction method had a slightly shorter length and a similar coverage probability as the normal-based method and gam method, which indicated the confidence interval constructed by the proposed method was more desirable in estimating the parameter of interest. The above findings are mostly true under both normally distributed error and non-normally distributed error terms. This ensures the robustness of our proposed test numerically, which also makes the proposed method a practically useful tool in real studies where we usually do not know the distribution of the data. The trade-off of the proposed method is that it requires more computation than the normal-approximation method.
In summary, this proposed method gives better inference in terms of the length and coverage probabilities of the confidence intervals compared to the normal-approximation-based method. It does not impose any restrictions on the data distribution, and the computations are relatively straightforward for partially linear models. This proposed method is recommended for estimating and constructing confidence intervals for the linear components in partially linear models, particularly when the sample size is small.
Methodology, H.S.; software, H.S. and L.C.; formal analysis, H.S. and L.C.; writing—original draft, revision, H.S.; writing—review and editing, L.C. All authors have read and agreed to the published version of the manuscript.
The boston housing data used in the paper was obtained from the StatLib archive (
The authors would like to thank the editor and three referees for their insightful comments that significantly improved an earlier version of this paper.
The authors declare no conflict of interest.
Footnotes
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
Confidence interval and coverage probability for partially linear models;
n | Est | Length | Coverage Probability | |||||||
---|---|---|---|---|---|---|---|---|---|---|
Norm | Gam | EL | ELb | Norm | Gam | EL | ELb | |||
Norm | 50 | 1.957 | 1.496 | 1.498 | 1.437 | 1.520 | 0.935 | 0.936 | 0.924 | 0.945 |
100 | 2.047 | 1.052 | 1.085 | 1.016 | 1.049 | 0.963 | 0.963 | 0.949 | 0.953 | |
200 | 2.026 | 0.706 | 0.704 | 0.689 | 0.701 | 0.946 | 0.950 | 0.944 | 0.950 | |
Non-norm | 50 | 1.975 | 1.368 | 1.289 | 1.278 | 1.386 | 0.946 | 0.934 | 0.944 | 0.950 |
100 | 2.05 | 1.012 | 1.051 | 0.980 | 1.008 | 0.973 | 0.962 | 0.954 | 0.956 | |
200 | 2.027 | 0.681 | 0.689 | 0.668 | 0.677 | 0.944 | 0.940 | 0.930 | 0.946 |
Appendix A
First, we give the following fact, which is used later in the proof of the theorem. Its proof can be shown by Assumptions 2–5:
From
Let
Using arguments similar to those in the proof of Theorem 3.2 of Owen [
To show
These arguments imply that
We continue to use the notations
Since in the linear model case, we have proved that
To show Equation (A4), we first need to show
Equation (A5) holds because
References
1. Engle, R.; Granger, C.; Rice, J.; Weiss, A. Nonparametric estimates of the relation between weather and electricity sales. J. Am. Stat. Assoc.; 1986; 81, pp. 310-320. [DOI: https://dx.doi.org/10.1080/01621459.1986.10478274]
2. Hastie, T.; Tibshirani, R. Generalized Additive Models; Chapman and Hall: London, UK, 1990.
3. Linton, O.; Nielsen, J. A Kernel Method of Estimating Structured Nonparametric Regression Based on Marginal Integration. Biometrika; 1995; 82, pp. 93-100. [DOI: https://dx.doi.org/10.1093/biomet/82.1.93]
4. Gray, R. Spline-based test in survival analysis. Biometrika; 1994; 50, pp. 640-652. [DOI: https://dx.doi.org/10.2307/2532779]
5. Wahba, G. Cross validated spline methods for the estimation of multivariate functions from data on functionals. Proceedings of the Iowa State University Statistical Laboratory 50th Anniversary Conference, Ames, IA, USA, 13–15 June 1984; David, H.A.; David, H.T. The Iowa State University Press: Ames, IA, USA, 1984; pp. 205-235.
6. Green, P.; Jennison, C.; Seheult, A. Analysis of field experiments by least squares smoothing. J. R. Stat. Soc. Ser.; 1985; 47, pp. 299-315. [DOI: https://dx.doi.org/10.1111/j.2517-6161.1985.tb01358.x]
7. Heckman, N. Smoothing Spline in partly linear models. J. R. Stat. Ser. B; 1986; 48, pp. 244-248. [DOI: https://dx.doi.org/10.1111/j.2517-6161.1986.tb01407.x]
8. Rice, J. Convergence rates for partially splined models. Stat. Probab. Lett.; 1994; 4, pp. 203-208. [DOI: https://dx.doi.org/10.1016/0167-7152(86)90067-2]
9. Speckman, P. Kernel Smoothing in Partial Linear Models. J. R. Stat. Soc. Ser. B; 1988; 50, pp. 413-436. [DOI: https://dx.doi.org/10.1111/j.2517-6161.1988.tb01738.x]
10. Chen, H.; Shiau, J.J.H. Data-Driven Efficient Estimators for a Partially Linear Model. Ann. Stat.; 1994; 22, pp. 211-237. [DOI: https://dx.doi.org/10.1214/aos/1176325366]
11. Chen, H. Convergence rates for parametric components in a partly linear model. Ann. Stat.; 1988; 16, pp. 136-146. [DOI: https://dx.doi.org/10.1214/aos/1176350695]
12. Härdle, W.; Liang, H.; Gao, J. Partially Linear Models; Springer Physica: Heidelberg, Germany, 2000.
13. Liang, H. Estimation in Partially Linear Models and Numerical Comparisons. Comput. Stat. Data Anal.; 2006; 50, pp. 675-687. [DOI: https://dx.doi.org/10.1016/j.csda.2004.10.007] [PubMed: https://www.ncbi.nlm.nih.gov/pubmed/20174596]
14. Severini, T.; Staniswalis, J. Quasilikelihood estimation in semiparametric models. J. Am. Stat. Assoc.; 1994; 89, pp. 501-511. [DOI: https://dx.doi.org/10.1080/01621459.1994.10476774]
15. Hastie, T.J. Generalized Additive Models; Wadsworth and Brooks/Cole: Pacific Grove, CA, USA, 1992.
16. Owen, A. Empirical Likelihood; Chapman and Hall/CRC: London, UK, 2001.
17. Robinson, P.M. Root-n-consistent semiparametric regression. Econometrica; 1988; 56, pp. 931-954. [DOI: https://dx.doi.org/10.2307/1912705]
18. Su, H.; Liang, H. An empirical likelihood-based method for comparison of treatment effects-test of equality of coefficients in linear models. Comput. Stat. Data Anal.; 2010; 54, pp. 1079-1088. [DOI: https://dx.doi.org/10.1016/j.csda.2009.10.018] [PubMed: https://www.ncbi.nlm.nih.gov/pubmed/20161586]
19. Fan, J.; Gijbels, I. Local Polynomial Modelling and Its Applications; Chapman and Hall/CRC: London, UK, 1996.
20. DiCiccio, T.; Hall, P.; Romano, J. Empirical likelihood is bartlett-correctable. Ann. Stat.; 1991; 19, pp. 1053-1061. [DOI: https://dx.doi.org/10.1214/aos/1176348137]
21. Chen, S.; Cui, H. On bartlett correction of empirical likelihood in the presense of nuisance parameters. Biometrica; 2006; 93, pp. 215-220. [DOI: https://dx.doi.org/10.1093/biomet/93.1.215]
22. Ruppert, D.; Sheather, S.J.; Wand, M.P. An effective bandwidth selector for local least squares regression. J. Am. Med. Assoc.; 1995; 90, pp. 1257-1270. [DOI: https://dx.doi.org/10.1080/01621459.1995.10476630]
23. Liang, H.; Wang, S.; Carroll, R. Partially linear models with missing response variables and error-prone covariates. Biometrica; 2007; 94, pp. 185-198.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Abstract
Partially linear models find extensive application in biometrics, econometrics, social sciences, and various other fields due to their versatility in accommodating both parametric and nonparametric elements. This study aims to establish statistical inference for the parametric component effects within these models, employing a nonparametric empirical likelihood approach. The proposed method involves a projection step to eliminate the nuisance nonparametric component and utilizes an empirical-likelihood-based technique, along with the Bartlett correction, to enhance the coverage probability of the confidence interval for the parameter of interest. This method demonstrates robustness in handling normally and non-normally distributed errors. The proposed empirical likelihood ratio statistic converges to a limiting chi-square distribution under certain regulations. Simulation studies demonstrate that this method provides better inference in terms of coverage probabilities compared to the conventional normal-approximation-based method. The proposed method is illustrated by analyzing the Boston housing data from a real study.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
Details
1 School of Computing, Montclair State University, Montclair, NJ 07043, USA
2 Department of Mathematics and Statistics, Rochester Institute of Technology, Rochester, NY 14632, USA;