1. Introduction
Regression analysis is a statistical method used to determine the relationship between response variables and one or more predictor variables [1]. Regression is divided into three types: parametric regression, nonparametric regression, and semiparametric regression. Parametric regression is used when the shape of the regression curve is known, whether linear, quadratic, cubic or otherwise. Whereas, nonparametric regression is used when the shape of the regression curve is unknown. As for semiparametric regression, it is a combination of parametric and nonparametric regression.
Nonparametric regression is a method used to model regression curves of unknown shape [2]. This method is a more flexible approach because the data is expected to look for the estimation form of the regression curve itself without being influenced by the researcher’s subjectivity factor [3]. Some of the estimators used in nonparametric regression include spline estimators, Fourier series, kernels, and local polynomials. Each estimator has its characteristics to approach unknown regression functions. Research on nonparametric regression has been widely conducted by single estimators [4,5] and mixed estimators [6,7]. However, its application is still limited to nonspatial data. In fact, there are many problems related to spatial data. Spatial data is data that contains size and location information [8]. Methods used in spatial data analysis include Spatial Autocorrelation, Spatial Error Model, Geographically weighted regression, and others.
According to [8], Geographically Weighted Regression (GWR) is a statistical method that can analyze spatial heterogeneity. Spatial heterogeneity is one of the same predictor variables exerting unequal influences on different locations within a study site. The GWR model generates an estimator of model parameters that are local to each point or location where the data is observed. Research on GWR by [9,10] shows that the GWR model is better than the global model that can overcome spatial heterogeneity. It was further developed on a nonparametric GWR with a single estimator [11,12,13]. In addition, the relationship between response variables and some predictor variables can vary [14]. Therefore, a nonparametric GWR model with mixed truncated spline and Fourier series would be developed. The truncated spline estimator in GWR is expected to overcome the changing curve pattern at certain sub intervals [11]. In contrast, the Fourier series estimator is expected to model the repeating data pattern [6]. This study aims to create a Geographically Weighted Nonparametric Regression (GWNR) model with a mixed truncated spline and Fourier series estimator, to determine the parameter estimate of the GWNR model with the Weighted Maximum Likelihood Estimator method, and evaluate the properties the mixed GWNR model.
The following discussion in this paper is divided into three main topics. Section 2 discusses the GWNR Model and method estimation of WMLE. Section 3 presents the estimation parameter model GWNR; unbiased and linear estimator properties; and data application. Section 4 is the conclusions.
2. Materials and Methods
2.1. Geographically Weighted Nonparametric Regression (GWNR)
The GWNR model is a development of GWR in nonparametric regression. Provided paired data and assumed relationships between predictor variables with response variables following a multivariable regression model [11] are as follows:
(1)
where, is a response variable, is a regression curve of unknown shape, P is the predictor variables with a truncated spline function, Q is the predictor variables approached with Fourier series functions, and n is a number of observations and is assumed to be additive. If the function is approached with truncated spline functions and Fourier series, then Equation (1) can be written: where, is a truncated spline component with P predictor variables and is a Fourier series component with Q other predictor variables. By matrix notation, it can be written with:(2)
where:2.2. Weighted Maximum Likelihood Estimator (WMLE)
Maximum likelihood estimation from the parameters and with a distributed n-sized sample , can be written with:
so that the likelihood function for is(3)
Next, by multiplying by the weighting matrix , the log likelihood function is obtained in terms of weighted function as [15]:
(4)
The role of weights in the GWNR model is very important because the weighting values will represent the location of observational data from one another. One method that can be used is the Gaussian Kernel function [8]. Equation (4) estimates the GWNR parameter with the WMLE method.
Furthermore, steps are given in estimating the parameters of a mixed GWNR model with the WMLE Method as follows:
Defining a mixed GWNR model
Assuming distribution
Determining the distribution of y
Forming a likelihood function
Forming a weighted likelihood function
Specifying the first partial derivative of the likelihood function against the mixed GWNR model parameter
Getting an estimate of mixed GWNR model parameters.
3. Results
3.1. Parameter Estimation
Estimation of parameters on GWNR models with mixed estimators uses Weighted Maximum Likelihood Estimator (WMLE). The WMLE method is obtained by knowing the distribution of the response variable in advance. Then, it is determined by the weighting matrix for each location to , . The weighting used is a Fixed Gaussian Kernel function. Next, it is given the form of a distribution of the GWNR model that is presented on Lemma 1.
Given the GWNR model in Equation (2), with is normally distributed with mean equal to zero and variance , hence is normally distributed with mean
and variance, .where:
P: number of spline components
M: polynomial degree of spline
R: number of knot points
Q: number of Fourier components
H: number of oscillation parameters.
Lemma 1 has been proven in Appendix A.
If given a model on Equation (2) with normally distributed with mean zero and variance and the weighted likelihood function given to (4), by the MLE method, an estimator is obtained and as follows:
where:t = knot point for spline component
h = oscillation parameter component.
Is given to Appendix B. □
3.2. Unbiased and Linear Estimator Properties
If is a truncated spline component parameter estimator of the GWNR model with a mixed estimator approach that follows Equation (4), so is an unbiased estimator and belongs to the class of linear estimators in observation y.
Furthermore, it can be seen in Appendix C which is the proof of Lemma 2.
If is a Fourier series component parameter estimator of the GWNR model with a mixed estimator approach that follows Equation (4), so is an unbiased estimator and belongs to the class of linear estimators in observation y.
Lemma 3 is the last proven lemma and is described in Appendix D.
If and are given by Theorem 1, hence the estimator for , and is hence given by:
so that the following is obtained:
Next, to determine the function estimator , and are described as follows. Based on Theorem 1, it can be substituted and so that it is obtained:
andAs a result, obtained estimator be
where is a hat matrix containing knot points t and the parameter of the h oscillation on the mixed GWNR model with the approach of truncated spline and Fourier series estimator. □3.3. Data Application
The data used are secondary data from [16,17,18,19,20,21] with research variables, percentage of the poor population , CPI , TPT , longitude and latitude coordinates , and as many as 81 districts/cities on Sulawesi Island.
The following steps for applying the estimated parameters of the GWNR model to poverty data on Sulawesi Island in 2020 are as follows:
-
Making a scatter plot between the variables and y, as well as and y
-
Defining the initial model
-
Selecting optimum knots and oscillation parameters
-
Estimating parameters of global model with the OLS method based on the initial model formed
-
Testing assumptions of spatial heterogeneity on residual values on global models
-
Determining the weighting matrix
-
Estimating parameters of the GWNR model with the WMLE method
-
Choosing the best model based on MSE and R2
-
Making conclusions
Based on the above step, the variable as a component of the Fourier series and the variable as the spline component are obtained based on the scatter plot shown. Furthermore, the test of spatial assumptions is obtained that the assumption of spatial heterogeneity is met, so the global model is less suitable for use because the residual properties are not homogeneous. One alternative model that can be used is the GWNR model. Use of this GWNR model is expected to overcome heteroskedasticity by generating a local model for each location. Here are some local models generated:
(5)
(6)
(7)
(8)
(9)
(10)
where:yken = estimated poverty percentage for Kendari City
ymks = estimated poverty percentage for Makassar City
yman = estimated poverty percentage for Manado City
ypal = estimated poverty percentage for Palu City
ygor = estimated poverty percentage for Gorontalo City
ymaj = estimated poverty percentage for Mamuju City.
GWNR mixed with oscillation parameters k =1 and linear spline t =1 resulted in MSE and R2 values of 3.65 and 74.65 per cent, respectively. Based on several local models above, it shows that poverty in Sulawesi Island is influenced by HDI and TPT, where the increasing HDI will result in a decrease in the percentage of poverty. Conversely, an increase in TPT will increase the percentage of poverty.
4. Conclusions
Estimation of GWNR using the truncated spline and Fourier series was successfully formulated. It was found that:
The GWNR model using a mixed estimator of truncated spline and Fourier series is
Where is a truncated spline component, is a component of a Fourier series, and is a residual component.
Estimators of GWNR are , , and . The estimator is an unbiased and linear estimator to observe the response variable.
Conceptualization: L.L., I.N.B. and V.R.; methodology, L.L. and I.N.B.; writing-original draft preparation, L.L.; writing—review and editing, L.L., I.N.B. and V.R. All authors have read and agreed to the published version of the manuscript.
Not applicable.
The authors declare no conflict of interest.
Footnotes
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
References
1. Draper, N.R.; Smith, H. Applied Regression Analysis; 3rd ed. John Wiley & Sons Inc.: Hoboken, NJ, USA, 2014; pp. 1-716. [DOI: https://dx.doi.org/10.1002/9781118625590]
2. Budiantara, I.N. The combination of spline and kernel estimator for nonparametric regression and its properties. Appl. Math. Sci.; 2015; 9, pp. 6083-6094. [DOI: https://dx.doi.org/10.12988/ams.2015.58517]
3. Cheng, M.; Paige, R.L.; Sun, S.; Yan, K. Variance reduction for kernel estimators in clustered/longitudinal data analysis. J. Stat. Plan. Inference; 2010; 140, pp. 1389-1397. [DOI: https://dx.doi.org/10.1016/j.jspi.2009.09.026]
4. Octavanny, M.A.D.; Budiantara, I.N.; Ratnasari, V. Pemodelan faktor-faktor yang memengaruhi provinsi jawa timur menggunakan pendekatan regresi semiparametrik spline. J. Sains Seni ITS; 2017; 6, pp. 1-7.
5. Hidayat, M.F.; Achmad, R.F.A.; Solimun,. Estimation of truncated spline function in non-parametric path analysis based on weighted least square (WLS). IOP Conf. Ser. Mater. Sci. Eng.; 2019; 546, pp. 5-10. [DOI: https://dx.doi.org/10.1088/1757-899X/546/5/052027]
6. Sudiarsa, W.; Budiantara, I.N.; Suhartono, S.; Purnami, S.W. Combined estimator fourier series and spline truncated in multivariable nonparametric regression. Appl. Math. Sci.; 2015; 9, pp. 4997-5010. [DOI: https://dx.doi.org/10.12988/ams.2015.55394]
7. Mariati, N.P.A.M.; Budiantara, I.N.; Ratnasari, V. Combination estimation of smoothing spline and fourier series in nonparametric regression. J. Math.; 2020; 2020, 4712531. [DOI: https://dx.doi.org/10.1155/2020/4712531]
8. Fotheringham, A.S.; Brundson, C.; Charlton, M. Geographically Weighted Regression, the Analysis of Spatially Varying Relationships; John Wiley & Sons Ltd.: Chichester, UK, 2002.
9. Tang, J.; Gao, F.; Liu, F.; Zhang, W.; Qi, Y. Understanding spatio-temporal characteristics of urban travel demand based on the combination of GWR and GLM. Sustainabily; 2019; 11, 5525. [DOI: https://dx.doi.org/10.3390/su11195525]
10. Dziauddin, M.F. Estimating land value uplift around light rail transit stations in Greater Kuala Lumpur: An empirical study based on geographically weighted regression (GWR). Res. Transp. Econ.; 2019; 74, pp. 10-20. [DOI: https://dx.doi.org/10.1016/j.retrec.2019.01.003]
11. Sifriyani,; Kartiko, S.H.; Budiantara, I.N.; Gunardi,. Development of nonparametric geographically weighted regression using truncated spline approach. Songklanakarin J. Sci. Technol.; 2018; 40, pp. 909-920.
12. Fitri, N.; Sifriyani, S.; Yuniarti, D. Nonparametric geographically weighted regression dengan pendekatan spline truncated. Pros. Semin. Nas. Mat. Dan Stat.; 2019; pp. 98-105.
13. Sifriyani, S. Simultaneous hypothesis testing of multivariable nonparametric spline regression in the GWR model. Int. J. Stat. Probab.; 2019; 8, 32. [DOI: https://dx.doi.org/10.5539/ijsp.v8n4p32]
14. Nurcahayani, H.; Budiantara, I.N.; Zain, I. The curve estimation of combined truncated spline and fourier series estimator for multiresponse nonparametric regression. Mathematics; 2021; 9, 1141. [DOI: https://dx.doi.org/10.3390/math9101141]
15. Akbarov, A.; Wu, S. Waranty claim forecasting based on weighted maximum likelihood estimator. Qual. Reliab. Eng. Int.; 2012; 28, pp. 663-669. [DOI: https://dx.doi.org/10.1002/qre.1399]
16. BPS. Provinsi Sulawesi Tenggara dalam Angka 2021; UD. Resky Bersama: Kendari, Indonesia, 2021.
17. BPS. Provinsi Sulawesi Selatan dalam Angka 2021; BPS Press: Makassar, Indonesia, 2021.
18. BPS. Provinsi Sulawesi Utara dalam Angka 2021; Perum Percetakan NRI: Manado, Indonesia, 2021.
19. BPS. Provinsi Sulawesi Tengah dalam Angka 2021; UD. Rio: Palu, Indonesia, 2021.
20. BPS. Provinsi Sulawesi Barat dalam Angka 2021; Erlangga: Mamuju, Indonesia, 2021.
21. BPS. Provinsi Gorontalo dalam Angka 2021; CV. Rifaldi: Gorontalo, Indonesia, 2021.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Abstract
Geographically Weighted Regression (GWR) is the development of multiple linear regression models used in spatial data. The assumption of spatial heterogeneity results in each location having different characteristics and allows the relationships between the response variable and each predictor variable to be unknown, hence nonparametric regression becomes one of the alternatives that can be used. In addition, regression functions are not always the same between predictor variables. This study aims to use the Geographically Weighted Nonparametric Regression (GWNR) model with a mixed estimator of truncated spline and Fourier series. Both estimators are expected to overcome unknown data patterns in spatial data. The mixed GWNR model estimator is then determined using the Weighted Maximum Likelihood Estimator (WMLE) technique. The estimator’s characteristics are then determined. The results of the study found that the estimator of the mixed GWNR model is an estimator that is not biased and linear to the response variable y.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
Details


1 Department of Statistics, Institut Teknologi Sepuluh Nopember, Surabaya 60111, Indonesia; Department of Statistics, Faculty of Mathematics and Natural Sciences, Universitas Halu Oleo, Kendari 93132, Indonesia
2 Department of Statistics, Institut Teknologi Sepuluh Nopember, Surabaya 60111, Indonesia