1. Introduction
The urban center is the core area of an urban structure. It is usually the area with the most concentrated functions of urban politics, economy and culture. The high density of urban centers and the central position of urban functions make it different from the cities’ other areas in many aspects, i.e., high population density, traffic congestion and high land development intensity. As a result, the central area of a city often does not have new residential land for development to build first-hand real estate for housing market. Therefore, the real estate market in central areas mainly consists of second-hand housing transactions. For a city’s real estate market, the government can determine and revise policies regarding planning, land, finance, tax, price and other aspects. Tax policy is a very important part of the policies. China is one of a small number of countries that does not levy annual real estate tax on ownership of residential properties. Recently, in the Government Work Report on March 5, 2019, the idea to "steadily promote the legislation of real estate tax" has been clearly put forward. Referring to the experience of developed countries, real estate tax is often based on the value of the houses [1]. How to assess properties quickly and accurately on a large scale is the basic technical requirement for the real implementation of real estate tax.
Based on this purpose, many scholars enter the research field of mass appraisal modeling of land and real estate from the perspective of economics, statistics, computer science, operational research and geographical science [2,3,4,5,6,7]. Based on Lancaster’s consumer behavior theory, the classic hedonic pricing model is introduced into the housing market and performs well in the analysis of the mass appraisal model, which is based on the hypothesis that goods are valued on their attributes or characteristics [8]. After years of application, the multiple regression analysis (MRA) model has been considered as the most widely use model with its advantages of a clear formula, wide recognition of tax departments and long-term stable application, especially for the linear multiple regression analysis model [9,10].
At the same time, more and more scholars find the unique spatial characteristics of properties and take spatial factors into account in the establishment of a mass appraisal model. Geographically Weighted Regression (GWR) is a widely used tool for exploring potential spatial heterogeneity in processes over geographic space [11,12,13,14,15]. The traditional MRA model is usually a kind of global model which assumes that the processes generating the observed data are the same everywhere so that a single parameter is estimated for each covariate in the model. However, the GWR model allows this assumption to be realized by calibrating a model at each location to obtain location-specific parameter estimates for each process. Furthermore, many researchers also notice the time dimension of real estate transaction database and combine it with the GWR model. After adding temporal non-stationarity, the conventional GWR model integrates both temporal and spatial information into the mass appraisal modeling and becomes a new form, namely the Geographically and Temporally Weighted Regression (GTWR) model. Huang et al. (2010) first proposed the GTWR model [16], and then Fotheringham et al. (2015) also developed the GTWR model based on their classic GWR model [17]. The GTWR model has good analytical capabilities for data sets with time series and spatial distribution characteristics [18]. Therefore, many scholars in different fields have adopted the GTWR model for analysis, such as satellite-based mapping of air pollution [19], spatial-temporal heterogeneity of industrial pollution [20], and the driving trajectory space-time characteristics [21], etc. At the same time, some scholars also focused on the improvement of model prototypes, such as compatibility with multiscale effect [22], combination with gravity model [23], extension with space-time kriging [24], etc.
It should also be noted that both institutions and scholars have made a lot of efforts for the research of mass appraisal models. Bourassa et al. (2007) compare and summarize the mass appraisal models related to spatial dependence and housing submarkets [25]. McCluskey et al (2013) concentrate on the accuracy of mass appraisal models and highlight the application potential of spatially weighted approach [26]. Wang and Li (2019) provide a systematic review of mass appraisal models for nearly two decades and identify a 3I-trend, namely AI-based, GIS-based and MIX-based models [27]. At the same time, researchers in many fields have also made effective attempts to the mass appraisal model, such as a multi-criteria decision analysis [28,29], expert system [30], modified evolutionary polynomial regression [31], Markov chain hybrid Monte Carlo method [32], artificial neural networks [33], hierarchical models [34], cluster analysis [35,36], rough set theory [37], reasoning-based models [38], support vector machine [39], geographically weighted principal component analysis [40], spatial error model [41] and geostatistical model [42].
This paper contributes to the study of mass appraisal modeling by utilizing the spatial-temporal characteristics of properties. Therefore, the aim of this article is to build, implement, and test the GTWR model with the community annual average price data in Beijing’s core area to compare the performance with MRA and GWR model. In addition, we also focus on the spatial characteristics of price-related parameters in high-density residential areas, providing an efficient evaluation approach for related fields. The remainder of this article is structured as follows. Section 2 introduces the study area and describes the data structure. It also provides detailed information about the three mass appraisal models which will be applied in this paper. Section 3 compares the results of MRA with the ordinary least squares (OLS) model, GWR model and GTWR model and analyzes the different situations for mass appraisal modeling of the Beijing core area. The final part presents conclusions and recommendations for future research.
2. Data and Methods 2.1. Study Area
Beijing is the capital of China. It is the political, cultural, international communication and technological innovation center of the nation. According to the Beijing Municipal Bureau of statistics (Release date: May 31, 2019), there are 21.54 million permanent residents in Beijing in 2018, and it is planned to stabilize at 23 million after 2020. The study area is the core area of Beijing, also named the Capital Functional Core Area. It is composed of two administrative regions, Xicheng District and Dongcheng District. The total area is 92.54 square kilometers, including 50.7 square kilometers in Xicheng District and 41.84 square kilometers in Dongcheng District. In 2018, the permanent residents of Xicheng District and Dongcheng District are 1.18 million and 0.82 million, and the corresponding population density (unit: person/km2) are 23333 and 19637 respectively. Figure 1 shows the map of the study area.
2.2. Data Description
The second-hand housing transaction database is from Lianjia, the largest second-hand house trading agency in China, with a local market share of nearly 60% in Beijing. The valuable data come from the records of transaction process in real commercial environment. The database contains average price data of annual transactions in each community and annual average value of all housing attributes in corresponding community. The transaction time span is 2014, 2016 and 2018, respectively. By removing the samples with missing attributes and obviously deviated coordinates, Table 1 shows the number of annual effective transaction samples of communities in Dongcheng District and Xicheng District. The total number of samples is 3064.
Figure 2 shows the spatial distribution and kernel density distribution of community annual average price (Unit: Renminbi (RMB) Yuan/m2). At first, it shows the geographical distribution of communities in core area. Some communities have all the three years’ transactions; some others only have one or two years’. Then, the interpolation of community annual average price is utilized to create a price surface by using the inverse distance weighted (IDW) method [43], for 2014, 2016 and 2018, respectively. IDW is a convenient spatial interpolation method, which can intuitively display the spatial distribution of the communities’ annual average price. It takes the distance between the interpolation point and the sample point as the weight for weighted average. The closer the interpolation point is, the greater the weight is given by the sample point. In this paper, the IDW method is supported by ArcGIS Desktop Software (Version: 10.5; Type: Advanced). And the mathematical power parameter of distance is set to the default value of 2. The distance parameter (search radius type) is defined as an adaptive radius with the default value of 12, which specifies 12 nearest input sample points to be utilized to perform interpolation. Finally, the kernel density of community samples in each year is estimated. The kernel density estimation is a natural extension of the histogram which shows the overall trend and density distribution regularity of the variables [44]. Based on MATLAB Software (Version: R2019b), a normal kernel density function is utilized with log(Price) (see Table 2 for definition) on the x-axis, probability density estimate values on the y-axis and default optimal bandwidth.
Residential community refers to a residential area surrounded by urban roads or natural boundaries, with a certain scale of living population, and built with public service facilities to meet the needs of residents. The transaction database of annual average price comes from residential community distributed throughout Beijing’s core area and hence is representative of the core area’s housing market. For mass appraisal of real estate in a city, the community scale is a proper choice. The evaluation value of the community will be the standard baseline of the individual properties within it. Based on the sufficient quantity and coverage of the 3064 community samples, the regression models can be simulated and applied well.
According to the attribute of each community sample in the database, there are 25 variables in total. The community average price is the only independent variable. Based on research purpose, all the dependent variables are divided into four categories: property structure in community, basic condition of community, traffic condition around community and living condition around community. Property structure in community contains community id, buying year, average area, average bedrooms, average decoration condition, average orientation condition. Basic condition of community includes the average house age, average ladder-to-household ratio, average years of property right, average ratio of elevator, average property management fee, number of buildings, number of households, floor area ratio (FAR) and green ratio (GA). Traffic condition around community consists of the shortest distance to bus station and the shortest distance to subway station. Living condition around community contains the shortest distance to kindergarten, the shortest distance to the park, the shortest distance to the hospital, the shortest distance to the shopping mall, the shortest distance to the food market, the shortest distance to the supermarket, the shortest distance to the movie theater, and the shortest distance to the restaurant. All the shortest distance calculations are within the 2 kilometer Euclidean-distance buffer zone. A summary of the detailed information and statistical analysis of the variables is listed in Table 2. It shows a whole picture of the community condition in Beijing’s core area. For property structure in community, the average housing price of the community is 79122 RMB Yuan/m2 with an average living area of 71.5 m2. Furthermore, the average number of two bedrooms for each property is suitable for a family of three. The average property decoration condition in the community is 0.33 (from “best = 1” to “worst = 0”), which is a normal condition for second-hand properties.
The orientation condition of the core area’s community has an average level of 0.70. For each property, the orientation weight ranges from 0.00 to 1.00. According to Beijing’s condition, houses facing south get the most light and good ventilation. Therefore, a weight of 1.00 represents the South with the best orientation. The following orientations are East, West and North with the weights of 0.66, 0.33 and 0, respectively. The data of the orientation in this paper are the average value of all properties in the relevant community, which indicates a comprehensive condition of the orientation. The closer the average value is to 1, the better the comprehensive orientation of the community is. For the basic condition of community, the average age of the building is 25 years. This shows that the development time of real estate in the core area is earlier and most communities are built in the 1990s. It also has an average of eight buildings and 597 households. Meanwhile, 45% of the buildings in community have elevators. For the traffic condition around the community, the bus condition with the average shortest distance of 0.76 kilometers is better than the subway condition with the average shortest distance of 0.90 kilometers. For the living condition, the shortest distance to kindergarten is 0.56 kilometers and the communities in the study area have great accessibility to the leisure and living places. The average shortest distance to parks, hospitals, shopping malls, food markets, supermarkets and restaurants are 0.76, 0.60, 1.03, 0.6, 0.55, 0.68 kilometers, respectively. People could walk to these places within 15 minutes (walk speed: 1.0–1.2 m/s) or ride a bike within 5 minutes (bike speed: 3.0–5.0 m/s). 2.3. Methods
2.3.1. Multiple Regression Analysis
Multiple regression analysis (MRA) explains the regression of a dependent variable over more than one independent variable. This makes it suitable for property price analysis because property values are determined by more than one property attribute. Equation (1) shows the formal model of an MRA.
Y=β0+β1 X1+…+βk Xk+ε,
whereYis the community price,X1,…,Xkare the community attributes,β0is the constant,β1,…,βkis the coefficients,εis the error term.
In order to facilitate the calculation and reduce the scale of housing prices, the logarithm calculation is carried out for the annual average house price of the community. In this paper, we adopt the natural logarithm calculation for community price. Then the equation is converted to:
log(Y)=β0+β1 X1+…+βk Xk+ε,
Generally, the MRA model is usually operated with the ordinary least squares (OLS). OLS is a data-driven methodology which can make the selected regression model has the minimum residual sum of squares of all the observations [45].
2.3.2. GWR Model and GTWR model
The GWR model is also a linear regression model which pays more attention to the local regression based on spatial relationship. The model can be performed as Equation (3).
Yi=β0(ui, νi)+∑kβk(ui, νi)Xik+εii=1,…,n,
where(ui,vi)represents the(x,y)coordinates of communityi,β0(ui, νi)is the constant value or intercept value.βk(ui, νi)are the coefficients of variableXikin communityi.εiis the error term.
Then the time variable is introduced into the model. Based on the analysis of Huang et al. (2010) [16], the GTWR model can be performed as Equation (4).
Yi=β0(ui, νi,ti)+∑k βk(ui, νi,ti)Xik+εii=1,…,n,
where(ui, νi,ti)represents the(x,y,t)spatial-temporal coordinates of communityi. Other factors in the equation are the same as Equation (3).
Then the linear regression should be solved by estimating theβk(ui, νi,ti)andβ0(ui, νi,ti)in Equation (5).
β^(ui, νi,ti)=[XTW(ui, νi,ti)X]−1 XTW(ui, νi,ti)Y,
whereW(ui, νi,ti)is the spatial-temporal weight matrix to communityi. By defining the spatial distancedSand temporal distancedT, the spatial-temporal distancedSTcan be combined as in Equation (6)
dST=dS⊗ dT,
where⊗could be any operator for certain situation. Here the+operator is adopted and scale factors ofλandμare selected fordSanddT.Then the spatial-temporal distance of communityiand communityjwith transaction yeartiandtjcan be represented in Equation (7)
(dijST)2=λ[(ui−uj)2+(vi−vj)2]+μ(tj−tj)2,
Based on the First Law of Geography [46], the closer an observation is to communityi , the greater the weight. The transaction year is also assumed. A different transaction year has mutual influence, i.e., the closer the transaction year, the greater the weight. This kind of weight is commonly built by Gaussian distance decay-based functions as shown in Equation (8) [47].
Wij=exp{−λ[(ui−uj)2+(vi−vj)2]+μ(tj−tj)2hST2},
wherehSTis the parameter of spatial-temporal bandwidth andλ∕μis the spatial-temporal distance ratio. Theλ∕μ value could be optimized by using the cross-validation (CV) or corrected Akaike information criterion (AICc) [48].
3. Results and Discussion 3.1. Multiple Regression Analysis with Ordinary Least Squares
The multiple regression analysis with ordinary least squares is carried out with all the variables in the Beijing core area. Table 3 shows the parameter estimates, their standard error, and inference results. There are six independent variables (p-value > 0.05) which are not statistically significant with the community price, including the property management fee, the green ratio, the shortest distance to the kindergarten, the shortest distance to the park, the shortest distance to the food market and the shortest distance to the supermarket. The coefficients of different variables reflect the degree and direction of the influence on the dependent variable under different measurement unit. The transaction year is the most important one with the coefficient value of 0.162. Stderror is the standard deviation of regression coefficient; the smaller it is, the more accurate the model is. T-statistic and p-value are both used to test the significance of the model variables. The larger the t-statistic is, the more significant the corresponding covariate is. The variance inflation factors (VIF) of all independent variables are also tested and all VIF values are smaller than 7.5 (most of them are smaller than 2), indicating that there is no global significant multicollinearity (also called redundant variable) among the explanatory variables. In terms of the performance of the overall model, the R2 is 0.5680 and the adjusted R2 is 0.5647, indicating that the OLS model can explain 56 percent of the variation in community price in core area.
3.2. Geographically Weighted Regression Model
OLS results in Table 3 shows that 17 significant variables are selected from the total of 23 variables. In order to run the GWR model, the global and local multicollinearity should also be removed. Otherwise, the result will not be feasible. The global multicollinearity could be checked by the VIF values. Variables with large VIF values (above 7.5) are redundant variables. It is more difficult to find out the local multicollinearity. One of the effective ways is to create a thematic map for each of the independent variables and look for areas with little or no variation in values. We combine the OLS results with the thematic map of each variable and finally find out the variables with local multicollinearity are transaction year, ladder-to-household ratio and years of property right. Finally, 14 variables are involved in building the GWR model. The result is shown in Table 4.
The model is implemented by the ArcGIS Desktop Software (Version: 10.5; Type: Advanced). The Gaussian kernel is used for GWR model and the kernel type is fixed. The overall R2 is 0.2215. The adjusted R2 is 0.2007, and the bandwidth is 4098.1515 meters. The bandwidth is an important factor for the GWR model. It determines the smoothness of the model. The optimal result of bandwidth is estimated by the AICc methods. The residual square is 305.0345. The smaller the residual square is, the more the GWR model fits the observed data. The sigma value is the square root of the normalized residual sum of squares, which is used for AICc calculation. Detailed information is listed in Table 5.
GWR is a local linear regression model and the result reflects that the GWR model can only explain around 20 percent of the variation in the center area of Beijing for the whole dataset of 2014, 2016 and 2018. It reflects that the GWR model is not effective for the multi-year dataset of core area in Beijing. Figure 3 shows the distribution of R2 and standard residual value of the GWR model.
As for the problem of the GWR model in the same community, the different attribute values (independent variables) of 2014, 2016 and 2018 are treated as three different samples, all in the same location. This situation may lead to the fact that different sample data at the same location are calculated and averaged during local regression, which disturbs the spatial characteristics of local regression. Therefore, the result of R2 is very low. For further verification, according to the methodology in this paper, the database is intercepted by year. The data of 2014, 2016 and 2018 are extracted respectively. First, the OLS test is carried out, then global and local multicollinearity tests are taken into progress for the significant variables. Afterward, all final variables are utilized to build the GWR model and the results are shown in Table 6. Obviously, the results of the GWR model for each year separately are much better than the GWR model with all years’ database. The adjusted R2 of GWR model for 2014, 2016 and 2018 is approximately 0.5374, 0.4618 and 0.6321.
3.3. Geographically and Temperally Weighted Regression Model
The independent variables involved in GTWR model are the same as GWR model. The GTWR model is also provided in ArcGIS Desktop with a plug-in program (Release Version: https://www.researchgate.net/publication/339567248_GTWRv1_1_20_May2020zip. Algorithm Source: reference [16]. Huang, B.; Wu, B.; Barry, M. Geographically and temporally weighted regression for modeling spatio-temporal variation in house prices. International Journal of Geographical Information Science 2010, 24, 383–401, doi:10.1080/13658810802672469). The Gaussian kernel is used for GTWR model and the kernel type is fixed. The transaction year variable is set as the timestamps according to the program’s instrument. After the calculation, the R2 is 0.8200 and the adjusted R2 is 0.8192. The bandwidth of the GTWR is 0.1122 and the spatial-temporal distance ratio is 0.3731. The detail information of model diagnosis is shown in Table 7. Figure 4 shows the spatial distribution of the standard residuals for the GTWR model in 2014, 2016, and 2018. Where more than 2.5 times of standardized residuals need to be examined. According to the output information and distribution maps, the residuals range from −0.8759 to 1.0417 in 2014, −1.2920 to 0.6153 in 2016 and −0.6944 to 0.3574 in 2018. There is no residual value that is statistically significant clustering of high and/or low residual. This indicates that the GTWR model is reliable.
For further analysis, independent variables of Area, Decoration, Shortest Distance of Bus station, Shortest Distance of Hospital and Shortest Distance of Restaurant are selected to conduct the coefficient distribution analysis for each year.
According to Figure 5, cold colors (e.g., black, dark blue) indicate that the area variable has a negative effect on the annual average house price of community. The larger the area is, the lower the average house price is. Warm colors (e.g., red, orange) indicate positive effect. The larger the area is, the higher the average house price is. From 2014 to 2018, the residential area in Xicheng District on the left side of the research area gradually changed from positive to negative. By 2018, only a proportion of the southern residential area has a positive correlation with the housing price. One reason for this change is that within five years, the purchasing ability of a family will not change too much. With the rise of housing prices, if most families still want to make a deal, they can only choose a smaller area of housing. Yet small area housing often has a higher unit price. This leads to the distribution that the smaller the area, the higher the unit price by 2018.
Figure 6 is the coefficient analysis of decoration. Considering the condition of 2014, 2016 and 2018, most communities have the pattern that the better decoration, the higher the housing price. This feature is prevalent in second-hand housing transactions. However, there are still obvious differences in the degree of influence. For instance, the coefficient value of the warm colors (e.g., red, orange) can contribute 20% to 30% of the price. In particular, the southern part of Dongcheng District has the lowest sensitivity to decoration in 2014, but the highest sensitivity in 2018. Generally, keeping other factors the same, the house with exquisite decoration will get a higher valuation than the house with ordinary decoration. However, when the transaction price of the house greatly exceeds the cost of decoration, the buyer will become insensitive to the decoration situation.
Figure 7 shows the distribution of coefficients for the shortest distance from the community to the bus stop. Because the variable here is the minimum distance, the situation is the opposite of the variable for area and decoration. Overall, the central area keeps cold colors (e.g., black, dark blue), which means that the closer to the bus station, the higher the house price. On the contrary, the surrounding areas have the opposite trend. Considering the high density of traffic facilities in the study area, it may be a phenomenon of supersaturation, as the convenience of public transport also means traffic congestion and traffic noise. Meanwhile, the warm colors (e.g., red, orange) of the communities are basically around the second and third ring, which is the urban expressway of Beijing.
Figure 8 is the coefficient distribution of the shortest distance to the hospital. Cold colors (e.g., black, dark blue) indicate that the smaller the shortest distance to the hospital, the higher the house price is. This also represents that these areas are still at a stage of positive demand for medical resources. The overall distribution trend for the three years of 2014, 2016 and 2018 has not changed. On the one hand, hospital construction needs a large investment in public infrastructure construction, which takes many years from input to output. On the other hand, it also reflects that the level of medical resources in the core area remains relatively stable.
Figure 9 shows the coefficient distribution of the shortest distance to the restaurant. Most communities of Xicheng District are in warm colors (e.g., red, orange), indicating that the closer the distance is, the smaller the impact on house prices. It presents a state of oversaturation. Most of the communities in Dongcheng District are in cold colors (e.g., black, dark blue), indicating that the more convenient the distance from the restaurant, the higher the house price.
4. Conclusions
Mass appraisal is considered when many properties need to be assessed under an evaluation standard on a given date. Compared with the appraiser’s house by house evaluation, the software programs of mass appraisal models can provide a more effective, fair and accurate result, together with easier operation and lower cost in the practical application. In this research, the level of community is used for the mass appraisal modeling with annual average price and other meaningful attributes. The database contains the price data of 2014, 2016 and 2018 in 3064 communities. Three mass appraisal models including the MRA with OLS, the GWR model and the GTWR model are built in the urban center of Beijing core area as the study area. The overall performance of the models is shown in Table 8. From the results of mass appraisal, MRA with OLS, as a global linear regression model, has a general effect and can explain about 56% of the information. In contrast, as a local linear model, the adjusted R2 of GWR is only 0.2007, which is invalid in this experimental area. However, when time factor is introduced to form the GTWR model, it will be able to take advantage of the local model and obtain the adjusted R2 of 0.8192. Housing price data are sensitive to spatial factor. At the same time, the influence of temporal factor on housing prices is also obvious from the results of Table 3 in Section 3.1. Therefore, modeling the sample data with spatial-temporal heterogeneity will be able to more accurately simulate the characteristics of housing price in the research area. Finally, GTWR model can make good use of multi-year community-level data to conduct the mass appraisal modeling.
There are also some limitations in this study that need to be further discussed. First is about the evaluation scale. This study has proved that community scale is feasible and effective. However, the community data is a mathematical processing of the original individual transaction data, which may lose some important information. At the same time, it should be noted that if the transaction dataset for each property is used to execute the mass appraisal for the whole city, the amount of data will be greatly increased. For local linear regression model, the multiple increases of the amount of calculation are a challenge to the stability and efficiency of the model. Finally, compared with the housing price data, the rental data have a higher transaction frequency and is relatively stable in a housing submarket. It can better describe the housing value from the perspective of residence and usage rather than investment. A useful future study will be to introduce the rental data into the construction of the mass appraisal model and make a comparative analysis with the housing price data.
Figure 2. Spatial distribution and kernel density distribution of community annual average price.
Figure 9. Coefficient of SD_Restaurant for the GTWR model in 2014, 2016, and 2018.
Year | Xicheng District | Dongcheng District | Total of Communities |
---|---|---|---|
2014 | 543 | 350 | 893 |
2016 | 663 | 496 | 1159 |
2018 | 597 | 415 | 1012 |
3064 |
Category | Variable | Definition | Obs. | Minimum | Mean | Median | Maximum | Std. |
---|---|---|---|---|---|---|---|---|
Dependent Variables | Price | Annual average transaction price of each community (Renminbi (RMB) Yuan per square meter) | 3064 | 19667.70 | 79122.67 | 78380.10 | 148977.00 | 26610.81 |
log(Price) | Natural logarithm calculation of Price | 3064 | 9.89 | 11.22 | 11.27 | 11.91 | 0.36 | |
Property Structure in Community | Year | Transaction year of 2014, 2016 and 2018 | 3064 | 2014 | 2016.08 | 2016 | 2018 | 1.58 |
Area | Living area (square meter) | 3064 | 7.92 | 71.52 | 61.33 | 303.44 | 32.37 | |
Bedroom | Number of bedrooms for unit household | 3064 | 1.00 | 2.01 | 2.00 | 5.00 | 0.54 | |
Decoration | Decoration condition (from “best = 1” to “worst = 0”) | 3064 | 0.00 | 0.33 | 0.30 | 1.00 | 0.31 | |
Orientation | Orientation condition (from “best = 1” to “worst = 0”) | 3064 | 0.00 | 0.70 | 0.80 | 1.00 | 0.34 | |
Basic Condition of Community | Age | The age of the buildings till 2019 | 3064 | 1.00 | 25.24 | 25.17 | 102.00 | 10.83 |
Ladder-to-Household Ratio | Ladder and Household number ratio for each floor | 3064 | 0.73 | 3.64 | 3.00 | 45.00 | 2.32 | |
Years of Property Right | The property right for 70, 50 and 40 years | 3064 | 40.00 | 69.59 | 70.00 | 70.00 | 3.23 | |
Ratio of Elevator | Buildings with elevators divided by total buildings | 3064 | 0.00 | 0.45 | 0.45 | 1.00 | 0.43 | |
Property Management Fee | Property management fee (RMB Yuan per square meter per month) | 3064 | 0.50 | 1.86 | 1.86 | 14.00 | 1.25 | |
Num. Buildings | The number of buildings in community | 3064 | 1.00 | 7.90 | 5.00 | 126.00 | 10.09 | |
Num. Households | The number of households in community | 3064 | 1.00 | 597.18 | 376.00 | 5877.00 | 651.84 | |
Floor Area Ratio | The ratio of building’s available area to the total area | 3064 | 0.09 | 2.74 | 2.74 | 13.89 | 1.27 | |
Green Ratio | The ratio of total green space to the total area of residential land | 3064 | 0.10 | 0.30 | 0.30 | 0.60 | 0.06 | |
Traffic Condition around Community | SD_Bus | Shortest distance to bus station within 2 km (kilometer) | 3064 | 0.03 | 0.76 | 0.7 | 2.00 | 0.41 |
SD_Subway | Shortest distance to subway station within 2 km (kilometer) | 3064 | 0.12 | 0.90 | 0.79 | 2.00 | 0.49 | |
Living Condition around Community | SD_Kindergarten | Shortest distance to kindergarten within 2 km (kilometer) | 3064 | 0.01 | 0.56 | 0.52 | 2.00 | 0.30 |
SD_Park | Shortest distance to park within 2 km (kilometer) | 3064 | 0.01 | 0.76 | 0.71 | 1.87 | 0.35 | |
SD_Hospital | Shortest distance to hospital within 2 km (kilometer) | 3064 | 0.03 | 0.60 | 0.57 | 1.63 | 0.31 | |
SD_Shopping mall | Shortest distance to shopping mall within 2 km (kilometer) | 3064 | 0.02 | 1.03 | 0.99 | 2.00 | 0.49 | |
SD_Food market | Shortest distance to food market within 2 km (kilometer) | 3064 | 0.03 | 0.60 | 0.58 | 2.00 | 0.31 | |
SD_Supermarket | Shortest distance to supermarket within 2 km (kilometer) | 3064 | 0.02 | 0.55 | 0.54 | 2.00 | 0.27 | |
SD_Restaurant | Shortest distance to restaurant within 2 km (kilometer) | 3064 | 0.02 | 0.68 | 0.65 | 2.00 | 0.35 |
Variable | Coefficient | StdError | t-Statistic | p-Value | VIF1 |
---|---|---|---|---|---|
Intercept | −315.851187 | 5.620593 | −56.195350 | 0.000000* | —— |
Year | 0.162138 | 0.002786 | 58.197090 | 0.000000* | 1.059613 |
Area | −0.002677 | 0.000249 | −10.751610 | 0.000000* | 3.572943 |
Bedroom | 0.032368 | 0.012161 | 2.661753 | 0.007810* | 2.368106 |
Decoration | 0.140665 | 0.014599 | 9.635222 | 0.000000* | 1.125050 |
Orientation | 0.057093 | 0.013688 | 4.170999 | 0.000037* | 1.174621 |
Age | −0.001904 | 0.000535 | −3.557123 | 0.000396* | 1.846311 |
Ladder Ratio of Household | −0.006975 | 0.001933 | −3.608488 | 0.000327* | 1.108054 |
Years of Property Right | 0.004240 | 0.001334 | 3.179919 | 0.001503* | 1.022561 |
Ratio of Elevator | 0.029615 | 0.013138 | 2.254163 | 0.024240* | 1.742408 |
Property management fee | 0.003770 | 0.004239 | 0.889347 | 0.373872 | 1.554304 |
Num. Buildings | 0.004415 | 0.000482 | 9.152315 | 0.000000* | 1.303021 |
Num. Households | −0.000036 | 0.000008 | −4.756089 | 0.000003* | 1.344524 |
Floor Area Ratio | −0.011992 | 0.003643 | −3.291964 | 0.001023* | 1.170229 |
Green Ratio | 0.032692 | 0.075335 | 0.433952 | 0.664370 | 1.038599 |
SD_Bus | −0.037806 | 0.010777 | −3.508051 | 0.000473* | 1.060005 |
SD_Subway | −0.025825 | 0.009012 | −2.865614 | 0.004196* | 1.054826 |
SD_Kindergarten | −0.016201 | 0.014863 | −1.090043 | 0.275774 | 1.058128 |
SD_Park | 0.007161 | 0.012712 | 0.563320 | 0.573266 | 1.085705 |
SD_Hospital | 0.046371 | 0.014087 | 3.291667 | 0.001024* | 1.037497 |
SD_Shopping mall | 0.046011 | 0.009152 | 5.027681 | 0.000001* | 1.116088 |
SD_Food market | 0.006466 | 0.014427 | 0.448219 | 0.654042 | 1.095029 |
SD_Supermarket | 0.026949 | 0.015890 | 1.696033 | 0.089992 | 1.049578 |
SD_Restaurant | −0.064511 | 0.012821 | −5.031700 | 0.000001* | 1.108541 |
OLS Diagnostics | |||||
Number of Observations | 3064 | ||||
R2 | 0.567956 | ||||
Adjusted R2 | 0.564687 | ||||
AICc2 | −127.522124 |
VIF1: variance inflation factors; AICc2: corrected Akaike information criterion; * a p-value less than 0.05 (typically ≤0.05) is statistically significant.
Variables | Significant3 | without Global4 | without Local5 |
---|---|---|---|
Transaction Year | √ | √ | |
Area | √ | √ | √ |
Bedroom | √ | √ | √ |
Decoration | √ | √ | √ |
Orientation | √ | √ | √ |
Age | √ | √ | √ |
Ladder Ratio of Household | √ | √ | √ |
Years of Property Right | √ | √ | |
Ratio of Elevator | √ | √ | |
Property Management Fee | |||
Num. Buildings | √ | √ | √ |
Num. Households | √ | √ | √ |
Floor Area Ratio | √ | √ | √ |
Green Ratio | |||
SD_Bus | √ | √ | √ |
SD_Subway | √ | √ | √ |
SD_Kindergarten | |||
SD_Park | |||
SD_Hospital | √ | √ | √ |
SD_Shopping mall | √ | √ | √ |
SD_Food market | |||
SD_Supermarket | |||
SD_Restaurant | √ | √ | √ |
Significant3: Variables with significance in OLS; without Global4: Variables without global multicollinearity; without Local5: Variables without local multicollinearity.
Diagnostics Content | Value |
---|---|
Number of Observations | 3064 |
Bandwidth | 4098.151518 |
Residual Squares | 305.034493 |
Sigma | 0.319754 |
AICc | 1760.556297 |
R2 | 0.221452 |
Adjusted R2 | 0.200689 |
GWR Diagnostics | All Years | 2014 | 2016 | 2018 |
---|---|---|---|---|
Number of Observations | 3064 | 893 | 1159 | 1012 |
Bandwidth | 4098.151518 | 1290.904633 | 2582.548632 | 1850.362328 |
Residual Squares | 305.034493 | 20.984072 | 41.746788 | 14.625219 |
Sigma | 0.319754 | 0.177307 | 0.198781 | 0.129565 |
AICc | 1760.556297 | −398.932944 | −401.695394 | −1184.098235 |
R2 | 0.221452 | 0.65381 | 0.508987 | 0.682964 |
Adjusted R2 | 0.200689 | 0.537363 | 0.461818 | 0.632098 |
Diagnostics Content | Value |
---|---|
Number of Observations | 3064 |
Bandwidth | 0.112248 |
Residual Squares | 70.5342 |
Sigma | 0.151724 |
AICc | −1896.93 |
R2 | 0.820032 |
Adjusted R2 | 0.819206 |
Spatial-temporal Distance Ratio | 0.373068 |
MRA with OLS | GWR | GTWR | |
---|---|---|---|
Number of Observations | 3064 | 3064 | 3064 |
Bandwidth | Global | 4098.151518 | 0.112248 |
AICc | −127.522124 | 1760.556297 | −1896.93 |
R2 | 0.567956 | 0.221452 | 0.820032 |
Adjusted R2 | 0.564687 | 0.200689 | 0.819206 |
Author Contributions
Conceptualization, D.W.; methodology, D.W.; software, D.W.; formal analysis, D.W.; resources, H.Y.; data curation, H.Y.; writing-original draft preparation, D.W.; writing-review and editing, D.W. and V.J.L.; visualization, D.W.; supervision, V.J.L.; funding acquisition, H.Y. All authors have read and agree to the published version of the manuscript.
Funding
This research was funded by the National Natural Science Foundation of China, grant number 71874195.
Conflicts of Interest
The authors declare no conflict of interest.
1. IAAO. Standard on Mass Appraisal of Real Property; IAAO: Kansas City, MO, USA, 2017.
2. Tajani, F.; Morano, P.; Ntalianis, K. Automated valuation models for real estate portfolios a method for the value updates of the property assets. J. Prop. Invest. Financ. 2018, 36, 324-347.
3. Ciuna, M.; Milazzo, L.; Salvo, F. A Mass Appraisal Model Based on Market Segment Parameters. Buildings 2017, 7, 34.
4. Zhou, G.; Ji, Y.; Chen, X.; Zhang, F. Artificial Neural Networks and the Mass Appraisal of Real Estate. Int. J. Online Eng. 2018, 14, 180-187.
5. Bencardino, M.; Nesticò, A. Demographic changes and real estate values. A quantitative model for analyzing the urban-rural linkages. Sustainability 2017, 9, 536.
6. Battisti, F.; Campo, O.; Forte, F. A Methodological Approach for the Assessment of Potentially Buildable Land for Tax Purposes: The Italian Case Study. Land 2020, 9, 8.
7. Manganelli, B.; Murgante, B. The dynamics of urban land rent in Italian regional capital cities. Land 2017, 6, 54.
8. Lancaster, K.J. New approach to consumer theory. J. Political Econ. 1966, 74, 132-157.
9. Del Giudice, V.; Manganelli, B.; De Paola, P. Hedonic Analysis of Housing Sales Prices with Semiparametric Methods. Int. J. Agric. Environ. Inf. Syst. 2017, 8, 65-77.
10. Lin, C.C.; Mohan, S.B. Effectiveness comparison of the residential property mass appraisal methodologies in the USA. Int. J. Hous. Mark. Anal. 2011, 4, 224-243.
11. Brunsdon, C.; Fotheringham, A.S.; Charlton, M.E. Geographically weighted regression: A method for exploring spatial nonstationarity. Geogr. Anal. 1996, 28, 281-298.
12. Bitter, C.; Mulligan, G.F.; Dall'erba, S. Incorporating spatial variation in housing attribute prices: A comparison of geographically weighted regression and the spatial expansion method. J. Geogr. Syst. 2007, 9, 7-27.
13. Harris, R.; Dong, G.P.; Zhang, W.Z. Using Contextualized Geographically Weighted Regression to Model the Spatial Heterogeneity of Land Prices in Beijing, China. Trans. GIS 2013, 17, 901-919.
14. Cao, K.; Diao, M.; Wu, B. A Big Data-Based Geographically Weighted Regression Model for Public Housing Prices: A Case Study in Singapore. Ann. Am. Assoc. Geogr. 2019, 109, 173-186.
15. Li, Z.Q.; Fotheringham, A.S.; Li, W.W.; Oshan, T. Fast Geographically Weighted Regression (FastGWR): A scalable algorithm to investigate spatial process heterogeneity in millions of observations. Int. J. Geogr. Inf. Sci. 2019, 33, 155-175.
16. Huang, B.; Wu, B.; Barry, M. Geographically and temporally weighted regression for modeling spatio-temporal variation in house prices. Int. J. Geogr. Inf. Sci. 2010, 24, 383-401.
17. Fotheringham, A.S.; Crespo, R.; Yao, J. Geographical and Temporal Weighted Regression (GTWR). Geogr. Anal. 2015, 47, 431-452.
18. Wang, H.X.; Wang, J.D.; Huang, B. Prediction for spatio-temporal models with autoregression in errors. J. Nonparametric Stat. 2012, 24, 217-244.
19. He, Q.; Huang, B. Satellite-based mapping of daily high-resolution ground PM2. 5 in China via space-time regression modeling. Remote Sens. Environ. 2018, 206, 72-83.
20. Cheng, J.; Dai, S.; Ye, X. Spatiotemporal heterogeneity of industrial pollution in China. China Econ. Rev. 2016, 40, 179-191.
21. Zhang, X.X.; Huang, B.; Zhu, S.Z. Spatiotemporal Influence of Urban Environment on Taxi Ridership Using Geographically and Temporally Weighted Regression. ISPRS Int. J. Geo-Inf. 2019, 8, 23.
22. Wu, C.; Ren, F.; Hu, W.; Du, Q. Multiscale geographically and temporally weighted regression: Exploring the spatiotemporal determinants of housing prices. Int. J. Geogr. Inf. Sci. 2019, 33, 489-511.
23. Wang, H.; Zhang, B.; Liu, Y.; Liu, Y.; Xu, S.; Zhao, Y.; Chen, Y.; Hong, S. Urban expansion patterns and their driving forces based on the center of gravity-GTWR model: A case study of the Beijing-Tianjin-Hebei urban agglomeration. J. Geogr. Sci. 2020, 30, 297-318.
24. Du, Z.; Wu, S.; Kwan, M.-P.; Zhang, C.; Zhang, F.; Liu, R. A spatiotemporal regression-kriging model for space-time interpolation: A case study of chlorophyll-a prediction in the coastal areas of Zhejiang, China. Int. J. Geogr. Inf. Sci. 2018, 32, 1927-1947.
25. Bourassa, S.C.; Cantoni, E.; Hoesh, M. Spatial dependence, housing submarkets, and house price prediction. J. Real Estate Financ. Econ. 2007, 35, 143-160.
26. McCluskey, W.J.; McCord, M.; Davis, P.T.; Haran, M.; McIlhatton, D. Prediction accuracy in mass appraisal: A comparison of modern approaches. J. Prop. Res. 2013, 30, 239-265.
27. Wang, D.; Li, V.J. Mass Appraisal Models of Real Estate in the 21st Century: A Systematic Literature Review. Sustainability 2019, 11, 7006.
28. Guarini, M.R.; Battisti, F.; Chiovitti, A. A methodology for the selection of multi-criteria decision analysis methods in real estate and land management processes. Sustainability 2018, 10, 507.
29. Manganelli, B.; Paola, P.D.; Giudice, V.D. A multi-objective analysis model in mass real estate appraisal. Int. J. Bus. Intell. Data Min. 2018, 13, 441-455.
30. Kilpatrick, J. Expert systems and mass appraisal. J. Prop. Invest. Financ. 2011, 29, 529-550.
31. Morano, P.; Rosato, P.; Tajani, F.; Manganelli, B.; Di Liddo, F. Contextualized Property Market Models vs. Generalized Mass Appraisals: An Innovative Approach. Sustainability 2019, 11, 4896.
32. Del Giudice, V.; De Paola, P.; Forte, F.; Manganelli, B. Real estate appraisals with Bayesian approach and Markov chain hybrid Monte Carlo method: An application to a central urban area of Naples. Sustainability 2017, 9, 2138.
33. Yacim, J.A.; Boshoff, D.G.B. Impact of Artificial Neural Networks Training Algorithms on Accurate Prediction of Property Values. J. Real Estate Res. 2018, 40, 375-418.
34. Hui, S.K.; Cheung, A.; Pang, J. A Hierarchical Bayesian Approach for Residential Property Valuation: Application to Hong Kong Housing Market. Int. Real Estate Rev. 2010, 13, 1-29.
35. Napoli, G.; Giuffrida, S.; Valenti, A. Forms and Functions of the Real Estate Market of Palermo (Italy). Science and Knowledge in the Cluster Analysis Approach. In Appraisal: From Theory to Practice; Stanghellini, S., Morano, P., Bottero, M., Oppio, A., Eds.; Springer: Berlin, Germany, 2017; pp. 191-202.
36. Calka, B. Estimating Residential Property Values on the Basis of Clustering and Geostatistics. Geosciences 2019, 9, 143.
37. Del Giudice, V.; De Paola, P.; Cantisani, G.B. Rough Set Theory for Real Estate Appraisals: An Application to Directional District of Naples. Buildings 2017, 7, 12.
38. Yeh, I.C.; Hsu, T.-K. Building real estate valuation models with comparative approach through case-based reasoning. Appl. Soft Comput. 2018, 65, 260-271.
39. Chen, J.-H.; Ong, C.F.; Zheng, L.; Hsu, S.-C. Forcasting spatial dynamics of the housing market using support vector machine. Int. J. Strateg. Prop. Manag. 2017, 21, 273-283.
40. Wu, C.; Ye, X.; Ren, F.; Du, Q. Modified Data-Driven Framework for Housing Market Segmentation. J. Urban Plan. Dev. 2018, 144.
41. Zhang, R.; Du, Q.; Geng, J.; Liu, B.; Huang, Y. An improved spatial error model for the mass appraisal of commercial real estate based on spatial analysis: Shenzhen as a case study. Habitat Int. 2015, 46, 196-205.
42. Palma, M.; Cappello, C.; De Iaco, S.; Pellegrino, D. The residential real estate market in Italy: A spatio-temporal analysis. Qual. Quant. 2019, 53, 2451-2472.
43. Watson, D.F.; Philip, G. A refinement of inverse distance weighted interpolation. Geo-Processing 1985, 2, 315-327.
44. Silverman, B.W. Density Estimation for Statistics and Data Analysis; CRC Press: Boca Raton, FL, USA, 1986; Volume 26.
45. Anselin, L. GIS research infrastructure for spatial analysis of real estate markets. J. Hous. Res. 1998, 9, 113-133.
46. Tobler, W.R. Computer movie simulating urban growth in detroit region. Econ. Geogr. 1970, 46, 234-240.
47. Fotheringham, A.S.; Brunsdon, C.; Charlton, M. Geographically Weighted Regression: The Analysis of Spatially Varying Relationships; John Wiley & Sons: New Yok, NY, USA, 2003.
48. Hurvich, C.M.; Simonoff, J.S.; Tsai, C.L. Smoothing parameter selection in nonparametric regression using an improved Akaike information criterion. J. R. Stat. Soc. Ser. B Stat. Methodol. 1998, 60, 271-293.
Daikun Wang1, Victor Jing Li1,2,* and Huayi Yu3
1Department of Geography and Resource Management, The Chinese University of Hong Kong, New Territories 999077, Hong Kong, China
2Institute of Future Cities, The Chinese University of Hong Kong, New Territories 999077, Hong Kong, China
3Department of Land & Real Estate Management, Renmin University of China, Beijing 100872, China
*Author to whom correspondence should be addressed.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
© 2020. This work is licensed under http://creativecommons.org/licenses/by/3.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Abstract
The traditional linear regression model of mass appraisal is increasingly unable to satisfy the standard of mass appraisal with large data volumes, complex housing characteristics and high accuracy requirements. Therefore, it is essential to utilize the inherent spatial-temporal characteristics of properties to build a more effective and accurate model. In this research, we take Beijing’s core area, a typical urban center, as the study area of modeling for the first time. Thousands of real transaction data sets with a time span of 2014, 2016 and 2018 are conducted at the community level (community annual average price). Three different models, including multiple regression analysis (MRA) with ordinary least squares (OLS), geographically weighted regression (GWR) and geographically and temporally weighted regression (GTWR), are adopted for comparative analysis. The result indicates that the GTWR model, with an adjusted R2 of 0.8192, performs better in the mass appraisal modeling of real estate. The comparison of different models provides a useful benchmark for policy makers regarding the mass appraisal process of urban centers. The finding also highlights the spatial characteristics of price-related parameters in high-density residential areas, providing an efficient evaluation approach for planning, land management, taxation, insurance, finance and other related fields.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer