Full text

Turn on search term navigation

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Abstract

Accurate spatial population distribution information, especially for metropolises, is of significant value and is fundamental to many application areas such as public health, urban development planning and disaster assessment management. Random forest is the most widely used model in population spatialization studies. However, a reliable model for accurately mapping the spatial distribution of metropolitan populations is still lacking due to the inherent limitations of the random forest model and the complexity of the population spatialization problem. In this study, we integrate gradient boosting decision tree (GBDT), extreme gradient boosting (XGBoost), light gradient boosting machine (LightGBM) and support vector regression (SVR) through ensemble learning algorithm stacking to construct a novel population spatialization model we name GXLS-Stacking. We integrate socioeconomic data that enhance the characterization of the population’s spatial distribution (e.g., point-of-interest data, building outline data with height, artificial impervious surface data, etc.) and natural environmental data with a combination of census data to train the model to generate a high-precision gridded population density map with a 100 m spatial resolution for Beijing in 2020. Finally, the generated gridded population density map is validated at the pixel level using the highest resolution validation data (i.e., community household registration data) in the current study. The results show that the GXLS-Stacking model can predict the population’s spatial distribution with high precision (R2 = 0.8004, MAE = 34.67 persons/hectare, RMSE = 54.92 persons/hectare), and its overall performance is not only better than the four individual models but also better than the random forest model. Compared to the natural environmental features, a city’s socioeconomic features are more capable in characterizing the spatial distribution of the population and the intensity of human activities. In addition, the gridded population density map obtained by the GXLS-Stacking model can provide highly accurate information on the population’s spatial distribution and can be used to analyze the spatial patterns of metropolitan population density. Moreover, the GXLS-Stacking model has the ability to be generalized to metropolises with comprehensive and high-quality data, whether in China or in other countries. Furthermore, for small and medium-sized cities, our modeling process can still provide an effective reference for their population spatialization methods.

Details

Title
High-Precision Population Spatialization in Metropolises Based on Ensemble Learning: A Case Study of Beijing, China
Author
Bao, Wenxuan 1   VIAFID ORCID Logo  ; Gong, Adu 1 ; Zhao, Yiran 2 ; Chen, Shuaiqiang 1 ; Ba, Wanru 1 ; He, Yuan 3   VIAFID ORCID Logo 

 State Key Laboratory of Remote Sensing Science, Beijing Normal University, Beijing 100875, China; [email protected] (W.B.); [email protected] (S.C.); [email protected] (W.B.); Beijing Key Laboratory of Environmental Remote Sensing and Digital City, Beijing Normal University, Beijing 100875, China; Faculty of Geographical Science, Beijing Normal University, Beijing 100875, China; [email protected] 
 School of Statistics, Beijing Normal University, Beijing 100875, China; [email protected] 
 Faculty of Geographical Science, Beijing Normal University, Beijing 100875, China; [email protected]; State Key Laboratory of Earth Surface Processes and Resource Ecology, Beijing Normal University, Beijing 100875, China 
First page
3654
Publication year
2022
Publication date
2022
Publisher
MDPI AG
e-ISSN
20724292
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
2700765031
Copyright
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.