Content area
To address the challenges in current research on spatial clustering algorithms for buildings in topographic maps—namely, their limited ability to effectively accommodate diverse application scenarios, including dense and regular urban environments, sparsely and irregularly distributed rural areas, and urban villages with complex structures—this paper introduces an innovative progressive clustering algorithm framework. The proposed framework operates in a hierarchical manner, progressing from macro to micro levels, thereby enhancing its adaptability and practical versatility. Specifically, it employs the minimum spanning tree (MST) technique for macro-level clustering analysis. Subsequently, a self-organizing map (SOM) neural network is utilized to perform micro-level clustering, enabling a more refined and detailed classification. Within this framework, the minimum spanning tree effectively captures the macroscopic distribution patterns of the building population. The macroscopic clustering results are then utilized as the initial weight configurations for the SOM neural network. This approach ensures that the overall spatial structural integrity is preserved during the subsequent micro-level clustering process. Moreover, the SOM neural network achieves refined optimization of micro-clustering details by incorporating building feature factors. To validate the effectiveness of the proposed algorithm, this study conducts an empirical analysis and comparative testing using building data from Futian District, Shenzhen City. The results indicate that the proposed algorithm exhibits superior recognition capabilities when applied to complex and variable spatial distribution patterns of buildings. Furthermore, the clustering outcomes align closely with the principles of Gestalt visual perception and outperform the comparison algorithms in overall performance.
Full text
1. Introduction
Map synthesis is a core issue in cartography, involving the extraction of characteristic patterns and primary features of geographic elements from geospatial information through processes such as displacement, simplification, selection, and generalization. These processes enable the representation of maps at various scales, making map synthesis a highly challenging and complex research direction in the field [1]. However, a critical gap remains in adapting these processes to real-world geographic scenarios with heterogeneous spatial distributions. For instance, in urban planning, disaster response, and resource allocation, there is an urgent need to identify cohesive building clusters that reflect both functional coherence (e.g., residential zones, industrial parks) and spatial proximity relationships (e.g., adjacency, connectivity) [2]. Current methods often fail to balance these dual requirements, particularly in complex environments such as urban villages or rural areas with sparse and irregular building layouts.
Spatial clustering is a crucial method for grouping spatial entities based on their similarity characteristics. Building clustering, specifically, must adhere to Tobler’s First Law of Geography, which posits that “everything is related to everything else, but near things are more related than distant things” [3]. This principle underscores the necessity of designing clustering algorithms that prioritize spatial proximity while accounting for semantic attributes (e.g., building height, function). Existing approaches are widely applied in pattern recognition and geographic distribution analysis but face significant limitations when handling buildings with complex spatial characteristics—such as variations in density, irregular distribution patterns, and non-linear proximity relationships [4,5,6].
Building clustering studies typically treat faceted building data as two-dimensional entities and rely on Gestalt visual cognitive principles (e.g., similarity, continuity, proximity) to ensure that the clustering results align with human perceptual conventions [7,8,9]. Clustering algorithms can generally be classified into four categories: density-based clustering, division-based clustering, hierarchical clustering, and graph-based clustering [10].
Among these, density-based clustering algorithms are commonly used for raster data when detecting high-density building areas. However, they require computation for each building and its surrounding entities, making the process computationally intensive and inefficient [11]. Parastoo proposed an enhancement to the traditional DBSCAN algorithm to address the challenges posed by discrete points and density inhomogeneity in the distribution of neighboring buildings. The key improvement in this algorithm is the use of locally weighted radii instead of the conventional global radii, which significantly increases the algorithm’s adaptability to complex distribution patterns [12]. Sanjay Kumar Anand et al. compared the efficiency of 11 clustering methods across five different datasets using three effectiveness metrics—internal, external, and stability—and identified the best-performing method to determine the most feasible solution for each algorithm [13]. In segmentation-based clustering, the number of clusters is predefined, and the clustering process is achieved by iteratively assigning data points to the nearest cluster center. K-means, a representative of divisive clustering algorithms, partitions N data points into K clusters. Through iterative optimization, it minimizes the distance between points within each cluster while maximizing the distance between points from different clusters. Zhou et al. conducted experiments using the K-means algorithm on small-scale datasets, demonstrating that the algorithm is highly sensitive to the selection of initial clustering centroids and typically identifies only spherical clusters [14]. In contrast to density-based and division-based clustering, hierarchical clustering is a bottom-up approach that constructs a tree-like structure by partitioning the dataset at different levels. The dataset can be partitioned using either a bottom-up agglomerative strategy or a top-down divisive strategy. Hu et al. proposed a new clustering validity index function that characterizes intraclass cohesion and interclass separation based solely on the similarity relationship between samples. This method is easier to understand and compute than the traditional F-statistic, but it remains computationally inefficient and requires significant storage compared to other algorithms [15]. Graph clustering, which utilizes the graph structure of buildings to represent their proximity, has long been recognized as an effective method for building clustering. It incorporates specific constraints to achieve cluster partitioning. Delaunay triangulations, minimum spanning trees (MST), and Voronoi diagrams are representative graph structures. For example, Ai et al. developed a constrained Delaunay triangulation structure in two dimensions, integrating the descriptions of rings, boundaries, islands, and vertices of polygons. This structure employs formalized conditional retrieval to extract regions of interest in two-dimensional spaces formed by dissecting triangles, followed by operations such as recognizing curved portions of polygons, detecting conflict relations, and enabling polygon cooperation [16]. Li et al. applied minimum spanning trees to the association analysis of low-density regions, leveraging their adaptability to various datasets to connect regions with potential relationships [17]. Yan et al. proposed a parallel decentralized iterative community clustering method based on network-weighted Voronoi diagrams (NWVD-DICCM) and its parallel version (NWVD-PDICCM) to identify effective community structures in large networks [18].
Despite these advancements, the clustering algorithms discussed above are largely limited to regular building group layouts and struggle to handle complex and sparse building distributions. They fail to fully account for the diversity of buildings and complex spatial relationships, which undermines the accuracy and adaptability of the clustering results. As a result, there has been a significant shift in research focus in recent years, moving away from the regular layouts of urban areas to explore more complex environments. Researchers are increasingly concentrating on the sparse distribution of buildings in rural areas, as well as the intricate structures and irregular layouts of urban villages and other similar environments. This shift aims to uncover novel patterns and clustering characteristics of building groups in these more challenging contexts. Yu et al. proposed a heuristic method for grouping buildings based on Delaunay triangulation, building area, and the length of face-to-face boundaries between neighboring buildings. They applied this method to cluster buildings in an urban village in Guangzhou, taking into account the unique characteristics of the buildings, both in terms of their individual attributes and group-level attributes [19]. Lv et al. extended the Voronoi clustering method by incorporating the spatial distribution characteristics of rural residential land. They then extracted proximity relationships for spatial clustering using the DC-Voronoi diagram [20]. Although existing research has optimized building clustering algorithms in several aspects, current methods still struggle to simultaneously adapt to a variety of scenarios, including dense, regular urban areas, sparsely distributed, irregular rural areas, and complex urban villages. Therefore, there is an urgent need for a clustering algorithm that can comprehensively address a wide range of building groups and exhibit broad applicability [21]. Additionally, researchers often rely on personal experience and prior knowledge when selecting and setting the parameters of clustering algorithms, rather than adopting a systematic approach for their selection and optimization.
This paper proposes a progressive spatial clustering method that synergizes macro- and micro-clustering.
-
Macro-clustering aims to capture global structural features (e.g., regional density patterns, large-scale connectivity) and establish overarching relationships between spatially dispersed entities.
-
Micro-clustering focuses on refining local groupings by incorporating fine-grained attributes to align with human perceptual conventions (e.g., Gestalt principles of similarity and continuity).
First, an MST algorithm is utilized to capture the macro-level distribution of building populations. This method effectively extracts global structural features and establishes overarching connections, particularly in high-density environments, thereby addressing overfitting issues caused by local density variations. The MST algorithm demonstrates robustness in sparse distributions, enabling the identification of connections between distant and isolated buildings [22]. Building on this macroscopic clustering structure, the clustering centers derived from MST are utilized as the initial weights of the SOM to guide the self-organizing process of the network. The SOM further refines the classifications by capturing fine-grained local features, demonstrating strong discriminative capabilities in complex scenarios (e.g., urban villages). Its nonlinear mapping capability complements the MST’s limitations in adapting to intricate local structures. Additionally, correlation analysis and feature factor screening ensure that only the most representative and relevant features are incorporated into the clustering process, enhancing the practical relevance of the clustering results. By employing multiple feature factors as constraints, the method fully captures the distinctions between different building types (e.g., high- vs. low-rise buildings) and functional zones (e.g., industrial vs. residential areas). This phased approach, integrating macro-clustering by MST and fine-grained clustering by SOM, prevents the potential confusion associated with one-step clustering of complex datasets and significantly improves adaptability to diverse spatial distributions. The resultant clustering process is fully automated, eliminating the need for human intervention while achieving end-to-end spatial clustering from macro to micro scales. These advancements provide a robust auxiliary tool for applications such as building map synthesis, urban planning, market analysis, and risk management.
2. A Progressive Clustering Approach for Buildings Incorporating MST and SOM
To respond more flexibly to the complex and dynamic spatial structures found in real environments and to extend applicability to a broader range of scenarios, this paper proposes a progressive clustering framework that operates from macroscopic to microscopic levels. The algorithm first conducts a preliminary clustering analysis of buildings using the MST method, which spans multiple scale boundaries to effectively capture and represent the proximity relationships and initial clustering patterns between buildings [23]. In this step, the MST is constructed by establishing a spatial relationship network based on the Delaunay triangulation model, with the minimum distance as the core metric. This approach efficiently identifies and integrates spatially adjacent building clusters, thus providing a solid foundation for subsequent analysis.
At the microscopic level, given the high dimensionality and complexity of building data, along with the need to visualize the detailed spatial distribution of the building population, this paper introduces the SOM neural network as a tool for micro-clustering. The SOM neural network is renowned for its strong nonlinear mapping ability, which enables it to effectively process high-dimensional data and extract underlying patterns [24]. At this stage, the results of the MST macro clustering are used as the initial weight configuration for the SOM network. This approach not only leverages the macro clustering outcomes to capture the overall spatial structure, ensuring that the micro-clustering process respects and builds upon the macroscopic pattern, but also effectively mitigates challenges such as the SOM’s sensitivity to initial weights, ambiguity in defining cluster boundaries, and high computational costs. Furthermore, the SOM network is further employed, utilizing feature factors (area, concavity, radius, direction, height, and spatial location) selected through correlation analysis to refine the constrained clustering of the MST macro clustering results, enabling a seamless transition from macro to micro-level clustering. The overall research strategy adopted in this paper is shown in Figure 1.
2.1. Macro Clustering of Buildings Based on Spatial Proximity Using Delaunay Triangular Network and MST
To accurately model the spatial proximity between building groups, a constraint-based Delaunay triangulation method is employed. Specifically, the building contours are first simplified, and the conversion of these contours into point sets is achieved by uniformly distributing encrypted points with unique identifiers along the contours at 10-m intervals. These encrypted points are then used as nodes to construct a Delaunay triangulation mesh via interpolation, effectively representing the spatial structure of the building group, as illustrated in Figure 2a [25].
Next, based on the spatial proximity relationship constructed by the Delaunay triangulation mesh, the buildings are abstracted as nodes, and the adjacency relationships between buildings are captured by the edges of the mesh, resulting in the formation of a spatial proximity graph of buildings, as shown in Figure 2b. However, due to the irregular distribution of buildings, this graph may include unnecessary long edges that distort the accuracy of the neighbor relationships and thus need to be removed.
To optimize the spatial proximity graph, we introduce the concept of a minimum spanning tree (MST), which connects N vertices using N-1 edges while minimizing the sum of the edge weights [26]. In this context, the distance between buildings is defined as the weight of the edges, with the shortest Euclidean distance between boundary points of two buildings serving as the specific metric. By constructing the MST (shown in Figure 2c), redundant long edges can be efficiently eliminated, while edges that reflect the true neighborhood relationships are preserved. The maximum gap method is then employed to identify points in the MST where edge weights exhibit significant changes, which is used to determine the pruning threshold. To ensure the reasonableness of this threshold, the Silhouette Coefficient (SC) is finally used for verification and adjustment.
When selecting an algorithm for constructing the minimum spanning tree, both Kruskal’s and Prim’s algorithms, which are based on a greedy strategy, yield the same result [27,28]. However, since Prim’s algorithm focuses on selecting edges with the smallest weight connected to the currently selected vertex set, this local optimization strategy can offer higher efficiency in certain scenarios (e.g., when buildings are more densely distributed) [29]. Therefore, in this study, the Prim algorithm is chosen to construct the minimum spanning tree of buildings, ensuring efficient processing.
2.2. Macro Clustering Centers as Initial Weights for Micro Clustering
After completing the macroscopic clustering, we determine the clustering centroids by calculating the geometric center of all buildings within each cluster. This geometric center approach ensures that each centroid accurately represents the spatial distribution of buildings in its cluster, with the centroid position reflecting the overall arrangement of buildings in that group. These centroids are then used as the initial weights for the SOM neural network, as shown in Figure 3. The cluster centroids derived from the MST method are directly mapped to the initial weights of the SOM network. However, in conventional SOM network design, neurons are typically arranged in a two-dimensional grid of size or [30,31]. As a result, the total number of neurons often does not match the number of macroscopic clustering centers.
To address this mismatch, more neurons than clustering centers are typically employed in the SOM neural network, ensuring that each neuron is assigned a corresponding weight vector. For initializing the weights of the remaining neurons, two main strategies are commonly used: random initialization and heuristic initialization. While random initialization is efficient and straightforward, it can be ineffective at the start of training due to its lack of targeting, as illustrated in Figure 4a. In contrast, heuristic initialization aims to optimize the initial weights by leveraging prior knowledge about the data. This can involve strategies such as considering the spatial distribution of building data or refining existing clustering centers to generate additional weight vectors. These strategies ensure that the remaining neurons start with weights that are more aligned with the data, improving the efficiency of training [32], as shown in Figure 4b. The heuristic method does not require manual intervention once the initial clustering centers are identified, making it a more automated and data-driven approach compared to random initialization. Given its ability to utilize prior knowledge effectively, the heuristic initialization method is adopted in this paper to initialize the weights of the remaining neurons.
2.3. Selection of Spatial Similarity Factors for Buildings and SOM-Based Micro-Clustering
In spatial clustering analysis of buildings, it is crucial to fully account for the spatial similarity between buildings to ensure that the clustering results accurately reflect their actual distribution characteristics [33]. Therefore, when constructing the system for characterizing spatial similarity, multiple building features must be considered comprehensively and integratively. However, in practical clustering operations, simply increasing the number of feature factors does not necessarily improve clustering performance. On the contrary, an excessive number of feature factors can lead to data redundancy, ultimately diminishing clustering efficiency. To address this, this paper conducts a comprehensive correlation analysis of the feature factors commonly used in building studies. The goal is to scientifically determine which factors should be included in the clustering constraints. This approach not only eliminates redundant data but also reduces the number of feature factors, thus enhancing computational efficiency.
2.3.1. Building Feature Factor Analysis
To comprehensively and accurately represent the spatial relationships of buildings, the analysis can be conducted at two levels: micro and macro [34,35]. Micro-level descriptions focus on individual building characteristics, such as location, area, and height, which provide a detailed representation of each building. At the macro level, the analysis shifts to attributes between building clusters, such as density and adjacency. These macro-level characteristics offer insight into the spatial organization and distribution patterns of the buildings [36].
When constructing a quantitative model, it is essential to select eigenfactors at both the micro and macro levels. Gestalt principles have long been applied in building clustering and pattern recognition. The spatial distribution patterns of building clusters can be effectively understood and recognized by applying principles of Gestalt grouping, such as proximity, similarity, closure, continuity, and co-location, as proposed in the work by Tang et al. (2023), which is based on the geographic knowledge graph framework [37]. Visual variables—factors of graphical and color variation that induce visual differences—play a crucial role in quantifying Gestalt cognitive theory. The Graphics Laboratory at the University of Paris, led by French scholar Jacques Bertin, was the first to propose a set of theories on the patterns of change in graphic symbols, based on years of intensive research. These visual variables encompass various aspects, including color, shape, size, texture, value, orientation, brightness, position, and density [38]. When characterizing building patterns, the focus is often on fundamental visual variables such as size, orientation, and shape. Each visual variable can be quantified by multiple eigenfactors. Common eigenfactors include area, perimeter, mean radius, height, compactness, density, fractal dimension, perimeter index, perpendicularity, concavity, SBR, and others [39].
2.3.2. Building Feature Factor Correlation Analysis
When the correlation coefficient between two building characterization factors falls within the range of 0.8 to 1, it indicates a strong correlation between these factors. Figure 5 illustrates the results of the correlation analysis for the aforementioned characterization factors. In this context, we can strategically choose to retain one of the factors, which not only streamlines the data and enhances analytical efficiency but also effectively mitigates the impact of noise that may arise from feature redundancy. Observation of Figure 5 shows that the correlation coefficient between area and perimeter is nearly 1, indicating an extremely strong correlation. Furthermore, the five characterization factors—compactness, fractal dimension, perimeter index, perpendicularity, and concavity—display a strong inter-correlation among them. Based on these findings, we will incorporate only the six key characterization factors—area, mean radius, height, concavity, orientation (SBR), and spatial location—in the subsequent constrained clustering analyses.
2.3.3. SOM Neural Network Micro-Clustering
The core advantage of the SOM neural network, as a competitive model for unsupervised learning, lies in its ability to automatically map high-dimensional input data into a low-dimensional, discrete representation space. This process does not require prior labeling of the data, making it particularly suitable for handling building vector data with complex multidimensional features [40]. As illustrated in Figure 6, the unique topology of the SOM neural network serves as the foundation for efficient data clustering.
In spatial clustering of building vector data using the SOM neural network, we first use the pre-selected clustering centers as the starting points and fine-tune the initial weights of the remaining neurons using a heuristic strategy. This approach aims to optimize the initial clustering performance of the network and ensure that more accurate clustering trends are reflected during the early stages of training. Subsequently, the original building data are input into the network for iterative training, during which the neuron weights are dynamically updated based on the features of the input data, gradually converging to a more optimized and stable clustering state.
To further enhance the rationality and accuracy of clustering, the six key feature factors selected above are used as the guiding basis for weight updates. By adjusting the weight update rules in a targeted manner, these feature factors not only guide the neurons’ sensitivity and differentiation towards the input data’s features but also steer the clustering results in a direction that better aligns with practical application needs. Ultimately, this process ensures the efficient and reasonable spatial clustering of the building vector data.
3. Algorithmic Process
3.1. Constructing Distance Matrices and MST
Given a dataset , where each building is characterized by attributes such as id, area, concavity, radius, orientation, height and spatial location . The distance matrix between buildings is computed using Euclidean distance, as shown in Equation:
(1)
The minimum spanning tree (MST) is constructed using Prim’s algorithm based on the distance matrix . The resulting MST is denoted as , where represents the set of nodes (buildings) and represents the set of edges (connections between buildings). The weight of each edge is the distance between two nodes. Pseudocode for MST Construction (Prim’s algorithm):
| function Prim(D): |
3.2. MST Macro Clustering Calculation Process
After constructing the MST, pruning is performed on to obtain macro clusters. Edges with larger weights are removed, resulting in multiple subtrees, each representing a distinct cluster. The pruning threshold is denoted by , and edges with weight are removed.
To determine the pruning threshold , the maximum gap method is utilized, which identifies points in the MST where edge weights change significantly. Pseudocode for MST Pruning:
| function prune_MST(MST, T): |
The silhouette coefficient is used to evaluate the quality of the clustering. The silhouette coefficient is defined as:
(2)
where is the average distance from data point iii to other points in the same cluster (intra-cluster distance), is the average distance from data point iii to the nearest other cluster (inter-cluster distance).The silhouette coefficient for the whole clustering is the average of the silhouette coefficients for all points. Pseudocode for Silhouette calculation:
| function silhouette (MST, clusters): |
The pruning threshold is adjusted iteratively based on the silhouette coefficient to maximize clustering quality.
Calculate the clustering centers of each subtree, and the set of building nodes in each subtree is . The formula is:
(3)
where denotes the calculated clustering centers. The set of clustering centers is obtained.3.3. SOM Micro-Clustering Computational Procedure
The SOM neural network is a two-dimensional structure with a set of node weights . Each clustering center obtained from the above is directly assigned to some of the neurons by the formula:
(4)
where . The remaining set of node weights e is handled using heuristic initialization. Pseudocode for SOM Initialization and training:| function SOM_training(clustering_centers, SOM_size): |
Building features such as area, concavity, radius, orientation, height, and spatial location are introduced as constraints for training. The building features are calculated as shown in Table 1.
Introduce the characterization factor such that building is characterized by . Define the weighted distance as:
(5)
where represents the input sample, denotes the weight of the feature, and and correspond to the dimensions of the sample and weight, respectively.The SOM training combines the weighted distances and adjusts the weight updating strategy to ensure that the feature factors are considered in the clustering process, with the formula:
(6)
where represents the number of iterations, denotes the learning rate, is the neighborhood function that defines the relationship between neuron and the best matching unit , represents the weight of the constraints, and indicates the additional constraint information. Pseudocode for SOM weight Update:| function SOM_weight_update(input_data, learning_rate, neighborhood_function, constraints): |
After certain iterations, the final SOM weight matrix is obtained, with each neuron representing the center of a cluster. The final clustering result is obtained by assigning the input data to the nearest neuron cluster.
4. Experimental Analysis
4.1. Experimental Data
The spatial distribution of buildings in Shenzhen is significantly influenced by factors such as economic conditions, social structures, and urban planning. It exhibits highly diversified characteristics, including the dense concentration of high-rise and ultra-high-rise buildings, as well as the coexistence of various neighborhood types, urban villages, and parks with differing densities and layout patterns [41]. To analyze these patterns and validate the effectiveness and applicability of the clustering method proposed in this study, a subset of buildings from Futian District, Shenzhen, was selected as the test area (as shown in Figure 7). The building profile data were downloaded and extracted from the OpenStreetMap (OSM) platform, encompassing a total of 8838 buildings.
The algorithmic framework proposed in this study begins by constructing an inter-building proximity map using Delaunay triangulation, as illustrated in Figure 8. This graph effectively captures the spatial relationships between buildings. To streamline the complexity of the graph and emphasize key spatial connections, the MST algorithm is applied to prune unnecessary long edges, resulting in an optimized structural graph, as shown in Figure 9.
Based on this, macro-level clustering analysis was performed utilizing the properties of the MST to group spatially related buildings into several macro-cluster regions. This approach effectively captures the primary patterns and trends in building distribution.
Furthermore, to refine the clustering analysis, the centers of the macro-clusters are used as the initial weight settings for the SOM neural network. Feature vectors, incorporating multiple building characterization factors, are then introduced to perform micro-clustering under constraints using the SOM neural network.
4.2. Experimental Platform and Programming Environment
All experiments were conducted on a Windows 10 operating system using a machine equipped with an Intel® Core™ i7-7700HQ (Intel Corporation is based in Santa Clara, CA, USA) processor and an NVIDIA® GeForce® 1050Ti graphics card (NVIDIA Corporation is based in Santa Clara, CA, USA). To fully leverage GPU acceleration, the deep learning experiments were executed in a CUDA 10.2 environment using the PyTorch 1.11.0 framework. The algorithms were implemented in Python and primarily relied on the following libraries and tools:
NumPy: Utilized for efficient numerical computations and matrix operations.
matplotlib.pyplot: Employed for data visualization and graphical output.
GeoPandas: Facilitates the handling and analysis of spatial data, including reading, transforming, and storing geospatial datasets.
Shapely (including modules shapely.geometry and shapely.geometry.geo): Provides functionality for the creation, manipulation, and analysis of geometric objects, supporting complex spatial computations.
math: Offers essential mathematical functions to aid in numerical calculations.
Features: Used for feature extraction and processing, ensuring efficient computation of building-related attributes required for the study.
This configuration not only guarantees high efficiency and accuracy in processing experimental data but also ensures flexibility and scalability in algorithm implementation, thereby providing a robust computational platform for the methods proposed in this study.
4.3. Experimental Results and Their Comparative Analysis
To comprehensively evaluate the rationality and effectiveness of the clustering algorithm proposed in this study, an ablation experiment strategy is first designed and implemented. This strategy aims to assess the individual contributions of the macro-clustering and micro-clustering modules, providing an in-depth analysis of the algorithm’s architecture. Subsequently, two widely used clustering algorithms are selected as benchmarks for comparative analysis: the K-Means++ algorithm [42] an enhanced version of the traditional K-means algorithm that addresses the issue of unstable clustering results caused by random initialization, and the Ordered Points-based Density Clustering Algorithm (OPTICS) [43], which is effective for datasets with complex density distributions and can accurately identify clusters of arbitrary shapes.
The core parameters of the algorithm proposed in this paper are designed to address both macro and micro levels. At the macro level, the focus is on determining the pruning threshold of the minimum spanning tree (MST). Initially, the pruning threshold = 350.97 is selected using the maximum gap method, as shown in Figure 10a, providing a reasonable starting point. To further optimize the pruning effect, we introduce the silhouette coefficient (SC) [44] validation step. This step assesses clustering quality by calculating the closeness and separation of clustering results under different pruning thresholds. Specifically, the pruning threshold = 262.70, which maximizes the silhouette coefficient, is selected as the final parameter, as shown in Figure 10b, ensuring that the pruned MST generates high-quality clustering results. At the micro level, three core parameters of the SOM neural network are considered: grid size , neighborhood function width (sigma), and learning rate (lr). To accurately determine the optimal combination of these parameters, we employ the control variable method in conjunction with the Adjusted Rand Index (ARI) [45], an evaluation metric that compares the similarity between clustering results and the real cluster labels (e.g., the actual cluster labels of the study area shown in Figure 11). Through repeated testing and comparisons, we determined that the parameter combination yielding the maximum Rand coefficient is , sigma = 0.1, and lr = 0.1.
4.3.1. Ablation Experiment
To systematically assess the individual and combined effects of macro-clustering and micro-clustering, as well as their synergistic influence in delineating building communities, three ablation experiments were designed and conducted in this study. The first experiment, referred to as the complete model experiment, aims to evaluate the performance of the entire clustering framework and serves as a baseline for comparison in the subsequent ablation experiments. The experimental results (shown in Figure 12) demonstrate that the clustering output aligns closely with human visual perception, strongly validating the effectiveness of the progressive clustering strategy proposed in this paper for addressing the complexity and diversity of the urban environment. This significantly enhances the accuracy of building community delineation.
The second experiment focused on removing the macro-level MST clustering module to separately assess its specific contribution to overall clustering performance. In this experiment, only the SOM neural network was retained, and building feature factors were incorporated for clustering analysis (as shown in Figure 13). The results indicate that while this method has unique advantages in deeply analyzing building features, the ambiguity in clustering boundaries and the significant decline in clustering accuracy underscore the critical role of the MST macro-clustering module in enhancing both the clarity and accuracy of the clustering results.
The third experiment reverses the approach by removing the SOM micro-clustering module and relying solely on MST for clustering, to assess its unique contribution to refining the clustering process. The experimental results (shown in Figure 14) demonstrate that the MST clustering method excels at capturing the intrinsic hierarchical structure of the data and is particularly effective at recognizing macroscopic patterns in the building distribution. However, its limitations become apparent at the level of refining specific building features, leading to clustering results that lack sufficient detail and significantly reduce the overall clustering accuracy. This series of ablation experiments not only highlights the individual contributions of the macro- and micro-clustering modules but also underscores the critical importance of their synergistic effects in achieving efficient and accurate building community delineation.
4.3.2. Comparison Experiment
To thoroughly analyze the differences between the two clustering methods in identifying and delineating building groups in the study area, this study applies both methods in parallel to conduct a global clustering analysis and selects eight representative clustering results for detailed demonstration, as shown in Figure 15. By comparing the global clustering effectiveness with the local subtle variations, it is observed that the K-Means++ algorithm, compared with the traditional K-means algorithm, exhibits certain advantages in mitigating the randomness associated with the selection of initial clustering centers. Furthermore, the integration of morphological factors significantly enhances its accuracy in identifying adjacent similar buildings. However, despite these improvements, K-Means++ reveals limitations in handling high-dimensional spatial data, as illustrated in regions a and c. Its insufficient capacity for deeply mining high-density information and its high sensitivity to the distribution of density centers result in erroneous segmentation of some dense regions. Additionally, certain regions that should be separated but are not distinguished due to significant distances (e.g., regions d and h) further highlight its shortcomings in fine-grained clustering. Nevertheless, from a macro perspective, the clustering performance of K-Means++ remains robust.
On the other hand, the OPTICS algorithm is renowned for its proficiency in handling multi-density datasets; however, its clustering performance exhibits notable shortcomings when applied to complex and dynamic urban spatial environments. Despite these limitations, OPTICS demonstrates exceptional capability in both global and local clustering tasks, particularly in recognizing clusters of buildings with intricate morphologies or varying density characteristics. This is especially evident in regions a and c, highlighting its potential and advantages in addressing complex spatial structures.
To validate the comparative experimental results of the three algorithms discussed above, this study employs the Silhouette Coefficient and Adjusted Rand Index (ARI) as evaluation metrics for quantitative analysis of clustering performance. The ARI measures the consistency between clustering results and expert labeling while correcting for random clustering expectations [40]. However, we acknowledge that these conventional metrics have inherent limitations when evaluating perceptually driven clustering—our core evaluation still relies on the Gestalt visual perception principle and uses expert-validated labels as the basis for judgment, as shown in Figure 11.
Figure 16a compares the Silhouette coefficients across algorithms. While our method shows comparable initial performance to K-Means++, the subsequent refinement through micro-clustering and building characteristic integration intentionally prioritizes human-interpretable cluster separation over metric optimization. This design choice explains the eventual decline in Silhouette scores, as the metric favors compact spherical clusters rather than perceptually meaningful groupings.
Figure 16b presents the Adjusted Rand Index (ARI) comparison. The observed performance variation across cluster counts stems from our method’s adaptive architecture: The initial micro-clustering phase (lower cluster counts) focuses on coarse pattern discovery, while later stages (higher cluster counts) enable fine-grained differentiation through building feature integration. This explains the improved ARI performance beyond iteration 8 as the algorithm stabilizes. We recognize the early-stage ARI limitation compared to OPTICS’ density-aware approach but emphasize that our method achieves better balance between metric performance and visual coherence as shown in Figure 15.
5. Conclusions
In this paper, we adopt an incremental clustering strategy of macro and then micro, firstly constructing a neighborhood graph using the Delaunay triangle principle, and then pruning through the minimum spanning tree algorithm, which is used as the basis to form a macro minimum spanning tree clustering framework. This macroscopic clustering result is used as the initial weight setting of the SOM neural network clustering, so as to realize the initial delineation of the building groups in the global perspective. Further, in order to deepen the accuracy of clustering, we introduce several key indicators of morphological similarity, including area, concavity, radius, orientation, height, and spatial location, as constraints to drive the SOM neural network to perform micro-clustering. This process realizes a seamless transition from macroscopic overall grasping to microscopic details, which effectively improves the comprehensive performance of clustering analysis.
The validation of the experimental data demonstrates that the clustering algorithm proposed in this paper effectively addresses the complexity and diversity inherent in the spatial distribution of urban buildings. It not only achieves a high level of clustering accuracy but also aligns closely with intuitive human perceptions of urban building clustering. Nonetheless, it is important to acknowledge that while the model exhibits notable overall performance, certain aspects still offer room for improvement. In particular, the initial weights of the SOM depend on the macroscopic clustering centers derived from the MST algorithm. This reliance implies that any bias or coarseness in the initial MST clustering may negatively influence the subsequent refinement process. Additionally, we observed that specific feature factors had an outsized impact on certain clustering details, occasionally causing minor fluctuations in clustering accuracy. To address these challenges, future research should focus on enhancing the accuracy of MST-based macro-clustering and refining the feature factor weighting strategy. These improvements aim to further enhance the robustness and precision of the clustering algorithm, ensuring that the results align more closely with practical application requirements.
Tianliang Zhang conceived the idea and writing—original draft; Jianhua Feng assisted with the experiments; Xiaoji Lan writing—review and editing. Each of the authors has reviewed this manuscript. All authors have read and agreed to the published version of the manuscript.
The geospatial data used in this study was obtained from OpenStreetMap (OSM). OSM is a collaborative project that provides open geographic data contributed by volunteers worldwide. All data is made available under the Open Database License (ODbL). The data used in this research was downloaded on 17 May 2024 from
Authors would like to thank potential supervisor, editor, and reviewer for their advice and comment.
The authors declare no conflicts of interest.
Footnotes
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
Figure 2. Spatial proximity relationships of buildings based on Delaunay; Triangulation.spatial structure of the building group (a), spatial proximity graph of buildings (b), distance between boundary points of two buildings (c).
Figure 4. Residual neuron weights solution. Random initialization (a) and heuristic initialization (b).
Figure 10. Determining the optimal pruning threshold. Maximum gap method (a) and introduced silhouette coefficient (b).
Figure 15. Clustering results of each algorithm and its details are shown. The algorithms used in the manuscript (a1–h1), K-means (a2–h2) and OPTICS (a3–h3).
Figure 16. Comparison of evaluation metrics for different algorithms. Compared Silhouette coefficients across algorithms (a) and Adjusted Rand Index (ARI) comparison (b).
Building characteristic parameters.
| Parameters | Descriptive Index | Formulas | Define |
|---|---|---|---|
| Geometric | Area | | Area |
| Concavity | | Reflects the degree of concavity and convexity of a building’s boundary, which can be measured by the difference between the polygon’s convex envelope and the original boundary. | |
| Mean radius | | Average distance from the midpoint of the bounding polygon to the geometric center | |
| Structure | Direction | | The direction of the longest side of the smallest external rectangle |
| Height | | Height | |
| Location | Spatial location | | Geometric center coordinates |
References
1. Rhind, D. A GIS research agenda. Int. J. Geogr. Inf. Syst.; 1988; 2, pp. 23-28. [DOI: https://dx.doi.org/10.1080/02693798808927876]
2. Caruso, G.; Hilal, M.; Thomas, I. Measuring urban forms from inter-building distances: Combining MST graphs with a Local Index of Spatial Association. Landsc. Urban Plan.; 2017; 163, pp. 80-89. [DOI: https://dx.doi.org/10.1016/j.landurbplan.2017.03.003]
3. Tobler, W.R. A computer movie simulating urban growth in the Detroit region. Econ. Geogr.; 1970; 46, (Suppl. S1), pp. 234-240. [DOI: https://dx.doi.org/10.2307/143141]
4. Zhao, Z. Research on Urban Spatial Pattern Supported by Building Footprint Data. Ph.D. Thesis; Wuhan University: Wuhan, China, 2021.
5. Wang, J.; Qian, H. Cartographic-generalization-knowledge and its application. Geomat. Inf. Sci. Wuhan Univ.; 2006; 31, pp. 382-386.
6. Wang, X.; Burghardt, D. A mesh-based typification method for building groups with grid patterns. ISPRS Int. J. Geo-Inf.; 2019; 8, 168. [DOI: https://dx.doi.org/10.3390/ijgi8040168]
7. Ai, T.; Zhang, X. The aggregation of urban building clusters based on the skeleton partitioning of gap space. The European Information Society: Leading the Way With Geo-Information; Springer: Berlin/Heidelberg, Germany, 2007; pp. 153-170.
8. Ai, T.; Guo, R. Polygon Cluster Pattern Mining Based on Gestalt Principles. Acta Geod. Cartogr. Sin.; 2007; 36, pp. 302-308.
9. Koffka, K. Perception: An introduction to the Gestalt-Theorie. Psychol. Bull.; 1922; 19, 531. [DOI: https://dx.doi.org/10.1037/h0072422]
10. Cetinkaya, S.; Basaraner, M.; Burghardt, D. Proximity-based grouping of buildings in urban blocks: A comparison of four algorithms. Geocarto Int.; 2015; 30, pp. 618-632. [DOI: https://dx.doi.org/10.1080/10106049.2014.925002]
11. Pei, T.; Wan, Y.; Jiang, Y.; Qu, C.; Zhou, C.; Qiao, Y. Detecting arbitrarily shaped clusters using ant colony optimization. Int. J. Geogr. Inf. Sci.; 2011; 25, pp. 1575-1595. [DOI: https://dx.doi.org/10.1080/13658816.2010.533674]
12. Pilehforooshha, P.; Karimi, M. A local adaptive density-based algorithm for clustering polygonal buildings in urban block polygons. Geocarto Int.; 2020; 35, pp. 141-167. [DOI: https://dx.doi.org/10.1080/10106049.2018.1508313]
13. Anand, S.K.; Kumar, S. Experimental comparisons of clustering approaches for data representation. ACM Comput. Surv. (CSUR); 2022; 55, pp. 1-33. [DOI: https://dx.doi.org/10.1145/3490384]
14. Zhou, A.; Yu, Y. The Research about Clustering Algorithm of K-Means. Comput. Technol. Dev.; 2011; 21, pp. 62-65.
15. Hu, X.; Ma, R.; Zhong, B. Study on validity of hierarchical clustering. J. Shandong Univ.; 2010; 40,
16. Ai, T.; Guo, R. Support for Area-Based Target Constraints in Delaunay Triangulation for Map Generalization. Geomat. Inf. Sci. Wuhan Univ.; 2000; 1, pp. 35-41. [DOI: https://dx.doi.org/10.13203/j.whugis2000.01.007]
17. Li, J.; Liang, Y. Segmentation Region Density Clustering Algorithm Based on Minimum Spanning Tree. J. Comput.-Aided Des. Comput. Graph.; 2019; 31, pp. 1628-1635.
18. Yan, Y.; Zhang, X.; Wang, L. Decentralized Iterative Community Clustering Parallelization Based on Network Weighted Voronoi Diagram on Spark Platform. Comput. Appl. Softw.; 2021; 38,
19. Yu, W.; Zhou, Q.; Zhao, R. A heuristic approach to the generalization of complex building groups in urban villages. Geocarto Int.; 2021; 36, pp. 155-179. [DOI: https://dx.doi.org/10.1080/10106049.2019.1590463]
20. Lv, Z.; Sun, Q.; Zhao, G.; Lu, C.; Hu, J. A Clustering Method of Rural Settlement Considering Direction Relation. Geomat. Inf. Sci. Wuhan Univ.; 2023; 48, pp. 631-638.
21. Gao, C. Research on Clustering Algorithm for Area Vector Building. Master’s Thesis; Chang’an University: Xi’an, China, 2024.
22. Regnauld, N. Contextual building typification in automated map generalization. Algorithmica; 2001; 30, pp. 312-333. [DOI: https://dx.doi.org/10.1007/s00453-001-0008-8]
23. Khan, A.A.; Mohanty, S.K. A fast spectral clustering technique using MST based proximity graph for diversified datasets. Inf. Sci.; 2022; 609, pp. 1113-1131. [DOI: https://dx.doi.org/10.1016/j.ins.2022.07.101]
24. Cui, J.; Li, C.; Cao, H.; Chen, X.; Wang, J.; Dai, T. SOM-Kmeans Based Machine Tools Energy Efficiency Grade Evaluation. J. Mech. Eng.; 2023; 59, pp. 295-306.
25. Ai, T.; Yin, H.; Shen, Y.; Yang, M.; Wang, L. A formal model of neighborhood representation and applications in urban building aggregation supported by Delaunay triangulation. PLoS ONE; 2019; 14, e0218877. [DOI: https://dx.doi.org/10.1371/journal.pone.0218877]
26. Lin, W. Research on hierarchical segmentation method of high-resolution remote sensing image based on minimum spanning tree model. J. Geod. Geoinf. Sci.; 2022; 51, 316.
27. Kruskal, J.B. On the shortest spanning subtree of a graph and the traveling salesman problem. Proc. Am. Math. Soc.; 1956; 7, pp. 48-50. [DOI: https://dx.doi.org/10.1090/S0002-9939-1956-0078686-7]
28. Prim, R.C. Shortest connection networks and some generalizations. Bell Syst. Tech. J.; 1957; 36, pp. 1389-1401. [DOI: https://dx.doi.org/10.1002/j.1538-7305.1957.tb01515.x]
29. Cai, J.; Meng, N.; Cai, Z.; Wang, A. Building Clustering Based on Minimum Spanning Tree Algorithm. Surv. Mapp.; 2017; 40, pp. 247-250.
30. Cheng, B.; Liu, Q.; Li, X. Intelligent Building Grouping Using a Self-organizing Map. J. Geod. Geoinf. Sci.; 2013; 42, pp. 290-294.
31. Wang, Z. The Cluster of Buildings Based on the SOM Neural Network. Master’s Thesis; University of Electronic Science and Technology: Chengdu, China, 2016.
32. Wu, C.; Chen, Z.; Jiang, M. The Research on Initialization of Ants System and Configuration of Parameters for Different TSP Problems in Ant Algorithm. Acta Electron. Sin.; 2006; 34, pp. 1530-1533.
33. Yang, C.; He, L.; Xie, P.; Zhou, X. Clustering Analysis of Geographical Area Entities Considering Distance and Shape Similarity. Geomat. Inf. Sci. Wuhan Univ.; 2009; 34, pp. 335-338.
34. Ruas, A. The roles of meso objects for generalisation. Proceedings of the 9th Symposium on Spatial Data Handling; Beijing, China, 10–12 August 2000.
35. Bard, S. Quality assessment of cartographic generalisation. Trans. GIS; 2004; 8, pp. 63-81. [DOI: https://dx.doi.org/10.1111/j.1467-9671.2004.00168.x]
36. Basaraner, M.; Cetinkaya, S. Performance of shape indices and classification schemes for characterising perceptual shape complexity of building footprints in GIS. Int. J. Geogr. Inf. Sci.; 2017; 31, pp. 1952-1977. [DOI: https://dx.doi.org/10.1080/13658816.2017.1346257]
37. Tang, Z.; Ai, T.; Xu, H. Reasoning of spatial distribution pattern of building cluster based on geographic knowledge graph. J. Geo-Inf. Sci.; 2023; 25, pp. 1202-1214.
38. Bertin, J. Semiology of Graphics: Diagrams, Networks, Maps; Berg, W.J. The University of Wisconstin Press: Madison, WI, USA, 1983.
39. Meng, N.; Feng, J.; Jia, Y. Comparative Analysis of Clustering Methods of Planar Settlements. J. Geomat.; 2023; 48, pp. 116-120. [DOI: https://dx.doi.org/10.14188/j.2095-6045.2021233]
40. Nan, F.; Li, Y.; Jia, X.; Dong, L.; Chen, Y. Application of improved som network in gene data cluster analysis. Measurement; 2019; 145, pp. 370-378. [DOI: https://dx.doi.org/10.1016/j.measurement.2019.01.013]
41. Liu, Q.; Zhang, H.; Li, B.; Liu, H. A Study of Principal-Agent Model Based Spatial Form Differentiation in Urban Villages in Shenzhen, China. Mod. Urban Res.; 2023; 9, pp. 37-42.
42. Huang, R.; Cheng, Q.; Chen, Z. An Algorithm of Data Classification Based on PCA and K-Means++. Proceedings of the 2021 IEEE Conference on Telecommunications, Optics and Computer Science (TOCS); Shenyang, China, 10–11 December 2021; IEEE: Piscataway, NJ, USA, 2021.
43. Meng, N.; Gao, C.; Wang, Z.; Li, J. Comparative study on clustering adaptability of DBSCAN extended algorithm in planar buildings. Sci. Surv. Mapp.; 2022; 47, pp. 204-214.
44. Dinh, D.-T.; Fujinami, T.; Huynh, V.-N. Estimating the optimal number of clusters in categorical data clustering by silhouette coefficient. Proceedings of the Knowledge and Systems Sciences: 20th International Symposium, KSS 2019; Da Nang, Vietnam, 29 November–1 December 2019; Proceedings 20 Springer: Singapore, 2019.
45. Liu, X.; Huang, Q.; Gao, S. Exploring the uncertainty of activity zone detection using digital footprints with multi-scaled DBSCAN. Uncertainty and Context in GIScience and Geography; Routledge: Milton Park, UK, 2021; pp. 67-94.
© 2025 by the authors. Published by MDPI on behalf of the International Society for Photogrammetry and Remote Sensing. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.