A New Thunderstorm Identification Algorithm Based

Full text

Turn on search term navigation

Introduction

Thunderstorms have a severe impact on many sectors, such as agriculture, aviation, infrastructures and power systems (Houston et al., 2015). Lightning caused by thunderstorms (Hayward et al., 2020) is responsible for damages and interruptions of power system (Li et al., 2013; Okabe & Takami, 2011; Perez et al., 2020; Takami et al., 2014; Visacro et al., 2012). Although there are many kinds of lightning protection equipment in power system (Hu et al., 2021), severe damage caused by lightning is one of the biggest threat in operation (Guo et al., 2019), especially devices in bulk power transmission schemes (Goertz et al., 2018; Zhuang et al., 2016). In order to prevent and reduce lightning accidents, apart from the lightning protection system, monitoring the trajectory of the electrical activity within thunderstorms can provide warning for power system to protect important facilities (Meng et al., 2019).

Many tracking algorithms based on radar data have been proposed in the past decades, because radar data are capable of distinguishing the structure and development of thunderstorms (Kohn et al., 2011). These algorithms are mainly divided into two categories: cross-correlation tracking methods (Rinehart & Garvey, 1978) and centroid tracking methods (Crane, 1979). The Tracking of Radar Echoes with Correlation, for example, calculates speed and direction from reflectivity echoes. Although this method does not have the capability to detect individual storms, it has been applied to the Integrate Terminal Weather System by the Massachusetts Institute of Technology–Lincoln Laboratory in the airport terminal area (Evans & Ducot, 1994) and the Nowcasting and Initialization of Modeling Using Regional Observation Data System (Nimrod) by UK Met Office over the UK and surrounding waters (Golding, 1998).

Storm identification is the first procedure of the centroid-type methods. It uses segmentation in image processing (Wang et al., 2019) and can be separated into three types: single threshold, multi-threshold and adaptive threshold schemes. Dixon and Wiener (Dixon & Wiener, 1993) proposed one of the most famous methods, the Thunderstorm Identification, Tracking, Analysis and Nowcasting (TITAN), which defined a contiguous region above 35 dBZ as a storm. Radar products in Thunderstorm Identification, Tracking, Analysis and Nowcasting (TITAN) are converted from polar to Cartesian coordinates and standardized in orientation, resolution and time. A contiguous region is a set of grid cells which are adjacent to each other (touching sides) (Dixon & Wiener, 1993). Since a grid cell has four adjacent grid cells (touching sides), this method will be referred to as 4-connectivity method throughout the paper (He et al., 2017; Zan et al., 2019). This algorithm has been updated several times and is applied to software supporting multiple datasets, such as the National Centre for Atmospheric Research (Mueller et al., 2003). Then, a multiple threshold scheme is used in the identification algorithm. Johnson et al. (1998) introduced another well-known algorithm named Storm Cell Identification and Tracking Algorithm applied in the National Weather Service of Spain (del Moral et al., 2018), which uses seven thresholds instead of one given threshold in Thunderstorm Identification, Tracking, Analysis and Nowcasting (TITAN). Based on TITAN, an Enhanced TITAN (Han et al., 2009) is able to separate individual storms from a cluster of storms through erosion and dilation. The adaptive threshold scheme automatically chooses three different reflectivity thresholds to isolate a storm from nearby storms, including TRACE3D (Handwerker, 2002), Thunderstorms Radar Tracking (Hering et al., 2004) and Thunderstorm Observation by Radar (Houston et al., 2015).

Lightning detection systems are employed worldwide, such as the European Cooperation for Lightning Detection network (Schulz et al., 2016), the National Lightning Detection Network (Zhu et al., 2020), the South African Weather Service Lightning Detection Network (Zinner et al., 2013), GLD360 (Global Lightning Data set 360) (Said et al., 2013). Lightning Location Systems can provide valuable information about the lightning activity in real time, including time, coordinates and polarity.

Total lightning data include intra-cloud (IC) pulses and cloud-to ground (CG) strokes. Typically, the density of cloud-to ground (CG) flashes is used on a yearly basis to evaluate lightning risk in an area (Matsui et al., 2020). Since the lightning activity is a characteristic of a thunderstorm (Finke, 1999), algorithms based on lightning data have been introduced from methods based on radar data to identify electrical active regions. Meyer (Meyer et al., 2013a) applied identification method of radar data to lightning data and analyzed electrically active regions in thunderclouds. The method of the contiguous region was used to identify both lightning cells and radar cells in northern Italy (Bonelli & Marcacci, 2008). To identify and outline electrically active regions, pixels are extended into circular (Meyer et al., 2013b) or rectangular shaped regions (Wang et al., 2017), and pixels with overlapping regions form a lightning cell. On the basis of statistical analysis of lightning data, monitoring the track of lightning activity can provide warning for power systems to protect important facilities. To evaluate the lightning performance of transmission line better, Wang et al. presented the Lightning Identification, Tracking, and Analysis algorithm based on the Density-Based Spatial Clustering of Applications with Noise to identify and track lightning clusters, and to classify different risk levels (Bao et al., 2021).

The temporal and spatial parameters of lightning clusters in the identification process mainly come from the parameters of radar data. The volume scan of radar is generally 5 min (Dixon & Wiener, 1993; Wang et al., 2017), 6 min (Rigo et al., 2010; Wang et al., 2017) or 10 min (Kyznarová & Novák, 2009), so lightning density maps often use the same temporal resolution as radar data (Bao et al., 2021; Wang et al., 2017). Meyer (Meyer et al., 2013a) chose 2.5 min as time interval for lightning data, which is half of the volume scanning time. The radar reflectivity are mapped into grid cells with horizontal resolution of 0.01° × 0.01° (Wang et al., 2017), 0.02° × 0.02° (Rigo et al., 2010), 0.05° × 0.05° (Dixon & Wiener, 1993; Kyznarová & Novák, 2009) (longitude × latitude), respectively. These grid cell sizes are also applied to lightning data at the same time (Rigo et al., 2010; Wang et al., 2017).

This paper introduces a new identification algorithm to outline lightning clusters based on the discreteness of total lightning data. Then, we propose a method to quantitatively evaluate the performance of lightning cluster identification algorithms using different temporal and spatial parameters. The size of equipment area in power systems is much smaller than that of weather forecast and the overhead line span is less than 500 m, except for large crossing in China. According to the evaluation results, appropriate spatio-temporal parameters are selected for power system warning. The Section 2 gives an introduction of the lightning data, two identification algorithms and an evaluation method. The Section 3 presents the evaluation results of two algorithms and the statistical variations of the space-time parameters of the new method. The Section 4 describes a case in which two methods have been applied to identify lightning clusters and the Section 5 presents conclusions and ideas for future work.

Materials and Methods Study Region

Foshan is a region in Guangdong province, located in the south of China, covering an area of nearly 3,797.72 km² (Luo et al., 2014) (as shown in Figure 1). It is one of the areas with the most frequent lightning activity in China, with an average of 73.4 thunderstorm days per year (Liu et al., 2014). Lightning activity is mostly concentrated from May to September (Luo et al., 2014).

View Image - Figure 1. Topographic map of the Foshan Total Lightning Location System in Guangdong province. The red line indicates the region of Foshan and the black dots represent nine stations.

Figure 1. Topographic map of the Foshan Total Lightning Location System in Guangdong province. The red line indicates the region of Foshan and the black dots represent nine stations.

Lightning Data

Total lightning data, including intra-cloud (IC) pulses and cloud-to ground (CG) strokes, are derived from the Foshan Total Lightning Location System (FTLLS). IC pulses and CG strokes are called events in the following part. Foshan Total Lightning Location System (FTLLS) comprises a network of nine low and very low frequency (LF/VLF)VLF/LF (200 Hz–500 kHz) senors (Cai et al., 2019), located in Datang station (DTZ), Leping station (LPZ), Lishui station (LSZ), Chancheng station (CCJ), Chencun station (CCZ), Longjiang station (LJZ), Junan station (JAZ), Baini station (BNZ), Mingcheng station (MCZ), which are installed on the top of affiliated buildings of the Foshan Power Company of China Southern Power Grid. The geographical location of the nine stations is shown in Figure 1. The study area includes the entire Foshan region, from latitude 22° to 24°N, and longitude of 112° to 114°E.

Foshan Total Lightning Location System (FTLLS) locating method relies on the time-of-arrival method and provides information about each event including time, latitude, longitude, altitude, type (IC and CG) and polarity. The horizontal and vertical location errors are less than 100 and 200 m, respectively (Cai et al., 2019) and the detection efficiency of flashes is 87.5% (Li et al., 2021).

Identification Methodology

Lightning cluster identification algorithm plays an essential role in lightning monitoring. First, Section 2.3.1 introduces the identification algorithm of the eight connected-component labeling (He et al., 2017) and the extended connected-component labeling. Then, a new identification algorithm based on the random distribution of lightning events is proposed.

Connected-Component Labeling

The connected-component labeling is the basis of extracting image components. Images at set time intervals are processed as binary images by a pre-set threshold. In order to process lightning events from FTLLS, spherical coordinates were mapped and transformed into Cartesian coordinates. The surface of the earth is gridded into latitude and longitude cells, which is similar to radar pixel gridding. The time interval is an integer multiple of 1 min The number that can be divided by 60 min (1 hr) is selected as the time interval, which is 2, 3, 4, 5, 6, 10, 12 min, respectively. Total lightning events are assigned to grid cells based on their geographical coordinates to produce a map of total lightning density. In a binary image, the basic unit is pixel, and each pixel has eight adjacent pixels, which is named the eight connected-component labeling or 8-connectivity in Figure 2a. In a total lightning density map, a grid cell with lightning events is regarded as the central grid cell and then its eight adjacent (neighboring and diagonal) grid cells are searched to find out whether they contain lightning events. If the adjacent grid cells contain lightning events, they are defined as new central grid cells and their adjacent grid cells are searched to find out whether they include total lightning events. This process continues until there are no grid cells with lightning events that can be found and a lightning cluster is a set of connected grid cells with total lightning events.

View Image - Figure 2. Sketch of identification algorithm of lightning cluster. (a) 8-connectivity. (b) 8-connectivity method is extended to 24, 48, 80 and 120-connectivity, corresponding to 5 × 5, 7 × 7, 9 × 9, 11 × 11 arrays respectively. (c) Lightning clusters identified by 8-connectivity, which has no grid gap, are fitted with red line polygon. (d) 24-connectivity with one grid gap. (e) 48-connectivity with two grid gaps. (f) 80-connectivity with three grid gaps. (g) 120-connectivity with four grid gaps. (h–l) Lightning clusters identified by combinatorial 8-connectivity allow 0–4 grid cell gaps between grid cells, respectively. The red lines represent the primary lightning clusters and the azure dotted lines indicate the final lightning clusters.

Figure 2. Sketch of identification algorithm of lightning cluster. (a) 8-connectivity. (b) 8-connectivity method is extended to 24, 48, 80 and 120-connectivity, corresponding to 5 × 5, 7 × 7, 9 × 9, 11 × 11 arrays respectively. (c) Lightning clusters identified by 8-connectivity, which has no grid gap, are fitted with red line polygon. (d) 24-connectivity with one grid gap. (e) 48-connectivity with two grid gaps. (f) 80-connectivity with three grid gaps. (g) 120-connectivity with four grid gaps. (h–l) Lightning clusters identified by combinatorial 8-connectivity allow 0–4 grid cell gaps between grid cells, respectively. The red lines represent the primary lightning clusters and the azure dotted lines indicate the final lightning clusters.

Eight-connectivity method is a 3 × 3 array, and grid cells adjacent to each other by sides or corners are clustered into a single cluster. The lightning cluster recognised by 8-connectivity leads to a high number of small area clusters. When using 0.01° grid cells (about 1.1 km), the connected-component labeling allows 1–4 grid cell gaps between grid cells based on the results. Other grid cell sizes use the same spatial parameters as the 0.01° grid cell. It means that 8-connectivity method is extended to 24, 48, 80 and 120 connected-component labeling in Figure 2b, corresponding to 5 × 5, 7 × 7, 9 × 9, 11 × 11 arrays respectively. The grid cells satisfying connected-component labeling of 24, 48, 80 and 120 form lightning clusters, respectively, in Figures 2c–2g. The process is similar to that of 8-connectivity. This paper called 8, 24, 48, 80 and 120 connected-component labeling as connectivity method throughout the paper.

Combinatorial 8-Connectivity Method

Based on the results of 8-connectivity algorithm, lightning clusters are merged according to the grid gap by combinatorial 8-connectivity. First, 8-connectivity method is used to identify lightning clusters in a total lightning map and they are named as the primary lightning clusters in Figures 2h–2l. Then, the combinatorial 8-connectivity method makes use of multi-connectivity to merge primary lightning clusters. The grid cell in the left lightning cluster allows 1–4 grid cell gaps to the grid cell which is in the right lightning cluster spatially, respectively in Figures 2i–2l, so they are merged into final lightning clusters. For example, there are 1–4 grid cell gaps between the grid cell marked “3” in the left clusters and the grid cells named as “5” in the right clusters, respectively. The two primary clusters form a final lightning cluster. If not, it is final lightning cluster in Figure 2h. The grid resolution has been varied from 0.01° to 0.05°, respectively. In order to analyze the size and shape of lightning cluster region projected on a horizontal plane, this paper uses convex polygons to outline lightning clusters.

Identification Evaluation Method

Four criteria are proposed to quantitively evaluate the two identification algorithms above, which are the fitting area of lightning clusters, the area proportion of lightning clusters, the number of lightning clusters, and the utilization rate of lightning events. The area of a lightning cluster fitted by a polygon is the area of a convex polygon. The area proportion is the ratio of the total area of the active clustered lightning grids to the full area of the fitted polygon. It represents how full or sparsely clustered the grids are within the polygon. For lightning clusters, the higher the value of the area proportion, the better. The area proportion of lightning clusters for each cluster is: [Image Omitted. See PDF]where N_g is the number of lightning active grid cells in a single cluster, S_g is the area of each grid cell, S_c is the area of convex polygon fitting. The number of lightning clusters is the number of clusters recognized by two algorithms at time Δt. The utilization rate of lightning events at Δt is calculated as: [Image Omitted. See PDF]where N_le is the number of lightning events in a single cluster, T_le is the total number of lightning events at Δt. A good identification method should have small fitting area, high proportion of area and utilization rate of lightning events, and appropriate number of lightning clusters.

Discussion and Results

According to the statistics, there were 77 days during which 50.000 or more total lightning events were detected during May to September in 2014. The number of lightning event days of 50,000 ∼ 100,000, 100,000–200,000 and more than 200,000 accounted for 51.9%, 28.6% and 19.5%, respectively. To keep results to a reasonable value for processing, three cases were chosen to represent three types of lightning activities, respectively. The test datasets are described in Table 1. The cases of May 15, 16 and 24, 2014 were used as an example to highlight the differences in the identification process.

Table 1 Total Lightning Events Used in This Analysis

Date	Events	Start time (UTC)	End time (UTC)
15 May 2014	149,041	04:00	12:00
16 May 2014	242,673	00:30	13:30
24 May 2014	76,958	04:00	10:30

Variations of Combinatorial 8-Connectivity Method

For lightning clustsers, variation in space-time parameters will affect the outcome of the algorithms. In this paper, we will vary the algorithms about their temporal and spatial parameters to see how these changes affect the identification results.

Temporal Variations

Although lightning events are recorded in real time, the choice of the time interval can be arbitrary. If the duration of the time interval is short, the area of lightning clusters is small and the lightning cluster is scattered resulting in low tracking efficiency. On the other hand, if the duration of the time interval is too long, the area of the lightning clusters will be large and the tracking cannot provide useful warnings for power systems. To test the sensitivity of the combinatorial 8-connectivity method in the temporal regime, the effect using different time intervals is tested.

The results are displayed in Figure 3. The top plots show that the fitting areas of 1-gap, 2-gaps, 3-gaps, 4-gaps are multiples of the 8-connectivity (0-gaps) fitting areas under the same grid cell size at time interval of 2 min while the middle and bottom plots show how the area proportion and the number of lightning clusters vary, respectively. The lines are the best (least squares) fit to the statistical results. Figures 3a–3e display an upward trend between the fitting area multiple and time interval, and the fitting area multiple increases with the distance of grid cell gaps. The median of fitting area is constant at 7.26 km² for 8-connectivity using 0.01° grid cell in all intervals in Figure 3a. The growth rate of fitting area decreases by less than 0.61 km²/min exceeding 6 min and the fitting area changes by less than 2.82%. Using grid cells with size 0.02, the growth rate of the fitting area of 48, 80 and 120-connectivity decreases when exceeding 6 min, while that of 24 -connectivity increases. The time interval of 6 min is a transition point for the fitting area in Figures 3c–3e, because the slope of the majority of the curves increase at time interval of 6 min Figures 3f–3j reveal a downward relation between the area proportion and time interval. Variation in area proportion of the grid cell from 0.01° to 0.03° decreases by less than 0.43%/min, 0.81%/min and 0.46%/min exceeding 6 min. The area proportion of 0.04° and 0.05° grid cell changes linearly. Figures 3k–3l show increasing relationships between the number of clusters and the time intervals. The number of lightning clusters increases by less than 1 when the time interval exceeds 6 min.

View Image - Figure 3. Evaluation results of combinatorial 8-connectivity due to the duration of the time interval. The five columns are the evaluation results of 0.01°–0.05° grid cell, respectively. The three rows are the multiple of fitting area, the proportion of area and the number of lightning clusters, respectively.

Figure 3. Evaluation results of combinatorial 8-connectivity due to the duration of the time interval. The five columns are the evaluation results of 0.01°–0.05° grid cell, respectively. The three rows are the multiple of fitting area, the proportion of area and the number of lightning clusters, respectively.

Figure 4 indicates increasing relation between the utilization rate and time interval and lines are the best (least squares) fit to the statistical results. As in the temporal changes, the utilization rate of lightning events is not strongly affected by the time interval after the interval of 6 min, and they merely change by less than 2.8%. When the duration of the time interval is short, some lightning clusters are deleted, because the area of the lightning cluster is less than three grid cells, which results in a low utilization rate. It mainly increases the density of lightning events in a grid cell when the time interval exceeds 6 min. The effect on the cell clustering adopting different time intervals using grid cells of 0.01° in size at three different stages during the thunderstorms' development is shown in Figure 5 and results of other grid cell sizes are not presented here. The area of lightning clusters increases with increasing time interval in the stage of growth in Figure 5a and decay in Figure 5c. On the other hand, the number of lightning clusters is reduced and the area of lightning cluster rises in the mature stage due to the increase of the duration of the time interval in Figure 5b.

Figure 4. Effects on utilization rate of lightning events due to the duration of the time interval.

View Image - Figure 5. Effects on lightning clusters using 0.01° grid cell due to the duration of the time interval. (a) Growth stage, (b) mature stage and (c) decay stage.

Figure 5. Effects on lightning clusters using 0.01° grid cell due to the duration of the time interval. (a) Growth stage, (b) mature stage and (c) decay stage.

According to the evaluation results above, the choice of a spatial parameter is combined with the size of the transmission line. Since the overhead line spans about 500 m, a circle of area is 19.63 km² with 5 spans and radius of 2.5 km. The median lightning cluster area is around the value of 19.63 km². Considering the stability of temporal parameters and the scale of power systems, the time interval was set to 6 min in the following part according to the analysis above.

Spatial Variations

Since lightning data are discrete points, it is vital to find the optimal grid cell size and amount of spatial gaps allowed for the algorithms. To test the sensitivity of spatial gaps and grid cell sizes, the combinatorial 8-connectivity allowed for 0–4 grid cell gaps and grid resolution of 0.01°–0.05°, respectively.

The lightning clusters were divided into four groups with different lightning event counts. The counts at 6 min intervals are classified as “low” (1–5 events), “medium” (6–15 events), “high” (16–63 events), and “very high” (64+ events). The results of combinatorial 8-connectivity spatial changes test are described in Figure 6 and the lines are the best (least squares) fit to the statistical results. Figures 6a–6d display that the fitting areas using 0.02°–0.05° grid cell in four groups are multiples of those using grid cells of 0.01° in size under the same grid gaps at the time interval of 6 min, respectively, while Figures 6f–6i show how the proportion rate of fitting area changes. The last column describes the variation of the two evaluation indicators in the three cases in Figures 6e and 6j.

View Image - Figure 6. Effects of allowing spatial grid cell gaps between grid cells in order to cluster lightning events into cells. The five columns are “low,” “medium,” “high,” “very high” and “all data,” respectively. The first row is that the fitting areas using 0.02°–0.05° grid cell in four groups are multiples of those using grid cells of 0.01° in size, and the second row shows the proportion of area.

Figure 6. Effects of allowing spatial grid cell gaps between grid cells in order to cluster lightning events into cells. The five columns are “low,” “medium,” “high,” “very high” and “all data,” respectively. The first row is that the fitting areas using 0.02°–0.05° grid cell in four groups are multiples of those using grid cells of 0.01° in size, and the second row shows the proportion of area.

It is found that the fitting area of 0.02°–0.05° grid cell remains constant around 4, 9, 16 and 25 times those using grid cells of 0.01° in size in the case of “low lightning event” in Figure 6a, respectively, because they form the final lightning clusters directly without a second merger. The area of lightning clusters mainly depends on the grid cell size. For example, the lightning area of 0.05° grid cell is 25 times of that of 0.01° grid cell. Compare to “low lightning event,” the grid cell size also has an influence on the area of the other three groups of lightning clusters, but the fitting area multiple decreases under the same grid cell size. The fitting area multiple of 0.05° grid cell is reduced from 19.44 times in the medium case to 10.26 times in the very high case in Figures 6b–6d. The above classification is based on lightning events other than the number of clusters. When using small grid cells and short grid gaps, there are a lot of lightning clusters in case of “low lightning event” and very few clusters in the very high case. For example, for 0.01° grid cell and 0 grid gap, the median of the four groups is higher than that of each of the lightning clusters divided into four equal parts, respectively. On the other hand, the median of the four groups is lower than that of each of the lightning clusters for 0.05° grid cell and 4 grid gaps. That is the reason why the multiple of fitting area shows mainly an upward trend in Figures 6a–6d. As to “All data,” it illustrates a downward trend between the fitting area multiple and the number of grid cell gaps and the fitting area multiple is stable around the square of the grid cell size, corresponding to about 4 times, 9 times, 16 times and 25 times in Figure 6e, respectively. In order to outline the area of lightning clusters accurately, the grid cell size was set to 0.01°. The effect of lightning clusters at three stages due to grid cell size using 2 gaps at 6 min interval is shown in Figure 7. The larger the grid cell, the larger the area of the lightning cluster in Figure 7a. There are 11 lightning clusters in the mature stage and 3 lightning clusters in the decay stage using 0.01° grid cell, respectively, while there is only 1 lightning cluster using 0.05° grid cell in Figures 7b and 7c.

View Image - Figure 7. Effects on lightning clusters due to grid cell size using 2 gaps at 6 min interval. (a) Growth stage, (b) mature stage and (c) decay stage.

Figure 7. Effects on lightning clusters due to grid cell size using 2 gaps at 6 min interval. (a) Growth stage, (b) mature stage and (c) decay stage.

The bottom graphs of Figure 6 show downward trends between the area proportion and the grid cell gaps. The cluster of “low lightning event” is merged only once, so the area proportion is 100% in Figure 6f. The area proportion changes by 3.04% and 4.35% from 2 gaps to 4 gaps using 0.01° grid cell in “high” and “very high” in Figures 6h and 6i, respectively. The area proportion almost remains the same in “medium” using 0.01° grid cell and 2 gaps in Figure 6g. Based on the best fit to statistical results, the spatial gap has a limited impact on the number of clusters exceeding 2 gaps and clusters decreases by less than two in Figure 8. Variation in the number and area proportion of lightning clusters is limited, and the median area is appropriate, therefore the spatial parameter is set to 2 gaps. The effect of lightning clusters during three stages due to the choice of spatial gaps using 0.01° grid cell at 6 min interval is presented in Figure 9. The size of lightning clusters is most the same except in the case of 0-gaps in Figures 9a and 9c. There are 19, 13, 10, 7 and 5 lightning clusters identified by the 0–4 gaps of the combinatorial 8-connectivity method in Figure 9b, respectively. The spatial parameter mainly reduces the number of clusters and increases the area of clusters in the mature stage.

Figure 8. Effects allowing spatial gaps between grid cells in the number of lightning clusters.

View Image - Figure 9. Effects on lightning clusters due to spatial gap using 0.01° grid cell at 6 min interval. (a) Growth stage, (b) mature stage and (c) decay stage.

Figure 9. Effects on lightning clusters due to spatial gap using 0.01° grid cell at 6 min interval. (a) Growth stage, (b) mature stage and (c) decay stage.

Comparison of Connectivity Method and Combinatorial 8-Connectivity Method

Four criteria described in Section 2 were also used to evaluate the connectivity (X = 8, 24, 48, 80 and 120) method. Based on the best (least squares) fit to the data, results thereof are shown in Figure 10. According to the analysis above, the spatial resolution was set to 0.01° × 0.01° and Figure 10 illustrates the results ranging from 2 min to 12 min. For 8-connectivity, the trend of the fitting area with the duration of the time interval becomes flat while the area of lightning clusters goes up with the duration of the time interval in the other four connectivity methods in Figure 10a. The proportion of area decreases only very slightly with fixed connectivity number and varying duration of the time interval while it decreases dramatically adopting a fixed time interval and varying connectivity number in Figure 10b. The number of lightning clusters rises while varying the time interval, but it reduces as the number of connectivity changes from 8 to 120 in Figure 10c. The utilization rate of lightning events grows with the duration of the time interval for 8-connectivity while the trend of utilization rate increases slightly with the duration of the time interval in the other four connectivity methods in Figure 10d. The fitting area and utilization rate of clusters rise as the number of connectivity varies from 8 to 120 in Figures 10a and 10d.

View Image - Figure 10. Evaluation results of connectivity algorithm. (a) The area, (b) the area proportion, (c) the number of lightning clusters and (d) the utilization rate of lightning event.

Figure 10. Evaluation results of connectivity algorithm. (a) The area, (b) the area proportion, (c) the number of lightning clusters and (d) the utilization rate of lightning event.

The evaluation results of the combinatorial 8-connectivity method with 1-gap, 2-gaps, 3-gaps, and 4-gaps are compared to those of 24, 48, 80 and 120- connectivity method, respectively. The 8-connectivity is the combinatorial 8-connectivity with 0-gaps, and the median area of lightning is 7.26 km² in Figure 10a. According to the multiple of the 8-connectivity (0-gaps) fitting area in Figure 3a, the fitting area of 1-gap, 2-gaps, 3-gaps and 4-gaps in any grid cell size and duration of the time interval can be calculated. Compared with the connectivity method of the same spatial gap, the combinatorial 8-connectivity reduces the fitting area, and the area difference ranges from 4.84 km² between 24-connectivity and 1-gap at the time interval of 2 min–33.88 km² between 120-connectivity and 4-gaps at the time interval of 10 min. The area proportion of combinatorial 8-connectivity method is also improved, ranging from 21.43% between 1-gap and 24-connectivity at the time interval of 10 min–38.10% between 4-gaps and 120-connectivity at the time interval of 2 min in Figure 10b and Figure 3b. The number of clusters of combinatorial 8-connectivity method increases by 1–4 in Figure 10c and Figure 3c. The lightning event utilization rate of the combinatorial 8-connectivity is consistent with that of the 8-connectivity, but is lower than that of the 24, 48, 80 and 120-connectivity ranging between 4.47% and 15.64%.

It is found that the combinatorial 8-connectivity using 2 gaps at 6 min reduces by 10.89 km² in area, rises from 39.35% to 67.80% in area proportion, but drops by 7.39% in utilization rate, respectively, relative to the 48-connectivity in Figure 10 and Figure 3. The number of lightning clusters differs by only one. Compared with the 8-connectivity at interval of 6 min, the combinatorial 8-connectivity using 2 gaps increases by 9.68 km² in area, decreases by 10 in quantity and 8.67% in area proportion, respectively. The combinatorial 8-connectivity optimized the fitting area, area proportion and the number of lightning clusters at the expense of a part of the utilization rate of lightning clusters.

Analysis of Case Studies

The total lightning data for case studies were collected by FTLLS and this lightning case occurred at 0400–1200 UTC on 15 May 2014. The horizontal spatial resolution of the geographic grid cell is 0.01° × 0.01° and the time interval is set to 6 min. The lightning activity happened in the southwest of Foshan and gradually moved to the northeast.

Each row of Figure 11 shows the results of identifying lightning clusters by 8-connectivity, 48-connectivity and combined 8-connectivity (2-gaps), respectively. The three moments (0454, 0700 and 1124 UTC 15 May 2014) represented the growth, mature and decay stages of lightning activity. The lightning cluster labeled as “1” in the other two methods is identified as two separated clusters named as “1” and “4” by 8-connectivity in Figure 11a. This means that combinatorial 8-connectivity (2-gaps) has the advantage of merging clusters reasonably. Then, clusters named as “2” and “3” by combinatorial 8-connectivity (2-gaps) in Figure 11g, which are the same as those of 8-connectivity in Figure 11a, are smaller in area than those of 48-connectivity in the growth stage in Figure 11d. The combinatorial 8-connectivity (2-gaps) is conducive to identifying early small lightning clusters and improve area proportion. The same phenomena can be seen in the decay stage in the third column. The cluster marked “7” in Figure 11i is divided into two clusters named as “12” and “13” in Figure 11c, while clusters named as “8” and “9” in Figure 11i are the same as clusters labeled as “14” and “15” in Figure 11i, respectively. Compared with 8-connectivity in Figure 11b, some small lightning clusters are merged by combinatorial 8-connectivity (2-gaps) and this algorithm reduces the number of clusters from 19 to 10 in Figure 11h. For example, clusters marked “5,” “6” and “7” in Figure 11b are merged into the cluster named as “4” in Figure 11h. It is also able to separate cluster “4” from 48- connectivity in Figure 11e into three clusters marked “4,” “5” and “6” in Figure 11h. Although the number of clusters increases, the combinatorial 8-connectivity (2-gaps) has the ability to improve the area proportion and the area. Table 2 shows the median values of identification evaluation for connectivity and combinatorial 8-connectivity (2-gaps).

View Image - Figure 11. Identification results of different methods on 15 May 2014. (a–c) 8-connectivity, (d–f) 48-connectivity and (g–i) combinatorial 8-connectivity (2-gaps). The red curves represent the boundary of the identified lightning clusters. The three columns are growth stage, mature stage and decay stage of lightning activity, respectively.

Figure 11. Identification results of different methods on 15 May 2014. (a–c) 8-connectivity, (d–f) 48-connectivity and (g–i) combinatorial 8-connectivity (2-gaps). The red curves represent the boundary of the identified lightning clusters. The three columns are growth stage, mature stage and decay stage of lightning activity, respectively.

Table 2 Identification Evaluation of Connectivity and Combinatorial 8-Connectivity (2-Gaps)

Method	Grid cell size	Time interval (min)	Number of adjacencies	Time (UTC)	Area (km²)	Area proportion (%)	Number	Utilization rate (%)
Connectivity	0.01°	6	8	0454	6.05	81.82	11	72.46
				0700	10.89	79.10	25	86.84
				1124	36.91	67.72	8	94.21
				total	8.47	75.63	20.5	92.71
	0.01°	6	48	0454	22.99	42.11	9	84.06
				0700	58.08	42.70	14	98.94
				1124	71.99	51.64	6	96.85
				total	38.72	42.86	13	96.87
Combinatorial 8-connectivity	0.01°	6	2-gaps	0454	12.71	75.74	8	72.46
				0700	41.14	65.85	15	96.84
				1124	38.72	66.67	7	94.21
				total	21.78	67.80	13	92.71

Conclusions

This paper proposes a new lightning identification algorithm based on total lightning data at set time intervals. Considering the discreteness of lightning data, the combinatorial 8-connectivity method allows gaps between grid cells when dividing lightning density maps into individual clusters.

An identification evaluation method is introduced to quantitively evaluate the performance of identifying algorithms. The four criteria are the fitting area of lightning clusters, the area proportion of lightning clusters, the number of lightning clusters, and the utilization rate of lightning events. Compared with the connectivity method, the combinatorial 8-connectivity method effectively reduces the number of lightning clusters and their corresponding area, increases the area proportion through two-level merging. The lightning event utilization rate of combinatorial 8-connectivity method is consistent with that of 8-connectivity method. The lightning activity on 15 May 2014 illustrated the advantages of the combinatorial 8-connectivity method above.

This study presents temporal and spatial variations of combinatorial 8-connectivity method. It is found that the multiple of fitting area remains constant around the square of the grid cell size, and 0.01° grid cell is applied to outline clusters. It uses 6 min interval for 0.01° grid cell because the slope of the four criteria decrease and change by less than 0.61 km²/min in area, less than 0.43%/min in area proportion, less than 1/min in number and less than 2.8% in utilization rate exceeding 6 min, respectively. The results remain stable from 2 gaps to 4 gaps using 0.01° × 0.01° grid cells. The lightning event utilization rate of combinatorial 8-connectivity method is consistent with that of 8-connectivity method while it has an advantage of declining by 10 clusters and rising from 7.26 to 16.94 km² in area, although the area proportion reduces by 8.67%.

A transition point appears consistently around the time interval of 6 min in all statistical plots, where the growth rate changes abruptly or the effect of time interval extension on the results is small. The same trend can be seen in spatial variation and a turning point is found in the combinatorial 8-connectivity method (2-gaps). The multiple of fitting area is the square of the multiple of grid cell size.

It will be vital to optimize spatiotemporal parameters in a wide database. The fixed thresholding techniques used by the combinatorial 8-connectivity method may cause unreliability in identifying stage. The algorithm can also likely be improved in the future by using multiple or adaptive threshold scheme to optimize the performance of identification algorithm. Furthermore, we hope to combine radar data with total lightning data in the future work (Bonelli & Marcacci, 2008; Meyer et al., 2013b; Rigo et al., 2010), because radar products, such as radar reflectivity, can further improve the knowledge about thunderstorm stage and development, including identification of thunderstorm initiation, vertical development and the dissipation.

Acknowledgments

The work was supported by the Fundamental Research Funds for the Central Universities (Grant No. 2042022kf0016), the National Natural Science Foundation of China (Grant No. 52177154 and Grant No. 51807144) and China Scholarship Council. Sincere thanks are given to Dr. Dieter R. Poelman and the Royal Meteorological Institute of Belgium. The authors have benefited greatly from his support and many discussions with him.

Conflict of Interest

The authors declare no conflicts of interest relevant to this study.

Data Availability Statement

All data reported in this manuscript can be found on Repository (https://doi.org/10.4121/17049623) and (http://doi.org/10.5281/zenodo.5728427).

Word count: 6055

Show less

© 2022. This work is published under http://creativecommons.org/licenses/by-nc-nd/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Abstract

Translate

This paper introduces a new thunderstorm identification algorithm called combinatorial 8‐connectivity method based on total lightning activity as observed by the Foshan total lightning location system. The influence of spatio‐temporal parameters on the outcome is presented. The evaluation method contains four criteria: the fitting area of lightning clusters, the area proportion of lightning clusters, the number of lightning clusters and the utilization rate of lightning events. It is found that the fitting area adopting grid cells with size 0.02°–0.05° remains constant around 4, 9, 16 and 25 times those using grid cells of 0.01° in size, respectively. The evaluation criteria change by less than 0.61 km²/min in area, less than 0.43%/min in area proportion, less than 1/min in number and less than 2.8% in utilization rate using 0.01° grid cell when the duration of time interval exceeds 6 min and the spatial gap between grid cells with lightning events is more than 2 gaps. Then, the evaluation results of the combined 8‐connectivity method are compared with the results of the connected‐component labeling method (8, 24, 48, 80 and 120‐connectivity method), and it is found that the performance of combined 8‐connectivity method is better to outline lightning cluster. The two methods have been applied to case analysis on 15 May 2014.

Details

Title

A New Thunderstorm Identification Algorithm Based on Total Lightning Activity

Author

Huang, Yijun¹

; Fan, Yadong¹

; Cai, Li¹

; Cheng, Si¹

; Wang, Jianguo¹

¹ School of Electrical Engineering and Automation, Wuhan University, Wuhan, China

Section

Research Article

Publication year

2022

Publication date

Apr 2022

Publisher

John Wiley & Sons, Inc.

e-ISSN

2333-5084

Source type

Scholarly Journal

Language of publication

English

DOI

https://doi.org/10.1029/2021EA002079

ProQuest document ID

2655591024

A New Thunderstorm Identification Algorithm Based on Total Lightning Activity

Jump to:

Full text

Abstract

Details

Suggested sources