Based on PCA and SSA-LightGBM oil-immersed

Full text

Turn on search term navigation

1 Introduction

Oil-immersed transformer is one of the important equipment in the power system, and its operation status determines whether the power system can operate safely and reliably. However, due to the influence of various factors such as electrical stress, thermal stress, and mechanical stress on the transformer, the transformer oil and organic insulating materials gradually age and decompose, resulting in the production of carbon oxides and a small amount of alkane gas, and their physical properties gradually decline. Therefore, it is particularly critical to make accurate diagnosis of transformer fault types by collecting monitoring data of dissolved gas in transformer oil and combining with appropriate fault diagnosis methods. However, due to the lack of a clear mapping relationship between transformer faults and gas content in oil, extracting transformer fault features is still the main problem of Dissolved Gas Analysis (DGA) technology. At present, although the IEC three-ratio method (the three-ratio method is a modification of the Rogers four-ratio method by the International Electrotechnical Commission (IEC)), the Rogers method and the improved three-ratio method have been formed, the fault boundaries of the above methods are too absolute and the coding is incomplete, so that The accuracy of transformer fault diagnosis results is not high [1, 2].

With the rapid development of artificial intelligence research, some intelligent algorithms have been introduced into transformer fault diagnosis. At present, scholars at home and abroad use DGA technology to integrate with support vector machine (SVM), convolutional neural network (CNN) [3], Self-Organizing Map (SOM) Neural Networks [4], fuzzy theory and other methods to improve the accuracy of fault diagnosis. Literature [5–7] uses a variety of intelligent algorithms Optimizing the penalty factor C and kernel function parameter g of SVM can improve its classification accuracy. However, the support vector machine is proposed for the binary classification problem, and it has shortcomings in multi-classification. References [8–10] use convolutional neural network in transformer fault diagnosis to effectively diagnose internal faults of power transformers, but there are problems that it is easy to fall into a local minimum point and the convergence speed is slow. References [11–14] use fuzzy theory to introduce transformer fault diagnosis and state assessment to solve the ambiguity and uncertainty between fault causes, operating states and fault mechanisms, and to better improve the accuracy of transformer faults. Its experience is processed by means of membership functions, and there will be a problem of "cognitive uncertainty". In recent years, with the continuous development of ensemble learning, many scholars have also applied it to transformer fault diagnosis. Among them, literature [15] screened the fault related features combined with the original DGA gas content features and ratio features, and optimized the lightgbm model parameters. Finally, the fault diagnosis accuracy increased from 88.02% to 90.14%. Reference [16] constructed 16 gas volume ratios as the input fault features of the LightGBM model, and used grid search to optimize the learning rate, tree depth, number of leaf nodes and iterations of the LightGBM model, and the final classification accuracy reached 93.39%. Reference [17] uses LightGBM to establish a 10KV feeder fault prediction model for distribution network. After parameter optimization, the fault accuracy of meteorological, equipment and operating factors all exceeds 90%.

Based on the analysis of the current research status of transformer fault diagnosis mentioned above, it can be concluded that traditional methods for analyzing dissolved gases in oil have limitations and the accuracy of fault diagnosis models can be further improved. In response to the first question, this article adopts the uncoded ratio method when constructing features. The uncoded ratio method is similar to the three ratio method, Rogers ratio method, etc., using the dissolved gas concentration ratio or component proportion in oil as the feature parameter. However, the dimensionality of the constructed features is more extensive, covering a wider range of effective information compared to traditional ratio methods. It has stronger potential for fault feature mining and can further improve the fault diagnosis performance of the model. In response to the second issue, this article adopts a lightweight gradient boosting machine model optimized by sparrow search algorithm as the transformer fault diagnosis model, which achieves a fault diagnosis accuracy of 93.6%. Compared with similar models, the diagnostic accuracy of this method has been improved by 8.1 and 5.7 percentage points, respectively. In summary, this article proposes a fault diagnosis method for oil immersed transformers based on principal component analysis (PCA) and sparrow search algorithm combined with LightGBM (SSA-LightGBM). Firstly, the relationship between different characteristic gases generated during the fault is explored, and a 17 dimensional joint feature is constructed using the uncoded ratio method; Secondly, the Z-score method is used for normalization, and the principal component analysis method is used for feature selection to construct fused features and eliminate redundant features between variable information; Finally, a fault diagnosis model for oil immersed transformers based on SSA LightGBM is constructed using the ten fold cross validation principle.

2 Data preprocessing

2.1 Preprocessing

The gases dissolved in the oil are mainly H₂, C₂H₆, CH₄, C₂H₂, and C₂H₄. When a fault occurs, the transformer fault type is determined by the ratio between the characteristic gases. Currently widely used methods include the Three-Ratio Method [18], CUSUM Method [19], and Rogers Ratio Method [20]. Although these methods are simple to operate, they have defects such as incomplete encoding and absolute fault boundary distinction [21]. The Non-Coding Ratio Method is similar to the Three-Ratio Method and Rogers Ratio Method, using the concentration ratio or component ratio of dissolved gases in oil as feature parameters, but with more dimensions in constructing features. Compared to traditional ratio methods, it covers a wider range of effective information, has stronger potential for fault feature mining, and can further improve the fault diagnosis performance of the model. After a large number of simulation experiments, it is found that:

1. There is no C₂H₂ in partial discharge, and there are more CH₄;

2. When arc discharge in oil, the gas generated is mainly H₂ and C₂H₂, with a small amount of CH₄ and C₂H₄;

3. When the temperature is overheated, the total hydrocarbons of H₂ will increase significantly, and the total hydrocarbons of transformer insulating oil, including C₂H₄, C₂H₆, and CH₄, will increase.

Based on the above rules, this paper adopts the uncoded ratio method to construct 17 kinds of characteristic parameters [22], whose characteristic quantities are described in Table 1.

[Figure omitted. See PDF.]

Referring to DL/T 722–2014 "Guidelines for Analysis and Judgment of Dissolved Gas in Transformer Oil", the data labels are divided into: normal state, medium and low temperature thermal fault, medium temperature thermal fault, high temperature thermal fault, high energy discharge, low energy discharge and partial discharge. In this paper, 513 cases of fault gas data are selected to construct a joint characteristic matrix, each row represents a fault sample, as shown in Formula (1):(1)

2.2 Z-score normalization

In order to eliminate the significant differences in dissolved gas values among different types of oil, it is necessary to normalize or standardize the data. Standardizing data can make the model gradient descent faster, while also making the data distribution more uniform, avoiding bias and instability caused by significant numerical differences between different features. After standardization, the range of values for all features is similar, which is beneficial for the model to converge faster and reduce the differences between features, helping to avoid overfitting problems in the model. Therefore, this article introduces the Z-Score standardization method [23, 24].

Data standardization processing, as shown in Formula (2):(2)where x*_ij is the joint feature after normalization; μ is the feature mean; σ is the feature standard deviation; x_ij is the original feature. After the standardized transformation, the average number of each type of fault features is 0, and the standard deviation is 1. The degree of feature concentration and dispersion trend are consistent, so as to eliminate the difference between features. The joint features of transformer fault sample features after the standardized processing are shown in Formula (3):(3)

3 PCA-based fusion feature matrix construction

Due to the use of the uncoded ratio method to construct high-dimensional feature data based on dissolved gas concentration ratios or component proportions in oil, some ratio features may have linear correlations with the original features, resulting in information redundancy. At the same time, due to the high dimensionality of the data, the computational load during the diagnostic process will increase exponentially, leading to an increase in model running time and seriously affecting the fault diagnosis results. Therefore, feature fusion methods are needed to reduce the dimensionality of the data. The commonly used methods include Principal Component Analysis (PCA)、Scaled dot-product attention [25] and Self-attention Mechanism [26], etc.

Principal component analysis is a data statistical analysis method. Its idea is to map high-dimensional data to new feature subspaces that are linearly independent of each other, eliminate the correlation between variables, reduce data dimensions, and simplify complex data [27].

Step1: Find the covariance matrix R.(4)where: r_nn is the correlation coefficient.

Step 2: Calculate the eigenvalues and eigenvectors of the covariance matrix R.

Eigenvalues y_j:(5)Eigenvectors:(6)Where the largest eigenvalue and the corresponding eigenvector represent the variance and direction of the first principal component, and similarly up to the smallest eigenvalue and the corresponding eigenvector represent the variance and direction of the last principal component.

Step 3: Calculate the cumulative contribution.(7)The number of principal components of the model is selected based on the cumulative variance contribution, and when the contribution reaches a fixed value, m principal components can be selected.

In this paper, PCA is used to reduce the dimensionality of the 17-dimensional joint features constructed by the codeless ratio method, and based on the principle of determining the main principal components by the cumulative contribution rate, the cumulative contribution rate of the variance of the first 7 principal elements contains 95% of the information of the principal elements, as shown in Fig 1, so the first 7 principal elements are selected as the fused feature vector matrix for fault diagnosis, which is denoted as:(8)

[Figure omitted. See PDF.]

For the collection of new samples, according to the construction method of the feature vector, construct a 17-dimensional joint feature vector x_new = [x_new1,x_new2,…,x_new17], and then project it to the new pivot to obtain the fusion feature Xnew. The projection algorithm is as follows formula:(9)A two-dimensional scatter plot of the different principal elements in the fused feature vector Xnew is presented for verification. From Fig 2, it can be seen that the fused features have good clustering ability.

[Figure omitted. See PDF.]

4 Construction of a fault diagnosis model based on SSA-LightGBM

4.1 LightGBM algorithm

Light Gradient Boosting Machine (LightGBM) is a decision tree based Boosting algorithm proposed by Microsoft for regression prediction, fault monitoring, classification and feature filtering, with fast training efficiency, low memory consumption and higher accuracy.

The main idea of the LightGBM algorithm is to improve the accuracy of the classifier by multiple iterations based on a decision tree. Assuming that the learner obtained in the previous round is F_t-1(x) and the loss function is L(y, F_t-1(x)) then the objective of the current round needs to find a weak classifier h_t(x) such that the loss function of the current round is minimized, i.e. the minimum loss function is expressed as:(10)The approximation of the loss function for the current round is then fitted so that the negative gradient of the loss function can be calculated, and the approximation of the loss function is expressed as:(11)Fitting h_t(x) by the squared difference approximation:(12)The resulting strong learners for this round are:(13)

4.2 SSA algorithm

Sparrow search algorithm is a new swarm intelligence algorithm proposed in 2020, which constantly updates individual positions to simulate sparrow foraging and anti predatory behavior.

Initialize the sparrow population and fitness value: suppose there are n sparrows in the population, the population can be expressed as X = [x₁,x₂,…,x_n]^T, the fitness functions of sparrows are F = [f(x₁), f(x₂),…, f(x_n)], the specific expression is:(14)(15)Discoverers account for 10% ~ 20% of the entire sparrow population, and discoverers with better fitness values will preferentially obtain resources in the search process, the location update is shown in Formula (12):(16)In the formula, t represents the current number of iterations; x_i,j^t+1 represents the position information of the ith sparrow in the jth dimension; α expressed as a uniform random number of (0,1]; iter_max represents the maximum number of iterations, R₂ represents the alarm value, and R₂ ϵ [0,1]; ST is the safety threshold, and ST ϵ [0.5,1]; Q is a random number subject to normal distribution; L is 1×d-dimensional all 1 matrix. When R₂<ST, it indicates that predators do not appear in the area around foraging; when R₂≥ST, it indicates that the predator appears and gives an alarm, all discoverers quickly fly to a safe area.

Followers will conduct local search under the leadership of the discoverer, and followers with better fitness values will give priority to obtain resources in the search process, the location update is shown in Formula (13):(17)In the formula, x_p^t+1 is the current best position of the producer; x_worst^t is the global worst position; A is 1×d-dimensional matrix, each element of the matrix is randomly set to 1 or -1, and A⁺ = A^T(AA^T). When i>n/2, it indicates that the ith entrant failed to grab food and needed to go to other areas to look for food; when i≤n/2, it indicates that the entrant is foraging near the optimal individual x_p.

Update the location of the watchman:(18)In the formula, x_best^t is the global optimal position; β is a step size control function and follows a normal distribution function with a mean of 0 and a variance of 1; k is the moving direction of sparrows, and k ϵ [–1,1]; ε is a constant, avoiding zero denominator; f_i is the fitness value of the current individual; f_g and f_w are the current global optimal and worst fitness values, respectively.

4.3 Model parameter optimization and training test

Because LightGBM model has diagnostic capability under default parameters, but the diagnostic accuracy is low, searching for specific hyperparameters in LightGBM model can further improve the fault accuracy. Where the parameter max_depth represents the depth of the tree, adjusting reasonably to avoid generating too deep trees [28]; feature_leaves represent a subsampling of features; learning rate represents the learning efficiency, when learning rate is set too small, the gradient decreases very slowly, if learning rate is too large, it will cross the optimal value, causing oscillation; subample is used to train weak learners, and it is easy to get too small a value to fit. Table 2 describes the main parameters, the optimization range, and the optimal parameter.

[Figure omitted. See PDF.]

The SSA algorithm is used to optimize the parameters of LightGBM, so as to build the SSA-LightGBM diagnosis model, the diagnosis flow chart is shown in Fig 3.

[Figure omitted. See PDF.]

Step 1: According to the collected data samples, the non coding ratio is used to construct the fault features, and then it is standardized, so as to eliminate the influence of the value between different ratios;

Step 2: Combined with the ten fold cross validation method, the sample data is divided into training set and test set;

Step 3: Set the initial parameters and super parameter optimization range of LightGBM model, and adjust the model parameters using SSA algorithm;

Step 4: Calculate the fitness value of the new position of the sparrow, and update the best and worst fitness values and their positions experienced by the entire sparrow population;

Step 5: Judge whether the number of iterations has been reached, if so, terminate the iteration, store the currently obtained training parameter value as the optimal parameter, otherwise return to Step 3;

Step 6: Input the stored optimal parameters into the LightGBM model, use the test set to test the diagnosis effect of the model, and output the fault classification results.

4.4 Comparative analysis of diagnostic results

4.4.1 Analysis of fault type diagnosis results.

In this paper, 513 cases of characteristic gases at transformer faults were collected, and the constructed SSA-LightGBM diagnostic model was trained and tested by inputting preset parameters and combining the ten-fold cross-validation principle, and its average fault accuracy was calculated to be 93.7% and the overall fault accuracy was 94.8%, as shown in Table 3.

[Figure omitted. See PDF.]

4.4.2 Analysis of multi-model diagnosis results.

To verify the superiority of LightGBM diagnostic model, the same fused features are used as input, and the diagnostic results of LightGBM model are compared with KNN [29], SVM [21] and BP [30]. The results are shown in Table 4. The average accuracy of LightGBM model reaches 90.3%, which is better than other models, verifying the superiority of LightGBM model, and after SSA optimizing, the accuracy is improved by 3.4%, which effectively improves the diagnosis accuracy.

[Figure omitted. See PDF.]

In this paper, SSA is used to find the relevant hyperparameters of LightGBM diagnosis model and compare and analyze the optimization results with those of GA and GWO, and the diagnosis results are shown in Table 5. The model diagnosis accuracy of GA-LightGBM, GWO-LightGBM and SSA-LightGBM are 85.6%, 88% and 93.7%, respectively, and SSA-LightGBM LightGBM model has the highest diagnostic accuracy, and all medium-temperature thermal faults, low-energy discharges and high-energy discharges are correctly identified, indicating that the SSA-LightGBM fault diagnosis model constructed in this paper can effectively improve the fault diagnosis accuracy of transformers.

[Figure omitted. See PDF.]

4.4.3 Analysis of diagnostic results for different feature quantities.

In order to verify the effectiveness of the optimized 7-dimensional fusion features, the experimentally collected data were used to obtain the corresponding fault diagnosis accuracy by IEC triple ratio, Rogers method and no-coding ratio method, and the results are shown in Table 6, indicating that the fusion fault features can dig deeper into the connection between the fault type and the DGA data, which can further improve the accuracy of the fault diagnosis model.

[Figure omitted. See PDF.]

5 Conclusion

In summary, the method proposed in this article can effectively solve the problem of low diagnostic accuracy caused by the complexity of oil immersed transformer faults, improve the fault diagnosis performance of transformers, and is superior to other methods. The specific innovation points are as follows:

(1) Propose to construct 17 dimensional fault features based on the uncoded ratio method, and deeply explore the intrinsic relationship between dissolved gases in oil and fault types, which is beneficial for extracting transformer fault features.

(2) Propose a PCA based method for dimensionality reduction of fault features, effectively removing invalid and redundant features, while reducing computational complexity and improving diagnostic accuracy after feature extraction.

(3) Propose a LightGBM model optimized based on sparrow search algorithm, and verify its better accuracy compared to other models by comparing simulation results under the same feature input conditions with different models.

Although the method proposed in this article can achieve higher accuracy in transformer fault diagnosis and has certain generalization and adaptability, it is difficult to apply to online fault diagnosis of transformers due to the complexity of the method for collecting dissolved gases in oil. In the future, online diagnostic methods based on multi-source information fusion should be implemented by combining transformer vibration signals and sound signals.

Supporting information

S1 Table. Initial sample.

https://doi.org/10.1371/journal.pone.0314481.s001

References

1. 1. Le X, Yi J Z, Ke Y Y, et al. Interpretation of DGA for Transformer Fault Diagnosis with Step-by-step feature selection and SCA-RVM[C]//2021 IEEE 16th Conference on Industrial Electronics and Applications (ICIEA). IEEE, 2021: 1372–1377.

2. 2. Li H, Zhang Y, Zhang Y. Study of transformer fault diagnosis based on improved sparrow search algorithm optimized support vector machine[J]. Journal of electronic measurement and instrumentation, 2021, 35(2021): 123–129.

* View Article

* Google Scholar

3. 3. Thomas J B, Chaudhari S G, Shihabudheen K V, et al. CNN-based transformer model for fault detection in power system networks[J]. IEEE Transactions on Instrumentation and Measurement, 2023, 72: 1–10.

* View Article

* Google Scholar

4. 4. Raj R A, Sarathkumar D, Andrews L J B, et al. Key gases in transformer oil–an analysis using self organizing map (SOM) neural networks[C]//2023 IEEE 12th International Conference on Communication Systems and Network Technologies (CSNT). IEEE, 2023: 642–647.

5. 5. Zhu L, Wang X H, Li H, et al. Transformer fault diagnosis method based on variation sparrow search algorithm and improved SMOTE under unbalanced samples [J]. High Voltage Engineering, 2023, 49(12):4993–5001.

* View Article

* Google Scholar

6. 6. Ouyang X, Li Z B. Transformer fault diagnosis technology based on sample expansion and feature selection and SVM optimized by IGWO [J]. Power System Protection and Control,2023,51(18):11–20.

* View Article

* Google Scholar

7. 7. Zhou X H, Feng Y C, Hu X C, et al. Transformer fault diagnosis based on SVM optimized by bald eagle search algorithm [J]. Southern Power System Technology, 2023, 17(06): 99–106+116.

* View Article

* Google Scholar

8. 8. Liao W, Yang D, Wang Y, et al. Fault diagnosis of power transformers using graph convolutional network[J]. CSEE Journal of Power and Energy Systems, 2020, 7(2): 241–249.

* View Article

* Google Scholar

9. 9. Yang D, Liao W, Ren X, et al. Fault diagnosis of transformer based on capsule network[J]. Gaodianya Jishu, 2021, 47(2): 415–424.

* View Article

* Google Scholar

10. 10. Wang D, Ma A J, Gui Y, et al. Diagnosis of partial discharge insulation fault fusion based on P-CNN[J]. High Voltage Engineering, 2020, 46(8): 2897–2905.

* View Article

* Google Scholar

11. 11. Zhou M, Wang T. Fault diagnosis of power transformer based on association rules gained by rough set[C]//2010 The 2nd International Conference on Computer and Automation Engineering (ICCAE). IEEE, 2010, 3: 123–126.

12. 12. Song X, Yang Z, Jin D C, et al. Assessment of transformer oil-paper insulation status with fuzzy rough set[J]. Chinese Journal of Scientific Instrument, 2017, 38(1): 190–197.

* View Article

* Google Scholar

13. 13. Kai L, Wei J P, Xue J Y. Method of fault diagnosis for power transformer based on optimizing characteristics and the fuzzy theory[J]. Power System Protection and Control, 2016, 44(15): 54–60.

* View Article

* Google Scholar

14. 14. Wang F Z, Shao S M. Fuzzy strategy on running state evaluation of oil-immersed power transformer[J]. Comput. Simul, 2015, 32: 141–145.

* View Article

* Google Scholar

15. 15. Wang K. Research on Transformer Fault prediction Method Based on Ensemble Learning[D].Shandong University,2021.

16. 16. Yuan M. A Dissertation Submitted to Guangdong University of Technology for the Degree of Master[D]. Guangdong University of Technology, 2021.

17. 17. Liu Y. Research on 10KV Feeder Fault prediction Method Based on Deep Learning[D].Chongqing University,2019.

18. 18. IEC. Mineral oil-impregnated electrical equipment in service-guide to the interpretation of dissolved and free gases analysis: IEC 60599–2007[S]. Geneva, Switzerland: IEC, 2007.

19. 19. Dai J J, Song H, Sheng G H, et al. Dissolved gas analysis of insulating oil for power transformer fault diagnosis with deep belief network[J]. IEEE Transactions on Dielectrics and Electrical Insulation,2017,24(5): 2828–2835.

* View Article

* Google Scholar

20. 20. ROGERS R R. IEEE and IEC codes to interpret incipient faults in transformers, using gas in oil analysis[J]. IEEE Transactions on Electrical Insulation, 1978, EI-13(5):349–354.

* View Article

* Google Scholar

21. 21. Li J, Zhang Q, Wang K, et al. Optimal dissolved gas ratios selected by genetic algorithm for power transformer fault diagnosis based on support vector machine[J]. IEEE Transactions on Dielectrics and Electrical Insulation, 2016, 23(2): 1198–1206.

* View Article

* Google Scholar

22. 22. Zho H, Zhao Z, Yu H, et al. Transformer Fault Diagnosis Based on BO-CatBoost[J].Control and Instruments in Chemical Industry, 2021, 48(06):601–607+633.

* View Article

* Google Scholar

23. 23. Li H, Zhao H, Cui C Z, et al. A Stellar Spectrum Classification Algorithm Based on CNN and LSTM Composite Deep Learning Model[J]. Spectroscopy and Spectral Analysis, 2024, 44(06):1668–1675.

* View Article

* Google Scholar

24. 24. Li R, Cao G L, Pu Y, et al. TDS-Net: A Two-Dimensional Stellar Spectra Classification Model Based on Attention Mechanism and Feature Fusion [J]. Spectroscopy and Spectral Analysis,2024,44(07):1968–1973.

* View Article

* Google Scholar

25. 25. Darwish A. Enhancing Prognostics of PEM Fuel Cells with a Dual-Attention LSTM Network for Remaining Useful Life Estimation: A Deep Learning Model[J]. Sustainable Machine Intelligence Journal, 2024, 7: (5): 1–20.

* View Article

* Google Scholar

26. 26. Darwish A. A Data-driven Deep Learning Approach for Remaining Useful Life in the ion mill etching Process[J]. Sustainable Machine Intelligence Journal, 2024, 8: (2): 14–34.

* View Article

* Google Scholar

27. 27. Yu Q, Huihong H, Jian F, et al. Fault diagnosis of transformer based on FCM and improved PCA[J]. High Voltage Apparatus, 2018, 54(12): 262–267.

* View Article

* Google Scholar

28. 28. Assegie T A, Elaraby A. Optimal tree depth in decision tree classifiers for predicting heart failure mortality[J]. Healthcraft Front, 2023, 1(1): 58–66.

* View Article

* Google Scholar

29. 29. Zhu Q, Zhu W, Wang H, et al. Research on online semi-supervised fault diagnosis method of transformer based on dissolved gas in oil[J/OL]. Power System Technology: 1–9[2022-05-03].

* View Article

* Google Scholar

30. 30. Wang Y, Zhang T. Fault Diagnosis of Transformers Based on Optimal Probabilistic Neural Network Based on Digital Twin [J]. Modular Machine Tool and Automatic Manufacturing Technique, 2020, 11: 20–23.

* View Article

* Google Scholar

Citation: Wang J, Chi J, Ding Y, Yao H, Guo Q (2025) Based on PCA and SSA-LightGBM oil-immersed transformer fault diagnosis method. PLoS ONE 20(2): e0314481. https://doi.org/10.1371/journal.pone.0314481

About the Authors:

Jizhong Wang

Roles: Writing – original draft

E-mail: [email protected]

Affiliation: State Grid Zhejiang Electric Power Co., Ltd., Hangzhou, China

ORICD: https://orcid.org/0009-0000-7099-6948

Jianfei Chi

Roles: Conceptualization, Data curation, Formal analysis, Writing – review & editing

Affiliation: State Grid Zhejiang Electric Power Co., Ltd., Hangzhou, China

Yeqiang Ding

Roles: Data curation, Investigation, Methodology, Software, Writing – review & editing

Affiliation: State Grid Zhejiang Electric Power Co., Ltd., Hangzhou, China

Haiyan Yao

Roles: Data curation, Formal analysis, Investigation, Software, Supervision, Validation

Affiliation: Hangzhou Electric Power Equipment Manufacturing Co., Ltd., Hangzhou, China

Qiang Guo

Roles: Data curation, Formal analysis, Project administration, Software, Writing – original draft

Affiliation: Hangzhou Electric Power Equipment Manufacturing Co., Ltd., Hangzhou, China

[/RAW_REF_TEXT]

References

1. Le X, Yi J Z, Ke Y Y, et al. Interpretation of DGA for Transformer Fault Diagnosis with Step-by-step feature selection and SCA-RVM[C]//2021 IEEE 16th Conference on Industrial Electronics and Applications (ICIEA). IEEE, 2021: 1372–1377.

2. Li H, Zhang Y, Zhang Y. Study of transformer fault diagnosis based on improved sparrow search algorithm optimized support vector machine[J]. Journal of electronic measurement and instrumentation, 2021, 35(2021): 123–129.

3. Thomas J B, Chaudhari S G, Shihabudheen K V, et al. CNN-based transformer model for fault detection in power system networks[J]. IEEE Transactions on Instrumentation and Measurement, 2023, 72: 1–10.

4. Raj R A, Sarathkumar D, Andrews L J B, et al. Key gases in transformer oil–an analysis using self organizing map (SOM) neural networks[C]//2023 IEEE 12th International Conference on Communication Systems and Network Technologies (CSNT). IEEE, 2023: 642–647.

5. Zhu L, Wang X H, Li H, et al. Transformer fault diagnosis method based on variation sparrow search algorithm and improved SMOTE under unbalanced samples [J]. High Voltage Engineering, 2023, 49(12):4993–5001.

6. Ouyang X, Li Z B. Transformer fault diagnosis technology based on sample expansion and feature selection and SVM optimized by IGWO [J]. Power System Protection and Control,2023,51(18):11–20.

7. Zhou X H, Feng Y C, Hu X C, et al. Transformer fault diagnosis based on SVM optimized by bald eagle search algorithm [J]. Southern Power System Technology, 2023, 17(06): 99–106+116.

8. Liao W, Yang D, Wang Y, et al. Fault diagnosis of power transformers using graph convolutional network[J]. CSEE Journal of Power and Energy Systems, 2020, 7(2): 241–249.

9. Yang D, Liao W, Ren X, et al. Fault diagnosis of transformer based on capsule network[J]. Gaodianya Jishu, 2021, 47(2): 415–424.

10. Wang D, Ma A J, Gui Y, et al. Diagnosis of partial discharge insulation fault fusion based on P-CNN[J]. High Voltage Engineering, 2020, 46(8): 2897–2905.

11. Zhou M, Wang T. Fault diagnosis of power transformer based on association rules gained by rough set[C]//2010 The 2nd International Conference on Computer and Automation Engineering (ICCAE). IEEE, 2010, 3: 123–126.

12. Song X, Yang Z, Jin D C, et al. Assessment of transformer oil-paper insulation status with fuzzy rough set[J]. Chinese Journal of Scientific Instrument, 2017, 38(1): 190–197.

13. Kai L, Wei J P, Xue J Y. Method of fault diagnosis for power transformer based on optimizing characteristics and the fuzzy theory[J]. Power System Protection and Control, 2016, 44(15): 54–60.

14. Wang F Z, Shao S M. Fuzzy strategy on running state evaluation of oil-immersed power transformer[J]. Comput. Simul, 2015, 32: 141–145.

15. Wang K. Research on Transformer Fault prediction Method Based on Ensemble Learning[D].Shandong University,2021.

16. Yuan M. A Dissertation Submitted to Guangdong University of Technology for the Degree of Master[D]. Guangdong University of Technology, 2021.

17. Liu Y. Research on 10KV Feeder Fault prediction Method Based on Deep Learning[D].Chongqing University,2019.

18. IEC. Mineral oil-impregnated electrical equipment in service-guide to the interpretation of dissolved and free gases analysis: IEC 60599–2007[S]. Geneva, Switzerland: IEC, 2007.

19. Dai J J, Song H, Sheng G H, et al. Dissolved gas analysis of insulating oil for power transformer fault diagnosis with deep belief network[J]. IEEE Transactions on Dielectrics and Electrical Insulation,2017,24(5): 2828–2835.

20. ROGERS R R. IEEE and IEC codes to interpret incipient faults in transformers, using gas in oil analysis[J]. IEEE Transactions on Electrical Insulation, 1978, EI-13(5):349–354.

21. Li J, Zhang Q, Wang K, et al. Optimal dissolved gas ratios selected by genetic algorithm for power transformer fault diagnosis based on support vector machine[J]. IEEE Transactions on Dielectrics and Electrical Insulation, 2016, 23(2): 1198–1206.

22. Zho H, Zhao Z, Yu H, et al. Transformer Fault Diagnosis Based on BO-CatBoost[J].Control and Instruments in Chemical Industry, 2021, 48(06):601–607+633.

23. Li H, Zhao H, Cui C Z, et al. A Stellar Spectrum Classification Algorithm Based on CNN and LSTM Composite Deep Learning Model[J]. Spectroscopy and Spectral Analysis, 2024, 44(06):1668–1675.

24. Li R, Cao G L, Pu Y, et al. TDS-Net: A Two-Dimensional Stellar Spectra Classification Model Based on Attention Mechanism and Feature Fusion [J]. Spectroscopy and Spectral Analysis,2024,44(07):1968–1973.

25. Darwish A. Enhancing Prognostics of PEM Fuel Cells with a Dual-Attention LSTM Network for Remaining Useful Life Estimation: A Deep Learning Model[J]. Sustainable Machine Intelligence Journal, 2024, 7: (5): 1–20.

26. Darwish A. A Data-driven Deep Learning Approach for Remaining Useful Life in the ion mill etching Process[J]. Sustainable Machine Intelligence Journal, 2024, 8: (2): 14–34.

27. Yu Q, Huihong H, Jian F, et al. Fault diagnosis of transformer based on FCM and improved PCA[J]. High Voltage Apparatus, 2018, 54(12): 262–267.

28. Assegie T A, Elaraby A. Optimal tree depth in decision tree classifiers for predicting heart failure mortality[J]. Healthcraft Front, 2023, 1(1): 58–66.

29. Zhu Q, Zhu W, Wang H, et al. Research on online semi-supervised fault diagnosis method of transformer based on dissolved gas in oil[J/OL]. Power System Technology: 1–9[2022-05-03].

30. Wang Y, Zhang T. Fault Diagnosis of Transformers Based on Optimal Probabilistic Neural Network Based on Digital Twin [J]. Modular Machine Tool and Automatic Manufacturing Technique, 2020, 11: 20–23.

Word count: 5046

Show less

© 2025 Wang et al. This is an open access article distributed under the terms of the Creative Commons Attribution License: http://creativecommons.org/licenses/by/4.0/ (the “License”), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Abstract

Translate

A fault diagnosis method for oil immersed transformers based on principal component analysis and SSA LightGBM is proposed to address the problem of low diagnostic accuracy caused by the complexity of current oil immersed transformer faults. Firstly, data on dissolved gases in oil is collected, and a 17 dimensional fault feature matrix is constructed using the uncoded ratio method. The feature matrix is then standardized to obtain joint features. Secondly, principal component analysis is used for feature fusion to eliminate information redundancy between variables and construct fused features. Finally, a transformer diagnostic model based on SSA-LightGBM was constructed, and the ten fold cross validation method was used to verify the classification ability of the model. The experimental results show that the SSA-LightGBM model proposed in this paper has an average fault diagnosis accuracy of 93.6% after SSA algorithm optimization, which is 3.6% higher than before optimization. At the same time, compared with the GA-LightGBM and GWO-LightGBM fault diagnosis models, SSA-LightGBM has improved the diagnostic accuracy by 8.1% and 5.7% respectively, verifying that this method can effectively improve the fault diagnosis performance of oil immersed transformers and is superior to other similar methods.

Details

Title

Based on PCA and SSA-LightGBM oil-immersed transformer fault diagnosis method

Author

Wang, Jizhong

; Chi, Jianfei; Ding, Yeqiang; Yao, Haiyan; Guo, Qiang

First page

e0314481

Section

Research Article

Publication year

2025

Publication date

Feb 2025

Publisher

Public Library of Science

e-ISSN

19326203

Source type

Scholarly Journal

Language of publication

English

DOI

https://doi.org/10.1371/journal.pone.0314481

ProQuest document ID

3168697505

Based on PCA and SSA-LightGBM oil-immersed transformer fault diagnosis method

Jump to:

Full text

1 Introduction

2 Data preprocessing

2.1 Preprocessing

2.2 Z-score normalization

3 PCA-based fusion feature matrix construction

4 Construction of a fault diagnosis model based on SSA-LightGBM

4.1 LightGBM algorithm

4.2 SSA algorithm

4.3 Model parameter optimization and training test

4.4 Comparative analysis of diagnostic results

4.4.1 Analysis of fault type diagnosis results.

4.4.2 Analysis of multi-model diagnosis results.

4.4.3 Analysis of diagnostic results for different feature quantities.

5 Conclusion

Supporting information

References

Abstract

Details

Suggested sources