Content area
Tool wear monitoring is crucial in machining, playing a vital role in ensuring quality and controlling costs. Inadequate control over tool wear and life can result in increased expenses or significant damage to both tools and workpieces, making accurate wear prediction essential to avoid failures. While traditional long short-term memory (LSTM) models perform well on time series data, they capture only unidirectional historical information and fail to utilize future data, limiting their effectiveness in complex wear prediction tasks. Moreover, the hyperparameter tuning process for LSTM models is complex and computationally expensive. Manual tuning methods often struggle to find a global optimum in high-dimensional spaces, leading to local optima and restricting the model's generalization capabilities. To address these limitations, this paper introduces a genetic algorithm-optimized bidirectional long short-term memory (GA-BiLSTM) model. Unlike traditional LSTM, BiLSTM captures both forward and backward time series data, enabling comprehensive utilization of sequence features. The genetic algorithm (GA) performs a global search of the hyperparameter space, automatically optimizing key parameters, thus avoiding the inefficiencies of manual tuning and significantly improving the model’s robustness and performance. Experimental results show that GA-BiLSTM reduces mean absolute error (MAE) by up to 72.0% and root mean square error (RMSE) by 64.3% on the PHM2010 dataset, demonstrating its superior predictive accuracy and practical applicability.
Article Highlights
The Bi-LSTM model is synergistically combined with a genetic optimization algorithm for the first time to predict the wear of Ball Nose Tungsten Carbide Cutters. Experimental results demonstrate superior fitting capabilities compared to alternative models.
The sensor signals are precisely utilized for feature extraction in the time domain, frequency domain, and timefrequency domain. These features are filtered using the Pearson correlation coefficient, and a correlation analysis is conducted on the remaining features.
A global optimization strategy utilizing genetic algorithms is employed to fine-tune the learning rate, number of hidden layers, and training batch size of the Bi-LSTM layer.
Introduction
In the machining processes of intelligent manufacturing, surface roughness degradation, and tool wear are induced by contact forces and friction, which in turn exacerbate the interaction between product quality and tool deterioration [1]. When tool wear exceeds a certain threshold, the workpiece no longer meets manufacturing standards, ultimately resulting in damage to both the tool and the workpiece. When tool wear reaches its maximum value, it necessitates the interruption of the machining process for tool replacement. Research indicates that 20% of total machining downtime is attributed to this factor [2]. By monitoring tool wear and utilizing wear warning systems, productivity can be enhanced by over 65% [3]. Therefore, monitoring tool wear is critical in the manufacturing process [4].
There are primarily two methods for measuring tool wear. The first is the direct method [5], in which tool dimension changes are precisely captured through the use of radioactive and visual sensors. Radioactive sensors have been employed to measure tool wear directly. This approach involves embedding a trace amount of radioactive material onto the flank face of the cutting tool. During machining, as the tool undergoes wear, the radioactive material is gradually transferred to the chips. By analyzing the quantity of radioactive material present in the chips, the extent of tool wear can be determined. However, the requirement for real-time chip collection and the inherent risks associated with handling radioactive substances restrict the application of this method to controlled laboratory settings. The use of vision sensors for the direct measurement of tool wear involves analyzing the cutting tool itself. Typically, these sensors exploit the greater reflectivity of the worn area compared to the unworn surface to extract various morphological parameters that describe the extent and characteristics of tool wear. Most research has focused on measuring flank wear, with relatively few studies addressing the simultaneous assessment of both flank and crater wear. While flank wear regions can be captured using a CCD camera, the evaluation of crater wear necessitates the projection of a structured light pattern onto the tool surface to obtain depth information from the crater. In structured light sensing, distortions in parallel laser light lines provide an estimate of crater depth. However, due to the challenging conditions of the cutting environment—such as the presence of lubricants, built-up edges, or metal deposits on the tool—current vision sensors are limited to operation between cutting cycles. Generally, given the sensitivity to environmental interferences such as cutting fluid, chips, and vibrations, the direct method is often conducted offline and lacks real-time monitoring capabilities [6]. Conversely, the indirect method [7] involves the use of advanced sensing technologies, including Hall sensors, cutting force sensors, acoustic emission sensors, and those measuring torque, and current. Although this method is less accurate than the direct one, it is favored for its ability to enable continuous real-time monitoring.
The monitoring of tool wear conditions typically encompasses several critical steps: sensor-based data acquisition, signal preprocessing, feature extraction, and selection, and predictive modeling [8]. In the feature extraction phase, dimensionality reduction techniques are applied to ensure the relevance of extracted features to the tool's condition. However, these features may still harbor redundancy or exhibit excessively high dimensionality, thus impairing model performance. To mitigate this issue, advanced feature selection methods such as Pearson correlation coefficients and Principal Component Analysis (PCA) are utilized to filter out the most informative features and further reduce dimensionality [9, 10]. Subsequently, to predict tool wear or remaining useful life (RUL), sophisticated machine learning algorithms—including Gaussian Process Regression (GPR) [11], Support Vector Machines (SVM) [12], Artificial Neural Networks (ANN) [13], Neuro-Fuzzy Inference Systems [14], and Hidden Markov Models (HMM) [15, 16]—are employed, forming a comprehensive framework for monitoring and prediction. Furthermore, shallow machine learning models often suffer from limited interpretability and poor generalization, rendering them inadequate for addressing highly complex nonlinear problems. As deep learning methodologies have advanced, architectures like Long Short-Term Memory (LSTM) networks, with their distinctive gating mechanisms and memory cells, have proven adept at capturing long-term dependencies in time series data while mitigating gradient vanishing issues [17]. Although traditional long short-term memory models demonstrate commendable performance on time series data, they are restricted to capturing only unidirectional historical information. This inability to leverage future data significantly undermines their efficacy in intricate wear prediction tasks. Bidirectional LSTM (BiLSTM), a sophisticated extension of LSTM, amplifies this capability by leveraging both forward and backward information flow, making it particularly suited for handling intricate nonlinear classification and regression tasks [18], especially in the real-time monitoring of tool wear using sensor-based data. Nonetheless, the major limitation of these approaches lies in their substantial computational overhead and time consumption. To overcome these constraints, optimization techniques such as Genetic Algorithms (GA) [19], Particle Swarm Optimization (PSO) [20], and Simulated Annealing (SA) [21] have been integrated, significantly enhancing model training efficiency and performance. GA, in particular, has gained prominence due to its capacity for rapid convergence toward a global optimum while maintaining computational simplicity.
To facilitate global feature extraction and tool wear prediction from long sequence real-time monitoring data, this paper introduces an intelligent tool wear prediction model predicated on a genetic algorithm-optimized bidirectional long short-term memory network (GA-Bi-LSTM). This model adeptly extracts long-term feature sequences from multiple channels utilizing sensors and enhances information utilization by integrating forward and backward LSTM layers. Additionally, the hyperparameters of the model are refined through a genetic algorithm.
The remainder of this paper is organized as follows: The"Previous Work"section critically reviews pertinent research on real-time monitoring of tool status. The"Model Theory"section elucidates the theoretical framework of the genetic algorithm-optimized Bi-LSTM model. The"Experiments"section provides a comprehensive overview of the datasets utilized, data processing methodologies, and experimental configurations. The"Results and Discussion"section scrutinizes the experimental findings, culminating in conclusions derived from comparative analyses. Finally, the"Conclusion"section synthesizes the research outcomes and articulates prospective directions for future investigations.
Previous work
Tool wear can be obtained intuitively and accurately using a direct measurement-based TCM method. Sortino, M [22] proposed a novel approach to direct tool wear measurement, utilizing statistical filtering algorithms and machine vision technology. Through experimental validation, the feasibility and accuracy of this method in real-world production environments were confirmed. However, in practice, direct measurement significantly affects processing efficiency. To ensure production efficiency, monitoring tool wear using indirect sensor-based methods has become a key research focus.
Palanisamy et al. [23] explored the application of artificial neural networks (ANN) in predicting cutting tool wear. A predictive model was developed through the design of experiments (DOE) and regression analysis, and the accuracy of the model was validated using ANN. Liang et al. [24] proposed a wear prediction model based on an improved hybrid differential grey wolf optimization (IHDGWO) algorithm to optimize the parameters of the support vector machine (SVM). The model was experimentally validated, demonstrating its effectiveness in improving prediction accuracy and convergence speed. Chen et al. [25] explored the application of deep belief networks (DBN) in predicting cutting tool flank wear and compared it with artificial neural networks (ANN) and support vector regression (SVR). The results demonstrated the advantages of DBN in terms of prediction accuracy and stability. Lorentzon, J. and Järvstråt, N. [26] studied the use of finite element (FEM) methods to simulate tool wear during the machining of nickel-based alloy Inconel 718, analyzing the impact of different friction models on the simulation results. Huang et al. [27] proposed a hybrid tool wear prediction model that combines a physical model with a multilayer perceptron (MLP) neural network. They utilized a particle filtering algorithm to integrate the predictions from the physical model and the data-driven model. Gao et al. [28] discussed a method for predicting tool wear using gated recurrent unit (GRU) neural networks and multi-sensor multi-domain feature fusion technology. Building on feature extraction and fusion, Chacón et al. [29] proposed a novel tool wear prediction method based on acoustic emission (AE) signals and machine learning algorithms, combining continuous wavelet transform (CWT) and random forest regression (RF) to enhance prediction accuracy. Similarly, Domínguez-Monferrer et al. [30] investigated tool wear prediction during the drilling process using the random forest (RF) algorithm and spindle power consumption signals, demonstrating the superiority of the RF model through comparisons with linear regression (LR) and k-nearest neighbors (kNN) algorithms. R. Zhao et al. [31] enhanced their tool wear prediction model by introducing a convolutional layer before the LSTM. This addition aimed to capture local features from the sensory data being processed. Zhao et al. [32] developed a method to predict tool wear during dry milling operations by employing local feature-based gated recurrent unit networks, which are considered a simplified version of long short-term memory (LSTM) networks.
Although progress has been made in tool wear prediction, certain limitations persist. Traditional machine learning models (e.g., SVM, RF) struggle to handle high-dimensional nonlinear data effectively. While deep learning models (e.g., ANN, DBN, GRU) improve prediction accuracy, they often rely on unidirectional time-series modeling, limiting their ability to capture bidirectional dependencies. Moreover, their hyperparameter optimization is typically dependent on manual tuning, which is time-consuming and often fails to achieve global optima.
To address these issues, this study proposes a tool wear prediction model based on a Bidirectional Long Short-Term Memory (BiLSTM) network, integrated with a Genetic Algorithm (GA) for automatic hyperparameter optimization. BiLSTM leverages both forward and backward time-series features, enhancing its ability to comprehensively capture wear trends. GA performs global optimization of key hyperparameters, improving the model’s generalization and predictive performance. Experimental results demonstrate that the proposed method accurately captures both global and local tool wear characteristics while significantly reducing Root Mean Square Error (RMSE) and Mean Absolute Error (MAE), effectively addressing existing gaps in information utilization and model optimization.
Model theory
This section provides a comprehensive overview of the fundamental LSTM model, along with the Bi-LSTM method developed from it. Furthermore, the optimization mechanism of the genetic algorithm (GA) applied to the Bi-LSTM model will be elaborated upon.
Long short-term memory
Hochreiter and Schmidhuber [33, 34, 35–36] designed the Long Short-Term Memory (LSTM) network, an advanced architecture of Recurrent Neural Networks (RNNs) that overcomes the vanishing gradient problem, making it far more adept than traditional RNNs at modeling time series data and capturing long-term dependencies. While LSTM, like conventional RNNs, retains the capacity to process sequential data, it significantly enhances its ability to manage extended dependencies by introducing memory cells and three gating mechanisms: the input gate, forget gate, and output gate, as illustrated in Fig. 1. The input gate governs the amount of new information to be stored, the forget gate dictates how much previously stored information should be discarded, and the output gate controls the information to be output. Memory cells serve as the repository for information, deciding what to retain or discard, and ensuring critical contextual data is preserved across long sequences. These sophisticated mechanisms allow LSTM to excel in handling extensive sequences, enabling it to more effectively capture and utilize distant contextual relationships. The mathematical formulation of the gating mechanisms is as follows:
[See PDF for image]
Fig. 1
Graphic illustration of the LSTM cell (adapted based on the descriptions by Hochreiter and Schmidhube [33] and a review article [37])
The forget gate adjusts the strength of the self-loop within the cell state to update the memory, relying on the current input X(t) and the previous time step's output h(t−1):
1
where and denote the input and hidden weights of the forget gate, respectively, while represents the bias term. The function refers to the sigmoid activation function.The input gate regulates the extent to which new information is written into the memory cell:
2
where and represent the input and hidden weights of the input gate, respectively, while denotes the bias term of the input gate. The function refers to the sigmoid activation function.The output gate’s function is to regulate the extent to which information from the memory cell is passed to the next time step:
3
where and represent the input and hidden weights of the output gate, respectively, while denotes the bias term of the output gate. The function refers to the sigmoid activation function.The new memory state update is as follows:
4
The output of the hidden layer for the current period is obtained after via output gate:
5
where ⊙ denotes element-wise multiplication, and the formulas for the sigmoid and tanh functions are as follows:6
7
Bi-directional long short-term memory
The Bidirectional Long Short-Term Memory network (BiLSTM) is an enhanced version of the LSTM architecture that comprises two independent LSTM layers, each responsible for processing the forward and backward information of the input sequence. Unlike the traditional LSTM network, which relies solely on past information, BiLSTM can simultaneously capture both past and future data when handling sequential information, thereby achieving a more comprehensive understanding of global information [38, 39]. As illustrated in Figs. 2 and 3, each time step of the BiLSTM includes a forward LSTM layer and a backward LSTM layer: the forward LSTM layer processes data progressively from the initial time step t = 1 to the final time step t = T, while the backward LSTM layer operates in reverse, starting from time step t = T and moving back to t = 1. The hidden states generated by the forward and backward LSTM layers retain historical and future information, respectively, and these two hidden states are concatenated to form a unified representation for the current time step t. The forward hidden state captures the past information of the input sequence, while the backward hidden state encapsulates future information, enabling the BiLSTM to fully leverage the temporal relationships within the data. The specifics of the forward and backward hidden states are as follows:
8
9
10
where the total length of the entire time series for the process is T.[See PDF for image]
Fig. 2
Illustration of an LSTM data flow structure
(Adapted from the model structure described in the article [40])
[See PDF for image]
Fig. 3
Illustration of a BiLSTM data flow structure
(Adapted from the model structure described in the article [40])
Genetic algorithm
Genetic Algorithm (GA), conceptualized by J. Holland [41, 42], represents a sophisticated metaheuristic optimization framework grounded in the principles of biological evolution, primarily devised for addressing intricate optimization challenges. This algorithm undergoes continual evolution through iterative mechanisms of crossover and mutation. The crossover operation emulates genetic recombination by exchanging segments of information among individuals, thereby generating novel solutions and exploring a broader solution landscape. Conversely, the mutation operation introduces stochastic alterations to the genetic material, effectively mitigating the risk of the algorithm becoming ensnared in local optima while simultaneously augmenting the population's diversity. Each generation of individuals is subjected to a rigorous fitness evaluation via an objective function, facilitating the selection of superior-performing solutions and ensuring the progressive evolution of the population toward the discovery of an optimal solution.
In this research endeavor, GA is harnessed to optimize the hyperparameter configurations of the Bidirectional Long Short-Term Memory (BiLSTM) model, encompassing parameters such as learning rate, the number of neurons within the hidden layers, and batch size. Throughout the algorithm's operation, the processes of crossover and mutation yield novel hyperparameter combinations, each of which is meticulously assessed for performance through a fitness function. This evaluation entails training the BiLSTM model and subsequently validating it on a designated dataset, wherein the Mean Absolute Error (MAE) and Root Mean Square Error (RMSE) are employed as loss functions to quantify its efficacy. These two metrics are among the most commonly used error measures in regression-based prediction tasks, owing to their simplicity and interpretability. Specifically, the MAE measures the average magnitude of errors between predicted and actual values, providing an intuitive assessment of prediction accuracy. Meanwhile, the RMSE emphasizes larger errors by calculating the square root of the mean of squared differences, making it particularly sensitive to outliers. The procedural flow of this methodology is delineated in Fig. 4. Through this iterative refinement and selection process, GA persistently enhances the configurations of hyperparameters in the foundational model, thereby identifying the optimal arrangement for tool wear prediction tasks and significantly elevating the model's predictive precision. Upon completion of the targeted iterative process, the genetic algorithm identifies the most effective individual from the array of hyperparameter combinations for the final training of the BiLSTM model, further amplifying its performance and ensuring that the model's predictive outputs are both resilient and exceptionally accurate.
[See PDF for image]
Fig. 4
Flowchart of the proposed technique GA-BiLSTM
11
12
Overall illustration of the proposed methodology
The complete flowchart of the real-time monitoring model for tool wear status optimized by the GA algorithm is presented in Fig. 5. The detailed procedures for the data collection and preprocessing phase will be elaborated in the following"Experiments"section.
[See PDF for image]
Fig. 5
Overall workflow diagram of the proposed technique GA-BiLSTM
Experiments
In this section, we will introduce the experimental background and basic setup of the PHM2010 dataset used for model comparison and evaluation.
Description of PHM2010 data challenge data set
To enhance the credibility and public recognition of research in the field of tool wear monitoring, the 2010 PHM Data Challenge data set (PHM Society Conference Data Challenge, provided at https://www.phmsociety.org/competition/phm/10) was selected for the experiment. The milling experiment was conducted on the Röders Tech RFM760 high-speed CNC milling machine (Röders, Soltau, Germany), using a 6 mm three-flute ball nose tungsten cutter for dry milling, and the workpiece material was stainless steel (HRC52, USA) [41]. The operational parameters of the machining tool were set as follows: the spindle speed was established at 10,400 r/min, the feed rate was configured to 1555 mm/min, the radial cutting depth along the Y-axis was set to 0.125 mm, and the axial cutting depth along the Z-axis was adjusted to 0.2 mm. The tool cut from the upper edge to the lower edge of the workpiece surface in a serrated manner. To acquire tool wear data, the experimental platform was outfitted with three types of sensors to capture cutting force, vibration, and acoustic emission signals during the machining process. A Kistler three-component dynamometer was positioned between the workpiece and the milling machine to record cutting force signals along the X, Y, and Z axes. Additionally, three Kistler acceleration sensors were mounted on the workpiece to detect vibrations in the three axes, while a Kistler acoustic emission sensor monitored the ultra-high-frequency stress wave pulses generated during material deformation. The sensor data were processed using a Kistler 5019 A multichannel charge amplifier. All collected signals were acquired via a data acquisition card (NI DAQ PCI 1200, Texas, USA) at a sampling rate of 50 kHz. A LEICA MZ 12 microscope was used to measure the flank wear of each individual flute after finishing each surface. Finally, seven channels of signals: force_x, force_y, force_z, vibration_x, vibration_y, vibration_z, and AE-rms(acoustic emission root mean square) were captured, and the flank wear was set to be the target value. The average wear observed on the three-tooth edges was recorded as the actual wear of the milling tools, stored under flute1-3. The overall experimental setup is illustrated in Fig. 6.
[See PDF for image]
Fig. 6
Illustration of the experimental setup for 2010 PHM Data Challenge data set
A total of six datasets, corresponding to the wear data of six identical tools, were obtained under the same conditions as the milling experiment. These datasets, labeled as C1, C2, C3, C4, C5, and C6, each contained 315 operational records. Among them, only C1, C4, and C6 included actual tool wear measurements taken after each cutting, therefore, they were used for validation experiments. Each operation's time series data consists of over 200,000 time steps, with a total of 315 operational records (cut number) per tool. Under the raw time series observations, Figs. 7 and 8 respectively show the original data sampling analysis of the cutting force signal (force_x) and vibration signal (vibration_x) in the x-direction during the 15 th, 165 th, and 270 th cutting operations from the C4 dataset. To avoid the effects of tool entry and exit, a random starting point is selected from the 50,001 st to the 100,000 th time step (with the same starting point maintained across different cutting operations), and 1,000 consecutive time steps are extracted for observation. It can be observed that, throughout the entire experimental process corresponding to the C4 dataset, as the number of cutting operations increases, both the cutting force and vibration in the x-direction also increase. At higher cutting operation numbers, force_x and vibration_x are significantly larger than those at lower cutting operation numbers.
[See PDF for image]
Fig. 7
Force_x during different cutting operations
[See PDF for image]
Fig. 8
Vibration_x during different cutting operations
Data preprocessing and experimental setup
Although the PHM2010 dataset contains data from a total of seven channels, this experiment specifically utilized time series data from six sensor channels: force_x, force_y, force_z, vibration_x, vibration_y, and vibration_z. The large volume of data arising from the over 200,000 time steps poses a significant challenge to the processing capabilities of the LSTM model. To mitigate signal distortion caused by downsampling, the code processes the complete time series by extracting various features. The extracted features encompass three components: time domain, frequency domain, and time–frequency domain. Time domain features are directly derived from the raw signal's time series data, while frequency domain features are obtained through Fourier Transform (FFT). Time–frequency domain features are extracted using Wavelet Packet Transform (WPT), employing the'db3'wavelet function, which facilitates three levels of decomposition and the extraction of features from eight nodes [43]. Each node represents the variation of the signal within a specific frequency range, with node names consisting of low-frequency components ('a'—approximation) and high-frequency components ('d'—detail). After multiple decompositions, the signal is divided into several frequency subbands, reflecting the energy and information of the signal in different frequency bands. Specifically, nodes such as'aaa','aad','ada','add', etc., represent the low-frequency and high-frequency components of the signal, corresponding to the steady part and the detail part of the signal, respectively. The L2 norm (also known as the Euclidean norm) of each node reflects the signal energy or variation intensity within the corresponding frequency band. The calculation formula for the L2 norm is as follows:
13
where represents the signal data in the frequency band, and is the number of data points.The specific features of each component are delineated in Table 1.
Table 1. The initially extracted time-domain, frequency-domain, and time–frequency domain features
Time-Domain | Frequency-Domain | Time–Frequency Domain(norms of bodes) |
|---|---|---|
Absolute mean value | Frequency center | aaa |
Max value | Mean square frequency | aad |
Root mean square | Root mean square frequency | ada |
Root amplitude | Variance of frequency | add |
Skewness | daa | |
Kurtosis | dad | |
Shape factor | dda | |
Pulse factor | ddd | |
Skewness factor | ||
Crest factor | ||
Clearance factor | ||
Kurtosis factor |
In statistics, the Pearson correlation coefficient, also known as the Pearson product-moment correlation coefficient [44], is a measure used to quantify the correlation between two variables, X and Y. It is defined as the ratio of the covariance between the two variables to the product of their standard deviations:
14
The correlation coefficient r takes values between −1 and 1. When r > 0, it signifies a positive relationship between two variables; conversely, r < 0 indicates a negative relationship. An r value of 0 suggests no linear association between the variables. The closer r is to ± 1, the stronger the correlation. In the data preprocessing phase of this experiment, the actual wear values from the three channels of flute1-3 were averaged to compute the final wear metric. Pearson correlation coefficients were then calculated between the wear metric and the 24 extracted features from the six channels. Features with a correlation coefficient greater than 0.7 were retained, reducing the feature space from 315 × 6 × 24 to 315 × 6 × 12, thus ensuring a robust relationship between input variables and the target output. The features corresponding to each channel after filtering are as follows: (AMV-Absolute mean value, MV-Max value, RMS-Root mean square, RA-Root amplitude, Sk-Skewness, Ku-Kurtosis).
Force_x Channel: [AMV, MV, RMS, Sk, aaa, aad, ada, add, daa, dad, dda, ddd].
Force_y Channel: [AMV, MV, RMS, RA, aaa, aad, ada, add, daa, dad, dda, ddd].
Force_z Channel: [AMV, MV, RMS, RA, aaa, aad, ada, add, daa, dad, dda, ddd].
Vibration_x Channel: [AMV, MV, RMS, RA, Sk, aaa, aad, ada, add, dad, dda, ddd].
Vibration_y Channel: [AMV, MV, RMS, RA, Sk, aaa, aad, ada, add, daa, dda, ddd].
Vibration_z Channel: [AMV, MV, RMS, RA, Sk, Ku, aaa, aad, ada, add, dda, ddd].
To verify the correlation between the remaining selected features and the prediction target (actual value for C1, C4, and C6 in Fig. 12), the Root Mean Square from the time-domain features and the'ada'node were chosen as representatives for analysis. Figure 9 illustrates that the RMS, commonly used as a critical metric for evaluating statistical amplitude, across all Force Channels in the three datasets exhibits the highest increase and growth rate during the 150 th to 300 th cutting operations throughout the entire cutting process. This indicates a significant rise in cutting forces during this phase, which aligns with the rapid tool wear observed in the same stage, as depicted in Fig. 12. Similarly, Fig. 10 shows that the L2 norm of the'ada'node across all Vibration Channels in the three datasets demonstrates a similar increasing trend to the RMS during this cutting phase. This growth in the L2 norm reflects an intensification of instability in the cutting process, which is consistent with the actual increase in tool wear and its degradation state. These two representative features strongly validate the correlation between the remaining extracted features and the predicted target values.
[See PDF for image]
Fig. 9
The variation trend of RMS in force channels
[See PDF for image]
Fig. 10
The variation trend of the ada node norm
Additionally, during data preprocessing, input data underwent standardization and normalization to enhance model training efficiency and ensure data quality and consistency. Standardization was performed using the formula:
15
Where represents the mean of the data, and represents the standard deviation. This transformation adjusts the data to have a mean of 0 and a standard deviation of 1, aligning the distribution to a standard normal form. It also eliminates biases caused by differing scales or magnitudes across features. This is particularly crucial in machine learning models that involve gradient-based optimization, as it prevents features with larger scales from dominating the weight updates during training. Normalization, on the other hand, was performed using the formula:
16
This operation scales the data into the range [0, 1], maintaining the relative proportions while limiting the range of values. Normalization is especially important for features with physical significance, such as wear measurements, as it ensures consistent relative weights and avoids distortion caused by extreme values. By combining standardization and normalization, this study achieved consistency in the magnitude of both features and target labels. This preprocessing strategy not only accelerated model convergence but also reduced instability caused by varying data scales, ensuring that differences in magnitude between features and labels did not introduce bias into model training.
In previous studies, it was observed that the wear trends of the three tools differed significantly. Due to the individual variability among tools, using data from two tools to predict the wear of another often yielded suboptimal results. While these predictions could capture the overall trend, they performed poorly in finer details [8]. In the fitting experiment, all samples from different tool sets were fed into the same model for training. Upon completion, the models were used to test datasets c1, c4, and c6, achieving accurate fitting results with minimal fluctuations. Based on these findings, this experiment combined all the samples, randomly selecting 80% as the training set and 20% as the validation set to avoid bias.
After completing the data preprocessing, the finalized input data were first fed into the Basic LSTM, LR, MLP, and SVR model for training, with the results shown in Table 2. These results clearly demonstrated the superiority of the LSTM when selecting the base model optimized by the GA (Genetic Algorithm). After confirming that the LSTM model would serve as the optimization target, the finalized input data were fed into the Basic LSTM model, BiLSTM model, GA-optimized LSTM models of varying scales, and GA-optimized BiLSTM models for further training. The predictive performance of these models for tool wear was evaluated using the aforementioned MAE and RMSE metrics, along with visual comparisons of the actual fitting graphs.
Table 2. Performance comparison among algorithms using the 2010 PHM Data Challenge data set
Algorithms | MAE | RMSE | ||||
|---|---|---|---|---|---|---|
C1 | C4 | C6 | C1 | C4 | C6 | |
Basic LSTM | 0.018 | 0.025 | 0.022 | 0.024 | 0.028 | 0.028 |
LR | 0.048 | 0.033 | 0.030 | 0.066 | 0.052 | 0.044 |
MLP | 0.037 | 0.034 | 0.024 | 0.048 | 0.055 | 0.044 |
SVR | 0.034 | 0.032 | 0.022 | 0.044 | 0.050 | 0.039 |
CNN | 0.037 | 0.033 | 0.024 | 0.047 | 0.056 | 0.043 |
Result and discussion
As shown in Table 2, the comparison of MAE and RMSE between the LSTM model and traditional models after training on the C1, C4, and C6 datasets was conducted to confirm its superiority, forming the basis for further optimization.
In all the cross-comparison experiments where the baseline LSTM model is the target for optimization, the number of layers is fixed at 2, and both the baseline and the GA-optimized models are trained for 500 epochs. Other hyperparameters, however, vary based on the optimization process. Figure 11 illustrates the key parameters in the LSTM model, such as Hidden size, Batch size, and Learning rate, which are also targeted by the genetic algorithm for optimization. Their respective search spaces within the genetic algorithm and other settings are outlined in Table 3. To ensure experimental consistency, the population size in the genetic algorithm remains constant across different target models (LSTM and BiLSTM), thereby guaranteeing the validity of the horizontal comparisons. In contrast, for comparisons within the same target model (LSTM or BiLSTM), variations in population size and number of generations are introduced to assess the genetic algorithm’s performance in optimizing the same baseline model under different population settings. The specific design and results of the various control experiments are as follows:
(1) Basic LSTM model: The hyperparameters were set as follows: Hidden size = 64, Batch size = 128, Learning rate = 0.002. After training for the specified number of epochs, the fitting results are shown in Fig. 12.
(2) Basic BiLSTM model: To ensure the validity of the comparative experiment, the hyperparameters were set to be the same as those of the basic LSTM model: Hidden size = 64, Batch size = 128, Learning rate = 0.002. After training for the specified number of epochs, the fitting results are shown in Fig. 13.
(3) GA-optimized basic LSTM model: Three comparative groups were configured with Population size = 6, Number of Generations = 10; Population size = 12, Number of Generations = 10; and Population size = 12, Number of Generations = 20. Among these, the configuration yielding the most optimal fitting performance—Population size = 12 and Number of Generations = 20—is illustrated in Fig. 14.
(4) GA-optimized BiLSTM model: Apart from replacing the basic LSTM with BiLSTM, all other settings are identical to experiment (3). The experiment with the best fitting performance, using Population size = 12 and Number of Generations = 20, is illustrated in Fig. 15.
[See PDF for image]
Fig. 11
Illustration of LSTM Hyperparameters (adapted based on the descriptions by Hochreiter and Schmidhube [33] and a review article [37])
Table 3. Respective search spaces within the genetic algorithm and other settings
Hyperparameters | Values |
|---|---|
Learning rate | (0.0001,0.01) |
Hidden size | (32,128) |
Batch size | (32,256) |
Population size | [4, 8] |
Generations | [5, 10] |
Crossover probability | 0.8 |
Mutation probability | 0.3 |
Selection method | selTournament |
Crossover method | cxBlend(alpha = 0.3) |
Mutation method | mutPolynomialBounded(eta = 0.3,indpb = 0.3) |
[See PDF for image]
Fig. 12
Prediction results of the basic LSTM model
[See PDF for image]
Fig. 13
Prediction results of the basic BiLSTM model
[See PDF for image]
Fig. 14
Prediction results of the GA enhanced LSTM model (P = 12, G = 20)
[See PDF for image]
Fig. 15
Prediction results of the GA enhanced BiLSTM model (P = 12, G = 20)
The MAE and RMSE values obtained from all experiments after the specified number of training epochs, as well as the reduction in MAE and RMSE compared to the Basic LSTM, are presented in Table 4. The visualization results are shown in Fig. 16, the algorithms represented by the letters below each bar in the histogram are shown in Table 5.
Table 4. Performance comparison among algorithms based on different improved LSTM models using the 2010 PHM Data Challenge data set
Algorithms | MAE | MAE decline | RMSE | RMSE decline | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
C1 | C4 | C6 | C1 (%) | C4 (%) | C6 (%) | C1 | C4 | C6 | C1 (%) | C4 (%) | C6 (%) | |
Basic LSTM | 0.018 | 0.025 | 0.022 | 0.024 | 0.028 | 0.028 | ||||||
Bi-LSTM | 0.018 | 0.023 | 0.021 | 0.0 | 8.0 | 4.5 | 0.023 | 0.023 | 0.026 | 4.2 | 17.9 | 7.1 |
GA-LSTM (P = 6,G = 10) | 0.017 | 0.024 | 0.020 | 5.6 | 4.0 | 9.1 | 0.023 | 0.025 | 0.025 | 4.2 | 10.7 | 10.7 |
GA-LSTM (P = 12,G = 10) | 0.015 | 0.014 | 0.016 | 16.7 | 44.0 | 27.3 | 0.021 | 0.019 | 0.022 | 12.5 | 32.1 | 21.4 |
GA-LSTM (P = 12,G = 20) | 0.010 | 0.011 | 0.013 | 44.4 | 56.0 | 40.9 | 0.015 | 0.015 | 0.020 | 37.5 | 46.4 | 28.6 |
GA-BiLSTM (P = 6,G = 10) | 0.017 | 0.023 | 0.020 | 5.6 | 8.0 | 9.1 | 0.022 | 0.021 | 0.025 | 8.33 | 25.0 | 10.7 |
GA-BiLSTM (P = 12,G = 10) | 0.014 | 0.014 | 0.015 | 22.2 | 44.0 | 31.8 | 0.019 | 0.018 | 0.020 | 20.8 | 35.7 | 28.6 |
GA-BiLSTM (P = 12,G = 20) | 0.008 | 0.007 | 0.012 | 55.6 | 72.0 | 45.5 | 0.011 | 0.010 | 0.016 | 54.2 | 64.3 | 42.9 |
[See PDF for image]
Fig. 16
The MAE and RMSE values obtained from experiments on datasets C1, C4, and C6 after the specified number of training epochs under different algorithms
Table 5. The algorithm represented by each letter
Letters | A | B | C | D |
Algorithms | Basic LSTM | Bi-LSTM | GA-LSTM (P = 6, G = 10) | GA-LSTM (P = 12, G = 10) |
Letters | E | F | G | H |
Algorithms | GA-LSTM (P = 12, G = 20) | GA-BiLSTM (P = 6, G = 10) | GA-BiLSTM (P = 12, G = 10) | GA-BiLSTM (P = 12, G = 20) |
Upon meticulous comparison of the fitting performances depicted in Figs. 12, 13, 14, 15 alongside the mathematical evaluation metrics MAE and RMSE delineated in Table 4 and Fig. 16, it becomes conspicuously apparent that the fitting outcomes for the C1, C4, and C6 datasets exhibit a coherent trend. Specifically, a significant enhancement in fitting accuracy is observed when transitioning from the basic LSTM model to the BiLSTM model across various LSTM optimization configurations. More precisely, the BiLSTM architecture demonstrates reductions in MAE and RMSE when compared to the basic LSTM model as follows: C1: 0.0%, C4: 8.0%, C6: 4.5% for MAE, and C1: 4.2%, C4: 17.9%, C6: 7.1% for RMSE. This underscores the BiLSTM's superior capacity to capture temporal dependencies within input sequences, thereby significantly enhancing overall fitting performance.
In the realm of genetic algorithms, further optimization of the LSTM and BiLSTM models was attained. In scenarios where the population size is diminished due to a limited number of individuals per generation, the degree of optimization is constrained. Consequently, an augmentation of the population size substantially amplifies fitting accuracy. For instance, when the GA-LSTM model is configured with a population size of 6 and a generation count of 10, the reductions in MAE for C1, C4, and C6 are recorded at 5.6%, 4.0%, and 9.1%, respectively, while the corresponding RMSE reductions stand at 4.2%, 10.7%, and 10.7%. Conversely, an increase in population size to 12 while maintaining the generation count at 10 yields even more pronounced enhancements, with MAE reductions of 16.7%, 44.0%, and 27.3% and RMSE reductions of 12.5%, 32.1%, and 21.4%.
Similarly, when the population size remains constant and the number of generations is augmented, a notable improvement in fitting accuracy is observed. For example, the GA-BiLSTM model, configured with a population size of 12 and a generation count of 10, results in MAE reductions of 22.2%, 44.0%, and 31.8% alongside RMSE reductions of 20.8%, 35.7%, and 28.6% when compared to the basic LSTM model. Elevating the generation count to 20 while maintaining the population size at 12 engenders even greater enhancements, with MAE reductions of 55.6%, 72.0%, and 45.5% and RMSE reductions of 54.2%, 64.3%, and 42.9%. These comparisons unequivocally illustrate that the integration of the BiLSTM model's bidirectional temporal characteristics within a larger population framework of genetic algorithm optimization significantly augments the predictive accuracy of tool wear.
While genetic algorithms present an efficacious approach for hyperparameter selection, they are encumbered by two principal limitations: resource allocation and temporal constraints. The training and evaluation of multiple complete LSTM or BiLSTM models necessitate substantial computational resources; however, leveraging GPU capabilities facilitates parallel processing, thus alleviating these constraints to a considerable extent. Moreover, concerning the evaluation framework of the genetic algorithm, subjecting each model individually to a training regimen of Epoch = 500, even with GPU acceleration, remains time-intensive. Our empirical observations indicate that model performance stabilizes around Epoch 50, with the efficacy ranking at this juncture aligning consistently with that observed at Epoch 500. Accordingly, we advocate for the evaluation of model performance at Epoch 50 during the execution of the genetic algorithm, subsequently resuming training for the most promising individuals at Epoch = 500. This strategic approach effectively curtails the training and evaluation duration for each individual within the population to one-tenth of the original timeframe, thereby markedly mitigating both the temporal and computational demands associated with genetic algorithm training.
Conclusion
This study proposes an online tool wear monitoring and prediction system based on a Genetic Algorithm-optimized Bidirectional Long Short-Term Memory Network (GA-Bi-LSTM) and validates its effectiveness using the publicly available PHM2010 dataset and experimental results. The main conclusions are as follows:
1. The model leverages the time-domain and time–frequency domain features extracted and filtered during data preprocessing. By employing the bidirectional LSTM architecture, the model accurately captures the bidirectional temporal dependencies in the flank wear process of Ball Nose Tungsten Carbide Cutters during the machining of Stainless Steel HRC52. Compared to traditional unidirectional LSTM models, the GA-Bi-LSTM demonstrates superior information utilization efficiency and predictive accuracy. Within the model, the genetic algorithm optimizes key hyperparameters globally through selection, crossover, and mutation operations, significantly enhancing the model's robustness and generalization ability. Additionally, the optimization of the genetic algorithm’s training strategy, such as reducing the training epochs for the initial population, effectively reduces computational costs while maintaining the performance of the final model. This optimization approach provides a feasible path for scaling the model in complex industrial scenarios, such as tool life management in large-scale production environments, real-time quality monitoring in precision manufacturing, and anomaly detection in equipment health monitoring.
2. Comparative analysis with existing techniques reveals that GA-Bi-LSTM achieves greater applicability in handling complex nonlinear time-series data, particularly when the population size of the algorithmic model individuals formed by different hyperparameter combinations reaches its maximum (P = 12, G = 20). On the PHM2010 dataset, the GA-Bi-LSTM (P = 12, G = 20) achieves reductions in MAE of 55.6%, 72.0%, and 45.5% on the C1, C4, and C6 datasets, respectively, and corresponding reductions in RMSE of 54.2%, 64.3%, and 42.9% compared to the baseline LSTM model. Furthermore, the error values of SVM, ANN, and traditional BiLSTM are significantly higher than those of GA-Bi-LSTM, further validating the superiority of the proposed model in tool wear prediction tasks.
3. Despite the exceptional performance of the GA-Bi-LSTM model in tool wear prediction, the results of this study are derived under specific conditions, including the machining process, material (stainless steel HRC52), and cutting tool (ball nose tungsten carbide cutter). Therefore, the applicability of these findings is primarily limited to similar working conditions and should not be generalized to other machining types without further validation. Future research may incorporate additional optimization algorithms, such as particle swarm optimization or Bayesian optimization, to further enhance model performance and explore its applications in other time-series prediction tasks, such as equipment health monitoring, industrial process control, energy demand forecasting, logistics and transportation scheduling optimization, and financial time-series analysis.
Author contributions
Mr.Kailai Tan. wrote the main manuscript text, and prepared the Figs. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16. Mr.Zhiqiang Liu launched part of the methodology Dr.Ruisong Jiang: Conceptualization All authors reviewed the manuscript.
Funding
No funding was received.
Data availability
The data that support the findings of this study are available in the PHM Society repository at https://www.phmsociety.org/competition/phm/10. These data were derived from the following resources available in the public domain: PHM 2010 Data Challenge.
Declarations
Competing interests
The authors declare no competing interests.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
References
1. Jain, AK; Lad, BK. A novel integrated tool condition monitoring system. J Intell Manuf; 2019; 30,
2. Kurada, S; Bradley, C. A review of machine vision sensors for tool condition monitoring. Comput Ind; 1997; 34,
3. Duan, J; Zhang, X; Shi, TL. A hybrid attention-based paralleled deep learning model for tool wear prediction. Expert Syst Appl; 2023; 211, 10. [DOI: https://dx.doi.org/10.1016/j.eswa.2022.118548]
4. Zhou, Y; Liu, CF; Yu, XL; Liu, B; Quan, Y. Tool wear mechanism, monitoring and remaining useful life (RUL) technology based on big data: a review. Sn Appl Sci; 2022; [DOI: https://dx.doi.org/10.1007/s42452-022-05114-9]
5. Lins, RG; de Araujo, PRM; Corazzim, M. In-process machine vision monitoring of tool wear for cyber-physical production systems. Robot Comput-Integr Manuf; 2020; 61, 17. [DOI: https://dx.doi.org/10.1016/j.rcim.2019.101859]
6. Zhang, CJ; Yao, XF; Zhang, JM; Jin, H. Tool condition monitoring and remaining useful life prognostic based on a wireless sensor in dry milling operations. Sensors; 2016; 16,
7. Wang, JJ; Xie, JY; Zhao, R; Zhang, LB; Duan, LX. Multisensory fusion based virtual tool wear sensing for ubiquitous manufacturing. Robot Comput-Integr Manuf; 2017; 45, pp. 47-58. [DOI: https://dx.doi.org/10.1016/j.rcim.2016.05.010]
8. Cai, WL; Zhang, WJ; Hu, XF; Liu, YC. A hybrid information model based on long short-term memory network for tool condition monitoring. J Intell Manuf; 2020; 31,
9. Shi, C; Panoutsos, G; Luo, B; Liu, H; Li, B; Lin, X. Using multiple-feature-spaces-based deep learning for tool condition monitoring in ultraprecision manufacturing. IEEE Trans Ind Electron; 2019; 66,
10. Wu, J; Su, Y; Cheng, Y; Shao, X; Deng, C; Liu, C. Multi-sensor information fusion for remaining useful life prediction of machining tools by adaptive network based fuzzy inference system. Appl Soft Comput; 2018; 68, pp. 13-23. [DOI: https://dx.doi.org/10.1016/j.asoc.2018.03.043]
11. Zhang, CD; Wang, W; Li, H. Tool wear prediction method based on symmetrized dot pattern and multi-covariance Gaussian process regression. Measurement; 2022; 189, 15. [DOI: https://dx.doi.org/10.1016/j.measurement.2021.110466]
12. Shi, DF; Gindy, NN. Tool wear predictive model based on least squares support vector machines. Mech Syst Signal Proc; 2007; 21,
13. D'Addona, DM; Ullah, A; Matarazzo, D. Tool-wear prediction and pattern-recognition using artificial neural network and DNA-based computing. J Intell Manuf; 2017; 28,
14. Gill, SS; Singh, R; Singh, J; Singh, H. Adaptive neuro-fuzzy inference system modeling of cryogenically treated AISI M2 HSS turning tool for estimation of flank wear. Expert Syst Appl; 2012; 39,
15. Yu, JS; Liang, S; Tang, DY; Liu, H. A weighted hidden Markov model approach for continuous-state tool wear monitoring and tool life prediction. Int J Adv Manuf Technol; 2017; 91,
16. Ertunc, HM; Loparo, KA; Ocak, H. Tool wear condition monitoring in drilling operations using hidden Markov models (HMMs). Int J Mach Tools Manuf; 2001; 41,
17. Shi XJ, Chen ZR, Wang H, Yeung DY, Wong WK, Woo WC. Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 28 (NIPS 2015)2015.
18. Peng, T; Zhang, C; Zhou, JZ; Nazir, MS. An integrated framework of Bi-directional long-short term memory (BiLSTM) based on sine cosine algorithm for hourly solar radiation forecasting. Energy; 2021; 221, 19. [DOI: https://dx.doi.org/10.1016/j.energy.2021.119887]
19. Bao, WY; Tansel, IN. Modeling micro-end-milling operations. Part III: influence of tool wear. Int J Mach Tools Manuf; 2000; 40,
20. Kennedy J, Eberhart R. Proceedings of ICNN’95-international conference on neural networks. 1995;4:1942–8.
21. Kirkpatrick, S; Gelatt, CD; Vecchi, MP. Optimization by simulated annealing. Science; 1983; 220,
22. Sortino, M. Application of statistical filtering for optical detection of tool wear. Int J Mach Tools Manuf; 2003; 43,
23. Palanisamy, P; Rajendran, I; Shanmugasundaram, S. Prediction of tool wear using regression and ANN models in end-milling operation. Int J Adv Manuf Technol; 2008; 37,
24. Liang, Y; Hu, SS; Guo, WS; Tang, HQ. Abrasive tool wear prediction based on an improved hybrid difference grey wolf algorithm for optimizing SVM. Measurement; 2022; 187, 13. [DOI: https://dx.doi.org/10.1016/j.measurement.2021.110247]
25. Chen, YX; Jin, Y; Jiri, G. Predicting tool wear with multi-sensor data using deep belief networks. Int J Adv Manuf Technol; 2018; 99,
26. Lorentzon, J; Järvstråt, N. Modelling tool wear in cemented-carbide machining alloy 718. Int J Mach Tools Manuf; 2008; 48,
27. Huang, WJ; Zhang, XY; Wu, CQ; Cao, SY; Zhou, Q. Tool wear prediction in ultrasonic vibration-assisted drilling of CFRP: a hybrid data-driven physics model-based framework. Tribol Int; 2022; 174, 12. [DOI: https://dx.doi.org/10.1016/j.triboint.2022.107755]
28. Gao, KP; Xu, XX; Jiao, SJ. Measurement and prediction of wear volume of the tool in nonlinear degradation process based on multi-sensor information fusion. Eng Fail Anal; 2022; 136, 17. [DOI: https://dx.doi.org/10.1016/j.engfailanal.2022.106164]
29. Chacón, JLF; de Barrena, TF; García, A; de Buruaga, MS; Badiola, X; Vicente, J. A novel machine learning-based methodology for tool wear prediction using acoustic emission signals. Sensors; 2021; 21,
30. Domínguez-Monferrer, C; Fernández-Pérez, J; De Santos, R; Miguélez, MH; Cantero, JL. Machine learning approach in non-intrusive monitoring of tool wear evolution in massive CFRP automatic drilling processes in the aircraft industry. J Manuf Syst; 2022; 65, pp. 622-639. [DOI: https://dx.doi.org/10.1016/j.jmsy.2022.10.018]
31. Zhao, R; Yan, RQ; Wang, JJ; Mao, KZ. Learning to monitor machine health with convolutional bi-directional LSTM networks. Sensors; 2017; 17,
32. Zhao, R; Wang, DZ; Yan, RQ; Mao, KZ; Shen, F; Wang, JJ. Machine health monitoring using local feature-based gated recurrent unit networks. IEEE Trans Ind Electron; 2018; 65,
33. Hochreiter, S; Schmidhuber, J. Long short-term memory. Neural Comput; 1997; 9,
34. Zaremba, WJ. Recurrent neural network regularization. arXiv; 2014; [DOI: https://dx.doi.org/10.4855/arXiv.1409.2329]
35. Yu, RG; Gao, J; Yu, M; Lu, WH; Xu, TY; Zhao, MK et al. LSTM-EFG for wind power forecasting based on sequential correlation features. Futur Gener Comp Syst; 2019; 93, pp. 33-42. [DOI: https://dx.doi.org/10.1016/j.future.2018.09.054]
36. Sak H, Senior A, Beaufays F, editors. Long Short-Term Memory Recurrent Neural Network Architectures for Large Scale Acoustic Modeling. 15th Annual Conference of the International-Speech-Communication-Association (INTERSPEECH 2014); 2014 Sep 14–18; Singapore, SINGAPORE. BAIXAS: Isca-Int Speech Communication Assoc; 2014.
37. Van Houdt, G; Mosquera, C; Nápoles, G. A review on the long short-term memory model. Artif Intell Rev; 2020; 53,
38. Siami-Namini S, Tavakoli N, Namin AS, editors. The performance of LSTM and BiLSTM in forecasting time series. 2019 IEEE International conference on big data (Big Data); 2019: IEEE.
39. Chen, T; Xu, RF; He, YL; Wang, X. Improving sentiment analysis via sentence type classification using BiLSTM-CRF and CNN. Expert Syst Appl; 2017; 72, pp. 221-230. [DOI: https://dx.doi.org/10.1016/j.eswa.2016.10.065]
40. Liu, G; Guo, JB. Bidirectional LSTM with attention mechanism and convolutional layer for text classification. Neurocomputing; 2019; 337, pp. 325-338. [DOI: https://dx.doi.org/10.1016/j.neucom.2019.01.078]
41. Goldber, DE; Holland, JH. Genetic algorithms and machine learning. Mach Learn; 1988; 3,
42. Ding, SF; Su, CY; Yu, JZ. An optimizing BP neural network algorithm based on genetic algorithm. Artif Intell Rev; 2011; 36,
43. Sun, Z; Chang, CC. Structural damage assessment based on wavelet packet transform. J Struct Eng-ASCE; 2002; 128,
44. Schober, P; Boer, C; Schwarte, LA. Correlation coefficients: appropriate use and interpretation. Anesth Analg; 2018; 126,
© The Author(s) 2025. This work is published under http://creativecommons.org/licenses/by-nc-nd/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.