1. Introduction
The development of electronic trading systems and computing technology have made stock trading more efficient since the end of the 20th century. The stock market has experienced tremendous expansion and generated a large amount of stock trading data and information. Therefore, how to grasp the operating mechanism of stock price movements has become a hot topic for researchers. However, the price trend in the stock market is a complex nonlinear dynamic system [1]. The time series of the stock price are influenced by the internal micro environment of enterprises, such as company performance and growth potential. In addition, stock prices are also influenced by external macroeconomic factors, such as changes in Gross Domestic Product(GDP), market interest rates, and media opinion etc. [2]. The fluctuation of stock prices is described as a stochastic process [3]. Econometric models are used to describe stock behavior, such as traditional time series prediction methods, exponential smoothing, and differential integrated moving average autoregressive models etc. [4, 5].
In recent years, artificial intelligence technology has been widely used to solve complex nonlinear time series stock prediction problems [6]. Machine learning and deep learning are the main effective methods to predict stock prices. By constructing neural networks, the models have the ability to simulate the human brain to analyze, learn, and interpret data such as images, sound, and text. Among them, a common model is to solve stock price prediction as a regression problem of time series. The common regression metrics used to measure the prediction performance included mean square error (MSE), root mean square error (RMSE), mean absolute error (MAE), mean absolute percentage error (MAPE), and goodness of fit (R2) of the model [7–10].
The methods proposed in many studies improve the performance of LSTM models by in many aspects, there are still some potential limitations and sources of error or bias. (1) computational complexity. The combination with optimization algorithms such as adaptive genetic algorithms may increase the computational complexity, resulting in longer training time. (2) Model generalization ability. Although hyperparameters are optimized, the generalization ability of the model under different market conditions (such as bull and bear markets) may be limited, requiring more validation and testing. (3) The influence of randomness. Meta-heuristic algorithms have randomness, which need to be run multiple times to ensure the stability of the results. (4) Data quality and quantification strategies. Data quality and pre-processing methods can handle noise and outliers of stock price, which have an important impact on prediction results. Particle Swarm Optimization (PSO) is a swarm intelligence algorithm that has unique advantages due to its high efficiency.
In this study, a novel PSO-LSTM stock price prediction model is proposed to leverage PSO in order to optimize LSTM hyperparameters. The decision to employ PSO for hyperparameter optimization stems from its efficacy in addressing complex optimization problems, particularly in high-dimensional spaces encountered in deep learning models like LSTMs. PSO excels in searching for global optima, converging relatively quickly compared to other optimization techniques, thus efficiently exploring hyperparameter combinations and saving computational resources. Moreover, the non-convex nature of hyperparameter optimization in LSTMs necessitates robust algorithms like PSO, capable of navigating complex, non-linear search spaces. Additionally, PSO’s ease of implementation and tuning make it accessible for researchers and practitioners seeking effective hyperparameter optimization strategies. By leveraging PSO, the predictive performance of LSTM-based stock price prediction model will be enhanced while addressing the challenges posed by high-dimensional optimization tasks. In addition, stock prices are high noise and high-dimensional time series, data preprocessing methods and feature selection strategies are also important for improving the performance of stock prediction models. Therefore, the model proposed in this study consists of three parts: (1) Data preprocessing includes wavelet transform (WT) and correlation analysis. WT was applied for the denoising time series. Correlation analysis is used to select the characteristics of input variables. (2) PSO is used to optimized the number of optimization iterations and the number of hidden neurons in LSTM neural networks. (3) The hyperparameters of the optimal solution obtained from PSO optimization are used as inputs of the LSTM model to predict stock prices. By comparing LSTM models with different hidden layers, the PSO-LSTM stock prediction results with the best performance will be selected as outputs. In order to analyze and compare the predicted results, RMSE, MAE, MAPE, and R2 are applied to evaluate the regression models.
The efficient market hypothesis (EMH) assumes that the degree of market development is heterogeneous. Selecting indices from different levels of market development can help explain the robustness of algorithms as market conditions may potentially affect the effectiveness of stock forecasting. Therefore, six stock indices are used to test the performance of the model. These indices include the Dow Jones Industrial Average (DJIA) index of the New York Stock Exchange, Standard & Poor’s 500 (S&P 500) index, Nikkei 225 index of Tokyo, Hang Seng Index of Hong Kong market, CSI300 index of Chinese Mainland stock market, and Nifty50 index of India. Among these six stock indices, DJIA and S&P 500 represent the most developed and efficient market. Hang Seng Index and Nikkei 225 represent the middle stage between efficient and inefficient markets. CSI300 and Nifty50 represent developing markets.
The main contribution and novelty of this article are as follows. Comprehensive experimental analysis utilizing data from six global stock indices across markets of varying development levels can facilitate a thorough comparative study of model performance. Then, the introduction of a novel PSO-LSTM stock price prediction model can leverage particle warm optimization to optimize LSTM hyperparameters. Finally, Investigation into the influence of diverse retrospective periods (50 days, 20 days, and 7 days) on model performance, providing insights into the model’s efficacy across different prediction cycles for practical applications.
The remainder of this article is organized as follows. Section 2 mentions the related work. Section 3 introduces the methodology applied in this study. Section 4 presents the details of the experimental design. Section 5 illustrates the experimental results. Section 6 summarizes the research.
2. Related work
Machine Learning (ML) has emerged as a transformative tool across various scientific and industrial domains. In the realm of environmental science, ML has been applied to landslide susceptibility mapping (LSM) in the Three Gorges Reservoir area. Utilizing Automated Machine Learning (AutoML), this approach simplifies the modeling process for non-experts by automating the selection and tuning of models. The implementation of AutoML has shown a significant performance improvement. ML’s utility is also evident in geotechnical engineering, particularly in predicting reservoir landslide displacements [11]. An Earthworm Optimization Algorithm-optimized Support Vector Regression (EOA-SVR) model has been developed, surpassing traditional metaheuristic models in stability and performance. This method provides a reliable tool for medium and long-term landslide early warning systems, which is crucial for disaster management and mitigation. As for post-disaster analysis, ML techniques, combined with multitemporal remote sensing, were used to monitor and analyze the evolution of landslide activity over a decade [12]. The use of algorithms like Random Forest facilitated the accurate mapping of landslides, revealing a gradual decrease in occurrences over time, despite intermittent spikes due to monsoonal rains. The study of long-term hydrological changes in the pan-Arctic region illustrates ML’s role in environmental monitoring [13]. By employing algorithms like the Extreme Gradient Boosting Tree (XGBoost), researchers have reconstructed historical water levels of pan-Arctic lakes, integrating these findings with climatic and hydrological data. This application not only enhances the understanding of hydrological dynamics influenced by climate change but also provides a methodological blueprint for similar environmental studies. These instances underline ML’s broad applicability and potential in providing innovative solutions to complex problems across various disciplines, reinforcing its role as a cornerstone technology in modern science and engineering [14].
Deep learning (DL) is a sub field of ML. LSTM neural network proposed by Hochreiter and Schmidhuber has shown excellent performance in time series prediction [15]. Compared to Convolutional Neural Network (CNN), LSTM neural network improves gate structure to selectively learn useful hidden information from a large amount of complex historical data. Therefore, LSTM can better understand the patterns and trends of time series [16]. Reference [17] used LSTM to predict the returns on investment portfolios of the S&P 500 index. The experimental results show that LSTM performs better than Random Forest (RF), Deep Neural Network (DNN), and Logistic Regression (LR). Reference [18] used LSTM to predict the opening prices of Google and NKE stocks and the results demonstrated that the LSTM model had good predictive ability. Reference [19] established an LSTM prediction model, using historical prices and technical analysis indicators as input variables to predict future trends in stock prices. LSTM was compared with other machine learning methods, and the experimental results showed that LSTM has excellent performance. Reference [20] compared LSTM with Support Vector Regression (SVR) using 9 common technical indicators to predict 5 US stock prices of international companies. The results showed that LSTM had better average prediction accuracy than SVR.
The accuracy and convergence of LSTM highly depend on the combination of hyperparameters, such as the number of hidden layers and neurons in each layer. However, the network topology constructed by these hyperparameters are difficult to manually adjust one by one, which makes it difficult to ensure a suitable and optimal network structure that meets practical applications. Metaheuristic algorithm is an optimization technique that can be used to optimize the hyperparameters of machine learning models [21, 22]. Compared with traditional optimization methods which may fall into local minima and fail to find the optimal solution, metaheuristic algorithms can find the optimal hyperparameters even under conditions of complex data distribution and huge search space [23–25].
PSO [26] is a swarm intelligence algorithm that has unique advantages among metaheuristic algorithms due to its high efficiency. The principle of PSO algorithm is to simulate the process of birds searching for food, dynamically adjusting their position based on individual and group extremum. PSO searches for the best solution to a given problem in the search space using a group of candidate solutions called particles. The iterative process of the algorithm only has a small number of parameters that need to be adjusted, such as particle velocity, particle quantity, and particle position. Based on the best position of each particle in the search space and the best position of the population so far, the position and velocity of the particles will be changed during each iteration. As particles move towards the best place, unknown domains in the search space are automatically avoided. This iterative optimization help particles to avoid local minima and converge to the global optimal solution. Moreover, the adaptability of PSO algorithm enables LSTM to determine the optimal parameters based on data features quickly and accurately, which can also help reduce the computational cost of adjusting the model. The combination of adaptive PSO algorithm and self-learning iterative optimization of key parameters of LSTM model can avoid manual parameter adjustment and improve the efficiency and authenticity of the model [27, 28].
Many types of LSTM (and other RNNs) have been tuned by metaheuristics algorithms for various purposes. Parkinson’s disease diagnosis is a challenging task due to the absence of reliable tests. Cuk et al. explored the potential of LSTM neural networks combined with attention mechanisms and proposed an optimized crayfish algorithm to detect Parkinson’s disease accurately [29]. The method achieved promising results with an accuracy of 87.42% using dual-task walking test data. Quick process shift detection is vital for modern smart manufacturing. Yang et al. proposed single and stacked LSTM models optimized with metaheuristic optimizers to detect shifts in high-dimensional manufacturing processes [30]. The CSOS_S_LSTM model achieved the best results with a shorter out-of-control run length, improving response time by 38.77% on average. Bacanin et al. proposed the use of a long short-term memory (LSTM) deep learning model for cloud load time-series forecasting [31]. It utilized LSTM with attention layers and a modified particle swarm optimization (PSO) algorithm. The variational mode decomposition (VMD) was used for data preprocessing. The proposed methodology outperformed other techniques in terms of performance metrics. SHAP analysis was used to assess feature importance. The methodology had potential for assisting cloud providers in resource allocation and provisioning decision-making processes. Francisco J. employed the Automated Machine Learning (AutoML) process for feature selection, model creation, and hyperparameter optimization to develop a machine learning model for Google stock price forecasting [32]. The AutoML process selected features from 11 technical indicators. Hyperparameters were optimized using PSO, achieving accurate results with errors ranging from 1E-2 to 9E-4. The CNN-LSTM network outperformed the standalone LSTM model. Pedroza-Castro et al. addressed the need for efficient computation resource management in distributed cloud-based services by proposing a methodology for forecasting cloud resource load using RNN with attention layers [33]. The models were optimized through hyperparameter tuning using a modified PSO metaheuristic and incorporated variational mode decomposition for handling non-stationary data sequences. The study demonstrated the potential of the proposed method in accurately forecasting cloud load. The results outperformed state-of-the-art algorithms and provided valuable insights for cloud providers. To enhance the security of intrusion detection systems, Donkol introduced an enhanced LSTM technique integrated with RNN (ELSTM-RNN) [34]. The proposed system employed PSO to select effective features and enhanced LSTM for classification. It addressed the challenges faced by existing methods and achieved better performance in detecting intrusions within network communications. The system’s efficiency was validated through extensive testing on multiple datasets, demonstrating improved classification accuracy and faster training times compared to existing methods.
In the latest research, many studies have applied the PSO algorithm to the field of time series prediction. The combination of PSO and LSTM models is used to predict the trend of stock changes based on quantitative and textual information [35]. Empirical results showed that the model was superior to the BP neural network and LSTM network models. Reference [36] validated the effectiveness and applicability of the PSO-LSTM model based on stock prices. Reference [37] validated the effectiveness of the PSO-LSTM model based on the sales of five types of fishing gear in an online store and two publicly available datasets.
Therefore, the proposed PSO-LSTM hybrid model for stock price prediction is deliberate and multifaceted. PSO’s compatibility with LSTM architecture, coupled with its proven track record and efficiency in optimization tasks, makes it a suitable choice for stock price prediction. Furthermore, PSO aligns optimally with the requirements and characteristics of stock price prediction due to its ability to effectively handle the complex and dynamic nature of stock market data. The inherent swarm intelligence of PSO allows it to navigate the intricate parameter space of LSTM, facilitating the discovery of optimal solutions amidst the inherent noise and non-linearity of financial markets. Additionally, the adaptability of PSO enables it to continuously refine its search strategy, ensuring robust performance in the face of evolving market conditions. PSO is also underpinned by the No Free Lunch theorem [38], which underscores the recognition that while other optimization algorithms may excel in certain contexts. PSO offers the most suitable optimization approach for stock price prediction framework.
3. Methodology
3.1 Long short-term memory neural networks
LSTM has been widely applied in predicting stock prices in recent years. Traditional RNN is prone to the problem of vanishing gradients. By introducing forget gates and memory units, LSTM can solve the above problems. Due to the presence of memory units, the flow of information is achieved through a mechanism called cellular state. LSTM can selectively retain or forget information based on its importance, which achieves dynamic learning of data patterns and improves prediction accuracy. LSTM can also overcome the long-term dependency problem in recurrent networks as gradients can flow over a longer period of due to the introduction of self recurrent paths.
In addition, the original model improvement of LSTM also enhances the reliability of the network [39]. This improvement stems from the structure of the LSTM unit. LSTM includes a unit specifically designed for long-term storage of information. LSTM also consist of input gate, output gate, and forget gate for precise control of data flow. During the training process, the LSTM neural network adopted the Time Series Back propagation (BPTT) algorithm [40], which further improved the performance of the model. The internal structure of the LSTM computing unit is shown in Fig 1.
[Figure omitted. See PDF.]
The gate control operations consist of a sigmoid activation function and a dot product operation. The forget gate, input gate, and output gate are the three gates used by LSTM to protect and control the state of a single computing unit.
The forgetting gate determines the information that needs to be discarded, which can be expressed as Eq (1).(1)Where wf represents the connection weight of the previous output. The symbol ht−1 is the previous output. The symbol xt is the current input. The symbol is the bias vector. The symbol σ is an activation function.
The input gate determines the information that needs to be updated. It is obtained by multiplying two vectors created by the input gate layer and tanh layer that expressed in Eqs (2) and (3).
(2)(3)
The output gate updates the information of the input gate and the forget gate by Eqs (4) and (5). First, the sigmoid activation function determines the output part. Then the cell state is normalized to [–1,1] through the tanh layer. Finally, dot multiplication is performed.
(4)(5)
Then the final output value of cell is calculated by Formula (6).
(6)
The LSTM neural network can be obtained by connecting each LSTM unit with a directed graph structure. Fig 2 shows a typical network structure.
[Figure omitted. See PDF.]
The training method of LSTM model usually include two steps. First, the output value of LSTM units is calculated through forward propagation. Then back propagation is used to calculate the error value and the weight gradient. These gradient values are used for updating the calculus gradient descent algorithm. Most neural network optimization algorithms use stochastic gradient descent (SGD), adaptive gradient algorithm (AdaGrad), and adaptive moment estimation algorithm (Adam). The SGD algorithm maintains a single learning rate during execution. Adam function is used in this article which can simultaneously calculate first-order moment estimation and second-order moment estimation. The calculation enable each parameter to obtain independent adaptive learning rates, thereby reducing overfitting problem to a certain extent [41]. In the specific experimental process, hyperparameters such as batch size, number of hidden layer neurons, and number of hidden layers for LSTM are needed to be set by the experimental designer.
3.2 Particle swarm optimization (PSO) algorithm
PSO algorithm will be used to optimize the hyperparameters of LSTM neural networks in this study. The core of PSO optimization algorithm is based on cooperation and information sharing among particles in the population. The optimal solution is obtained through iteration. For instance, there are N particles in the D-dimensional search space. The position and velocity of each particle are random during initialization. The current optimal extremum for each particle is the optimal solution obtained by the current individual search (particle best, pbest). The expression of the optimal solution of ith particle is . The expression of the global extremum is . All particles in the particle swarm will update their velocity and position according to Eqs (7) and (8) until the optimal solution is found [39].
(7)(8)
The symbol k represents the number of iterations. The spatial position of ith particle is . The speed of ith particle is . The symbol w is the inertia factor used to adjust the search range of the solution space which represents the tendency of particles to maintain their historical velocity. The symbol c1 and c2 is an acceleration constant used to adjust the maximum learning step size. The symbol r1 and r2 is a uniform random number used to increase the randomness of the search, with a range of values between [0, 1]. In addition to the inertia of particles, Eqs (7) and (8) also consider the following two tendencies. One is the tendency of particles that approach their historical best position. The other is the tendency of particles that approach the historical best position of a population or neighborhood.
Subsequently, the fitness value of each particle is evaluated based on Eqs (9) and (10). The current fitness value of each particle will be compared and ranked with the fitness value of the global best position (gbest) [39]. The pbest with better fitness values will be updated based on the new global best position gbest. If the number of iterations of the algorithm reaches its maximum value, the extreme value of the particle swarm is taken as the optimal solution. Otherwise, the particle swarm will continuously iterate the above process until the optimal solution is found.(9)(10)
Algorithm 1 PSO Algorithm
procedure PSO
for each particle i
Initialize velocity Vi and position Xi for particle /
Evaluate particle i and set Pbesti = Xi
end for
Gbest = min {Pbesti}
while not stop
for i = 1 to N
Update the velocity and position of particle
Evaluate particle
if fit (Xi) < fit (Pbesti)
Pbesti = Xi,
if fit(Pbesti) < fit (Gbest)
Gbest = Pbest,
end for
end while
print Gbest
end procedure
3.3 PSO-LSTM stock price prediction model
The number of optimization iterations and the number of neurons in the hidden layer are used as particles. The flowchart of the PSO-LSTM stock price prediction model proposed in this article is shown in Fig 3.
[Figure omitted. See PDF.]
Three LSTM neural networks with one hidden layer, two hidden layers, and three hidden layers will be constructed. The number of iterations and the number of neurons in different hidden layers are used as the optimization objectives of the model. This experiment will preprocess six stock index datasets. At the same time, the dataset will be divided into training and testing sets. The root mean square error (RMSE) of the prediction result is used as the result of the fitness function value. pbest and gbest will be updated according to Algorithm 1 and the optimal solution will be selected as outputs. The hyperparameters of the optimal solution are input into the LSTM model to predict stock prices. The results of PSO-LSTM stock prediction models with different hidden layers will be compared based on the evaluation indicators.
4 Research data and experiment
The six stock indices used to test the performance of the model include the Dow Jones Industrial Average (DJIA) index of the New York Stock Exchange, Standard & Poor’s 500 (S&P 500) index, Nikkei 225 index of Tokyo, Hang Seng Index of Hong Kong market, CSI300 index of Chinese Mainland stock market, and Nifty50 index of India. The data are from the WIND database (http://www.wind.com.cn) provided by Shanghai Wind Information Co., Ltd, CSMAR database (http://www.gtarsc.com) provided by Shenzhen GTA Education Tech. Ltd., and the global financial portal Investing.com. The time span were from 2008/07/02 to 2016/09/30.
4.1 Input variables
Stock prices are influenced by both macro and micro environments. Therefore, three sets of variables are selected as input variables which are shown in Table 1. The first set of input variables are the historical trading data which include open, high, low, close prices and the trading volume [42]. These raw prices represent fundamental trading information. The details are described as No.1-5 in Table 1. The second set of input variables consists of 12 technical indicators that can grasp moving trends of stock price [43]. The details are described as No.6-15 in Table 1. The final set of inputs is the macroeconomic indicators, which include the exchange rate and interest rate [44].
[Figure omitted. See PDF.]
4.2 Correlation analysis
From the perspective of stock trading, the closing price is an important factor in formulating trading strategies. To avoid multicollinearity, the numerical values of Pearson correlation coefficients between the closing price and other features are calculated. Table 2 show the results of Pearson correlation coefficients of DJIA as an sample. SPSS analysis shows that features with correlation coefficients above 95% have a significant impact on price fluctuations (** Correlation is significant at the 0.01 level (2-dailed)). Therefore, features with higher correlation coefficients will be removed.
[Figure omitted. See PDF.]
To further demonstrate the relevance, correlation heatmap of DJIA is illustrated as Fig 4.
[Figure omitted. See PDF.]
4.3 Data pre-processing
4.3.1 Data denoising.
Due to the complexity and high-noise of the stock market, data denoising is a necessary means to improve the performance of LSTM models. Wavelet transform (WT) has the ability to simultaneously analyze the frequency components of financial time series. Therefore, WT can process highly non-stationary financial time series data [45]. The Haar function is used as the wavelet function in the study, which not only decomposes time series into time-domain and frequency-domain, but also has the advantage of short computation time [46].
Continuous Wavelet Transform (CWT) extracts features of time series in the dimensions of time and scale, but the coefficients contain a large amount of redundant information and require further dimensionality reduction. Therefore, Discrete Wavelet Transform (DWT) has gradually become a more common method, which can extracts features more effectively by decomposing time series into orthogonal component sets. Fig 5 expressed the closing price of the S&P500 index for each trading day from July 1, 2008 to October 1, 2016. Fig 5(A) shows the original historical closing prices without noise reduction, which are unstable and noisy. The Pywt library in Python was used for DWT. Fig 5(B) shows the closing price of the S&P500 index after three-layer wavelet decomposition, with a more stable sequence and less noise.
[Figure omitted. See PDF.]
Comparison of closing price of the S&P500 index before denoising (a) and after denoising (b).
4.3.2 Data normalization.
There are three sets of input variables with different dimensions used in the experiments. Therefore, it is necessary to standardize the data to be within the output range of the activation function. MinMaxScaler [47] in Pandas is used to process different scaled features into the ranges [–1,1] based on Eq (11). Specifically, as data normalization preserves all relationships in the data precisely, it avoids bias [48].(11)where, xnorm is the converted value. xmax is the maximum value of the sample and xmin is the minimum value of the sample.
4.4 Experimental Environment and evaluation indicators
The deep learning framework TensorFlow in the Keras framework [49] is served as back end support to construct prediction models. Tables 3–5 provide the details of the experimental environments.
[Figure omitted. See PDF.]
[Figure omitted. See PDF.]
[Figure omitted. See PDF.]
Six stock indices were used to train and test predictive models in this experiment. The closing price is used as the predicted value, and the difference between the predicted value and the true value is used as the predictive effect. The evaluation indicators used in this study which are shown as Eqs (12)–(15) were root mean square error (RMSE), mean absolute error (MAE), mean absolute percentage error (MAPE), and goodness of fit (R2) of the model [7–10]. RMSE is defined as(12)
MAE is given as(13)
MAPE is defined as(14)
R2 is defined as(15)
The RMSE, MAE, and MAPE were used to measure the deviation between the actual and predicted values. The smaller the value, the closer the predicted value to the actual value. R2 was used to measure the degree of model fitting. The closer it was to 1, the better the model fits the actual prices.
5. Experimental results of PSO-LSTM forecasting model
5.1 Experiments of PSO-LSTM model (Parameters manually set)
Firstly, the results of PSO optimized LSTM model with one hidden layer, two hidden layers, and three hidden layers will evaluated in this experiment. The parameters for initializing PSO will be based on Eqs (7) and (8). The group size is set to 20. The range of the number of neurons and iteration times in the hidden layer are set to [0, 300]. The range of particles is set to [0, 300]. The speed range of particles is set to [–2, 2]. The maximum number of iterations for PSO is set to 50. The inertia weight w is a major parameter. The larger the inertia weight w is, the stronger the global optimization ability is and the weaker the local optimization ability is. w is set to 0.8 in the experiment. The acceleration constants c1 and c2 are set to 1.5, indicating that the weights for individual and population particles are the same. The first 80% of the data are used as the training set while the last 20% are used as the testing set. For each experiment, 10 tests are conducted and evaluated using the average of the results. In section 4.1, the impact of manually setting the number of neurons on the experimental results is discussed. Due to too many combinations, a set of experiments are selected to present the results. In Section 5.1 and 5.2, a 50 day retrospective period is used to validate the model.
5.1.1 Impact of increasing the number of hidden neurons on experimental results
The prediction errors of different hidden neurons when a hidden layer is used and training rounds are set to 100 is shown in Table 6. It can be seen that increasing the numbers of hidden neurons has no significant effect on the experimental results.
[Figure omitted. See PDF.]
5.1.2 The impact of different training rounds on experimental results.
Next, the impact of different training rounds in the LSTM model is considered. The prediction errors for different training rounds when the number of hidden neurons and hidden layers are set to 200 and 1 respectively are shown in Table 7. As the number of training rounds increases, the model can better extract features from the data set, which reduce the prediction error of the model. The first number of hidden neurons represents the number of neurons in the first hidden layer of the LSTM model, the second number of hidden neurons represents the number of neurons in the second hidden layer, and so on. However, if the number of training rounds increases beyond the appropriate range, the trained model may experience overfitting.
[Figure omitted. See PDF.]
(Hidden neurons = 200, hidden layer = 1).
5.1.3 The impact of different numbers of hidden layers on experimental results.
Finally, the effect of using different numbers of hidden layers is considered. It can be seen Table 8 that increasing the number of LSTM layers will affect the results. Adding LSTM layers helps improve the ability of neural networks to extract data features. However, in regard to different datasets, the optimal model parameters can only be obtained by selecting an appropriate number of layers for experiments.
[Figure omitted. See PDF.]
5.1.4 PSO-LSTM model with optimal parameters.
The prediction errors of the optimal parameters of the LSTM model are summarized in the Table 9 for the six stock indices representing different development markets. The comparison between the true and predicted values of the experimental dataset using LSTM and PSO-LSTM models are shown in Fig 6, respectively.
[Figure omitted. See PDF.]
The line charts of real and predicted close price in 6 indices: (a) DJIA, (a) S&P500, (c) HangSeng, (d) Nikkei225, (e) CSI300, and (f) Nifty 50.
[Figure omitted. See PDF.]
5.2 Experiments of PSO-LSTM model (PSO automatically optimize parameters)
In section 5.1, experiments with manually setting parameters to determine the appropriate neural network model for prediction are conducted. In section 5.2, the hyperparameters of the LSTM neural network are automatically optimized by PSO. In Section 5.2, a 50 day retrospective period is used to validate the model. The accuracy of the neural network before and after optimization will be compared. The data from six stock indices proves the degree of fit between the predicted values of the PSO-LSTM model and the actual values. The prediction errors and optimal parameters of LSTM and PSO-LSTM models are listed in Table 10. The changes in fitness (MSE) of the PSO-LSTM model during the evolution process are shown in Fig 7. As the number of iterations of the PSO algorithm increases, the loss gradually decreases. The PSO-LSTM model performs better than the LSTM model in almost all different network layers and experimental data. Hyperparameter alternatives of the PSO-LSTM model are seen in Table 10. PSO and search and try to find best hyperparameters. The settings of algorithm parameter values are shown in Table 11.
[Figure omitted. See PDF.]
[Figure omitted. See PDF.]
[Figure omitted. See PDF.]
As mentioned above, PSO is used to optimize LSTM’s parameters. To validate the effectiveness, the comparison of LSTM parameters and errors before and after using PSO algorithm is expressed in Table 12. Furthermore, Fig 8 shows the box plots of the performance scores of PSO-LSTM model with 10 replications. Table 13 illustrates the performance scores of the PSO-LSTM models in the test data.
[Figure omitted. See PDF.]
[Figure omitted. See PDF.]
[Figure omitted. See PDF.]
5.3 Experiments of other well known forecasting methods
5.3.1 Comparison of machine learning model.
There are many other classic algorithms for time series prediction. In this section, PSO-LSTM model proposed in this article will be compared with several popular methods adopted by researchers in recent years. In addition, the performance of the model under different prediction cycles can be more comprehensively evaluated by conducting experiments on retrospective periods of different time spans. In Section 5.3, a 50 day retrospective period is used to validate the model. In section 5.4, a 20-day and 7-day lookback period will be used to validate the model.
XGBoost can parallelly combine multiple weak classifiers (decision trees) into a strong classifier (elevation tree) through result weighting, making it one of the most efficient and high-performance algorithms in the engineering field [50]. XGBoost is often combined with deep learning models [48] or intelligent optimization algorithms [51, 52] for time series prediction. RF is an ensemble classifier based on Bagging proposed by Breiman [53]. RF uses the best classification result from all decision trees as the final result, which has been widely used in the field of stock prediction [54]. K-Nearest Neighbor (KNN) neural network algorithm lies in an accurate classification method, which classifies adjacent data parameters [55]. KNN has been successfully applied in financial time series forecasting [56]. Support Vector Machine (SVM) is a machine learning algorithm [57]. SVM is effective for regression and classification problems with multivariate features, which is also applicable to multivariate data for stock price prediction. The performance of support vector machines depends on the selection of kernel functions. Therefore, in practical problems, it is very important to choose a suitable kernel function to establish an SVM algorithm based on real data models. Support vector machines have been widely used in time series prediction [58–60]. Multi-layer perceptron (MLP) is a feed forward artificial neural network model. MLP neural network simplifies the structure of biological neurons to obtain the basic structure of the neural network, which is widely used for time series prediction [61, 62]. Bi directional long short-term memory (Bi-LSTM) model is a method of continuous input information based on LSTM with two input sequences, positive and negative. It is a special variant of recurrent neural networks, while retaining the advantages of LSTM in processing long-term correlated sequences and compensating for the disadvantage of LSTM in not using contextual information for prediction. This method has achieved good results in interactive prediction [63–65].
As for SVM, different activation functions are tested and the best results are obtained through RBF activation functions. Therefore, the parameter Gamma of RBF kernel needs to be adjusted. The optimal performance of SVM was gamma = 0.01. As for MLP, there is no fixed rule for selecting the number of hidden layers and the optimal number of neurons in the hidden layers. In order to compare the performance of the MLP and PSO-LSTM, the number of neurons, number of hidden layers, learning rate, number of training rounds, and other parameters of MLP were configured to be similar to PSO-LSTM. As for KNN, the best value of k was found to be 5 through testing, and the best values for different datasets were listed in the table. In all experiments, the first 80% of the data were used as the training set, and the last 20% were used as the testing set. This process was repeated 10 times. Finally, based on all experimental results, the average values are recorded as the final results. This technique is used to estimate the accuracy of prediction models in practical applications and avoid overfitting problems.
From the experimental results of the six datasets listed in Table 14, it can be seen that PSO-LSTM achieved the best performance in the HangSeng, Nikkei225, and Nifty50 datasets. PSO-LSTM ranks in the top three in predictive performance on DJIA, S&P500, and CSI300 datasets. Compared with seven other machine learning models, PSO-LSTM achieved excellent predictive performances.
[Figure omitted. See PDF.]
The p-value of the symbol test for the algorithm proposed in this article is 0.0078125 compared to other machine learning algorithms. The p-value of the Wilcoxon rank test is 0.005411. The difference between the PSO-LSTM algorithm proposed in this article and other traditional methods is at the level of 0.05 (95%), demonstrating the significant advantage of the proposed algorithm.
5.3.2 Comparison of state-of-the-art models.
The comparative experimental results between PSO-LSTM and other state-of-the-art (SOTA) models are shown in Table 15. The comparative experiment includes two situations: using different datasets and using the same dataset with different time periods. The main contribution of this article is to propose a model for optimizing the structure of neural networks.
[Figure omitted. See PDF.]
5.4 Experiments on different lookback periods (20 and 7 day lookback periods)
In sections 5.1 and 5.2, the experiments are conducted based on a 50 day retrospective period. In section 5.4, the performances under different lookback periods are further investigated. The changes at the current time point are predicted by analyzing the lookback period data of the S&P500 dataset. The results are listed in Table 14. It can be seen that there are no significant difference between using a 50 day (results of S&P500 of Table 14 in Section 5.3) and a seven day retrospective period (Table 16) and a twenty day retrospective period (Table 17) for prediction.
[Figure omitted. See PDF.]
[Figure omitted. See PDF.]
As depicted in the table, the proposed model consistently outperforms alternative machine learning models across various retrospective periods (7 days, 20 days, and 50 days), with a minimum 25% increase in prediction accuracy. These findings underscore the robustness and applicability of the PSO-LSTM model, particularly in high-frequency trading scenarios.
6. Conclusion
The proposed PSO-LSTM model offers a novel approach to address the intricate challenge of stock price prediction. Although LSTM neural networks have gained prominence for their adeptness in handling financial time series data, their efficacy in practical scenarios is hampered by the intricate process of parameter optimization. This study introduces a hybrid model that integrates LSTM with PSO to mitigate this limitation and enhance predictive accuracy.
The PSO-LSTM model’s adaptability is particularly noteworthy, as it efficiently determines optimal parameters by leveraging the inherent problem-solving capabilities of the PSO algorithm. By swiftly identifying parameter combinations aligned with data characteristics, the model streamlines computational overhead and bolsters predictive performance. Empirical evaluations underscore the efficacy of the proposed approach. Manual exploration of key parameters, including the number of hidden neurons and layers, indicates their nuanced impact on model performance. Furthermore, automated optimization via PSO demonstrates tangible improvements in predictive accuracy as well as lower prediction errors.
Comparative analysis with traditional LSTM models and a spectrum of machine learning algorithms reaffirm the superiority of the PSO-LSTM framework. Notably, it emerges as a top-performer across diverse datasets, underscoring its robust predictive capabilities. Moreover, the model’s resilience is evidenced through retrospective validation across varying timeframes, wherein consistent predictive accuracy is maintained over 50, 20, and 7-day intervals. In essence, the proposed PSO-LSTM model represents a significant advancement in stock market prediction methodologies, offering a potent amalgamation of LSTM’s temporal modeling prowess and PSO’s optimization finesse.
The reasons that the model produces better results are as follows. Firstly, LSTM can capture the long short-term dependencies in financial time series, which can effectively learn and predict complex patterns and trends in data. Secondly, PSO algorithm is used to optimize the parameters of the LSTM model. PSO maintains a good balance between global search and local search to avoid the problem of falling into local optima. PSO also can find parameter combinations close to the global optimum in a short period of time. The advantages enable the PSO-LSTM model to be trained with the optimal parameter configuration to improve predictive performance. During the training process, various regularization techniques, such as dropout and L2 regularization, are employed to prevent overfitting of the model. Overfitting refers to the situation where a model performs well on training data but performs poorly on test data. The regularization process ensures the generalization ability of the model, making its predictions more accurate on unknown data. In addition, data normalization, denoising, and feature selection are conducted before model training. These steps ensure the quality of input data and enable the model to learn and capture important information of the data to improve prediction performance. Finally, a multi-level evaluations of the performances of the model are conducted. In addition, the performance of the model under different market conditions are also tested, which verify its adaptability and robustness in different scenarios.
However, the proposed solution in this article still has many aspects for improvements. The proposed algorithm focuses on stock prediction as a single objective optimization problem with the aim of enhancing accuracy. In practical applications, it is often essential to efficiently train the model in response to market changes. In addition, the neural network operates as a "black box" model that sacrifices the interpretability and understandability of the original features. How to utilize an automatic and effective feature extraction process to reduce the dimensionality of the feature space and extract transformed features to create a new low-dimensional space are promising.
In future work, data completeness by incorporating natural language analysis of financial news will be enhance. The latest models like GANs and pre-trained models for stock price prediction are also worth exploring. Additionally, the LSTM neural network will be refined by exploring additional parameters. Alternative evolutionary algorithms for hyperparameter optimization will also be investigated. Future values of the stocks can be predicted with low error. The trading system built by the proposed model can give successful buy/sell/keep suggestions.
Supporting information
S1 File.
https://doi.org/10.1371/journal.pone.0310296.s001
(ZIP)
References
1. 1. El B.D., Yahyaouy A, and Boumhidi J. (2022),"Intelligent Energy Management for Micro-Grid based on Deep Learning LSTM prediction Model and Fuzzy Decision-Making.", Sustainable Computing: Informatics and Systems, Vol. 35, pp. 100709.
* View Article
* Google Scholar
2. 2. Zolfaghari M., Fadishei H., Tajgardan M. and Khoshkangini R. (2022), “Stock Market Prediction Using Multi-Objective Optimization”, 12th International Conference on Computer and Knowledge Engineering (ICCKE), pp. 253–262,
* View Article
* Google Scholar
3. 3. Gao J, Jia Z, Wang X, Xing H. (2022), "Prediction of degradation trend of proton exchange membrane fuel cells based on PSO-LSTM.", Journal of Jilin University (Engineering Edition), Vol. 52 No. 9, pp. 2192–2202.
* View Article
* Google Scholar
4. 4. Mehtab Sidra, and Sen J. (2020), "Stock Price Prediction Using CNN and LSTM-Based Deep Learning Models.", 2020 International Conference on Decision Aid Sciences and Application (DASA), pp. 447–453.
* View Article
* Google Scholar
5. 5. Fuwei Y, Jingjing C and Yicen L. (2021), "Improved and optimized recurrent neural network based on PSO and its application in stock price prediction." Soft Computing, Vol. 27, pp. 3461–3476.
* View Article
* Google Scholar
6. 6. Yang Yujun, Yang Yimei and Zhou Wang. (2021), "Research on a hybrid prediction model for stock price based on long short-term memory and variational mode decomposition.", Soft Computing, Vol. 25 No. 21, pp. 13513–13531.
* View Article
* Google Scholar
7. 7. Baek Y.; Kim H. Y. Modaugnet: A new forecasting framework for stock market index value with an overfitting prevention lstm module and a prediction lstm module. Expert Systems with Applications, 2018, 113, 457–480.
* View Article
* Google Scholar
8. 8. Karathanasopoulos A.; Osman M. Forecasting the dubai financial market with a combination of momentum effect with a deep belief network. Journal of Forecasting, 2019,38,346–353.
* View Article
* Google Scholar
9. 9. Nguyen D. H. D.; Tran L. P.; Nguyen V. Predicting stock prices using dynamic lstm models. In International Conference on Applied Informatics, 2019; pp. 199–212. Springer.
* View Article
* Google Scholar
10. 10. Jin Z.; Yang Y.; Liu Y. Stock closing price prediction based on sentiment analysis and lstm. Neural Computing and Applications, 2019, 1–17.
* View Article
* Google Scholar
11. 11. Ma J., Lei D., Ren Z. et al. Automated Machine Learning-Based Landslide Susceptibility Mapping for the Three Gorges Reservoir Area, China. Math Geosci (2023). https://doi.org/10.1007/s11004-023-10116-3
* View Article
* Google Scholar
12. 12. Liu Z., Ma J., Xia D. et al. Toward the reliable prediction of reservoir landslide displacement using earthworm optimization algorithm-optimized support vector regression (EOA-SVR). Nat Hazards 120, 3165–3188 (2024). https://doi.org/10.1007/s11069-023-06322-1
* View Article
* Google Scholar
13. 13. Long Y., Li W., Huang R. et al. A Comparative Study of Supervised Classification Methods for Investigating Landslide Evolution in the Mianyuan River Basin, China. J. Earth Sci. 34, 316–329 (2023). https://doi.org/10.1007/s12583-021-1525-9
* View Article
* Google Scholar
14. 14. Liang X., Song C., Liu K. et al. Reconstructing Centennial-Scale Water Level of Large Pan-Arctic Lakes Using Machine Learning Methods. J. Earth Sci. 34, 1218–1230 (2023). https://doi.org/10.1007/s12583-022-1739-5
* View Article
* Google Scholar
15. 15. Gers F. A., Schmidhuber J., Cummins F. (2000), " Learning to forget: Continual prediction with LSTM.", Neural Computation, Vol. 12 No. 10, pp. 2451–2471. pmid:11032042
* View Article
* PubMed/NCBI
* Google Scholar
16. 16. Nan Jing, Qi Liu, and Hefei Wang. (2021), "Stock price prediction based on stock price synchronicity and deep learning." International Journal of Financial Engineering, Vol. 08 No. 2, pp. 2141010–2141020.
* View Article
* Google Scholar
17. 17. Oliveira D M Q N a C M P R a D. Stock Market’s Price Movement Prediction With LSTMNeural Networks[J]. 2017 International Joint Conference on Neural Networks (IJCNN), 2017.
* View Article
* Google Scholar
18. 18. Moghar A, Hamiche M. Stock Market Prediction Using LSTM Recurrent Neural Network[J].Procedia Computer Science, 2020, 170: 1168–1173.
* View Article
* Google Scholar
19. 19. Nelson D M Q, Pereira A C M, Oliveira R a D. Stock market’s price movement prediction with LSTM neural networks[C]. 2017 International Joint Conference on Neural Networks (IJCNN), 2017:1419–1426.
* View Article
* Google Scholar
20. 20. Bathla G. Stock Price prediction using LSTM and SVR[C]. 2020 Sixth International Conference on Parallel, Distributed and Grid Computing (PDGC), 2020: 211–214.
* View Article
* Google Scholar
21. 21. Latrisha N, Jeta N, Callista A, et al. (2023), "Machine learning approaches in stock market prediction: A systematic literature review. ", Procedia Computer Science, Vol. 216, pp. 96–102.
* View Article
* Google Scholar
22. 22. Thakkar A., & Chaudhari K. (2022), "Information fusion-based genetic algorithm with long short-term memory for stock price and trend prediction.", Applied Soft Computing, Vol. 128, pp.109428–109448.
* View Article
* Google Scholar
23. 23. Gülmez B. (2023), "A novel deep learning model with the Grey Wolf Optimization algorithm for cotton disease detection.", Journal of Universal Computer Science, Vol. 6 No. 29, pp. 595–626.
* View Article
* Google Scholar
24. 24. Kumar K., & Haider M. T. U. (2021), "Enhanced prediction of intra-day stock market using metaheuristic optimization on RNN–LSTM network.", New Generation Computing, Vol. 39, pp.231–272.
* View Article
* Google Scholar
25. 25. Oztürk M. M. (2022), "Initializing hyper-parameter tuning with a metaheuristic ensemble method: A case study using time-series weather data.", Evolutionary Intelligence, Vol. 16, pp. 1–13.
* View Article
* Google Scholar
26. 26. Kennedy R., Eberhart J. (1995), "Particle swarm optimization.", Proceedings of ICNN’95—International Conference on Neural Networks, Vol. 4, pp. 1942–1948.
* View Article
* Google Scholar
27. 27. Thakkar Ankit, and Kinjal . (2021), "A comprehensive survey on portfolio optimization, stock price and trend prediction using particle swarm optimization.", Archives of Computational Methods in Engineering, Vol. 28, pp. 2133–2164.
* View Article
* Google Scholar
28. 28. Chi Ma. (2022), "Stock linkage prediction based on optimized LSTM model.", Multimedia Tools and Applications, Vol. 81 No. 9, pp. 12599–12617.
* View Article
* Google Scholar
29. 29. Cuk Aleksa, et al. "Tuning attention based long-short term memory neural networks for Parkinson’s disease detection using modified metaheuristics." Scientific Reports 14.1 (2024): 4309. pmid:38383690
* View Article
* PubMed/NCBI
* Google Scholar
30. 30. Yang Chao-Lung, et al. "LSTM-based framework with metaheuristic optimizer for manufacturing process monitoring." Alexandria Engineering Journal 83 (2023): 43–52.
* View Article
* Google Scholar
31. 31. Bacanin Nebojsa, et al. "Cloud computing load prediction by decomposition reinforced attention long short-term memory network optimized by modified particle swarm optimization algorithm." Annals of Operations Research (2023): 1–34.
* View Article
* Google Scholar
32. 32. Pedroza-Castro Francisco J., Alfonso Rojas-Domínguez, and Martín Carpio. "Automated Machine Learning to Improve Stock-Market Forecasting Using PSO and LSTM Networks." Hybrid Intelligent Systems Based on Extensions of Fuzzy Logic, Neural Networks and Metaheuristics. Cham: Springer Nature Switzerland, 2023. 331–345.
33. 33. Predić Bratislav, et al. "Cloud-load forecasting via decomposition-aided attention recurrent neural network tuned by modified particle swarm optimization." Complex & Intelligent Systems (2023): 1–21.
* View Article
* Google Scholar
34. 34. Donkol Ahmed Abd El-Baset, et al. "Optimization of intrusion detection using likely point PSO and enhanced LSTM-RNN hybrid technique in communication networks." IEEE Access 11 (2023): 9469–9482.
* View Article
* Google Scholar
35. 35. Jiangpeng Wang, Cheng Wang. (2021), "Progress and future trends of deep learning technology.", China Security, Vol. 07, pp. 81–84.
* View Article
* Google Scholar
36. 36. Shen Wei, Zou Bixia, Chen Xingxin. (2021), "Forecasting Of Csi 300 Index Based On Pso-Lstm-Rt Composite Model." 2021 International Symposium on Computer Science and Intelligent Controls (ISCSIC), pp. 329–333,
* View Article
* Google Scholar
37. 37. Xueyu Y., Chun H., Heng X., Yuyang S. (2023), "Particle swarm optimization LSTM based stock prediction model.", 2023 3rd Asia-Pacific Conference on Communications Technology and Computer Science, pp. 513–516,
* View Article
* Google Scholar
38. 38. Wolpert David H., and Macready William G. "No free lunch theorems for optimization." IEEE transactions on evolutionary computation 1.1 (1997): 67–82.
* View Article
* Google Scholar
39. 39. He Qi-Qiao, Wu Cuiyu, Si Yain-Whar. (2022),"LSTM with particle Swam optimization for sales forecasting.", Electronic Commerce Research and Applications, Vol. 51, pp. 101118.
* View Article
* Google Scholar
40. 40. Greff K., Srivastava R. K., Koutník J., Steunebrink B. R., Schmidhuber J. (2017), "LSTM: A Search Space Odyssey. "IEEE Transactions on Neural Networks and Learning Systems, Vol. 28 No. 10, pp. 2222–2232. pmid:27411231
* View Article
* PubMed/NCBI
* Google Scholar
41. 41. Kumaresan M., Basha M. J., Manikandan P., Annamalai S., Sekaran R., & Kumar A. S. (2023), "Stock Price Prediction Model Using LSTM: A Comparative Study.", 2023 3rd Asian Conference on Innovation in Technology (ASIANCON), pp. 1–5.
* View Article
* Google Scholar
42. 42. Liu H, Long Z. An improved deep learning model for predicting stock market price time series[J]. Digital Signal Processing, 2020, 102: 102741–102757.
* View Article
* Google Scholar
43. 43. Fischer T.; Krauss C. Deep learning with long short-term memory networks for financial market predictions. European Journal of Operational Research, 2018,270, 654–669.
* View Article
* Google Scholar
44. 44. Wang Q.; Xu W.; Zheng H. Combining the wisdom of crowds and technical analysis for financial market prediction using deep random subspace ensembles. Neurocomputing, 2018, 299:51–61.
* View Article
* Google Scholar
45. 45. Zhong X.; Enke D. A comprehensive cluster and classification mining procedure for daily stock market return forecasting. Neurocomputing, 2017, 267:152–168.
* View Article
* Google Scholar
46. 46. Lei L. Wavelet neural network prediction method of stock price trend based on rough setattribute reduction[J]. Applied Soft Computing, 2018, 62: 923–932.
* View Article
* Google Scholar
47. 47. Hsieh T-J, Hsiao H-F, Yeh W-C. Forecasting stock markets using wavelet transforms andrecurrent neural networks: An integrated system based on artificial bee colony algorithm[J]. Applied soft computing, 2011, 11(2): 2510–2525.
* View Article
* Google Scholar
48. 48. Kim T. Y., Oh K. J., Kim C., Do J. D. (2004), "Artificial neural networks for non-stationary time series. " Neurocomputing, Vol. 61 No. 3, pp. 439–447.
* View Article
* Google Scholar
49. 49. Devan P, Khare N. (2020), "An efficient XGBoost-DNN-based classification model for network intrusion detection system.", Neural Computing and Applications, Vol. 32, pp. 12499–12514.
* View Article
* Google Scholar
50. 50. Ketkar Nikhil, and Ketkar Nikhil. (2017), "Introduction to keras." Deep learning with python: a hands-on introduction, pp. 97–111.
* View Article
* Google Scholar
51. 51. Chollet F. (2021), "Deep Learning with Python.", Manning Publications, pp.80–82.
* View Article
* Google Scholar
52. 52. Ye Z, Schuller B. (2021), "Capturing dynamics of post-earnings-announcement drift using a genetic algorithm-optimized xgboost.", Expert Systems with Applications, Vol. 177 No. 2, pp. 114892–114908.
* View Article
* Google Scholar
53. 53. Yun K, Yoon S, Won D. (2021), "Prediction of stock price direction using a hybrid GA-XGBoost algorithm with a three-stage feature engineering process.", Expert Systems with Applications, Vol. 186, pp. 115716–115737.
* View Article
* Google Scholar
54. 54. Krauss C, Do A, Huck N. (2017), "Deep neural networks, gradient-boosted trees, random forests: Statistical arbitrage on the S&P 500". European Journal of Operational Research, Vol. 259 No. 2, pp. 689–702.
* View Article
* Google Scholar
55. 55. Kumbure M, Lohrmann C, Luukka P, et al. (2022), "Machine learning techniques and data for stock market forecasting: A literature review.", Expert Systems with Applications, Vol. 197, pp. 116659–116700.
* View Article
* Google Scholar
56. 56. Tajmouati Samya, et al. (2021), "Applying k-nearest neighbors to time series forecasting: two new approaches. " arXiv preprint arXiv:2103.14200.
* View Article
* Google Scholar
57. 57. Lin Guancen, Lin Aig, and Cao Jianing. (2021), "Multidimensional KNN algorithm based on EEMD and complexity measures in financial time series forecasting.", Expert Systems with Applications, Vol. 168, pp. 114443.
* View Article
* Google Scholar
58. 58. Castillo Pedro A., et al. (2017), "Applying computational intelligence methods for predicting the sales of newly published books in a real editorial business management environment.", Knowledge-Based Systems, Vol. 115, pp. 133–151.
* View Article
* Google Scholar
59. 59. Li X, Wu P. (2021), "Stock Price Prediction Incorporating Market Style Clustering. ", Cognitive Computation, Vol. 14 No. 1, pp. 149–166.
* View Article
* Google Scholar
60. 60. Zhang J, Shao Y, Huang L, et al. (2020), "Can the Exchange Rate Be Used to Predict the Shanghai Composite Index?", IEEE Access, Vol. 8, pp. 2188–2199.
* View Article
* Google Scholar
61. 61. Sadeghi A, Daneshvar A, Zaj M. (2021), "Combined ensemble multi-class svm and fuzzy nsga-ii for trend forecasting and trading in forex markets.", Expert Systems with Applications, Vol. 10, pp. 115566–115582.
* View Article
* Google Scholar
62. 62. Pełka P., Dudek G. (2019), "Pattern-based forecasting monthly electricity demand using multilayer perceptron.", International Conference on Artificial Intelligence and Soft Computing, pp. 663–672.
* View Article
* Google Scholar
63. 63. Khare K., Darekar O., Gupta P., Attar V. (2017), "Short term stock price prediction using deep Learning.", 2017 2nd IEEE International Conference on Recent Trends in Electronics, Information & Communication Technology (RTEICT), pp. 482–486.
* View Article
* Google Scholar
64. 64. Elsheikh , Ahmed S. Yacout , and Ouali M. S. (2019), "Bidirectional handshaking LSTM for remaining useful life prediction.", Neurocomputing, Vol. 5, pp. 148–156.
* View Article
* Google Scholar
65. 65. Cui Zhiyong, et al. (2018), "Deep Bidirectional and Unidirectional LSTM Recurrent Neural Network for Network-wide Traffic Speed Prediction.", arxiv preprint arxiv:1801.02143.
* View Article
* Google Scholar
66. 66. Jianhua Huang, Min Zhong, Qingchun Hu. LSTM stock prediction model based on improved particle swarm optimization algorithm [J]. Journal of East China University of Science and Technology (Natural Science Edition), 2022, 48 (05): 696–707.
* View Article
* Google Scholar
67. 67. Jingqi Cong, Pengfei Cheng, Zhenjun Zhao. Integrated prediction model for stock index based on CEEMD-CNN-LSTM [J]. Systems Engineering, 2023,41 (04): 104–116
* View Article
* Google Scholar
68. 68. Weijie Chen, Weihui Jiang, Xibei Jia. Stock Index Price Prediction Based on CNN-GRU Joint Model [J]. Information Technology and Informatization, 2021 (09): 87–91
* View Article
* Google Scholar
Citation: Zeng X, Liang C, Yang Q, Wang F, Cai J (2025) Enhancing stock index prediction: A hybrid LSTM-PSO model for improved forecasting accuracy. PLoS ONE 20(1): e0310296. https://doi.org/10.1371/journal.pone.0310296
About the Authors:
Xiaohua Zeng
Roles: Conceptualization, Methodology, Software, Writing – original draft
Affiliation: School of Economics and Trade, Guangzhou Xinhua University, Dongguan, China
ORICD: https://orcid.org/0000-0003-0934-7118
Changzhou Liang
Roles: Data curation, Methodology
Affiliation: School of Economics and Trade, Guangzhou Xinhua University, Dongguan, China
Qian Yang
Roles: Data curation, Methodology
Affiliation: School of Economics and Trade, Guangzhou Xinhua University, Dongguan, China
Fei Wang
Roles: Project administration, Supervision, Writing – review & editing
E-mail: [email protected] (FW); [email protected] (JC)
Affiliation: School of Economics and Trade, Guangzhou Xinhua University, Dongguan, China
Jieping Cai
Roles: Investigation, Project administration, Supervision, Writing – review & editing
E-mail: [email protected] (FW); [email protected] (JC)
Affiliation: School of Economics and Trade, Guangzhou Xinhua University, Dongguan, China
[/RAW_REF_TEXT]
1. El B.D., Yahyaouy A, and Boumhidi J. (2022),"Intelligent Energy Management for Micro-Grid based on Deep Learning LSTM prediction Model and Fuzzy Decision-Making.", Sustainable Computing: Informatics and Systems, Vol. 35, pp. 100709.
2. Zolfaghari M., Fadishei H., Tajgardan M. and Khoshkangini R. (2022), “Stock Market Prediction Using Multi-Objective Optimization”, 12th International Conference on Computer and Knowledge Engineering (ICCKE), pp. 253–262,
3. Gao J, Jia Z, Wang X, Xing H. (2022), "Prediction of degradation trend of proton exchange membrane fuel cells based on PSO-LSTM.", Journal of Jilin University (Engineering Edition), Vol. 52 No. 9, pp. 2192–2202.
4. Mehtab Sidra, and Sen J. (2020), "Stock Price Prediction Using CNN and LSTM-Based Deep Learning Models.", 2020 International Conference on Decision Aid Sciences and Application (DASA), pp. 447–453.
5. Fuwei Y, Jingjing C and Yicen L. (2021), "Improved and optimized recurrent neural network based on PSO and its application in stock price prediction." Soft Computing, Vol. 27, pp. 3461–3476.
6. Yang Yujun, Yang Yimei and Zhou Wang. (2021), "Research on a hybrid prediction model for stock price based on long short-term memory and variational mode decomposition.", Soft Computing, Vol. 25 No. 21, pp. 13513–13531.
7. Baek Y.; Kim H. Y. Modaugnet: A new forecasting framework for stock market index value with an overfitting prevention lstm module and a prediction lstm module. Expert Systems with Applications, 2018, 113, 457–480.
8. Karathanasopoulos A.; Osman M. Forecasting the dubai financial market with a combination of momentum effect with a deep belief network. Journal of Forecasting, 2019,38,346–353.
9. Nguyen D. H. D.; Tran L. P.; Nguyen V. Predicting stock prices using dynamic lstm models. In International Conference on Applied Informatics, 2019; pp. 199–212. Springer.
10. Jin Z.; Yang Y.; Liu Y. Stock closing price prediction based on sentiment analysis and lstm. Neural Computing and Applications, 2019, 1–17.
11. Ma J., Lei D., Ren Z. et al. Automated Machine Learning-Based Landslide Susceptibility Mapping for the Three Gorges Reservoir Area, China. Math Geosci (2023). https://doi.org/10.1007/s11004-023-10116-3
12. Liu Z., Ma J., Xia D. et al. Toward the reliable prediction of reservoir landslide displacement using earthworm optimization algorithm-optimized support vector regression (EOA-SVR). Nat Hazards 120, 3165–3188 (2024). https://doi.org/10.1007/s11069-023-06322-1
13. Long Y., Li W., Huang R. et al. A Comparative Study of Supervised Classification Methods for Investigating Landslide Evolution in the Mianyuan River Basin, China. J. Earth Sci. 34, 316–329 (2023). https://doi.org/10.1007/s12583-021-1525-9
14. Liang X., Song C., Liu K. et al. Reconstructing Centennial-Scale Water Level of Large Pan-Arctic Lakes Using Machine Learning Methods. J. Earth Sci. 34, 1218–1230 (2023). https://doi.org/10.1007/s12583-022-1739-5
15. Gers F. A., Schmidhuber J., Cummins F. (2000), " Learning to forget: Continual prediction with LSTM.", Neural Computation, Vol. 12 No. 10, pp. 2451–2471. pmid:11032042
16. Nan Jing, Qi Liu, and Hefei Wang. (2021), "Stock price prediction based on stock price synchronicity and deep learning." International Journal of Financial Engineering, Vol. 08 No. 2, pp. 2141010–2141020.
17. Oliveira D M Q N a C M P R a D. Stock Market’s Price Movement Prediction With LSTMNeural Networks[J]. 2017 International Joint Conference on Neural Networks (IJCNN), 2017.
18. Moghar A, Hamiche M. Stock Market Prediction Using LSTM Recurrent Neural Network[J].Procedia Computer Science, 2020, 170: 1168–1173.
19. Nelson D M Q, Pereira A C M, Oliveira R a D. Stock market’s price movement prediction with LSTM neural networks[C]. 2017 International Joint Conference on Neural Networks (IJCNN), 2017:1419–1426.
20. Bathla G. Stock Price prediction using LSTM and SVR[C]. 2020 Sixth International Conference on Parallel, Distributed and Grid Computing (PDGC), 2020: 211–214.
21. Latrisha N, Jeta N, Callista A, et al. (2023), "Machine learning approaches in stock market prediction: A systematic literature review. ", Procedia Computer Science, Vol. 216, pp. 96–102.
22. Thakkar A., & Chaudhari K. (2022), "Information fusion-based genetic algorithm with long short-term memory for stock price and trend prediction.", Applied Soft Computing, Vol. 128, pp.109428–109448.
23. Gülmez B. (2023), "A novel deep learning model with the Grey Wolf Optimization algorithm for cotton disease detection.", Journal of Universal Computer Science, Vol. 6 No. 29, pp. 595–626.
24. Kumar K., & Haider M. T. U. (2021), "Enhanced prediction of intra-day stock market using metaheuristic optimization on RNN–LSTM network.", New Generation Computing, Vol. 39, pp.231–272.
25. Oztürk M. M. (2022), "Initializing hyper-parameter tuning with a metaheuristic ensemble method: A case study using time-series weather data.", Evolutionary Intelligence, Vol. 16, pp. 1–13.
26. Kennedy R., Eberhart J. (1995), "Particle swarm optimization.", Proceedings of ICNN’95—International Conference on Neural Networks, Vol. 4, pp. 1942–1948.
27. Thakkar Ankit, and Kinjal . (2021), "A comprehensive survey on portfolio optimization, stock price and trend prediction using particle swarm optimization.", Archives of Computational Methods in Engineering, Vol. 28, pp. 2133–2164.
28. Chi Ma. (2022), "Stock linkage prediction based on optimized LSTM model.", Multimedia Tools and Applications, Vol. 81 No. 9, pp. 12599–12617.
29. Cuk Aleksa, et al. "Tuning attention based long-short term memory neural networks for Parkinson’s disease detection using modified metaheuristics." Scientific Reports 14.1 (2024): 4309. pmid:38383690
30. Yang Chao-Lung, et al. "LSTM-based framework with metaheuristic optimizer for manufacturing process monitoring." Alexandria Engineering Journal 83 (2023): 43–52.
31. Bacanin Nebojsa, et al. "Cloud computing load prediction by decomposition reinforced attention long short-term memory network optimized by modified particle swarm optimization algorithm." Annals of Operations Research (2023): 1–34.
32. Pedroza-Castro Francisco J., Alfonso Rojas-Domínguez, and Martín Carpio. "Automated Machine Learning to Improve Stock-Market Forecasting Using PSO and LSTM Networks." Hybrid Intelligent Systems Based on Extensions of Fuzzy Logic, Neural Networks and Metaheuristics. Cham: Springer Nature Switzerland, 2023. 331–345.
33. Predić Bratislav, et al. "Cloud-load forecasting via decomposition-aided attention recurrent neural network tuned by modified particle swarm optimization." Complex & Intelligent Systems (2023): 1–21.
34. Donkol Ahmed Abd El-Baset, et al. "Optimization of intrusion detection using likely point PSO and enhanced LSTM-RNN hybrid technique in communication networks." IEEE Access 11 (2023): 9469–9482.
35. Jiangpeng Wang, Cheng Wang. (2021), "Progress and future trends of deep learning technology.", China Security, Vol. 07, pp. 81–84.
36. Shen Wei, Zou Bixia, Chen Xingxin. (2021), "Forecasting Of Csi 300 Index Based On Pso-Lstm-Rt Composite Model." 2021 International Symposium on Computer Science and Intelligent Controls (ISCSIC), pp. 329–333,
37. Xueyu Y., Chun H., Heng X., Yuyang S. (2023), "Particle swarm optimization LSTM based stock prediction model.", 2023 3rd Asia-Pacific Conference on Communications Technology and Computer Science, pp. 513–516,
38. Wolpert David H., and Macready William G. "No free lunch theorems for optimization." IEEE transactions on evolutionary computation 1.1 (1997): 67–82.
39. He Qi-Qiao, Wu Cuiyu, Si Yain-Whar. (2022),"LSTM with particle Swam optimization for sales forecasting.", Electronic Commerce Research and Applications, Vol. 51, pp. 101118.
40. Greff K., Srivastava R. K., Koutník J., Steunebrink B. R., Schmidhuber J. (2017), "LSTM: A Search Space Odyssey. "IEEE Transactions on Neural Networks and Learning Systems, Vol. 28 No. 10, pp. 2222–2232. pmid:27411231
41. Kumaresan M., Basha M. J., Manikandan P., Annamalai S., Sekaran R., & Kumar A. S. (2023), "Stock Price Prediction Model Using LSTM: A Comparative Study.", 2023 3rd Asian Conference on Innovation in Technology (ASIANCON), pp. 1–5.
42. Liu H, Long Z. An improved deep learning model for predicting stock market price time series[J]. Digital Signal Processing, 2020, 102: 102741–102757.
43. Fischer T.; Krauss C. Deep learning with long short-term memory networks for financial market predictions. European Journal of Operational Research, 2018,270, 654–669.
44. Wang Q.; Xu W.; Zheng H. Combining the wisdom of crowds and technical analysis for financial market prediction using deep random subspace ensembles. Neurocomputing, 2018, 299:51–61.
45. Zhong X.; Enke D. A comprehensive cluster and classification mining procedure for daily stock market return forecasting. Neurocomputing, 2017, 267:152–168.
46. Lei L. Wavelet neural network prediction method of stock price trend based on rough setattribute reduction[J]. Applied Soft Computing, 2018, 62: 923–932.
47. Hsieh T-J, Hsiao H-F, Yeh W-C. Forecasting stock markets using wavelet transforms andrecurrent neural networks: An integrated system based on artificial bee colony algorithm[J]. Applied soft computing, 2011, 11(2): 2510–2525.
48. Kim T. Y., Oh K. J., Kim C., Do J. D. (2004), "Artificial neural networks for non-stationary time series. " Neurocomputing, Vol. 61 No. 3, pp. 439–447.
49. Devan P, Khare N. (2020), "An efficient XGBoost-DNN-based classification model for network intrusion detection system.", Neural Computing and Applications, Vol. 32, pp. 12499–12514.
50. Ketkar Nikhil, and Ketkar Nikhil. (2017), "Introduction to keras." Deep learning with python: a hands-on introduction, pp. 97–111.
51. Chollet F. (2021), "Deep Learning with Python.", Manning Publications, pp.80–82.
52. Ye Z, Schuller B. (2021), "Capturing dynamics of post-earnings-announcement drift using a genetic algorithm-optimized xgboost.", Expert Systems with Applications, Vol. 177 No. 2, pp. 114892–114908.
53. Yun K, Yoon S, Won D. (2021), "Prediction of stock price direction using a hybrid GA-XGBoost algorithm with a three-stage feature engineering process.", Expert Systems with Applications, Vol. 186, pp. 115716–115737.
54. Krauss C, Do A, Huck N. (2017), "Deep neural networks, gradient-boosted trees, random forests: Statistical arbitrage on the S&P 500". European Journal of Operational Research, Vol. 259 No. 2, pp. 689–702.
55. Kumbure M, Lohrmann C, Luukka P, et al. (2022), "Machine learning techniques and data for stock market forecasting: A literature review.", Expert Systems with Applications, Vol. 197, pp. 116659–116700.
56. Tajmouati Samya, et al. (2021), "Applying k-nearest neighbors to time series forecasting: two new approaches. " arXiv preprint arXiv:2103.14200.
57. Lin Guancen, Lin Aig, and Cao Jianing. (2021), "Multidimensional KNN algorithm based on EEMD and complexity measures in financial time series forecasting.", Expert Systems with Applications, Vol. 168, pp. 114443.
58. Castillo Pedro A., et al. (2017), "Applying computational intelligence methods for predicting the sales of newly published books in a real editorial business management environment.", Knowledge-Based Systems, Vol. 115, pp. 133–151.
59. Li X, Wu P. (2021), "Stock Price Prediction Incorporating Market Style Clustering. ", Cognitive Computation, Vol. 14 No. 1, pp. 149–166.
60. Zhang J, Shao Y, Huang L, et al. (2020), "Can the Exchange Rate Be Used to Predict the Shanghai Composite Index?", IEEE Access, Vol. 8, pp. 2188–2199.
61. Sadeghi A, Daneshvar A, Zaj M. (2021), "Combined ensemble multi-class svm and fuzzy nsga-ii for trend forecasting and trading in forex markets.", Expert Systems with Applications, Vol. 10, pp. 115566–115582.
62. Pełka P., Dudek G. (2019), "Pattern-based forecasting monthly electricity demand using multilayer perceptron.", International Conference on Artificial Intelligence and Soft Computing, pp. 663–672.
63. Khare K., Darekar O., Gupta P., Attar V. (2017), "Short term stock price prediction using deep Learning.", 2017 2nd IEEE International Conference on Recent Trends in Electronics, Information & Communication Technology (RTEICT), pp. 482–486.
64. Elsheikh , Ahmed S. Yacout , and Ouali M. S. (2019), "Bidirectional handshaking LSTM for remaining useful life prediction.", Neurocomputing, Vol. 5, pp. 148–156.
65. Cui Zhiyong, et al. (2018), "Deep Bidirectional and Unidirectional LSTM Recurrent Neural Network for Network-wide Traffic Speed Prediction.", arxiv preprint arxiv:1801.02143.
66. Jianhua Huang, Min Zhong, Qingchun Hu. LSTM stock prediction model based on improved particle swarm optimization algorithm [J]. Journal of East China University of Science and Technology (Natural Science Edition), 2022, 48 (05): 696–707.
67. Jingqi Cong, Pengfei Cheng, Zhenjun Zhao. Integrated prediction model for stock index based on CEEMD-CNN-LSTM [J]. Systems Engineering, 2023,41 (04): 104–116
68. Weijie Chen, Weihui Jiang, Xibei Jia. Stock Index Price Prediction Based on CNN-GRU Joint Model [J]. Information Technology and Informatization, 2021 (09): 87–91
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
© 2025 Zeng et al. This is an open access article distributed under the terms of the Creative Commons Attribution License: http://creativecommons.org/licenses/by/4.0/ (the “License”), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Abstract
Stock price prediction is a challenging research domain. The long short-term memory neural network (LSTM) widely employed in stock price prediction due to its ability to address long-term dependence and transmission of historical time signals in time series data. However, manual tuning of LSTM parameters significantly impacts model performance. PSO-LSTM model leveraging PSO’s efficient swarm intelligence and strong optimization capabilities is proposed in this article. The experimental results on six global stock indices demonstrate that PSO-LSTM effectively fits real data, achieving high prediction accuracy. Moreover, increasing PSO iterations lead to gradual loss reduction, which indicates PSO-LSTM’s good convergence. Comparative analysis with seven other machine learning algorithms confirms the superior performance of PSO-LSTM. Furthermore, the impact of different retrospective periods on prediction accuracy and finding consistent results across varying time spans are. Conducted in the experiments.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer