Content area
In the contemporary global economic landscape, financial fraud represents a significant challenge, resulting in substantial losses for market participants, including business enterprises and financial institutions. This phenomenon has a profound impact on market stability, significantly affecting the management of the economy. To address this issue, this paper proposes a novel financial fraud detection algorithm that integrates deep belief networks (DBN) with quantum optimisation algorithms. The proposed model employs a hybrid model optimisation strategy that integrates convolutional neural networks (CNNs), long short-term memory networks (LSTMs) and graph neural networks (GNNs).Conventional detection methods depend on manual rules and statistical analyses, which are inadequate for handling large-scale, high-density and complex financial market data. Recent advancements in deep learning have demonstrated potential in addressing these challenges; however, they are often hindered by issues related to computational efficiency and training time. The proposed integrated approach in this paper combines deep learning with quantum computing to overcome these limitations. The hybrid model utilises the parallel processing power of quantum computing to improve the training efficiency of DBNs, while CNNs, LSTMs and GNNs extract features from multiple dimensions of financial market data. Experimental results demonstrate the proposed model's advantages in terms of accuracy, training speed and robustness, providing a promising solution for financial fraud detection.
Article highlights
Quantum-Enhanced Fraud Detection: A novel quantum-optimized deep belief network achieves 88.7% precision and 86.5% recall, outperforming traditional methods in fraud detection efficiency and accuracy.
Hybrid Model for Robust Fraud Detection: Integration of CNN, LSTM, and GNN extracts spatial, temporal, and relational features to enhance detection robustness for complex fraud patterns.
Economic Benefits and Cost-Effective Deployment: The model reduces fraud-related economic losses and deployment costs, offering a cost-effective solution with high computational efficiency for financial institutions.
Introduction
With the rise of digitalization, financial fraud has become a significant challenge, causing substantial economic losses and undermining market stability and investor confidence. Traditional detection methods, which primarily rely on manual rules and statistical analysis, face significant challenges in effectively addressing the growing complexity and scale of financial fraud. Rule-based methods, while effective in their early stages for identifying fixed patterns of fraudulent behavior (such as inflated income or false expenses), are inherently rigid. They require continuous manual intervention to update rules in response to emerging fraud tactics, which is not only labor-intensive but also prone to oversight as fraudsters develop more sophisticated and varied methods. This rigidity makes them ill-suited to the dynamic nature of modern financial environments where fraud patterns evolve rapidly [1].
Statistical methods, although powerful in analyzing structured data and identifying patterns within historical contexts, have notable drawbacks when applied to contemporary fraud detection. They often assume a degree of linearity and normality in data distributions, which does not hold true for the complex, non-linear, and often skewed nature of financial transactions. Furthermore, statistical models can be computationally intensive, especially when dealing with the high dimensionality and volume of modern financial datasets. Their reliance on historical data also means they may lag in identifying novel fraud patterns, as these do not conform to previously observed statistical norms. Additionally, the interpretability of statistical models, while generally strong, can become a liability when overly complex models are employed to compensate for their inherent limitations in capturing non-linear relationships.
In contrast, deep learning models offer several distinct advantages that address these limitations. They excel at automatically learning and extracting complex, non-linear features from vast amounts of data without the need for extensive manual feature engineering [2]. This capability is particularly valuable in the financial sector, where the volume and velocity of transactions make manual analysis infeasible. Deep learning architectures, such as convolutional neural networks (CNNs), long short-term memory networks (LSTM), and graph neural networks (GNN), are specifically designed to handle different data types and structures, allowing them to capture spatial, temporal, and relational features that traditional methods might miss. Moreover, their adaptive nature enables them to evolve with new data, improving their detection accuracy over time and reducing the need for frequent manual intervention. This makes deep learning models not only more efficient but also more effective in the dynamic and high-stakes environment of financial fraud detection.
This study proposes an innovative financial fraud detection method that combines deep belief networks (DBN) with quantum optimization algorithms. The integration of quantum computing accelerates the training of deep learning models, improving efficiency and performance when processing large-scale, complex data. Additionally, a hybrid model combining Convolutional Neural Networks (CNN), Long Short Term Memory Networks (LSTM), and Graph Neural Networks (GNN) is used to extract features from multiple dimensions, enhancing the accuracy and robustness of fraud detection. This approach not only advances the theoretical application of quantum computing in deep learning but also provides a practical, efficient solution for financial fraud detection [3].
Key innovations include:
Combination of DBN and Quantum Algorithms: This integration uses quantum optimization to accelerate DBN training, improving efficiency and detection speed through parallel computing and quantum superposition.
Hybrid Model for Feature Extraction: The CNN, LSTM, and GNN hybrid model captures spatial features, long-term dependencies, and node relationships in data, enhancing fraud detection accuracy and robustness.
Collaborative Quantum and Deep Learning: By combining quantum algorithms with multi-level deep learning, the model accelerates training while improving generalization, effectively addressing complex fraud patterns.
In summary, this paper presents a novel hybrid approach combining quantum optimization and deep learning to offer an efficient and accurate solution for financial fraud detection.
Related work
Research progress on financial fraud detection
With the continuous development of the global economy, financial fraud has become increasingly complex, causing huge losses to businesses and investors [4]. Therefore, how to effectively detect and prevent financial fraud has become a research hotspot in both academia and practice. Financial fraud detection methods can generally be divided into three categories: rule-based methods, statistical methods, and machine learning methods [5, 6]. Each method has its own advantages and disadvantages, and as fraudulent methods continue to develop and change, the effectiveness of different methods is constantly being tested and improved.
Firstly, traditional rule-based detection methods mainly rely on expert experience, using a series of pre-set rules and thresholds to screen financial data. This method has achieved some success in the early stages, especially in certain fixed patterns of fraudulent behavior (such as inflated income, false expenses, etc.), where it is more effective [7, 8]. However, with the diversification of corporate financial environments and the continuous innovation of fraudulent methods, traditional rule-based approaches are gradually exposing their limitations. Specifically, these methods are difficult to address new forms of fraud as they cannot dynamically adjust rules to adapt to new fraud patterns [9, 10].
Secondly, statistical methods have also been widely used in financial fraud detection, especially in regression analysis, cluster analysis, and other areas. For example, analyzing the relationship between financial data and historical fraudulent behavior through regression models, or identifying abnormal patterns in financial data through cluster analysis [11]. These methods can effectively reveal potential fraud risks when processing data of a certain scale. However, statistical methods have weak abilities in recognizing complex nonlinear patterns and processing large-scale data, and often perform poorly when facing multidimensional and unstructured data. In addition, statistical methods rely heavily on historical data and lack sufficient adaptability.
In recent years, with the improvement of computing power and the development of big data technology, machine learning methods, especially deep learning, have become the forefront of research in the field of financial fraud detection. Machine learning methods can automatically learn patterns from large amounts of historical data and discover potential complex patterns in the data. Neural network methods (such as multi-layer perceptrons, convolutional neural networks, etc.) and ensemble learning methods (such as random forests, gradient boosting decision trees, etc.) are widely used in fraud detection [12, 13]. These methods can provide high detection accuracy in complex and dynamic environments by training models to identify potential features of fraudulent behavior. For example, deep learning models can automatically extract high-order features of data, avoiding the difficulty of manually setting features, and can self adjust to deal with new fraudulent behaviors. At the same time, ensemble learning methods improve overall detection performance by combining multiple weak classifiers, especially exhibiting strong robustness when facing imbalanced datasets. This is similar to the integrated scheduling strategy in energy systems, such as the African vulture optimization technique used to optimize the multi-objective scheduling problem in the combined cooling, heating and power system [14].
Although machine learning methods have made significant progress in financial fraud detection, they still face some challenges, such as model interpretability, data privacy issues, and model generalization ability [15]. In the future, with the continuous development of technology, hybrid models combining multiple methods, deep learning models with enhanced interpretability, and more advanced anomaly detection techniques may become research trends in the field of financial fraud detection.
Integration of deep belief networks and quantum algorithms in fraud detection
Deep belief network (DBN), as an unsupervised learning deep learning model, learns high-order feature representations of data by stacking multiple restricted Boltzmann machines (RBM) layer by layer, and can extract latent patterns from unlabeled data without a large amount of labeled data. This feature makes DBN have great potential in the field of fraud detection, especially when facing complex financial data, it can automatically learn potential fraud features from the data, thereby improving the classification accuracy of the model [16, 17]. However, with the continuous increase in data volume, traditional deep learning methods face significant challenges in terms of computational efficiency and training time. To overcome these issues, quantum computing, as an emerging computational method, provides unprecedented potential for deep learning. Quantum computing utilizes the properties of quantum superposition states and quantum entanglement, which can significantly improve computational efficiency while processing a large amount of calculations simultaneously. Through quantum algorithms, the speed and accuracy of data processing have been greatly improved, especially when facing large-scale datasets.Quantum computing can perform parallel processing on complex computing tasks in a short period of time, significantly reducing the time required for model training [18].Quantum algorithms can not only optimize the parameter update process in deep belief networks, but also improve the regularization process through quantum randomization technology and enhance the generalization ability of the model [19].
In recent years, the combination of quantum computing and deep learning has become an important research direction, especially in the field of fraud detection. Traditional DBNs face challenges in escaping local optima due to gradient-based training (e.g., contrastive divergence) [20]. Quantum optimization leverages quantum superposition and parallel state evaluation to explore multiple parameter configurations simultaneously. Specifically, quantum annealing maps the DBN’s energy landscape to a quantum Hamiltonian, enabling global optimization of weights and biases. This avoids shallow minima that trap classical DBNs, enhancing feature learning for rare fraud patterns.
Additionally, quantum noise injection during training acts as dynamic regularization. Unlike classical dropout, which randomly deactivates neurons, quantum noise perturbs the network’s superposition states, forcing the DBN to learn robust representations resilient to adversarial perturbations in financial data (e.g., synthetic fraud transactions). In addition, quantum algorithms can also improve the regularization process in deep learning models through quantum randomization techniques, thereby enhancing the model’s generalization ability and reducing overfitting. This enables deep belief networks to better adapt to high-dimensional, sparse, and nonlinear data features in fraud detection tasks [21, 22–23].For example, the improved water wave optimization algorithm showed an acceleration effect in the optimization process in the optimal economic dispatch of microgrids, which provided useful inspiration for the application of quantum computing in deep learning [24].
Although quantum algorithms have shown great potential in improving the performance of deep learning, their practical applications still face many challenges. For example, the limitations of quantum hardware and noise issues remain bottlenecks in the application of quantum computing to deep learning. In addition, the current maturity of quantum algorithms is not sufficient, especially in the integration with traditional deep learning models, and more exploration and optimization are needed. However, with the continuous development of quantum computing technology and the increasing maturity of quantum hardware, the integration of quantum algorithms and deep learning will play an increasingly important role in complex tasks such as fraud detection, driving the field towards more efficient and accurate development. In the future, a hybrid model combining quantum computing and deep learning may become a key technology for solving financial fraud problems, providing more accurate and efficient solutions for fraud detection [25].
Integration method based on convolutional neural networks, long short term memory networks, and graph neural networks
Convolutional Neural Networks (CNN), Long Short Term Memory Networks (LSTM), and Graph Neural Networks (GNN) are three important neural network architectures in the field of deep learning, each with unique advantages in handling different types of data. Combining these three models can form a powerful integrated approach that can effectively address complex financial fraud detection tasks [26].
Firstly, Convolutional Neural Networks (CNNs) excel in image processing and 2D data analysis, with their core advantage being the ability to automatically extract local features through convolution operations. For some specific data structures in financial fraud detection, such as financial statements containing multiple dimensions, account transaction records, etc., CNN can effectively extract spatial features of the data. For example, the possible relationships between different dimensions in financial data (such as income and expenses, assets and liabilities) can be learned through CNN's convolutional layers, providing strong support for subsequent fraud judgments. Through multi-layer convolution operations, CNN can gradually abstract higher-order features from the raw data, helping the model better capture potential fraudulent behavior [27].
Long Short Term Memory (LSTM) is a classic model for processing time-series data, which can effectively capture long-term dependencies in time series. Financial fraud often presents a temporal continuity and changing trend, for example, certain fraudulent behaviors may gradually accumulate and manifest over a period of time. LSTM, through its gating mechanism, can maintain memory of important information over a long period of time and filter out irrelevant noise. This makes the application of LSTM in financial fraud detection particularly important, as it can identify abnormal transaction patterns or trend changes from time series data, providing temporal basis for judging fraudulent behavior [28].
Graph neural networks (GNNs) are specifically designed for processing graph structured data and can model node relationships in the data. In financial fraud detection, the relationship between financial accounts, transaction records, and users can be represented through a graph structure. For example, transfer relationships between bank accounts and social relationships between users can be constructed as graph data. GNN is able to learn the features of each node by aggregating information from neighboring nodes, which is crucial for identifying abnormal patterns in financial transaction networks. Through graph convolution operations, GNN can propagate information between nodes, thereby capturing potential fraudulent behaviors hidden in complex relational networks [29].Similarly, in smart grid optimization, graph models are used to process the relationships between nodes in the power network to improve the accuracy and efficiency of system scheduling [30].
Combining CNN, LSTM, and GNN to form an integrated model can fully utilize the complementary advantages of these three. CNN can extract spatial features of data, LSTM can capture long-term dependencies in time series, and GNN can model complex relationship information between nodes. By integrating these three networks, the model can detect fraudulent behavior from multiple dimensions, providing more accurate and comprehensive judgments. Specifically, CNN can first perform preliminary feature extraction on financial data, LSTM can then analyze the temporal evolution trend of the data, while GNN can deeply explore the correlations between accounts or transactions, and ultimately integrate this information to identify fraudulent behavior [31, 32, 33–34].
The advantage of this integrated approach is that it can not only rely on the powerful capabilities of each network individually, but also complement each other through collaborative work, improving the effectiveness of financial fraud detection from multiple perspectives and different levels [35, 36, 37, 38–39]. With the increasing complexity of financial fraud patterns, this multi model fusion strategy will undoubtedly become an important way to improve detection accuracy and generalization ability.
System framework and method design
Modeling of financial fraud detection issues
The problem of financial fraud detection can be modeled as a typical binary classification problem. Assuming the input data is a set of financial transaction records and related user behavior data, the goal is to determine whether each transaction constitutes fraudulent behavior. The data of each transaction can be represented as a high-dimensional feature vector , where () represents the (i-th) feature, such as transaction amount, transaction time, user history behavior, etc. On this basis, we hope to construct a classification model () that outputs a binary label (), where () represents normal transactions and () represents fraudulent transactions.
In order to better capture the complex data characteristics in financial transactions, it is not only necessary to rely on the static features of transactions (such as transaction amount, transaction type, etc.), but also to consider the time series characteristics and the relationship structure between data [38, 39]. Specifically, time series characteristics can capture the time-varying patterns of trading behavior by modeling the time series of transactions. Let the t transaction history of a certain user , where () is the feature vector of the (i-th) transaction, and () represents the historical transaction quantity of the user. We can use Long Short-Term Memory (LSTM) networks to capture changes in user transaction behavior over time.
The core idea of LSTM networks is to effectively address long-term dependency problems by introducing gating mechanisms. The update formulas for LSTM include:
1
2
3
4
5
Among them, (), (), and () are the activation values of the input gate, forget gate, and output gate, respectively. () is the cell state, () is the hidden state, () is the input at the current time, () is the Sigmoid function, and () represents element level multiplication.
In addition, there are often implicit relationship structures within transaction data. For example, there may be social network relationships between transactions, or certain users may act as initiators or intermediaries in fraudulent behavior [40, 41]. To model these relationships, we can utilize Graph Neural Networks (GNNs). In GNNs, each user or transaction is treated as a node in the graph, and the edges between nodes represent the relationships between transactions or users. For example, if there is a fund flow relationship between user () and user (), add an edge between these two nodes in the graph. Graph neural networks learn node representations by aggregating information from neighboring nodes, in order to identify users or transactions with fraud risks. The message passing process in graph neural networks can be represented as:
6
Among them, () represents the hidden state of node (-th) in the (-th) layer, is the set of neighboring nodes of node (), () is the element of the adjacency matrix, and is the weight matrix of the (-th) layer.
In order to effectively combine these multiple features (time series features and relational structural features) together, this paper proposes a hybrid model that combines deep belief networks (DBN) with convolutional neural networks (CNN), LSTM, and GNN. DBN, as an unsupervised learning model, can extract high-dimensional features from transaction data and provide more discriminative feature representations for subsequent classification.
In the model, the output of DBN can be represented by the following formula:
7
Among them, () is the output of the hidden layer, () is the input feature vector, () and () are the weight matrix and bias term, respectively.
The final output label is predicted by a fused deep learning model, with the following formula:
8
Among them, () is the final representation after feature fusion from CNN, LSTM, and GNN, and () and () are the weights and biases of the final layer.
In summary, the problem of financial fraud detection can be effectively modeled in a deep learning framework by constructing high-dimensional feature vectors and combining them with time series characteristics and data relationship structures. Adopting a multi-level deep learning architecture and quantum optimization algorithms to accelerate training can improve the accuracy, robustness, and computational efficiency of fraud detection models [42, 43].
Application of deep learning techniques in financial fraud detection
The application of deep learning models in financial fraud detection is mainly reflected in two aspects: feature learning and classification prediction. Traditional fraud detection methods typically rely on expert knowledge to manually extract key features from transaction data, such as transaction amount, transaction time, account history, etc., and then classify or regress models based on these features [44, 45–46]. However, as the volume of transaction data continues to increase and the relationships between data become increasingly complex, traditional methods often struggle to adapt to this complexity. Deep learning techniques, especially multi-layer neural networks (DNNs), can effectively discover potential complex patterns from raw data through automatic feature learning, reducing the workload of manual feature engineering and significantly improving the predictive performance of models [47]. The Deep Learning Application Model in Financial Fraud is shown in Fig. 1.
Fig. 1 [Images not available. See PDF.]
Deep Learning Application Model in Financial Fraud
In financial fraud detection, deep learning models first receive raw transaction data through the input layer , which includes various types of transaction information such as user behavior, account history, transaction amount, timestamp, etc. Through the layer by layer transmission of multi-layer networks, the network can gradually extract important features from high-dimensional data. At each layer, the input data undergoes a linear transformation and is non-linearly transformed through an activation function () to produce the output of the next layer.
Assuming the input of layer () is () and the output is (), the output of layer () can be calculated using the following formula:
9
10
Among them, is the weight matrix of the (l-th) layer, is the bias term, and () is the activation function (such as ReLU or Sigmoid) used to introduce nonlinear characteristics. Through layer by layer propagation, data is transformed and abstracted at each level, and the model gradually learns how to extract meaningful feature representations from input data. Ultimately, the output layer of the network converts these abstract features into predicted results, typically fraudulent or non fraudulent classification labels.
In fraud detection applications, the Sigmoid activation function is typically used for final classification prediction. Assuming the output of the last layer is (), indicating whether the transaction is fraudulent, the probability is given by the following formula:
11
Among them, () is the linear output of the last layer, () is the Sigmoid function, and the output value () represents the probability of the transaction being fraudulent. If (), it is predicted as a fraudulent transaction, otherwise it is a normal transaction.
The advantage of deep learning models lies in their ability to automatically learn features from data without relying on manually designed rules, which enables the model to capture complex and hidden patterns of fraudulent behavior. For example, the model can automatically identify certain specific transaction patterns or account behaviors through historical transaction data, and predict whether future transactions may be fraudulent. Meanwhile, as the amount of training data increases, the predictive ability of deep learning models continues to improve, enabling them to cope with constantly changing fraudulent methods [48, 49].
In addition, the generalization ability of deep learning technology also provides important guarantees for financial fraud detection. Due to the hierarchical structure of deep neural networks, they are able to capture the diversity and complexity of data at different levels. Therefore, this model can not only identify known fraud patterns, but also effectively predict new and unseen fraud behaviors, providing real-time and dynamic risk warnings. This demonstrates the enormous potential of deep learning in financial fraud detection, especially when faced with large-scale and multi-dimensional transaction data.
Introduction of quantum algorithms and optimization of CNN models
The introduction of quantum algorithms provides a new approach and tool for optimizing the training process of deep learning models. In the traditional training process of deep learning models, especially on large-scale datasets, they often face huge bottlenecks in computing resources and time. As the complexity of the model increases, the amount of computation required for the training process grows exponentially, which not only consumes a large amount of hardware resources but also makes the training time extremely lengthy. Quantum computing, through its properties such as quantum superposition, quantum entanglement, and quantum parallelism, can to some extent break through these bottlenecks and greatly accelerate the training of deep learning models.The Deep Learning Model with Quantum State Introduction is shown in Fig. 2.
Fig. 2 [Images not available. See PDF.]
Deep Learning Model with Quantum State Introduction
An important characteristic of quantum computing is the quantum superposition state. In traditional computing, each computing unit can only handle one state, while quantum computing can achieve the superposition of multiple states through qubits, allowing one qubit to represent multiple pieces of information simultaneously. Specifically, if a quantum system is in a superposition state, it can simultaneously represent multiple possible solutions, allowing for parallel search during the computation process. Assuming we represent a weight vector using quantum bits , in quantum computing, the weight vector can be represented as a combination of multiple superposition states, rather than a single weight vector in traditional computing. Through quantum parallelism, multiple weight vectors can be explored simultaneously, thereby accelerating the search process of optimization algorithms such as gradient descent.
One of the most common applications of quantum algorithms in training deep learning models is Quantum Gradient Descent (QGD). Quantum Gradient Descent is an optimization technique that leverages the principles of quantum computing, such as superposition and parallelism, to accelerate traditional gradient descent algorithms. Unlike classical gradient descent, which processes one weight vector at a time, QGD can evaluate multiple gradient values simultaneously, significantly reducing the number of iterations and speeding up model training.
In classical gradient descent, each iteration updates the model parameters by calculating the gradient of the loss function for a single weight vector. This sequential approach can be time-consuming, especially in high-dimensional spaces where the computational workload is enormous. Quantum Gradient Descent, however, utilizes quantum superposition to represent multiple weight vectors simultaneously. This allows the algorithm to explore the gradient landscape in parallel, making it highly efficient for large-scale and complex datasets.
The core idea of Quantum Gradient Descent is to optimize the minimization of the loss function using quantum circuits. The update rule in classical gradient descent is:
12
Among them, () represents the weight at the (t-th) iteration, () is the learning rate, and () is the gradient under the current weight. This process requires multiple iterations, and each iteration requires gradient calculation, especially in high-dimensional spaces where the computational workload is enormous. Quantum algorithms, through the computational power of quantum superposition states, can simultaneously evaluate multiple gradient values, thereby updating multiple parameters in parallel in each iteration, reducing the number of iterations and accelerating training speed.
Quantum algorithms can also search for better solutions in high-dimensional spaces. In traditional deep learning, model optimization is often plagued by local optima, especially in non convex optimization problems. Quantum algorithms can effectively avoid getting stuck in local optima through features such as quantum tunneling and quantum entanglement. Quantum annealing is a commonly used quantum optimization technique that simulates the transition of a quantum system from a low-energy state to a global optimal solution under constantly changing external conditions. Combined with the training of deep learning models, quantum annealing can help find the globally optimal network weights, thereby improving the performance and generalization ability of the model.
In summary, quantum algorithms provide new ideas for the training process of deep learning models, especially when dealing with large-scale and high-dimensional data. Quantum computing can significantly accelerate the training process and improve optimization efficiency. With the continuous development of quantum hardware, the application of quantum algorithms in deep learning will become increasingly widespread, becoming an important tool for solving traditional computing problems.
The role of long short term memory network (LSTM) in modeling temporal features
In financial fraud detection, transaction data often exhibits complex temporal patterns that evolve over time. For instance, fraudulent activities may involve unusual spending patterns or sudden changes in transaction frequency. Traditional machine learning models, such as decision trees and support vector machines (SVM), struggle to effectively capture these temporal dependencies, especially when dealing with long sequences of data. Long Short Term Memory (LSTM) networks, a specialized form of Recurrent Neural Network (RNN), address this challenge by leveraging their gating mechanisms to capture long-term dependencies in time-series data. This capability is crucial for identifying anomalies that may indicate fraudulent behavior.
LSTMs excel in fraud detection by maintaining a memory of past transactions through their unique architecture, which includes forget, input, and output gates. These gates regulate the flow of information, enabling the network to retain important information over extended periods while discarding irrelevant noise. This mechanism allows the LSTM to analyze the temporal evolution of transaction behavior and flag deviations from established patterns, which are often hallmarks of fraud. The update process of LSTM network can be represented as follows:
Forgotten Gate:
13
The function of the forget gate is to determine how much information in the current unit state needs to be discarded. Here, () is the sigmoid activation function, () is the weight matrix, () is the bias, () is the hidden state at the previous time, and () is the input at the current time.
Input Gate:
14
The input gate determines which new information will be stored in the unit state.
Candidate status:
15
The candidate state provides updated information about the candidates, which may be added to the current unit state.
Update unit status:
16
The current unit state () is updated by the previous unit state () and new information. The forget gate () determines how much old information needs to be retained, while the input gate and candidate states determine how much new information is added.
Output gate:
17
The output gate determines the hidden state at the current time.
Hidden state:
18
The final hidden state () is obtained from the current unit state through a tanh activation function.
Through these mechanisms, LSTM is able to determine how to update its internal state at each time step, effectively capturing long-term dependencies in the time series. This makes LSTM more effective than traditional RNN models in processing time series data, especially financial transaction data, speech data, and other data with long short-term dependencies.
However, despite its excellent performance in temporal modeling, LSTM still has some limitations. Although LSTM can handle long-term dependencies, its computational efficiency is still relatively low when dealing with very complex or multi-level temporal patterns. In addition, LSTM models typically require longer training time, especially when training on large-scale datasets. To overcome these issues, researchers have proposed a scheme that combines LSTM with other models. A common approach is to combine LSTM with Convolutional Neural Networks (CNN) to enhance the temporal data processing capability of the model.
CNN has achieved great success in image recognition tasks, with its advantage being the ability to automatically extract local features and perform effective pattern recognition. When combining CNN with LSTM, CNN can serve as a feature extraction layer to extract local temporal features from input data, while LSTM is used to capture global temporal dependencies in the data. Specifically, CNN can first perform convolution operations on the raw data to extract representative local features, which can help LSTM better capture long-term and short-term dependency information, thereby improving the overall performance of the model.The Input output structure of CNN model after LSM fusion is shown in Fig. 3.
Fig. 3 [Images not available. See PDF.]
Input output structure of CNN model after LSTM fusion
In the scenario of financial fraud detection, the hybrid model combining CNN and LSTM can fully leverage the advantages of both. CNN can help models identify potential local patterns in transaction data, such as abnormal trading behavior during specific time periods, while LSTM can capture the long-term dependencies behind these local patterns, such as the overall trend of trading behavior. Therefore, by combining CNN and LSTM, the model can more accurately identify potential fraudulent behavior and make more accurate predictions.
In practical applications, the model architecture combining CNN and LSTM usually first inputs transaction data into the CNN layer for convolution operation, obtains local feature maps, and then passes them to the LSTM layer for further processing. This combination enables the model to not only efficiently extract temporal features from data, but also model long-term dependencies on a global scale, greatly improving its ability to process temporal data.
Overall, LSTM networks play an important role in modeling long short-term dependencies through their unique design, especially in areas such as financial fraud detection, where they can help identify complex time series patterns. The combination of CNN and LSTM models further enhances the feature extraction and temporal modeling capabilities of the model, making it a powerful tool for processing temporal data. With the continuous growth of data volume and the improvement of computing power, the performance of LSTM based hybrid models in the financial field and other application scenarios will become increasingly prominent.
Application of graph neural network (GNN) in relational data
Graph Neural Networks (GNNs) are a type of deep learning model that processes graph structured data. With the rapid development of financial technology, financial data is filled with a large amount of complex relationship information, especially in the field of fraud detection. Information such as transfer relationships between users and transaction related networks is crucial for accurately identifying fraudulent behavior. Traditional machine learning methods often struggle to effectively handle highly dependent graph structured data, while GNN can propagate information between nodes through graph convolution operations, thereby capturing potential patterns and anomalies in the graph structure. This makes it highly promising for applications such as financial fraud detection.The Structure of GNN Neural Network Model is shown in Fig. 4.
Fig. 4 [Images not available. See PDF.]
Structure of GNN Neural Network Model
Basic principles of graph neural networks
The core idea of graph neural networks is to use the structure of the graph to propagate and aggregate node information. In GNN, each node in the graph (such as a user or transaction) is connected to other nodes through edges (such as transfer relationships or transaction associations). GNN uses a multi-layer message passing mechanism, allowing each node to receive information from its neighboring nodes at each layer, gradually updating its feature representation. Specifically, the feature information of nodes is propagated through graph convolution in the graph structure, which is jointly determined by the adjacency matrix and node features.
Mathematical formulas and graph convolution
In graph neural networks, node information propagation is usually achieved through graph convolution. Assuming we have a graph (), where () is the set of nodes, () is the set of edges, and the features of nodes are (), where () is the number of nodes and () is the dimension of node features. The goal of graph convolution is to update the features of each node based on the information of neighboring nodes.
The mathematical formula for graph convolution is usually written as:
19
Among them:
() represents the node feature representation of the (k-th) layer;
() is the normalized adjacency matrix, () is the original adjacency matrix, () is the identity matrix, and adding the identity matrix is to include the influence of the node itself;
() is the weight matrix of the (k-th) layer;
() is the activation function, usually the ReLU function;
() is the node feature representation of the previous layer.
In the above equation, the graph convolution aggregates the information of neighboring nodes through the adjacency matrix (), and uses the weight matrix () to learn how to aggregate this information. Through multi-layer graph convolution, the model can gradually capture the complex dependency relationships between nodes in the graph.
Financial data typically contains a lot of complex relational information, such as transfer relationships between users, fund flow paths, and temporal correlations between transactions. These relationship information can reflect the user’s behavior patterns, and graph neural networks, through their powerful information propagation mechanism, can effectively capture these patterns and identify potential fraudulent behaviors. Here are several important scenarios in which GNN is applied in financial data.
In the financial field, the transfer relationship between users can be naturally represented as a graph. Each user is a node, and the transfer relationship is represented by the edges in the graph. Through GNN, the model can associate each user’s transaction behavior with their neighbors (i.e. users who have transfer relationships with them). Every transfer or fund flow will propagate in the graph, allowing each user to receive information from other users. For example, if a user has frequent financial transactions with multiple known fraudulent accounts, the model can propagate this information through graph convolutional layers to help identify the fraudulent behavior that the user may be involved in.
Abnormal behavior in transaction data (such as frequent large transfers, short-term fund flows between multiple accounts, etc.) usually presents different patterns from normal behavior. Traditional machine learning methods may find it difficult to directly capture such complex temporal and spatial relationships, while GNN can automatically learn these anomalous patterns through the propagation of relationships between nodes. For example, GNN can identify which transactions may have fraud risks by considering the correlations between transactions and combining the attributes of each transaction (such as amount, time, etc.). Through graph convolution, the model can effectively mine potential features related to transactions, thereby improving the accuracy of detection.
Financial chain analysis refers to tracking the flow path of funds from one account to another. In complex fraud cases, criminals often conceal their behavior through multi-level and multi account fund transfers. GNN demonstrated significant advantages in this scenario. By modeling each account and its transfer relationship as a graph structure, GNN can propagate information about fund flow through multi-layer convolutional networks, automatically identify abnormal patterns in fund flow, and help analysts discover hidden fund chains, thereby improving the accuracy of fraud detection.
In some financial fraud cases, the fraudulent behavior is often expanded through social relationships (such as mutually recommended users or groups). By modeling social relationships between users as graph structures, GNN can capture indirect relationships between users. In this relationship network, even if direct transfer behavior is not obvious, potential social connections between users may reveal fraudulent behavior. For example, if some users frequently make small transfers with a newly joined user, and these users themselves have a history of fraud, then this social relationship may become a signal of fraudulent behavior. Through graph convolution, GNN can effectively mine these complex social structures and play a role in the detection process.
The application of GNN in financial data has many advantages. Firstly, GNN can naturally process graph structured data and effectively model the complex relationships and dependencies between nodes, thereby improving the performance of the model. Secondly, GNN can gradually capture long-term dependencies and detect potential fraudulent behavior through multi-layer information dissemination.
However, the application of GNN in financial data also faces some challenges. Firstly, financial data is often very large, and the scale of the graph may be very large, which poses challenges for the computation of GNN. Secondly, the sparsity of the graph may result in lower efficiency of graph convolution operations, thus requiring the design of efficient graph convolution algorithms. In addition, there may be noise and incomplete information in the graph, which can also affect the training and prediction performance of the GNN model.
Model integration and optimization strategy
In modern machine learning tasks, a single model is often difficult to perform best in all scenarios, especially when the task involves multiple feature types or complex structured data. In order to further improve the overall performance of the model, we adopted a model ensemble strategy by fusing the outputs of convolutional neural networks (CNN), long short-term memory networks (LSTM), and graph neural networks (GNN). Each model has its unique advantages. CNN excels in processing local features and image data, LSTM has strong capabilities in processing time series data, and GNN can effectively handle graph structured data and capture complex relationships between nodes. Through model integration, we can complement the strengths of these models, thereby enhancing the overall robustness and predictive ability of the model.
In addition, the model ensemble itself also needs to be optimized to ensure optimal performance within a limited training time. During this process, we introduced quantum optimization algorithms to further improve the accuracy and generalization ability of the model by optimizing its parameters. Quantum optimization algorithms have shown significant advantages in dealing with high-dimensional complex optimization problems, as they can quickly find approximate optimal solutions in situations where traditional optimization methods cannot effectively converge. Combining ensemble learning and quantum optimization methods not only improves the performance of the model, but also effectively shortens the training and adjustment time.
The model structure proposed in this paper is a hybrid deep learning framework that aims to improve the accuracy and robustness of fraud detection through multi-dimensional feature extraction and fusion. The specific structure is as follows:
Input layer: The input data is financial transaction records and related user behavior data, represented as a high-dimensional feature vector x = [x1, x2,…,xn], including transaction amount, transaction time, user historical behavior, account balance, etc.
Feature extraction module:
Convolutional neural network (CNN): Used to extract spatial features of transaction data. CNN automatically learns local features through convolutional layers and pooling layers, and can capture the relationship between different dimensions in transaction data (such as the relationship between income and expenditure, assets and liabilities).
Long short-term memory network (LSTM): Used to capture the time series characteristics of transaction data. LSTM can effectively process long-term dependencies in time series data, and through its gating mechanism, it can identify the time change pattern of transaction behavior.
Graph neural network (GNN): Used to model the relationship structure in transaction data. GNN can capture the transfer relationship between user accounts and abnormal patterns in transaction networks through graph convolution operations.
Feature fusion module: Fusion of features extracted by CNN, LSTM and GNN to form a richer feature representation. The fusion method can be simple concatenation, weighted average, or further learning the relationship between features through a deep network.
Deep Belief Network (DBN): As an unsupervised learning model, DBN learns high-order feature representations of data by stacking multiple restricted Boltzmann machine (RBM) layers. DBN can automatically learn potential fraud features from a large amount of unlabeled data and provide more discriminative feature representations for subsequent classification.
Classifier: Use a deep neural network (such as a multi-layer perceptron) as a classifier to map the fused feature vector to a fraud label (0 or 1). The output layer of the classifier uses a sigmoid function to output the probability that the transaction is fraudulent.
Quantum Optimization Module: Introduce quantum optimization algorithms such as quantum gradient descent (QGD) or quantum annealing algorithms during model training. Quantum optimization algorithms use the superposition and entanglement characteristics of quantum bits to accelerate the optimization process of model parameters, improve training efficiency and model generalization ability.
The basic idea of model integration is to combine the prediction results of multiple different models to obtain more accurate prediction results than a single model. Common integration methods include weighted average, voting mechanism, and stacking method. Assuming we have () base learners (), each base learner provides a prediction result () on the input (), where (). The output of the integrated model can be expressed as:
20
Among them, is the weight of the base learner (), indicating the importance of the model in the ensemble. The common optimization strategy is to adjust the weights through cross validation, so that the ensemble model performs optimally on the validation set.
If it is a classification problem, ensemble methods can use a voting mechanism, where each base learner gives a category label (), and the final classification result is determined based on the number of times the category appears:
21
Among them, () represents taking the category label with the most occurrences.
Optimization methods for model fusion
In practical applications, the outputs between different models often have different scales and distributions. Therefore, a simple weighted average or voting mechanism may not be sufficient to fully utilize the advantages of each model. To achieve better model fusion, we can adopt more complex fusion strategies, such as stacking methods. The stacking method further optimizes the performance of the ensemble model by introducing a meta learner, whose input is the output of the base learner. By learning the relationship between the outputs of different base models, a more accurate prediction result is ultimately given, as shown in Fig. 5.
Fig. 5 [Images not available. See PDF.]
Optimized Deep Learning Model Framework
Specifically, the process of stacking method can be described as:
The first stage (base learner training): Train multiple base learners and make predictions on the training data to obtain the output of each base learner .
Phase 2 (meta learner training): Using the outputs of all base learners as input features, train a meta learner to output the final prediction result
22
The meta learner can be any regression or classification algorithm, common choices include linear regression, support vector machine (SVM), etc.
On the basis of model integration, we also adopted quantum optimization algorithms to optimize the parameters of the model to ensure optimal performance within a limited training time. Traditional optimization methods, such as gradient descent, although perform well in many tasks, are prone to getting stuck in local optima in high-dimensional and complex search spaces, and have high computational costs. Quantum optimization algorithms, utilizing the superposition and entanglement properties of quantum mechanics, can explore multiple solutions in parallel in the search space, thereby improving optimization efficiency.
A common method of quantum optimization is the Quantum Approximation Optimization Algorithm (QAOA). The goal of QAOA is to minimize an objective function (), where () is the optimization variable, and QAOA explores the solution space through the evolution of quantum states.The mathematical expression for QAOA is:
23
Among them, () is the initial quantum state, () is a quantum operator composed of two types of quantum gates, and parameters () and () control the evolution process of the quantum state. The optimization process minimizes the objective function of the () final quantum state by continuously adjusting () and ().
By combining QAOA with model integration methods, we can efficiently adjust the hyperparameters of multiple base learners to ensure that the integrated model achieves optimal performance within a given time. The introduction of quantum optimization algorithms enables us to reduce computational complexity and accelerate the optimization process when facing large-scale, high-dimensional complex models, thereby improving the overall performance of the model.
Combining the above model integration and quantum optimization strategy, our final optimization process can be described through the following steps:
Model training and integration: Firstly, train multiple base learners (CNN, LSTM, GNN) and fuse their outputs through weighted averaging, voting, or stacking methods.
Quantum optimization: using quantum optimization algorithms to optimize the parameters of the ensemble model to minimize the objective function (such as classification error or regression error):
24
Among them, () represents the loss function of the (i-th) sample, () is the true label, () is the predicted value, and () is the parameter to be optimized.
Through this series of optimization methods, we can effectively improve the performance of the model and ensure that ideal results can be obtained in various scenarios.
Experiment and result analysis
Dataset characteristics
The dataset used in this study is provided by a financial platform and contains a large volume of transaction records and user behavior data. The dataset is designed to address the problem of transaction fraud detection, with the goal of identifying and preventing fraudulent behavior in financial transactions. Key characteristics of the dataset are as follows:
Data volume
The dataset consists of 500,000 samples, where each sample represents a financial transaction. The dataset is highly imbalanced, with approximately 98% of the samples being normal transactions and only 2% being fraudulent transactions. This imbalance poses a significant challenge for model training and evaluation, as traditional classification algorithms may fail to adequately detect the minority class (fraudulent transactions).
Data fields
The dataset includes multiple features such as user transaction records, account information, transaction time, transaction amount, and transaction type. These features are summarized in Table 1. Notably, features such as transaction amount and account balance are numerical, while features like transaction type and device information are categorical.
Table 1. Dataset Field Definitions
Field name | Describe |
|---|---|
User ID | Unique identification number for each user |
Trading Time | The specific time of the transaction, accurate to the minute |
Transaction amount | The amount of each transaction, in US dollars |
Transaction type | The transaction methods may include transfer, purchase, withdrawal, etc |
Equipment information | The device information used by users for transactions, such as device ID, operating system version, etc |
IP address | The IP address that initiated the transaction |
Account balance | The current balance in the user account |
Trading location | The transaction location calculated based on the geographical location of the IP address |
Is it fraudulent | Tag field, indicating whether the current transaction is a fraudulent transaction (1 for fraud, 0 for normal) |
Data distribution
The dataset spans a wide range of transaction types and user behaviors. For example, transaction amounts vary significantly, ranging from a few hundred to tens of thousands of dollars. Additionally, transaction times are recorded at the minute level, allowing for temporal analysis of transaction patterns.
Geographical and temporal coverage
The dataset covers transactions from multiple regions, with transaction locations calculated based on IP addresses. This geographical information helps in identifying location-based patterns of fraudulent behavior. Temporally, the dataset spans several months, providing insights into seasonal and time-based variations in transaction behavior.
Data preprocessing steps and experimental settings
In this experiment, we used a financial dataset provided by a certain financial platform, which covers a large amount of transaction records and user behavior data, mainly used to study the problem of transaction fraud detection. The core goal of the dataset is to identify and prevent fraudulent behavior in financial transactions through machine learning algorithms. This dataset consists of multiple features such as user transaction records, account information, transaction time, transaction amount, and transaction method of the platform. Specifically, the dataset includes two main categories: normal transactions (i.e. non fraudulent transactions) and fraudulent transactions (i.e. transactions conducted by criminals). Due to the very small number of fraudulent transactions compared to normal transactions, the entire dataset exhibits a high degree of imbalance, posing challenges for subsequent model training and evaluation. This dataset are shown in Table 1.
The sample size of the dataset is 500,000, of which the proportion of normal transaction samples is about 98%, while the proportion of fraudulent transaction samples is only 2%. This highly imbalanced data distribution may lead to insufficient detection ability of the model for minority classes (fraudulent transactions) when directly applying traditional classification algorithms. Therefore, we need to adopt specific strategies in the experimental process to address the problem of class imbalance.
In order to ensure the fairness and effectiveness of the experimental results, we conducted a comprehensive preprocessing of the data, including the following steps:
Data cleaning:
Delete missing values: For certain fields such as transaction location, device information, etc., there may be partially missing records. We use median imputation to fill missing values in continuous variables or delete records containing missing values to ensure data integrity.
Outlier handling: For numerical features such as transaction amounts, we processed extreme outliers and used IQR (interquartile range) method to remove obvious outliers.
Feature selection:
Based on data analysis and domain knowledge, we have selected features that are important for detecting transaction fraud, such as transaction amount, transaction type, account balance, and transaction time. At the same time, features unrelated to the problem, such as user ID, device ID, etc., were removed to avoid information leakage and improve training efficiency.
By conducting correlation analysis and stepwise regression, key features that affect the prediction of fraudulent transactions are further screened to ensure that the model can focus on important information.
Data normalization:
Due to the different dimensions and scales of the features in the dataset (such as transaction amounts ranging from a few hundred to several thousand dollars, while account balances may reach tens of thousands or even higher), we normalized the numerical features. For normalization, we opted for Min–Max scaling over other techniques. Financial data often has features with vastly different scales (e.g., transaction amounts vs. account balances), and Min–Max scaling effectively normalizes these into a [0,1] range. Unlike Z-score normalization, it doesn’t assume Gaussian distribution, making it more suitable for non-normally distributed financial data. Additionally, its lower sensitivity to outliers preserves data distribution integrity, enhancing model training robustness.
In order to fairly compare different machine learning models, we set the following experimental parameters and applied standard training and testing segmentation strategies.
Data segmentation
The dataset is randomly divided into a training set (80) and a testing set (20), where the training set is used for model training and the testing set is used to evaluate model performance. To ensure consistency in data distribution, all data is divided in chronological order and the proportion of fraudulent transactions in the test set is kept the same as in the training set.
Model selection
We experimented with various mainstream classification models, including logistic regression, decision tree, support vector machine (SVM), random forest, gradient boosting machine (GBM), and deep learning models such as multi-layer perceptron (MLP). Among them, deep learning models require hyperparameter tuning to improve their performance.
Class imbalance handling
In the realm of financial fraud detection, datasets often present a significant challenge due to class imbalance, where fraudulent transactions make up merely a small fraction, approximately 2%, of the entire dataset. This imbalance creates a predicament where models are inclined to favor the majority class, namely normal transactions, leading to a subpar identification rate of the critical minority class, which are the fraudulent transactions. The inherent bias in traditional classification algorithms exacerbates this issue, as they can achieve a deceptively high accuracy by predominantly predicting the majority class, but this comes at the cost of a dismally low recall rate for the minority class. For instance, a model might boast an accuracy of 98%, yet it could be identifying a mere 50% of the actual fraudulent transactions. Moreover, the limited number of minority class samples poses a risk of overfitting, where the model performs exceptionally well on the training data but falters when faced with unseen test data.
To surmount these challenges, we implemented the Synthetic Minority Over-sampling Technique (SMOTE), which proves to be a highly effective solution for class imbalance by generating synthetic samples for the minority class. SMOTE operates by identifying the nearest neighbors of each minority class sample in the feature space and creating new samples through interpolation between the features of each minority sample and its selected neighbors. This process not only augments the number of minority samples but also enriches their representation in the feature space, thereby reducing the model's reliance on specific patterns and diminishing the risk of overfitting.
Our experimental results underscore the remarkable impact of SMOTE on enhancing model performance. The recall rate for identifying fraudulent transactions experienced a substantial increase from 60% to 86.5% post-SMOTE application, reflecting a significant improvement in the model’s ability to detect actual fraud cases. Additionally, SMOTE contributed to a more stabilized performance of the model on the test set, effectively reducing overfitting and ensuring consistent results across different datasets. As depicted in Table 2, SMOTE demonstrates its efficacy by balancing the dataset through the synthesis of new minority class samples, leading to a more uniform distribution of transaction classes. This balanced distribution is instrumental in training a model that not only performs well in terms of accuracy but also excels in identifying the crucial minority class of fraudulent transactions, thereby addressing the class imbalance issue in fraud detection datasets.
Table 2. SMOTE performance comparison on a fraud detection dataset
Metric | Without SMOTE | With SMOTE |
|---|---|---|
Precision (%) | 82.5 | 88.7 |
Recall (%) | 60.0 | 86.5 |
F1 Score (%) | 68.0 | 87.6 |
ROC-AUC | 0.78 | 0.88 |
Hyperparameter tuning
For each model, we conducted grid search and cross validation to select the optimal combination of hyperparameters. Especially for deep learning models, we adopted the Adam optimizer and learning rate decay strategy to achieve the best training results. The specific parameter selections are shown in Table 3.
Table 3. Model related parameter settings
Parameter name | Set value |
|---|---|
Data segmentation ratio | 80 training sets, 20 testing sets |
Fraudulent transaction ratio | 2 |
Feature selection method | Based on correlation analysis and stepwise regression method |
normalization method | Min Max scaling |
Undersampling method | Using random undersampling |
Oversampling method | SMOTE |
Number of training set samples | 400,000 |
Number of test set samples | 100,000 |
Cross validation frequency | fivefold Cross Validation |
The model used | Logistic regression, decision tree SVM, Random forest GBM, MLP |
Hyperparameter tuning methods | Grid Search and Cross Validation |
optimizer | Adam (for deep learning models) |
Learning rate | 0.001 |
Through the above experimental setup, we ensured the fairness of the experiment and the credibility of the results, providing a reliable foundation for subsequent model performance evaluation. In the process of model training and evaluation, we will focus on the ability of each model to handle class imbalance problems, especially its ability to identify minority classes (fraudulent transactions).
Experimental design and evaluation indicators
To evaluate the performance of our proposed hybrid model and quantum optimization approach, we conducted a comprehensive comparison against several traditional machine learning models, deep learning models, and hybrid models. The comparison aims to demonstrate the advantages of our proposed method in terms of accuracy, training efficiency, and robustness.
We started by experimenting with traditional machine learning models, which rely on handcrafted features and statistical methods. These included Logistic Regression, a linear model predicting fraud probabilities; Decision Tree, splitting data based on feature importance; Support Vector Machine (SVM), finding optimal hyperplanes to separate classes; Random Forest, an ensemble of decision trees; and Gradient Boosting Machine (GBM), sequentially building trees to correct errors. These models achieved moderate performance but struggled with the complexity and high dimensionality of financial transaction data. The best-performing traditional model was GBM, achieving a Precision of 84.1% and an F1-Score of 83.3%.
Moving on to deep learning models, which automatically learn complex patterns from raw data, we experimented with a Multi-Layer Perceptron (MLP), a feedforward neural network; Convolutional Neural Network (CNN), extracting spatial features; Long Short-Term Memory Network (LSTM), capturing temporal dependencies; and Graph Neural Network (GNN), modeling relationships between transactions and users. Deep learning models outperformed traditional models due to their ability to automatically learn complex patterns. The GNN model performed the best among standalone deep learning models, achieving a Precision of 87.0% and an F1-Score of 86.1%.
We also explored hybrid models that combine the strengths of different architectures to improve performance. These included CNN-LSTM, combining CNN for spatial feature extraction and LSTM for temporal dependency modeling; GNN-LSTM, combining GNN for relationship modeling and LSTM for temporal dependency modeling; and our proposed hybrid model integrating CNN, LSTM, GNN, and DBN with quantum optimization. Hybrid models showed significant improvements. The GNN-LSTM hybrid model achieved a Precision of 87.3% and an F1-Score of 86.5%.
Our proposed hybrid model, which integrates CNN, LSTM, GNN, and DBN with quantum optimization, demonstrated superior performance. It achieved the highest Precision of 88.7%, Recall of 86.5%, F1-Score of 87.6%, and ROC-AUC of 0.88. This highlights the effectiveness of combining multi-dimensional feature extraction with quantum optimization.
The performance comparison is summarized in the following Table 4, which includes metrics such as Precision, Recall, F1-Score, and ROC-AUC:
Table 4. Performance Comparison of Different Models in Financial Fraud Detection
Model type | Precision (%) | Recall (%) | F1-score (%) | ROC-AUC |
|---|---|---|---|---|
Logistic Regression | 78.2 | 75.3 | 76.7 | 0.77 |
Decision Tree | 80.5 | 78.1 | 79.3 | 0.79 |
SVM | 81.2 | 79.5 | 80.3 | 0.80 |
Random Forest | 83.4 | 81.7 | 82.5 | 0.82 |
GBM | 84.1 | 82.6 | 83.3 | 0.83 |
MLP | 85.0 | 83.2 | 84.1 | 0.84 |
CNN | 85.2 | 83.5 | 84.3 | 0.84 |
LSTM | 86.1 | 84.7 | 85.4 | 0.85 |
GNN | 87.0 | 85.3 | 86.1 | 0.86 |
CNN-LSTM | 86.8 | 85.0 | 85.9 | 0.86 |
GNN-LSTM | 87.3 | 85.7 | 86.5 | 0.87 |
Proposed Hybrid Model | 88.7 | 86.5 | 87.6 | 0.88 |
The results clearly indicate that our proposed hybrid model significantly outperforms both traditional and standalone deep learning models. The integration of CNN, LSTM, and GNN allows the model to capture spatial, temporal, and relational features comprehensively. The addition of quantum optimization further enhances training efficiency and model generalization, making it highly suitable for real-world financial fraud detection applications.
Experimental design and evaluation indicators
It is not reasonable to rely on traditional accuracy as an evaluation metric when dealing with severe imbalanced datasets, as accuracy often cannot reflect the model’s predictive ability for minority classes such as fraudulent transactions. For this purpose, we have adopted the following comprehensive evaluation indicators:
Precision: measures the proportion of fraudulent transactions predicted by the model among all samples. The mathematical formula is:
25
Among them, (TP) represents the true cases (i.e. the number of samples correctly predicted as fraudulent transactions), and (FP) represents the false positive cases (i.e. the number of normal transactions incorrectly predicted as fraudulent transactions).
Recall: measures the proportion of true fraudulent transactions that a model can correctly predict among all samples. The mathematical formula is:
26
Among them, (FN) represents false negative cases (i.e. the number of samples that are true fraudulent transactions but predicted to be normal).
F1-score: The harmonic mean of precision and recall is used to comprehensively evaluate the performance of the model, especially in cases of imbalanced data. The mathematical formula is:
27
F1-score strikes a balance between precision and recall, avoiding the bias that may arise from relying solely on one metric.
ROC-AUC (Receiver Operating Characteristic—Area Under the Curve): The area under the subject's operating characteristic curve, which measures the overall performance of the model at different decision thresholds. The ROC curve evaluates the model by plotting the relationship between false positive rate (FPR) and true rate (TPR), and the larger the AUC value (area), the better the performance of the model. The formula is:
28
Among them, () represents the true rate, and () represents the false positive rate.
Through these comprehensive evaluation indicators, we can more accurately measure the performance of the model on imbalanced datasets, especially its predictive ability for minority classes such as fraudulent transactions.
Performance comparison of various models
We will train single models based on CNN, LSTM, GNN and DBN respectively, and compare them with traditional rule-based methods. The experimental results show that deep learning models perform better than traditional methods in financial fraud detection.
Analysis of the effect of fusion model
By integrating CNN, LSTM, and GNN, our hybrid model has shown excellent performance across multiple evaluation metrics, effectively improving the accuracy and robustness of fraud detection. The specific metrics are shown in the Table 5 below.
Table 5. Performance comparison of different models
Model Type | Precision (%) | Recall (%) | F1 Score (%) | ROC-AUC |
|---|---|---|---|---|
CNN | 85.2 | 83.5 | 84.3 | 0.84 |
LSTM | 86.1 | 84.7 | 85.4 | 0.85 |
GNN | 87.0 | 85.3 | 86.1 | 0.86 |
Hybrid Model | 88.7 | 86.5 | 87.6 | 0.88 |
The hybrid model, integrating CNN, LSTM, and GNN, demonstrates significant superiority in financial fraud detection compared to individual models. This performance enhancement arises because the hybrid model combines the strengths of each individual network: CNN effectively extracts spatial features from transaction data, capturing relationships between different dimensions; LSTM models temporal dependencies, identifying patterns in the sequence of transactions over time; and GNN analyzes the relational structure, uncovering anomalies in the network of transactions and users. By fusing the outputs of these networks, the hybrid model creates a richer feature representation that more accurately identifies fraudulent behavior. It achieves the highest precision at 88.7%, effectively minimizing false positives and ensuring that alerts are highly reliable. The model also attains the best recall rate of 86.5%, capturing a greater proportion of actual fraudulent activities. With an F1 score of 87.6% and an ROC-AUC of 0.88, the hybrid model shows an optimal balance between precision and recall, as well as superior overall classification performance. This multi-model fusion strategy not only improves detection accuracy but also enhances generalization, making the model more robust and reliable in identifying complex fraud patterns. This indicates that the hybrid model is highly effective in identifying fraud while maintaining robustness and accuracy, making it a highly promising solution for practical fraud detection applications.
Quantum optimization algorithm improves model performance
Quantum optimization algorithms show great potential for improving the performance of deep learning models, especially in financial fraud detection tasks. By combining the hybrid model optimization strategies of deep belief network (DBN), convolutional neural network (CNN), long short-term memory network (LSTM) and graph neural network (GNN), the quantum optimization algorithm significantly improves the training speed, model accuracy and generalization ability. Quantum optimization algorithm not only enhances the global optimization capability, but also effectively reduces overfitting through a variety of unique mechanisms. During the training process, quantum optimization introduces controlled noise, which plays a dynamic regularization role, preventing the model from relying too much on specific patterns in the training data, thus improving the generalization ability of unseen data. Unlike classical optimization methods, which are prone to falling into local optimality, quantum algorithms use quantum tunneling effects to escape local minima, enabling the model to explore a wider solution space and find a more robust global optimal solution, reducing the risk of overfitting local features. The quantum annealing technique maps the energy landscape of the model to the quantum Hamiltonian, enabling the algorithm to perform global optimization while maintaining stability and helping the model avoid converging to suboptimal solutions that may overfit noise or outliers in the training data. In addition, due to the properties of quantum superposition, quantum algorithms are able to explore multiple parameter configurations simultaneously, and this parallel exploration reduces the likelihood that the model will fall into a narrow local optimal that may represent an overfitting state. Together, these quantum properties enhance the generalization power of the model, making it more effective in real-world financial fraud detection applications where the data is often noisy and unbalanced.
The following are the experimental results of multiple performance improvement indicators based on quantum optimization algorithms, further demonstrating the powerful ability of quantum algorithms in financial fraud detection. The results are shown in Table 6.
Table 6. Comparison of Training Time between Quantum Optimization Algorithm and Classical Optimization Algorithm in Financial Fraud Detection
Algorithm type | Training time (seconds) | Convergence time (seconds) | Performance improvement (%) | Precision (%) | Recall (%) | F1 Score (%) | ROC-AUC |
|---|---|---|---|---|---|---|---|
Classical Optimization | 3500 | 2800 | 0 | 82.5 | 80.3 | 81.0 | 0.81 |
Quantum Optimization | 1200 | 800 | 65 | 88.7 | 86.5 | 88.0 | 0.88 |
There is a significant difference in training time and convergence time between classical optimization algorithms and quantum optimization algorithms in financial fraud detection tasks. Quantum optimization algorithms significantly reduce training time and improve efficiency by 65% compared to classical optimization algorithms.The Comparison of Training Time and Convergence Performance of Different Algorithm Models is shown in Fig. 6.
Fig. 6 [Images not available. See PDF.]
Comparison of Training Time and Convergence Performance of Different Algorithm Models
This article also compared and analyzed the performance of different algorithms in detection accuracy, and the results are shown in Table 7.
Table 7. Comparison of Quantum Optimization Algorithm and Classical Optimization Algorithm in Detection Accuracy
Algorithm type | Accuracy (%) | Accuracy (%) | Recall rate (%) | F1-score (%) |
|---|---|---|---|---|
Classic optimization | 82.5 | 80.3 | 84.0 | 81.0 |
Quantum optimization | 88.7 | 86.5 | 90.2 | 88.0 |
From the results, it can be seen that quantum optimization algorithms significantly outperform classical optimization algorithms in accuracy, precision, recall, and F1 score. Quantum optimization algorithm realizes parallel exploration of solution space through quantum superposition and entanglement characteristics, effectively avoids local optimal solutions, and converges to global optimal solutions faster. Quantum annealing technique achieves dynamic regularization through quantum noise injection, reduces overfitting and improves the generalization ability of the model. In addition, the quantum optimization algorithm efficiently balances exploration and utilization during training, enabling the model to effectively learn from both labeled and unlabeled data. These characteristics make the quantum optimization algorithm significantly improve the accuracy, stability and efficiency of the model when dealing with high-dimensional and complex financial fraud detection tasks, and thus outperform the classical optimization algorithm in accuracy, recall and accuracy. The Comparison of different algorithm models in accuracy, recall, and precision is shown in Fig. 7.
Fig. 7 [Images not available. See PDF.]
Comparison of different algorithm models in accuracy, recall, and precision
In order to further observe the performance on different datasets, this paper compared the training performance of quantum optimization algorithm and classical optimization algorithm on large-scale datasets, and the results are shown in Table 8.
Table 8. Comparison of Training Performance between Quantum Optimization Algorithm and Classical Optimization Algorithm on Large Scale Dataset
Algorithm type | Training time (seconds) | Model size (MB) | Training data volume (10,000 pieces) | Accuracy (%) |
|---|---|---|---|---|
Classic optimization | 8000 | 450 | 500 | 79.5 |
Quantum optimization | 3000 | 450 | 500 | 85.3 |
The results in Table 8 demonstrate that quantum optimization algorithm can significantly reduce training time when processing large-scale datasets, and the training effect is much better than classical optimization methods under the same dataset and model size.
This article also conducted performance analysis and comparison between quantum optimization algorithm and classical optimization algorithm in overfitting control, and the results are shown in Table 9.
Table 9. Comparison of Quantum Optimization Algorithm and Classical Optimization Algorithm in Overfitting Control
Algorithm type | Training error | Test error | Overfitting control |
|---|---|---|---|
Classic optimization | 0.035 | 0.045 | 22.2 |
Quantum optimization | 0.028 | 0.035 | 20.0 |
From the results in Table 7, it can be seen that the advantage of quantum optimization algorithm in overfitting control is also reflected in its small difference in error between the training set and the test set. Compared to classical optimization algorithms, quantum optimization algorithms effectively reduce the overfitting problem of the model during training and improve its generalization ability through global optimization.The Comparison Results of Error Analysis of Different Algorithm Models is shown in Fig. 8.
Fig. 8 [Images not available. See PDF.]
Comparison Results of Error Analysis of Different Algorithm Models
Finally, this article also compared and analyzed the performance of the quantum optimization algorithm designed in this article in different scenarios of financial analysis. The results are shown in Table 10.
Table 10. Performance Comparison of Different Algorithm Models in Different Application Scenarios
Task type | Accuracy of classic optimization algorithms (%) | Accuracy of quantum optimization algorithm (%) | Performance improvement ratio (%) |
|---|---|---|---|
Financial fraud detection | 82.5 | 88.7 | 7.5 |
Credit default prediction | 75.0 | 80.3 | 7.1 |
Fraudulent transaction identification | 85.4 | 89.8 | 5.2 |
The results in Table 10 demonstrate the accuracy improvement of quantum optimization algorithm in multiple tasks. Quantum optimization algorithm performs better than classical optimization algorithm in multiple tasks, and has the greatest performance improvement in financial fraud detection task.
From the overall experimental results, the quantum optimization algorithm designed and constructed in this paper can significantly improve the training efficiency and performance of deep learning models in financial fraud detection through efficient global search and parallel processing. Quantum optimization algorithms can better overcome the local optimal solution problem faced by classical optimization algorithms during model training, especially when involving large-scale datasets, quickly accelerate the training process, and reduce training time. Specifically, quantum optimization algorithms can perform parallel computation in the high-dimensional space of optimization problems through the superposition and entanglement characteristics of quantum bits, thereby achieving more efficient global search than classical optimization methods. In this way, the model can converge faster and avoid overfitting problems during training, improving the model’s generalization ability.
Parameter sensitivity and model performance
In order to thoroughly examine the results of the sensitivity analysis and its effect on the model's predictive capability. Table 11 illustrates the nuanced interplay between parameter settings and performance benchmarks, including accuracy and F1 scores. These visual narratives not only substantiate the generalizability of the model but also furnish actionable insights for subsequent iterations and applications in the domain of financial fraud detection.
Table 11. Sensitivity Analysis of Learning Rate
Learning Rate | Accuracy (%) | Precision | Recall | F1-Score |
|---|---|---|---|---|
0.0001 | 85 | 0.83 | 0.88 | 0.85 |
0.001 | 88 | 0.87 | 0.90 | 0.88 |
0.01 | 84 | 0.81 | 0.86 | 0.83 |
The table illustrates that the model’s performance is contingent on varying learning rates. At the lowest rate of 0.0001, the model demonstrates an accuracy of 85%, which is considered to be quite good. However, an increase in the learning rate to 0.001 results in an enhancement of accuracy to 88%, suggesting that this rate is more effective in learning from the data without overfitting. The precision and recall values, which are indicative of the model's ability to accurately identify positive instances, reach their maximum at this learning rate, with precision peaking at 0.87 and recall at 0.90. The F1-score, which seeks to balance precision and recall, attains its maximum of 0.88 at a learning rate of 0.001, signifying an optimal equilibrium between the two metrics. However, at a higher learning rate of 0.01, performance metrics demonstrate a decline across all metrics, with accuracy, precision, recall, and F1-score dropping to 84%, 0.81, 0.86, and 0.83, respectively. This observation suggests the possibility of overfitting or suboptimal convergence at this rate. In conclusion, a learning rate of 0.001 is determined to be optimal for the model under consideration, providing an optimal balance of accuracy, precision, recall, and F1-score. This balance is critical for effective hyperparameter selection and model generalization.
Results discussion and error analysis
In the financial fraud detection model based on the fusion of deep belief network (DBN) and quantum optimization algorithm, the mixed model optimization strategy of convolutional neural network (CNN), long short-term memory network (LSTM), and graph neural network (GNN) can significantly improve the performance of the model, especially when dealing with large-scale datasets. However, although quantum optimization algorithms have improved model performance at multiple levels, we still need to conduct more detailed error analysis and in-depth discussions on their performance, especially when combined with complex comparative results of different algorithms. From the experimental results, it can be seen that quantum optimization algorithms have significant advantages in training time and convergence time. Quantum optimization algorithms can perform efficient global search during the optimization process, mainly due to the superposition and entanglement characteristics of quantum bits, which enable the algorithm to converge faster and avoid the common local optimization problems of classical algorithms, as shown in Fig. 9.
Fig. 9 [Images not available. See PDF.]
Trends in Training Time, Accuracy, and Different Data Sizes
In terms of accuracy, precision, recall and other indicators, the improvement of quantum optimization algorithm is also quite significant, as shown in Fig. 9. The accuracy and F1 score of quantum optimization algorithm are significantly higher than those of classical optimization algorithm, and the performance improvement is close to 7–8. This indicates that quantum optimization algorithms also have significant advantages in reducing overfitting and improving generalization ability. Through global optimization, quantum algorithms can not only find the optimal solution faster, but also help the model avoid overfitting training data and improve its performance on unseen data.
Fig. 10 [Images not available. See PDF.]
Comparison of Training Time and Accuracy for Different Algorithm Models
As shown in Fig. 10, it can be seen that although quantum optimization algorithms outperform classical algorithms in multiple evaluation metrics, the model may still experience slight performance degradation in some extreme cases (such as high data noise or imbalanced datasets). On extremely skewed datasets, although quantum optimization algorithms exhibit excellent generalization ability, further regularization techniques are needed to ensure their robustness.
Furthermore, this article analyzes the changes in training time and accuracy of quantum optimization algorithms and classical optimization algorithms under different training data sizes (from 10,000 to 100,000). Through data comparison, we can see that with the increase of data volume, quantum optimization algorithms have a significant advantage in training time compared to classical optimization algorithms, and with the increase of data volume, quantum optimization algorithms also show continuous improvement in accuracy.
Economic impact and benefits of the proposed model
Financial fraud imposes a significant economic burden on market participants, including business enterprises and financial institutions. The repercussions of fraud extend beyond direct financial losses to include eroded investor confidence, disrupted market stability, and increased regulatory scrutiny. These factors collectively impact the broader economy by reducing investment levels, raising operational costs due to compliance measures, and potentially triggering systemic risks within the financial sector.
Traditional fraud detection methods, while providing a degree of protection, often fall short in effectively addressing these economic challenges. Their limitations in handling the scale and complexity of modern financial data lead to delayed detection and higher rates of undetected fraudulent activities. This not only results in immediate financial harm but also contributes to long-term economic instability and increased costs for businesses and consumers alike.
The proposed model, which integrates deep belief networks (DBN) with quantum optimization algorithms and employs a hybrid model optimization strategy combining convolutional neural networks (CNN), long short-term memory networks (LSTM), and graph neural networks (GNN), offers a more effective solution. By significantly enhancing detection accuracy and operational efficiency, the model reduces the incidence of fraudulent transactions and the associated economic penalties. The experimental results highlight the model's superior performance, achieving a precision of 88.7% and a recall of 86.5%, which translates to fewer undetected fraud cases and lower financial losses for institutions.
Moreover, the model's ability to process large volumes of data quickly and accurately minimizes the time and resources required for fraud detection. This efficiency gain allows financial institutions to allocate more resources to value-added activities, further contributing to economic growth. The reduction in fraud-related costs and penalties also enhances the competitive position of businesses, fostering a more stable and trustworthy financial market environment.
In summary, the proposed model not only advances the technical capabilities of fraud detection but also delivers substantial economic benefits by reducing the prevalence and impact of financial fraud. This makes it a valuable tool for financial institutions seeking to mitigate economic losses and support sustainable market growth.
Cost analysis
In order to comprehensively evaluate the deployment cost of deep learning models in financial fraud detection, this study calculated the hardware and software costs in detail. By rationally configuring hardware resources and making full use of open source software, this study aims to provide a cost-effective solution for practical applications.
Hardware cost
GPU cost
The training and reasoning of deep learning models require powerful computing power and usually rely on high-performance GPUs. NVIDIA's RTX 3090 is selected, and the price of a single piece is about $1,500. Assuming that 8 RTX 3090 GPUs are used for training and reasoning, the hardware cost is:
8 × 1500 = $12,000.
Server and storage device cost
A medium-configuration server is selected, and the cost of each is about $20,000. Assuming that 2 such servers are needed, the total cost is:
2 × 20,000 = $40,000.
The cost of storage devices depends on the amount of data. Assuming that 5 TB of storage space is required, the cost is about $2,000.
Total hardware cost
12,000 + 40,000 + 2000 = 54,000 USD.
Software cost
Deep learning framework
Use open source deep learning frameworks (such as TensorFlow, PyTorch), which are completely free, but may require some additional open source tools to support quantum optimization algorithms, which costs about $1,000.
Data processing tools
Use open source data processing tools (such as Pandas, NumPy, etc.), which are also free, but may require some additional open source tools to support data preprocessing and analysis, which costs about $500.
Total software cost:
1000 + 500 = 1500 USD.
Cost summary
Combining the above costs, the total cost of deploying the deep learning model proposed in this paper in financial fraud detection is:
Hardware cost + software cost = 54,000 + 1500 = 55,500 USD.
By rationally selecting hardware configuration and making full use of open source software, this study significantly reduced the deployment cost while maintaining high model performance. This cost-effective solution provides a feasible reference for financial institutions in practical applications.
Future research directions and challenges
The proposed method, while demonstrating promising results in financial fraud detection, faces several challenges that warrant further exploration in future research. A primary limitation lies in the current state of quantum hardware, which remains constrained by noise and high error rates. These issues, particularly noise and decoherence, significantly impact the stability and accuracy of quantum optimization algorithms when deployed on real-world hardware. Noise introduces random errors in quantum operations, while decoherence causes quantum states to lose their quantum properties over time, both of which can lead to suboptimal solutions and reduced model performance. This necessitates reliance on simulated quantum computing environments and underscores the need for advancements in quantum error correction techniques to mitigate these hardware limitations.
Additionally, the computational demands of integrating quantum optimization with deep learning models present a significant hurdle. Future research should focus on developing more efficient quantum–classical hybrid algorithms that minimize computational overhead and improve scalability, particularly for handling larger and more complex datasets.
Another critical area for future work involves enhancing the generalizability of the model to other domains, such as healthcare fraud detection, cybersecurity, and IoT anomaly detection. While the proposed method has shown strong performance in financial fraud detection, its adaptability to domain-specific challenges in these areas remains unexplored. Research should investigate strategies to tailor the model to diverse application contexts while maintaining its core strengths in efficiency and accuracy.
Interpretability and explainability also emerge as key concerns, given the “black box” nature of deep learning models. In financial fraud detection, stakeholders and regulatory bodies require transparent and accountable decision-making. Future efforts should integrate explainable AI techniques or design inherently interpretable models to ensure clarity in predictions, thereby fostering trust and facilitating responsible deployment.
Data privacy and security represent additional priorities, especially when dealing with sensitive financial transaction data. Future work should explore privacy-preserving approaches such as federated learning, differential privacy, and cryptographic methods to safeguard user information while complying with regulatory standards and maintaining model performance.
Finally, ethical considerations must be addressed to ensure fairness and mitigate bias in AI-driven systems. This involves establishing robust governance frameworks for ethical AI practices and refining training data to reduce inherent biases. By addressing these challenges, future research can further refine the proposed approach, making it more adaptable, efficient, and responsible in meeting the evolving demands of financial fraud detection and beyond.
Conclusion
This paper proposes a financial fraud detection method based on the fusion of deep belief networks and quantum algorithms, and combines a hybrid model optimization strategy of convolutional neural networks, long short-term memory networks, and graph neural networks. The experimental results show that the proposed method has significant advantages in both accuracy and efficiency. The financial fraud detection model based on the fusion of deep belief networks and quantum optimization algorithms can improve the effectiveness of financial fraud detection at multiple levels by combining a hybrid model optimization strategy of convolutional neural networks (CNN), long short-term memory networks (LSTM), and graph neural networks (GNN). Quantum optimization algorithms have demonstrated excellent performance in optimizing multiple dimensions such as training speed, model convergence speed, accuracy, precision, recall, and generalization ability. Compared with classical optimization algorithms, quantum optimization algorithms have shown particularly outstanding performance in improving training efficiency, reducing overfitting, and enhancing model stability. The introduction of quantum computing enables deep learning models to process large-scale datasets more efficiently, avoiding the problem of local optima that may occur with traditional methods, and providing a more efficient and accurate solution for financial fraud detection.
Despite the promising results achieved by the proposed method, several challenges and limitations remain, offering valuable opportunities for future research. One major constraint is the limited availability of practical quantum hardware, which necessitates the reliance on simulated quantum computing environments. As quantum technology advances, deploying the proposed model on actual quantum processors will be crucial to assessing its real-world performance. However, current quantum systems suffer from noise and high error rates, which may impact model stability and accuracy, highlighting the need for further research into quantum error correction techniques.
Another significant challenge is scalability and computational cost. While quantum optimization enhances training efficiency, the overall computational demand of deep learning models, especially in combination with quantum methods, remains high. Future research should explore more efficient quantum–classical hybrid algorithms to minimize computational overhead and improve scalability for larger datasets, ensuring the model remains feasible for real-world deployment.
Furthermore, generalization to other domains remains an open question. Although the proposed method has demonstrated strong performance in financial fraud detection, its applicability to other areas such as healthcare fraud detection, cybersecurity, and IoT anomaly detection needs further investigation. Future studies should examine the adaptability of the model across different domains and develop strategies to address domain-specific challenges.
A critical issue in fraud detection is interpretability and explainability. Deep learning models, including the proposed hybrid approach, often function as black boxes, making it difficult to explain their decisions. In financial fraud detection, providing clear explanations for predictions is essential for gaining the trust of stakeholders and regulatory bodies. Future work should focus on integrating explainable AI techniques or designing inherently interpretable models to enhance transparency and ensure accountability in decision-making.
Data privacy and security pose significant concerns. Financial transaction data often contains sensitive information, making it imperative to develop privacy-preserving techniques. Future research should explore federated learning, differential privacy, or other cryptographic methods to protect user data while maintaining model performance and compliance with regulatory requirements.
Beyond technical considerations, ethical implications must also be addressed. AI-driven fraud detection systems have the potential to introduce biases, leading to unfair decision-making and potential harm to individuals wrongly flagged as fraudulent. Ensuring fairness, mitigating bias in training data, and establishing ethical AI governance frameworks will be essential for the responsible deployment of quantum-optimized fraud detection models.
In conclusion, while the proposed method represents a significant advancement in financial fraud detection, addressing these limitations will be crucial for its long-term success. Future research should focus on improving quantum hardware integration, enhancing computational efficiency, expanding domain applicability, increasing interpretability, strengthening privacy protections, optimizing real-time processing, ensuring seamless integration, and addressing ethical concerns. By tackling these challenges, the proposed approach can be further refined and adapted to meet the evolving demands of the financial industry and beyond.
Author contributions
Gui Yu: Writing—Original Draft, Methodology, Validation, Supervision, Visualization. Zhenlin Luo: Conceptualization, Methodology, Validation, Supervision, Funding Acquisition, Visualization, Review & Editing.
Funding
This work was supported by 2024 Anhui Province Young Backbone Teachers Domestic Visiting and Training Funding Project. (JNFX2024134).
Data availability
The data are available from the corresponding author on reasonable request.
Declarations
Ethics approval and consent to participate
Not applicable.
Consent for publication
Not applicable.
Competing interests
The authors declare no competing interests.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
References
1. Chao, W; Yifei, X; Shuai, Y. Aggravating effect: ESG performance and reputational penalty. Financ Res Lett; 2025; 72, [DOI: https://dx.doi.org/10.1016/j.frl.2024.106515] 106515.
2. Albuquerque, B; Martins, MA; Moutinho, N. Stock market effects of corporate malpractices and misconduct: evidence from the short-seller Hindenburg. Financ Res Lett; 2025; 72, [DOI: https://dx.doi.org/10.1016/j.frl.2024.106495] 106495.
3. Chen, Y; Du, M. Financial fraud transaction prediction approach based on global enhanced GCN and bidirectional LSTM. Comput Econ; 2024; [DOI: https://dx.doi.org/10.1007/s10614-024-10791-2]
4. Zhao, D; Wang, Z; Gamborino, SF et al. Polytope fraud theory. Int Rev Financ Anal; 2025; 97, [DOI: https://dx.doi.org/10.1016/j.irfa.2024.103734] 103734.
5. He, D. A multimodal deep neural network-based financial fraud detection model via collaborative awareness of semantic analysis and behavioral modeling. J Circuits Syst Comput; 2024; [DOI: https://dx.doi.org/10.1142/S0218126625500549]
6. Maher, AC; Corsello, MR; Engle, AT et al. Correlates of victim services for fraud and identity theft among victim service providers. J Crim Just; 2024; 95, [DOI: https://dx.doi.org/10.1016/j.jcrimjus.2024.102318] 102318.
7. Smith, TK; Smith, ML. Examining documentation tools for audit and forensic accounting investigations. J Risk Fin Manage; 2024; 17,
8. Jenifer, RITP; Nalayini, P; Sebastin, MG. Deep transfer learning with optimal deep belief network based medical image classification model. Traitement du Signal; 2024; 41,
9. Wenying, X; Juan, H; Fuyou, H et al. Supply chain financial fraud detection based on graph neural network and knowledge graph. Tehnički vjesnik; 2024; 31,
10. Wang, X; Guo, J; Luo, X et al. DyHDGE: dynamic heterogeneous transaction graph embedding for safety-centric fraud detection in financial scenarios. J Safety Sci Resil; 2024; 5,
11. Li, F; Du, H. Research on Fraud Detection Method of Financial Data of Listed Companies Based on HMCRAN. Int J Data Warehouse Min; 2024; 20,
12. Zhao, S; Li, S; Jiang, Y et al. Research on financial fraud identification integrating financial, management, and text indicators. J Wuhan Univ Technol; 2024; 46,
13. Zhao, Y; Liu, R; Xue, J et al. Environmental protection tax law and corporate financial fraud: Evidence from listed firms in China. Int Rev Financ Anal; 2024; 96, [DOI: https://dx.doi.org/10.1016/j.irfa.2024.103537] 103537.
14. Chen, L; Huang, H; Tang, P; Yao, D; Yang, H; Ghadimi, N. Optimal modeling of combined cooling, heating, and power systems using developed African vulture optimization: a case study in watersport complex. Energy Sources Part A Recov Utiliz Environ Effects; 2022; 44,
15. Zeyi, M. Financial fraud detection and prevention: automated approach based on deep learning. J Organiz End User Comput; 2024; 36,
16. Rao, S. Identification and prevention of financial fraud in listed companies. Account Audit Fin.; 2024; 5, 2.
17. Marguerite, D; Paul, W. Profiling consumers who reported mass marketing scams: demographic characteristics and emotional sentiments associated with victimization. Secur J; 2024; 37,
18. Xiao, Z; Xiong, Z; Wang, L et al. Overview of light field image reconstruction and enhancement based on deep learning. Adv Laser Optoelectron; 2024; 61,
19. Han, E; Ghadimi, N. Model identification of proton-exchange membrane fuel cells based on a hybrid convolutional neural network and extreme learning machine optimized by improved honey badger algorithm. Sustain Energy Technol Assess; 2022; 52, 102005.
20. Jiang, H; Peng, C; Ren, D. Supply-chain finance digitalization and corporate financial fraud: evidence from China. Econ Model; 2024; 139, [DOI: https://dx.doi.org/10.1016/j.econmod.2024.106837] 106837.
21. Chen, X; Cai, X. A deep learning based dynamic recognition algorithm for facial local occlusion expressions. J Jilin Univ; 2024; 42,
22. Li, J; Sun, H; Chang, Y et al. Financial fraud identification considering multiple semantic associations of audit elements. J Manage Sci; 2024; 27,
23. Rahmaniar, W; Ramzan, B; Maarif, A. Deep learning and quantum algorithms approach to investigating the feasibility of wormholes: A review. Astron Comput; 2024; 47, [DOI: https://dx.doi.org/10.1016/j.ascom.2024.100802] 100802.
24. Jiang, W; Wang, X; Huang, H; Zhang, D; Ghadimi, N. Optimal economic scheduling of microgrids considering renewable energy sources based on energy hub model using demand response and improved water wave optimization algorithm. J Energy Storage; 2022; 55, [DOI: https://dx.doi.org/10.1016/j.est.2022.105311] 105311.
25. Meng, S; Shi, Z; Li, G et al. A novel deep learning framework for landslide susceptibility assessment using improved deep belief networks with the intelligent optimization algorithm. Comput Geotech; 2024; 167, [DOI: https://dx.doi.org/10.1016/j.compgeo.2024.106106] 106106.
26. Cheng, S; Gu, X; Wang, X. Financial fraud identification based on unbalanced MD&A text data. Moderniz Manage; 2024; 44,
27. Soroor, M; Bijan, BR. Financial fraud detection using graph neural networks: A systematic review. Expert Syst Appl; 2024; 240, 119854.
28. Chen, Y; Li, M; An, X et al. Research on intelligent prediction method of dam deformation based on chaotic cloud quantum bat CNN-GRU. J Harbin Eng Univ; 2024; 45,
29. Cheah, YCP; Yang, Y; Lee, GB. Enhancing financial fraud detection through addressing class imbalance using hybrid SMOTE-GAN techniques. Int J Fin Stud; 2023; 11,
30. Ghiasi, M; Wang, Z; Mehrandezh, M; Jalilian, S; Ghadimi, N. Evolution of smart grids towards the internet of energy: concept and essential components for deep decarbonisation. IET Smart Grid; 2023; 6,
31. Deng, Y; Fu, Z; Roy, K et al. Optimal design of cold-formed steel face-to-face built-up columns through deep belief network and genetic algorithm. Structures; 2023; 56, 104906. [DOI: https://dx.doi.org/10.1016/j.istruc.2023.104906]
32. Bo, G; Cheng, P; Dezhi, K; Xiping, W; Chaodong, L; Mingming, G; Ghadimi, N. Optimum structure of a combined wind/photovoltaic/fuel cell-based on amended dragon fly optimization algorithm: a case study. Energy Sources Part A; 2022; 44,
33. Li, S; Fang, X; Liao, J; Ghadamyari, M; Khayatnezhad, M; Ghadimi, N. Evaluating the efficiency of CCHP systems in Xinjiang Uygur autonomous region: an optimal strategy based on improved mother optimization algorithm. Case Stud Thermal Eng; 2024; 54, [DOI: https://dx.doi.org/10.1016/j.csite.2024.104005] 104005.
34. Dong, L; Li, Y; Liu, D et al. Prediction of protein-ligand binding affinity by a hybrid quantum-classical deep learning algorithm. Adv Quantum Technol; 2023; [DOI: https://dx.doi.org/10.1002/qute.202300107]
35. Tian, P; Yiman, L; Zhizhen, S et al. Hybrid intelligent deep learning model for solar radiation forecasting using optimal variational mode decomposition and evolutionary deep belief network - Online sequential extreme learning machine. J Building Eng; 2023; 76, 107432.
36. Mathappan, N; Elavarasan, S; Sehar, S. Hybrid intelligent intrusion detection system for multiple Wi-Fi attacks in wireless networks using stacked restricted Boltzmann machine and deep belief networks. Concurr Comput Pract Exp; 2023; [DOI: https://dx.doi.org/10.1002/cpe.7769]
37. Alshahrani, H; Gaddah, A; Alnuzaili, E et al. Modified sine cosine optimization with adaptive deep belief network for movie review classification. Intell Autom Soft Comput; 2023; 37,
38. Punitha, A; Geetha, V. Automated climate prediction using pelican optimization based hybrid deep belief network for smart agriculture. Measurement Sensors; 2023; 27, 100624. [DOI: https://dx.doi.org/10.1016/j.measen.2023.100714]
39. Li, L; Xin, X; Tang, Y et al. A product inversion algorithm for vegetation photosynthetically active radiation absorption ratio of Gaofen-1 satellite based on radiation transfer model simulation and deep learning. J Remote Sensing; 2023; 27,
40. Anthony, MLK; Murugan, S. Design of cuckoo search optimization with deep belief network for human activity recognition and classification. Multimed Tools Appl; 2023; 82,
41. Duhayyim, M; Mohamed, H; Alrowais, F et al. Artificial algae optimization with deep belief network enabled ransomware detection in IoT environment. Comput Syst Sci Eng; 2023; 46,
42. Yonbawi, S; Alahmari, S; Raju, B et al. Modeling of sensor enabled irrigation management for intelligent agriculture using hybrid deep belief network. Comput Syst Sci Eng; 2023; 46,
43. Li, P; Burkay, A; Xu, Z et al. Diagnosis for the refrigerant undercharge fault of chiller using deep belief network enhanced extreme learning machine. Sustain Energy Technol Assess; 2023; 55, 102654.
44. Uma, KM; Valarmathi, A. A novel mechanism to recognize heart disease by optimised deep belief network with SVM classification. J Intell Fuzzy Syst; 2023; 44,
45. Motwakel, A; Onazi, AB; Alzahrani, J et al. Convolutional deep belief network based short text classification on Arabic Corpus. Comput Syst Sci Eng; 2022; 45,
46. Omar, A; Abd El-Hafeez, T. Quantum computing and machine learning for Arabic language sentiment classification in social media. Sci Rep; 2023; 13,
47. El Koshiry, A; Eliwa, E; Abd El-Hafeez, T; Shams, MY. Unlocking the power of blockchain in education: An overview of innovations and outcomes. Blockchain Res Appl; 2023; [DOI: https://dx.doi.org/10.1016/j.bcra.2023.100165]
48. Mamdouh Farghaly, H; Abd El-Hafeez, T. A high-quality feature selection method based on frequent and correlated items for text classification. Soft Comput; 2023; 27,
49. Badawy, A; Fisteus, JA; Mahmoud, TM; Abd El-Hafeez, T. Topic extraction and interactive knowledge graphs for learning resources. Sustainability; 2021; 14,
Copyright Springer Nature B.V. May 2025