Content area
To evaluate the poverty alleviation effects of digital financial inclusion, this study proposes a comprehensive financial data analysis and prediction method by integrating K-means clustering, Long Short-Term Memory (LSTM) neural networks, and the Error Correction Model (ECM), collectively forming the K-LSTM-ECM model. The model first employs K-means clustering to group user data and uncover behavioral patterns of different user groups. Subsequently, LSTM is used to model and predict time-series data. Finally, the ECM is introduced to correct systematic errors and enhance prediction accuracy. The model was validated using diverse datasets, including World Bank Open Data, IMF economic indicators, and UNDP Human Development Reports. The results show that the error range of K-LSTM-ecm model is the lowest in mean square error, mean absolute error and root mean square error (e.g., mean square error is the lowest 1.44%), and the prediction precision rate reaches 91.23% on average. In terms of recall rate and false positive rate, K-LSTM-ecm model outperforms other models, with the highest recall rate reaching 94.45% and the lowest false positive rate reaching 2.08%. Through case studies, the prediction results of K-LSTM-ecm model for 2021 and 2022 are closer to the actual data, with poverty values of 0.212 and 0.181, respectively, and the prediction results of key indicators such as the proportion of subsistence population and rural disposable income are also better than other models. These findings verify the efficiency and reliability of the K-LSTM-ECM model in predicting the poverty alleviation effects of digital financial inclusion, providing robust data support for policymakers and the financial industry.
Keywords: digital inclusive finance, poverty reduction effect, K-means clustering, LSTM, error correction model (ECM), data analysis, forecast
Received: November 19, 2024
To evaluate the poverty alleviation effects of digital financial inclusion, this study proposes a comprehensive financial data analysis and prediction method by integrating K-means clustering, Long Short-Term Memory (LSTM) neural networks, and the Error Correction Model (ECM), collectively forming the K-LSTM-ECM model. The model first employs K-means clustering to group user data and uncover behavioral patterns of different user groups. Subsequently, LSTM is used to model and predict time-series data. Finally, the ECM is introduced to correct systematic errors and enhance prediction accuracy. The model was validated using diverse datasets, including World Bank Open Data, IMF economic indicators, and UNDP Human Development Reports. The results show that the error range of K-LSTM-ecm model is the lowest in mean square error, mean absolute error and root mean square error (e.g., mean square error is the lowest 1.44%), and the prediction precision rate reaches 91.23% on average. In terms of recall rate and false positive rate, K-LSTM-ecm model outperforms other models, with the highest recall rate reaching 94.45% and the lowest false positive rate reaching 2.08%. Through case studies, the prediction results of K-LSTM-ecm model for 2021 and 2022 are closer to the actual data, with poverty values of 0.212 and 0.181, respectively, and the prediction results of key indicators such as the proportion of subsistence population and rural disposable income are also better than other models. These findings verify the efficiency and reliability of the K-LSTM-ECM model in predicting the poverty alleviation effects of digital financial inclusion, providing robust data support for policymakers and the financial industry.
Povzetek: Predlagan je nov model K-LSTM-ECM za ocenjevanje ucinkov digitalne financne vkljucenosti na zmanjsevanje revscine, ki zdruZuje metode grucenja K-means, LSTM in ECM.
(ProQuest: ... denotes formulae omitted.)
1 Introduction
Poverty remains a persistent global challenge, affecting billions of individuals [1]. In addressing this critical issue, digital inclusive finance (DIF), as an innovative financial model, 1s widely recognized as a potent tool for poverty alleviation and eradication [2]. By merging digital technology with financial services, DIF provides convenient, affordable, and secure financial services to the underserved and marginalized populations, fostering their inclusion into formal financial systems, and enhancing economic inclusivity and sustainable development [3]. Despite its widespread adoption worldwide, uncertainties persist regarding the actual impact and mechanisms through which DIF contributes to reducing poverty. To delve deeper into the reducing poverty of DIF. A novel comprehensive model, K-LSTM-ECM, is brought out. It integrates К - means clustering, Long Short-Term Memory (LSTM) neural networks, and Error Correction Model (ECM). By analyzing financial behavior data from diverse user groups and applying the K-LSTM-ECM model for modeling and prediction, the study aims to assess the potential impact of DIF services on reducing poverty.
The research 15 committed to using K-LSTM-ECM model to accurately analyze and predict the poverty reduction effect of digital financial inclusion. The goal is to reveal the deep links between financial behavior and poverty reduction through a comprehensive evaluation of the model, provide reliable forecasting results to guide policy making and financial practice, and verify the application potential of the model in the field of financial data analysis and forecasting.
The contribution of the research is to propose an innovative K-LSTM-ECM model, which provides a new analysis and prediction method for the poverty reduction effects of digital financial inclusion by combining K-means clustering, LSTM neural networks, and ECM. This study not only provides an in-depth analytical framework in theory, but also demonstrates the high predictive accuracy and reliability of the model in empirical research, which provides strong data support for policy makers in the formulation of poverty reduction strategies and the optimization of financial practices, and has important value in promoting further discussion and research on the application of financial technology and machine learning in the academic community.
The novelty of the research is reflected in the unique construction of the K-LSTM-ECM model, which integrates different technical means to provide an unprecedented comprehensive perspective for the study of the poverty reduction effects of digital financial inclusion. The innovation of the model is not limited to the technical level, but also its ability to process largescale financial data, capture user behavior patterns, and predict long-term dependencies. Moreover, the model's high accuracy and low error rate shown in empirical studies, as well as its potential impact on policy making and financial product design, highlight its innovative status and application prospects in existing research.
The study is divided into five coherent parts. The first part introduces the global research status of digital financial inclusion and LSTM technology through literature review. In the second part, the construction principle and process of K-LSTM-ECM model are explained in detail, and how to integrate various technical means to enhance the ability of data analysis and prediction. In the third part, the performance of the model is evaluated through experiments to confirm its efficiency and accuracy in predicting the poverty reduction effect of digital financial inclusion. In the fourth part, the research discusses the whole content and summarizes the main challenges, limitations and advantages of the research content. The fifth part reviews the research results, discusses the practical application value of the model, and puts forward the direction and suggestions of future research, in order to provide data support and decision-making reference for policy making and financial practice.
2 Related works
Digital technology and financial innovation are driving the rapid development of global inclusive finance, providing efficient and safe financial services for impoverished and marginalized populations, thereby promoting inclusive economic growth and poverty alleviation [4, 5]. Malladi et al. [3] addressed the developmental challenges of inclusive finance in India, proposing recommendations to overcome key challenges and promote the further development of digital technology to advance inclusive finance for broader social inclusivity. Rastogi et al. [6] focused on India's Unified Payments Interface platform, using structural equation modeling to analyze its impact on financial literacy, financial inclusivity, and economic development for the poor, bridging knowledge gaps between the Unified Payments Interface and these aspects. Li J et al. [7]. investigated the impact of DIF on urban innovation, introducing a difference-in-differences model to estimate its causal effect on urban innovation, highlighting its positive influence and enhancing public awareness of DIF. Hasan et al. [8] examined how digital financial services contribute to inclusive finance in the environment of China, addressing the representation gaps in financial services for billions of people in developed and emerging markets, providing insights for future policies and research on inclusive finance.
With the rise of machine learning, deep learning and other technologies and the improvement of computing power, the effective prediction of financial data has been studied by many scholars. Biju A К У N et al. [9] conducted a detailed analysis and evaluation of the increasing application of machine learning, artificial intelligence, and deep learning in the financial field, and found deficiencies 1n empirical academic research to critically evaluate these algorithm-based advanced automated financial technologies. Aiming at the high interest of pattern recognition researchers in the field of market financial prediction, Barra S et al. [10] proposed to use the integrated model of convolutional neural networks to process time series data and predict the future trend of the American market, thus achieving effective financial analysis and prediction. Aiming at the importance of innovation in financial risk management and stock forecasting, Kurani A et al. [11] discussed two algorithms that play an important role in stock forecasting, namely artificial neural network and support vector machine. The results showed that artificial neural network and support vector machine can solve financial problems to a certain extent, and can be integrated with other novel technologies. Form a hybrid approach to provide a more robust approach to financial stock data forecasting. Al Dulaimi J A A B et al. [12] proposed to use data mining tools to analyze and forecast financial data in view of the financial forecasting difficulties faced by enterprises, and analyzed time series data through regression equations to help enterprises predict financial failures and provide information needed for key decisions. A summary of the work is shown in Table 1.
To sum up, existing researches have shortcomings in the following aspects: First, they lack integrated models, and most of them use a single method (such as structural equation model, convolutional neural network, etc.), which cannot effectively capture complex relationships and time dependence in financial data, Second, the research on digital financial inclusion mostly stays at the level of qualitative analysis and lacks quantitative evaluation of its poverty reduction effect. Third, the lack of systematic error correction mechanism leads to the limited accuracy of long-term prediction; Fourth, the lack of integration and utilization of multi-source data has failed to fully reveal the multidimensional impact of digital financial inclusion. Therefore, this paper proposes a study on the poverty reduction effect of digital financial inclusion based on K-LSTM-ECM model, and combines K-means clustering algorithm to analyze the causal relationship between digital financial inclusion and poverty reduction effect by using LSTM model and error model. The novelty of the research is reflected in the unique construction of the K-LSTMECM model, which integrates different technical means to provide an unprecedented comprehensive perspective for the study of the poverty reduction effects of digital financial inclusion. The innovation of the model is not limited to the technical level, but also its ability to process large-scale financial data, capture user behavior patterns, and predict long-term dependencies. Moreover, the model's high accuracy and low error rate shown in empirical studies, as well as its potential impact on policy making and financial product design, highlight its innovative status and application prospects in existing research.
3 Establishment of the K-LSTMECM model of dif's poverty reduction effects
The establishment of the K-LSTM-ECM model for learning DIF's reducing poverty effects holds significant significance. By combining K-means, LSTM, and ECM techniques, the model offers a more accurate and reliable analysis approach to address the uncertainty of DIF's impact on reducing poverty. This section provides detailed descriptions of the K-means clustering algorithm, LSTM models, and ECM models, along with the framework and training process of the K-LSTM-ECM model.
3.1 K-means clustering algorithm application in K-LSTM-ECM model
Given the abundance of customer and transaction data, the K-means clustering algorithm is adept at grouping customers, identifying various customer segments, and revealing distinct characteristics and needs. Customer segmentation enables financial institutions to provide personalized financial services, catering to diverse customer needs and enhancing user experience. Moreover, K-means clustering facilitates market segmentation, unveiling market characteristics and potential opportunities to guide precise marketing strategies and financial product promotion. Additionally, K-means clustering aids risk assessment by clustering customer transaction behaviors, identifying high-risk and low-risk customers, enabling financial institutions to accurately assess risks and take appropriate measures. The K-means clustering process is illustrated in Fig. 1 [13].
The K-means first selects K clusters and randomly selects K data points, then treat them as the initial cluster centers. Next, find the distance between each data point and every cluster center, and then assign the data point to the cluster that has the nearest center. This process will result in the formation of K sets of clusters. After that, recompute the center of each cluster by taking the average of all the data points belonging to that cluster. Check again whether the cluster centers have changed, and stop iteration if the change is less than the preset threshold or reaches the maximum number of iterations. Finally, according to the updated cluster centers, the data points are reassigned to the nearest cluster, and the cluster centers are updated again. Repeat the above process until the convergence condition is met. Finally, the final cluster center and the cluster assignment of each data point are obtained.
The first step in the calculation process of the Kmeans algorithm is to establish a set A containing n data samples, as shown in Formula (1).
... (1)
In Formula (1), ... is a vector with d dimensions. It represents d different attributes of the i-th data. The sample size is expressed by N. Then Formula (2) 1s given.
... (2)
In Formula (2), ... is Jth cluster's center point, each center point cj contains d different attributes, and k represents the number of clusters. The second step is to choose an appropriate K value. The K value selection in the K-means algorithm can be given directly, or can be determined through various model analysis to obtain the optimal clustering effect. The third step is to calculate the distance between the two data of xi and ch. Generally, Euclidean distance and Manhattan distance are used for distance measurement. Euclidean distance is shown in Formula (3).
... (3)
In Formula (3), ... represents the distance between xi and cj and l represents the number of dimensions in the space. The Manhattan distance is shown in Formula (4).
... (4)
In Formula (4), it 1s the same as the Euclidean distance formula, and 1 represents the number of dimensions in the space. The fourth step is to recalculate the same cluster as cj after classifying the sample data, as shown in Formula (5).
... (5)
In Formula (5), ... is the data amount in one same cluster, (...). The fifth step is to judge when to stop the operation of the K-means algorithm, and usually use the criterion function to evaluate the clustering effect. A commonly used criterion function is the sum of squared errors (SSE), which is used to measure the sum of the distances between each data point and the cluster center to which it belongs, indicating the closeness of the cluster. Smaller SSE, the better the clustering effect, as shown in Formula (6).
... (6)
In Formula (6), E represents the sum of the square errors, ... represents the distance between xi and cj, xi represents a given data object, and cj is the center point of the j-th cluster.
3.2 Establishment of LSTM in K-LSTMECM model
In the research on the reducing poverty effect of DIF, the main reason for using LSTM is its powerful ability to process time series data. Digital financial inclusion involves a large amount of time-series data, such as user transaction records and consumption behaviors, and LSTM can effectively capture the longterm dependencies in these data and help better understand user behavior patterns and trends. Especially in the financial field, user behavior and decision-making are affected by a long time in the past. This long-term dependency is crucial for accurate prediction and modeling, and the design structure of LSTM enables it to transmit and store information for a long time, and thereby better handle this long-term dependency. In addition, LSTM has flexible sequence modeling capabilities, can adapt to different types of time series data, and can model continuous event sequences, such as users' transaction sequences and login sequences. The basic structure of LSTM 15 given in Fig. 2 [14].
In Fig. 2, the output at the current and previous moment 1s shown, and the state at these two moment 1s shown. LSTM introduces three gate structures to control and protect information. The memory storage cell of LSTM consists of four parts including self-loop connected nodes [15]. The role of these gates is to realize the function of forgetting or remembering through the selective forgetting and recording of information. LSTM not only includes three gate-level control units that control the flow of information, the input gate i, the forget gate f, and the output gate g , but also has builtin hidden layer memory cells Ci. The forget gate f is used to control the degree of retention of the information of the previous moment in the memory cells at the current moment. The calculation of the forget gate S at time t is shown in Formula (7).
... (7)
In Formula (7), Wf is the forgetting gate weight matrix, ht-1 is the hidden state at the previous moment, xi is the input currently, and bf is the bias parameter. O represents the S-type activation function, and its expression is shown in Formula (8).
... (8)
Formula (8) is used to map the calculation result to the range [0, 1], which means that the value that the forgetting gate output is input to the gate I. The calculating process is shown in Formula (9).
... (9)
In Formula (9), W means input gate's weight matrix, and b, is the bias parameter of the input gate. The input gate controls the update degree of the input information of the current moment to the memory cells. The calculation of the output gate ? is shown in Formula (10).
... (10)
In Formula (10), means output gate's weight matrix, and b, is the bias parameter of the output gate. The output gate controls the degree to which the information in the memory cell is output at the current moment. The calculation of С, candidate memory cells is shown in Formula (11).
... (11)
In Formula (11), ue means input gate's weight matrix and the hidden state at the last moment, bc 1s the bias parameter of the candidate cell, tanh is a hyperbolic function, and its expression is shown in Formula (12).
... (12)
Formula (12) converges faster than the S-type activation function. A candidate memory cell is a temporary memory cell computed at each time step to store new input information. The update memory cell calculation 1s shown in Formula (13).
... (13)
In Formula (13), Cra is the memory cell at the previous moment, and updating the memory cell 1s to update the memory cell through the weighted sum of the forget gate, input gate and candidate memory cells. The output of memory cells is shown in Formula (14).
... (14)
In Formula (14), ht represents the hidden state of LSTM at the " moment, which is also the output of memory cells.
3.3 The error correction model is introduced in K-LSTM-ECM model
ECM is a statistical model used to correct the error between predicted values and actual observed values in time series [16]. Due to complex economic and social factors, the reducing poverty effect of DIF 1s often affected by multiple variables, so there may be some errors 1n the prediction results. Therefore, the research introduces ECM to help correct systematic errors or deviations in model predictions, thereby improving the accuracy and reliability of predictions. The general form of ECM is shown in Formula (15).
... (15)
In Formula (15), Δy tells the dependent variable first-order difference instantly, Δyt-1 tells its lagged first-order difference at time and ... tells its lagged first-order difference at time f-1 ECM estimates coefficients α , β and γ by statistical techniques such as least squares regression. After estimating the coefficient εt ECM can correct the prediction error of the original model by adding the error term to the predicted value. After introducing ECM, the K-LSTM-ECM model framework 1s established, as shown in Fig. 3.
As can be seen from Fig. 3, the K-LSTM-ECM model first performs the data preprocessing step, in which the user group data is cleaned and normalized to ensure the accuracy of subsequent analysis. Then, the Kmeans clustering algorithm is applied for detailed group processing of customer data, which classifies customers with similar characteristics into the same category and effectively identifies and distinguishes the specific needs and behavior patterns of different customer groups. Based on the results of K-means clustering, the model customizes and builds a specialized LSTM model for each customer group. These models are specifically designed to analyze and predict the potential impact of digital financial inclusion on poverty reduction across groups. In this way, models are able to provide personalized predictions and in-depth analysis for the specific situations of different customer groups. Further, the model applies the trained K-LSTM-ECM model to the prediction task of future data to predict how the poverty reduction effects of digital financial inclusion evolve over time. In addition, ECM 1s used to comprehensively evaluate the accuracy and stability of the model prediction to ensure the reliability of the prediction results and the robustness of the model. This process not only improves forecasting accuracy, but also provides solid data support for financial decisions. The K-LSTM-ECM model training process is shown in Fig. 4.
As shown in Fig. 4, first, the user groups are classified using the K-means clustering algorithm, and corresponding labels are generated. Next, divide the clustered passenger flow data into two set for training and examing. Then, initialize the network weights and parameters of the LSTM neural network model. The square loss function is used as the loss function, and the backpropagation is performed to calculate the error. Adam optimizer is hired for computing gradients and update parameters. The prediction results are input into the error correction model ECM, the error 1s calculated, and the error 1s fed back to the LSTM neural network. Finally, output the prediction result. This training process combines K-means clustering, LSTM model training and ECM to improve the accuracy and performance of research on the effects of digital financial inclusion on reducing poverty. The pseudo-code for model training is shown in Table 2.
4 Analysis of the research results on the poverty reduction effect of dif under the K-LSTM-ECM model
The study carried out model comparison experiments and a case study of the reducing poverty effect of DIF. In the model comparison experiment, the study used 5 sets of existing real data as the benchmark, and tested the prediction error, precision rate, recall rate, and F1 value (F-measure) indicators of the three models respectively. In the case analysis, the study uses the KLSTM-ECM model to conduct a comprehensive and detailed analysis and compare it with the real results to calculate its accuracy.
4.1 Data set selection and processing
In the experiments conducted, the study used a comprehensive data collection strategy that included World Bank Open Data, IMF economic indicators, UNDP Human development reports, Global Findex databases, and country-specific central bank and financial regulator reports. Through the combination of systematic sampling and convenient sampling, we can ensure the universality and representative of samples in many countries and regions around the world. The collected data is rigorously cleaned and standardized to eliminate inconsistencies and dimensional differences, enhancing the comparability of the data. Data preprocessing and model training adopt best practices, including data standardization, cross-validation, etc., to ensure the accuracy of research results and the generalization ability of models.
4.2 K-LSTM-ECM model comparison experiment
For the prediction accuracy and superiority verifying with the K-LSTM-ECM, the study conducted a model comparison experiment. The test objects were the LSTM model, the K-L STM model without ECM, and the K-LSTM-ECM model. In the comparative experiment, a single LSTM model is first used to predict the effect of reducing poverty, and then the K-LSTM model without an error compensation model is used for prediction. Next, use the K-LSTM-ECM model to input the prediction results of the LSTM model into the error correction model to obtain the corrected prediction results. By comparing the three models' predicting output, the improvement effect of the error correction model on the prediction results can be analyzed. In the experiment, based on experimental tuning and literature support, the number of K-means clusters is set to 5, which can effectively balance the meticulousness of grouping and computational complexity. In terms of LSTM hyperparameter Settings, the number of hidden layer nodes is 64, the learning rate is 0.001, the time step is 10, and the batch size is 64. These parameters refer to the common configurations of previous studies, which can meet the computing resource conditions of this study and ensure the training efficiency and prediction performance of the model.
The experiment was completed on a high performance computing server with hardware configuration including Intel Xeon Gold 6226R CPU, NVIDIA Tesla V100 GPU (32 GB video memory), 128 GB memory and 2 TB NVMe SSD. The operating system 1s Ubuntu 20.04 LTS. The software environment is Python 3.8, mainly using Py Torch 1.10.0, Scikit-learn 0.24.2 (K-means), NumPy 1.21.2, Pandas 1.3.3 and Matplotlib 3.4.3. In data preprocessing, the missing values are filled by multiple interpolation method, and the outliers are detected and eliminated based on 30 principle. In addition, z-score standardization is used to normalize the data to ensure the consistency of each feature dimension. The error correction model (ECM) is implemented by a custom Python module with a random seed set to 42. Data preprocessing ensures the stability of model inputs and the reproducibility of experimental results. The research first evaluates the prediction errors of the three, and the experimental results are shown in Fig. 5.
It can be seen from Fig. 5 that three indicators are studied to measure the forecast error, which are mean square error, mean absolute error and root mean square error. In the mean square error results, the highest error of the LSTM model 15 8.34%, the lowest 15 6.88%, the highest error of the K-LSTM model is 7.12%, the lowest is 4.01%, the highest error of the K-LSTM-ECM model is 3.68%, and the lowest is 1.44%. Among the average absolute errors, the error range of the LSTM model is 7.11%-8.49%, the error range of the K-LSTM model is 4.28%-6.89%, and the error range of the K-LSTM-ECM model is 1.49%-3.27%. In the root mean square error, the overall error of the K-LSTM-ECM model is also the smallest. The K-LSTM-ECM model shows better performance in terms of prediction error, with lower error bounds compared to the LSTM model and the KLSTM model. The precision rate evaluation results of the three models are Fig. 6.
As can be seen from Fig 6, the precision rate of KLSTM-ECM model is the highest on all data sets, among which the precision rate on data set No. 3 reaches 94.86%, and the average precision rate on all data sets reaches 91.23%. Compared with the average precision rate of LSTM model 74.56% and K-LSTM model 83.18%, the average precision rate increased by 16.67% and 8.05% respectively. It shows that the K-LSTM-ECM model is excellent in the precision rate of prediction results, and can more accurately judge the impact of DIF on reducing poverty. The evaluation results of the recall rate and false positive rate of the three models are shown in Fig. 7.
As can be seen from Fig 7, in terms of recall rate, K-LSTM-ECM model reaches the highest level in the tests under different data sets, among which the recall rate on data set 1 1s 94.45%, that on data set 2 1s 92.64%, that on data set 3 1s 88.92%, and that on data set 4 15 91.07%. The recall rate on dataset 5 1s 86.55%, indicating that the model can capture positive samples more accurately and has a high ability to identity poverty reduction effects. In terms of false positives rate, KLSTM-ECM model also performs well. In the test of data sets 1 to 5, the false positives rate is 2.08%, 5.48%, 3.23%, 3.81% and 4.16% respectively, which is significantly lower than the other two models, indicating that K-LSTM-ECM model has a good effect in reducing false positives. More precise control over error reporting. In addition, the K-LSTM-ECM model combines Kmeans clustering, LSTM neural network and error correction model, which has certain complexity in predicting the effect of reducing poverty. Therefore, the F1 value under the K-LSTM-ECM model. In the research on the reducing poverty effect of digital financial inclusion, it is helpful to comprehensively consider the predictive ability of the model. The F1 values of the three models are shown in Fig. 8.
As can be seen from example 8, the single LSTM model has a relatively low F1 value, for example, 73.85% on Data 1. The K-LSTM model performs well on most datasets, indicating that the introduction of the error correction model can improve the prediction accuracy in some cases. The F1 value of K-LSTM-ECM model is significantly higher than that of the other two models, and the F1 values in data sets 1-5 are 84.54%, 92.03%, 86.87%, 94.25% and 94.79%, respectively, with an average F1 value of 90.62%. It shows that K-LSTMECM model has better comprehensive forecasting ability in the research of poverty reduction effect of digital financial inclusion. In the research on the effect of inclusive finance on reducing poverty, it has better comprehensive prediction ability. Finally, the ROC curve and AUC area of the three models were compared, and the results were shown in Fig. 9.
As can be seen from Fig 9, the area under the ROC curve of K-LSTM-ECM model, that is, the AUC, is the largest, reaching 0.89, which is 9.88% and 4.70% higher than that of the single LSTM model and K-LSTM model respectively. This shows that the K-LSTM-ECM model has a high ability to distinguish between positive and negative samples effectively, that is, in the context of the poverty reduction effect of digital financial inclusion, the model can accurately identify cases with poverty reduction effect.
4.3 Case study on poverty reduction effect of DIF
To explore the practical application value of the KLSTM-ECM model, the research selects the macroscopic data of a certain place (denoted as A place) from 2010 to 2020 for a total of 10 years as the training set to predict the poverty data in 2021 and 2022. Nine key indicators are used to deeply analyze the impact of DIF on poverty alleviation, and the K-LSTM-ECM model is used to predict and analyze it. The nine major indicators are shown in Table 3.
It can be seen from Table 3 that the poverty situation is firstly measured from the perspective of the degree of rural poverty and the proportion of the subsistence allowance population through the poverty alleviation indicators Povl and Pov2. Subsequently, the digital financial inclusion index was introduced to describe the level of financial development, combined with economic growth indicators to reflect the economic situation. Income distribution indicators and industrial structure indicators discuss the income distribution gap and industrial optimization. The indicator of urbanization level reflects the process of urbanization, while the indicator of education level shows the improvement of education level. Finally, the degree of government intervention measures the degree of government participation in reducing poverty. The statistical results of the nine major indicators are shown in Table 4.
From Table 4 that the average value of the poverty indicator Povl 1$ 0.031, but the standard deviation is relatively large, which shows that the distribution of poverty levels is uneven. The rural per capita disposable income Pov2 gap is obvious, with an average of 16,473 and a standard deviation of 2,741. The digital financial inclusion index IND showed large fluctuations, with an average of 158.45 and a standard deviation of 62.494. The income distribution indicator INC and the education level indicator EDU are relatively stable, with an average of 2.135 and 0.164, respectively. The economic development index GDPP has a large difference, with an average value of 5.304 and a standard deviation of 2.332, and the urbanization level index URB has an average value of 64.465% and a standard deviation of 12.279. The industrial structure index IS changes relatively steadily, with an average of 0.756 and a standard deviation of 0.068. The study carried out normalization processing based on the specific data of the 9 major indicators, and finally obtained the poverty value of each year, ranging from [0, 1]. The smaller the poverty value, the lower the poverty level. Use the K-LSTM-ECM model to process the nine major indicators each year to generate training set data. Use the actual poverty data from 2010 to 2020 to compare and analyze the poverty data of the training set obtained by the K-LSTM-ECM model, and the results are shown in Fig. 10.
From 2010 to 2020, the poverty value of land A decreased year by year, from a maximum of 0.823 to 0.231. Povl, that is, the proportion of the minimum living population has also decreased year by year, from 0.062 to 0.013. Pov2, that is, rural disposable income has increased from 5,243 yuan in 2010 to 23,147 yuan in 2020. At the same time, the training set obtained by the K-LSTM-ECM model is not too far from the real, and the highest error is 3.03% in Pov1 in 2014. The training results of the model are consistent with the actual trend, predicting trends such as the decline in poverty, the decrease 1n the proportion of subsistence allowances, and the increase in rural disposable income, which further confirms the effectiveness of the model in the research on the effect of DIF on reducing poverty. This study uses the training set data to use the K-LSTM-ECM model for prediction, compares it with the LSTM model, K-LSTM model and the true value, and selects the research model of Barra S et al. [10] et al. Mentioned in the literature review as the model of existing research results for secondary comparison. The results are shown in Table 5. In 2021 and 2022, the poverty values predicted by the K-LSTM-ECM model are relatively low, 0.212 and 0.181, respectively, indicating that the model has achieved relatively accurate results in prediction. In comparison, the predicted poverty values of the LSTM model and the K-L STM model are slightly higher, 0.219 and 0.216, and 0.189 and 0.176, respectively. In addition, from the perspective of Pov1 and Pov2 indicators, the predicted value of K-LSTM-ECM model in 2021 and 2022 is close to the real value, and the performance is better than that of Barra S et al. [10]. In general, KLSTM-ECM model shows high accuracy and reliability in prediction, which 1s closer to the actual situation than LSTM model, K-LSTM model and Barra S et al. [10] model.
5 Discussion
Compared with relevant studies, the K-LSTM-ecm model is unique in that it combines K -means clustering, LSTM and ECM to quantity the actual impact of digital financial inclusion on poverty reduction from the perspective of time series and error control. Compared with Malladi CCM et al., Rastogi S et al., who mainly focused on the policy research on the promotion of inclusive finance and the effect of payment platforms, and Li J et al, who focused on the causal analysis of digital finance and urban innovation, this study focuses more on the deep connection between digital inclusive finance and poverty reduction. Compared with Barra S et al. 's CNN-based time series prediction and Kurani A et al. s hybrid algorithm, K-LSTM-ecm model significantly improves the prediction accuracy and model robustness through integrated design. In addition, the model not only focuses on technical optimization, but also aims at the specific application scenarios of digital financial inclusion, providing quantitative reference for policy makers and financial service designers, and filling the gap in the field of quantitative analysis of poverty reduction effects of existing research.
The K-means algorithm improves the feature expression ability of data by accurately grouping user group behaviors. LSTM model captures the long-term dependence of time series data effectively. The error correction model (ECM) further improves the prediction accuracy and the reliability of the results by dynamically adjusting the system errors. In addition, K-LSTM-ecm model's integration of multi-source data makes it more applicable in dealing with complex financial scenarios, while existing research methods are often limited to a single data source or simple model combination, and it is difficult to achieve similar effects.
However, the model has some limitations. First, the computational demands of the model, particularly the combined use of LSTM and ECM, may restrict its applicability in resource-constrained environments. Second, the generalizability of the model may be affected by the quality and diversity of training data, as data biases specific to certain regions or groups could impact its universal applicability. Moreover, the rapidly changing financial market environment requires the model to adapt dynamically to policy adjustments, economic cycles, and new financial products, posing challenges to its extensibility. Lastly, privacy, security, and ethical concerns in using financial data are critical issues that need to be addressed during model development and application.
By overcoming these challenges, future research can further enhance the efficiency and adaptability of the K-LSTM-ECM model and leverage more diverse data resources to amplify its role in advancing digital financial inclusion research and practice globally.
6 Conclusion
This study proposed and validated the K-LSTMECM model, which integrates K-means clustering, Long Short-Term Memory (LSTM) networks, and an Error Correction Model (ECM) to achieve precise predictions of the poverty alleviation effects of digital financial inclusion. Experimental results show that the K-LSTMECM model significantly outperforms the single LSTM and K-means-LSTM models in prediction accuracy, error control, and result stability. The findings indicate that the K-LSTM-ECM model effectively reveals the deep connections between financial behavior and poverty alleviation, providing robust data support for policy-making and financial practices. Particularly, the model's strong alignment with real-world data in case studies validates its applicability across different regional and group scenarios. Compared to existing methods, the K-LSTM-ECM model introduces an innovative technological integration, offering a novel analytical tool for studying digital financial inclusion and filling a significant research gap in this domain.
The K-LSTM-ecm model proposed in this study has important social impact and practical application value. By quantifying the poverty reduction effect of digital financial inclusion, the model provides accurate decision-making basis for policy makers, such as optimizing resource allocation and formulating more targeted poverty alleviation policies, while helping financial institutions design products that meet the needs of low-income groups. Promote the popularization of inclusive finance. The research strictly follows the principles of data privacy and fairness, ensures the transparency and unbias of the model in data processing and prediction, and avoids blind reliance on Al decisionmaking by improving the interpretability of the model. In practical application, the model can be used in poverty alleviation policy optimization, financial risk assessment and credit approval, etc., to promote social equity and sustainable economic development.
References
[1] Bruckner B, Hubacek K, Shan Y, Zhong H, & Feng K. (2022). Impacts of poverty alleviation on national and global carbon emissions. Nat. Sustain, 5(4), 311-320. https://doi.org/10.1038/s41893-02100842-z
[2] Khera P, Ng S, Ogawa S, & Sahay R. (2022). Measuring digital financial inclusion in emerging market and developing economies: A new index. Asian Economic Policy Review, 17(2), 213-230. https://do1.org/10.1111/aepr. 12377
[3] Malladi C M, Soni R K, & Srinivasan S. (2021). Digital financial inclusion: Next frontiers - Challenges and opportunities. CSI Trans. ICT, 90), 127-134. https://do1.org/10.1007/s40012-02100328-5
[4] Liu H. Enhanced C°CoSo method for intuitionistic fuzzy MAGDM and application to financial risk evaluation of high-tech enterprises. (2024). Informatica, 48(5). https://do1.org/10.31449/1nf.v4815.5169
[5] Bu Y. Fuzzy Decision Support system for financial planning and management. (2024). Informatica, 48(21). https://do1.0rg/10.31449/1nf.v48121.6718
[6] Rastogi S, Sharma A, Panse C, & Bhimavarapu V M. (2021). Unified Payment Interface (UPI): A digital innovation and its impact on financial inclusion and economic development. Universal Journal of Accounting and Finance, 9(3), 518-530. https://doi.org/10.13189/ujaf.2021.090326
[7] Li J, & Li В. (2022). Digital inclusive finance and urban innovation: Evidence from China. Rev. Dev. Econ, 26(2), 1010-1034. https://doi.org/10.1111/rode. 12846
[8] Hasan M M, Yajuan L, & Khan S. (2022). Promoting China's inclusive finance through digital financial services. Glob. Bus. Rev, 23(4), 984-1006. https://doi.org/10.1177/0972150919895348
[9] Biju A K U N, Thomas A S, Thasneem J. (2024) Examining the research taxonomy of artificial intelligence, deep learning & machine learning in the financial sphere-a bibliometric analysis. Quality & Quantity, 58(1): 849-878. https://do1.org/10.1007/s11135-023-01673-0
[10] Barra S, Carta S M, Corriga A, Podda A S, Recupero D R. (2020) Deep learning and time series-to-image encoding for financial forecasting. IEEE/CAA Journal of Automatica Sinica, 7(3): 683-692. https://do1.org/10.1109/JAS.2020.1003132
[11] Kurani A, Doshi P, Vakharia A, Shah M. (2023) A comprehensive comparative study of artificial neural network (ANN) and support vector machines (SVM) on stock forecasting. Annals of Data Science, 10(1): 183-208. https://do1.org/10.1007/s40745-021-00344-x
[12] AlDulaimi JA A B, Muter K J, Younis NN. (2023) The use of data mining technology in financial forecasting of accounting profits: An applied study. International Journal of Construction Supply Chain Management, 13(1): 37-49. https://do1.org/10.14424/jesem2023130103
[13] Ghazal T M. (2021). Performances of K-means clustering algorithm with different distance metrics. Intell. Autom. Soft Comput, 30(2), 735742. https://do1.org/10.32604/1asc.2021.019067
[14] Chen Z. (2022). Research on internet security situation awareness prediction technology based on improved RBF neural network algorithm. JCCE, 16), 103-108. https://do1.org/10.47852/bonviewJCCE 149145205 514
[15] Polyzos S, Samitas A, & Spyridou A E. (2021). Tourism demand and the COVID-19 pandemic: An LSTM approach. Tour. Recreat. Res, 46(2), 175187. https://do1.org/10.1080/02508281.2020.1777053
[16] Kanchanawong P, € Calderwood D A. (2023). Organization, dynamics and mechanoregulation of integrin-mediated cell-ECM adhesions. Nat. Rev. Mol. Cell Biol, 24(2), 142-161. https://do1.org/10.1038/s41580-022-00531-5
© 2025. This work is published under https://creativecommons.org/licenses/by/3.0/ (the "License"). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.