This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
1. Introduction
The computerization of accounting and the computerization of business data had begun to be widely used in many units. The sampling audit of financial and commercial data by the audited unit is the basis of daily audit work. According to the information about data, it becomes important to find potentially useful information from a large amount of disorderly data [1, 2]. Therefore, finding the really valuable things from these huge data and providing clues and basic methods for the inspectors to discover problems is a relatively urgent issue for the audit departments of many companies.
Compared with previous audits, big data audits have their own characteristics, which can greatly improve the management of enterprises. Therefore, big data auditing is a new trend in the development of internal auditing in Chinese enterprises. In addition, the status quo, improvement plan, and business improvement of some companies in the big data audit were investigated, and it was found that the implementation of the big data audited system is regular, and it is recommended that it can be tracked on the basis of subsequent implementation. This article proposes a new method of audit data analysis based on big data mining to innovate the audit method. The application of this new method has important reference significance for the internal audit of other enterprises.
Data mining refers to the process of searching for information hidden in a large amount of data through algorithms. Data mining is generally related to computer science and achieves these goals through a number of methods such as statistics, online analytical processing, intelligence retrieval, machine learning, expert systems (relying on past rules of thumb), and pattern recognition. This article integrates and develops enterprise internal audit and big data by using data mining technology, which is an efficient, convenient, and feasible audit method.
The rest of the article is organized as follows: Section 2 details the related work, while Sections 3 and 4 throw light on the theoretical method and experimental simulation and analysis, respectively. Similarly, Section 5 discusses internal audit analysis, and Section 6 is the final conclusion of the article. The innovation of this article is to put forward a big data audit system through the understanding of the company’s internal audit and a comprehensive analysis of its improvement process, and to improve the clustering algorithm—key data mining algorithm—so that it can be more suitable for the research topic of this article.
2. Related Work
With the rapid development of massive data, more and more data are stored, and humans need to find a more convenient way to obtain data. Buczak proposed a method for data analysis and mining, and also gave a detailed description of the specific process [3]. Xu pointed out that the preservation of data security is a very important issue. He said that it needs to be continuously studied to ensure the security of data [4]. Kavakiotis proposed to apply the method of data mining and machine learning to the treatment of diabetes [5]. Chaurasia studied the performance of different classification techniques, using classification accuracy to test a total of 683 rows and 10 columns of breast cancer data. The purpose is to use data mining technology to develop an accurate breast cancer prediction model [6]. Yan used data mining technology to evaluate data, which is not applicable, thinking that there is still a lot of data that need to be predicted by itself [7]. Emoto observed and analyzed the characteristics of the gut microbiota of patients with coronary artery disease using data mining technology [8]. Hong proposed to use data mining technology to effectively prevent the flood problem in Poyang Lake [9]. Huang aimed to provide an effective method to calculate a rough approximation of fuzzy concepts in a dynamic fuzzy decision system (FDS), where objects and features change at the same time [10]. Based on big data technology, Zhao extended and changed the traditional model according to the characteristics of data mining services and proposed a big data alliance data mining service process model. In addition, Zhao uses intelligent decision-making theory and knowledge reasoning methods to build a fast-response, reusable, and intelligent service model to realize the scalability of data mining services [11]. Liu analyzed and explored emerging ideas and methods of data mining techniques, and conducted audits to evaluate the evolution of these techniques. It can be seen that the combination of big data technology and industrial green manufacturing technology is slow, and it is necessary to combine industrial green manufacturing enterprises with big data technology and artificial intelligence. In order to improve the current severe environmental problems, Liu also discussed the development trend in the green technology production based on big data technology and the integration and innovation of big data technology and green technology, enriching the forms of environmental supervision and participation and other technological progress and improving business efficiency. As a result of the progress of green technology based on big data, environmental audit strategies and suggestions are put forward [12]. The abovementioned documents have a very detailed description of some key technical points and a good demonstration of the related design process of data mining. However, looking at these several documents, there is no inquiry into the mining ability of data mining, there is no experimental design for the stability of data mining, and there are still some deficiencies.
3. Auditing Methods of Big Data within the Enterprise
3.1. Big Data Audit System
Enhancing the implementation of quality control is an important measure to control audit risks. Intensified implementation of quality control can effectively supervise the full implementation of audit work and audit review.
3.1.1. Strengthening the Training of Practitioners
On the one hand, it is aimed at the training of high-level management personnel, strengthening their attention to quality control and their understanding of the quality control process, and fundamentally optimizing the execution environment of quality control. On the other hand, it is to train personnel to strengthen their quality control awareness and make them consciously abide by the quality control system. Only when the quality control awareness of managers and auditors are improved, the quality control system can be enforced strongly, the audit work can be carried out in accordance with the quality control system in the entire audit process, and the audit quality can be improved and audit risk can be reduced.
3.1.2. Strengthening the Operation Supervision of Quality Control
An independent department should be set up to supervise and inspect all accounting and auditing departments, and an effective supervision and inspection mechanism should be established. In the business undertaking stage, we supervise and inspect whether we have a detailed understanding of the customer and the former CPA, whether the forms in the business undertaking stage are true, etc.; and formulate corresponding punishment measures, such as public criticism, fines, downgrading, dismissal, etc. When it is discovered that the corresponding certified public accountants and auditors have failed to implement the quality control system, they will be punished in varying degrees according to the severity of the circumstances. While achieving the disciplinary effect, other auditors are warned to implement the quality control system in accordance with the regulations [13]. Through such an independent department, problems in the work process can be discovered in time, and risks can be prevented and resolved in time, so that the staff consciously abide by the quality control system, which is conducive to the implementation of quality control.
3.1.3. Building a Big Data Audit System
When enterprises are faced with massive data audit projects, traditional equipment and data collection methods face problems such as high resource consumption and slow data processing and analysis [14, 15]. For this reason, this article proposes a big data audit system, build a big data audit system including infrastructure layer, data layer, data analysis layer, and application layer, which is shown in Figure 1:
[figure(s) omitted; refer to PDF]
3.1.4. Improving Audit Procedures
Improving the auditing procedures is the preliminary application of big data technology in auditing practice. The audit process is longer, the scope of the audit is wider, and there are many big data technologies. Therefore, the application of big data technology in any step, as long as it is more efficient than the previous audit methods, is to improve the audit procedure [16]. This article synthesizes the existing research results and summarizes the audit procedures with high application degree of big data technology in Figure 2.
[figure(s) omitted; refer to PDF]
It can be seen from Figure 2 that the improvement in audit procedures mainly refers to data acquisition, data processing, and data analysis. RH accounting firm can select some applications and all applications according to the situation of the audited unit and its own audit needs [17]. The following is a simple illustration of data acquisition and data processing.
Data acquisition is shown in Figure 3. The acquired data include both internal data provided by the enterprise and external data (the acquired data include not only structured data from the data relational database, but also semi-structured data from web pages, XML, etc., as well as unstructured data such as office files, company reports, emails, pictures, audio and video, etc.). All this relies on nonrelational database NoSQL technology. As the audit evidence SQL, documents, pictures, audio and video, and other files cannot be stored in SQL, which brings great inconvenience to the work of auditors, so NoSQL technology came into being [18]. As the world’s largest information retrieval company, Google has widely used NoSQL database systems.
[figure(s) omitted; refer to PDF]
In the context of big data, data processing can enhance the timeliness of audit work, and making good use of data processing can efficiently and quickly obtain audit evidence. There are two big data processing modes, namely stream processing and batch processing. At the same time, when cloud accounting is developed, real-time auditing will also be realized, and the development direction of auditing must be real-time auditing [19].
3.2. Mathematical Model of Cluster Analysis
Cluster analysis is actually to analyze the distribution of the feature vectors corresponding to the samples in the entire X set and divide x1, x2, xn into several disjoint groups according to the degree of closeness between the samples.
Let X = {x1, x2, …, xn} be the domain of the data object to be analyzed (total). Each data object xk (k = 1, 2, …, n) is described by several commonly used parameter values, a parameter value describes an attribute of xk, and the following conditions must be met:
The membership degree of the sample xk (1 ≤ k ≤ n) to the subset xi (1 ≤ i ≤ c) can be expressed by the membership function as:
Among them, the membership function must also meet the conditions:
Among
That is, it is equivalent to that all samples belong to only a certain cluster, and all subsets must be non-empty. Such cluster analysis is usually called hard partition.
3.2.1. Binary Variables
Binary variables refer to clustered data objects with only two states of 0 and 1. For example, the variable 1 that describes the state of a thing means existence and 0 means non-existence. Only one of these two states can be selected, and the third state cannot exist. Binary variables can also be subdivided into two types: symmetric and asymmetric. Symmetrical binary variables indicate that the importance of different states of variables is not different, while asymmetrical binary variables set different weights for different states [20].
The simple matching coefficient can be expressed as:
Similarly for asymmetric binary variables, the similarity of different variables is related to a coefficient named Jaccard. Supposing the values of p, q, r, and s are the same as above, then the Jaccard coefficient is as follows:
3.2.2. Ordinal Variable
Ordinal variables can have multiple different state values, and the difference measurement method can also be calculated with a simple matching coefficient:
3.2.3. Interval Scale Variables
Interval scale variables can be defined as continuous measures with linear scales, including width and length, height and weight, air pressure and temperature, and so on. Before dividing data objects into different categories, it is necessary to define a measure of difference or similarity to measure the difference between different categories of data objects and the similarity of data objects in the same category. The usual method is to measure the distance between data objects. For two data objects with n-dimensional attributes, it can be expressed as:
For the distance d between two data objects, the main distance functions are:
(1) Ming’s Minkowski distance:
When q takes 1, 2, and ∞, the Ming’s distance can be expressed as:
(i) Absolute distance:
(ii) Euclidean distance
(iii) Chebyshev distance
(2) Mahalanobis distance
The minute distance described above is only applicable to the usual Euclidean space. Considering that the attribute value of each variable of the data object is usually a random variable, because the random variable is released randomly, the various components may be correlated. Therefore, the Mahalanobis distance between the ith sample and the jth sample can be expressed as:
In clustering algorithms, distance is usually used as a very intuitive measure of difference, especially Euclidean distance, which is used in this article [21]. Here we briefly introduce two important similarity measure similarity coefficients. The similarity coefficient is between −1 and 1. The closer the coefficients are to ±1, the more similar they are.
(1) Cosine of Included Angle. In the cluster analysis, let the sample i in the p-dimensional space be:
Sample j is:
The cosine of the angle between two samples is used to express their similarity coefficient, and the cosine of the angle between the samples is recorded as:
(2) Correlation Coefficient. The correlation coefficient of sample i and j can be recorded as:
Among them
3.2.4. Level-Based Approach
The hierarchical clustering method is to decompose the data set into several groups (classes) to form a clustering tree. According to the clustering method, it can be divided into top-down split hierarchical clustering and bottom-up cohesive hierarchical clustering. Agglomerative hierarchical clustering is to initially treat each data object as a class, and then merge it level by level until it forms a set that cannot be merged. The split hierarchical clustering regards all data objects as one class, and then gradually splits according to the given rules, producing several subclasses, until it reaches the clustering. The following describes the processing process of agglomerative clustering and split clustering by simply clustering the data set (a, b, c, d, e) in Figure 4:
[figure(s) omitted; refer to PDF]
3.3. The Characteristics of Internal Audit Informationization under Big Data
In the professional field of internal auditing, the “Global Technical Audit Guidelines” issued by the International Association of Internal Auditors (IIA) summarizes the connotation of “internal auditing informatization” (i.e., internal audit informationization mainly includes information technology vulnerability management, information technology audit, information technology control, etc.). In order to realize the internal audit function under the guidance of the enterprise development strategy, the internal audit department supervises, evaluates, and optimizes the enterprise’s risk, control, and corporate governance [22]. Modern information technology is used to build a big data audit platform based on “cloud computing” to collect financial data and business data generated in the operation of the enterprise in real time and extensively.
Compared with internal audit in the traditional sense, the characteristics of information-based internal audit are as follows: diversification of audit content, digitization of audit objects, intelligence of audit management, and modernization of audit technology.
3.3.1. Risk-Oriented Concept
In 2013, the International Institute of Internal Auditors (IIA) released a three-line defense model for effective risk management [23]. The first line of defense is business management and internal control, mainly for risks in business operations. The second line of defense includes financial control, quality control, etc. Its role is mainly to monitor the cost-effectiveness of business operations. In addition, the second line of defense also supervises the first line of defense to ensure its effectiveness. The third line of defense mainly refers to the internal audit, based on the first line of defense and the second line of defense, focusing on the loopholes and risk points in the company’s operations, conducting key audits, and issuing audit results. The “three lines of defense model for effective risk management” is shown in Figure 5.
[figure(s) omitted; refer to PDF]
The internal audit work is risk oriented, and a risk early warning system is established on the basis of the audit data warehouse. The system can automatically operate according to the program settings and issue early warning information in time, thereby reducing the risks in the business process. The functions of the risk early warning system are mainly realized through the following aspects:
(1) Establishing a risk early warning model. The risk early warning model is based on risk early warning indicators and covers most of the risk points in the company’s operations that the audit focuses on. The risk early warning model can automatically calculate and compare the data in the audit data warehouse, and then the auditor will further analyze the abnormal and fluctuating indicators to form a risk early warning report.
(2) The push and follow-up feedback mechanism of risk early warning reports. The audit department will promptly push the risk warning report to the responsible department and help the responsible department rectify and eliminate risks in a timely manner. In order to ensure the implementation of audit rectification opinions, the system will continue to monitor and feed back the implementation of the responsible department.
(3) Periodic reporting system to the management. The audit department will regularly form a special report on the results of risk early warning, the implementation of rectification of the responsible department, and the audit recommendations for improving risk management, which will be reported to the management to effectively improve the performance of the internal audit department and the status of the department.
3.3.2. Elements of Big Data Audit
“Big Data Audit System” mainly covers audit data collection, audit data storage management, and audit business application modules. The audit business application modules are mainly composed of audit early warning modules, audit support modules, and information access modules. The general implementation mode of “Big Data Audit System” is shown in Figure 6.
[figure(s) omitted; refer to PDF]
The main mode of operation of the big data audit system is as follows: first, the audit data warehouse regularly imports audit data from the BSS business system, ERP business system, network transportation system, and other application systems. The source data are processed by methods such as absorption, cleaning, and elimination, so that the data in the audit data warehouse meet the needs of the audit. Second, based on the audit data warehouse, the audit department has developed the application of the audit early warning system and the online monitoring system. These applications make full use of high-tech data mining technology (e.g., statistical technology, artificial intelligence, and neural network) and can perform in-depth processing of the data in the audit data warehouse. Finally, the analysis results are presented to the auditors (through visualization technology, such as electronic forms and other information access tools), and the auditors draw audit conclusions based on this.
4. Experiments on the Status Quo of Internal Auditing
4.1. Basic Survey of Internal Audit
Due to the large number of enterprises in China, there are many different types of enterprises; however, compared with other types of enterprises, listed companies have stronger profitability, larger scale, and larger number. Therefore, when analyzing the status quo of internal audit of Chinese enterprises, we will focus on the consideration of listed companies in China. When analyzing the implementation of internal audits by listed companies in China, the relevant data are mainly obtained through a combination of databases and questionnaires (data come from Juchao Consulting Network, Guotaian Database, published statistical yearbooks, internal audit related systems of listed companies). The specific data are shown in Table 1.
Table 1
Shanghai and Shenzhen stock exchanges announced internal audit related systems.
Number of companies | The total amount | Percentage | |
Shanghai Stock Exchange | 20 | 1336 | 1.49 |
Shenzhen | 90 | 2027 | 4.44 |
Total | 110 | 3363 | 3.27 |
4.1.1. Basic Overview of Internal Audit
There are many types of corporate internal audit services. Common audit services include financial audit, internal control audit, operation audit, special audit, and risk audit. Statistics on this aspect are shown in Table 2.
Table 2
Scope of internal audit business.
Number of companies | Percentage | |
Financial Audit | 56 | 100 |
Internal control audit | 52 | 92.86 |
Business audit | 44 | 78.57 |
Special audit | 24 | 42.86 |
Risk audit | 15 | 26.79 |
It can be seen from the above survey results that the company’s internal audit department is involved in financial audit, internal control audit, business audit, special audit, and risk audit. Financial, internal control, and operational audit outsourcing accounted for a relatively large proportion.
4.1.2. Academic Qualifications of Internal Auditors
According to the survey feedback, 56 listed companies have a total of 456 employees, and each listed company has 8 internal auditors on average. Among them, 69 are postgraduates, 138 are undergraduates, 113 are junior colleges, and 136 are technical secondary schools or below. The specific situation is shown in Table 3.
Table 3
Educational background of internal auditors.
Number of samples | Percentage | |
Postgraduate | 69 | 15.13 |
Undergraduate | 138 | 30.26 |
Junior college | 113 | 24.78 |
Technical secondary school and below | 136 | 29.82 |
Total | 456 | 100 |
Listed companies should pay attention to the training of internal auditors and raise the threshold for entering internal audit institutions. This allows truly capable audit talents to join in and improve the comprehensive capabilities of the company’s entire internal audit team.
4.2. Investigation on the Implementation of Internal Audit Informatization
According to the survey feedback of 56 listed companies, there are 7 listed companies temporarily not considering the use of data mining technology in internal audit. There are 16 listed companies that only consider the use of data mining technology for internal auditing but have not implemented it in the end, and 33 companies have considered using data mining technology for internal auditing and have implemented it. The specific data are shown in Table 4.
Table 4
Implementation of internal audit informatization.
Progress status | Number of subsamples | Percentage |
Not consider | 7 | 12.5 |
Considered but not implemented | 16 | 28.57 |
Considered and implemented | 33 | 58.93 |
Total | 56 | 100 |
Among the 23 listed companies that have not implemented internal audit and use data mining technology, 4 listed companies believe that they have audit capabilities and do not need to rely on the power of data mining technology. Six listed companies believe that there are certain risks in the use of data mining technology in internal auditing. The relevant data are shown in Table 5.
Table 5
Reasons for not implementing internal audit information.
Unimplemented quantity | Percentage | |
Possess auditing capabilities | 4 | 17.39 |
There is a risk | 6 | 26.09 |
No experience | 7 | 30.43 |
Lack of regulatory support | 6 | 26.09 |
Total | 23 | 100 |
Through investigation, more than 50% of listed companies finally implemented internal audit information. However, there are still some companies that do not have the idea of using data mining technology for internal audit or do not consider using data mining technology at all. This shows to a large extent that Chinese enterprises have less consideration of various factors that affect the informationization of internal auditing. In the specific process of implementing informatization, there is a lack of specific implementation paths and management rules to guide and regulate.
5. Internal Audit Analysis under Big Data
5.1. The Overall Level of Internal Audit Resource Integration
At present, the resource integration implemented in the corporate internal audit practice is often implemented only in fragmented organizational methods. It does not treat this process as a systematic work. Its main purpose is to make up for the shortcomings of certain types of resources in the execution of internal audit projects. This method of resource integration can indeed achieve the necessary integration of certain audit resources that were originally independent. However, it lacks a holistic view, it is difficult to pass on good experience, it is difficult to give feedback on existing problems, and it is difficult to preserve the complete information in the process. As a whole, it is difficult to produce long-term significance for improving the level of internal audit management.
The internal audit work of the abovementioned investigation companies maintains a high degree of independence, and the acquisition of company-related information is generally smooth and comprehensive. However, there are still many problems to be solved in practical work. The degree of integration of various elements of internal audit resources is measured in three levels: low (1–10), medium (11–20), and high (21–30). The basic situation of the internal audit resource integration of companies that did not use big data for auditing and audits that use big data in the interview survey can be reflected as shown in Figure 7:
[figure(s) omitted; refer to PDF]
As can be seen from the figure, through the application of data mining in big data, although all aspects of internal audit have been optimized but however due to the fact that the structure of the audit team has just been adjusted, there are still unstable factors such as insufficient audit experience of some personnel. And because the newly transferred internal auditors may not be able to quickly and independently carry out large-scale audit projects. This will affect the allocation of human resources in different audit projects to a certain extent. In addition, in recent years of internal audit work, due to the large scale, complex nature, and wide range of professional aspects of some audit projects, the audit department has adopted the method of seconding professionals from the corporate finance department to complete the audit tasks. Although the final audit effect has been affirmed by the company, the lack of a mature secondment mechanism and the temporary establishment of audit project teams may affect the original work of other departments. These contradictions have yet to be resolved.
5.2. Accuracy Analysis of Clustering Algorithm
In order to verify the performance of the improved algorithm, this article uses four data sets from the UCI machine learning database to conduct experiments. At the same time, the algorithm was run 10 times on each data set to record the average value and compared with the traditional K-means clustering algorithm. The result is shown in Figure 8:
[figure(s) omitted; refer to PDF]
From the above four sets of experimental data, we can see that the method proposed in this article has a higher accuracy rate. Its accuracy is greatly improved compared with the traditional K-Means clustering algorithm. At the same time, in order to compare the ability of the two algorithms to obtain the best number of clustering categories, this study runs 50 times on each of the four data sets. The result is shown in Figure 9.
[figure(s) omitted; refer to PDF]
The data in the figure firstly express such a fact very intuitively: The clustering algorithm proposed in this article is overwhelmingly stronger than the traditional clustering algorithm in obtaining the best clustering category. This benefits from the powerful global optimization capability of the algorithm.
The inertia adjustment coefficient h also affects the accuracy of the algorithm, and a proper inertia adjustment coefficient can effectively correct the inertia coefficient of the particles. This can give full play to the characteristics required by the particles at this time. In order to obtain the best inertia adjustment coefficient, this article selects a set of continuous inertia coefficients. It clusters on four data sets and plots the results as shown in Figure 10.
[figure(s) omitted; refer to PDF]
It is easy to know from the figure that on different data sets, the law of accuracy changes with the inertia adjustment coefficient is different, and their best inertia adjustment coefficients are also different from each other. When the inertia adjustment coefficient is too large, the running state of the traditional algorithm particles overcorrects the next movement of the particles. This makes the overall inertia factor larger and greatly reduces the particle’s local search ability, which leads to a decrease in the accuracy of the algorithm.
Based on the above analysis, we can conclude that the accuracy of the improved clustering algorithm has increased by 31.4%. Its optimal clustering ability has increased by 20.7%, and the company’s internal audit resources have been improved by 17.4%. It can be seen that while the improved algorithm has greatly improved its performance, it also has a greater role in promoting the company’s internal audit.
6. Conclusions
The article mainly studies the improvement in the company’s internal audit issues. It uses data mining technology to conduct a fusion analysis of big data, so that the company’s internal audit work can be better improved. First, it conducts an analysis on the key algorithm clustering algorithm of data mining technology and optimizes and improves it. This makes the improved algorithm better suitable for handling the company’s internal audit issues. And in the experiment and analysis part, a comparative analysis of audit resource integration and algorithm performance is carried out, and it is concluded that the improvement in the algorithm has a very good effect.
[1] B. Huang, J. Wei, Y. Tang, L. Chang, "Enterprise risk assessment based on machine learning," Computational Intelligence and Neuroscience, vol. 2021,DOI: 10.1155/2021/6049195, 2021.
[2] H. Yang, R. Su, P. Huang, Y. Bai, K. Fan, K. Yang, H. Li, Y. Yang, "PMAB: a public mutual audit blockchain for outsourced data in cloud storage," Security and Communication Networks, vol. 2021, 2021.
[3] A. Buczak, E. Guven, "A survey of data mining and machine learning methods for cyber security intrusion detection," IEEE Communications Surveys & Tutorials, vol. 18 no. 2, pp. 1153-1176, 2017.
[4] L. Xu, C. Jiang, J. Wang, J. Yuan, Y. Ren, "Information security in huge amount of data: privacy and data mining," IEEE Access, vol. 2 no. 2, pp. 1149-1176, 2017.
[5] I. Kavakiotis, O. Tsave, A. Salifoglou, N. Maglaveras, I. Vlahavas, I. Chouvarda, "Machine learning and data mining methods in diabetes research," Computational and Structural Biotechnology Journal, vol. 15 no. C, pp. 104-116, DOI: 10.1016/j.csbj.2016.12.005, 2017.
[6] V. Chaurasia, S. Pal, "A novel approach for breast cancer detection using data mining techniques," Social Science Electronic Publishing, vol. 3297 no. 1, pp. 2320-9801, 2017.
[7] X. Yan, L. Zheng, "Fundamental analysis and the cross-section of stock returns: a data-mining approach," Review of Financial Studies, vol. 30 no. 4, pp. 1382-1423, DOI: 10.1093/rfs/hhx001, 2017.
[8] T. Emoto, T. Yamashita, T. Kobayashi, N. Sasaki, Y. Hirota, T. Hayashi, A. So, K. Kasahara, K. Yodoi, T. Matsumoto, T. Mizoguchi, W. Ogawa, K.-i. Hirata, "Characterization of gut microbiota profiles in coronary artery disease patients using data mining analysis of terminal restriction fragment length polymorphism: gut microbiota could be a diagnostic marker of coronary artery disease," Heart and Vessels, vol. 32 no. 1, pp. 39-46, DOI: 10.1007/s00380-016-0841-y, 2017.
[9] H. Hong, P. Tsangaratos, I. Ilia, J. Liu, A.-X. Zhu, W. Chen, "Application of fuzzy weight of evidence and data mining techniques in construction of flood susceptibility map of Poyang County, China," The Science of the Total Environment, vol. 625 no. 1, pp. 575-588, DOI: 10.1016/j.scitotenv.2017.12.256, 2018.
[10] Y. Huang, T. Li, C. Luo, H. Fujita, S.-j. Horng, "Matrix-based dynamic updating rough fuzzy approximations for data mining," Knowledge-Based Systems, vol. 119, pp. 273-283, DOI: 10.1016/j.knosys.2016.12.015, 2017.
[11] J. Zhao, Y. Wang, "A novel massive big data analysis of educational examination research using a linear mixed-effects model," Complexity, vol. 2021 no. 6,DOI: 10.1155/2021/3752598, 2021.
[12] Y. Liu, Z. Fang, F. Chen, "Impact of PM2.5 environmental regulation based on big data for green technology development," Arabian Journal of Geosciences, vol. 14 no. 7,DOI: 10.1007/s12517-021-06900-2, 2021.
[13] R. Kemp, A. Smith, R. Smith, "Audit, validation, verification and assessment for safety and security standards," Journal of Cybersecurity and Information Management, vol. 7 no. 1, pp. 22-50, DOI: 10.54216/jcim.070103, 2021.
[14] A. I. Mokhtar, S. Metawa, "Investor psychology perspective: a deep review on behavioral finance," American Journal of Business and Operations Research, vol. 0 no. 1,DOI: 10.54216/ajbor.000101, 2019.
[15] M. A. Fernández-Gámez, F. García-Lagos, J. R. Sánchez-Serrano, "Integrating corporate governance and financial variables for the identification of qualified audit opinions with neural networks," Neural Computing & Applications, vol. 27, pp. 1427-1444, 2016.
[16] W.-P. Tsai, S.-P. Huang, S.-T. Cheng, K.-T. Shao, F.-J. Chang, "A data-mining framework for exploring the multi-relation between fish species and water quality through self-organizing map," The Science of the Total Environment, vol. 579 no. 1, pp. 474-483, DOI: 10.1016/j.scitotenv.2016.11.071, 2017.
[17] F. Marozzo, D. Talia, P. Trunfio, "A workflow management system for scalable data mining on clouds," IEEE Transactions on Services Computing, vol. 11 no. 3, pp. 480-492, DOI: 10.1109/tsc.2016.2589243, 2018.
[18] G. Cheon, K.-A. N. Duerloo, A. D. Sendek, C. Porter, Y. Chen, E. J. Reed, "Data mining for new two- and one-dimensional weakly bonded solids and lattice-commensurate heterostructures," Nano Letters, vol. 17 no. 3, pp. 1915-1923, DOI: 10.1021/acs.nanolett.6b05229, 2017.
[19] W. Rupesh, J. Sagar, S. Rahul, "An internal intrusion detection and protection system by using data mining and forensic techniques," IEEE Systems Journal, vol. 11 no. 2, 2017.
[20] J. Bai, He Tian, "Research on Audit Data Analysis and Decision Tree Algorithm for Benefit Distribution of Enterprise Financing Alliance," Scientific Programming, vol. 2021,DOI: 10.1155/2021/1910156, 2021.
[21] Z. Xu, G. Zhu, N. Metawa, Q. Zhou, "Machine learning based customer meta-combination brand equity analysis for marketing behavior evaluation," Information Processing & Management, vol. 59 no. 1,DOI: 10.1016/j.ipm.2021.102800, 2022.
[22] L. Zheng, W. Hu, Y. Min, "Raw wind data preprocessing: a data-mining approach," IEEE Transactions on Sustainable Energy, vol. 6 no. 1, pp. 11-19, 2017.
[23] C.-J. Tseng, C.-J. Lu, C.-C. Chang, G.-D. Chen, C. Cheewakriangkrai, "Integration of data mining classification techniques and ensemble learning to identify risk factors and diagnose ovarian cancer recurrence," Artificial Intelligence in Medicine, vol. 78, pp. 47-54, DOI: 10.1016/j.artmed.2017.06.003, 2017.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
Copyright © 2022 Nan Nan. This is an open access article distributed under the Creative Commons Attribution License (the “License”), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. https://creativecommons.org/licenses/by/4.0/
Abstract
Auditing based on big data is the trend in the future audit development. First, the technical environment provides a technical support platform for continuous auditing. Through the development of information technology to promote the merger between financial services, the company’s business operations have been digitized, and the original paper audit is also facing changes. This article aims to study the integration and development of enterprise internal audit and big data based on data mining technology. To this end, this article proposes a big data audit system, improves and optimizes the clustering algorithm (key algorithm) of data mining, and designs experiments and analysis to explore its related effects and improved performance, so that it can be more suitable for the research topic. The experimental results of this article show that the improved big data audit system improves the resource perfection of internal audit by 17.4%. The improved algorithm’s accuracy rate has increased by 31.4%, and the best clustering ability has also been improved by 20.7%, which can be well applied to the company’s internal audit.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer