1. Introduction
As a research domain that has entered its third decade of life, process mining (PM) has flourished in the past few years in both research and applications. Following the Process Mining Manifesto [1], which is considered to be the state-of-the-art for PM, numerous primary studies have been developed for all three areas of the field (process discovery, process conformance, and process enhancement). Following the paradigm of other software engineering fields [2,3,4,5], it is observed that there is a considerable number of secondary studies for PM for a variety of domains [6,7,8,9,10]. Process mining has gained importance due to its capability to bridge the gap between data mining and business process management. This area of research provides valuable insights into real business processes by analyzing event logs.
Systematic literature reviews (SLRs), systematic mapping studies (SMSs), and surveys are all classified as secondary studies [11] as they evaluate and interpret research conducted in a specific domain regarding a specific set of research questions [12]. In the area of software engineering, secondary studies are an important means to aggregate and represent knowledge for a specific topic [11,12,13,14], as they can highlight gaps in the existing research and, at the same time, provide insights into methodologies and tools to be used by practitioners [15,16]. A secondary study must be unbiased, auditable, and repeatable [12].
As a research domain matures and the number of secondary studies increases, a research synthesis in the form of a tertiary study can further identify crucial areas of the specific field and generate future research questions that have not yet been addressed [17]. A tertiary study assesses and analyzes a set of secondary studies and through a research synthesis, following the methodology of an SLR, provides new insights into a specific area [15,18,19]. In other fields of software engineering, several tertiary studies have already been conducted [20,21,22,23]. However, there is still a need for more studies [24]. Reviewing the research literature so far, in 20 years of research on PM, to the best of our knowledge, there has not been a tertiary study for this field. With an increasing number of secondary studies in PM in the past few years, we believe that the field of PM has matured enough for a tertiary study to be conducted.
This tertiary study covers the SLRs and SMSs published on PM until March 2023. Insights into available secondary research in PM and important attributes such as the number of primary studies, quality score, and area of focus are presented in a tabular format. Tabulating data is a useful means of aggregation that is used in conjunction with textual justification to respond to the research questions posed [25]. We follow the same methodology as an SLR based on the guidelines proposed by Kitchenham and Charters [12] and Petersen et al. [26]. This study identifies current trends, reports on the quality and demographics of the studies included, and answers a broad set of research questions posed in the following section to provide a clear mapping of the domain.
The scope of this study includes all secondary studies in different areas of PM. The quality of each secondary study is assessed to provide an in-depth analysis of the overall quality of the studies published in this field, both in journals and as conference papers. The quality assessment was performed using the York University Centre for Reviews and Dissemination Database of Abstract of Reviews of Effects (DARE) criteria [27]. Previous reviews have utilized systematic literature reviews and meta-analyses. However, the tertiary study approach provides a unique perspective by synthesizing findings from secondary studies, offering a broader understanding of existing research gaps and trends. To ensure that the findings of this study have been appropriately reported, the checklist and guidelines proposed by Budgen et al. [28] and Ampatzoglou et al. [29] have been considered.
The rest of this study is structured as follows: The Research Method Section (Section 2) thoroughly explains the methodology followed to ensure that this work is auditable and repeatable in the future. The Results/Findings Section (Section 3) analyzes the research questions, and Section 4, Discussion, discusses the results. Finally, we provide our conclusions, limitations, and future work in Section 5.
2. Research Method
As mentioned, this paper follows the same methodology as an SLR based on the guidelines proposed by Kitchenham and Charters [12] and Petersen et al. [26]. The structure of this paper is similar to that of Khan et al. [24] since it complies with all the standards of a high-quality tertiary study and is extensive in covering a broad range of research questions that would help us to derive important and valuable conclusions applied in the field of process mining. Figure 1 shows the procedure followed for the formulation of this study. In Step 1, the research questions are presented based on the objectives mentioned above. Step 2 contains the activities of the search procedure, including the definition of search terms, preparing the search queries for each database, online database selection, and the actual search process (the search procedure is illustrated in Figure 2). In Step 3, the authors select the studies by applying the inclusion and exclusion criteria. Step 4 presents the quality assessment of the selected studies using the DARE guidelines, whereas during Step 5, information is extracted from each study. Finally, Step 6 synthesizes the data and presents the results.
2.1. Formulating Research Questions
The research questions were formulated with the main objective of this study being to map the information provided by secondary studies in the field of PM across all research areas, due to it being a relatively new research topic. In addition to mapping the information available in secondary studies in the field of PM, another goal was to examine their quality and the areas they focus on. The research questions (RQs) addressed by this study are presented below.
RQ1: Which research areas are addressed by SMSs/SLRs in process mining?
To answer this research question, the existing SMSs and SLRs in the field of PM are classified according to the specific areas they focus on. As a result, a thematic analysis of the studies is presented by reading their title, abstract, and keywords.
RQ2: What are the trends relating to the quality of published SMSs/SLRs?
Each one of the studies included in this research is evaluated using quality assessment guidelines and DARE criteria [27]. To define the quality of each study, a set of criteria is examined by reviewing the studies and their included primary studies. This research question allows us to draw valid conclusions regarding the quality of the studies over time and the quality of the studies based on their venue.
RQ3: What are the current trends in SMSs/SLRs in process mining relating to guidelines, data sources, and the number of included studies?
This research question identifies the most followed guidelines for SMSs/SLRs in PM. In addition, the trends based on the digital libraries used and the number of primary studies included are also revealed.
RQ4: What are the demographics of published SMSs/SLRs in process mining?
The final research question identifies interesting findings related to the most highly cited papers in PM, the publication year and venues, and the distribution of research worldwide.
2.2. Search Procedure
The search process began on 1 February 2023, providing a comprehensive overview of the process mining literature over a specified timespan. The search string defined for this research was:
(“process mining” OR “workflow mining”) AND (“systematic literature review” OR SLR OR “systematic review” OR “literature review” OR “systematic mapping” OR “mapping study” OR “survey”).
Since several databases were used for the search, the search string was slightly modified for searches in each one of the databases to ensure that the most accurate result in each case would be retrieved. The search was executed in March 2023 (cutoff date 7 March 2023), and the steps for conducting the search and selecting the studies to participate in this research are shown in Figure 2.
The search was conducted using 6 primary databases: ACM Digital Library, IEEE Xplore, ScienceDirect, SpringerLink, CiteSeerX Library, and Wiley. However, the included studies referenced a broader range of 13 databases, which were used in their respective methodologies. The database selection was based on best practices according to the recent literature and the ones most used by other authors to formulate tertiary studies, as well as SMSs and SLRs [30,31]. The exact number of papers extracted from each database is illustrated in Figure 2.
During the search process, we first searched each database to retrieve an initial set of studies. We then applied the inclusion and exclusion criteria to eliminate non-relevant studies. In the end, Google Scholar was manually searched by both authors using the same terms as those in the search query to identify any relevant research that was not returned by the digital libraries used. No additional studies were identified in this step.
2.3. Study Selection (Inclusion/Exclusion Criteria)
At first, 1476 papers were retrieved by applying the search strings in the online databases. Both authors applied the inclusion and exclusion criteria shown in Table 1 in all returned studies. A spreadsheet was used to analyze each study based on the pre-established criteria (file available upon request to the authors). At the end of this process, the results of both authors were compared, and discussions were conducted to solve the conflicted results. Eventually, out of the 1476 studies originally extracted, 25 were selected to be further examined and included in this research (Table 2).
Quality Assessment (DARE and Questions)
It is of prominent importance to review the quality of the included studies in any secondary and tertiary study [15,52]. The Database of Abstracts of Reviews of Effects (DARE) has defined the following quality criteria to evaluate the quality of scientific papers [27]. These criteria are well-accepted in the scientific community as a basis for the qualitative evaluation of studies, and thus, we evaluated our selected studies based on these (Table 3). Table 3 is designed based on the criteria outlined in the Centre for Reviews and Dissemination databases: value, content, and developments.
Each quality attribute is marked with either 0, 0.5, or 1 and the total quality score for each study is the sum of the values of all five quality attributes. As a result, each study ranges from low (quality score between 0.5 and 2) to medium (quality score between 2.5 and 3.5) and high (quality score between 4 and 5).
Both authors individually assessed the studies for each one of the quality criteria. When a consensus was not achieved, a discussion was held between both authors, resolving any discrepancies. The quality score for each one of the studies included in this tertiary study can be found in the data extraction form. The detailed scores for each quality attribute are shown in Appendix A—Table A1. The quality of the studies is further analyzed in the relevant research questions below and in the Section 4.
2.4. Data Extraction
For each of the included studies, information was extracted to facilitate the analysis and synthesis of the results. The data extraction attributes are illustrated in Table 4.
The authors performed an in-depth examination of each of the studies included in this research and summarized the findings in separate documents. The extracted information was then used to answer the research questions and formulate the results. The detailed extraction forms for each study can be found online (extraction forms available upon request to the authors). Table 5 illustrates a summary of the key extracted information.
3. Results—Findings
In this section, we present the findings and the answers to the research questions based on the data extracted from each one of the studies included in this study. The final step of the procedure followed is thus provided here.
3.1. Data Synthesis
3.1.1. RQ1: Which Research Areas Are Addressed by SMSs/SLRs in Process Mining?
All studies were classified according to their main area of concern. Both authors identified this by reading their titles, keywords, and abstracts. No major discrepancies were identified in the perceptions of both authors. Despite the small number of SMSs and SLRs in process mining, numerous research themes are explored by researchers, as indicated in Table 5. Most of the focus is on process mining areas (technologies and applications) and healthcare, as 72% of the included studies fall under these domains. Research interest is shown in other related information technology areas, such as agile software development, cybersecurity, etc.
Having reviewed the literature for a suggested thematic analysis of PM [61], we concluded that there is no guidance for PM at the moment, other than the three areas of PM introduced in the PM Manifesto [1]. Therefore, we briefly present below some insights on the thematic areas addressed by the included studies.
Process mining areas (technologies and applications): S2 constitutes a state-of-the-art analysis related to mining software processes. The authors focus on the analysis of research topics, data sources, and mining techniques and tools. The included studies are mapped based on the associated mining tools, identified correlations in research topics, and data sources used. Both S3 and S18 address PM topics in the area of education. S3 applies PM techniques in educational data and the synthesis is based on PM types and perspectives, PM tools, algorithms and analysis techniques, and methodologies used. S18 performs an analysis of studies applying PM in learning process design. The included studies are analyzed based on their research objectives, the PM technique, and the tools used. S4 is a secondary study that focuses on the event logs used for PM. Among others, the main classification dimensions used are model type and language, type of implementation, and type of evaluation data. In S8, the active research topics of PM and their main publishers are mapped. The studies are classified by the domains they explore or the industry segment used for PM. The most active research topics are associated with process discovery algorithms, conformance checking and architecture tools and improvements. Goal-oriented PM is examined in S11. The authors conclude that the use of PM in association with goals does not yet have a coherent line of research. S12 is a broad study on the field of PM that systematically assesses PM scenarios. More than 70% of the studies evidenced discovery as the main type of PM addressed. The study also analyzes the included studies by the technique used and the presented graphical results. S16 is the only study investigating PM in recommender systems. The authors provide a taxonomy of published studies in this field, based on the type and perspective of recommendations, a list of datasets and evaluation metrics used, implementation environments, and the algorithms used. A taxonomy consisting of 4 different approaches for addressing process complexity and the impact of contributing factors on PM results is the focus of the SLR undertaken in S19. Finally, S25 presents an SLR on process discovery and conformance-checking metrics for duplicate tasks. No classification or taxonomy of the primary studies included is present. The authors provide a comprehensive review of the current work to assist researchers in locating the right PM tools and metrics.
Healthcare: Using the included primary studies, S1 creates a concept map and classification according to certain characteristics (type of research, contribution, healthcare specialty, and mining activity). This study is one of the two studies with the highest quality score (4.5 out of 5) included in this research and it was published as an open-access paper. The characteristics that comprised its overall quality score can be found in the Appendix (Table A1). S5 includes primary studies that use PM for conformance verification of healthcare processes and is mostly focused on tools applying PM techniques in an intensive care unit. No thematic analysis or classification of the studies is included in S5. On the other hand, S6 reviews the PM literature, focusing on studies in healthcare, and discusses the main challenges and the trends in objective, type, perspective, algorithms and tools, medical facilities, medical fields, medical process types, medical data, and preprocessing. Although S7 is also presented in the healthcare area, it is mostly focused on oncology as it investigates the PM of the eHealth records of cancer patients. It performs a thematic review, identifying some key themes such as process and data types, research questions, techniques/perspectives and tools, methodologies and limitations, and future work. The thematic areas of S1, S6, and S7 vary, indicating that the field is very broad; thus, there is room for different research focusing on various aspects. S9 is a literature review of the usage of PM in healthcare. The main analysis aspects (process and data types, frequently posed questions, PM techniques, perspectives and tools, methodologies, implementation and analysis strategies, geographical analysis, and medical fields) cover most of the areas addressed by studies S1, S5, S6, and S7. In addition, S9 also lists the most frequently used categories and areas for future research. S21 evaluates the use of PM in healthcare while focusing on similar characteristics as the ones mentioned for S9. Specifically, the strategy and algorithm used, the location used, and the main contributions for the identified application are investigated by the research questions. Similarly, S22 identifies the PM algorithms, techniques, tools, methodologies, and approaches used in PM research in the healthcare domain. Finally, one of the latest studies in the area, S23, considers three novel review dimensions (the PM project stages, the involvement of domain expertise, and the KPIs considered during the PM analysis). The authors highlight the evolution of research in this field by considering time trends within the review dimensions.
Agile software development: S10 and S15 are focused on agile software development. They both investigate the usage areas of PM in agile software development based on algorithms, data sources, data collection mechanisms, and the analysis techniques and tools used. In addition, S15 classifies the proposed approaches in agile software development methodologies that use PM based on venues, tools, PM areas, and evaluation metrics.
Cybersecurity: Two studies (S17 and S24) address PM in the cybersecurity domain. S17 summarizes the related published studies and identifies the existing efforts on methods, datasets, tools, and frameworks. On the other hand, S24 focuses on the potential of PM to aid in cybersecurity and software reliability. The authors collect existing process mining applications and discuss current and future trends.
Industrial applications: S13 is the only study analyzing the application of PM techniques in relation to the industrial context. The authors extract data concerning the software tools, algorithms, and techniques used for PM and they acknowledge that healthcare is the most applicable industrial domain for PM.
Cloud-based applications: The authors in S14 examine the applicability of PM techniques to cloud-based applications. Their discussion reports on algorithms, tools, verification approaches taken, and cloud-specific challenges.
Software engineering: S20 is the only study in the software engineering domain and the second of the two studies with the highest quality score (4.5 out of 5) included in this research. It was published as a conference paper. The authors present state-of-the-art PM research in software engineering, focusing on characterizing the most complex software processes. The included studies are classified according to the publication type, conceptual definition of PM perspectives (control-flow, organizational, case, and time), implementation in case studies, and the impact towards the challenges and opportunities identified in the field.
3.1.2. RQ2: What Are the Trends Relating to the Quality of Published SMSs/SLRs?
The complete table with the quality scores for each study is shown in Appendix A. Based on this assessment, all quality criteria, except for Assessment of Quality, are almost always used by the studies included in this research. A partial assessment of the quality of the primary studies is only observed in studies S1 and S19. A complete assessment is only observed in studies S17 and S20. The fact that the quality assessment of primary papers is absent in most of the studies is reflected in their overall quality score, as most of them can be classified as being of low and medium quality. In contrast, all four studies performing at least a partial quality assessment of the primary studies are classified as being of high quality.
Quality of Published SMSs/SLRs over Time
The quality score and publication year of each study included in this research are shown in Table 5. The average quality score for SMSs/SLRs in the field of process mining is 3.5, indicating that the studies published are mostly in the upper bounds of medium quality. By analyzing the quality scores per year (Figure 3), we observe similar quality levels for studies published between 2016 and 2019, with an average score of 3.5. The quality seems slightly lower for studies published in 2020. The highest average quality score is observed for 2021. However, generic conclusions cannot be drawn, as there was only one study among the included studies published in 2021. Finally, the trend in the average quality score returns to 3.5 for 2022, confirming that for the years covered in this research, the studies included are of medium quality, a trend that remains unaltered throughout the years, with a few exceptions, based on the included studies in this research.
Comparison of Quality for SMSs/SLRs Published in Journals and SMSs/SLRs Published in Conference Papers
The number of studies included in this research was almost evenly distributed in terms of publication venue (journal versus conference papers). Thirteen studies (S1, S4, S8, S9, S11, S12, S13, S16, S19, S21, S22, S23, and S24) were published in journals. Twelve studies (S2, S3, S5, S6, S7, S10, S14, S15, S17, S18, S20, and S25) were published at conferences. The average quality for studies published in journals is 3.5, whereas for studies published at conferences, the quality is slightly lower at 3.45 (Table 6). As a result, we observe that the quality remains the same regardless of the venue of a paper (journal or conference paper). However, with a closer look at the quality scores per year, we can see that since 2020, conference papers appear to have increased their average quality (3.3 in 2020). The average quality score for journal papers was 3.7 up to 2020. We observe that for the studies included in this research, there appears to be a slight decrease in the average quality score for journal papers and a slight increase in the average quality score for conference papers in the past 2 years.
Taking a closer look at the analysis of the quality of journal papers, we identified which papers were published in open-access journals and which were not. Five out of thirteen were published in open-access journals and the remaining eight were published in subscription journals. As indicated in Table 7, from the studies included in this tertiary study, the papers published in open-access journals are of higher average quality. More specifically, the average quality score for open-access studies included in this paper is 3.9, with a gap of 0.65 in the average quality score from the rest of the studies published in subscription journals. Although the small number of studies will not allow us to draw generic conclusions for the average quality score for open-access journals, this is an indication that papers published in open-access journals are of high quality. Future research can further investigate this and reach more generic conclusions.
3.1.3. RQ3: What Are the Current Trends in SMSs/SLRs Relating to Guidelines, Data Sources, Types of Questions, and Number of Included Studies?
Guidelines Most Frequently Cited for Conducting SMSs/SLRs in the Area of Process Mining
The authors read the bodies of the included studies to extract the guidelines used. In the case that more than one of the guidelines was mentioned, all of them were accounted for in the data extraction process. Regarding the guidelines that were more often used in SMSs/SLRs in the process mining domain, the guidelines by Kitchenham [11,12,31] are the most widely used. In addition, some other guidelines are followed, as illustrated in Table 5. Three of the studies do not report any specific guidelines for their research processes and syntheses (reported as n/a in the table below). However, they have been included in this study as they include a structure and methodology similar to the guidelines reported in the literature (search strategy, inclusion/exclusion criteria, quality assessment, data extraction and synthesis, etc.).
Online Databases Most Frequently Used to Search for Primary Studies
Thirteen different online databases were identified and used by the included studies. A complete list of the databases used in each one of the studies can be found in Appendix A (Table A2 and Table A3). The picture below (Figure 4) summarizes the most preferred online databases: IEEE Xplore, Scopus, ACM DL, Science Direct, SpringerLink, and Google Scholar. It should be noted that in most of the cases, Google Scholar is used most as a source to identify additional studies, rather than as a primary source to locate studies for inclusion.
Number of Primary Studies Included in Published SMSs/SLRs over the Years
The SLRs/SMSs included in this study aggregate primary studies from 2002–2022. The smallest set of primary studies is six (in S15). The largest set of primary studies is 1278 (in S8). It is noted that the study with the smallest set of primary studies was a conference paper, whereas the study with the largest was a journal paper. As the research interest in PM grows, we cannot draw strict conclusions regarding the average number of primary studies in the studies included in this research (Table 8). For every study, the number of included studies depends on the research area and the research questions posed. An analysis of the number of primary studies for each of the studies included in this paper is shown in Table 6. It can be assumed that the number of primary studies included in each study correlates to the activity in the research area at a given time period. Figure 5 provides a graphical representation of the distribution of primary studies in each one of the 25 studies included.
In total, 60% of the studies included in this research have included 0–50 primary studies. In addition, 20% of the studies included 50–100 primary studies in their analysis. The remaining 20% of the studies have included more than 150 primary studies in their works, with two studies exceeding 500 primary studies.
3.1.4. RQ4: What Are the Demographics of Published SMSs/SLRs in Process Mining?
Number of SMSs/SLRs Published Annually
Most of the studies (eight) were published in 2022. This denotes an increased interest in published PM studies compared to previous years. In the earlier years covered by this research (Table 9), the number of SMSs/SLRs was smaller as expected, since the domain was relatively new and research in this field was therefore limited. With the increase in the number of primary studies, there was an opportunity to summarize and map the research performed in PM. Around 2018 it was observed that six SMSs/SLRs were published, showing a maturity level for the PM domain. Throughout the years, the areas of healthcare and process mining have constantly trended as topics for the SMSs/SLRs published. As the field matures, it is expected that more research areas will gain attention in the years to come.
Publishing Venues and Geographical Distribution of SMSs/SLRs Studies
The fact that the 25 studies included in this research have been published in 23 different venues (13 studies were published in journals and 12 studies were published at conferences, as already mentioned) can be considered an indication that there is a broad set of related areas, apart from the field of information technology, that PM can be applied to. Researchers focused on this field have a wide range of venues to submit their research. Although there are no specific trends in publishing venues, there seems to be a trend in the geographical areas where the studies are being conducted. The primary authors were distributed within the five continents (Figure 6). Nine studies (36%) emerged from Asian countries (Turkey, China, Iran, Indonesia, Japan, and Malaysia), seven studies (28%) emerged from European countries (Estonia, Spain, the UK, Italy, Belgium, and Czechia), and the rest of the studies were developed in America (five studies in South America (Brazil and Chile) and three studies in North America (Canada and Mexico)), with one in Africa (Egypt). Turkey and Brazil are leading the research in SMS/SLR studies in process mining, as they have published three studies each that are included in this work. It is also worth mentioning that there are no SMSs/SLRs from the USA. This observation has also been noted in the tertiary study of Barros-Justo et al. [23]. The thematic findings underscore process mining’s extensive applications across domains such as healthcare, cybersecurity, and agile software development, showing its broad impact.
Which Are the Most Cited Papers in This Area?
The extracted data (Table 5) provide some interesting observations regarding the number of citations for the SMSs/SLRs included in this study. The citation numbers were extracted from Google Scholar on 7 March 2023, as other tertiary studies have used Google Scholar to provide an indication of how many times a paper has been cited (cite landscaping and more). Secondary studies in the PM domain have received a total of 1658 citations. The average number of citations for secondary studies in PM is 66. For the studies included in our work, studies of higher quality are not always cited more often. In addition, recently published works, such as S21, can be frequently cited, although most of the recently published work has yet to receive a high number of citations. The papers published in journals have received 795 citations, whereas the papers published in conferences have received more citations, with a total of 863.
Future work could further compare the citation numbers (individual and average numbers) to those of SMSs/SLRs in other fields and/or in the overall PM domain. It has also been observed that studies published in peer-reviewed journals usually receive more citations than the studies published in conferences. However, since several studies have been published in the past 2 years, it is still too early to assess the validity of this observation.
4. Discussion
The aim of this study was to answer a set of research questions and cover the SLRs and SMSs published in PM up to March 2023. This study identifies current trends and reports on the quality and demographics of the studies included. We followed the same methodology as an SLR based on the guidelines proposed by Kitchenham and Charters [12] and Petersen et al. [26]. The quality of each secondary study is assessed using the DARE criteria [27] to provide an analysis of the overall quality of the studies published in this field, both as journal and conference papers. To ensure that the findings of this study have been appropriately reported, the checklist and guidelines proposed by Budgen et al. [28] have been followed. In total, 25 secondary studies published in PM are reviewed in this work. Thirteen studies are journal papers, and twelve studies are conference papers. This section discusses the implications of this study.
The diverse thematic areas indicate the broad applicability of process mining across multiple research domains, emphasizing its potential for growth in underexplored areas. PM research interest is found in the technologies and applications of PM, healthcare, cybersecurity, and agile software development. Most of the studies included in this tertiary study fall under the PM areas (technologies and applications) and healthcare. The areas covered in secondary studies have matured more in terms of scientific research and applications in PM; thus, there are enough published works to create the need for secondary research in these fields. However, as PM is now a mature research domain, future secondary studies in this field could focus on other areas of research. The studies usually provide a thematic analysis based on various characteristics, with the most used usually being PM algorithms, PM tools, and research questions posed. Although other domains in software engineering have a classification in place that secondary studies can exploit, there is no classification set in the literature so far for PM.
Regarding RQ2, the quality analysis of the included secondary studies was performed using DARE criteria [27]. Overall, the quality assessment indicates that the majority of SMSs/SLRs in process mining maintain a consistent medium-to-high quality level (>3.5), suggesting a robust research framework in this field. There are eight studies that scored more than 4 in the quality assessment (S1, S4, S7, S8, S17, S19, S20, and S24). Five of these were published in journals and the rest are conference papers. In general, only two of the studies (S1 and S19) performed a partial quality assessment for the included primary studies. In two more studies (S17 and S20), the quality criterion is met (scoring 1). In total, 21 of the included studies do not perform any quality assessment on the primary studies. As discussed in [24], this is an important factor to consider in the future that will also contribute to increasing the quality score of the secondary studies produced. Our analysis indicates that journal and conference papers show comparable quality, with slight variations observed over recent years. By taking a closer look at the included studies, it is also observed that although the quality score for conference papers has increased in the past years (from 3.3 in 2020 to 3.45 in 2023), it has slightly decreased for journal papers (from 3.7 in 2020 to 3.5 in 2023). Given the small number of secondary studies published in the past 2 years, no generic conclusions can be drawn regarding the progress of the quality score over the years. Finally, we compared the quality score of journal papers and open-access papers to reveal that for the studies included in this tertiary study, the papers published in open-access journals are of higher average quality than those published in subscription journals. The open-access studies hold an average quality score of 3.9, with a gap of 0.65 in the average quality score from the rest of the studies that were published in journals. Although the small number of studies will not allow us to draw generic conclusions for the average quality score for open-access journals, this can be an indication that papers published in open-access journals are of high quality. This deserves further investigation in the future.
Regarding the guidelines used for conducting secondary studies and the search engines used for identifying primary studies, the findings of our work are in alignment with those of other tertiary studies in the field of software engineering [20,21,22,23,24]. The most frequently used guidelines cited for conducting SMSs/SLRs in PM are those of Kitchenham [11], Kitchenham and Charters [12], and Kitchenham et al. [31]. The most popular databases used for primary studies are IEEE Xplore, Scopus, ACM DL, Science Direct, and SpringerLink. Google Scholar is also popular but mostly used to identify additional studies rather than as a primary source to locate studies. No trends are identified for the number of primary studies included in the SMSs/SLRs. The studies included in our work cover primary studies for the past twenty years (2002–2022). The smallest set of primary studies included is six in study S15. The largest set of primary studies included is 1278 in study S8. We may assume that the number of included studies is dependent on the maturity of the area of the research and the research questions posed. In general, most of the secondary studies (80%) usually include up to 100 primary studies.
PM secondary studies demonstrate substantial academic attention, reflecting their growing influence in the field. Citation analysis shows that secondary studies in process mining receive notable recognition, indicating the field’s growing academic influence. The average number of citations for secondary studies in PM is 66. By performing our analysis, we have observed that studies of higher quality are not always cited more often and that recently published studies may also receive a high number of citations. To conclude, conference studies receive a slightly higher number of citations than papers published in peer-reviewed journals. Regarding the geographical areas investing in secondary research for PM, it seems that there is a globally increased interest in the domain, as PM research emerges in various continents around the globe and such works are published in multiple publishing venues.
This work can assist new researchers in understanding the PM domain as it lists all available secondary research in this field. It can also be used as a basis to identify gaps in PM and conduct more secondary studies in the related areas. The lessons learned here, the presentation of the findings (guidelines used, search engines, etc.), and any recommendations provided might also be useful to researchers in the PM field.
Limitations
As this study follows a similar structure to Khan et al. [24], potential limitations may exist related to internal validity, external validity, construct validity, and conclusion validity. The authors follow the suggestions of Khan et al. [24] to address all of these to the greatest possible extent. This research might include biases inherent in database selection and study inclusion criteria. Although all the major search engines have been used to identify the SMSs/SLRs published in the PM domain, no additional methods (such as snowballing) have been used during the search procedure. However, the comprehensive search strategy in multiple databases and the predefined inclusion and exclusion criteria minimize selection bias and ensure a transparent and consistent selection process. Future studies could enhance this work by incorporating snowballing techniques to capture additional interrelated studies. In addition, by performing the quality review in the included studies, there is the potential risk of deriving incorrect conclusions regarding the quality of a study. However, despite the quality review performed and the score of each study, the quality score was not used as part of the exclusion criteria; thus, no study was excluded due to their overall quality score. In general, the data extraction process can be considered objective. As a result, some of the fields of the extraction form (such as the thematic area and the implications for future work) might be subject to potential author bias. Finally, the small number of secondary studies included in this research might not fully represent the quality and demographics of the research in PM. The small number of included studies is expected for a young discipline area such as PM. As the domain matures, the research might be replicated to validate or alter the results by including more available secondary studies in the future.
5. Conclusions and Future Work
This tertiary study identified the SMSs/SLRs in the field of PM, published until March 2023, aiming to identify trends in the research conducted in this area. We followed the guidelines proposed by Kitchenham and Charters [12] and Petersen et al. [26]. The quality of each secondary study was assessed using the DARE criteria [27]. In total, 25 secondary studies published in PM are reviewed in this work. Thirteen are journal papers and twelve are conference papers.
A set of four research questions have been answered regarding the areas addressed by SMSs/SLRs in PM, the trends relating to the quality of published SMSs/SLRs, the guidelines followed, the data sources, and the number of included studies, as well as the demographics of the published work. Response to RQ1: The thematic areas covered by the secondary studies in PM are technologies and applications of PM, healthcare, cybersecurity, and agile software development, with the first two currently being the most prominent. Response to RQ2: The quality analysis of the 25 studies revealed that studies published in PM have a medium-to-high quality score. An average score of 3.5 has been observed across the included studies. No significant difference exists in the quality score of studies published in journals and conference proceedings. Most of the included studies do not perform any quality checks on their primary studies. Response to RQ3: As expected, the guidelines proposed by Kitchenham [11], Kitchenham and Charters [12], and Kitchenham et al. [31] for conducting systematic reviews are also the most popular in the PM area. Numerous search engines are used while conducting secondary studies in PM. The most frequently searched databases for primary studies include IEEE Xplore, Scopus, ACM DL, Science Direct, and SpringerLink. Google Scholar is also popular, but it is mostly used to identify additional studies rather than as a primary source to locate studies. Trends have yet to be identified for the number of primary studies included in SMSs/SLRs in PM. Response to RQ4: SMSs/SLRs in the area of PM are developed across the globe, and Turkey and Brazil are showing more interest in this type of research at the moment. The studies included in this research have been cited 1658 times so far, with the average number of citations being 66. Conference studies receive a slightly higher number of citations than those published in journals.
In the future, researchers could explore if the quality assessment of primary studies is still an issue in conducting secondary research. Including more SMSs/SLRs in both journals (both subscription and open-access) and conferences could enhance the conclusions drawn for the quality scores in relation to the publication venues. In addition, as the research domain matures even more, additional secondary studies can be developed for a variety of thematic areas. An extended set of secondary studies in this area will allow the replication of this tertiary study in the near future, which will be enhanced with a broader set of questions to examine the maturity of the studies across the various fields of PM. Future work could further compare the citation numbers (individual and average numbers) to those of SMSs/SLRs in other fields and/or in the overall PM domain. Our work can play an important role for future researchers as it is the only tertiary study that analyzes and indexes published secondary studies in the PM field. The results of this tertiary study can be used by future studies as a basis for conducting new research and investigating the research topics in PM. Specifically, regarding the quality scores of the included studies, future studies might use them to enhance the conclusions and/or to examine the progress of quality throughout the years.
Conceptualization, E.K. and I.S.; methodology, E.K. and I.S.; software, E.K. and I.S.; validation, E.K. and I.S.; formal analysis, E.K. and I.S.; investigation, E.K.; resources, E.K.; data curation, E.K. and I.S.; writing—original draft preparation, E.K.; writing—review and editing, E.K. and I.S.; visualization, E.K.; supervision, I.S. All authors have read and agreed to the published version of the manuscript.
Data available upon request to the corresponding author.
The authors declare no conflict of interest.
Footnotes
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
Figure 4. The most preferred online search engines for identifying primary studies.
Figure 5. The distribution of the number of primary studies for the 25 studies included in this Tertiary Study.
Figure 6. A worldwide distribution of secondary studies in Process Mining (Figure created using MS Excel map representation of data).
The Inclusion/Exclusion criteria applied in all retrieved studies.
IC# | Inclusion Criteria | EC | Exclusion Criteria |
---|---|---|---|
IC1 | Studies since 2003 (or early 2004 when PM research was initiated) until April 2020 | EC1 | The study is not labelled as a secondary study |
IC2 | Studies related to the search string defined | EC2 | The study is a secondary study, but the subject is not directly related to process mining |
IC3 | Studies reported in English | EC3 | Duplicated paper |
IC4 | Peer-reviewed studies | EC4 | Full text is not available |
IC5 | Secondary studies | EC5 | Study is not in English |
IC6 | Full text is available | EC6 | The study does not include a systematic review process and primary studies |
IC7 | Studies including a systematic review process and primary studies |
# means number.
The 25 studies included in this Tertiary Study.
Study# | Title |
S1 | Systematic Mapping of Process Mining Studies in Healthcare [ |
S2 | A Mapping Study on Mining Software Process [ |
S3 | Educational Process Mining: A Systematic Literature Review [ |
S4 | Automated Discovery of Process Models from Event Logs: Review and Benchmark [ |
S5 | Process Mining for Healthcare Process Analytics [ |
S6 | Process Mining in Healthcare: A Systematic Review [ |
S7 | Process Mining in Oncology: A Literature Review [ |
S8 | Process Mining Techniques and Applications—A Systematic Mapping Study [ |
S9 | Process Mining in Healthcare: A Literature Review [ |
S10 | Systematic Mappin Study on Process Mining in Agile Software Development [ |
S11 | From Event Logs to Goals: a Systematic Literature Review of Goal-Oriented Process Mining [ |
S12 | A Systematic Mapping Study of Process Mining [ |
S13 | Process Mining and Industrial Applications: A Systematic Literature Review [ |
S14 | Process Mining for Cloud-Based Applications: A Systematic Literature Review [ |
S15 | Using Process Mining in Agile Software Development Methodologies: A Systematic Mapping Study [ |
S16 | A Survey on Recommendation in Process Mining [ |
S17 | A Survey on Process Mining for Security [ |
S18 | Systematic Literature Review on Process Mining in Learning Management System [ |
S19 | Complex Process Modeling in Process Mining: A Systematic Review [ |
S20 | Process Mining Perspectives in Software Engineering: A Systematic Literature Review [ |
S21 | Opportunities and Challenges for Applying Process Mining in Healthcare: a Systematic Mapping Study [ |
S22 | Process Mining Applications in the Healthcare Domain: A Comprehensive Review [ |
S23 | Process Mining in Healthcare—An Updated Perspective on the State of the Art [ |
S24 | Process Mining Usage in Cybersecurity and Software Reliability Analysis: A Systematic Literature Review [ |
S25 | Process Mining of Duplicate Tasks: A Systematic Literature Review [ |
# means number.
The DARE quality criteria used to evaluate the quality of the included studies [
QC# | Quality Attribute | Scoring Guidelines |
---|---|---|
QC1 | Inclusion and exclusion criteria | The inclusion and exclusion criteria are explicitly defined (score is 1) |
The inclusion and exclusion criteria are partially defined (score is 0.5) | ||
The inclusion and exclusion criteria are not defined (score is 0) | ||
QC2 | Search adequacy | Four or more reputed digital libraries are searched (score is 1) |
Three or four digital libraries are searched with no extra search strategies (score is 0.5) | ||
Two or less than two online databases are searched (score is 0) | ||
QC3 | Synthesis method | Paper presents an explicit synthesis method, and a reference to the method is given (score is 1) |
Paper presents a synthesis method, but no reference is given (score is 0.5) | ||
Paper presents no synthesis method (score is 0) | ||
QC4 | Quality assessment of primary studies | The quality assessment criterion is explicitly defined and assessed in the paper (score is 1) |
Quality assessment is conducted but not reported (score is 0.5) | ||
No effort is made to assess the quality of included papers (score is 0) | ||
QC5 | Information about included studies | Information is provided about each primary study (score is 1) |
Only summary information on primary studies is given (score is 0.5) | ||
Information about the primary studies is not provided (score is 0) |
# means number.
The data extraction attributes.
Data Item Name | Description |
---|---|
PaperID | The reference ID assigned to the study by the authors of the current Tertiary Study |
Authors | Number of reviewers involved in each study and their names |
Title | The title of the study |
Keywords | The keywords listed in the SMSs |
Review topic | Area of research |
Type of the publishing venue | Type of paper (Conference, Journal, Symposium, or Workshop) |
Name of the publishing venue | Where the paper is published |
First author’s affiliation | Affiliation used by the first Author of the study |
List of digital databases used | The electronic databases used for searches |
Number of primary papers | The number of studies included in the SMS/SLR |
Proposals for future research | The research gaps identified and/or future research proposals |
Publishing year | The year the study was published |
Years covered by the secondary study | The range in years covered by each study |
Quality score|DARE score | The total quality score based on DARE criteria (as this is described in the relevant section above) |
Number of citations | The total number of times the study is cited by other papers |
Guidelines used | The existing guidelines followed for conducting the review |
A summary of the extracted information for each study.
S# | Year Published | Years Covered | Quality Score | Publishing Venue | Thematic Area | Guidelines Used | # of DBs | Primary Studies Included | # of Citations | Continent |
---|---|---|---|---|---|---|---|---|---|---|
S1 | 2018 | 2005–2017 | 4.5 | OA Journal | Healthcare | Petersen et al. [ | 10 | 172 | 92 | Asia |
S2 | 2017 | 2004–2016 | 3.5 | Conference | Process Mining areas | Kitchenham and Charters [ | 4 | 40 | 9 | Asia |
S3 | 2017 | 2009–2016 | 3.5 | Conference | Process Mining areas | Kitchenham and Charters [ | 4 | 37 | 31 | Africa |
S4 | 2018 | 2011–2017 | 4 | Journal | Process Mining areas | Kitchenham [ | 7 | 86 | 336 | Europe |
S5 | 2016 | 2008–2015 | 3 | Conference | Healthcare | Petersen et al. [ | 10 | 11 | 28 | Asia |
S6 | 2018 | 2007–2018 | 3 | Conference | Healthcare | n/a | 6 | 55 | 42 | Europe |
S7 | 2016 | 2008–2016 | 4 | Conference | Healthcare | n/a | 5 | 33 | 74 | Europe |
S8 | 2019 | 2002–2018 | 4 | Journal | Process Mining areas | Kitchenham and Charters [ | 4 | 1278 | 210 | South America |
S9 | 2016 | 2005–2016 | 3.5 | OA Journal | Healthcare | Van der Aalst [ | 5 | 74 | 607 | South America |
S10 | 2018 | 2003–2017 | 3 | Conference | Agile software development | Kitchenham [ | 5 | 25 | 9 | Asia |
S11 | 2020 | 2012–2017 | 3.5 | Journal | Process Mining areas | Kitchenham et al. [ | 4 | 24 | 52 | North America |
S12 | 2018 | 2005–2014 | 3 | Journal | Process Mining areas | Kitchenham [ | 2 | 705 | 68 | South America |
S13 | 2020 | 2007–2019 | 2 | Journal | Industrial applications | Tranfield et al. [ | 1 | 18 | 27 | Europe |
S14 | 2019 | 2008–2018 | 3 | Conference | Cloud-based applications | Kitchenham et. al. [ | 3 | 27 | 9 | North America |
S15 | 2018 | 2014–2017 | 3.5 | Conference | Agile software development | Kitchenham and Charters [ | 7 | 6 | 6 | South America |
S16 | 2022 | 2008–2020 | 3 | Journal | Process Mining areas | Kitchenham [ | 7 | 34 | 0 | Asia |
S17 | 2022 | 2017–2021 | 4 | Conference | Cybersecurity | Kitchenham [ | 5 | 22 | 0 | Asia |
S18 | 2022 | 2017–2021 | 3 | Conference | Process Mining areas | Kitchenham et al. [ | 5 | 20 | 1 | Asia |
S19 | 2022 | 2012–2022 | 4 | OA Journal | Process Mining areas | Kitchenham et al. [ | 6 | 58 | 0 | Asia |
S20 | 2021 | 2010–2020 | 4.5 | Conference | Software Engineering | Kitchenham [ | 5 | 12 | 6 | North America |
S21 | 2022 | 2002–2019 | 3 | Journal | Healthcare | Moher et al. [ | 5 | 270 | 21 | South America |
S22 | 2022 | 2010–2021 | 4 | Journal | Healthcare | Kitchenham [ | 9 | 172 | 12 | Europe |
S23 | 2022 | 2002–2021 | 3.5 | OA Journal | Healthcare | Aguirre et al. [ | 5 | 263 | 12 | Europe |
S24 | 2022 | 2014–2020 | 4 | OA Journal | Cybersecurity | Kitchenham and Charters [ | 6 | 35 | 2 | Europe |
S25 | 2020 | 2004–2020 | 3.5 | Conference | Process Mining areas | Kitchenham et al. [ | 7 | 45 | 4 | Asia |
# means number.
The average quality score for studies published in Journals and at Conferences.
Published in Journals | Published at Conferences | ||
---|---|---|---|
S# | Quality Score | S# | Quality Score |
S1 | 4.5 | S2 | 3.5 |
S4 | 4 | S3 | 3.5 |
S8 | 4 | S5 | 3 |
S9 | 3.5 | S6 | 3 |
S11 | 3.5 | S7 | 4 |
S12 | 3 | S10 | 3 |
S13 | 2 | S14 | 3 |
S16 | 3 | S15 | 3.5 |
S19 | 4 | S17 | 4 |
S21 | 3 | S18 | 3 |
S22 | 3.5 | S20 | 4.5 |
S23 | 3.5 | S25 | 3.5 |
S24 | 4 | ||
average quality score | 3.50 | average quality score | 3.45 |
# means number.
A comparison of the quality score for studies published in Subscription Journals and Open-Access Journals.
Published in Subscription Journal | Published in OA Journal | ||
---|---|---|---|
S# | Quality Score | S# | Quality Score |
S4 | 4 | S1 | 4.5 |
S8 | 4 | S9 | 3.5 |
S11 | 3.5 | S19 | 4 |
S12 | 3 | S23 | 3.5 |
S13 | 2 | S24 | 4 |
S16 | 3 | ||
S21 | 3 | ||
S22 | 3.5 | ||
average quality score | 3.25 | average quality score | 3.90 |
# means number.
The average number of primary studies (per year) in the studies included in this research.
Year | 2016 | 2017 | 2018 | 2019 | 2020 | 2021 | 2022 |
---|---|---|---|---|---|---|---|
Average number of studies | 39 | 38 | 175 | 652 | 29 | 12 * | 109 |
* one study.
Number of studies and thematic areas covered per year.
Year | Number of Studies | Research Domain (s) |
---|---|---|
2016 | 3 | Healthcare |
2017 | 2 | Process Mining areas |
2018 | 6 | Healthcare |
Process Mining areas | ||
Agile software development | ||
2019 | 2 | Process Mining areas |
Cloud-based applications | ||
2020 | 3 | Process Mining areas |
Industrial applications | ||
2021 | 1 | Software Engineering |
2022 | 8 | Healthcare |
Process Mining areas | ||
Cybersecurity |
Appendix A
Detailed scores for the DARE quality criteria per study.
S# | Inclusion and Exclusion Criteria | Search Coverage | Assessment of Quality | Study Description | Synthesis of Studies | Study Score | Total 1 s per Study | Total 0.5 s per Study | Total 0 s per Study |
---|---|---|---|---|---|---|---|---|---|
S1 | 1 | 1 | 0.5 | 1 | 1 | 4.5 | 4 | 1 | 0 |
S2 | 1 | 0.5 | 0 | 1 | 1 | 3.5 | 3 | 1 | 1 |
S3 | 1 | 0.5 | 0 | 1 | 1 | 3.5 | 3 | 1 | 1 |
S4 | 1 | 1 | 0 | 1 | 1 | 4 | 4 | 0 | 1 |
S5 | 0.5 | 1 | 0 | 0.5 | 1 | 3 | 2 | 2 | 1 |
S6 | 0.5 | 1 | 0 | 0.5 | 1 | 3 | 2 | 2 | 1 |
S7 | 1 | 1 | 0 | 1 | 1 | 4 | 4 | 0 | 1 |
S8 | 1 | 1 | 0 | 1 | 1 | 4 | 4 | 0 | 1 |
S9 | 1 | 1 | 0 | 0.5 | 1 | 3.5 | 3 | 1 | 1 |
S10 | 1 | 1 | 0 | 0.5 | 0.5 | 3 | 2 | 2 | 1 |
S11 | 1 | 0.5 | 0 | 1 | 1 | 3.5 | 3 | 1 | 1 |
S12 | 1 | 0.5 | 0 | 0.5 | 1 | 3 | 2 | 2 | 1 |
S13 | 0.5 | 0 | 0 | 1 | 0.5 | 2 | 1 | 2 | 2 |
S14 | 1 | 0.5 | 0 | 1 | 0.5 | 3 | 2 | 2 | 1 |
S15 | 1 | 1 | 0 | 0.5 | 1 | 3.5 | 3 | 1 | 1 |
S16 | 1 | 1 | 0 | 0.5 | 0.5 | 3 | 2 | 2 | 1 |
S17 | 1 | 1 | 1 | 1 | 0 | 4 | 4 | 0 | 1 |
S18 | 1 | 0.5 | 0 | 1 | 0.5 | 3 | 2 | 2 | 1 |
S19 | 1 | 1 | 0.5 | 0.5 | 1 | 4 | 3 | 2 | 0 |
S20 | 1 | 0.5 | 1 | 1 | 1 | 4.5 | 4 | 1 | 0 |
S21 | 1 | 0.5 | 0 | 0.5 | 1 | 3 | 2 | 2 | 1 |
S22 | 1 | 1 | 0 | 1 | 1 | 4 | 4 | 0 | 1 |
S23 | 1 | 0.5 | 0 | 1 | 1 | 3.5 | 3 | 1 | 1 |
S24 | 1 | 1 | 0 | 1 | 1 | 4 | 4 | 0 | 1 |
S25 | 1 | 1 | 0 | 0.5 | 1 | 3.5 | 3 | 1 | 1 |
Quality average score for all studies | 3.5 | ||||||||
QC average score | 0.94 | 0.78 | 0.12 | 0.8 | 0.86 | ||||
Total 1 s per QC | 22 | 15 | 2 | 15 | 19 | ||||
Total 0.5 s per QC | 3 | 9 | 2 | 10 | 5 | ||||
Total 0 s per QC | 0 | 1 | 21 | 0 | 1 |
# means number.
A complete list of the databases used in each one of the studies.
S# | # of DBs | Online Databases Used |
---|---|---|
S1 | 10 | PubMed, SpringerLink, ACM DL, Science Direct, IEEE Xplore, Google Scholar, Scopus, Emerald, Wiley, Web of Science |
S2 | 4 | Manual search of MSRConf, Scopus, DBLP, Google Scholar |
S3 | 4 | IEEE Xplore, DBLP, Google Scholar, ERIC |
S4 | 7 | Scopus, Web of Science, IEEE Xplore, ACM DL, SpringerLink, Science Direct, Google Scholar |
S5 | 10 | ACM DL, Emerald, Google Scholar, IEEE Xplore, PubMed, Science Direct, Scopus, SpringerLink, Web of Science, Wiley |
S6 | 6 | Google Scholar, IEEE Xplore, ACM DL, Science Direct, PubMed, Research Gate |
S7 | 5 | PubMed, BMJ Open, Journals of Clinical Oncology, ACM DL, Google Scholar |
S8 | 4 | ACM DL, IEEE Xplore, Science Direct, SpringerLink (+snowballing) |
S9 | 5 | PubMed, dblp, Google Scholar, Health Analytics case studies repository, web searches |
S10 | 5 | SpringerLink, IEEE Xplore, ACM DL, Google Scholar, Science Direct |
S11 | 4 | Scopus, Google Scholar, IEEE Xplore, PubMed |
S12 | 2 | Scopus, ISI Web of Science |
S13 | 1 | Scopus |
S14 | 3 | Scopus, IEEE Xplore, Web of Science |
S15 | 7 | IEEE Xplore, ACM DL, Science Direct, Scopus, Springer, Wiley, Web of knowledge |
S16 | 7 | ACM DL, Emerald, IEEE Xplore, PubMed, Science Direct, Springer, Wiley |
S17 | 5 | ACM DL, DBLP, IEEE Xplore, Elsevier, Springer |
S18 | 5 | Science Direct, IEEE Xplore, Wiley, Emerald, ACM DL |
S19 | 6 | Web of Science, Scopus, Google Scholar, SpringerLink, IEEE Xplore, Science Direct |
S20 | 5 | ACM DL, IEEE Xplore, Science Direct, Scopus, Springer |
S21 | 5 | ACM DL, IEEE Xplore, PubMed, Science Direct, SpringerLink |
S22 | 9 | PubMed, DBLP, Google Scholar, Scopus, Web of Science, IEEE Xplore, ACM DL, SpringerLink, Science Direct |
S23 | 5 | PubMed, Web of Science, IEEE Xplore, Scopus, Google Scholar |
S24 | 6 | IEEE Xplore, Science Direct, SpringerLink, ACM DL, Web of Science, Scopus |
S25 | 7 | Google Scholar, ACM DL, Science Direct, Web of Science, Scopus, IEEE, SpringerLink |
# means number.
The databases used per study.
S# | Scopus | IEEE Xplore | ACM DL | Science Direct | Web of Science | SpringerLink | Google Scholar | PubMed | Wiley | Emerald | DBLP | Other |
---|---|---|---|---|---|---|---|---|---|---|---|---|
S1 | x | x | x | x | x | x | x | x | x | x | ||
S2 | x | x | x | x | ||||||||
S3 | x | x | x | x | ||||||||
S4 | x | x | x | x | x | x | x | |||||
S5 | x | x | x | x | x | x | x | x | x | x | ||
S6 | x | x | x | x | x | |||||||
S7 | x | x | x | |||||||||
S8 | x | x | x | x | ||||||||
S9 | x | x | x | x | ||||||||
S10 | x | x | x | x | x | |||||||
S11 | x | x | x | x | ||||||||
S12 | x | x | ||||||||||
S13 | x | |||||||||||
S14 | x | x | x | |||||||||
S15 | x | x | x | x | x | x | x | |||||
S16 | x | x | x | x | x | x | x | |||||
S17 | x | x | x | x | x | |||||||
S18 | x | x | x | x | x | |||||||
S19 | x | x | x | x | x | x | ||||||
S20 | x | x | x | x | x | |||||||
S21 | x | x | x | x | x | |||||||
S22 | x | x | x | x | x | x | x | x | x | |||
S23 | x | x | x | x | x | |||||||
S24 | x | x | x | x | x | x | ||||||
S25 | x | x | x | x | x | x | x | |||||
Total | 16 | 20 | 15 | 15 | 11 | 14 | 14 | 10 | 5 | 4 | 5 | 4 |
# means number.
References
1. Van Der Aalst, W.; Adriansyah, A.; De Medeiros, A.K.A.; Arcieri, F.; Baier, T.; Blickle, T.; Bose, J.C.; Van Den Brand, P.; Brandtjen, R.; Buijs, J. et al. Process mining manifesto. Bus. Process Manag. Workshops; 2012; 9, pp. 169-194.
2. Yang, Z.; Liu, X.; Li, T.; Wu, D.; Wang, J.; Zhao, Y.; Han, H. A systematic literature review of methods and datasets for anomaly-based network intrusion detection. Comput. Secur.; 2022; 116, 102675. [DOI: https://dx.doi.org/10.1016/j.cose.2022.102675]
3. Dissanayake, N.; Jayatilaka, A.; Zahedi, M.; Babar, M.A. Software security patch management—A systematic literature review of challenges, approaches, tools and practices. Inf. Softw. Technol.; 2022; 144, 106771. [DOI: https://dx.doi.org/10.1016/j.infsof.2021.106771]
4. e Silva, L.C.; Sobrinho, Á.A.D.C.C.; Cordeiro, T.D.; Melo, R.F.; Bittencourt, I.I.; Marques, L.B.; da Cunha Matos, D.D.M.; da Silva, A.P.; Isotani, S. Applications of convolutional neural networks in education: A systematic literature review. Expert Syst. Appl.; 2023; 231, 1201. [DOI: https://dx.doi.org/10.1016/j.eswa.2023.120621]
5. Nuaimi, M.; Fourati, L.C.; Hamed, B.B. Intelligent approaches toward intrusion detection systems for Industrial Internet of Things: A systematic comprehensive review. J. Netw. Comput. Appl.; 2023; 215, 103637. [DOI: https://dx.doi.org/10.1016/j.jnca.2023.103637]
6. Erdogan, T.G.; Tarhan, A. Systematic mapping of process mining studies in healthcare. IEEE Access; 2018; 6, pp. 24543-24567. [DOI: https://dx.doi.org/10.1109/ACCESS.2018.2831244]
7. Batista, E.; Solanas, A. Process mining in healthcare: A systematic review. Proceedings of the 2018 9th International Conference on Information, Intelligence, Systems and Applications (IISA); Zakynthos, Greece, 23–25 July 2018; IEEE: New York, NY, USA, 2018; pp. 1-6.
8. dos Santos Garcia, C.; Meincheim, A.; Junior, E.R.F.; Dallagassa, M.R.; Sato, D.M.V.; Carvalho, D.R.; Santos, E.A.P.; Scalabrin, E.E. Process mining techniques and applications—A systematic mapping study. Expert Syst. Appl.; 2019; 133, pp. 260-295. [DOI: https://dx.doi.org/10.1016/j.eswa.2019.05.003]
9. Erdem, S.; Demirörs, O.; Rabhi, F. Systematic mapping study on process mining in agile software development. Software Process Improvement and Capability Determination, Proceedings of the 18th International Conference, SPICE 2018, Thessaloniki, Greece, 9–10 October 2018; Springer International Publishing: Berlin/Heidelberg, Germany, 2018; pp. 289-299.
10. Silalahi, S.; Yuhana, U.L.; Ahmad, T.; Studiawan, H. A Survey on Process Mining for Security. Proceedings of the 2022 International Seminar on Application for Technology of Information and Communication (iSemantic); Semarang, Indonesia, 17–18 September 2022; IEEE: New York, NY, USA, 2022; pp. 1-6.
11. Kitchenham, B. Procedures for Performing Systematic Reviews; NICTA Technical Report 0400011T.1 Keele University: Keele, UK, 2004; Volume 33, pp. 1-26.
12. Kitchenham, B.A.; Charters, S. Guidelines for Performing Systematic Literature Reviews in Software Engineering; EBSE-2007-01 Keele University: Keele, UK, 2007.
13. Fink, A. Conducting Research literature Reviews: From the Internet to Paper; Sage Publications: Thousand Oaks, CA, USA, 2019.
14. Petticrew, M.; Roberts, H. Systematic Reviews in the Social Sciences: A Practical Guide; John Wiley & Sons: Hoboken, NJ, USA, 2008.
15. Kitchenham, B.A.; Dyba, T.; Jorgensen, M. Evidence-based software engineering. Proceedings of the 26th International Conference on Software Engineering; Edinburgh, UK, 28 May 2004; IEEE: New York, NY, USA, 2004; pp. 273-281.
16. Dyba, T.; Kitchenham, B.A.; Jorgensen, M. Evidence-based software engineering for practitioners. IEEE Softw.; 2005; 22, pp. 58-65. [DOI: https://dx.doi.org/10.1109/MS.2005.6]
17. Cruzes, D.S.; Dybå, T. Research synthesis in software engineering: A tertiary study. Inf. Softw. Technol.; 2011; 53, pp. 440-455. [DOI: https://dx.doi.org/10.1016/j.infsof.2011.01.004]
18. Cooper, H.; Hedges, L.V.; Valentine, J.C. The Handbook of Research Synthesis and Meta-Analysis; 2nd ed. Russell Sage Foundation: New York, NY, USA, 2009; pp. 1-615.
19. Noblit, G.W.; Hare, R.D. Meta-Ethnography: Synthesizing Qualitative Studies; Sage: San Jose, CA, USA, 1988; 11.
20. Hoda, R.; Salleh, N.; Grundy, J.; Tee, H.M. Systematic literature reviews in agile software development: A tertiary study. Inf. Softw. Technol.; 2017; 85, pp. 60-70. [DOI: https://dx.doi.org/10.1016/j.infsof.2017.01.007]
21. Curcio, K.; Santana, R.; Reinehr, S.; Malucelli, A. Usability in agile software development: A tertiary study. Comput. Stand. Interfaces; 2019; 64, pp. 61-77. [DOI: https://dx.doi.org/10.1016/j.csi.2018.12.003]
22. Barros-Justo, J.L.; Benitti, F.B.; Matalonga, S. Trends in software reuse research: A tertiary study. Comput. Stand. Interfaces; 2019; 66, 103352. [DOI: https://dx.doi.org/10.1016/j.csi.2019.04.011]
23. Garousi, V.; Mäntylä, M.V. A systematic literature review of literature reviews in software testing. Inf. Softw. Technol.; 2016; 80, pp. 195-216. [DOI: https://dx.doi.org/10.1016/j.infsof.2016.09.002]
24. Khan, M.U.; Sherin, S.; Iqbal, M.Z.; Zahid, R. Landscaping systematic mapping studies in software engineering: A tertiary study. J. Syst. Softw.; 2019; 149, pp. 396-436. [DOI: https://dx.doi.org/10.1016/j.jss.2018.12.018]
25. Brereton, P.; Kitchenham, B.A.; Budgen, D.; Turner, M.; Khalil, M. Lessons from applying the systematic literature review process within the software engineering domain. J. Syst. Softw.; 2007; 80, pp. 571-583. [DOI: https://dx.doi.org/10.1016/j.jss.2006.07.009]
26. Petersen, K.; Feldt, R.; Mujtaba, S.; Mattsson, M. Systematic mapping studies in software engineering. Proceedings of the 12th International Conference on Evaluation and Assessment in Software Engineering (EASE); Swindon, UK, 26–27 June 2008; pp. 1-10.
27. Booth, A.M.; Wright, K.E.; Outhwaite, H. Centre for Reviews and Dissemination databases: Value, content, and developments. Int. J. Technol. Assess. Health Care; 2010; 26, pp. 470-472. [DOI: https://dx.doi.org/10.1017/S0266462310000978]
28. Budgen, D.; Brereton, P.; Drummond, S.; Williams, N. Reporting systematic reviews: Some lessons from a tertiary study. Inf. Softw. Technol.; 2018; 95, pp. 62-74. [DOI: https://dx.doi.org/10.1016/j.infsof.2017.10.017]
29. Ampatzoglou, A.; Bibi, S.; Avgeriou, P.; Chatzigeorgiou, A. Guidelines for managing threats to validity of secondary studies in software engineering. Contemporary Empirical Methods in Software Engineering; Springer: Cham, Switzerland, 2020; pp. 415-441.
30. Verner, J.M.; Brereton, O.P.; Kitchenham, B.A.; Turner, M.; Niazi, M. Systematic literature reviews in global software development: A tertiary study. Proceedings of the 16th International Conference on Evaluation & Assessment in Software Engineering (EASE); Ciudad Real, Spain, 14–15 May 2012; IET: Stevenage, UK, 2012; pp. 2-11.
31. Kitchenham, B.; Pretorius, R.; Budgen, D.; Brereton, O.P.; Turner, M.; Niazi, M.; Linkman, S. Systematic literature reviews in software engineering—A tertiary study. Inf. Softw. Technol.; 2010; 52, pp. 792-805. [DOI: https://dx.doi.org/10.1016/j.infsof.2010.03.006]
32. Dong, L.; Liu, B.; Li, Z.; Wu, O.; Babar, M.A.; Xue, B. A mapping study on mining software process. Proceedings of the 2017 24th Asia-Pacific Software Engineering Conference (APSEC); Nanjing, China, 4–8 December 2017; IEEE: New York, NY, USA, 2017; pp. 51-60.
33. Ghazal, M.A.; Ibrahim, O.; Salama, M.A. Educational process mining: A systematic literature review. Proceedings of the 2017 European Conference on Electrical Engineering and Computer Science (EECS); Bern, Switzerland, 17–19 November 2017; IEEE: New York, NY, USA, 2017; pp. 198-203.
34. Augusto, A.; Conforti, R.; Dumas, M.; La Rosa, M.; Maggi, F.M.; Marrella, A.; Mecella, M.; Soo, A. Automated discovery of process models from event logs: Review and benchmark. IEEE Trans. Knowl. Data Eng.; 2018; 31, pp. 686-705. [DOI: https://dx.doi.org/10.1109/TKDE.2018.2841877]
35. Erdoğan, T.; Tarhan, A. Process mining for healthcare process analytics. Proceedings of the 2016 Joint Conference of the International Workshop on Software Measurement and the International Conference on Software Process and Product Measurement (IWSM-MENSURA); Berlin, Germany, 5–7 October 2016; IEEE: New York, NY, USA, 2016; pp. 125-130.
36. Kurniati, A.P.; Johnson, O.; Hogg, D.; Hall, G. Process mining in oncology: A literature review. Proceedings of the 2016 6th International Conference on Information Communication and Management (ICICM); Hatfield, UK, 29–31 October 2016; IEEE: New York, NY, USA, 2016; pp. 291-297.
37. Rojas, E.; Munoz-Gama, J.; Sepúlveda, M.; Capurro, D. Process mining in healthcare: A literature review. J. Biomed. Inform.; 2016; 61, pp. 224-236. [DOI: https://dx.doi.org/10.1016/j.jbi.2016.04.007]
38. Ghasemi, M.; Amyot, D. From event logs to goals: A systematic literature review of goal-oriented process mining. Requir. Eng.; 2020; 25, pp. 67-93. [DOI: https://dx.doi.org/10.1007/s00766-018-00308-3]
39. Maita, A.R.C.; Martins, L.C.; Paz, C.R.L.; Rafferty, L.; Hung, P.C.K.; Peres, S.M.; Fantinato, M. A systematic mapping study of process mining. Enterp. Inf. Syst.; 2018; 12, pp. 505-549. [DOI: https://dx.doi.org/10.1080/17517575.2017.1402371]
40. Corallo, A.; Lazoi, M.; Striani, F. Process mining and industrial applications: A systematic literature review. Knowl. Process Manag.; 2020; 27, pp. 225-233. [DOI: https://dx.doi.org/10.1002/kpm.1630]
41. El-Gharib, N.M.; Amyot, D. Process mining for cloud-based applications: A systematic literature review. Proceedings of the 2019 IEEE 27th International Requirements Engineering Conference Workshops (REW); Jeju, Republic of Korea, 23–27 September 2019; IEEE: New York, NY, USA, 2019; pp. 34-43.
42. Arias, M.; Marques, M.R.; Rojas, E. Using process mining in agile software development methodologies: A systematic mapping study. Proceedings of the 2018 XLIV Latin American Computer Conference (CLEI); Sao Paulo, Brazil, 1–5 October 2018; IEEE: New York, NY, USA, 2018; pp. 552-561.
43. Yari Eili, M.; Rezaeenour, J. A survey on recommendation in process mining. Concurr. Comput. Pract. Exp.; 2022; 34, e7304. [DOI: https://dx.doi.org/10.1002/cpe.7304]
44. Wafda, F.; Usagawa, T.; Mahendrawathi, E. Systematic literature review on process mining in learning management systems. Proceedings of the 2022 IEEE International Conference on Industry 4.0, Artificial Intelligence, and Communications Technology (IAICT); Bali, Indonesia, 28–30 July 2022; IEEE: New York, NY, USA, 2022; pp. 160-166.
45. Imran, M.; Ismail, M.A.; Hamid, S.; Nasir, M.H.N.M. Complex process modeling in process mining: A systematic review. IEEE Access; 2022; 10, pp. 23276-23291. [DOI: https://dx.doi.org/10.1109/ACCESS.2022.3208231]
46. Urrea-Contreras, S.J.; Flores-Rios, B.L.; Astorga-Vargas, M.A.; Ibarra-Esquer, J.E. Process mining perspectives in software engineering: A systematic literature review. Proceedings of the 2021 Mexican International Conference on Computer Science (ENC); Morelia, Mexico, 9–11 September 2021; IEEE: New York, NY, USA, 2021; pp. 1-8.
47. Dallagassa, M.R.; dos Santos Garcia, C.; Scalabrin, E.E.; Ioshii, S.O.; Carvalho, D.R. Opportunities and challenges for applying process mining in healthcare: A systematic mapping study. J. Ambient Intell. Humaniz. Comput.; 2022; 13, pp. 165-182. [DOI: https://dx.doi.org/10.1007/s12652-021-02894-7]
48. Guzzo, A.; Rullo, A.; Vocaturo, E. Process mining applications in the healthcare domain: A comprehensive review. Wiley Interdiscip. Rev. Data Min. Knowl. Discov.; 2022; 12, e1442. [DOI: https://dx.doi.org/10.1002/widm.1442]
49. De Roock, E.; Martin, N. Process mining in healthcare: An updated perspective on the state of the art. J. Biomed. Inform.; 2022; 131, 103995. [DOI: https://dx.doi.org/10.1016/j.jbi.2022.103995]
50. Macák, M.; Daubner, L.; Sani, M.F.; Buhnova, B. Process mining usage in cybersecurity and software reliability analysis: A systematic literature review. Array; 2022; 13, 100120. [DOI: https://dx.doi.org/10.1016/j.array.2021.100120]
51. Duan, C.; Wei, Q. Process mining of duplicate tasks: A systematic literature review. Proceedings of the 2020 IEEE International Conference on Artificial Intelligence and Computer Applications (ICAICA); Dalian, China, 27–29 June 2020; IEEE: New York, NY, USA, 2020; pp. 778-784.
52. Goulão, M.; Amaral, V.; Mernik, M. Quality in model-driven engineering: A tertiary study. Softw. Qual. J.; 2016; 24, pp. 601-633. [DOI: https://dx.doi.org/10.1007/s11219-016-9324-8]
53. Petersen, K.; Vakkalanka, S.; Kuzniarz, L. Guidelines for conducting systematic mapping studies in software engineering: An update. Inf. Softw. Technol.; 2015; 64, pp. 1-18. [DOI: https://dx.doi.org/10.1016/j.infsof.2015.03.007]
54. Okoli, C.; Schabram, K. A guide to conducting a systematic literature review of information systems research. SSRN Electron. J.; 2010; 10, pp. 1-49. [DOI: https://dx.doi.org/10.2139/ssrn.1954824]
55. Van Der Aalst, W. Process Mining: Discovery, Conformance and Enhancement of Business Processes; Springer: Berlin/Heidelberg, Germany, 2011; Volume 2.
56. Barn, B.; Barat, S.; Clark, T. Conducting systematic literature reviews and systematic mapping studies. Proceedings of the 10th Innovations in Software Engineering Conference; Jaipur, India, 5–7 July 2017; pp. 212-213.
57. Tranfield, D.; Denyer, D.; Smart, P. Towards a methodology for developing evidence-informed management knowledge by means of systematic review. Br. J. Manag.; 2003; 14, pp. 207-222. [DOI: https://dx.doi.org/10.1111/1467-8551.00375]
58. Perry, D.E.; Porter, A.A.; Votta, L.G. Empirical studies of software engineering: A roadmap. Proceedings of the Conference on the Future of Software Engineering; Limerick, Ireland, 4–11 May 2000; pp. 345-355.
59. Moher, D.; Schulz, K.F.; Simera, I.; Altman, D.G. Guidance for developers of health research reporting guidelines. PLoS Med.; 2010; 7, e1000217. [DOI: https://dx.doi.org/10.1371/journal.pmed.1000217]
60. Aguirre, S.; Parra, C.; Sepúlveda, M. Methodological proposal for process mining projects. Int. J. Bus. Process Integr. Manag.; 2017; 8, pp. 102-113. [DOI: https://dx.doi.org/10.1504/IJBPIM.2017.083793]
61. Bourque, P.; Fairley, R. Swebok; IEEE Computer Society: Washington, DC, USA, 2004.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Abstract
Background: This tertiary study lists the secondary studies published in the process mining domain and provides an analysis related to a set of research questions. It is the first tertiary study in this area. The objective is to provide information about the available secondary studies in process mining, respond to research questions relating to the thematic areas covered in the studies, as well as trends regarding their quality, and report on findings for publication venues, citations, guidelines used, and demographics. Method: A tertiary study based on systematic secondary studies published up to March 2023. A total of 25 secondary studies related to process mining have been identified following the application of inclusion/exclusion criteria and quality assessment. Results: The most popular thematic areas addressed are technologies and applications for process mining and healthcare. The secondary studies in process mining have a medium quality score of 3.5. The guidelines introduced by Kitchenham over the years are preferred in secondary studies in this field. There is no trend related to the number of primary studies included in secondary studies in process mining. Conclusion: Although numerous secondary studies exist for process mining, there is still room for more research, specifically in the areas highlighted in this study. Future researchers can use this study for reference, and they can also use the listed research topics to dive deep into the issues identified.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer