Full text

Turn on search term navigation

Introduction

Systematic reviews (SRs) are the current standard to collate and synthesize empirical evidence and evaluate trends across a specific body of literature in response to research questions. SRs involve strict structured and formal methodological processes [1, 2]. Standardized protocols, such as the Preferred Reporting Items for Systematic Reviews and Meta-Analysis (PRISMA), offer researchers a guideline for transparent and comprehensive reporting of SR processes and results [3]. In addition, formal guides established by Cochrane provide further evaluation criteria in order to provide appropriate context for the interpretation of study data in various research settings [4]. Despite these protocols, SRs continue to be a huge undertaking due to extensive resource requirements. Depending on the scope of review and precision of search terms used, researchers may review tens of thousands of articles during various stages of screening. Therefore, given the substantial resource and time commitment required to complete the screening phases for SRs, it is crucial to investigate opportunities which may accelerate the screening process.

The screening phases of a SR include de-duplicating search outputs across multiple databases, and screening title and abstract and full text [3]. During these steps, researchers examine each article against strict inclusion and exclusion criteria to determine its eligibility for inclusion in the SR. To ensure standards of quality, more than one individual must screen the same article independently at each screening stage, with the reliability between screeners calculated and reported as part of the standard requirements for publishing SR [5]. Altogether, screening phases can take hundreds of hours for each individual reviewer involved.

Artificial intelligence-based (AI) software such as COVIDENCE [6], CUREDATIS [7], and Sciome SWIFT-ActiveScreener [8] have been developed to help expedite SR screening and reduce the number of person-hours required to complete SRs. For the purpose of this paper, AI programs refer to programs that are enabled to perform tasks that normally require human intelligence during in the context of conducting a SR. While they do not eliminate human involvement in the screening process, each program may reduce time and resources spent using various proprietary solutions. For example, COVIDENCE aids clinical research reviews with its ability to distinguish between randomized controlled trials (RCTs) versus non-RCTs. Other SR tools include similar options to apply sorting tags and take notes. Critically, ActiveScreener, which is part of a growing set of novel tools harnessing active learning, can estimate completeness of the screening process, and notify reviewers when they may stop screening early. In this paper, we evaluated ActiveScreener in terms of its agreement with human screeners, and usability in a large SR of mental health outcomes following treatment for PTSD. ActiveScreener was selected for this project namely for its departure from programs that use AI to identify records, and instead, use machine learning to build a predictive algorithm to reduce time spent in screening phases of SRs.

ActiveScreener is a novel machine learning and web-based AI software for SRs. ActiveScreener uses a L2-regularized log-linear model and active-learning approach to screening, meaning that the model is continually trained during the screening process [8]. In this case, the “training” occurs whenever a screener identifies included articles through the real time manual identification of uploaded documents. Each time an article is identified as “included”, the model is re-trained and re-orders the most likely relevant references to be screened next. A statistical model based on a negative binomial distribution is then used by ActiveScreener to estimate sensitivity of the screening process and is used to alert screeners that a specified threshold of likely relevant articles (in this case, 95%) have been included [8]. This AI prioritization of articles believed to be most relevant during screening can trim screening time and human effort by nearly 70% [8]. Past research utilizing ActiveScreener have found that the algorithms to work well in reviews involving the physical health literature [9, 10]. Despite indications of past use in health reviews, there is little evidence for how ActiveScreener may perform in evaluations of mental health and treatment outcomes. Further, the precision of the estimation model remains unclear. In this paper, we set out to evaluate the precision and usability of ActiveScreener in conducting screening for a mental health treatment SR [11]. Specifically, in Part 1, we formally evaluated the agreement between its predictive model relative to the screening outcomes conducted by individual human screeners, and in Part 2, we collected informal feedback regarding the usability of ActiveScreener amongst a cohort of screeners.

Methods

Participants

Eighteen screeners were trained to identify articles for inclusion and exclusion at the title and abstract assessment phase of the review and on the use of ActiveScreener for a meta-analysis and SR (for more details on this project see Liu et al., 2021). All respondents were paid employees or unpaid volunteers of the MacDonald Franklin Operational Stress Injury Research and Innovation Centre (MFOSIRC).

Procedure

All respondents received an email with the link directing them to the online survey. Respondents were asked to complete both the demographic information and the ActiveScreener User Experiences Survey online via Google Forms. Data was collected in April 2022.

A total of 10,002 references required review at the title and abstract screening stage of this SR. ActiveScreener inclusion statistics were set at 95% predicted inclusion rate, resulting in 5,390 of these references to be reviewed in duplicate by screeners. Screeners were able to access ActiveScreener at any time on their own schedules, and when they logged on, they were provided with the most relevant article at that time as identified by ActiveScreener, rather than every screener reviewing articles in the same order. Once screening reached 95% of relevant articles included, according to ActiveScreener, all screeners stopped. At this stage, data consisting of the screening results for the 5,390 references reviewed by screeners and the remaining 4,612 references reviewed by the ActiveScreener AI were exported. Inclusion statistics were then reset to 100%, prompting the screeners to continue screening the remining 4,612 references for relevant abstracts to be screened in full-text. Data was once again exported. Screening results for the remaining 4,612 references from ActiveScreener and the screeners were then compared to assess agreement between ActiveScreener and reviewer’s decisions during title and abstract screening.

No direct compensation was given for participating in this study, however; many of the respondents were paid employees of the MFOSIRC and completed the survey during working hours, thereby receiving nominal monetary compensation for the time spent participating. For unpaid volunteers, the time spent completing this survey was included in their volunteer hours, for which they are provided a letter of recognition.

Measures

Demographics.

Demographic information included: (1) the respondents’ role within MFOSIRC, (2) whether the respondents conducted or assisted on a SR or meta-analysis prior to their time at the MFOSIRC, (3) respondents’ level of experience with SR or meta-analyses (e.g., beginner, intermediate), and (4) types of software previously used by respondents for screening for SR or meta-analyses.

ActiveScreener user experience survey.

This survey was created by authors (J.J.W.L & A.N) to capture respondents experiences using ActiveScreener. The survey consisted of 12 items (statements or questions) related to usability of ActiveScreener for screening (e.g., “SWIFT Active Screener is easy to use”, “SWIFT Active Screener software was easy to learn”). Nine statements were quantitative, and three questions were qualitative, providing open text boxes for responses. Of the quantitative items, eight statements were rated on a 5-point Likert scale, ranging from ‘strongly agree’ to ‘strongly disagree’, and one question was rated on a 5-point Likert scale, ranging from ‘very confident’ to ‘not at all confident’. The qualitative items included three open-ended questions capturing information related to features of ActiveScreener the respondents enjoyed, any challenges experienced while using ActiveScreener, and any suggestions the respondents had to improve ActiveScreener. This survey was assessed for face validity, but as it was an internal assessment of usability and acceptability, no other reliability or validity assessments were undertaken.

Data analysis

A confusion matrix and statistics were generated and used to evaluate the predictive agreement of ActiveScreener across three classes. The three classes were Included (represents references identified as meeting inclusion criteria), Excluded (representing references identified as meeting exclusion criteria), and Conflicted (representing disagreement on whether the reference should be included or excluded). Analyses were performed in R-Studio using the tidyverse [12], stringr [13], and caret [14] packages. Results are reported for only the title and abstract screening stage.

Both quantitative and qualitative data was used to provide descriptive information related to the respondents’ experiences using ActiveScreener. For qualitative data, common themes were extracted from responses provided regarding enjoyable features of the software, challenges with ActiveScreener, and suggested improvements.

Results

The multiclass confusion matrix for 4,612 references is presented in Table 1. As shown, both the screeners and the ActiveScreener AI identified 1,365 included references, 2,528 excluded references, and 622 conflicted references. For 97 references, the screeners identified these references as included, while the ActiveScreener AI identified these references as conflicted.

[Figure omitted. See PDF.]

Overall, agreement was 97.9%, 95% Confidence Interval (CI) [0.97, 0.98], p < .001. Interrater reliability was reported with Kappa [Fleiss and Conger; 0.96]). Sensitivity for the three classes were: Included (0.93), Excluded (1.00), and Conflicted (1.00). Specificity for the three classes were: Included (1.00), Excluded (1.00), and Conflicted (0.98).

Quantitative data

All 18 respondents completed all nine quantitative items. All respondents (100%) either agreed or strongly agreed that: their training needs were met; ActiveScreener was easy to learn; they felt confident using ActiveScreener; and they would recommend ActiveScreener for use in other SR. Nearly all respondents reported either agreeing or strongly agreeing that: ActiveScreener was easy to use (94.4%); and ActiveScreener had a user-friendly interface (94.5%). The majority of respondents (88.9%) also reported that they either agreed or strongly agreed that ActiveScreener had all of the features needed for adequate screening. Of the eight respondents who had prior experience with other screening programs or tools, seven respondents (87.5%) rated that they either agreed or strongly agreed that they preferred ActiveScreener over other programs. With regards to the experience of technical or system-related glitches, respondents varied in their perspectives, with 44.5% of respondents indicating that they experienced no technical or system-related glitches (either agreed or strongly agreed), while 22.2% indicated experiencing technical or system-related glitches (disagreed). Results for each survey items are reported in Table 2.

[Figure omitted. See PDF.]

Qualitative data

Features enjoyed.

For the question capturing the features of ActiveScreener enjoyed most by respondents, three primary themes emerged from the data (see Table 3 for quotes).

[Figure omitted. See PDF.]

AI Predictability. Respondents noted that ActiveScreener accelerates the screening process through predictive capabilities. Specifically, ActiveScreener reorders references based on individual patterns of inclusion and exclusion such that likely included articles are pushed to the top of the screening list.

Screening process. Respondents noted that ActiveScreener makes the screening process easier and faster. Specifically, all the information required for screening is available on one page including the article title, abstract, full text, and inclusion and exclusion criteria. This allows the screener to evaluate the article quickly.

User-friendly interface. Respondents noted that ActiveScreener has a user-friendly interface. For example, respondents noted ease of use and ability to access ActiveScreener from any device as a positive feature of this software.

Challenges.

For the question capturing any challenges experienced by respondents, two primary themes emerged from the data (see Table 3 for quotes).

Technical issues. Respondents noted that they encountered some technical difficulties and glitches while using ActiveScreener. For example, connection loss specific to the ActiveScreener website or processing or loading speeds were commonly described.

Article uploading. Respondents noted that uploading articles individually to each reference is time consuming and could result in errors such as a mismatch of articles to references.

Suggested improvements.

For the question capturing suggested improvements or additions to the program, three primary themes emerged from the data (see Table 3 for quotes).

Data extraction. Respondents noted that they would have liked the ability to either extract data directly within ActiveScreener or be able to export the included references with attached articles to other formats (e.g., SmartSheets).

Bulk upload. Respondents noted that they would like the ability to upload articles to references in bulk as opposed to one at a time.

Interface improvements. Respondents noted potential improvements to the user interface. For example, navigation opportunities, keeping a session counter of screened articles, and ability to flag references with incorrect articles attached.

Discussion

In our study, we found that ActiveScreener performed above its expected 95% agreement in prediction at the title and abstract assessment phase of the SR and was found to be user friendly by both novice and seasoned screeners. Consistent with past evidence that the effectiveness of this program can reduce screening time and effort by nearly 50% [15], we observed similar results with a large-scale review of PTSD treatment outcomes.

Regarding its agreement, our confusion matrix results indicated that when testing against a large-scale SR that included over 10,000 articles screened in the title and abstract phase, ActiveScreener performed better than expected in its predictive algorithm. While the software was expected to reach 95% sensitivity, the actual agreement between the machine learning model and our screeners in this review exceeded 95% (97.9%), which may have been aided by the high number of independent screeners on this project. Further, of the categories examined, discrepancies between the predictive algorithm and actual human screening outcomes were minimal. Specifically, there were no discrepancies between human screeners and the ActiveScreener AI with respect to articles that should be excluded from the SR. Only a small number of discrepancies were found between human screeners that indicated articles should be included, while the ActiveScreener AI predicted that the articles would be conflicted (i.e., predicted multiple human screeners would disagree on inclusion and exclusion) based on prior trends in human screening. This means that no studies that the ActiveScreener AI predicted to be included resulted in exclusions by screeners. Thus, these statistics, as yielded by the confusion matrix, indicate that ActiveScreener is a reliable and rigorous platform to accelerate screening at the title and abstract phase of SRs, especially when utilizing its predictive algorithm function. Future directions of this research should consider the assessment of ActiveScreener AI agreement when including a fewer number of screeners and those with different levels of experience. Previous research in other areas indicates that ActiveScreener maintains high levels of agreement with as few as two reviewers [9], and duplication of this result would be beneficial for use in mental health-related SR with more limited resources. Further, to reduce human resources during screening, ActiveScreener should consider implementing new features such as bulk upload and templates for subsequent data extraction directly within the platform. Both would reduce the need for switching between programs when conducting reviews and would thereby reduce human resource requirements and the potential for error. Importantly, as decisions at the title and abstract phase were not compared directly against final inclusion decisions in this analysis, the magnitude of impact ActiveScreener has on the screener process in its entirety is not clear. However, one can assume with relative confidence that due to the high agreement in phase one, high levels of agreement would have been maintained in the final phase of full-text screening.

In examining user feedback amongst a group of screeners, we found that ActiveScreener was endorsed as easy to learn and easy to use. However, user feedback also noted that there were software glitches, such as the platform being unavailable from time to time, as well as glitches when uploading articles and using other features. While these challenges do not undermine its use, they provide areas of opportunity for ActiveScreener programmers to consider for future research and development. Of note, while the administered survey was developed internally and assessed for face validity, reliability and other forms of validity were not examined. As such, this may have led to measurement error and quantitative results should be interpreted with caution.

Conclusion

In considering the merits of ActiveScreener, it should be noted that the software’s machine learning algorithm is reliant on the rigour of training and the strength of screeners that it bases its user feedback on. As such, users must conduct training and screening with care. In particular, the clarity in which inclusion and exclusion criteria may be applied during the initial screening stages is of vital importance in building the accuracy and agreement of the predictive model as well as for increasing agreement between human screeners and the model. Thus, researchers are encouraged to spend considerable time to ensure the inclusion and exclusion criteria are clearly understood and reliably applied by all screeners during the project training stages. In addition, another time-saving feature of ActiveScreener, the deduplication function for uploading references, can benefit from further development as it currently limits the deduplication to texts only, and does not extend to cover punctuation. Depending on the database, references may be exported with variable punctuations, which is not covered by the feature, resulting in many duplicate references when screening. However, it should be noted that this can easily be solved with workarounds, such as manually combining search yields on r with generated codes that deduplicates references prior to uploading on ActiveScreener. Finally, it is important to note that ActiveScreener’s program to accelerate the screening stage is only currently relevant at the title and abstract stage and excludes further reviews of full texts. Thus, current study findings and the potential time and resource savings are only applicable to the initial screening phase of SRs. Additionally, this paper did not present screening decisions at the title and abstract assessment phase relevant to the final sample of the included articles, and therefore can only describe how well ActiveScreener software performed compared to trained human screeners at this stage of the review process. Taken together, ActiveScreener appears to be a user friendly and valuable platform for SRs, and when used appropriately, may be a useful tool during the initial screening process.

References

1. 1. Gough D, Davies P, Jamtvedt G, Langlois E, Littell J, Lotfi T, et al. Evidence Synthesis International (ESI): Position Statement. Syst Rev. 2020; 9(1):155. pmid:32650823

* View Article

* PubMed/NCBI

* Google Scholar

2. 2. Jahan N, Naveed S, Zeshan M, Tahir MA. How to Conduct a Systematic Review: A Narrative Literature Review. Cureus. 2016;8(11):e864. pmid:27924252

* View Article

* PubMed/NCBI

* Google Scholar

3. 3. Page MJ, McKenzie JE, Bossuyt PM, Boutron I, Hoffmann TC, Mulrow CD, et al. The PRISMA 2020 statement: an updated guideline for reporting systematic reviews. BMJ. 2021;372: n71. pmid:33782057

* View Article

* PubMed/NCBI

* Google Scholar

4. 4. Higgins J, Thomas J, Chandler J, Cumpston M, Li T, Page M, et al. Cochrane Handbook for Systematic Reviews of Interventions version 6.2. Cochrane Handbook for Systematic Reviews of Interventions Version 6.2. 2021; available from: www.training.cochrane.org/handbook

* View Article

* Google Scholar

5. 5. Belur J, Tompson L, Thornton A, Simon M. Interrater Reliability in Systematic Review Methodology: Exploring Variation in Coder Decision-Making. Sociological Methods & Research. 2021;50(2):837–865. https://doi.org/10.1177/0049124118799372

* View Article

* Google Scholar

6. 6. Covidence systematic review software, Veritas Health Innovation, Melbourne, Australia. Available from: www.covidence.org.

7. 7. Research Solutions (n.d.). Curedatis Systematic Review Engine. Available from: https://www.researchsolutions.com/curedatis

8. 8. Howard BE, Phillips J, Tandon A, Maharana A, Elmore R, Mav D, et al. SWIFT ActiveScreener: Accelerated document screening through active learning and integrated recall estimation. Environ Int. 2020;138:105623. pmid:32203803

* View Article

* PubMed/NCBI

* Google Scholar

9. 9. Elmore R, Schmidt L, Lam J, Howard BE, Tandon A, Norman C, et al. Risk and Protective Factors in the COVID-19 Pandemic: A Rapid Evidence Map. Frontiers in public health. 2020; 8:582205. pmid:33330323

* View Article

* PubMed/NCBI

* Google Scholar

10. 10. Lam J, Howard BE, Thayer K, Shah RR. Low-calorie sweeteners and health outcomes: A demonstration of rapid evidence mapping (rEM). Environment international. 2019;123:451–458. pmid:30622070

* View Article

* PubMed/NCBI

* Google Scholar

11. 11. Liu JJW, Nazarov A, Easterbrook B, Plouffe RA, Le T, Forchuk C, et al. Four Decades of Military Posttraumatic Stress: Protocol for a Meta-analysis and Systematic Review of Treatment Approaches and Efficacy. JMIR Res Protoc. 2021;10: e33151. pmid:34694228

* View Article

* PubMed/NCBI

* Google Scholar

12. 12. Wickham H, Averick M, Bryan J, Chang W, McGowan LD, François R, et al. “Welcome to the tidyverse.” Journal of Open Source Software. 2019;4(43): 1686.

* View Article

* Google Scholar

13. 13. Wickham H. stringr: Simple, Consistent Wrappers for Common String Operations. 2022; available from: https://stringr.tidyverse.org, https://github.com/tidyverse/stringr.

* View Article

* Google Scholar

14. 14. Kuhn M. Building Predictive Models in R Using the caret Package. Journal of Statistical Software. 2008;28(5): 1–26. https://doi.org/10.18637/jss.v028.i05

* View Article

* Google Scholar

15. 15. Howard BE, Phillips J, Miller K, Tandon A, Mav D, Shah MR, et al. SWIFT-Review: a text-mining workbench for systematic review. Syst Rev. 2016;5:87. pmid:27216467; PMCID: PMC4877757.

* View Article

* PubMed/NCBI

* Google Scholar

Citation: Liu JJW, Ein N, Gervasio J, Easterbrook B, Nouri MS, Nazarov A, et al. (2024) Usability and agreement of the SWIFT-ActiveScreener systematic review support tool: Preliminary evaluation for use in clinical research. PLoS ONE 19(11): e0291163. https://doi.org/10.1371/journal.pone.0291163

About the Authors:

Jenny J. W. Liu

Roles: Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Project administration, Supervision, Writing – original draft, Writing – review & editing

E-mail: [email protected]

Affiliations: The MacDonald Franklin Operational Stress Injury Research and Innovation Centre, Lawson Health Research Institute, London, ON, Canada, Department of Psychiatry, Schulich School of Medicine & Dentistry, Western University, London, ON, Canada

ORICD: https://orcid.org/0000-0003-2213-1346

Natalie Ein

Roles: Formal analysis, Writing – original draft, Writing – review & editing

Julia Gervasio

Roles: Formal analysis, Writing – original draft, Writing – review & editing

Affiliations: The MacDonald Franklin Operational Stress Injury Research and Innovation Centre, Lawson Health Research Institute, London, ON, Canada, Department of Psychology, Toronto Metropolitan University, Toronto, Canada

Bethany Easterbrook

Roles: Conceptualization, Data curation, Project administration, Writing – review & editing

Maede S. Nouri

Roles: Formal analysis, Writing – review & editing

Affiliation: The MacDonald Franklin Operational Stress Injury Research and Innovation Centre, Lawson Health Research Institute, London, ON, Canada

Anthony Nazarov

Roles: Conceptualization, Funding acquisition, Investigation, Methodology, Project administration, Resources, Software, Supervision, Writing – review & editing

J. Don Richardson

Roles: Funding acquisition, Investigation, Project administration, Resources, Supervision, Writing – review & editing

[/RAW_REF_TEXT]

References

1. Gough D, Davies P, Jamtvedt G, Langlois E, Littell J, Lotfi T, et al. Evidence Synthesis International (ESI): Position Statement. Syst Rev. 2020; 9(1):155. pmid:32650823

2. Jahan N, Naveed S, Zeshan M, Tahir MA. How to Conduct a Systematic Review: A Narrative Literature Review. Cureus. 2016;8(11):e864. pmid:27924252

3. Page MJ, McKenzie JE, Bossuyt PM, Boutron I, Hoffmann TC, Mulrow CD, et al. The PRISMA 2020 statement: an updated guideline for reporting systematic reviews. BMJ. 2021;372: n71. pmid:33782057

4. Higgins J, Thomas J, Chandler J, Cumpston M, Li T, Page M, et al. Cochrane Handbook for Systematic Reviews of Interventions version 6.2. Cochrane Handbook for Systematic Reviews of Interventions Version 6.2. 2021; available from: www.training.cochrane.org/handbook

5. Belur J, Tompson L, Thornton A, Simon M. Interrater Reliability in Systematic Review Methodology: Exploring Variation in Coder Decision-Making. Sociological Methods & Research. 2021;50(2):837–865. https://doi.org/10.1177/0049124118799372

6. Covidence systematic review software, Veritas Health Innovation, Melbourne, Australia. Available from: www.covidence.org.

7. Research Solutions (n.d.). Curedatis Systematic Review Engine. Available from: https://www.researchsolutions.com/curedatis

8. Howard BE, Phillips J, Tandon A, Maharana A, Elmore R, Mav D, et al. SWIFT ActiveScreener: Accelerated document screening through active learning and integrated recall estimation. Environ Int. 2020;138:105623. pmid:32203803

9. Elmore R, Schmidt L, Lam J, Howard BE, Tandon A, Norman C, et al. Risk and Protective Factors in the COVID-19 Pandemic: A Rapid Evidence Map. Frontiers in public health. 2020; 8:582205. pmid:33330323

10. Lam J, Howard BE, Thayer K, Shah RR. Low-calorie sweeteners and health outcomes: A demonstration of rapid evidence mapping (rEM). Environment international. 2019;123:451–458. pmid:30622070

11. Liu JJW, Nazarov A, Easterbrook B, Plouffe RA, Le T, Forchuk C, et al. Four Decades of Military Posttraumatic Stress: Protocol for a Meta-analysis and Systematic Review of Treatment Approaches and Efficacy. JMIR Res Protoc. 2021;10: e33151. pmid:34694228

12. Wickham H, Averick M, Bryan J, Chang W, McGowan LD, François R, et al. “Welcome to the tidyverse.” Journal of Open Source Software. 2019;4(43): 1686.

13. Wickham H. stringr: Simple, Consistent Wrappers for Common String Operations. 2022; available from: https://stringr.tidyverse.org, https://github.com/tidyverse/stringr.

14. Kuhn M. Building Predictive Models in R Using the caret Package. Journal of Statistical Software. 2008;28(5): 1–26. https://doi.org/10.18637/jss.v028.i05

15. Howard BE, Phillips J, Miller K, Tandon A, Mav D, Shah MR, et al. SWIFT-Review: a text-mining workbench for systematic review. Syst Rev. 2016;5:87. pmid:27216467; PMCID: PMC4877757.

Word count: 4145

Show less

© 2024 Liu et al. This is an open access article distributed under the terms of the Creative Commons Attribution License: http://creativecommons.org/licenses/by/4.0/ (the “License”), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Abstract

Translate

Systematic reviews (SRs) employ standardized methodological processes for synthesizing empirical evidence to answer specific research questions. These processes include rigorous screening phases to determine eligibility of articles against strict inclusion and exclusion criteria. Despite these processes, SRs are a significant undertaking, and this type of research often necessitates extensive human resource requirements, especially when the scope of the review is large. Given the substantial resources and time commitment required, we investigated a way in which the screening process might be accelerated while maintaining high fidelity and adherence to SR processes. More recently, researchers have turned to artificial intelligence-based (AI) software to expedite the screening process. This paper evaluated the agreement and usability of a novel machine learning program, Sciome SWIFT-ActiveScreener (ActiveScreener), in a large SR of mental health outcomes following treatment for PTSD. ActiveScreener exceeded the expected 95% agreement of the program with screeners to predict inclusion or exclusion of relevant articles at the title/abstract assessment phase of the review and was reported to be user friendly by both novice and seasoned screeners. ActiveScreener, when used appropriately, may be a useful tool when performing SR in a clinical context.

Details

Title

Usability and agreement of the SWIFT-ActiveScreener systematic review support tool: Preliminary evaluation for use in clinical research

Author

Liu, Jenny J W

; Ein, Natalie; Gervasio, Julia; Easterbrook, Bethany; Nouri, Maede S; Nazarov, Anthony; Richardson, J Don

First page

e0291163

Section

Research Article

Publication year

2024

Publication date

Nov 2024

Publisher

Public Library of Science

e-ISSN

19326203

Source type

Scholarly Journal

Language of publication

English

DOI

https://doi.org/10.1371/journal.pone.0291163

ProQuest document ID

3131777279

Usability and agreement of the SWIFT-ActiveScreener systematic review support tool: Preliminary evaluation for use in clinical research

Jump to:

Full text

Introduction

Methods

Participants

Procedure

Measures

Demographics.

ActiveScreener user experience survey.

Data analysis

Results

Quantitative data

Qualitative data

Features enjoyed.

Challenges.

Suggested improvements.

Discussion

Conclusion

References

Abstract

Details

Suggested sources