Content area
Full text
Introduction
Trial sequential analysis (TSA) has been an increasingly used tool to assess the conclusiveness of evidence synthesized from systematic reviews and meta-analyses (SRMAs) [1, 2–3]. TSA incorporates the concept of cumulative meta-analyses, where each study is added to the evidence synthesis sequentially according to its publication time. Due to multiplicity issues arising from multiple hypothesis testing each time a study is added, TSA applies statistically rigorous methods to adjust the overall type I and type II error rates, thus reducing the likelihood of false positive and false negative conclusions. Moreover, TSAs can estimate required information sizes (RIS), akin to sample size calculations in clinical trials, which helps to determine whether a meta-analysis has adequate statistical power [4]. If the RIS is not achieved, TSA provides decision boundaries that can help assess the statistical significance (monitoring boundaries) or futility (futility boundaries) of an experimental intervention, in a similar manner to interim analyses of clinical trials. Hereafter, we will refer collectively to monitoring and futility boundaries as decision boundaries.
Transparency and reproducibility are essential in validating the conclusions derived from TSAs [5]. Recent years have marked significant improvements in the reporting quality of SRMAs, due to checklists such as the PRISMA statement [6]. However, the quality of reporting and reproducibility of TSA is unclear. Table 1 outlines three key components of a TSA: the RIS, decision boundaries, and the Z-curve (comprising Z-statistics from cumulative meta-analyses). It also specifies the reporting elements necessary to facilitate the reproduction of TSAs. The aim of this cross-sectional meta-epidemiological study is to assess the reproducibility of TSAs in recent SRMAs.
Table 1. Checklist for reporting methods used for performing TSAs
Element in TSA | Reporting item |
---|---|
RIS | • Type I error rate • Type II error rate (or statistical power) • Diversity (if heterogeneity is present) • Minimally relevant differences and variances for continuous outcomes • Relative risk reductions and assumed event rates in control groups for binary outcomes |
Decision boundaries | • Data used for deriving information fractions (typically the cumulative sample sizes of individual studies divided by the RIS) • Spending functions for deriving adjusted type I and type II error rates for decision boundaries (optional, as they are typically used as the functions suggested by Lan and DeMets [7]) |
Z-curve |