The Banksia plot: a method for visually comparing point estimates and confidence intervals across datasets

Abstract

Objectives

In research evaluating statistical analysis methods, a common aim is to compare point estimates and CIs calculated from different analyses. This can be challenging when the outcomes (and their scale ranges) differ across datasets. We therefore developed a graphical method, the “Banksia plot”, to facilitate pairwise comparisons of different statistical analysis methods by plotting and comparing point estimates and CIs from each analysis method, both within and across datasets.

Study Design and Setting

The plot is constructed in three stages. Stage 1: To compare the results of two statistical analysis methods, for each dataset, the point estimate from the reference analysis method is centered on zero, and its confidence limits are scaled to range from −0.5 to 0.5. The same centering and scale adjustment values are then applied to the corresponding comparator analysis point estimate and confidence limits. Stage 2: A Banksia plot is constructed by plotting the centered and scaled point estimates from the comparator method for each dataset on a rectangle centered at zero, ranging from −0.5 to 0.5, which represents the reference method results. Stage 3: Optionally, a matrix of Banksia plots is graphed, showing all pairwise comparisons from multiple analysis methods. We illustrate the Banksia plot using two examples.

Results

Illustration of the Banksia plot demonstrates how the plot makes it immediately apparent whether there are differences in point estimates and CIs when using different analysis methods (example 1) or different data extractors (example 2). Furthermore, we demonstrate how different bases for ordering the CIs can be used to highlight particular differences (ie, in point estimates or CI widths).

Conclusion

The Banksia plot provides a visual summary of pairwise comparisons of different analysis methods, allowing patterns and trends in the point estimates and CIs to be easily identified.

Full text

Turn on search term navigation

What is new?

What was already known?

• In research evaluating statistical analyses methods, comparisons of point estimates and associated measures of precision are often undertaken to show the differences between analyses. These are often shown using tables and graphs displaying point estimate summaries separately from measures of precision. It is difficult to compare multiple examples from different studies, as they may contain results from different outcomes with data including different ranges of values.

What this adds to what was known?

• The Banksia plot centers and scales point estimates and confidence intervals (CIs) before graphing them together, allowing many comparisons to be easily visualized and assessed for patterns and trends, regardless of the ranges of values in the outcomes.

What is this implication and what should change now?

• The Banksia plot facilitates the comparison of point estimates and CIs from different statistical analyses in a pairwise manner in a single graph, allowing for assessments within and across different datasets. Multiple Banksia plots can be shown in a matrix of pairwise analysis comparisons, providing an efficient way to visualize many comparisons.

1 Introduction

In research evaluating statistical analysis methods, a common aim is to compare point estimates and measures of precision (eg, standard errors [SE] or CIs) calculated from different analyses. These comparisons are often undertaken in empirical studies where a set of statistical analysis methods is applied to real-world datasets to examine whether the choice of method matters in practice [ ^1–3]. Plots are an important tool used in such studies to visually communicate similarities and differences in the results across the statistical methods [ ⁴, ⁵].

A particular complexity arising in empirical studies that aim to compare point estimates and their CIs is that the datasets are very likely to have different outcomes. While within a dataset this does not pose a problem – since the results yielded from the different statistical methods are on the same scale and therefore are directly comparable – across datasets with different outcomes (and scales), the results are not directly comparable. Take, eg, an empirical study in which the aim is to compare the results of a set of statistical methods applied to interrupted time series (ITS) studies. The included ITS studies are likely to examine different questions, and as such, measure different outcomes (which have different scales); eg, in one ITS study, the aim may be to examine the impact of bicycle helmet legislation on cycling fatalities, where the outcome is the rate of fatalities per 100,000 population [ ⁶], while another might be to examine the impact of the introduction of an antibiotic stewardship program in an emergency medical department, where the outcome is antibiotic drug use per 100 patient days [ ⁷].

We therefore aimed to develop a plot that facilitates pairwise comparisons of point estimates and CIs from different statistical analyses both within and across datasets; the Banksia plot. The plot is so named since it visually resembles the flowers of Banksia species that are endemic to Australia [ ⁸]( ^{Fig 1} ). In this paper, we describe our methods for developing the plot, introduce the plot, and use two examples to illustrate its construction and utility. The examples stem from two empirical studies we have undertaken [ ³, ⁹], which had different aims and datasets but with the commonality that our interest was in comparing point estimates and CIs (estimated from different methods) across datasets with different outcomes. We provide code to implement the plots in Stata and R.

2 Method

In Section ^2.1, we describe our methods for developing the plot. In Section ^2.2, we describe the stages involved in constructing the Banksia plot to compare two statistical analyses and its extension to a matrix of Banksia plots for comparisons between more than two analyses. This is followed in Section ^2.3 by a description of the empirical studies that are used to illustrate the application and interpretation of the plot. Code in Stata and R for creating the plots for the examples introduced in Section ^2.3 is available in an online repository [ ¹⁰].

2.1 Methods for developing the plot

The impetus for developing the plot arose from an empirical study we undertook (described in Section ^2.3.1), in which we aimed to jointly compare point estimates and CIs from different statistical methods. Prototypes of the plot to visually show the comparisons were developed by S.L.T and discussed in meetings (with A.K, E.K, and J.E.M), where decisions were made about the centering and scaling of the CIs and how to visually depict key components. S.L.T. subsequently further revised the plot, which included amendments suggested during peer review.

2.2 Construction of the plot
2.2.1 Stage 1: centering and scaling CIs

The first stage in creating the Banksia plot involves centering and scaling the point estimate and CI from the analysis of interest (which we refer to as ‘the comparator’) to that estimated by ‘the reference’ analysis. This involves two steps, with the formulae shown in ^Table 1 and depicted via example in ^Figure 2 .

In the first step, the point estimate and confidence limits from the reference analysis are centered on zero. To center the point estimate, subtract the value of the reference analysis point estimate from itself. To center the CI for the reference analysis around zero, subtract the point estimate for the reference analysis from the upper and lower limits of the CI for the reference analysis. This process maintains the CI width. This same adjustment is also applied to the comparator point estimate and its confidence limits. This maintains the same absolute difference between the point estimates (and their respective confidence limits) across the two analyses. Importantly, these adjustments assume that the confidence limits are symmetric around their point estimates, so for ratio measures (such as risk ratios and odds ratios), the point estimates and their confidence limits would need to be log transformed prior to making these adjustments.

In the second step, the CI limits of the reference analysis are scaled to range from −0.5 to 0.5 (a width of 1). This is achieved by dividing the adjusted reference analysis CI limits by the width of the CI. The adjusted point estimate and CI limits of the comparator analysis are also divided by the CI width of the reference analysis. This centering and scaling process preserves the relative magnitudes of the difference in point estimates and CI widths between the two analyses, and importantly, allows for comparisons across datasets where the outcomes and scale ranges differ.

2.2.2 Stage 2: the Banksia plot– comparing multiple scaled CIs

In the second stage, the Banksia plot is constructed from the scaled point estimates and confidence limits calculated for each dataset in stage 1 ( ^{Fig 3} ). The reference analysis CIs, all of which are centered at zero and range from −0.5 to 0.5, are shown as a colored rectangle in the background. Next, the scaled comparator analysis point estimates (white squares) and CIs are plotted (vertical blue lines). Differences in point estimates between the two analyses are indicated by perturbations from zero (indicating no difference in point estimates) of the left-hand diamond, showing the median of the scaled differences. Differences in the relative widths of the CI are indicated by where the comparator analysis CI falls relative to the reference rectangle. CIs falling within the reference rectangle indicate that the comparator analysis CI is smaller than the reference analysis, and CIs beyond the reference rectangle indicate the opposite.

The order of the lines can be varied to investigate different patterns; for example, the lines in ^Figure 3 are ordered by CI width, while those in ^Figure 4 are ordered by point estimate value. The vertical range of the scale can be truncated ( ^{Fig 3}) to enhance the visibility of most CIs. However, care should be taken to indicate any CIs that extend beyond the plot's range, for example, by using arrows at one or both ends. To avoid obscuring the widths of any extremely wide CIs, this plot may be supplemented with an additional plot where the vertical range is wide enough to display the full widths of all CIs. Other plot options are described in ^{Box 1} .

2.2.3 Stage 3: matrix of Banksia plots– comparing multiple analyses

Empirical studies often evaluate multiple different analyses. However, the Banksia plot only compares two analyses. The plot can be extended to display multiple comparisons by creating a matrix of Banksia plots ( ^{Fig 5} ). Pairwise comparisons are plotted with the columns indicating the reference analysis and the rows indicating the comparator analysis. Other plotting options are described in ^{Box 1}.

2.3 Description and aims of the illustrative examples
2.3.1 Example 1: comparison of statistical methods for ITS

The first example that we use to demonstrate the utility of the Banksia plot stems from an empirical study we undertook to compare point estimates and CIs calculated from six statistical analysis methods applied to 190 real-world ITS datasets [ ³]. The inclusion criteria for the datasets were only based on the characteristics of the design, with no restrictions on the research questions addressed (except that they had to address an interruption that had public health implications). This led to the inclusion of time series datasets that had different outcomes and associated ranges, thus necessitating standardization of the point estimates and CIs for comparison across the datasets. Eg, in one study, the aim was to examine the impact of the First World War on suicide rates per 100,000 population, with values ranging from 1.2 to 3.76 [ ¹¹], in another study, the aim was to examine the impact of automatic reporting of screening results for chronic kidney disease in a single province, where the outcome was the number of new referrals, with values ranging from 77 to 173 [ ¹²]; while in another study, the aim was to examine the country-wide impact of introducing a generic version of a drug, where the outcome was the utilization of drugs in the defined daily doses, with values ranging from 2.48 million to 2.74 million [ ¹³]. For this example, we illustrate the Banksia plot for comparing point estimates and CIs for the immediate level change estimator. This estimator represents the change in the outcome immediately after the interruption's introduction. The matrix of Banksia plots displays plots for the comparison of each statistical method with all others.

2.3.2 Example 2: assessing the accuracy of effect estimates calculated from digitally extracted data

The second example that we use to demonstrate the use of the Banksia plot stems from a study we undertook to assess the accuracy of effect estimates calculated from 43 datasets that were obtained using digital data extraction by four researchers from published ITS graphs [ ⁹]. The datasets were a subset of those used in example 1, with the additional criteria that the included studies required a graph in the published manuscript for which we were able to obtain the original study data (referred to as “provided” data from here on). As per the first example, the time series datasets had different outcomes and associated ranges. For this example, we illustrate the Banksia plot for comparing point estimates and CIs for the immediate level-change as well as the slope-change estimators. The slope-change estimator represents the change in the trend per unit of time from preinterruption to postinterruption times. The matrix of Banksia plots displays plots for the comparison of each extractor with all others and the provided data.

3 Results
3.1 Example 1: comparison of statistical methods for ITS

Here, we illustrate how the Banksia plot can facilitate the understanding of how point estimates and CIs differ across the statistical methods applied to the same datasets. The matrix of Banksia plots ( ^{Fig 5}) displays 15 pairwise comparisons between each of the six statistical methods. For the purpose of our illustration, we refer to the statistical methods as ‘method 1’ through to ‘method 6’. We use color coding to denote a relationship between the methods. Specifically, method 3, where the CIs are light blue, is a modification of method 2, where the CIs are dark blue. Similarly, method 6, where the CIs are orange, is a modification of method 5, where the CIs are yellow.

The primary focus of these plots was to compare the widths of the CIs between the methods; hence, the CIs were ordered from the smallest comparator CI width to the largest. Our matrix Banksia plot makes it immediately apparent via the different shapes created by the comparator CIs relative to the reference rectangle across the plots that the choice of statistical method for analyzing ITS datasets can importantly impact the confidence widths. Furthermore, it can be immediately seen that in all plots, the leftmost diamond, which represents the median of the scaled differences, sits atop the solid black horizontal line at zero, indicating no difference (on average) between the statistical methods. It is visually clear from the white points lying on the y = 0 line that the related methods 2 and 3 yield exactly the same point estimates (row 2, column 2), and this is similarly the case for methods 5 and 6 (row 5, column 5). While there is no difference (on average) in the point estimates between the statistical methods, it is easy to identify from the plot the methods for which there is larger variation in the differences (for example, method 5 (or equivalently method 6) vs method 2 (or equivalently method 3)).

3.2 Example 2: assessing the accuracy of effect estimates from digitally extracted data

Here we illustrate how the Banksia plot can facilitate the understanding of how point estimates and CIs differ when the same statistical method is applied to datasets obtained from different data extractors. The matrix of Banksia plots ( ^{Fig 6} ) displays 20 pairwise comparisons between the four data extractors and the provided data. Ten of the comparisons show the results for the level-change estimator (top right triangle) and the other 10 show the results for the slope-change estimator (bottom left triangle). Here, each data extractor is assigned a different color with the provided data colored black.

The primary focus of the plots was to compare the consistency of the point estimates and widths of the CIs between the extractors and provided data; hence, we present a matrix plot that only includes the comparisons between extractors and provided data ( ^{Fig 7} ). This is a subset of the matrix plot that includes all pairwise comparisons ( ^{Fig 6}). Presenting a subset of all pairwise comparisons may be desirable when there is a clear reference method (eg, new method against several existing benchmark methods) since the graph will be less cluttered, and thus patterns can be more easily identified for the comparisons which are of key interest. Note that in both figures for this example, the CIs were ordered by a dataset identification number, ensuring that each line corresponds to a specific dataset across all subplots. For example, the 43rd line in all subplots represents the estimate of the 43rd dataset.

Our full matrix plot ( ^{Fig 6}) makes it immediately apparent that the point estimates are very similar for all extractors and datasets, as indicated by the leftmost diamond, which represents the median of the scaled differences, sitting atop the solid black horizontal line at zero, indicating no difference (on average) between the extractors. It is also apparent that extractor 4's data extraction led to less accurate point estimates compared with the other extractors, as indicated by greater scatter of the white dots (scaled differences in point estimates). Furthermore, on closer inspection, the scaled differences (white dots) from a few studies (eg, studies 20, 33, 42 for level change) were slightly lower for multiple extractors compared to the provided data. Upon checking the original graphs from which these data were extracted, we found that these graphs had errors in them, making them different to the provided datasets [ ⁹]. This would have been impossible to discover if only summary statistics were reported or graphs with no identification of the study from which data points were obtained were used.

Furthermore, our matrix plots clearly reveal that the CIs were largely similar, as evidenced by the comparator CIs forming rectangular shapes that sit atop the reference rectangles. Note that these patterns differ compared with most of those in ^Figure 4 from example 1. Similar to the patterns observed for the scaled differences, extractor 4's data extraction led to more variation in the CI widths compared with the other extractors and the provided data, although the differences were minimal.

4 Discussion

We have introduced a new type of data visualization – the Banksia plot – which facilitates pairwise comparisons of point estimates and CIs from different statistical analyses both within and across datasets. Through a process of centering and scaling the point estimates and CIs, the plot can be used even when the datasets have different outcomes and scales. We have shown through two examples how the Banksia plot makes immediately apparent differences in point estimates and CIs, when they exist. We have provided code to implement the plots in Stata and R.

While we have shown the Banksia plot in the context of statistical empirical evaluations, the plot could also be used in numerical simulation studies, where the purpose would be to compare results arising from the use of different statistical methods. Furthermore, the plot can be used to examine a range of statistical methods, including those that may impact SEs and CIs or those that only impact CIs.

An advantage of the plot is that it allows the joint comparison of point estimates and CIs, which may provide a greater insight than when graphs and tables are used to display point estimates and measures of precision separately. The graph can be ordered in such a way that it will better highlight the primary focus of the investigation (eg, ordering from the smallest to largest comparator CIs when the widths of CIs are the focus). Finally, by showing the comparison for each dataset, rather than reporting summary statistics, a more detailed examination of individual results can be undertaken. This can be useful for investigating patterns and deviations.

A disadvantage of the plot is that because the comparisons of the point estimates and CIs are relative, it is not immediately obvious as to whether the differences on an absolute scale are important and would lead to a different interpretation. However, additional color coding or line patterning could be used to indicate, eg, whether a P value threshold differed between the comparator and reference analysis or if one CI crossed a clinically important threshold and another did not.

Future research needs to be undertaken to ascertain the users' understanding of the Banksia plot and whether modifications are necessary to improve understanding. As a first step, a ‘think-aloud’ study could be undertaken where users are asked to interpret the plot, and their feedback is used to identify features that may need modification [ ¹⁴].

5 Conclusion

The Banksia plot facilitates the comparison of point estimates and CIs from different statistical analyses in a pairwise manner, allowing for assessments within and across different datasets. By centering and scaling the point estimates and CIs the plot can be used even when the datasets have different outcomes and scales. Multiple Banksia plots can be shown in a matrix of pairwise analysis comparisons, providing an efficient way to visualize many comparisons. Differences in point estimates and CIs can be easily identified via the different shapes created by the comparator CIs relative to the rectangle created by the reference CIs.

Ethics statement

Not applicable. No animal or human data or tissues were involved.

Consent for publication

Not applicable. No data from any individual person is used.

CRediT authorship contribution statement

Simon L. Turner: Writing – review & editing, Writing – original draft, Visualization, Validation, Software, Project administration, Methodology, Investigation, Formal analysis, Data curation, Conceptualization. Amalia Karahalios: Writing – review & editing, Supervision, Methodology, Investigation. Elizabeth Korevaar: Writing – review & editing, Validation, Software, Investigation. Joanne E. McKenzie: Writing – review & editing, Writing – original draft, Supervision, Methodology, Investigation, Funding acquisition.

Declaration of competing interest

The authors declare that they have no competing interests.

Acknowledgments

The authors wish to thank the other contributing authors from the example papers Andrew B. Forbes, Jeremy M. Grimshaw, Miranda Cumpston, Monica Taljaard, Raju Kanukula, the Monash Methods for Evidence Synthesis Unit for additional feedback and conversations on the naming of the graph and Andrew B. Forbes for additional assistance in the construction of the plot through developmental feedback and supervision.

**Table 1**
Formulae for centering and scaling the confidence intervals (CIs)
Process	Calculation	Reference analysis			Comparator analysis
Process	Calculation	Point estimate	CI lower	CI upper	Point estimate	CI lower	CI upper
Original		${\hat{θ}}_{R}$	$C I L_{R}$	$C I U_{R}$	${\hat{θ}}_{C}$	$C I L_{C}$	$C I U_{C}$
Step 1: centering	Subtract by reference analysis point estimate ( ${\hat{θ}}_{R})$	${\hat{θ}}_{R} - \hat{θ}_{R} = 0$	$C I L_{R} - {\hat{θ}}_{R}$	${C I U_{R} - \hat{θ}}_{R}$	${\hat{θ}}_{C} - {\hat{θ}}_{R}$	$C I L_{C} - {\hat{θ}}_{R}$	${C I U_{C} - \hat{θ}}_{R}$
Step 2: scaling	Divide by reference analysis CI width ( ${C I}_{R_{w i d t h}} = C I U_{R} - C I L_{R}$ )	$0$	$\frac{C I L_{R} - {\hat{θ}}_{R}}{{C I}_{R_{w i d t h}}}$	$\frac{C I U_{R} - {\hat{θ}}_{R}}{{C I}_{R_{w i d t h}}}$	$\frac{{\hat{θ}}_{C} - {\hat{θ}}_{R}}{{C I}_{R_{w i d t h}}}$	$\frac{C I L_{C} - {\hat{θ}}_{R}}{{C I}_{R_{w i d t h}}}$	$\frac{C I U_{C} - {\hat{θ}}_{R}}{{C I}_{R_{w i d t h}}}$

Word count: 3639

Show less

The Banksia plot: a method for visually comparing point estimates and confidence intervals across datasets

Content area

Abstract

Full text