Scientific Papers

Visualizing and diagnosing spillover within randomized concurrent controlled trials through the application of diagnostic test assessment methods | BMC Medical Research Methodology


The objectives here are to demonstrate how an arms-based analysis enables a visualization and possible diagnosis of spillover among RCCT’s and why this is not possible from a traditional contrast-based analysis. Three aspects are discussed below; the nature of spillover, how spillover will not be identifiable using conventional contrast based methods and the novel application of diagnostic test assessment (DTA) methods as an arms based method together with measures of dispersion to enable its recognition. The tutorial uses data from 190 RCCT’s abstracted in 13 Cochrane reviews of various antimicrobial versus non-antimicrobial based interventions to prevent pneumonia in ICU patients as an example for diagnosing spillover. Spillover has long been postulated for antimicrobial interventions in this context but never formally evaluated.

Spillover and infection prevention interventions

Spillover is an indirect effect mediated by contagion occurring within populations originating from those who receive an intervention of interest to impact those individuals that do not [5,6,7,8]. Spillover is important to consider in estimating the population level effects of infection prevention interventions such as vaccination programs against contagious infections such as COVID, cholera, typhoid, and influenza [6, 7]. In these examples, spillover mediates herd protection though lowering the infection rate in both those recipients of the intervention and, indirectly, those non-recipients concurrent within the same population. Moreover, in evaluating the population level effects of vaccination interventions, the causal inference (efficacy) for individuals is not of primary interest whereas the population effectiveness is [5, 6].

Ideally, RCCT’s enable the estimation of an intervention ES by comparing the event rates in concurrent control and intervention groups. In conducting an RCCT of an infection prevention intervention any spillover will influence the event rate within the concurrent control groups of the RCCT’s although the size of the spillover effect will vary between RCCT’s. Hence spillover will likely amplify any inherent dispersion of the event rate among concurrent control groups depending on the strength of this indirect effect. By contrast, the dispersion among the corresponding intervention groups will mostly reflect the heterogeneity in the ES of the various infection prevention interventions under study within different RCCT’s in addition to the inherent dispersion in the incidence rate (Fig. 1). Hence, the overall ES estimate from an RCCT incorporates both the direct effect of the intervention on the intervention group individuals plus any indirect spillover effect, whether positive or negative, on the control group individuals.

Fig. 1
figure 1

Schema of six conditions of spillover (‘a’. – ‘f’.) contributing to heterogeneity in pneumonia incidence proportions among component groups of ICU patients within RCCT’s contained within a systematic review in relation to a clinically relevant incidence range (dotted lines) within this population. Note that pneumonia in the ICU context arises from colonization which is contagious within the ICU context. Movement to the right and left represents increasing or decreasing pneumonia incidence above or below the upper or lower end of the clinically relevant range (dotted red or yellow lines, respectively). The dotted rectangles at left represent systematic reviews reporting data for control () and intervention () groups of RCCT’s. The conditions (‘a’ – ‘f’) provide exposure to interventions which might be effective (–) or ineffective ( ±) at preventing pneumonia for individuals within intervention groups. Any intervention or spillover effect will contribute to heterogeneity at both the level of the group and the study ES. At the ‘herd’ (group) level there is either no spillover (0) or spillover which is beneficial (-) or harmful ( +) towards pneumonia incidence for individuals within control groups. The nett result is the apparent ES reported as the summary ES in each RCCT and systematic review. Note that ‘-’ equates to prevention (i.e. reduction) in pneumonia and ‘ + ‘ is the converse. a. Unexposed (Pre-exposure) component groups to intervention (potential outcomes not yet observed). b. Ineffective intervention and no spillover. c. Effective intervention and no spillover. d. Effective intervention and spillover which is beneficial (reducing pneumonia). e. Effective intervention and spillover which is harmful (increasing pneumonia). f. Effective intervention and spillover which is harmful (increasing pneumonia) but is uneven being present among some RCCT’s and not others

Diagnosing spillover, being a population (i.e. herd) level effect manifest on individuals, will require the following conditions; a defined end point of interest, clusters of populations of interest, an intervention of interest with spillover potential, and the exposure, or not, of these multiple comparable defined populations (i.e. exchangeable herds) to the intervention with incomplete penetrance. An example of where these conditions have been met is the inference of spillover on typhoid incidence among individuals within eighty Kolkata neighbourhoods cluster randomized to receive exposure, or not, to a population typhoid vaccination program delivered with incomplete penetrance to individual residents within the neighbourhoods [8].

Identifying spillover will require methods to quantify the amount and direction of increased dispersion in the event rate among the non-recipients within these populations to one of three comparators. Firstly, this could be relative to background dispersion such as that among the non-recipients within herds exposed to an intervention ineffective against the end point of interest. In the typhoid example, the ineffective intervention was neighbourhood exposure to a hepatitis vaccination program [8]. Second, this could be relative to the dispersion among the recipients of the effective intervention. Thirdly, this could be relative to a clinically relevant incidence range for the end point of interest for the population of interest, where this is available.

DTA and the arms-based framework

Clinical studies of diagnostic tests differ fundamentally from RCCT’s in that the study sub-populations, those with versus without the disease of interest, have not been defined by random allocation [9]. Also, the diagnostic test threshold typically varies across studies to accommodate ‘‘rule in’’ versus ‘‘rule out’’ testing strategies [10,11,12]. The SUTVA is generally neither valid nor a relevant consideration in relation to DTA. Hence DTA meta-analyses are undertaken within an arms-based framework with the test performance characteristics reported as summary sensitivities and specificities. Whilst a summary diagnostic odds ratio (DOR) might be available, this is not generally of interest except when comparing the results for different diagnostic tests or applications in different populations. Whereas the aggregation of results from high quality RCCT studies to achieve a more precise causal effect estimate is usually a realistic and desirable goal (in the absence of spillover), this is not the case for DTA.

For DTA, the primary interest is the projection to future applications of the test. To achieve this objective, current DTA methods provide three outputs not usually of interest within a contrast-based synthesis [13,14,15]. Firstly, the study level and summary sensitivities and specificities are often provided together with associated 95% confidence intervals. Second, DTA methods provide the summary receiver operator characteristic (SROC) plot, which displays both the dispersion in the sensitivities and specificities and how they co-vary across the aggregated studies. The visual representation of the SROC plot summary has evolved over time from a summary point (Q*, where summary sensitivity = 1 minus specificity), the SROC curve and, most recently, as a 95% confidence ellipse [9]. Thirdly, the dispersion of sensitivity and specificity are visualized in the SROC as a 95% prediction ellipse. These outputs are of great interest towards projecting future utility of the diagnostic test to applications in comparable populations.

Parallels between the SROC and L’abbe plots

The SROC derived within a DTA resembles the L’abbe plot as derived within a meta-analysis of RCCT’s. Each displays the dispersion in event rates in the two component groups, along the y-axis for one versus the x-axis for the other [16, 17]. For the L’abbe plot, these are the event rates in the intervention versus control groups, respectively. For the SROC plot, these are the test positive rates among the diseased (sensitivity) versus the non-diseased (which equates to 1 minus specificity), respectively. In both cases, the diagonal (y = x line) represents the locus where the event rates in the two populations in the comparison are equal. The two plots differ in how the covariation away from this line is displayed and how event rate dispersion is inferred. For the L’abbe plot, depending on whether the ES is defined as an odds ratio (OR), a risk ratio (RR) or a risk difference (RD) giving a visual representation of covariation as variously a line parallel to the y = x line (RD), a line that passes through the origin (RR), or a curve (OR), respectively. For the L’abbe plot, dispersion is assessed merely as a subjective visual impression which is governed by whether the presumptive underlying relationship is a RD, RR or OR.

For the SROC plot, on the other hand, the underlying relationship is always as an OR and the dispersion in event rates, being quantified as a summary point together with the derivation of an enveloping 95% prediction ellipse, enables projections of the sensitivity and specificity to future applications of the diagnostic test.

The most recent DTA methods require logistic transformation of sensitivity and specificity with the covariation defined within either bivariate or multi-level random effects models [18,19,20]. On logistic transformation, the SROC relationship has a linear (straight line) regression which, on back transformation to the linear scale, becomes curved. The SROC displays the summary operating point, which map the summary values of sensitivity and specificity along the SROC curve within the plot. Moreover, these models provide bi-directional 95% confidence regions (as ellipses) rather than as two unidirectional 95% confidence limits together with 95% prediction ellipses. On back transformation to the linear scale, these 95% ellipse regions lose their elliptical shape.

Indicators of dispersion

Dispersion of ES estimates within the contrast-based framework are of interest towards understanding the stability of the ES estimate. Commonly calculated measures are tau2, I2, and H2 although they are each imperfect measures which are widely mis-interpreted [21]. For example, I2, and H2 merely provide the ratio between the proportion of observed variance that might be due to variation in true effects versus sampling error [22]. The 95% prediction limits, although less commonly reported, are considered a better representation of the potential dispersion of the ES estimate. That there is > 200 types of graphical displays that are available for meta-analysis and systematic reviews in part reflects that in conducting a meta-analysis, dispersion is best appreciated when visualized [16, 17].

A key role for graphical displays of dispersion, within both the contrast-based and the arms-based framework, is its application towards identifying the balance between potential outlier versus inlier study results towards the summary effect. The L’abbe plot is not optimal in this role compared to other methods [23]. Another method for achieving this visually is within a caterpillar plot which is a forest plot with the studies ordered by increasing study specific incidence of ES [24]. However, caterpillar plots are infrequently used because their interpretation is limited if there are insufficient studies. Additionally, within the arms-based framework, there is the potential to reference either a clinically relevant range, where this is available either from expert opinion or independent sources, or a range that is considered meaningful [25].

The above commentary does not consider the application of contrast-based versus arms-based analysis within network meta-analysis. This is an active area of research beyond what is considered here in the diagnosis of spillover on concurrent control groups within infection prevention RCCT’s [26].

Illustrative example

Pneumonia prevention among ICU patients

Patients receiving mechanical ventilation are at high risk of acquiring pneumonia (Ventilator associated pneumonia; VAP) whilst in the intensive care unit (ICU) [27,28,29,30]. An extensive range of methods, being either non-antimicrobial [31,32,33,34,35,36,37,38,39] or antimicrobial [40,41,42,43] based, have been studied among patients receiving, or likely to receive, mechanical ventilation towards preventing VAP. Many of the interventions studied in these RCCT’s are included within national programs aiming for “pneumonia zero” [30]. Of note, the pneumonia incidence in the ICU population is considered by experts to lie within 5 and 40% [28] or as a more conservative range 8 to 28% [29]. Length of ICU stay is a strong correlate [27].

These RCCT’s have been summarized within Cochrane reviews [31,32,33,34,35,36,37,38,39,40,41,42,43]. The summary ES derived within these Cochrane reviews estimate pneumonia incidence reductions of > 50% using antimicrobial based interventions [40,41,42,43], versus non-antimicrobial based interventions [31,32,33,34,35,36,37,38,39] which achieve more modest or no significant reductions.

Antimicrobial based interventions, using either topical antiseptics and oral care [40, 41] or antibiotics [42, 43], were presumed to alter the microbiome of the entire ICU. This spillover of intervention effect was anticipated from the first study [44] being postulated as “….having heavily contaminated patients next to decontaminated patients might adversely affect the potentially beneficial results [postulate one]. Secondly, a reduction of the number of contagious patients by applying [selective digestive decontamination] SDD in half of them, might reduce the acquisition, colonisation and infection incidence in the not-SDD-treated control group [postulate two].” [44].

Whilst antimicrobial interventions are believed to mediate prevention by altering the ICU microbiome [45,46,47], neither the size nor the direction of spillover has ever been estimated despite > 60 RCT’s and > 50 systematic reviews and meta-analyses of antimicrobial based interventions. The original presumption that the spillover from antimicrobial based interventions, as for the herd effects of vaccination interventions, would always be beneficial has never been proven [44]. By contrast, any spillover for non-antimicrobial interventions will likely be minimal, because they are relatively ineffective at preventing pneumonia and also because they have minimal impact on the ICU microbiome.

This tutorial uses the data from 190 RCCT’s abstracted in 13 Cochrane reviews of non-antimicrobial [31,32,33,34,35,36,37,38,39] and antimicrobial based [40,41,42,43] interventions to prevent pneumonia in ICU patients receiving or likely to receive, mechanical ventilation. This collection of studies has been analysed elsewhere [48] where additional details together with both an arms-based and a traditional contrast-based analysis of the data is available.

Pneumonia prevention among ICU patients: the data and the interventions

The non-antimicrobial category includes upper gastro-intestinal tract (UGIT) [31], feeding [32,33,34], airway [35,36,37,38], and probiotic [39] based interventions. The antimicrobial category includes topical antiseptic or oral care [40, 41], and topical antibiotic [42, 43] based interventions.

For some antimicrobial RCCT’s the control group patients received a protocolized antimicrobial intervention in addition to standard care. These RCCT’s, here termed antimicrobial duplex studies, are separately classified in the Cochrane reviews [40,41,42,43] and here constitute a third category.

All data analyzed are provided in the supplemental material. The data is arrayed in a layout as for the analysis of a diagnostic test with the count of patients with pneumonia and the count without pneumonia for the intervention and control groups, respectively. The Stata commands are listed in the supplement.

Contrast-based analysis

For the contrast-based analysis, the meta-analysis models of prevention ES with associated estimates of heterogeneity were undertaken using mixed-effect methods of meta-analysis using the ‘meta’ and ‘meta meregress’ command in Stata 18 (Stata Corp., College Station, TX, USA) [49].

Arms-based analysis

For the arms-based analysis, the pneumonia count data was analysed as if for a diagnostic test with the counts in the intervention and control groups representing the disease positive and negative groups, respectively. The analysis was conducted as if for a DTA using the ‘metandi’ user command to generate summary measures of ‘sensitivity’ and ‘1 minus specificity’ (pneumonia incidences in the intervention and control groups, respectively) [13]. The SROC plots were generated with the ‘metandiplot’ command [13]. SROC plots generated using the more recently developed ‘metadta’ command [14] are also displayed for comparison.

Diagnostic approaches

The diagnosis of spillover requires the identification of increased dispersion in event rate, whether assessed visually, within SROC plots, or by using heterogeneity metrics, among control groups of RCCT’s within these three categories. There are three approaches to assessing this dispersion.

  • by comparison to the dispersion among the corresponding intervention groups receiving the antimicrobial intervention,

  • by comparison to the dispersion among the control groups within RCCT’s of an ineffective intervention, which here is the non-antimicrobial based RCCT’s,

  • by comparison to the clinically relevant pneumonia incidence benchmark range [28, 29].

All three approaches are used here.

The principal analysis examines the three broad categories of intervention. A secondary level of analysis, located in the supplement, explores the intervention subcategories corresponding to listings within individual Cochrane reviews [30,31,32,33,34,35,36,37,38,39,40,41,42,43].

Simulation studies

To explore the utility of DTA methods for visualizing spillover, I conducted simulation studies based on the non-antimicrobial studies. The RCCT’s of non-antimicrobial interventions can be expected to have spillover between control and intervention groups at a level that would be no greater than that occurring in the ICU context under standard operating conditions.

To simulate negative spillover, the control group pneumonia count was decreased by 2.5 or 5 per 100 control group patients. This equates to the conditions of Fig. 1d.

Positive spillover was simulated under conditions of uniform (Fig. 1e) or partial (Fig. 1f) spillover across RCCT’s. To simulate uniform positive spillover, the control group pneumonia count was increased by 2.5, 5, or 10 per 100 control group patients. Spillover that was positive and partial was simulated by increasing the control group pneumonia count by 10 or 20 per 100 control group patients in half or a quarter of randomly selected control groups.

The outcomes of the simulations were assessed using the SROC plots and the metrics of heterogeneity associated with the control groups.



Source link