Centre Cochrane Français, Paris, France

Université Paris Descartes - Sorbonne Paris Cité, Paris, France

INSERM U738, Paris, France

Assistance Publique-Hôpitaux de Paris, Hôpital Hôtel-Dieu, Centre d'Epidémiologie Clinique, Paris, France

INSERM CIE 4, Paris, France

Assistance Publique-Hôpitaux de Paris, Hôpital Européen Georges Pompidou, Unité de Recherche Clinique, Paris, France

Abstract

Background

Network meta-analysis (NMA), a generalization of conventional MA, allows for assessing the relative effectiveness of multiple interventions. Reporting bias is a major threat to the validity of MA and NMA. Numerous methods are available to assess the robustness of MA results to reporting bias. We aimed to extend such methods to NMA.

Methods

We introduced 2 adjustment models for Bayesian NMA. First, we extended a meta-regression model that allows the effect size to depend on its standard error. Second, we used a selection model that estimates the propensity of trial results being published and in which trials with lower propensity are weighted up in the NMA model. Both models rely on the assumption that biases are exchangeable across the network. We applied the models to 2 networks of placebo-controlled trials of 12 antidepressants, with 74 trials in the US Food and Drug Administration (FDA) database but only 51 with published results. NMA and adjustment models were used to estimate the effects of the 12 drugs relative to placebo, the 66 effect sizes for all possible pair-wise comparisons between drugs, probabilities of being the best drug and ranking of drugs. We compared the results from the 2 adjustment models applied to published data and NMAs of published data and NMAs of FDA data, considered as representing the totality of the data.

Results

Both adjustment models showed reduced estimated effects for the 12 drugs relative to the placebo as compared with NMA of published data. Pair-wise effect sizes between drugs, probabilities of being the best drug and ranking of drugs were modified. Estimated drug effects relative to the placebo from both adjustment models were corrected (i.e., similar to those from NMA of FDA data) for some drugs but not others, which resulted in differences in pair-wise effect sizes between drugs and ranking.

Conclusions

In this case study, adjustment models showed that NMA of published data was not robust to reporting bias and provided estimates closer to that of NMA of FDA data, although not optimal. The validity of such methods depends on the number of trials in the network and the assumption that conventional MAs in the network share a common mean bias mechanism.

Background

Network meta-analyses (NMAs) are increasingly being used to evaluate the best intervention among different existing interventions for a specific condition. The essence of the approach is that intervention A is compared with a comparator C, then intervention B with C, and adjusted indirect comparison allows for comparing A and B, despite the lack of any head-to-head randomized trial comparing A and B. An NMA, or multiple-treatments meta-analysis (MA), allows for synthesizing comparative evidence for multiple interventions by combining direct and indirect comparisons

Reporting bias is a major threat to the validity of results of conventional systematic reviews or MAs

Numerous methods have been used as sensitivity analyses to assess the robustness of conventional MAs to publication bias and related small-study effects

Methods

First, we extended a meta-regression model of the effect size on its standard error, recently described for MAs

Datasets used

A previous review by Turner et al. assessed the selective publication of antidepressant trials

**Appendix 1.** Summary effect sizes for the 12 comparisons of each antidepressant agent and placebo. **Appendix 2.** Winbugs codes. **Appendix 3.** Estimated parameters in the adjustment models applied to published data.

Click here for file

Contour-enhanced funnel plots for the antidepressant trials with published results

**Contour-enhanced funnel plots for****the antidepressant trials with****published results.** Each funnel plot is the scatter plot of the treatment effect estimates from individual trials against the associated standard errors; the vertical solid line represents the pooled estimate. In the absence of reporting bias, we might expect a symmetrical funnel plot. We may find the funnel plot is not symmetrical, ie does not resemble an inverted funnel, which may be due to reporting bias, however there are other possible sources of asymmetry. The contour lines represent perceived milestones of statistical significance (long dash p = 0.1; dash p = 0.05; short dash p = 0.01). If studies seem to be missing in areas of non-significance then asymmetry may be due to reporting bias rather than other factors.

Network meta-analysis

The standard model for NMA was formalized by Lu and Ades _{
ijk
} of _{
ijk
}. We assume that _{
ijk
} > 0 indicates superiority of _{
ijk
} ~ _{
ijk
}, _{
ijk
}) and _{
ijk
} is the true effect underlying each randomized comparison between treatments _{
jk
}
^{2} = ^{2}). This assumption can be relaxed

Adjustment models

Meta-regression model

We used a network meta-regression model extending a regression-based approach for adjusting for small-study effects in conventional MAs

Figure A in Additional file _{
ijk
} is the treatment effect adjusted for small-study effects underlying each randomized comparison between treatments _{
jk
} represents the potential small-study effect (ie, the slope associated with funnel plot asymmetry for the randomized comparisons between treatments ^{2}. This is equivalent to the assumption that comparison-specific small-study biases are exchangeable within the network. Since we assumed that _{
ijk
} > 0 indicates superiority of

**Figures.** Graphical representation of the adjustment models (A) regression model and (B) selection model. A solid arrow indicates a stochastic dependence and a hollow arrow indicates a logical function.

Click here for file

Selection model

We use a model that adjusts for publication bias using a weight function to represent the process of selection. The model includes an effect size model (ie, the standard NMA model that specifies what the distributions of the effect size estimates would be with no selection) and a selection model that specifies how these effect size distributions are modified by the process of selection

_{0jk
} ~ _{0}, _{0}
^{2}) and _{1jk
} ~ _{1}, _{1}
^{2})

Figure B in Additional file _{
i
} represents the propensity of the trial results to be published, _{0jk
} sets the overall probability of observing a randomized comparison between treatments _{1jk
} controls how fast this probability evolves as the standard error increases. We expect _{1jk
} to be negative, so trial results yielding larger standard errors have lower propensity to be published. The model assumes exchangeability of the _{0jk
} and _{1jk
} coefficients within the network. By setting _{
ijk
} = _{
ijk
}/_{
i
}, we define a simple scheme that weights up trial results with lower propensity of being published so that they have a disproportionate influence in the NMA model. _{
ijk
} is the treatment contrast corrected for the selection process underlying each randomized comparison between treatments

Models estimation

We estimated 4 models: standard NMA model of published data, 2 adjustment models of published data and a standard NMA model of FDA data. In each case, model estimation involved Markov chain Monte Carlo methods with Gibbs sampling. Placebo was chosen as the overall baseline treatment to compare all other treatments. Consequently, the 12 effects of drugs relative to placebo are the basic parameters. For 2 treatments _{
jk
} > 0 indicate that

In the standard NMA model, we defined prior distributions for the basic parameters ^{2}: ^{2}: _{0}, _{0}
^{2}) and (_{1}, _{1}
^{2}). We considered _{
min
} and _{
max
} the probability of publication when the standard error takes its minimum and maximum values across the network of published data and specified beta priors for these probabilities _{
min
} being < 50% were 5% and the chances of _{
min
} being < 80% were 50%. For trials with standard error equal to the maximum observed value, our guess was that the chances of _{
max
} being < 40% were 50% and the chances of _{
max
} being < 70% were 95%. We discuss these choices further in the Discussion. From this information, we determined Beta(7.52, 2.63) and Beta(3.56, 4.84) as prior distributions for _{
min
} and _{
max
}, respectively. Finally, we expressed _{0} and _{1} in terms of _{
min
} and _{
max
} and chose uniform distributions in the range (0,2) on the standard deviations _{0} and _{1}. For each analysis, we constructed posterior distributions from 2 chains of 500,000 simulations, after convergence achieved from an initial 500,000 simulations for each (burn-in). Analysis involved use of WinBUGS v1.4.3 (Imperial College and MRC, London, UK) to estimate all Bayesian models and R v2.12.2 (R Development Core Team, Vienna, Austria) to summarize inferences and convergence. Codes are reported in the Additional file

Models comparison

We compared the results of the 2 adjustment models applied to published data and results of the standard NMA model applied to published data and the FDA data, the latter considered the reference standard. First, we compared posterior means and 95% credibility intervals for the 12 basic parameters and common variance, as well as for the 66 functional parameters (ie, all 12 × 11/2 = 66 possible pair-wise comparisons of the 12 drugs). Second, we compared the rankings of the competing treatments. We assessed the probability that each treatment was best, then second best and third best, etc. We plotted the cumulative probabilities and computed the surface under the cumulative ranking (SUCRA) line for each treatment

Results

In the meta-regression model applied to published data, the posterior mean slope _{1} was −10.0 (−18.0 – -2.50), so trials yielding larger standard errors tended overall to have lower propensity to be published. In both models, all estimates were subject to large uncertainty (Additional file

Table

**FDA data**

**Published data**

**Standard NMA model**

**Regression model**

**Selection model**

**Standard NMA model**

**Mean (SD)**

**Mean (SD)**

**Mean (SD)**

**Mean (SD)**

Data are posterior means and standard deviations of the basic parameters (Θ), the between-trial heterogeneity (τ).

_{
BUP
}

0.176 (0.081)

0.043 (0.256)

0.229 (0.121)

0.271 (0.139)

_{
CIT
}

0.240 (0.074)

0.081 (0.171)

0.254 (0.073)

0.306 (0.076)

_{
DUL
}

0.300 (0.054)

0.166 (0.190)

0.340 (0.066)

0.402 (0.058)

_{
ESC
}

0.310 (0.067)

0.165 (0.193)

0.311 (0.070)

0.357 (0.068)

_{
FLU
}

0.256 (0.081)

0.004 (0.160)

0.215 (0.068)

0.271 (0.074)

_{
MIR
}

0.351 (0.070)

0.206 (0.331)

0.424 (0.110)

0.567 (0.092)

_{
NEF
}

0.256 (0.076)

0.112 (0.260)

0.348 (0.094)

0.437 (0.094)

_{
PAR
}

0.426 (0.063)

0.267 (0.346)

0.438 (0.105)

0.593 (0.078)

_{
PAR CR
}

0.323 (0.101)

0.174 (0.187)

0.309 (0.083)

0.354 (0.085)

_{
SER
}

0.252 (0.077)

0.210 (0.231)

0.359 (0.094)

0.419 (0.094)

_{
VEN
}

0.395 (0.071)

0.199 (0.224)

0.403 (0.092)

0.504 (0.075)

_{
VEN XR
}

0.398 (0.094)

0.261 (0.273)

0.423 (0.110)

0.506 (0.107)

0.060 (0.037)

0.031 (0.024)

0.024 (0.019)

0.032 (0.025)

Difference plots of estimates of pair-wise comparisons of the 12 antidepressant agents and placebo: regression model of published data vs. standard network meta-analysis (NMA) model of published data (left panel); selection models of published vs. standard NMA model of published data (right panel)

**Difference plots of estimates****of pair-wise comparisons of****the 12 antidepressant agents****and placebo: regression model****of published data vs.****standard network meta-analysis (NMA)****model of published data****(left panel); selection models****of published vs. standard****NMA model of published****data (right panel).** The x-axes show the estimates from the standard NMA model applied to published data, the y-axes show the differences between the estimates from the adjustment (regression or selection) model of published data and the estimates from the standard NMA model of published data. Black dots are the 12 estimated drug effects relative to placebo; white dots are the 66 possible pair-wise comparisons between the 12 drugs.

Figure

Probabilities that each antidepressant drug is the best according to standard NMA of FDA data, regression model, selection model or standard NMA model of published data

**Probabilities that each antidepressant**
**drug is the best**
**according to standard NMA**
**of FDA data, regression**
**model, selection model or**
**standard NMA model of**
**published data.**

Figure

Cumulative ranking probability plots for the 12 antidepressant agents from the standard NMA model applied to FDA data (bold solid line) and published data (bold dotted line) and from the 2 adjustment models applied to published data (regression model in plain dashed line and selection model in plain double-dashed line)

**Cumulative ranking probability plots****for the 12 antidepressant****agents from the standard****NMA model applied to****FDA data (bold solid****line) and published data****(bold dotted line) and****from the 2 adjustment****models applied to published****data (regression model in****plain dashed line and****selection model in plain****double-dashed line).** On each plot, the x-axis shows possible ranks from

In adjustment models applied to published data, between-trial heterogeneity and fit were comparable to those obtained with standard NMA of published data (Tables

**Regression model**

**Selection model**

**NMA model**

Lower values of

Mean posterior residual deviance (

31.4

31.5

34.4

Effective number of parameters (pD)

15.9

14.7

13.9

Deviance Information Criterion (DIC)

47.3

46.2

48.3

The estimated drug effects relative to placebo from the regression and selection models were similar to those from the NMA of FDA data for some drugs (Table

Difference plots of estimates of pair-wise comparisons of the 12 antidepressant agents and placebo: standard NMA model of published data vs. standard NMA model of FDA data (upper panel); regression model of published data vs. standard NMA model of FDA data (bottom left panel); selection model of published vs. standard NMA model of FDA data (bottom right panel)

**Difference plots of estimates****of pair-wise comparisons of****the 12 antidepressant agents****and placebo: standard NMA****model of published data****vs. standard NMA model****of FDA data (upper****panel); regression model of****published data vs. standard****NMA model of FDA****data (bottom left panel);****selection model of published****vs. standard NMA model****of FDA data (bottom****right panel).** The x-axes show the estimates from the standard NMA model applied to FDA data, the y-axes show the differences between the estimates from the adjustment (regression or selection) model of published data and the estimates from the standard NMA model of FDA data. Black dots are the 12 estimated drug effects relative to placebo; white dots are the 66 possible pair-wise comparisons between the 12 drugs.

Discussion

We extended two adjustment methods for reporting bias from MAs to NMAs. The first method combined NMA and meta-regression models, with effect sizes regressed against their precision. The second one combined the NMA model with a logistic selection model estimating the probability that a trial was published or selected in the network. The former method basically adjusts for funnel plot asymmetry or small study effects, which may arise from causes other than publication bias. The latter adjusts for publication bias (ie, the suppression of an entire trial depending on results). The two models borrow strength from other trials in the network with the assumption that biases operate in a similar way in trials across the domain.

In a specific network of placebo-controlled trials of antidepressants, based on data already described and published previously by Turner et al., comparing the results of adjustment models applied to published data and those of the standard NMA model applied to published data allowed for assessing the robustness of efficacy estimates and ranking to publication bias or related small-study effects. Both models showed a decrease in all basic parameters (ie, the 12 effect sizes of drugs relative to placebo). The 66 contrasts for all possible pair-wise comparisons between drugs, the probabilities of being the best drug and the ranking were modified as well. The NMA of published data was not robust to publication bias and related small-study effects.

This specific dataset offered the opportunity to perform NMAs on both published and FDA data. The latter may be considered "an unbiased (but not the complete) body of evidence" for placebo-controlled trials of antidepressants

Similar approaches have been used by other authors. Network meta-regression models fitted within a Bayesian framework were previously developed to assess the impact of novelty bias and risk of bias within trials

The 2 adjustment models rely on the assumption of exchangeability of selection processes across the network; that is, biases, if present, operate in a similar way in trials across the network. In this case study, all studies were, by construction, industry-sponsored, placebo-controlled trials registered with the FDA, and for all drugs, results of entire studies remained unreported depending on the results

The models we described have limitations. First, they would result in poor estimation of bias and effect sizes when the conventional MAs within the network include small numbers of trials _{
min
} and _{
max
}, the probabilities of publication when the standard error takes its minimum and maximum values across the network

Conclusions

In conclusion, addressing publication bias and related small-study effects in NMAs was feasible in this case study. Validity may be conditioned by sufficient numbers of trials in the network and assuming that conventional MAs constituting the network share a common mean bias. Simulation analyses are required to determine under which condition such adjustment models are valid. Application of such adjustment models should be replicated on more complex networks, ideally representing the totality of the data as in Turner's, but our results confirm that authors and readers should interpret NMAs with caution when reporting bias has not been addressed.

Competing interest

The authors declared that they have no competing interest.

Authors' contributions

LT provided substantial contributions to conception and design, analysis and interpretation of data, drafted the article and revised it critically for important intellectual content. GC and PR provided substantial contributions to design and interpretation of data, and revised the article critically for important intellectual content. All authors read and approved the final manuscript.

Financial disclosure

Grant support was from the French Ministry of Health Programme Hospitalier de Recherche Clinique National (PHRC 2011 MIN01-63) and European Union Seventh Framework Programme (FP7 – HEALTH.2011.4.1-2) under grant agreement n° 285453 (

Acknowledgments

The authors thank Laura Smales (BioMedEditing, Toronto, Canada) for editing the manuscript.

Pre-publication history

The pre-publication history for this paper can be accessed here: