Summary of evidence
In this case study, we explored vibration of effect in more than 16 000 pooled analyses of IPD data from 12 randomised controlled trials comparing canagliflozin with placebo for type 2 diabetes mellitus. We observed no Janus effect for the mean difference in HbA1c, suggesting that the vibration of effect did not have an impact on the direction of the effect observed, although the point estimate varied considerably. Almost all the pooled analyses on this endpoint showed significant differences, which indicates that the uncertainty related to vibration of effect concerned the magnitude of the change. Nevertheless, a difference in HbA1c is controversial as a valid surrogate for macrovascular and microvascular complications in type 2 diabetes mellitus, and it is difficult to interpret in terms of clinical relevance.11 33
With respect to our analyses examining the vibration of effect on major adverse cardiovascular events, a clinically relevant outcome, we observed no Janus effect. However, the vibration of effect had an impact on the detection of canagliflozin efficacy on major adverse cardiovascular events. Regarding individual studies, efficacy on these events was identified in only one of the 12 trials included, perhaps as a consequence of lack of power, or because it was only observed in certain populations, such as patients at high cardiovascular risk. Notably, two studies that included patients with high cardiovascular risk (NCT01032629 and NCT01989754) and reported major adverse cardiovascular events as their primary outcome were pooled into one paper,15 with a significant result of the pooled analysis, but not significant for any of the individual trials. This pooled analysis was planned a priori15 and the efficacy of canagliflozin on major adverse cardiovascular events has now been robustly established in another independent study for patients with type 2 diabetes mellitus who have chronic kidney disease,34 a study for which data were not available at the time of our initial request through the YODA project.
We observed a Janus effect in our analyses examining the vibration of effect on serious adverse events. However, because these results could be confounded by the results observed for major adverse cardiovascular events, we excluded this subgroup from our definition of serious adverse events. This approach, excluding the beneficial effects of a treatment from the composite of serious adverse events, has been proposed in the field of psoriasis research.35 Results on serious adverse events were robust in this post hoc sensitivity analysis.
Vibration of effect has been suggested as a standardised method that can be used to systematically evaluate the breadth and divergence of study results,7 9 36 depending on the various methodological choices. In the context of meta-analyses, this approach is quite similar to the GOSH (graphical display of study heterogeneity) method, which was proposed for meta-analyses on aggregated data.37 We believe that a method of this type, exploring all possible subsets, makes even more sense in the context of pooled analyses, because these studies do not, by nature, exhaustively cover all existing studies. In addition, the use of IPD enabled us to explore and extract outcomes (ie, major adverse cardiovascular events and serious adverse events) that would have been difficult to extract from aggregated data, because they would not have been measured or reported in the initial publications, thus increasing the relevance of our study beyond the classic GOSH approach.
Lastly, we used a definition of the Janus effect that is only contingent on point estimates, and not on statistical significance, as in previous work.7 9 36 When looking for statistical significance, our case study very rarely identified contradictory results. Observing changes in the direction of effect estimates and occasionally in significance is to be expected because of sampling variability only. Heterogeneity, bias in some of the initial studies, and the magnitude of the effect might also affect the existence of vibration of effect and the presence of a Janus effect. However, we believe that the bigger concern for pooled analyses is the presence of selection or availability bias in the IPD used in the meta-analysis.38
Strengths and limitations of the study
The findings from our case study might not be generalisable across all fields. Firstly, we selected an example involving a large number (and therefore a large number of possible combinations) of randomised controlled trials. Identifying vibration of effect in fields with only a few trials could be more challenging. Secondly, we only considered two methodological choices—that is, study inclusion and timing of endpoints. Vibration of effect can be influenced by many other characteristics, including subgroup analyses, different definitions of outcomes (eg, a different construction of major adverse cardiovascular events), different groupings of doses, and different analytical strategies (eg, choice of one stage v two stage IPD, model specification, or different handling of missing data). Therefore, our results could in fact underestimate the vibration of effect that might have resulted from many other researchers' degrees of freedom in the analysis. However, our evaluation concerned 16 332 analyses and provides an idea of the impact of different trial combinations and repeated endpoint measures. Our choice to focus such combinations of studies was driven by the fact that many pooled analyses are run with the risk of manipulating the results, by selecting favourable combinations of studies.5 Nevertheless, subgroup analyses might be frequently conducted in pooled analyses (eg, for duloxetine5), a consideration that deserves attention in future research.
Thirdly, we relied only on studies available on the YODA platform at the time of our request. All these studies were sponsored by Janssen, and we considered that this specific subset of studies was adequately represented a sample that a given sponsor would use when conducting a series of secondary analyses. Relying on IPD from such a homogeneous subset of studies allowed the quality of studies to be better assessed, and for analyses to be standardised. Therefore, selected studies for the analysis could have less variability and less potential for vibration of effect than in the present study. We did not conduct a systematic search for other studies, for example, those conducted in an academic context. Whether the authors of academic studies would have shared the trial IPD necessary to conduct our analyses for vibration of effect is uncertain, and including studies of this type could have added heterogeneity and vibration of effect.
Implications of the findings
Our findings have several implications. Especially when performing post hoc evaluations of published trials, pooled analyses focusing on a subset of all available studies cannot be simply assumed to be the preferred method. In particular, our findings suggest that results from pooled analyses should be critically appraised. Health authorities, for instance, should not rely exclusively on findings from pooled analyses when approving treatments. Evidence suggests that findings from pooled analyses have been used to guide approvals by the European Medicine Agency,39 including that for nalmefene for alcohol use disorders.40 To enhance the quality of pooled analyses and the evidence generated by them, we suggest that pooled analyses should be planned a priori, with detailed, pre-registered study protocols, as with prospective meta-analyses.41 This step would minimise any methodological changes during the analyses that could introduce vibration of effect. If pre-registration is not possible (eg, when the researchers conducting pooled analyses are not involved in the design or conduct of the original randomised controlled trial), analytical plans should be registered before data analysis, to maintain full transparency regarding any decision made during the conduct of the study, such as the selection of studies to be pooled in the analysis. Pooled analyses should rely on IPD from studies that are representative of the target population of interest and high quality, in order to best estimate the estimands of interest. These steps will continue to be important as data sharing increases in medicine and secondary uses of this type become more popular.10 42
We think that the vibration-of-effect approach shows promise in exploring issues related with reproducibility, especially because overlapping meta-analyses with divergent conclusions are not rare in the literature.2 However, to recommend implementing the method in all IPD meta-analysis/pooled analyses would be immature. Therefore, we recommend that future research systematically explores vibration of effect in a large set of meta-analyses in order to give a better indication of its relevance. Such a study will also help to investigate associations between the vibration of effect and the Janus effect with many parameters such as heterogeneity, effect size, study quality, and random sampling.
Conclusion
In this case study, we explored the vibration of effect in more than 16 000 pooled analyses of IPD data from 12 randomised controlled trials comparing canagliflozin with placebo for treating patients with type 2 diabetes mellitus. We found substantial variations in the magnitude and, for serious adverse events, the direction of the effects estimated. These findings suggest that when conducting pooled analyses of IPD from randomised controlled trials, trial selection, analysis of subsets of all trials and their selection or availability of IPD could have considerable consequences on treatment effect estimation.