Many tools exist to evaluate methodological quality and risks of bias in systematic reviews, but they have been developed with different purposes, and choosing among them is difficult. More than 40 critical appraisal tools exist to evaluate the content and measurement properties of systematic reviews.8 9 After these reviews were published, two new tools were developed (ie, ROBIS and AMSTAR-210 11), and one is under development (Risk of Bias in Network Meta-Analysis (RoB NMA)12 13).
The decision about how to evaluate overall risk of bias for ROBIS is made at the assessors' discretion, as opposed to the AMSTAR-2 overall judgement, which is prescribed by AMSTAR-2 guidance. Examples of how to interpret methodological quality and risk of bias assessments, and how to make an overall judgement are found in box 1.
Box 1Decision rules: how to decide that the results of a review are of high quality or at low risk of bias overall
Decision rules are a priori strategies used to specify rules to define explicitly how each item is rated, as well as how an overall judgement is made about a specific systematic review with the AMSTAR-2 and ROBIS tools. In the case of AMSTAR-2, the authors who are using the tool stipulate how to come to an overall high quality rating in the results of the review, but not how to rate each item. For example, item 15 of AMSTAR-2 asks assessors whether an adequate investigation of publication bias (small study bias) was conducted and whether its likely effect on the results was discussed. However, the AMSTAR-2 team did not specify what happens when 10 studies or fewer were included (ie, the analysis will be underpowered to detect publication bias), what methods to detect publication bias are recommended, and if publication bias is detected, how it should be discussed (ie, as a systematic review limitation).
The ROBIS tool equally does not specify what decision rules should be used for assessment of risk of bias, nor how to come to an overall judgement. For example, item 4.6 of ROBIS ("Were biases in primary studies minimal or addressed in the synthesis?") is similar to item 12 of AMSTAR-2 ("If meta-analysis was performed, did the review authors assess the potential impact of risk of bias in individual studies on the results of the meta-analysis?"). Of note, risk of bias should be assessed in any systematic review regardless of whether a meta-analysis was performed. A possible decision rule for answering these two questions when considering whether bias was adressed and considered in the results and their interpretation could be to respond "Yes" or "Probably/Partial Yes" if:
All studies received a low risk of bias rating; and
Studies were judged at high risk of bias and sensitivity analyses (grouping high v low risk studies in a meta-analysis) or adjustment approaches were used
For a "No" response:
Important biases were suspected to have been in the included studies that have been ignored by the review authors; or
Risk of bias was not assessed at all in the included studies; or
Bias was assessed but authors did not incorporate it into findings, discussion, and conclusions
Based on the above decision rules, how would the following statement be rated? "We planned on conducting sensitivity analysis on the studies based on their level of risk of bias. Most of the included studies had a similar risk of bias across all the domains except for industry sponsorship bias and incomplete data for total testosterone. Due to the inadequate number of studies, we were not able to conduct a sensitivity analysis on the included studies based on industry sponsorship."
For overall judgements, a decision rule could be that if one or more ROBIS domains are at high risk of bias, then the overall study is deemed at high risk of bias. For AMSTAR-2, the authors of the tool have stipulated that the review is considered of low or critical low quality when any of the subset of seven ‘critical’ items have one or more critical flaws. While the decisions about how to rate the items and make overall judgements can be debated, the grounds on which overview authors make these decisions should be noted explicitly in the manuscript or in an appendix, as then the assessment results will be transparent and reproducible.
Cautionary note: empirical evidence does not currently support the assignment of scores to items that are met in a risk of bias tool followed by the summation or averaging of these scores to produce a numerical measure of risk of bias. A thoughtful, nuanced, and customised overall judgement is required that considers all items with suspected bias on the basis of specific context.
The AMSTAR-2 and ROBIS tools were designed to assess systematic reviews with pairwise meta-analysis only. A more recent tool under development aims to assess the potential biases and limitations in network meta-analyses.12 13 Guidance documents (eg, Cochrane14 and JBI15) recommend overview authors use ROBIS or AMSTAR-2 when comparing and critically appraising systematic reviews over other available tools. Figure 1 presents two example assessments conducted by our team, the ROBIS assessment of Normansell and colleagues16 is presented at the domain level, and the AMSTAR-2 assessment of Puig and colleagues17 is presented by item. Items are backed by quotes and rationales to support the answers chosen, for full transparency, and to help when comparing assessments between two independent assessors (figure 2).