Tool to assess risk of bias in studies estimating the prevalence of mental health disorders (RoB-PrevMH)

We developed RoB-PrevMH, a concise RoB tool for prevalence studies in mental health that was designed with the intention to be adaptable to different systematic reviews and consisting of three items: representativeness of the sample, non-response bias and information bias. Our tool showed fair to substantial inter-rater reliability when applied to studies included in two systematic reviews of prevalence studies. All three items from RoB-PrevMH have been considered or included in existing tools.14 18 21 RoB-PrevMH does not contain any item on reporting and does not require an assessment of the overall RoB in a study. For each item, three assessments of RoB are possible (high, unclear and low)

Strengths and limitations

The strengths of RoB-PrevMH include the fact that it was created after a comprehensive review of items identified in previous tools as well as a consensus between researchers. Second, the feedback we received from the MH-COVID Crowd who used the tool suggests that the tool is concise and easy to use. Third, it focuses on RoB only and avoids questions that assess reporting. Fourth, the tool was tested by three pairs of extractors in two sets of studies with different aims. The inter-rater reliability was rated from fair to substantial. Finally, the tool has the potential to be tailored to other research questions.

Our tool also has limitations. First, the team of methodologists and investigators involved in development and testing was small. The tool would have benefited by a wider consultation strategy that involved more mental health experts and investigators who have designed and undertaken prevalence studies, as well as more methodologists. Second, the brevity of the tool could also be considered a limitation. For example, the MHCOVID project only includes studies that used validated tools for measuring mental health outcomes, so we did not include specific items for recall bias and observer bias, which might be important for other questions. Third, even if we assume that RoB-PrevMH would likely be quicker to complete than other tools, we did not formally assess the time required for completion in comparison with other tools. Fourth, the need to tailor the tool for each project and create training material for the people who will apply it might require more time than other tools at the start of a project. Moreover, the inter-rater reliability varied between the three items, with kappa values ranging from 0.32 to 0.71.

An important part of the evaluation of any RoB tool is the assessment of its validity. This is often done indirectly, by contrasting findings from studies judged at low versus high RoB in each domain. For example, randomised trials at high RoB from poor allocation concealment show, on average, larger effects than studies with low RoB.26 Prevalence studies are characterised by large heterogeneity, and it is expected that some of this heterogeneity might be associated with differences in RoB.27 However, RoB-PrevMH was not found to be associated with different study findings in a meta-analysis of the changes of symptoms of depression, anxiety and psychological distress during the pandemic, possibly because other design and population-related factors played a more important role in heterogeneity.4 A large-scale evaluation of the validity of RoB-PrevMH is needed to understand which design and analysis features impact most on the estimation of prevalence.

When we compare our tool’s performance with the available instruments, only the tool proposed by Hoy et al tested the inter-rater agreement and calculated the kappa value with a considerable number of studies on the prevalence of low back and neck pain.14 Even though representativeness of the target population might be difficult to judge objectively, the inter-rater agreement for this item was substantial while in the 54 studies assessed by Hoy et al the inter-rater agreement achieved was higher.14 For the second item on non-response, inter-rater agreement was substantial, but lower than similar items in the Hoy tool.14 The third item on misclassification had the lowest kappa statistic but the highest agreement between raters. In classification tables with great imbalance in the marginal probabilities and a high underlying correct classification rate kappa can be paradoxically low, as was the case of kappa for information bias.28 29 We did not make an overall RoB assessment for each study, which the Hoy tool does14 because of the problems with this approach.24

留言 (0)

沒有登入
gif