Our review concludes that interventions involving the workplace could potentially increase the probability of returning to work. The studies on IPS with the workplace involved had a very low certainty of evidence, making it impossible to assess the impact of these interventions. A very low certainty of evidence, however, does not necessarily mean that there is no effect: it highlights the need for more well-designed studies of this topic. Studies of behavioural therapy and team-based support yielded low certainty of evidence, which implies that it is possible that future research might change these results.
Our results are largely consistent with previous systematic reviews targeting people with CMDs (Finnes et al. 2019a; Joyce et al. 2016; Nigatu et al. 2016), mental health conditions (Fadyl et al. 2020), mental disorders (Dewa et al. 2015), depression (Nieuwenhuijsen et al. 2020) or adjustment disorders (Arends et al. 2012): there is no convincing evidence that the interventions involving the workplace led to return to. As in our review, previous systematic reviews have included a range of interventions (e.g., targeting the individual’s workability, RTW behaviour, coping strategies, problem-solving skills, and interpersonal behaviours or organizational change). Besides the variety across the interventions, these are based on different mechanisms. As suggested by Nieuwenhuijsen and colleagues (Nieuwenhuijsen et al. 2020) the mechanisms could be broadly classified into (a) improving working conditions for supporting the employee to overcome the barriers for returning to work, e.g., by the adjustment of working hours or work tasks, or (b) the improvement of depressive or other psychological symptoms using medication and/or therapy (e.g., CBT) (Nieuwenhuijsen et al. 2020). In our systematic review, the included interventions were grouped into three categories. In addition, the interventions included in the categories ‘IPS’ and ‘Work-focused team-base support’ were mainly based on the first mechanism, while the interventions in the category ‘Work-focused behavioural therapy’ utilized a combination (Nieuwenhuijsen et al. 2020). However, irrespective of clarity of mechanisms, we cannot draw any firm conclusions regarding the interventions’ effectiveness. Another problem in the categorization of interventions is the potential overlap between the interventions. Our categorization was based on the main content of each intervention. However, in the study by (Reme et al. 2015) an intervention with CBT and individual support based on IPS-principles was evaluated. All participants in the intervention group received up to 15 CBT sessions, and of those, 32% received individual support based on IPS-principles. We have based our categorization on the fact that CBT was delivered to all study participants in the intervention group and a lesser amount receiving individual support. Still, with the range of intervention and the need of exploring why an intervention results in the desired effect or not, we suggest a thorough examination of the adherence to an intervention’s components by e.g., evaluating the reach, dose delivered, received and underlying mechanisms. This could be done by conducting a process evaluation in parallel to an effectiveness trial. A process evaluation could add to the current knowledgebase by informing results from a randomized controlled trial with how much of each component that needs to be delivered, the uptake of the intervention and components and the users’ perceptions about barriers to, and facilitators of the intervention. Having the work-directed intervention in mind, i.e., a complex intervention commonly delivered and evaluated at individual, organizational societal levels (Skivington et al. 2021) a process evaluation could develop the interpretation of the results from an effectiveness trial. For example, Arends and colleagues’ (Arends et al. 2012) reported that several of the intervention’s components (e.g., inventory of problems and/or opportunities and support needed) were linked to the outcome recurrent sickness absence. Hence, their process evaluation provided an example of how a thorough examination of an intervention’s components explain the intervention’s effectiveness.
Even if our systematic review did not include studies with a qualitative design, the addition of study participants’ perspective of the interventions could provide sufficient knowledge to our findings. Previous studies reporting the individuals’ experiences of participating in work-directed intervention have shown that the individual’s learned from receiving individual support in their preparation of RTW. The intervention by Wisenthal and colleagues contained a mapping of work ability, need and motivation for RTW, which contributed to the participants’ self-reflection, visualizing their resources and clarifying demands (Wisenthal et al. 2019). Further, the professionals providing work-directed interventions needed an including attitude regarding the individual’s situation and experiences in combination with their medical expertise (Andersen et al. 2014; Strömbäck et al. 2020). However, besides the support needed when preparing the RTW, support is also needed during the RTW to achieve a seamless transition from sickness absence into re-entering work (Wästberg et al. 2013). Among non-employed individuals with long-term conditions, support is needed throughout the process of gaining a paid employment, e.g., by sufficient collaboration with the involved stakeholders (Fadyl et al. 2022). These findings add to the results of our systematic review by, on the one hand, using interventions which support the development of self-efficacy and motivation. On the other hand, the participants asked for more ‘hands on-support’ during and after they had returned to work. We conclude that the interventions included in our systematic review could benefit from being adjusted to individual needs of behavioural change and support.
In addition to previous systematic reviews, our review highlights several ethical aspects arising from work-directed interventions. The included interventions suggest increased cooperation between stakeholders, e.g., the individual on sickness absence, his/her employer, health care- and Social Insurance Agency’s representatives. Our ethical analysis indicates that the explored interventions may affect the individual’s autonomy, personal integrity and control over the sharing of sensitive information. These results are in line with Holmlund et al. (Holmlund et al. 2023). In addition, Holmlund and colleagues revealed that unclear roles among the professionals involved in delivering work-directed interventions implied unequal access to support (Holmlund et al. 2023). Another ethical analysis of a similar intervention showed ethical challenges due to conflicting goals on organizational and individual levels, e.g., the intervention challenged organizational values on fairness and justice, and introduced a need for the individual to juggle the roles of an employee and a patient (Karlsson et al. 2024). The interventions investigated in our systematic review presume a common goal of reintegrating the employee back to work, among the involved stakeholders. However, our results show that work-directed interventions come with ethical ‘costs’ on behalf of (first, and foremost) the individual, but—as shown by previous studies (Holmlund et al. 2023; Karlsson et al. 2024) on the behalf of the involved stakeholders and organizations. Given the inconclusive results shown by our, and previous systematic reviews of work-directed interventions, the results from our ethical analysis should be taken into consideration when planning and conducting work-directed interventions. Further, these results might guide policy- and decisionmakers whether to implement work-directed interventions.
Allowing for quasi-experimental designs in systematic reviews of effectivenessDespite our search of quasi-experimental designs, we did not find any studies which met the inclusion criteria. Although quasi-experimental studies examining RTW outcomes for sick leave individuals exist, they encompass broader diagnostic populations beyond Common Mental Disorders (CMD), thus being excluded from our review. For instance, Hägglund et al. (Hägglund et al. 2020) analysed the impact of CBT on individuals with mild or moderate mental illness, and Hägglund (Hägglund 2013) assessed the effects of stricter enforcement of eligibility criteria in the Swedish sickness insurance system. These studies belong primarily to the field of economics, highlighting a discrepancy in population focus across research disciplines.
However, we suggest future systematic reviews to allow the inclusion of quasi-experimental designs when evaluating an intervention’s effectiveness, as these may often be considered to have a high external validity. Quasi-experimental designs offer a valuable alternative when ethical or logistical considerations prevent the implementation of true experiments. While randomised control trials aim to establish causal effects through random assignment, quasi-experimental designs achieve a similar goal without relying on true randomization. Instead, subjects are grouped based on predetermined criteria, mirroring random assignment to mitigate individual selection biases common in non-randomized experiments. Key quasi-experimental methods include Regression Discontinuity, Differences-in-Differences, and the Instrumental Variable method, as detailed, for instance, by Angrist and Pischke (Angrist and Pischke 2009).
Further, quasi-experiments often present several advantages. These include typically larger sample sizes, a reduced risk of biased population sampling, and the absence of issues related to participants and/or caseworkers being aware that they are part of a study. Quasi-experiments may have these advantages because they involve real-world interventions that have already been implemented without the explicit purpose of evaluation. Consequently, the concern that participants are aware of being part of a study is not an issue. Moreover, their use of retrospective register data helps alleviate problems associated with small sample sizes and attrition.
Quasi-experiments also have shortcomings, particularly if the fundamental assumption for identifying a causal treatment effect is unlikely to be met. Comparing the advantages and disadvantages of experiments versus quasi-experiments is not straightforward, as it hinges on the quality and context of the specific study. Our rationale for including quasi-experimental studies in the review lies in their potential to furnish evidence as compelling as RCT studies, underscoring their significance in systematic reviews.
Exploiting the potential of quasi-experiments to study subpopulations of interest, such as CMD, within larger sample sizes could contribute to improving research quality. Furthermore, quasi-experimental methods can be utilized for evaluating existing interventions and can be implemented gradually in different regions to leverage temporal variations for evaluation purposes. It is of interest to note that there are also RCT studies conducted in economics that evaluate labor market interventions but also include broader populations than those with CMD, such as the studies by Fogelgren et al. (Fogelgren et al. 2023), Engström et al. (Engström et al. 2017) and Laun and Skogman Thoursie (Laun and Skogman Thoursie 2014).
More studies of high scientific quality are neededA recurring conclusion from the previous reviews is that more studies of high scientific quality are needed. We agree with this conclusion. About half the studies meeting our inclusion criteria were excluded from the review because they were assessed as having a high risk of bias. We have identified the following methodological aspects for consideration in future research.
Firstly, a recurrent problem is the small number of participants and underpowered trials. Experiments are resource-intensive, and the cost of large-scale experiments is significant. This means that RCT studies often become small-scale. Most studies included in our review report recruitment difficulties, which implies a risk that the pre-estimated group size cannot be achieved. In addition, many studies are conducted at a few local offices or centres where the participants are not necessarily representative of a broader population. These aspects reduce the external validity.
Secondly, it is difficult to withhold information that the participants are part of a study. Consent from participants is usually required and neither the participants nor those providing the interventions are blinded. While initially it may be feasible to withhold information regarding the assigned intervention from individuals, it is important to consider that the intervention unfolds over a specific duration, and participants can hardly be shielded from external information indefinitely. These aspects reduce the internal validity.
Thirdly, the studies lack detailed descriptions of the content of interventions, comparisons groups and so-called ‘co-interventions’. With reference to ‘care as usual’ only two studies reported the use of drug treatment (Dalgaard et al. 2017a, 2017b; Salomonsson et al. 2017, 2020). Such treatment is commonly used for reducing symptoms of e.g., anxiety and/or depression and could possibly influence the outcome of sickness absence. The absence of such information makes it difficult to interpret the results of the studies and limits the ability to replicate them.
Fourthly, in line with recent research findings, we support the necessity to establish standardized outcome measures, a ‘Core Outcome Set’ (see Hoving et al. 2018; Ravinskaya et al. 2023; Ravinskaya et al. 2022). The main argument is to be able to compare studies. For example, while some studies focus on the duration until return to work, others emphasize the share of individuals who have returned to work, or the duration of sick leave. Nonetheless, our study reveals additional crucial insights regarding a ‘Core Outcome Set’. First, it is important to have a common approach how to calculate outcomes. Even if the same outcome information is available, some authors favour odds ratios, while others prefer alternative measure such as the share returned to work. Secondly, we emphasize the merits of utilizing core outcomes derived from register data. We argued for the inclusion of quasi-experimental studies in our report, where register data is essential. Once again, the establishment of a ‘Core Outcome Set’ is paramount. The question is if this core set of outcomes can also include registry-based measures of health. This could involve, for example, the number of days in outpatient care, inpatient care, and the number of prescribed doses of medication.
Finally, our systematic review evaluated three distinct interventions, IPS and ‘Work-focused’ team-base support and ‘Work-focused’ CBT. As already argued, to explore why an intervention results in the desired effect or not, we advocate process evaluations in order to learn the underlying mechanisms for the potential success of an interventions. This also opens up the question whether an intervention could be more successful if it, for example, incorporated elements from both IPS and CBT. This suggest that studies do not only randomize individuals to singular treatment arms, such as IPS or CBT, but also to a combined treatment arm, such as IPS and CBT together.
Methodological considerationsOne strength of our study is the comprehensive literature search in international databases, citation searches, and different publication types, including ‘grey literature’. The risk of overlooking any significant studies is small. Further, the certainty of the quantitative results has been assessed by applying the international GRADE system, which means that a structured assessment was made of five domains.
With regard to limitations, our review included articles reporting RCTs conducted in Sweden, Denmark, Norway and The Netherlands. These countries have different social insurance systems which could potentially affect the outcome, and this should therefore be considered when interpreting the results. Another limitation is our categorization of the included interventions. Even if an intervention had a specific content, e.g., CBT, it was not possible to determine whether the content was the same across studies, not whether the competence and training of those implementing the intervention affected the outcome.
Comments (0)