Clinical predictors of non‐response to lithium treatment in the Pharmacogenomics of Bipolar Disorder (PGBD) study

1 INTRODUCTION

Lithium is regarded as a first-line treatment for bipolar disorder (BD),1-7 but it does not work for all patients. The modern use of lithium for treatment of BD was first introduced by John Cade in 1949, and it has been widely studied since. Although findings from these studies have been at times controversial, the evidence for the efficacy of lithium in acute mania and maintenance treatment is now well established. In a meta-analysis of five randomized controlled trials of BD comparing prophylactic lithium therapy with placebo, Geddes and colleagues found that lithium is more effective than placebo in preventing recurrence of illness, with 60% in the lithium group remaining well over 1–2 years compared with 40% in the placebo group.8 In a subsequent meta-analysis of six studies of lithium in the treatment of acute mania, Yildiz and colleagues found that 48% of patients responded to lithium compared to 31% for placebo.9 While these seminal reviews unequivocally demonstrate the efficacy of lithium for both acute mania and maintenance treatment of BD, they also highlight that anywhere from 40%–50% of patients do not respond adequately over a 2-year period and require either the addition of or a change to another psychotropic drug.9 These findings are consistent with observational data from longitudinal cohort studies.10-12

There is considerable continued interest in identifying predictors of response to lithium before starting treatment in order to avoid the typical trial and error process of finding the right medication for a particular patient during which time he or she may continue to experience devastating symptoms and be at risk for suicide. This is the goal of precision medicine (also referred to as individualized or personalized medicine). Although the promise of precision medicine has garnered a great deal of attention recently,13 the search for predictors of lithium response dates back to the very first studies of its prophylactic effect in mood disorders.14

Indeed, there is a long history of searching for clinical predictors of response to lithium treatment that can help guide treatment decisions. In 2005, Kleindienst and colleagues15, 16 carried out two comprehensive systematic reviews of predictors of lithium response in which they identified nearly 2,000 studies published between 1966 and 2003 on this topic. In one review, they focused on studies that examined psychosocial and demographic predictors and identified nine that emerged as consistently associated with lithium response. Four were associated with good response (high social status, social support, good compliance, and “dominance” personality trait), while five were associated with poorer response (stress, high expressed emotion, neurotic personality trait, unemployment, and high number of life events). In the other review, they focused on studies that examined clinical predictors of lithium response and identified five that were consistently associated with lithium response across studies. These included a pattern of mania-depression-interval in bi-phasic episodes (so-called MDI polarity sequence) and older age at onset associated with better response, and high number of hospitalizations, a pattern of depression-mania-interval (i.e., DMI polarity sequence), and continuous cycling associated with poorer response. Both reviews concluded that the effect sizes of these factors on treatment response were relatively small.

In 2019, Hui and colleagues17 carried out a subsequent meta-analysis of clinical predictors of lithium response that included more recent data from 71 studies with over 12,000 patients. They identified six predictors of good lithium response, some of which overlapped the earlier review by Kleindienst and colleagues,15, 16 and included manic-depression-interval pattern, absence of rapid cycling, absence of psychotic symptoms, family history of bipolar disorder, shorter pretreatment illness duration, and later age at onset. They noted, however, that the included studies tended to have small sample sizes and there was considerable heterogeneity in results.

The Pharmacogenomics of Bipolar Disorder (PGBD) Study (www.clinicaltrials.gov, NCT01272531) was a large multi-center study designed to prospectively identify clinical and molecular predictors of lithium response. We report here the results of an analysis of clinical data from this study to examine clinical predictors of lithium response. The advantage of this study over previous ones is that patients were prospectively followed on lithium monotherapy for up to 2 years to better identify predictors of long-term treatment response specifically to lithium.

2 METHODS 2.1 Study overview

The PGBD was one of 14 research projects in the Pharmacogenetic Research Network funded by the National Institute of Health to support multi-disciplinary, collaborative research on how genetic factors contribute to inter-individual differences in responses to medications. The PGBD set out to conduct a multi-site prospective study of lithium monotherapy in the treatment of BD.

The details of the trial have been described elsewhere.18 Briefly, the goal of the study was prevention of illness recurrence by lithium monotherapy. All patients were observed in an observation phase lasting 4 weeks to confirm they were in remission defined by having a Clinical Global Impression of Severity Scale (CGI-S) score of ≤3 (mildly ill) for at least 4 weeks. After the observation phase, the patients entered a 2-year maintenance phase, during which they were assessed every 2 months to monitor their on-going clinical response. Patients who came into the trial clinically unstable and/or not on lithium monotherapy were first transitioned to lithium monotherapy in a stabilization phase that lasted a maximum of 16 weeks which included visits every other week for the first 8 weeks and one visit per month for the next 2 months. The treatment dosage of lithium was not fixed by study protocol but instead was titrated by the treating clinicians as clinically indicated. Throughout the follow-up, patients were allowed to take a benzodiazepine for anxiety and/or zolpidem for sleep. A range of clinical measures (described below) was collected at the screening and subsequent visits to monitor clinical progress and enable investigation of clinical predictors of response.

2.2 Participants

Patients were enrolled into the study from outpatient psychiatry clinics in academic medical centers at nine sites within the United States and two international sites. The nine domestic sites included: University of California, San Diego; Indiana University; University of Chicago; University of Pennsylvania; University of Iowa; Johns Hopkins University; Case Western Reserve University; University of Michigan; and the Mayo Clinic. The two international sites were University of Bergen, Norway, and Dalhousie University in Halifax, Canada.

Patients were included in the study if they: (1) had bipolar I disorder in any phase of illness; (2) were naïve to or not presently on lithium and had at least one affective episode meeting DSM-IV criteria in the last 12 months or were currently on lithium and did not have any history of mood episodes meeting DSM-IV criteria in the last 6 months; (3) were able to give informed consent; (4) were 18 years or older; and (5) were currently symptomatic, as defined as a CGI-S score of at least 3 (mild severity), unless the patient entered the study already stable on lithium monotherapy. Women of child-bearing potential were included if they agreed to use adequate contraception and inform their doctor at the earliest possible time of their plans to conceive.

Patients were excluded if they: (1) were unwilling or unable to comply with study requirements; (2) had renal impairment (serum creatinine >1.5 mg/dL); (3) had thyroid stimulating hormone (TSH) level over >20% above the upper normal limit or, if on thyroid medication, had not been euthyroid for at least 3 months before the first visit; (4) were currently in crisis such that inpatient hospitalization or other crisis management should take priority; (5) met criteria for physical dependence requiring acute detoxification from alcohol, opiates or barbiturates; (6) were pregnant or breastfeeding; (7) had participated in a clinical trial of an investigational drug within the past 1 month; or (8) had a history of lithium toxicity, not due to mismanagement or overdose, that required treatment.

All study procedures were approved by local Institutional Review Boards (IRBs), and all patients provided written informed consent. This analysis included data on the first 345 BD patients who enrolled into the study and had sufficient follow-up of at least 4 weeks as of the date of data freeze on June 26, 2017. There were four patients who were still active in the study but had not yet reached the maintenance phase by the time of this data freeze and were not included in these analyses.

2.3 Clinical outcomes

Patients were followed until they: (1) completed all study visits over 2 years of the maintenance phase (or had achieved the maintenance phase and were still active in the on-going study by the date of the data freeze), (2) were terminated from the study before completion of all visits because of failure to achieve (i.e., failure to remit) or maintain (i.e., relapse) stabilization on lithium, or (3) were terminated from the study for other reasons. Failure to remit was defined by the inability to achieve clinically sustained remission (where remission was documented as described above) by the end of the observation phase or based on clinical judgment that the patient was unable to adequately stabilize on lithium monotherapy. Relapse was evaluated using the Mood Episode Checklist which summarizes DSM-IV criteria for mania and depression and was collected at each visit during the maintenance phase. Relapse was defined by the following: (1) meets criteria for mania and has a CGI-S of 5 (markedly ill) or greater; (2) meets criteria for a major depressive episode with 4-week duration; (3) meets criteria for a mixed episode with CGI-S of 5 or greater; (4) psychiatric hospitalization for a mood episode is required; or (5) in the physician's judgment the patient cannot be managed on monotherapy and a change in medication is required. Episodes of hypomania without impairment of function were not considered relapses. These criteria were designed to be stringent so as to detect clear failures of prophylaxis, rather than brief episodes that might not require a medication change in clinical practice. Serum lithium levels were routinely monitored as clinically recommended over the course of follow-up. On average, lithium levels were maintained at appropriate therapeutic levels19 and were, in fact, slightly higher for those who failed to remit or relapse compared to others (0.68 vs 0.63 mEq/L, p = 0.05).

2.4 Clinical predictors

Patients were evaluated with the Diagnostic Interview for Genetic Studies (DIGS) in order to establish a diagnosis of bipolar I disorder by DSM-IV criteria and collect detailed historical clinical information about current and lifetime mental illnesses. Patients also completed a range of self and clinician rated scales at the screening and subsequent visits to document the clinical course of illness and factors that may relate to the course. Self-rated scales included the Childhood Life Events Scale; the Lifetime History of Aggression Scale; the Columbia Suicide Symptom Severity Scale; the Basic Language Morningness Scale (BALM); the Temperament Evaluation of Memphis, Pisa, Paris and San Diego – auto-questionnaire version (TEMPS-A); the 16 item Quick Inventory of Depression Symptomatology Self-Report (QIDS-SR-16); the Sheehan Disability Scale; the Quality of Life Enjoyment and Satisfaction Questionnaire (Q-LES-Q); and the Life Events Questionnaire (LEQ). Clinician rated scales included the following: the Clinical Global Impressions of Severity Scale (CGI-S); the Hamilton Rating Scale for Anxiety (HAM-A); the Montgomery Asberg Depression Rating Scale (MADRS); the Clinician Administered Rating for Mania (CARS-M); and the Modified Scale for Suicidal Ideation (MSSI). From the assessments collected either at the screening or baseline visits, we derived 43 clinical variables for analysis that were selected based on clinical experience and an expert review of the literature involving three of us (JK, MA, and JC). These included variables on socio-demographic factors, baseline symptoms, clinical history and course, co-morbid illnesses, family history of mental illness, childhood and current life events, and level of functioning. See Table 1 for a full list of variables that were examined.

TABLE 1. Clinical predictors examined for association with treatment response Baseline symptoms Comorbidity Anxiety symptomsa Comorbid alcohol abuse/dependencec Hypermotor activityb Comorbid substance abuse/dependencec Irritability and aggressivenessb Comorbid anxiety disorderc Clinical history Comorbid personality disorderc Age of onsetc Functioning Chronicity of affective disorderc Disability at baseline: impairmentf Chronicity of substance abusec Disability at baseline: family/home lifef History of delusionse Disability at baseline: social lifef History of auditory hallucinationse Disability at baseline: totalf History of visual hallucinationse Disability at baseline: work/schoolf History of any hallucinationse Functioning during most severe depressionc History of headaches lasting 4 to 72 hoursd Functioning during most severe maniac History of migrainesd Functioning overallc History of suicidal thought/behaviorc Years of educatione History of suicide attempte Marital statuse Affective psychosisc Life events Independence of psychosis episodesc Childhood life eventsg Mania type: irritable vs. elatede Childhood physical abuseg Number hospitalizations: inpatiente Life events at last visit: totalh Number hospitalizations: inpatient + daye Life events at last visit: negativeh Presence of mixed episodesc Life events at last visit: positiveh Presence of rapid cyclingc Family history First degree history completed suicidei First degree history bipolar disorderi First degree history depressioni 2.5 Statistical analyses

Differences in socio-demographic factors between patients who completed all study procedures, those who failed to remit or experienced a relapse, and those who were terminated from the study for other reasons were compared using chi-square tests for categorical variables and one-way ANOVA for continuous variables. We then used survival analysis with Cox Proportional Hazard models to examine the relationship between clinical predictors measured at baseline and the time from study entry to treatment failure, which was defined as the time of the last visit at which the patient was determined to have failed to remit or to have relapsed. All other patients were censored at the time of their last visit in the on-going study. We examined each clinical predictor individually in models that additionally controlled for potential confounders including age at study entry, sex, race, and lithium status upon entry into the study. These variables were selected from the available data because they are important socio-demographic factors that experience indicated may be relevant and/or they were found to differ with treatment outcome. Race was captured as a categorical variable for Whites, Blacks, Asians, or other. Lithium status upon entry into the study was captured as a categorical variable to distinguish those who entered the study stable on lithium monotherapy, on lithium plus other psychotropic medications, or not on lithium. We used two-tailed p < 0.05 to declare associations statistically significant. We did not correct for multiple testing because the clinical predictors were carefully selected based on prior hypotheses that they may be relevant to treatment response.

To determine if the associations with treatment response of the clinical predictors identified through the above procedures differed in the initial versus later phases of follow-up, we stratified the survival analyses and looked first at survival over the stabilization/observation phases among all patients who entered the study, and then separately over the maintenance phase among patients who entered the maintenance phase. To formally test for differences in association, we combined the stratified survival data and included in the Cox Proportional Hazard models an interaction term between the specific predictor and an indicator variable for the stabilization/observation versus maintenance phases.

To assess the robustness of observed associations to the assumptions of the survival analysis, we carried out two additional analyses. We defined two alternative but related response variables for analysis: (1) an acute response variable based on whether patients proceeded to the maintenance phase or not; and (2) a prophylactic response variable which contrasted patients who completed all study visits or who had reached the maintenance phase and were still active on study as of the data freeze on June 26, 2017 versus those who failed to remit or who relapsed on lithium monotherapy before completing all study visits. We then used logistic regression to examine the association between the clinical predictors and the two different dichotomous response variables in models that controlled for the same potential confounders as in the survival analysis. The inferences drawn from these two alternative logistic regression analyses were nearly identical to those from the survival analysis, so we report here the results from the survival analysis because it uses more of the available information provided by the prospective data and it provides a unified framework for analyzing the data over the entire time course of the study.

Finally, to evaluate the predictive ability of a model that included all clinical predictors individually found to be significantly associated with treatment failure, we carried out a receiver-operating curve (ROC) analysis specifically for survival data. We first carried out multiple imputation to fill in missing covariate data and maximize the available data for the ROC analysis. We note that we only used the multiple imputation procedure for this and not the primary analyses described above, and we used it only after confirming that analyses with the imputed dataset yielded results that were consistent with those reported from the primary analyses described above. Multiple imputation was performed on the predictor dataset with the mi command in STATA to generate 35 imputed datasets. A consensus imputed dataset was generated by taking the median (for continuous covariates) or modal (for categorical covariates) values across the 35 imputed datasets. We note that this procedure does not take into account the uncertainty in the consensus imputed estimates, but we reasoned it would be sufficient for obtaining reasonable estimates from the ROC analysis. We then proceeded to compare the ROC curves of nested models, including a base model that included the base variables controlled for in all analyses (age at study entry, sex, race, and lithium status upon entry into the study) and a full model that included the base variables plus all clinical predictors that were individually associated with treatment failure (see Table 3). The consensus imputed dataset was randomly split into ten non-overlapping subsets of approximately equal size, with approximately the same proportion of censored and event observations across all subsets. Cox models for the nested models were then fit using nine out of ten subsets, leaving the tenth subset as a hold-out set. Using the results of the fitted models, linear predictor scores were obtained for observations in the hold-out set. Model fitting and prediction were repeated ten times, where a different subset of data was held out each time. Predicted survival ROC curves over 2 years were estimated for the linear predictions using the CoxWeights function from the risksetROC R package.20, 21 The area under the curve (AUC) for the ROC of the nested models were generated, and the differences in AUC were recorded. This process was repeated across 10,000 permutations of survival status and time of censoring pairings. The p-value for AUC difference between models was derived as the proportion of permuted AUC differences that were greater than the unpermuted AUC difference.

3 RESULTS

Figure 1 shows a CONSORT-like flow diagram of the study. A total of 345 individuals were enrolled into the study and included in the analysis. Of these, a total of 194 patients successfully advanced to the maintenance phase, while 60 patients failed to remit on lithium monotherapy during stabilization and/or observation phases. Another 91 patients were terminated from the study for other reasons prior to the maintenance phase. Of the 194 patients who entered the maintenance phase, 41 experienced a relapse, 65 were terminated for other reasons, and 88 completed the study or were still in active treatment as of the date of data freeze.

image

Consort-like flow diagram of patients in the Pharmacogenetics of Bipolar Disorder Prospective trial

Table 2 shows basic socio-demographic characteristics of the study sample broken down by the final outcome status of the patients, whether they completed the study (or were stabilized in maintenance and still active on the study), experienced a treatment failure, or were terminated for other reasons. There were no significant differences in age, sex or race between these three broad outcomes. Patients who entered the study stable on lithium monotherapy were significantly more likely to complete the study compared with those who either were on lithium and other psychotropic medications or were not on lithium on study entry. There were also significant differences between the sites in the outcomes achieved by the patients. These differences were largely explained by the proportion of patients at each site that entered the study stable on lithium, highlighting the importance of controlling for this potential confounder in subsequent analyses.

TABLE 2. Socio-demographic characteristics of the study sample by final outcome status Completed studya (n = 88) Treatment failureb (n = 101) Terminated otherc (n = 156) p-value Age, mean years ±SD 43.84 ± 15.48 42.20 ± 13.32 41.66 ± 14.60 0.526 Sex, n (%) 0.492 Male 41 (46.59) 51 (50.50) 67 (42.95) Female 47 (53.41) 50 (49.50) 89 (57.05) Race, n (%) 0.055 Asian 2 (2.27) 1 (0.99) 4 (2.56) Black 7 (7.95) 7 (6.93) 28 (17.95) White 77 (87.50) 89 (88.12) 115 (73.72) More than one race 2 (2.27) 4 (3.96) 9 (5.77) Ethnicity, n (%) 0.936 Hispanic 3 (3.41) 3 (3.00)d 6 (3.85) Non-Hispanic 85 (96.59) 97 (97.00) 150 (96.15) Li status, n (%)e <0.001 Li monotherapy 56 (63.64) 16 (15.84) 25 (16.03) Li plus other meds 19 (21.59) 47 (46.53) 58 (37.18) Not on Li 13 (14.77) 38 (37.62) 73 (46.79) Site, n (%) 0.001 UCSD 7 (7.95) 11 (10.89) 11 (7.05) Case Western 10 (11.36) 21 (20.79) 40 (25.64) Indiana 8 (9.09) 12 (11.88) 6 (3.85) Johns Hopkins 5 (5.68) 8 (7.92) 28 (17.95) Bergen 10 (11.36) 8 (7.92) 21 (13.46) Chicago 2 (2.27) 4 (3.96) 11 (7.05) Iowa 12 (13.64) 9 (8.91) 13 (8.33) Michigan 17 (19.32) 14 (13.86) 10 (6.41) Penn 4 (4.55) 4 (3.96) 8 (5.13) Dalhousie 12 (13.64) 9 (8.91) 5 (3.21) Mayo Clinic 1 (1.14) 1 (0.99) 3 (1.92)

We then examined the association between hypothesized clinical predictors of lithium response and treatment response. Table 1 shows the list of clinical predictors that were selected a priori for investigation and the self and clinician rated scales from which they were derived. We examined each predictor individually in survival models controlling for factors that we reasoned may confound the relationship with treatment response because they are important socio-demographic factors or were found to differ with outcome status, including age at study entry, sex, race, and lithium status upon entry into the study. Table 3 shows the results for those clinical predictors that were significantly associated with treatment response at nominal significance of p < 0.05.

TABLE 3. Hazard ratio (HR) associations between clinical predictors and treatment response # treatment failures/total person-daysb HR (95% CI); p-value Baseline anxiety symptomsc 98/105053 1.05 (1.03–1.08); p < 0.001 Chronicity of affective disorder Non-chronic 32/65683 1.00 Chronic 58/28693 2.92 (1.76–4.83); p < 0.001 History of migraine No 71/86811 1.00 Yes 28/17094 1.62 (1.03–2.55); p = 0.037 History of suicidal behavior None 22/42134 1.00 Suicidal ideation 32/30103 1.65 (0.95–2.86); p = 0.077 Suicide attempt 36/22915 2.03 (1.16–3.53); p = 0.012 History of mixed episodes No 46/65213 1.00 Yes 44/28388 1.60 (1.01–2.53); p = 0.046 Overall functioning Not disabled 46/67890 1.00

Comments (0)

No login
gif