United States Value Set for the Functional Assessment of Cancer Therapy-General Eight Dimensions (FACT-8D), a Cancer-Specific Preference-Based Quality of Life Instrument

This research was conceived, designed and conducted by the Multi-Attribute Utility in Cancer (MAUCa) Consortium. The University of Sydney Human Research Ethics Committee approved MAUCa’s program of research (No. 13207). The US valuation study was deemed exempt from US Institutional Review Board (IRB) by the Advarra IRB (CA209-466C, Pro00032061).

2.1 Survey Sampling and Implementation

A cross-sectional population-based survey conducted between 18 February 2019 and 9 April 2019 collected sociodemographic and health status data, with a discrete choice experiment (DCE) included as the valuation component. SurveyEngine, a company specializing in online choice experiments, managed sample recruitment (via a US online panel), survey administration and data collection. SurveyEngine and its panel provider complied with the International Code on Market, Opinion and Social Research and Data Analytics [16]. Members of the online panel were eligible if 18 years or older and able to read English. Online panellists received an email invitation, including a hyperlink to the survey. Any who attempted to enter the survey via mobile phones were excluded, as the DCE was too complex for a small screen. Consent was sought from those who successfully entered the survey, and those who consented were screened for quota sampling to ensure that the age, sex, race and ethnicity distributions of the sample approximated those of the US general population, per US 2010 Census data. Those who completed the survey were awarded panel points (approximate value 1 USD).

2.2 FACT-8D Dimensions and Levels

The FACT-8D has eight dimensions: pain, fatigue, nausea, sleep, work, support, sadness and worry, derived from nine FACT-G items. Table 1 shows how the FACT-8D dimensions and levels map to the corresponding FACT-G source items. Note that all the FACT-G source items have five response options: not at all (0), a little bit (1), somewhat (2), quite a bit (3) and very much (4). Note also that the FACT-G items for pain, fatigue, nausea, sadness and worry are all negatively framed, so a higher score indicates more symptoms, sadness or worry. Because the remaining three FACT-G items that determine the FACT-8D dimensions sleep, work and support are positively framed, reverse scoring is required so that FACT-8D Level 0 represents the best score and Level 4 represents the worst score across all dimensions. Note also that the FACT-8D support dimension contains two items; the FACT-8D level allocated is the best level of support, whether from family or friends. The FACT-8D describes over 390,000 possible health states (58 = 390,625).

Table 1 Mapping between the FACT-8D descriptive system (dimensions and levels), FACT-G items, and attributes and levels in the discrete choice experiment2.3 Discrete Choice Experiment

A DCE was used to generate preference data that were used to estimate the parameters of the US FACT-8D preference-weighting algorithm. Following methods previously developed for FACT-8D valuation [11], the DCE contained nine attributes: the eight FACT-8D dimensions and survival duration. Table 1 shows how the levels in the DCE mapped to the FACT-8D descriptive system and corresponding FACT-G source items. In the DCE, survival duration had four levels: 1, 2, 5 and 10 years. These levels were suggested by the oncologist members of the MAUCa Consortium, based on the rationale that 1, 2, 5 and 10 years are common survival goal posts for patients and clinicians, and are commonly used as time-points for survival endpoints in clinical trials, with the shorter durations (1 years, 2 years) applying to advanced cancers and the longer durations applying to early stages of cancer (5 years, 10 years).

Designing the experiment for the DCE involved selecting pairs of health states (choice-sets) that optimized statistical efficiency in estimating the utility model parameters. We selected a C-efficient approach as our data analysis focused on the ratios of coefficients in the conditional logit model, as this was the most appropriate focus for our analysis purpose (Sect. 2.5). The DCE experimental design comprised 100 choice-sets that optimized statistical efficiency in estimating the utility model parameters. Each choice-set comprised two FACT-8D health states, each described by nine attributes (eight HRQL dimensions and duration). We simplified the cognitive task by constraining the number of attributes that differed between health states in any choice-set to five, in line with the typical number of attributes in DCEs used to develop preference-based measure value sets [17]. We decided to vary four HRQL dimensions and duration, and used a method devised by Bleimer to determine which to vary in each choice-set [18]. We generated random choice-sets, keeping only those with exactly five dimensions differing, until we had 10,000 choice-sets that met this criterion. We then used Ngene version 1.3 [19, 20], a software for designing experiments, to select 100 choice-sets that optimized the experimental design’s C-efficiency using a modified Fedorov algorithm with duration as the denominator [21]. Small non-zero priors were used to indicate that levels within each dimension were logically ordered. Table A (Supplement 1) contains the final experimental design.

The valuation task required participants to consider 16 pairs of hypothetical health states (i.e. 16 choice-sets), described as ‘Option A’ and ‘Option B’ (Fig. 1), and for each choice-set, select the health state they would prefer to live in until death. Dimensions that differed between Options A and B were highlighted in yellow, a presentation format preferred by participants in our previous DCE valuation methods experiment [22].

Fig. 1figure 1

An example of a choice set in the US FACT-8D discrete choice experiment. Each choice-set contained two hypothetical health states, Option A and Option B. Each health state was described in terms of levels of the eight FACT-8D quality of life domains (e.g. a little bit of pain, quite a bit of fatigue, etc.) and survival duration (e.g. you will live in this health state for 5 years, and then die). The study participant was asked to indicate which health state would they prefer

There were two levels of randomization in the DCE component of the survey: (1) each respondent was allocated 16 randomly selected choice-sets (without replacement) from the 100 in the DCE design, and (2) which option was seen as Option A or Option B was randomized within each choice-set. The nine DCE attributes were always presented in the same order, as previous work showed that order does not systematically bias preference weights [23].

2.4 Other Survey Content

The survey included sociodemographic characteristics, the FACT-G [12] and self-reported general health (assessed by a single question commonly used in national surveys in the USA [24]). The order of survey components is shown in Fig. 2. After completing the DCE component, participants were asked four fixed-format questions about the difficulty and clarity of the valuation task and strategy used to choose between health states (Appendix A, Supplement 2).

Fig. 2figure 2

Valuation survey flow chart. This figure shows the number of survey participants who completed each section of the valuation survey, and the numbers who were excluded or dropped out. Abbreviation: DCE, discrete choice experiment

2.5 Statistical Analyses

Descriptive statistics summarized sample demographics, self-reported general health and participant feedback on the DCE valuation task. Sample representativeness was assessed against US population reference data for demographics and self-reported general health using chi-square tests.

DCE data quality was assessed by tallying how many respondents chose either all A’s or all B’s across the choice-sets and by considering the time respondents took to complete the survey. We divided respondents into deciles based on total survey time, ran a conditional logit model on the DCE data in each decile and then graphed the Pseudo-R2 and the number of statistically significant coefficients for each decile, interpreting low values on either indicator as suggesting relatively low-quality data.

The DCE data were analysed with STATA 13.0 [25], using a functional form used previously to estimate utilities from DCE data consistent with standard QALY model restrictions [23, 26,27,28,29]. The QALY model requires that all health states have zero utility at death, i.e., ‘the zero condition’ [30, 31]. A functional form that satisfied this requirement included the interaction between the FACT-8D levels and the TIME variable (Eqs. 1 and 2). Therefore, as TIME tended to zero, the systematic component of the utility function tended to zero. Another typical requirement of the QALY model is constant proportional time trade-off [30], therefore the relationship between utility and TIME (life years) was assumed linear.

A useful feature of this functional form is that the impact of deviating from Level 1 (no problems) in each dimension was characterized through a two-factor interaction term with duration (the experimental design allowed for these interactions). This enabled a preference-weighting algorithm in which the effect of each level of each dimension could be included as a decrement from full health.

The DCE data were analysed in two ways, reflecting different approaches to modelling heterogeneity (Eqs. 1 and 2). The primary analysis (Eq. 1) used conditional logit models in which the utility of option j in choice-set s for survey respondent i was assumed to be:

$$\begin U_ = \alpha TIME_ + \beta X_ TIME_ + \varepsilon_ \hfill \\ i = , \, \ldots ,I \, }; \, j = },}; \, s = , \, \ldots ,00} - } \hfill \\ \end$$

(1)

where α is the utility associated with a life year,\(_\) is a vector of dummy variables representing the levels of the FACT-8D health state presented in option j and β is the corresponding vector of preference weights associated with each level in each dimension within \(_\), for each life year. The error term \(_\) was assumed to have a Gumbel distribution. To adjust the standard errors to allow for intra-individual correlation, a clustered sandwich estimator was used via STATA’s vce (cluster) option. To estimate preference weights for each deviation from Level 1 (no problems) in each FACT-8D dimension, we divided each of the β terms by α [26], and used the delta method to estimate standard errors for these ratios [32].

Two conditional logit models were estimated. Model 1 included every decrement from the best level (i.e., Level 1, no problems) in each dimension (Eq. 1). Thus, \(_\) contained 32 terms (8 dimensions × [5-1] levels within each). Non-monotonicity in such models typically reflects noise, with the non-monotonic parameter estimates being not statistically different from each other [33]. Model 2 followed the same general form as Equation 1 but imposed a restriction of monotonicity across levels of each dimension by combining non-monotonic levels. Model 2 therefore included a reduced number of estimates in β (the vector of preference weights). The MAUCa consortium has used this approach previously for the European Organisation for Research and Treatment of Cancer (EORTC) Quality Of Life Utility-Core 10 Dimensions (QLU-C10D) [34,35,36,37,38,39,40,41,42] and the FACT-8D [11, 15]. The impact of constraining coefficients was assessed with change in the model pseudo-R2; ideally, the imposition of monotonicity would not reduce model fit markedly.

The secondary analysis (Eq. 2, Model 3) used mixed logit modelling [43] which assumed that coefficients were randomly drawn from a distribution, allowing for preference heterogeneity among individuals (i.e. random coefficients).

$$U_ = \left( } \right)TIME_ + \left( } \right)X_^ TIME_ + \varepsilon_ .$$

(2)

Thus, α and the vector of βs represent population mean preferences, while γi and ηi are individual deviations around those mean preferences. These deviations were assumed to be distributed multivariate normal (0, ∑). We used the mixlogit STATA command [44] to estimate α, the vector of βs and the standard deviations of γ and the vector of ηs, with the following adjustment. The standard procedures limit the number of parameters drawn from a distribution to 20; to allow all 33 coefficients (including duration) to be drawn from distributions, we used pseudo-random draws.

For variables that deviated from the US general population by ≥ 2.0% in any category, iterative proportional fitting, or raking, weights were included in DCE models [45]. Raking was implemented using the ipfweight command in STATA 13.0, with observations with missing demographic data assigned a weight of 1. Variance inflation due to weighting was assessed by calculating the percentage increase in the standard errors of the unweighted versus weighted Model 1 coefficients.

留言 (0)

沒有登入
gif