Discretizing continuous variables in nutrition and obesity research: a practice that needs to be cut short

We compare analytic approaches from two studies where continuous independent variables were either discretized/dichotomized or analyzed as continuous variables. Since the correlations among independent variables may inflate Type I error rates and lead to the detection of spurious results [15], we report the correlations among the predictors. We also report the correlations between the predictors and criterion, as research shows dichotomization may both attenuate and inflate effect sizes [13, 14]. In addition, we test models for nonlinearity, as prior research has shown that un-modeled nonlinear relationships may create a spurious interaction in linear regression analysis [16, 17]. Both studies were approved by the University of Texas at El Paso Institutional Review Board, and participants consented prior to participation.

In Study 1, an un-modeled quadratic effect would have resulted in a two-way interaction with multiple linear regression. We present the model with the quadratic effect below and compare it to analyses associated with dichotomizing continuous independent variables. Since the predictors could take on the value of zero and zero represented a meaningful score of not answering any item correct on the measures, the predictors were not centered. In Study 2, there was no quadratic effect for either predictor. Therefore, we compare a linear regression model with a two-way interaction to a variety of ANOVA models. In the regression analyses in Study 2, we mean-centered the predictors as neither predictor could take on the value of zero. Data from Study 1 is available from the first author, while data from Study 2 is available from the last author.

Study 1: Materials and methods

Our first example uses three variables to examine the relationship between nutrition knowledge and health literacy. The relationships among accurately reading food labels, nutrition knowledge, and health literacy (n = 612) as part of a larger model were explored. Prior to the start of the study, informed consent was obtained from all participants. Participants were female (71.4%) with an average age of 20.26 years (SD = 3.89) and Latinx (85.3% of the sample). Data were collected online in Qualtrics during 2017–2018 academic school year. The study had sufficient power: Assuming that health literacy, nutrition knowledge and a quadratic effect of health literacy explained 12.5% of variability in food label accuracy scores and the quadratic effect of health literacy uniquely explained 2.5% of unique variability, a sample of 385 participants would be needed to detect the quadratic effect with a Type I error rate of 0.05 and power = 0.80 [18] This study was not replicated.

To measure health literacy, we used a modified version of the Health Literacy Skills Instrument [19, 20] with items being scored as correct or incorrect. The composite score represents the total correct answers. Scores range from 0 to 9, with higher scores indicating greater health literacy. Participants’ average score was 5.61 (SD = 1.54) with reliability (indexed by KR-20) of 0.68, 95% CI (0.64, 0.72). In order to create two categories, anyone scoring 6 or lower was considered “low” on health literacy (68.8% of the sample) and anyone scoring 7, 8, or 9 was considered “high.” While it was not possible to create approximately equal-sized groups with discrete outcomes that assume the values between 0 and 9, the arbitrary choice of where to dichotomize can be seen as an additional impediment to valid inference when dichotomizing independent variables.

For our second variable, we measured nutrition knowledge using a modified version of a measure developed by Parmenter and Wardle [21]. The measure consisted of 20 items that pertain to the relationship between diet and health problems. The composite score represents the total number of correct answers. Participant scores ranged from 0 to 18 with an average score of 10.73 (SD = 3.26) and a reliability estimate (indexed by KR-20) of 0.65, 95% CI (0.61, 0.69). To create two artificial categories, scores were dichotomized at 12, where participants who scored 11 or lower were “low” on nutrition knowledge (56.4% of the sample).

To measure nutrition label accuracy, a modified version of the Nutrition Label Survey [22] based on our earlier work [19] was used. Participant scores ranged from 0 to 16 with an average score of 10.86 (SD = 3.29). The reliability of the scores for this measure (as indexed by KR-20) equaled 0.77, 95% CI (0.74, 0.79).

Study 1: ResultsUnivariate relationship between predictors and label accuracy

The relationship between health literacy and label accuracy as a continuous variable is r(610) = 0.30, P < 0.001. Health literacy explains 8.8% of the variability in label accuracy scores. The relationship between dichotomized health literacy and label accuracy is attenuated, as the correlation was r(610) = 0.14, P < 0.001, explaining 2.0% of the variance. This example demonstrates a dramatic 77.2% reduction in effect size when using dichotomized variables (8.8% vs. 2.0%).

In addition, the correlation between nutrition knowledge and label accuracy was r(610) = 0.29, P < 0.001 and explains 8.3% of the variance in label accuracy scores. The correlation between the dichotomized nutrition knowledge and label accuracy was r(610) = 0.20, P < 0.001, and the proportion of variance explained was 4.1%. Finally, the two predictors were moderately correlated, r(610) = 0.30, P < 0.001.

Results of the analysis of variance with dichotomized independent variables

In this demonstration, we investigated the effects of dichotomized continuous variables on label accuracy scores. The main effect for nutrition knowledge was statistically significant, F(1, 608) = 14.27, P < 0.001, squared partial correlation = 0.023, Cohen’s d = 0.43. The nutrition label accuracy score (mean ± SE) for the participants with “low” vs. “high” nutrition knowledge was 10.59 ± 0.20 vs. 11.66 ± 0.20.

The main effect for health literacy was statistically significant, F(1, 608) = 8.44, P = 0.004, squared partial correlation = 0.014, Cohen’s d = 0.31. The nutrition label accuracy score (mean ± SE) for the participants with “low” vs. “high” health literacy was 10.72 ± 0.16 vs. 11.53 ± 0.23. Finally, the interaction was not statistically significant, F(1, 608) = 3.512, P = 0.061, squared partial correlation = 0.006. The R2 for this analysis equaled 0.06, indicating that the dichotomized independent variables and their interaction explain 6% of the label accuracy variability.

Results of the multiple regression analysis

We also analyzed the data using multiple regression. Initially, we regressed label accuracy on nutrition knowledge, health literacy, and their interaction. As prior research [16, 17] shows that un-modeled nonlinear effects may result in spurious interactions, we also regressed label accuracy on nutrition knowledge, health literacy, and health literacy squared (to estimate a quadratic effect). This model was a better model in terms of the proportion of variance explained. Another model containing the quadratic effect for health literacy and a two-way interaction between health literacy and nutrition knowledge was also examined, but the two-way interaction was not significant. We now discuss the regression model with the quadratic effect.

In the analysis with the quadratic effect, the R2 equaled 0.17, almost three times the proportion of variance explained in the prior ANOVA. Moreover, the partial regression coefficient for nutrition knowledge was significant: for every 1-unit increase in nutrition knowledge, the predicted label accuracy score increased 0.17 points (P < 0.001) holding health literacy constant. The squared partial correlation coefficient associated with this variable equaled 0.030, which indicates that nutrition knowledge uniquely accounts for 3.0% of the unexplained variability in nutrition label accuracy scores that is not accounted for by the other predictors in the model.

The simple effect for health literacy when health literacy equals zero is 2.52 (P < 0.001) showing that when health literacy is low, small differences in health literacy are associated with large differences in predicted label accuracy scores. The squared partial correlation coefficient associated with this simple effect equaled 0.07, indicating that this variable uniquely accounts for 7.0% of the unexplained variability in nutrition label accuracy scores that is not accounted for by the other predictors in the model.

One obvious shortcoming of dichotomizing health literacy for an ANOVA is the inability to examine nonlinear relationships. Put another way, the power to detect this quadratic effect in an ANOVA equals zero. In this model, the quadratic effect was statistically significant. At the point where Health Literacy = 0 (its minimum), and holding constant Nutrition Knowledge, a 1-unit increase in Health Literacy translates to a 2.52-point increase in predicted Label Accuracy when health literacy equals zero, and −0.20 is half the amount by which this effect changes for every 1-unit increase in Health Literacy thereafter. So, the initially significant positive effect of Health Literacy weakens as Health Literacy increases (P < 0.001). The squared partial correlation coefficient associated with this quadratic effect equaled 0.049, indicating this variable uniquely accounts for 4.9% of the unexplained variability in nutrition label accuracy scores that is not accounted for by other predictors are in the model.

Probing the quadratic effect

For our model,

$$\hat=1.53+0.17\,\,+2.52\,\,-0.20\,\,^$$

Re-expressing this model,

$$\hat=1.53+0.17\,\,+2.52\,\,-0.20\,(\,* \,)$$

The quadratic effect above indicates the quadratic effect is an interaction, i.e., the regression of accuracy in reading nutrition labels on health literacy depends on where you stand on health literacy. Rearranging terms, this model can be re-expressed as:

$$\hat=1.53+.17\,\,+\,\left(2.52-0.20\,\,\right)\,$$

To aid in the interpretation of the parenthesized term, we used the Johnson–Neyman [23] regions of significance approach to determine what values of health literacy made the parenthesized term statistically significant. As Spiller and colleagues [24] highlight, the choice of what values equal zero among the predictor variables must be kept in mind when examining the parenthesized term above.

There are several ways to probe an interaction, including the use of 3D graphs [25], spotlighting or pick-a-point [16], floodlighting [26], and the Johnson–Neyman [23] regions of significance approach. The use of 3D graphs allows for a three-dimensional examination of the relationship between the predictor variables and the dependent variable. Spotlighting or pick-a-point assesses the statistical significance of the parenthesized term at a particular value of the moderator variable. Floodlighting examines all possible values that the moderator variable can take on in the parenthesized term for the simple slope, while the Johnson–Neyman regions of significance approach provides a range of the values for the moderator where the parenthesized term for the simple slope is statistically significant. We chose to use the Johnson–Neyman approach, as the resulting graph is easy to interpret and provides the regions of significance.

Miller et al. [27] provide online tools that involve the use of the Johnson–Neyman approach to examine the statistical significance of the parenthesized term when the linear model contains a quadratic effect. Using these tools, the parenthesized term above is not statistically significant when health literacy ranges between 5.8883 to 7.2203. In other words, when health literacy is equal to 6 or 7, the local linear effect of health literacy on accuracy in reading nutrition labels is not statistically significant. Between the values of 0 and 5, the local linear effect of health literacy on accuracy in reading food labels is positive. For the individuals who score either 8 or 9, the local linear effect of health literacy on accuracy in reading food label scores is negative. Figure 1 depicts this simple slope.

Fig. 1: Johnson-Neyman Regions of Significance Plot for Study1.figure 1

Probing the quadratic effect of health literacy.

Study 2: Materials/subjects and methods

Our second example examines the relationships among cognitive restraint, BMI, and fruit & vegetable (F/V) intake (n = 586). Participants were female (66.2%), Latinx (69.3%), and had annual income less than 50 K (53.6%). The average age was 35.5 (SD = 14). Data were collected from health professionals, nutrition students, and community members from March 2018 to June 2019. Prior to the start of the study, informed consent was obtained from all participants. The study had sufficient power: Assuming that the interaction between cognitive restraint and BMI explained 1.5% of unique variability and that BMI, cognitive restraint, and their interaction explained 12.5% of the variability in F/V intake, a sample of 462 participants would be needed to detect this interaction with a Type I error rate of 0.05 and power = 0.80 [18]. This study was not replicated.

To measure cognitive restraint, we used a modified version of this domain from the Three Factor Eating Questionnaire [28, 29] with items being scored on a 1–4 Likert scale. Mean composite scores range from 1 to 4, with increased scores indicating greater cognitive restraint. Participants’ average score was 2.61 (SD = 0.56) with reliability (indexed by coefficient alpha) of 0.67. For the purpose of this example, anyone scoring 2.60 or lower was considered “low” on cognitive restraint and anyone scoring above 2.60 as considered “high” on cognitive restraint. “Low” scoring individuals made up 45.9% of the sample.

For our second variable, we calculated the participants’ BMI. Participant height (Seca 213 stadiometer, Hamburg, Germany) and weight were measured (InBody 270 and InBody 570 Body Composition Analyzers, Seoul, South Korea), and BMI was calculated. Heights were rounded to the nearest half-centimeter. Participant BMIs ranged from 17.0 to 60.7 with an average score of 27.99 (SD = 6.05). Several artificial groupings were created: dichotomized at median, discretized per CDC guidelines [1], and dichotomized as having obesity or not. To create two artificial categories, scores were dichotomized at 27.1, where participants who scored 27.1 or lower were “low” BMI and were assigned a score of 0. “Low” scoring participants made up 50.2% of the sample. Individuals whose BMI was greater than 27.1 were considered “high” and were assigned a score of 1.

In a separate analysis, BMI was discretized at the following points, according to CDC guidelines [1]: underweight, BMI < 18.5 (1.2% of the sample); healthy weight, BMI 18.5–24.9 (32.4% of the sample); overweight, BMI 25.0–29.9 (36% of the sample); Class 1 Obesity, BMI 30–34.9 (18.4% of the sample); Class 2 Obesity, BMI 35–39.9 (7% of the sample); and Class 3 Obesity, BMI ≥ 40 (4.9% of the sample). In a third ANOVA, BMI was discretized using modified CDC guidelines where Class 1, Class 2, and Class 3 obesity were merged into an “obesity” category, constituting 30.3% of the sample.

For F/V intake, we measured skin carotenoid levels, a biomarker for total F/V intake, using reflectance spectroscopy (VEGGIE METER® by Longevity Link Incorporated, Salt Lake City, UT, USA) [30, 31]. Participant scores ranged from 29 to 709 with an average score of 275.67 (SD = 110.149).

Study 2: ResultsUnivariate relationship between predictors and F/V intake

When analyzed as a continuous variable, the relationship between BMI and F/V intake is r(584) = −0.18, P < 0.001. BMI explains 3.1% of the variability in F/V intake scores. The relationship between a dichotomized BMI and F/V intake increased, with a correlation of r(584) = −0.21, P < 0.001, explaining 4.2% of the variance. As mentioned earlier, one of the conditions under which dichotomizing continuous variables may increase the correlation with the criterion occurs when the correlation is small, as is the case in this example. These correlations do not statistically differ from one another, Z = 1.07, P = 0.29. [32, 33].

The correlation of cognitive restraint as a continuous variable with F/V intake was r(584) = 0.13, P = 0.001 while the correlation of dichotomized cognitive restraint with F/V intake was r(584) = 0.14, P = 0.001. The correlations with these two approaches did not statistically differ from one another, Z = −0.17, P = 0.86 [32, 33]. Finally, the correlation between continuously measured cognitive restraint and continuously measured BMI was r(584) = 0.04, P = 0.28.

Results of the analysis of variance with discretized and dichotomized independent variables

To investigate the effects of dichotomized and discretized continuous variables on F/V intake scores we conducted the following ANOVAs on F/V intake: 2 (low vs high cognitive restraint) × 2 (low vs high BMI), a 2 (low vs high cognitive restraint) × 4 (underweight–healthy weight–overweight-obesity), and a 2 (low vs high cognitive restraint) × 6 (underweight–healthy weight–overweight-Class 1 Obesity-Class 2 Obesity-Class 3 Obesity).

For the 2 × 2 ANOVA on F/V intake scores, the main effect of BMI was statistically significant, F(1, 582) = 24.58, P < 0.001, squared partial correlation = 0.041. The F/V score (mean ± SE) for the participants with low vs. high BMI was 296.26 ± 6.24 vs. 252.42 ± 6.26. The main effect for cognitive restraint was statistically significant, F(1, 582) = 12.07, P < 0.001, squared partial correlation = 0.02. The scores (mean ± SE) for participants with low vs. high cognitive restraint was 258.98 ± 6.50 vs. 289.70 ± 5.99. Finally, the interaction was not statistically significant, F(1, 582) = 3.34, P = 0.062, squared partial correlation = 0.006. The R2 for this analysis equaled 0.067, indicating that the dichotomized independent variables and their interaction explain 6.7% of the variability in F/V intake. For these analyses, one would conclude that individuals with low BMI (relative to high BMI), and that those individuals with high cognitive restraint (relative to low cognitive restraint), consume more F/Vs.

For the 2 × 4 ANOVA (respectively, low vs. high cognitive restraint; underweight–healthy weight–overweight–obesity) on F/V intake scores, the main effect for BMI was statistically significant, F(3, 578) = 6.94, P < 0.001, squared partial correlation = 0.012. The F/V score (mean ± SE) for participants in the categories of underweight, healthy weight, overweight, and obesity were 246.50 ± 44.83, 302.78 ± 7.77, 267.4 ± 7.52, 254.27 ± 8.05, respectively. Bonferroni contrasts revealed that individuals with healthy weight ate more F/Vs than participants with overweight and obesity. The main effect for cognitive restraint was not statistically significant, F(1, 578) = 3.23, P = 0.073, squared partial correlation = 0.006. Finally, the interaction was also not statistically significant, F(3, 578) = 2.07, P = 0.103, squared partial correlation = 0.004. The R2 for this analysis equaled 0.065, indicating that the discretized predictors and their interaction explain 6.5% of F/V intake variability.

For the 2 (low vs. high cognitive restraint) × 6 (underweight–healthy weight–overweight–obesity 1–obesity 2–obesity 3) ANOVA on F/V intake scores, the main effect for BMI was statistically significant, F(5, 574) = 4.85, P < 0.001, squared partial correlation = 0.008. The F/V score (mean ± SE) for participants in the categories of underweight, healthy weight, overweight, obesity 1, obesity 2, and obesity 3 were 246.50 ± 44.81, 302.78 ± 7.77, 267.43 ± 7.52, 263.94 ± 10.42, 239.50 ± 16.77, and 232.98 ± 19.90, respectively. Bonferroni contrasts revealed that healthy-weight individuals ate more F/Vs than those with overweight and all participants with obesity. The main effect for cognitive restraint was not statistically significant, F(1, 574) = 1.42, P = 0.23, squared partial correlation = 0.002. Finally, the interaction was also not statistically significant, F(5, 574) = 1.72, P = 0.128, squared partial correlation = 0.003. The R2 for this analysis equaled 0.072, indicating that the discretized predictors and their interaction explain 7.2% of the F/V intake variability.

Results of the multiple regression analysis

We also analyzed the data using multiple regression, where BMI and cognitive restraint and their interaction were included in the model. In general, the regression model in our example can be expressed as:

Rearranging terms and declaring X (cognitive restraint) as the focal predictor and Z (e.g., BMI) as the moderator variable yields:

$$\hat=_+_Z)+_+\beta }_Z)X$$

The first parenthesized term is known as the simple intercept; we see that the simple intercept is dependent on the value of Z, the moderator, and its associated conditional partial regression coefficient, \(_\). The second parenthesized term is known as the simple slope, and it is also dependent on the value of Z. Researchers often want to provide “meaning” to \(_\), so that one can say that for every 1 unit change in X, \(_\) will represent how much the predicted outcome variable will change. That claim can be made only when Z = 0. In the current example, BMI and cognitive restraint can never take on the value of zero. As a result, \(_\) represents an effect that has no meaning. The above equation can also be rearranged making Z the focal predictor, so that the meaningfulness of \(_\) will depend on whether X (e.g., cognitive restraint) can assume the value of 0.

McClelland et al. [34] provide an easy-to-understand synopsis of a variety of ways to provide meaning to these conditional partial regression coefficients. Some of these methods involve mean-centering the predictor variables [35] and performing orthogonal transformations of the predictor variables [36, 37]. As McClelland et al. [34] point out, these transformations of the predictor will not alter the estimate of \(_\) or its associated standard error. In addition, the semi-partial correlation and partial correlation involving the interaction term and the outcome will not be changed due to either mean centering or the use of an orthogonal transformation. These transformations will also have no effect on model fit [38]. The primary benefit of transforming predictor variables is to provide “meaning” to \(_\) and \(_\).

It is also clear that interpreting the simple slope depends on the numeric values of \(_\) and/or \(_\) [24], which depends on how predictor variables are transformed. In this demonstration, we decided to mean center BMI and cognitive restraint. In this analysis, the R2 equaled 0.059, which is less than estimates of R2 from the ANOVA models. The conditional effect for a centered BMI was statistically significant, Β = −3.75 (SE = 0.76), t = −4.95, P < 0.001. The squared partial correlation coefficient associated with this conditional effect equaled 0.040, indicating that this variable uniquely accounts for 4% of the unexplained variability in F/V intake scores that is not accounted for by the other predictors.

The conditional effect for mean cognitive restraint was significant: Β = 24.22 (SE = 8.01), t = 3.02, P = 0.003. The squared partial correlation coefficient associated with this variable equaled 0.015, which indicates that cognitive restraint uniquely accounts for 1.5% of the unexplained variability in F/V intake that is not accounted for by the other predictors. The conditional effect of cognitive restraint is qualified by an interaction with BMI: Β = −2.84 (SE = 1.34), t = −2.12, P = 0.034. These findings contradict what was found with the various ANOVA models where cognitive restraint was dichotomized and BMI was either dichotomized or discretized. The squared partial correlation coefficient associated with this conditional effect equaled 0.008, indicating this variable uniquely accounts for 0.8% of the unexplained variability in F/V intake scores that is not accounted for by the other predictors.

Probing the interaction

As discussed earlier, once an interaction is found in regression, the interaction needs to be understood. In the equations below, \(\hat\) will denote the predicted F/V intake score. For our model,

$$\hat=276.10+24.22(-\,\,)+-3.75(-\,)-2.84(\left(-\,\,\right)* \left(-\,\right))$$

This model is a moderated multiple regression equation, where one independent variable is the focal predictor and the other independent variable is the moderator. For this example, cognitive restraint will be the focal predictor of F/V intake and BMI moderates the relationship between cognitive restraint and F/V intake. Rearranging terms, this model can be re-expressed as:

$$\hat=\left(276.10+-3.75(-\,)\right)+\left(24.22+-2.84\left(-\,\right)\right)(-\,\,)$$

To aid in the interpretation of this model, many researchers would dichotomize the moderator and plot the regression of the outcome variable on the focal predictor separately for individuals who are “low” and “high” on the moderator. Other researchers might pick three arbitrary points of the moderator variable (e.g., the mean and 1 standard deviation above and below the mean of the moderator variable) and plot the regression of the dependent variable on the focal predictor at these three points. While such procedures are commonly used, they do not determine the numeric values of the moderator variable that make the simple slope statistically significant. Moreover, such an approach limits generalizability as the values of the mean and the standard deviation are sample-dependent [24].

The Johnson–Neyman technique [23] allows for such an assessment by creating 95% confidence intervals for a simple slope for all hypothetical values of the moderator. Confidence limits that exclude zero indicate the simple slope is statistically significant at that value of the moderator. In Fig. 2, the vertical axis consists of values of the simple slope, while the horizontal axis are the numeric values of the moderator variable, BMI. Looking at Fig. 2, we see the values of BMI slightly above 2.24 units above the mean have confidence intervals that contain 0. Using PROCESS [39], the simple slope is not statistically significant at mean-centered BMI values above 2.24 units above the mean. Mean BMI in this sample equaled 27.99. In practical terms, the conditional effect of increased mean-centered cognitive restraint on F/V intake is detectable for participants classified as underweight, healthy weight, overweight, and for some who would be classified as having obesity (BMI < 30.23, which is 27.99 + 2.24). For participants who have BMIs >30.23, there is no significant association between F/V intake and cognitive restraint.

Fig. 2: Johnson-Neyman Regions of Significance Plot for Study2.figure 2

Probing the two-way interaction with mean-centered BMI as the moderator variable.

As interactions are symmetric, we can also treat cognitive restraint as the moderator variable. Rearranging the above expression,

$$\hat=\left(276.10+24.22(-\,\,\right)+\left(-3.75+-2.84\left(-\,\,\right)\right)(-\,)$$

Using the Johnson–Neyman [23] regions of significance approach, the simple slope above is statistically significant when cognitive restraint scores are greater than or equal to 0.64 units below the mean on cognitive restraint.

留言 (0)

沒有登入
gif