Dietary patterns derived by reduced rank regression, macronutrients as response variables, and variation by economic status: NHANES 1999–2018

Study design and population

We analysed data from the National Health and Nutrition Examination Survey (NHANES), which is an ongoing national repeated cross-sectional survey designed to assess the health and nutritional status of the US population [18]. NHANES is conducted by the National Center for Health Statistics of the Centers for Disease Control and Prevention [19]. In this study, we used data from 2-year NHANES survey cycles conducted during 1999–2018. The included participants were 20 years of age and older (n = 55,081). A total of 41,849 participants (49.8% males) were included in the RRR model after exclusions for missing dietary and economic status data, as well as implausible energy intakes (for males an energy intake < 800 or > 6000 kcal/day; for females an energy intake < 600 or > 4000 kcal/day [20]. In the descriptive and regression analyses, there was a total of 39,757 (49.0% males) participants following the exclusion of participants with missing covariate data. An additional 12,566 participants were excluded from the C-reactive protein (CRP) regression analysis as CRP was not measured during the 2011–2014 NHANES survey cycles (Fig. 2). All participants signed written informed consent with approval from the National Center for Health Statistics Research Ethics Review Board. Additional low risk ethical approval was obtained from the Flinders University Human Research Ethics Committee (6547).

Fig. 2figure 2

Sampling description of study participants, the National Health and Nutrition Examination Survey (NHANES)

Note: CRP, C-reactive protein

Dietary data

The 1999–2018 NHANES cycles collected dietary data through 24-hour dietary recall interviews using the Automated Multiple-Pass Method [21]. The Automated Multiple-Pass Method is a computer-assisted interview system developed by the United States Department of Agriculture (USDA) for the estimation of food intakes [22]. For the 2003–2018 waves, there were two 24-hour dietary recall interviews. The first dietary interview was administered in person by a trained interviewer in the Mobile Examination Center, and the second was conducted three to ten days later by telephone. We used the data from the first dietary interview in the primary analysis. Respondents were provided with measuring guides for assistance in estimating the portion sizes of consumed foods and beverages. An updated version of the USDA Food and Nutrient Database for Dietary Studies was used to determine the nutrient values of food items for each 2-year survey period [21]. The USDA Food Patterns Equivalents Database was employed to disaggregate the reported food and beverages into 37 USDA Food Patterns components [21]. Further detail on the dietary data collection method is reported elsewhere [21].

Dietary patterns

RRR was used to identify dietary patterns using 26 foods and food groups: citrus fruit; other fruits; dark green vegetables; tomato; potato; other starchy vegetable; other vegetables; whole grain; refined grain; meat, pork and beef; frank meat; organ meat; poultry; fish high in omega-3; fish low in omega-3; egg; soy; nuts; legumes; milk; yogurt; cheese; liquid fat; solid fat; added sugar; and alcohol. The food groupings were based on nutrient composition and cooking methods. Percentages of energy from four macronutrients (protein, carbohydrates, saturated fat, and unsaturated fat) were calculated and used as response variables. The RRR model used in this study is depicted in Fig. 3. The number of dietary patterns derived using RRR is dependent on the number of response variables. Hence, four dietary patterns were extracted in our analysis.

Fig. 3figure 3

The reduced rank regression model with the response variables and predictors used in this study

Note: Only a few of the food groups used in this study are displayed in the figure. %E, percentage of total energy

Economic status

Family poverty to income ratio was the variable selected to operationalize economic status; it is the ratio of family income to the specific poverty threshold for that survey year. These data were obtained through a questionnaire conducted by trained interviewers in the home of NHANES participants using Computer-Assisted Personal Interview methodology [23]. Family poverty to income ratio was classified into three groups based on a published study and the Patient Protection and Affordable Care Act: low (≤ 1), middle (1–4), and high (≥ 4) [24]. These groups were utilized to represent low, medium, and high levels of economic status, respectively.

Central obesity and systemic inflammation

Central obesity was assessed as waist circumference, which is a measure of abdominal adiposity independently associated with an increased risk of morbidity and mortality [25]. Waist circumference was measured with a tape measure at the uppermost lateral border of the hip crest (ilium) to the nearest 0.1 cm [26, 27]. These measures were undertaken by trained health technicians in the Mobile Examination Center during the examination segment of the NHANES survey cycles. Waist circumference was analysed as a continuous variable.

Blood samples were obtained from participants as a component of NHANES. CRP, a biomarker of systemic inflammation, was measured in the 1999–2010, 2015–2016 and 2017–2018 survey cycles, but not the 2011–2014 survey cycle. The 1999–2010 survey cycles used latex-enhanced nephelometry to measure CRP in mg/dL. This method is based on the reaction between a soluble analyte and its corresponding antigen or antibody bound to polystyrene particles. Quantification of CRP occurs through anti-CRP antibodies covalently linking with the polystyrene core and hydrophilic shell of CRP particles [23]. Two alternative methodologies of a higher sensitivity were used to measure CRP levels in the 2015–2016 survey cycle and 2017–2018 survey cycle: a near infrared particle immunoassay rate method and a two-reagent immunoturbidimetric system, respectively. CRP was modelled as a continuous variable in mg/L, and levels above 10.0 mg/L were excluded because this may indicate acute infections [28].

Confounders

Sociodemographic factors, behavioural factors and chronic conditions were included as confounding variables. Confounders were selected based on directed acyclic graphs (DAGs) of the relationships between economic status and dietary pattern score, dietary pattern score and abdominal obesity, and dietary pattern score and systemic inflammation (Supplementary Figs. 1, 2 and 3).

The sociodemographic characteristics were: age (years), sex (male or female), ethnicity (Mexican American, other Hispanic, non-Hispanic white, non-Hispanic black, or other races including multi-racial), marital status (married/living with partner, widowed, divorced/separated, or never married), and education level (less than high school, high school diploma or equivalent, or more than high school). The behavioural factors included smoking status, physical activity, and total energy intake. Smoking status was categorized as: never, former (does not currently smoke but has smoked > 100 cigarettes in lifetime), or current (currently smokes and has smoked > 100 cigarettes in lifetime). Physical activity was evaluated using metabolic equivalent of task (MET)-minutes, which was calculated through multiplying the weekly minutes for each moderate to vigorous activity by its appropriate MET score. Participants were categorized into three groups of physical activity level: low (< 600 MET-minutes per week), moderate (600 to < 1200 MET-minutes per week), or high (≥ 1200 MET-minutes per week). Total energy intake from foods and beverages was computed in kcal/day and then converted and reported in kJ/day using the following equation: 1 kcal = 4.184 kJ.

Diabetes (yes or no), cardiovascular disease (yes or no), and cancer (yes or no) were also included as confounding variables. Diabetes was defined as meeting at least one of the following criteria: a fasting plasma glucose ≥ 126 mg/dL; a random plasma glucose ≥ 200 mg/dL with symptoms and signs present (e.g., diabetes retinopathy); a 2-hour plasma glucose ≥ 200 mg/dL during a 75 g oral glucose tolerance test; and/or a haemoglobin A1c level ≥ 6.5%. Participants could also be classified as having diabetes if they gave a positive response to any of the following questions: “Did a doctor tell you, you have diabetes?”, “Are you taking insulin?”, and/or “Do you take pills to lower blood sugar?” [29]. Participants were also asked to self-report whether they had been diagnosed with cardiovascular disease or cancer by a doctor (yes or no). Models for economic status and dietary patterns were adjusted for sex, age, marital status, ethnicity, educational status, smoking status, physical activity level, total energy intake, diabetes, cardiovascular disease, and cancer. For the association of dietary patterns with waist circumference, models were additionally adjusted for economic status. For the association of dietary patterns with CRP, models were additionally adjusted for economic status and the method of CRP measurement (survey cycles).

Statistical analysis

All analyses accounted for the complex survey design using NHANES-assigned dietary data weights, population sampling units, and strata. Data were downloaded from the Centers for Disease Control and Prevention website [30]. Descriptive analysis of sociodemographic and lifestyle characteristics was performed across quintiles of dietary pattern scores. Characteristics were summarized using mean (standard deviation [SD]) for symmetrically distributed continuous variables, median (interquartile range [IQR]) for skewedly distributed continuous variables, and proportion for categorical variables.

Four dietary patterns were identified using RRR. Factor loadings, which are the standardized correlations between food groups and the dietary patterns (factors), were calculated. The proportions of factor-specific and all factor variances that explain the response variables and food groups were determined. Participants received a factor score for each dietary pattern which represents their adherence to the dietary pattern. This was derived in the form of a continuous variable that evaluates how much of a participant’s diet approximates the corresponding dietary pattern [2]. Factor scores were divided into quintiles (Q1 [lowest intake], Q2, Q3, Q4 and Q5 [highest intake]) for further analyses. Correlations (response scores) between the response variables and dietary patterns were quantified.

The cross-sectional associations between economic status and dietary pattern scores were determined using generalized linear models with Gaussian distribution and identity link. The models were adjusted for potential confounders described above. The associations of dietary pattern scores with central adiposity (represented by waist circumference) and systemic inflammation (represented by CRP) were determined using multivariable generalized linear models with Gaussian distribution and identity link. A supplementary analysis with stratification by sex was performed on the association between dietary pattern scores and central adiposity. The trend of association across quintiles of each dietary pattern were assessed using quintiles as a continuous parameter. All statistical analyses were performed using Stata statistical software version 17.0 (Statacorp, College Station, TX). A Stata module was installed to perform RRR [31].

Comments (0)

No login
gif