Validation of a Latin-American Spanish version of the Body Esteem Scale for Adolescents and Adults (BESAA-LA) in Colombian and Nicaraguan adults

MethodsParticipants

A total of 525 participants completed the questionnaire (Mage = 24.62, SD = 9.95), exceeding the recommended minimum 1:20 item participant ratio (see [29, 69]). Participants were asked about their gender identity, answer options included ‘female’, ‘male’, ‘other’ (with the option to specify) and ‘prefer not to say’. 65% of participants identified as female, 34% as male and 0.6% as other/preferred not to say).

Procedure

This project complied with ethical guidelines for research with human participants and received ethical approval from Durham University (PSYCH-2020–08-20T11:47:35-dfls13) and Universidad del Norte (Nº 210). All data were collected using an online Questionnaire on Qualtrics™.

Undergraduate psychology students from Universidad del Norte and Corporación Universitaria Reformada in Barranquilla were invited to fill out the questionnaire. Additionally, they received a maximum of 0.5 percentage-point on their final exam when they recruited five people to participate. Out of 697 people who opened the questionnaire, 525 participants (64%) answered all questions. Two weeks later, all participants who completed the questionnaire received an email inviting them to complete the questionnaire again. Due to low participation rates for retest at the two-week interval, participants received another invitation four weeks later. Only 16% of participants (N = 84) filled out the questionnaire a second time.

MeasuresBody Esteem Scale for Adolescents and Adults

The Body Esteem Scale for Adolescents and Adults [48] assesses body satisfaction/dissatisfaction and includes 23 items. It is answered on a Likert-scale from 1 = never to 5 = always, with higher average scores indicating higher body satisfaction.

Professor Beverley Mendelson granted permission to translate the Body Esteem Scale for Adolescents and Adults [48] in January 2022. The translation and cross-cultural adaptation followed the guidelines by Swami and Barron [69] and Kling et al. [35]. First, items were translated by two bilingual speakers into Spanish (FA, TT). The scale was then back translated into English by a Colombian bilingual speaker (AC). Any disagreements between translations were resolved through discussion.

In the next step, a committee consisting of two authors (JT, LF) and five students in Colombia confirmed that all wording was culturally appropriate (as suggested by [7]. Feedback on this final translation was then sought from a small group of Colombian students (N = 24). All items were judged to be comprehensible, and no further changes were made.

Body appreciation

The Latin Spanish version of the Body Appreciation Scale-2 (BAS-2; [26, 72]) was used to measure body appreciation. It includes 10 items on a Likert scale from 1 = never to 5 = always (e.g., “I respect my body”). Higher average scores indicate higher body appreciation. The scale showed good reliability in our sample (Cronbach’s α = 0.94). This scale has been used to test the nomological validity of the BESAA in other validation studies (see, e.g., [4, 28]) and we expected positive correlations between BAS-2 and BESAA-LA scores.

Media internalization and sociocultural pressures

The Spanish version of the Sociocultural Attitudes Towards Appearance Scale-4 (SATAQ-4; [40, 64]) was used to measure internalization and sociocultural pressures. The scale consists of five subscales, measuring internalization of the thin ideal, athletic ideal, and sociocultural pressures from family, friends, and the media. Participants respond on a Likert scale, where 1 = strongly disagree and 5 = strongly agree, to statements such as “It is important for me to look athletic”. Higher average scores indicate higher internalization and sociocultural pressures. Cronbach’s alpha confirmed good reliability in our sample with 0.92 (Cronbach’s α). The SATAQ has been used in another validation study to test discriminant validity [4]. We expected negative correlations between the SATAQ-4 subscales and the BESAA-LA.

Eating restraint

The restrained eating subscale of the Spanish version of the Eating Disorder Examination Questionnaire (EDE-Q; [42, 54]), was used to assess eating disorder symptoms. This subscale contains 5 items and measures the frequency of eating restraint over the last four weeks on a response scale from 0 = no days to 6 = every day. Higher average scores indicate higher eating disorder symptomatology. Good reliability was confirmed in our sample (Cronbach’s α = 0.86). The EDE-Q has been used in prior studies to test discriminant validity (e.g., [4, 24]). We expected a negative correlation between eating restraint and BESAA-LA.

Statistical analysis

Analyses were conducted in RStudio 4.0.3 (RStudio Team 2020). Packages Lavaan [60], and SEM [23] were used for confirmatory factor analysis (CFA) and Exploratory Factor Analysis (EFA). Exploratory structural Equation model (ESEM) was conducted in Mplus version 8.10 [52]. Analytic code and redacted-anonymized data are available on OFS (https://osf.io/q9mj2/).

The fit of previously validated factor structures (i.e., original, Spanish short, Brazilian, and French male and female versions) was first assessed using confirmatory factor analysis (CFA). As these factor structures did not satisfactorily fit our data, exploratory factor analysis (EFA) with a random half of the data was conducted. Parallel analysis [30], the Guttman-Kaiser criterion [27, 33], and the scree plot were considered to decide the number of factors to retain during EFA. Factors needed to fulfil the following criteria (as suggested by [69]): (a) each factor needed to contain at least 3 items, (b) the factor loading for each item needed to be at least 0.35, (c) each factor needed to explain at least 5% of the total variance, (d) items with cross-loadings of > 0.33 on more than one factor would be excluded. The newly found structure was confirmed using CFA on the second half of the data. We performed oblique rotation and used robust weighted least squares estimator (WLSMV), as recommended for categorical variables [51, 56]. Additionally, exploratory structural equation models (ESEM) were used to test model fit. Due to the presence of cross-loadings, ESEM has advantages over CFA since it does not force factor loadings on secondary factors to 0 [56]. A target rotation was specified based on the factor structure derived from EFA. This means we specified on which factor we expected items to load, while not restricting cross-loadings to 0. We followed the code provided by Prokofieva et al. [56].

The following fit indices were used to test adequate model fit, as recommended by Hu and Bentler [31]: Root Mean Square Error of Approximation (RMSEA, values less than 0.06 are considered good, and values between 0.07 and 0.10 are acceptable); Standardized Root Mean Square Residual (SRMR; values below 0.08 are good, values between 0.09 and 0.10 are acceptable); Tucker-Lewis Index and Comparative Fit Index (TLI and CFI; values higher than 0.95 suggest close fit for both, values between 0.90 and 0.94 are acceptable).

Internal consistency was assessed using Cronbach’s alpha coefficient (α; [20]) and McDonald’s Omega (ω; [44]). Alpha and omega values above 0.8 indicate good reliability (see [38]). Convergent and discriminant validity were evaluated using Pearson correlations. Test–retest reliability was assessed using the intraclass correlation coefficient (ICC; [3]), for which values above 0.75 are considered acceptable [55].

A series of multi-group CFA models was performed to assess configural, metric and scalar invariance across men and women—this is to say, whether the general factor structure is the same, whether the specific loadings are equal, and whether they have equal intercepts, respectively. This process involves sequentially creating three models, with every model adding more constraints compared to the previous one. The fit of the models is compared and if fit does not deteriorate in the more constrained model, invariance at the given level can be assumed. To confirm invariance, the chi-square difference test should be nonsignificant and differences in CFI, RMSEA and SRMR between the two models (configural vs. metric; metric vs. scalar) should remain small (ΔCFI < 0.01 and ΔRMSEA < 0.015 or ΔSRMR < 0.030 are considered sufficient, [14]. However, the chi-square difference test is sensitive to sample size (see [69]) and as long as the differences in CFI, RMSEA and SRMR remain small, invariance can still be assumed.

ResultsConfirmatory factor analysis with previously validated factor structures

The data did not show multivariate normality (skewness p < 0.001, kurtosis p = 0.39), as assessed by the Mardia test [43]. A table with descriptive statistics is provided in Additional file 1: S1. None of the previously validated factor structures (original, French female and male versions, Brazilian version, and Spanish short version) showed adequate fit for our data, with the Spanish short version showing closest fit compared to the other factor structures (see Table 1).

Table 1 Goodness of fit indices from confirmatory factor analysis using factor structures of other validation studiesExploratory factor analysis

Exploratory factor analysis was performed with a random half of the sample (N = 275). Both parallel analysis and Kaiser-Guttmann criterion (3 eigenvalues were greater than 1; 9.46, 2.26, 1.42; fourth eigenvalue = 0.70) suggested a three-factor structure. In addition, the fit of the unidimensional model was assessed (see Table 3). The three-factor solution showed best fit for the data, according to relative and absolute fit indices. Item 5 (“I think my appearance would help me get a job”) showed poor factor loading and was subsequently removed. Four items (Item 4 “I am preoccupied with trying to change my body weight”; item 13 “My looks upset me”; item 17 “I feel ashamed of how I look.”; item 18 “Weighing myself depresses me”) showed loadings above 0.37 on more than one factor and were therefore excluded. The subscales of the 18-item version were named ‘appearance-positive’, ‘appearance-negative’ and ‘weight’. All factor loadings derived from EFA are shown in Table 2. CFA with the second random half of the sample showed acceptable fit of the 3-factor version with 18 items (see Table 3 for all results).

Table 2 Factor loadings of the 18-item version of the BESAA-LA derived from EFA, with item labels in Spanish and EnglishTable 3 Fit indices from EFA, CFA and ESEM for the 23- and 18-item versions of the BESAA-LAESEM

To confirm factor structure, ESEM analysis was conducted using target rotation (based on EFA structure) in Mplus. Fit indices improved compared to CFA analysis and indicated good fit of the factor structure with the three subscales ‘appearance-positive’, ‘appearance-negative’ and ‘weight’. See Table 3 for fit indices and Additional file 2: S2 for all factor loadings derived from ESEM.

Reliability and validity

The 18-item BESAA-LA and all subscales showed good internal reliability (total: α = 0.92, ω = 0.94; ‘appearance-positive’: α = 0.92, ω = 0.94; ‘appearance-negative’: α = 0.74, ω = 0.78 and ‘weight’: α = 0.89, ω = 0.90).

Convergent and discriminant validity were assessed using zero-order correlations. BAS, EDE, SATAQ thin-ideal internalization, athletic internalization and pressures all showed significant correlations with BESAA-LA total scale and subscales in the predicted directions. The BAS showed a strong significant positive correlation with BESAA-LA and all its subscales, confirming convergent validity. Strong significant negative correlations of the BESAA-LA with the thin internalization and pressure subscales of SATAQ, as well as small significant negative correlations with the subscale athletic internalization (except for the positive and weight subscales) confirmed discriminant validity. Eating restraint showed a moderate significant negative relationship with BESAA-LA. See Table 4 for all results.

Table 4 Zero-order correlations coefficients (Pearson’s r) between BESAA-LA, BAS, SATAQ and EDETemporal stability

Although only 84 participants (16% of our original Colombian sample) responded to the questionnaire a second time, intra-class correlation analysis suggested good test–retest reliability ICC = 0.754, p < 0.001, 95%CI 0.645 to 0.833, providing some support for the temporal stability of the BESAA-LA. ANOVAs indicated that those participants did not significantly differ from non-completers in terms of age (F(1, 523) = 2.908, p = 0.089), gender (F(1, 523) = 0.75, p = 0.387), or baseline BESAA-LA scores (F(1, 523) = 0.177, p = 0.674).

Measurement invariance across gender

A series of multi-group CFAs were conducted to check for invariance across gender. Table 5 shows the fit of the model for men and women followed by tests of configural, metric and scalar invariance. The configural invariance test confirmed that the factor structure does not differ between men and women. Metric invariance uses a more restricted model with equal factor loadings between groups. The chi-square difference test between the configural and metric model was nonsignificant (p = 0.195) and fit indices showed minimal changes, indicating metric invariance. To test scalar invariance, a model with equal intercepts imposed for men and women was created. Compared to the metric model, chi-square difference test was significant (p < 0.001), but changes in CFI, RMSEA and SRMR remained small, confirming scalar invariance of the BESAA-LA in men and women in Colombia (see Table 5).

Table 5 Configural, metric and scalar measurement invariance of the BESAA-LA across men and women in Colombian sampleInterim discussion

The aim of Study 1 was to translate and culturally adapt the Body Esteem Scale for Adolescents and Adults in a Colombian adult sample. The 18-item BESAA-LA showed good internal reliability, temporal stability and validity. As expected, the BESAA-LA showed a positive correlation with body appreciation, a construct that is related to body satisfaction, but that incorporates a broader range of factors, such as body functionality and self-love [73]. Significant moderate negative correlations were shown with the SATAQ subscales thin ideal internalization and sociocultural pressures, whereas the athletic internalization subscale showed significant, but small negative correlations with the BESAA-LA. This is in line with other studies that found lower correlations between the BESAA and athletic subscale, compared to other SATAQ subscales [4].

When assessing factorial validity, the original three-factor structure could not be reproduced. However, our three factor solution with subscales ‘appearance-positive’, ‘appearance-negative’ and ‘weight’ resembles the Brazilian structure [65]. This resemblance is not surprising, as these Latin American countries share geographical proximity, as well as a focus on appearance and beauty, which is reflected in high incidence of plastic surgery per capita in both Colombia and Brazil (International Association for Aesthetic/Cosmetic surgery survey 2020). Conceptually, the two subscales ‘appearance-positive’ and ‘appearance-negative’ support the proposition that positive and negative body image might be two separate constructs, rather than the ends of a continuum [65, 77]. The third subscale ‘weight’ has been found consistently across all validation studies conducted so far (Original validation [48], Iceland [32], Italy [17], Turkey [2], India [24], Spain [4] and Brazil [65]).

One item (item 5; “I think my appearance would help me get a job”) showed poor factor loading and was therefore excluded. This is in line with the French versions for men and women, where this item was excluded due to poor factor loading [61, 74]. Four items were excluded due to high cross-loadings. Items 4 and 18 ("I am preoccupied with trying to change my body weight” and “Weighing myself depresses me”) loaded on the ‘weight’ and ‘appearance-negative’ subscales, which makes conceptual sense, as both items are negative statements referring to weight. On the other hand, items 13 and 17 (“My looks upset me” and “I feel ashamed of how I look”) loaded both on ‘appearance-positive’ and ‘appearance-negative’ subscales. This could be due to the nature of these two items; whereas most items in the ‘appearance-negative’ subscale focus on wanting to look better or like someone else, these two items assess specific negative emotional reactions to one’s own physical appearance. During data collection, several participants commented that these questions had felt unusually emotionally intrusive compared to the rest of the questionnaire, which may explain the unusual loading pattern.

留言 (0)

沒有登入
gif