Testing the accuracy of the DRNNAGE software for age estimation in a modern Greek sample

Any forensic anthropological method must be tested for repeatability, reproducibility, and accuracy before its use is generalized and it becomes admissible to legal contexts [19, 20]. According to the DRNNAGE software developers [14], all skeletal traits employed in their method presented a very high (average value of 0.907) and statistically significant concordance coefficient regarding intra-observer error, with only exception the “Radius head” (RD01) and “Femur head” (FM01). In contrast, the average intra-observer error concordance coefficient in our study was 0.717 for the first observer and 0.748 for the second. There was great variability in the coefficient’s values among different traits; thus, certain traits should be favored and others should be used cautiously or be avoided altogether.

Inter-observer error, as expected, showed an even smaller concordance coefficient (average value 0.615). Once again, some traits showed much higher reproducibility than others and should be thus preferred, while others should be avoid. At this point, we must stress that DRNNAGE involved many traits that are binary-coded. Therefore, we would have expected better reproducibility and repeatability results since methods using a narrower scale of categories produce greater agreement among researchers [20].

The use of different network algorithms to predict age-at-death had a minimal impact on the results, though in our sample, the ensemble autoencoder S network showed higher bias and inaccuracy values, so any of the remaining three options should be preferred.

The validity of DRNNAGE was overall average in the modern Greek assemblage, both for males and females. Very interestingly, this software showed a high validity for individuals older than 50 years old, which is often a problematic category when using “traditional” skeletal age-at-death estimation methods, whereas the results were very poor for those younger than 50. Surprisingly, the cranial sutures exhibited the highest validity, even for younger adults. Among the remaining anatomical areas, those with high validity included the clavicle and 1st rib, the acetabulum, and the pubic symphysis, while the lowest validity was achieved by the vertebrae. Similarly, the greatest bias and inaccuracy were found in the vertebrae, and the lowest in the clavicle and first rib. The above observations are corroborated by the analyses performed on the decade based segmented sample. However, it is important to note that the sample size for the group of individuals older than 90 years old is very small and the respective results should be interpreted with caution.

These results are partly in agreement with another validation study performed on the Athens Collection, employing traditional age-at-death estimation methods focused on the public symphysis, iliac auricular surface, and cranial sutures [13]. In specific, Xanthopoulou and colleagues found that the iliac auricular surface when recorded using the Lovejoy et al. [21] method works satisfactorily for all age groups; cranial sutures and the pubic symphysis were found to perform satisfactorily for individuals younger than 50 years old but poorly for older ones, while the iliac auricular surface recorded using the Buckberry and Chamberlain [22] method gave the most accurate results for individuals older than 50 years. The differences in the performance of the DRNNAGE software compared to these traditional methods, even when focusing on the same anatomical areas, must be attributed to the different ways in which skeletal changes are recorded in each method but also to their different statistical treatment for age prediction.

The vertebrae, and more specifically the fusion of the superior and inferior epiphyses, have been established and popularized over the years as a viable method of skeletal age estimation in teenagers and young adults [23,24,25]. As a rather recent example, Albert et al. [26] achieved over 78% classification accuracy when studying 57 individuals aged 14–27 years. For older adults, several studies have shown that osteophyte formation could be useful for estimating the age-at-death [27,28,29]. Very recently, Sluis and colleagues [30] tested three methods based on osteophyte formation on 88 individuals from the Middenbeemster cemetery and achieved over 72.73% classification accuracy. The DRNNAGE scoring system for vertebrae covers the whole spectrum from the fusion of the epiphyseal ring to the formation of lipping. However, all these major changes from ring fusion to lipping are covered within merely three stages, which do not express sufficiently different degrees of lipping that are anticipated in middle aged and older adults Therefore, the low validity and high bias and inaccuracy observed in our study may be due to the skewed age-at-death distribution towards older people in the Athens Collection.

The cranial suture closure pattern has been studied as a potential age-at-death predictor for nearly a century [31]. Several studies since then have proposed variants of different recording schemes for age prediction based on different sutures and suture combinations [32,33,34]. In parallel, numerous validation studies have stressed the poor performance of this method (e.g. [35]). The high validity in age estimation achieved in the Athens Collection for individuals younger than 69 years old via DRNNAGE is thus surprising; however, it also aligns with a recent review that stressed the potential of this anatomical area as a useful indicator for age estimation [36].

With regard to the other anatomical areas that showed high accuracies, Kunos et al. [37] were the first to use the first rib for age-at-death estimation because it is easily identifiable, not influenced by mechanical stress in the same manner as the lower ribs and exhibits a prolonged span of remodeling into the eighth decade. In a recent study on 260 skeletons from the Raymond A. Dart Collection of Human Skeletons, Jooste and Steyn [38] concluded that the first rib can be used to make age-at-death predictions but should ideally be used in combination with other skeletal traits. The DRNNAGE software combines the first rib with the clavicle, which has the potential to aid age estimates beyond the traditional “mature adult” age category (> 46 years) [39], while its usefulness in providing precise age estimations between the ages of 16 and 30 years has been identified by several studies [2, 40,41,42,43]. Therefore, our results from the Athens Collection, showing high age-group classification rates for the clavicle and first rib are not surprising.

The acetabulum and pubic symphysis were the remaining two anatomical areas that gave overall high validity values in the Athens Collection. The adult human pelvis has been among the most useful areas for age-at-death estimation and contains different anatomical structures that have been used for this purpose: pubic symphysis, auricular surface, and acetabulum. Bony degenerative changes in these regions have been shown to correlate with age [44,45,46]. Although several studies have demonstrated that relevant methods can most commonly support age estimates between the late teens and 50–60 years, where the observations of the progressive degenerative changes reach their peak breakdown and plateau [21, 22, 47, 48], in the present study the DRNNAGE software performs better for individuals over 49 years old. In what concerns the epiphyseal union at the upper and lower limbs, this has been established as a viable method of skeletal age estimation in teenagers and young adults [42]. For more mature adults, the most commonly used age estimation methods based on the upper and lower limbs focus on degenerative changes on the articular surfaces [21, 47,48,49]. This latter approach is also the one followed at DRNNAGE. The upper and lower limbs showed a generally moderate validity in skeletal age estimation for individuals aged from 19 to 59 years old at the Athens Collection, with the exception of the upper limb performance for the age group 19–29 years old. This may be linked to the fact that articular changes (usually osteophytes and porosity) are strongly associated with mechanical stress linked to daily occupations but also body weight and other factors, besides age [50].

Finally, for a more direct comparison with the performance of the DRNNAGE software as reported by its developers [14], the variable combinations described in their work were also tested. The results of these analyses are provided in Supplementary Material (Table S12 and Figure S1). According to the comparison, the anatomical regions of the sutures, the 1st rib, and the pubic symphysis showed similar validity values in both studies, while major differences in validity were observed for the variable combinations regarding axial, appendicular, sacroiliac, and standard traits. Specifically, the DRNNAGE models severely underperformed in the Athens Collection. Similarly, the anatomical regions of the clavicle and the acetabulum underperformed in the Athens Collection sample; however, the observed differences in validity values were moderate.

Although it has been often supported in the literature that using a multivariate approach for skeletal age estimation is more proper than using any single method alone [51, 52], in the present study the lowest classification rate was obtained when combining all available anatomical regions. It is well known that the aging process is controlled by various internal and external factors [6, 7], which affect different anatomical areas differently and this can become a source of bias in age estimation. Furthermore, the validity of skeletal age estimation can be affected by within and between individuals and populations variation in the rate of senescence [53]. The DRNNAGE software was trained utilizing skeletal collections hosted at the University of Coimbra (CISC, XXI-ISC) which are composed of individuals of Portuguese ancestry. Furthermore, the age-at-death distribution of the reference sample used by the DRNNAGE developers was homogeneous across the represented age-at-death span, whereas the proportion of older individuals is higher in the Athens Collection. Therefore, the validity loss observed could be attributed to either population specificity or the different age-at-death distributions of the utilized samples.

In conclusion, the DRNNAGE software produced partly accurate age-at-death predictions in a modern Greek assemblage. This method was particularly successful for males and females older than 50 years, but it performed poorly for those younger than this threshold, with only exception the use of cranial suture closure. Moreover, different anatomical areas showed very different repeatability, reproducibility, and validity. Further evaluation studies in different assemblages are necessary in order to test the performance of this software more broadly.

留言 (0)

沒有登入
gif