The methodological possibilities for chronological age prediction of a deceased person depend on the availability of the biological material. Skull bone is among the most commonly found bone type and agedependent accumulation of molecular markers like Pen and D-Asp as well as changes in the DNAm level could therefore be useful for age prediction [37]. As interindividual variations are known for all chronological age markers [47, 48], a combined analysis could be beneficial. For this purpose, we investigated age-dependent changes in these three molecular clocks, developed age prediction RR models, investigated the improvement of age prediction by combination, and examined whether samples from individuals with signs of decomposition can also be analyzed. To achieve this, parietal bones from deceased individuals without and with signs of decomposition were collected and the DNAm of 6 CpG sites in ELOVL2, KLF14, PDE4C, RPA2, TRIM59, and ZYG11A was analyzed using MPS, and the amount of D-Asp and Pen was determined by HPLC. To the best of our knowledge, this is the first study to examine age prediction using DNAm, D-Asp and Pen for samples in different stages of decomposition in bone.
Age-dependent changes in D-Asp, Pen and DNA methylationThe samples included in dataset 1 (n = 98) were used to characterize the age-dependent molecular clocks D-Asp, Pen and DNAm (i.e., included CpG sites) in bone (cf. Figure 1). One limitation was that the Pen accumulation during life was below the detectable threshold for some individuals under the age of 18 years.
Age-dependence was verified in bone with ρ > 0.8 in all markers, however, with a lower age-dependent correlation for D-Asp (ρ = 0.86) compared to previous studies using highly bradytrophic and homogeneous tissues, such as dentine (Pearson r = 0.96–0.99 [9, 12, 13, 49]). The age-dependent correlation for Pen was comparable to results for dentine (Pearson r = 0.94 [12, 15]).
Differences in the results for the protein parameters in total dentine and total bone can be explained by very different turnover rates in these tissues. Although dentine and bone share structural and functional similarities like the collagen matrix and the mineral content, after initial formation during tooth development, mature dentine is a very bradytrophic tissue with (almost) no turnover through which its protein composition stays largely unchanged [50] resulting in a close relation between D-Asp and Pen levels to age [12] even by analyzing total tissue.
Bone tissue on the other hand undergoes constant remodeling through a balanced process of old bone resorption and new bone replacement described as bone turnover rate (% rebuilt bone per year) and is influenced by e.g., diseases, stress, overall fitness, and hormonal influences [51]. It depends on bone type and is highest at sites where trabecular bone predominates and lowest at sites with a lot of cortical bone [4, 5]. In our study, we investigated skull bone samples having a higher density of cortical bone material and therefore a lower turnover rate [5]. However, with increasing age, the bone structure and metabolism change, resulting in loss of bone mass, decreasing thickness and osteoporosis [16, 18, 26]. It can therefore be assumed that the composition of the organic bone protein matrix varies, especially in old age, with each protein having its own kinetics for the accumulation of D-Asp and Pen, depending on its structure and metabolism. Consequently, variation in the protein composition through changes in bone metabolism as well as degradation of a total bone protein sample strongly impacts the D-Asp and Pen content of a total protein mixture. The only solution is the purification of individual long-living proteins. The analysis of D-Asp in purified osteocalcin from skull bones proves this theory with a very high correlation between D-Asp content in purified osteocalcin and age (Pearson r = 0.99 [52]). So far, only osteocalcin has been identified as a suitable bone protein; however, its purification is very challenging. The identification of further suitable proteins and the establishment of practicable methods for protein purification is an important research goal. We confirmed the results from our pilot study (D-Asp ρ = 0.9; Pen ρ = 0.9) investigating bone samples from 15 individuals [37]. It has to be mentioned that a comparison between studies is limited as sample ranges, age composition within the dataset, and the used correlation parameter can be discrepant between studies.
Also the observed Spearman’s correlation values for the DNAm markers (ρ = 0.87–0.93) were within the ranges of the pilot study (ρ = 0.9–0.93 [37]), although, the final ‘best’ CpG site was not the same in all cases. It would also be possible, to use neighboring CpG sites as an alternative, as these often had similar correlation values. Slight fluctuations can be caused by sample size and age composition under study. Other studies have also analyzed DNAm in bone samples and revealed age-dependent changes [22, 38, 39, 53, 54]. Furthermore, the six genomic regions show agedependent correlation in a wide range of other tissues and are (with different intensity) implemented in multiple age predictions models [30]. Although DNAm was a more accurate marker for age prediction, inter-individual variation increased with age, and outliers occurred. The observed age dependence may (partly) also be explained by changes in metabolism and turnover with increasing age caused by possible shifts in cell-type composition and cell function in dependence on the above-mentioned factors.
Age prediction based on three biological age estimatorsFor the development of the RR models, only samples of the collected individuals equal to and above 18 years were used as the training dataset (n = 86). Furthermore, two independent test datasets (individuals without signs of decomposition: n = 44, individuals with signs of decomposition: n = 48) were used to test the RR models. The RR models based on Pen and D-Asp in total protein samples resulted for the training data using CV in a mean MAE/ mean RMSE of 9.66 years/ 11.52 years (Pen), and 11.91 years/ 14.57 years (D-Asp). The results for D-Asp and Pen do not measure up compared to the data for dentine and purified osteocalcin from skull bone [12, 15, 52]. However, the known methodological approaches for purifying osteocalcin are very complex and can currently hardly be used in forensic practice. Given this context, total protein samples were analyzed here. The DNAm approach led to a lower mean MAE mean RMSE of 4.95 years/ 6.89 years and was therefore more accurate compared to the protein-based parameters (∆MAE of 5 years and 7 years). These results were confirmed by the independent test set (cf. Table 1). In a previous study based on dentine, age prediction models were developed that led to MAEs of 2.93 years for D-Asp and 3.41 years for Pen [12]. First, DNAm age prediction models were developed, having e.g., in the study of Woźniak et al. (2021), an MAE of 3.3 years and 3.4 years in the training and test dataset by analysis of occipital and femoral bone material [38]. Differences in the MAEs compared to previous studies may be partly due to increased ages included in our study. The studies mentioned before included samples from individuals under 80 years (with the exception of one training sample in the study of Woźniak et al. (2021)) [22, 38, 39, 53, 54]. In our study, all predictions models led to an increased MAE and RMSE in the older age groups, with a particularly strong decreased accuracy in the 80 + year’s age category (cf. Suppl. Table S3). As in case of other age prediction models based on molecular markers, the higher uncertainty in case of older individuals should be considered. For better interpretation of the obtained results, reporting of age group-dependent model evaluations parameters as presented in Suppl. Table S3 can be therefor helpful. In addition, information as the percentage of the correct predictions within a case-dependent useful interval (e.g. 61.4% +/- 5 years) could be added.
Advantage of using combined modelsThe combination of Pen and D-Asp for development of a RR model increased the overall accuracy (training set CV: MAE 8.55 years, RMSE 10.18 years, test set: MAE 7.16 years, RMSE 9.16 years). The usefulness of this approach has already been demonstrated by combining the D-Asp and Pen content for age prediction in dentine obtaining a decrease in MAEs from 2.93 (D-Asp) and 3.41 years (Pen) to 2.68 years (combined) [12], observing the same effect for more complex tissues such as intervertebral discs and epiglottis [33]. Considering the single-molecular clock models, the accuracy (evaluated as MAE/RMSE) of the DNAm-based model was superior to that of the protein-based age prediction in total protein samples. Combining the DNAm with either D-Asp, Pen, or D-Asp and Pen did not show an improvement in overall accuracy considering individuals without signs of decomposition. Nevertheless, this does not exclude an improvement in single cases as the MAE and RMSE evaluate the model performance based on all test data results. Therefore, the conclusion that isolated DNAm would always be sufficient in specific individual cases could be too short-sighted. The inclusion of the protein levels (as well as the inclusion of DNAm in protein models) might be useful in order to outbalance influences like lifestyle, health status, and numerous diseases [14, 31, 32]. Further research is needed to investigate the not yet well understood impact of these different factors on chronological age prediction models to define guidelines for in which cases a combination might be (dis) advantageous.
Impact of post-mortem changes on age predictionIn a next step, we examined samples with early to severe signs of decomposition and the effect on the prediction accuracies. In our study, all three molecular clocks were successfully analyzed. However, with even longer postmortem intervals, reliable and accurate DNAm may be difficult. Bone proteins may be quite well preserved for a long time [55,56,57]. Nevertheless, postmortem degradation of proteins that change the overall composition of total bone samples may be a problem, if total bone samples (and not defined, purified proteins) are analyzed. In dentine, Pen could be stable over very long PMIs up to thousands of years (at least in dentine), which would enable a wide application range of age estimation based on this parameter also in the anthropological-archaeological context [58]. It remains to be clarified whether this also applies to bones.
For all parameters, a moderate correlation with age was observed (ρ(Pen) = 0.68, ρ(D-Asp) = 0.59, ρ(DNAm(6CpGs)) = 0.4–0.73), which was lower compared to the samples from individuals without signs of decomposition (cf. Figures 1 and 3). The biggest differences were observed for the markers with the highest correlation value (ρ) in individuals without signs of decomposition: Pen (∆ρ 0.22), ELOVL2 (∆ρ 0.21), PDE4C (∆ρ 0.28), TRIM59 (∆ρ 0.34), and KLF14 (∆ρ 0.39). The results can mainly be attributed to increased variation of single samples. This variation is also visible for the other markers, but ha less impact on the correlation value (ρ) due to an already higher variation in individuals without signs of decomposition. More research is needed to explicitly determine the underlying biological and technical causes (of which some are discussed below). An overall lower accuracy with MAEs of 11.77 years (RMSE 15.07 years) for Pen and 11.68 years (RMSE 15.42 years) for D-Asp was obtained. The DNAm model still performed better with an MAE of 7.38 years (RMSE 10.39 years) compared to the protein-based parameters but less accurate than testing bones without signs of decomposition. A slight improvement was obtained for the RMSE (10.39 years (DNAm) vs. 9.08 years (combined)) by the combination of the three molecular clocks (cf. Table 1). The slightly greater drop of the RMSE compared to the MAE (7.38 years (DNAm) vs. 6.8 years (combined)) may give an indication that especially outliers in the age prediction were reduced, which was the case in samples with very low DNA content (cf. Suppl. Figure 4C). An analysis of more samples is needed to support this indication.
The overall reduced accuracy in the age prediction based on the molecular clock models in decomposed individuals could be caused by postmortem changes like deterioration of the mineral phase and microbiological invasion which results in chemical and biological degradation of the organic bone matrix. In the absence of functional enzymatic repair mechanisms cellular components and DNA degrade due to their limited chemical stability [59, 60]. This leads to a change of the cell type composition and the amount and quality of DNA available for DNAm analysis.
Although bone proteins may be quite well preserved for a long time [55,56,57], postmortem degradation of proteins may significantly change the overall composition of total bone samples to a mixture of preserved proteins and fragments of broken proteins. This has direct implications on the overall contents of D-Asp and Pen, since they are analyzed as “summary values” in total protein samples. The even higher scattering of the data for older individuals could be related to a pre-existing intravital, age-related degradation of the organic bone matrix, which could result in a higher vulnerability against postmortem influences.
Additionally, the overall impaired tissue and cell structure in decomposed samples might have an impact. Especially for DNAm, a difference in the obtained DNAm values between individuals with and without signs of decomposition might occur due to the analysis process. As decalcification and multiple washing steps are part of the analysis, therefore a destroyed or altered cell structure could lead to a specific ‘wash away’ effect changing the cell type composition analyzed in the final eluate. Moving forward, extensive research is needed in the future to investigate the impact of the discussed degradation processes potentially interfering with accurate age prediction.
To get first research insight, if it could be beneficial to include the very heterogeneous biological as well technical variation caused by decomposition in the training data of a model, pilot RR models were built for age prediction of individuals with signs of decomposition. Within this model, the d-score was not yet included, as not enough samples covered all d-scores in sufficient amount. The overall visible state of decomposition of the body (total body d-score) and head (head d-score (cf. Suppl. Table S1B)) do not necessarily align with the decomposition state of specific tissues such as the bone material itself. As observed before, no correlation was seen between the dscore and the age prediction deviation (Suppl. Fig. S2). Overall, the pilot LOOCV RR models improved the prediction accuracy and outbalanced the previously observed downward trend (cf. Figure 4),but should be considered with caution as more research and samples are needed for a better understanding of all influences and to build a reliable model.
Considerations and limitations of the developed modelsThe developed models are based on the results of the analyzed samples and might be influenced by that. Next to the biological facts impacting the results, the used technical procedures can lead to variation, limits, and to study-specific results which are presented below.
Sample collection and preparationAlthough the samples in this study were taken from strictly standardized areas (Os Parietale) at the same anatomical location, there could be differences and some heterogeneity between the cancellous and cortical portions within a bone fragment analyzed [61]. An additional factor causing variations in the proportion of cortical and cancellous bone is aging itself as described above [16]. This raises the question of whether different bone pieces from the same general location show intra-individual differences. Furthermore, variation between measurements even from the same fragment can occur because of stochastic effects (molecules analyzed) and technical fluctuations, which should be part of future research. Furthermore, our results cannot be automatically transferred to other bone types analyzed (e.g. femur, an often-occurring sample type in forensic casework). A study by König et al. (2023) observed differences in the age-dependent accumulation of D-Asp and Pen between three bone types (skull, rib, clavicle), which could be due to differences in the structure and metabolism of the various bone types from different anatomical regions, leading to different protein compositions and thus to variations in D-Asp and Pen levels of the samples [62]. The resulting impact on age prediction models has to be further investigated and will also depend on the strategy of the model development (e.g. choice of mathematical model, inclusion of sampling location).
As the analysis of all molecular markers depends on sample preparation, the use of another sample preparation before analysis could lead to differences. As described above, e.g. longer incubation steps for bone decalcification or increased wash steps prior to DNAm analysis might lead to ‘wash-away’ effects and change cell type composition. Furthermore, blood cells in small capillaries and the remaining bone marrow cannot be excluded as trabecular bone was not removed.
Technical challengesThe data used for model development are based on D-Asp, Pen and DNAm analysis and technical variation has to be considered. Within this study, standardized methods were used for all samples to harmonize analysis over all three datasets. Technical variation was reduced to a minimum by using enough material e.g. to allow a DNA input of at least 10 ng in the PCR, reducing stochastic variation. Nevertheless, as shown especially for the decomposed individuals, that was not always possible. Furthermore, the harsh process of bisulfite conversion increases DNA degradation with a higher impact on DNAm analysis in case of already pre-degraded samples. These effects remain a challenge in case of DNAm analysis from degraded and low DNA amounts increasing stochastic variation during DNAm analysis.
In case of protein analysis, technical variation can also arise by a too low powder amount (optimal amount in our study was identified with 20 mg) for a sufficient signal quantity for evaluation. Furthermore, the technical threshold for detection of the protein accumulation resulted e.g. in detection challenges for the Pen accumulation in minors. More sensitive methods might be help in the future to overcome that problem.
Model development and evaluationThe presented models and results are based on the choice of six CpG sites, two protein markers and ridge regression as underlying mathematical model. The included CpG sites showed age-dependency, and the mathematical model showed suitability, however the use or addition of other sites and optimization of the mathematical model might be able to improve the model.
The models developed during this study are based on samples from deceased individuals and therefore limited in material available, leading to a model excluding individuals under 18 years of age. This decision was made to exclude a bias due to imbalance of number of individuals per age category and in addition due to the fact, that the interpretational threshold did not allow a reliable quantification of the Pen amount for all individuals under the age of 18 years. Within the study, sex balance between females and males as well as equal balance over the whole age group could not be completely achieved. A potential impact of the sex needs further consideration and deeper analysis with more samples. Furthermore, the composition of dataset 3 including individuals with signs of decomposition is biased toward a higher age due to the circumstance that younger individuals are less often found in a (highly) decomposed state. Further collection of samples of specific ages and decomposition state could help to improve model building in the future.
Comments (0)