Nanomaterials, which are defined as materials that fall in the range of 1–100 nanometers two-dimensionally, are commonly used in the fields of biomedicine, catalysis, and electricity because of their stable and unique performance, small size, and large surface area . Nanomaterials encompass a range of substances that can be categorized as carbon-based, metal oxides, semiconductors, polymers, clays, emulsions, or metals . Metal oxide nanoparticles (MONPs) are metallic oxides that exist within the nanoscale range and can be intentionally created or occur naturally . Under the rapid development of nanotechnology, more and more MONPs including zinc, iron, titanium, and copper are being explored in therapeutic applications such as drug delivery, bioimaging, biosensing, bioelectronics, and tissue engineering applications . Simultaneously, many of these particles also presented strong antibacterial, antifungal, antidiabetic, antioxidant, anticancer, and photocatalytic activities . Besides the medical field, they are also commonly used in commercial products such as fuel cells and plastics, and environmental applications such as analysis, sensing, remediation, and amendments. However, it is concerning that the environment is affected because of the enormous production and inadvertent use of nanomaterials.
Nanoparticles have been identified in wastewater streams, drinking water sources, and tap water in amounts ranging from nanograms to micrograms per liter . Also, it was reported that MONPs have been found in human tissues such as brain, heart, and liver and that occupational exposure to metal oxide nanomaterials increased oxidative stress biomarkers, suggesting potential DNA oxidative damage and lipid peroxidation . Given the limited data available from human studies, researchers have turned to zebrafish and their embryos for toxicological investigations. Zebrafish embryos are commonly used to identify environmental heavy metal pollution . As a multicellular organism, zebrafish can offer more comprehensive insights into nanomaterials’ kinetics, migration, and transformation than in vitro cell culture assays . Meanwhile, it is considered an equivalent model for investigating developmental toxicity and genotoxicity because around 85% of its genes are comparable to those found in humans .
The potential harm to human health posed by newly created MONPs, particularly those used in biomedical applications, necessitates the implementation of safety-by-design strategies for these materials. The potential to lower development timeframes, costs associated with experiments, and late-stage attrition, in addition to ethical, societal, and regulatory pressures to minimize animal testing, make it worthwhile to create computational models that can accurately predict the toxic hazard of novel MONPs before experimental testing and, ideally, before synthesis, based on the intrinsic, synthesis-controlled properties of the MONPs . Over the years, QSAR/QSPR/QSTR techniques have been employed to establish correlations between various characteristics of nanomaterials and their toxicity . Nano-quantitative read-across structure–toxicity relationship (nano-qRASTR) models are an advanced approach that builds upon the principles of nano-quantitative structure–toxicity relationship (nano-QSTR) models. These models integrate read-across techniques with traditional quantitative structure–activity relationship (QSAR) methods to enhance the predictive capabilities, particularly in datasets with limited data points .
Using quantum chemical descriptors, researchers have created several models to evaluate the toxicity of MONPs to different species covering multiple endpoints, and their work has produced significant and trustworthy findings . However, significant computational resources and time are needed for the usage of quantum descriptors for modeling purposes. Not only that, but the reproducibility of quantum descriptors is also an issue because of the usage of different quantum methods and basis sets . In contrast, periodic table descriptors were derived or directly obtained from the periodic table. They were able to produce models that were comparable to, or even better than, those of quantum-based descriptors in many cases , which in turn helped to reduce the amount of time needed for computation followed by without using any computational resources.
However, the periodic descriptors of the previous first and second generations have their limitation such as being unable to deal with the influential observations that exist in the present dataset. In this study, we have proposed five third-generation periodic table descriptors along with the application on modeling enzyme inhibition of the zebrafish hatching enzyme ZHE1 with the nano-qRASTR approach to improve the model quality, predictability, and reliability significantly.
Materials and Methods DatasetThe percentage decrease in enzymatic activity expressed in the form of enzyme inhibition to zebrafish in % (%EIzebrafish) of the zebrafish hatching enzyme (ZHE1) of 24 MONPs is utilized for the modeling study . The experimental data (%EIzebrafish) ranged from −1.04 (Co3O4) to 44.72 (Cr2O3).
Descriptor calculationModels were developed based on the fundamental properties of these metal oxides that can be obtained from the periodic table. A total of 28 periodic table descriptors were utilized for nano-QSTR followed by nano-qRASTR modeling. The list of all derived descriptors along with their meaning and symbol is given in Table 1. Periodic table descriptors offer the advantage of rapid acquisition without the need for extensive calculations or software utilization, unlike quantum chemical descriptors. In our earlier work, we have proposed seven and sixteen descriptors, which were classified as first- and second-generation periodic table descriptors, respectively . In this study, we have proposed five more periodic table descriptors, termed third-generation periodic table descriptors. These are atomic radius, crystal ionic radii, density of the metal, electron affinity, and ionization energy. The atomic radius is a fundamental property that influences many physical and chemical characteristics of an element. In the context of nanoparticles, the size of the metal atoms directly affects the overall size and surface area of the nanoparticles, which are critical factors in their reactivity and interaction with other materials. The ionic radius is essential for understanding the metal’s behavior in different oxidation states. This is particularly relevant in nanoparticle chemistry, where redox reactions are common. The density of a metal is a macroscopic property that influences the mass and volume of nanoparticles. Electron affinity measures the energy change when an electron is added to a neutral atom, reflecting the tendency of the metal to gain electrons. The first ionization energy is the energy required to remove the outermost electron from a neutral atom, which is a critical factor in determining the metal’s reactivity and stability. For the present study, descriptors of all three generations are computed and employed for modeling. All descriptor values can be found in Supporting Information File 1. Also, an example calculation of all descriptors for Al2O3 is given in Supporting Information File 1.
Table 1: List of periodic table descriptors used for model development.
No. Generation Mathematical expression Description 1 first generation MW molecular weight of the metal oxide 2 Nmetal number of metal atoms per molecule 3 Noxy number of oxygen atoms per molecule 4 χ metal electronegativity 5 ∑χ total metal electronegativity in the specific metal oxide 6 ∑χ/nO total metal electronegativity in the specific metal oxide relative to the number of oxygen atoms 7 χox oxidation number of the metal 8 second generation Zmetal atomic number of the metal 9 Zvmetal number of valence electrons of the metal 10 PNmetal period number of the metal 11 λ = (Zmetal − Zvmetal)/Zvmetal core environment of the metal, defined by the ratio of the number of core electrons to the number of valence electrons 12 μ = 1/(PNmetal − 1) — 13 Vmetal valence of the metal 14 αmetal = λ·μ — 15 ∑αmetal = αmetal·Nmetal — 16 ∑αoxy = Noxy·0.33 — 17 ∑α = ∑αmetal + ∑αoxy core count, gives a measure of the molecular bulk 18 εmetal = −αmetal + (0.3·Zvmetal) electronegativity count of the metal 19 εoxy = −αoxy + (0.3·Zvoxy) electronegativity count of oxygen 20 ∑ε = εmetal·Nmetal + εoxy·Noxy total electronegativity count of the metal oxide 21 ∑ε/N summation of epsilon relative to the number of atoms in the molecule 22 (∑α)2 square of summation of alpha, gives a measure of molecular bulk 23 (∑ε/N)2 summation of epsilon divided by the number of atoms squared 24 third generation a0 atomic radius of the metal (pm) 25 rion crystal ionic radius of the metal (pm) 26 dmetal density of the metal (g/cm3) 27 Ea electron affinity (eV) 28 I1 first ionization energy of the metal (eV) Splitting of the datasetThe selection of training and test sets was based on the principal component analysis score with guaranteed uniform distribution, as we previously reported . In this study, we used the same dataset-splitting method. In our previous study, we removed compound CoO because of outlier behavior that significantly impacted our model quality. However, as we have proposed five new third-generation periodic table descriptors for modeling, in the present study we have included CoO to check the modeling, as well as the prediction capability, of the newly introduced descriptors along with the existing ones. The details of training and test sets can be found in Supporting Information File 1.
nano-QSTR model developmentThe best subset selection (BSS) approach was used to identify the optimal combination of descriptors. The BSS tool can be accessed at https://teqip.jdvu.ac.in/QSAR_Tools/. It systematically evaluates all possible subsets of descriptors to determine the best combination based on a specified criterion, providing a comprehensive search for the most predictive model. This method was preferred over stepwise regression analysis through backward elimination because BSS ensures that the chosen subset is truly optimal by considering all possible models, whereas stepwise regression may overlook some combinations because of its iterative nature. Afterward, the selected descriptors were employed to develop the final model using a multiple linear regression (MLR) statistical tool, which can be accessed at https://teqip.jdvu.ac.in/QSAR_Tools/ . Pearson correlation among descriptors was also checked, which aimed to create a more dependable model and reduce the possibility of intercorrelation among the descriptors.
Calculation of RASTR descriptors and development of nano-qRASTR modelRASTR is a method that integrates the ideas of read-across and QSTR for q-RASTR analysis (here we are modeling nanomaterials, hence the term nano-qRASTR) . This method calculates similarity and error-based RASTR descriptors for training and test sets. The RASAR-Desc-Calc-v2.0 tool employs three similarity-based techniques to produce 15 descriptors, namely, SD_Activity, SE, CVact, MaxPos, MaxNeg, Abs Diff, Avg. Sim, SD_Similarity, CVsim, gm (Banerjee-Roy coefficient), gmAvg. Sim, gmSD_Similarity, Pos.Avg.Sim, and Neg.Avg.Sim. These descriptors are essential for identifying structural similarities and predicting biological activity. The tool’s algorithm uses the weighted standard deviation of predicted values, the coefficient of variation of computed predictions, the average similarity level of close training compounds for each query molecule, and other advanced metrics to ensure accurate predictions. Further details about the tool and its features can be found at https://sites.google.com/jadavpuruniversity.in/dtc-lab-software/home .
After computing the RASTR descriptors for both the training and test sets, these descriptors were merged with existing periodic table descriptors. Feature selection was then performed using the BestSubsetSelection_v2.1 tool, which can be found at https://teqip.jdvu.ac.in/QSAR_Tools/. This tool produces a comprehensive set of model combinations for a user-specified number of descriptors while ensuring that the intercorrelation does not exceed a certain threshold. The MLR-based nano-qRASTR model was evaluated using the MLRPlusValidation 1.3 software package, which can be found at https://teqip.jdvu.ac.in/QSAR_Tools/.
Validation, applicability domain, and Y-randomizationThe nano-QSTR model and the nano-qRASTR model were validated through measurements of the goodness-of-fit and the internal validation tool of leave-one-out cross-validation (Q2). The goodness-of-fit of the models was measured using the coefficient of determination (R2), which indicates how well the model’s predictions match the actual data. Internal validation was performed using the leave-one-out cross-validation (LOO-CV) method:
This technique involves removing one data point at a time from the dataset, building the model on the remaining data, and then predicting the excluded data point. The process is repeated for each data point, and the Q2 metric is calculated to assess the model’s predictive accuracy. Details of the validation metrics can be found in our previous works .
We also examined the applicability domain (AD) using the leverage technique to generate the Williams plot . A Y-randomization study was also performed to determine if the produced model was generated by chance or not, which entailed performing the model’s calculations 100 times by rearranging the dependent variables while maintaining the original independent variables constant . A Y-randomization study has been performed employing “MLR Y-Randomization Test 1.2”, available at https://teqip.jdvu.ac.in/QSAR_Tools/. Following the Y-randomization procedure, the study calculated the mean values of R2 and Q2 for the 100 randomly generated models.
External dataset for data gap filling and prediction reliabilityOur prepared external dataset consists of 35 MONPs that were used to predict toxicity for zebrafish. External prediction quality is also checked through the “Prediction Reliability Tool” that employs the AD to our external prediction that is evaluated by three criteria: (1) The mean absolute error is calculated for leave-one-out predictions using the ten most similar training compounds for each query molecule. (2) The standardization approach determines the applicability domain based on similarity. (3) The proximity of the predicted value of the query compound to the experimental mean training response is evaluated .
Results and Discussion nano-QSTR toxicity modelEquation 1 has been developed employing the BSS-MLR approach for the inhibition of ZHE1 hatching enzyme activity:
The first descriptor represents the total metal electronegativity in a specific metal oxide and shows a negative correlation to the inhibition of the ZHE 1 hatching enzyme. In this case, an increase in electronegativity will result in a decrease in toxicity. For instance, SnO2 has a %EI of 7.12 while having a total metal electronegativity of 3.56. In contrast, the total metal electronegativity of WO3 is 1.65, and its observed %EIzebrafish is 42.72. The descriptor
gives a measure of the molecular bulk, which has a positive correlation to the enzyme’s activity. CeO2 has an
value of 12.50 while it has a %EI value of 2.56; in contrast, TiO2 has a
value of 143.76 and a %EI value of 13.28. The last descriptor in our nano-QSTR model is the atomic radius, ao. The model presents a negative coefficient for the atomic radius (−0.439), suggesting that nanomaterials composed of atoms with larger radii are associated with a decrease in %EIzebrafish. A larger atomic radius might indicate weaker bonding and less effective interaction with the enzyme or its substrate, leading to less enzyme inhibition. This could be due to the diffuse nature of the outer electrons in larger atoms, which might reduce the efficiency of electronic interactions essential for binding or catalytic activity.
Our nano-QSTR model suggests that the enzymatic activity of ZHE1 in zebrafish is influenced negatively by the total electronegativity of metals and the atomic radius of the nanomaterial components but positively by the molecular bulk of the nanomaterials. Electronegativity and atomic size determine the reactivity and contact strength of nanomaterials with biological systems, whereas the molecule bulk affects the mechanism of inhibition through steric effects.
nano-qRASTR toxicity modelTo improve the statistical quality of the nano-QSTR models, we have employed read-across descriptors employing modeled descriptors. Later, all descriptors are merged together and employed for modeling using the BSS-MLR approach. Equation 2 presents the developed nano-qRASTR model:
Like the nano-QSTR model, the nano-qRASTR model also has the descriptor with a positive contribution to the toxicity. Also, there are two new descriptors from RASTR, namely, SE(LK) and CVsim(LK). “SE” stands for standard uncertainty in the observed response values for the chosen proximate source compounds related to each reference compound. It has a positive contribution to our model with a coefficient of +5.10. The effect of SE(LK) can also be observed in our training set. ZnO has the highest %EI value (42.72) in our training set, while it also has the highest SE(LK) value of 11.47. Conversely, In2O3 has a SE(LK) value of 2.21, and the experimental %EI value is only 7.12. CVsim(LK), which stands for the coefficient of variation of the similarity values, has a negative contribution to the model. In our dataset, CVsim(LK) did not show a large variation in the values. However, we can observe that Al2O3 has a relatively large CVsim(LK) value (1.25), while Mn2O3 has a relatively small CVsim(LK) value of 1.06; their corresponding %EI values are 3.44 and 17.2, respectively.
The quality of the nano-qRASTR model was also checked according to the criterion by Golbraikh and Tropsha, with all the metrics falling within the stipulated threshold as follows:
The Y-randomization test was also performed to validate if the model was generated by chance. After shuffling all descriptor values, 100 random models were generated. As a result, the average R2 value is 0.20, while the average Q2 value is −0.60, which cannot qualify the threshold of 0.5 for both parameters, suggesting that our original model was not developed by chance (details in Supporting Information File 1).
The scatter plot (Figure 1a) suggests that all MONPs are very close to the best-fit line concerning the experimental toxicity and predicted toxicity values, which further supports the validity of the model. A Williams plot (Figure 1b) was used to verify the prediction reliability by carrying out the applicability domain analysis using the leverage approach. Our result indicates that one training compound (Fe3O4) is above the leverage critical value. It will be considered as influential X outlier. There is also a test date that has a higher value than h* and will be considered as outside of the AD.
Figure 1: Scatter plot (a) and Williams plot (b) for the nano-qRASTR model. The red dashed line indicates the highest Hat or leverage value, that is, the h* cut-off line.
The SHAP plot (Figure 2) indicates that has a predominantly positive effect on the predictions of the model, as the SHAP value increases with increased values of
. The descriptor SE(LK) shows a more pronounced positive influence on the predicted values. This is consistent with the positive coefficient in our regression equation, and the slight trend from blue to red dots suggests a correlation between feature values and impact. Conversely, CVsim(LK) predominantly affects the model predictions negatively, as evidenced by its SHAP values being mainly on the left side.
Figure 2: SHAP plot for the nano-qRASTR model.
Mechanisms of ZHE1 enzyme inhibitionThe incorporation of third-generation descriptors significantly improves the predictive power of the nano-qRASTR model. MONPs with higher metal electronegativity may interfere more strongly with cellular functions of zebrafish, but this does not invariably heighten toxicity; in some instances, it may mitigate oxidative stress and membrane disruption, thereby diminishing toxic effects. Conversely, MONPs with larger atomic radii and crystal ionic radii tend to exhibit a lower surface area-to-volume ratio, which can reduce their cellular interactions and uptake. This reduction in uptake can lead to less cellular dysfunction and toxicity. Larger atomic radii may result in MONPs that are less likely to penetrate cell membranes, thereby decreasing their potential to cause cellular damage and toxicity. However, MONPs with increased molecular bulk can enhance toxicity via several mechanisms. They can physically damage cell membranes, potentially causing cell death. Their size may lead to alternative, more detrimental cellular uptake pathways or provoke harmful responses by accumulating on cell surfaces. Such MONPs might also elevate oxidative stress by triggering the production of reactive oxygen species, which damage cellular components. They can obstruct vital biological processes and, through aggregation, cause localized toxicity to zebrafish. Additionally, their size affects biodistribution and clearance, with larger MONPs tending to accumulate within the zebrafish organism, further exacerbating toxicity (Figure 3). In zebrafish, these mechanisms can manifest in several ways, affecting not only individual cells but also developmental processes. The implications for zebrafish embryos include potential deformities, impaired development, and mortality. Employing zebrafish as a biological model facilitates the evaluation of toxicity, offering an integrative perspective on the hazards that MONPs may present in aquatic ecosystems and living organisms.
Figure 3: Mechanism of zebrafish hatching enzyme inhibition by MONPs according to the developed models. The figure is “Created with BioRender.com” (https://biorender.com/) with a purchased academic license. This content is not subject to CC BY 4.0.
Comparison with previously published modelsCompared to our previous nano-QSTR model ( = 0.68,
= 0.74, and
= 0.70) , the current nano-qRASTR model demonstrates improvements in these three critical metrics with enhancements of 0.01, 0.02, and 0.05, respectively. Although these improvements might seem minimal, it is crucial to note that in the preceding study, we were able to model 23 MONPs, excluding CoO, which significantly impacted the quality of the model because of its outlier behavior. In contrast, the current study successfully models all 24 MONPs without compromising the model’s quality and predictability, leading to improved results. This suggests that the nano-qRASTR approach is a suitable choice for modeling in cases involving small and complex datasets.
Predictions for 27 out of 35 MONPs were within the AD, indicating that the nano-qRASTR model confidently predicts 77.14% of the MONPs (Table 2). However, predictions for eight MONPs were considered unreliable as they fell outside the AD. For the MONPs within the AD, the predicted enzyme inhibition (%EI) in zebrafish ranges from 32.42% to 76.16%. Within this spectrum, Ta2O3 exhibits the highest toxicity, while V2O3 shows the least.
Table 2: Predicted values for an external dataset employing the nano-qRASTR model.
Metal oxide Modeled descriptors Predicted %EIzebrafish AD status (∑α)2 SE(LK) CVsim (LK) Ag2O 544.29 8.58 0.73 128.34 out Au2O 994.14 15.14 2.11 225.01 out Au2O3 1036.20 15.14 2.11 232.33 out BaO 32.83 9.87 0.58 47.63 in BeO 1.77 10.01 0.57 43.14 in Bi2O3 52.27 6.58 0.75 32.45 in CaO 11.09 9.60 0.54 42.89 in CdO 36.97 9.85 0.43 49.94 in Co2O3 86.92 6.49 0.98 35.54 in Ga2O3 52.02 6.58 0.73 32.63 in GeO2 8.96 9.97 0.56 44.24 in HfO2 58.68 9.93 0.68 51.40 in HgO 66.10 8.65 0.33 49.92 in IrO2 84.46 9.00 0.43 53.80 in MgO 8.01 9.59 0.54 42.28 in MnO2 20.19 9.99 0.54 46.52 iIn Mo2O3 461.82 9.80 1.05 116.71 out Nb2O3 440.58 9.22 0.87 112.16 out OsO2
Comments (0)