Pre-implantation biopsy plays a central role in kidney graft evaluation and on decisions concerning the possibility to use the kidneys for transplantation. However, the assessment of pre-implantation kidney biopsies is not standardized in terms of the technical procedures adopted and pathologists’ evaluations. Harmonization of this process is needed [4]. Currently, tissue samples may be obtained by core-needle biopsy or wedge biopsy. The most appropriate processing technique (e.g. snap frozen vs rapidly processed) for these specimens is debated [18]. Different policies can have a significant impact on the final report, with possible under-/over-estimation of chronic damage in different renal compartments [19]. This can, in turn, influence the outcome of the graft [20], with the best correlation being described when pre-implantation biopsies are interpreted by experienced renal pathologists [21].
However, the most frequently encountered scenario involves relying on on-call general pathologists, who may have limited knowledge in nephropathology [12]. Moreover, reliance on general pathologists increases inter-observer variability. General pathologists typically assign higher scores for glomerulosclerosis and arterial thickness, which are the most important parameters for evaluating chronic renal damage [8]. To address this challenge, remote teleconsultation by renal experts can be solicited after the biopsy slides are digitized [22]. Once the slides have been scanned, it is also possible to apply computational tools [12]. Indeed, the creation of an AI-based tool that could assist pathologists, by improving accuracy and expediting their review, could be highly beneficial.
The detection of glomerulosclerosis in pre-implantation biopsies is significantly associated with graft survival, with studies demonstrating the predictive role of glomerulosclerosis > 10% [23], with no incremental effects for values above that threshold [24]. This highlights the importance of subtle changes around this cutoff which can be affected by inter-observer variability. Hence, AI-assisted detection of glomeruli, with reliable distinction between normal, ischemic and globally sclerotic, improves diagnostic assessment using whole-slide images [25]. Despite reported challenges with the segmentation and classification task of certain renal structures (e.g. variable shapes/dimensions/internal architecture, interspersed nature within the renal parenchyma, and heterogeneity of pre-analytical variables), previous attempts to apply AI in renal pahology demonstrated high reliability of glomerular detection and classification (e.g. precision in classifying healthy vs sclerosed glomeruli ranging from 0.834–0.935 and 0.806–0.976) [26]. In addition, fibrosis and lumen narrowing of vascular structures (arteries and arterioles) is significantly associated with long term graft survival, especially for mild-moderate (> 25%) arteriosclerosis [24]. Fortunately, AI-assisted segmentation from whole slide images has demonstrated good reliability in discriminating blood vessels vs tubules with an accuracy and precision of 0.93 and 0.88 [27], respectively, confirmed by subsequent studies (accuracy 0.89) which also demonstrated that significantly less time was needed for the algorithm as compared to the pathologists (2 min vs 20 min) [15].
In this study, the Galileo system was trained on a heterogeneous and multi-institutional cohort of renal core-needle and wedge biopsies that included a broad range of pre-analytical variables. The aim was to obtain a robust AI-assisted tool that could be generalized and employed in different settings, to accommodate the heterogeneity of cases encountered in routine clinical practice. Excellent precision and sensitivity were noted for Galileo during the training phase (81.96% and 94.39%), with total area error restricted to only 2.81%. The validation phase on an external dataset annotated by a different panel of pathologists allowed this AI-based tool to achieve good reliability in terms of precision and sensitivity (74.05% and 71.03%), with further reduction of the total area error (2%). Even reaching these promising levels of performance, the AI models can be significantly demanding in terms of computational power, which can potentially limit their wider applications by on-call pathologists due to the potential need for dedicated high performance workstations and to the long computational times [28]. The employment of cloud-based AI suites, like the one used in the present study, can significantly shorten the processing times, i.e., 2 min vs 22–31 for pathologists, which is highly important in the transplantation setting. Another possible limitation of adopting AI could be the reluctance of pathologists in trusting black box solutions [28], which might possibly be mitigated by explainability methods. In this setting, the ability to visually represent renal structures detected by the AI algorithm Galileo in an explainable manner greatly improved end-user acceptance, and facilitated the creation of a final pathology report that integrated qualitative and quantitative findings. Although promising, the current version of the Galileo algorithm includes five histological classes among those required for the interpretation of pre-implantation renal biopsies. The evaluation of interstitial fibrosis and tubular atrophy (IFTA), not covered by Galileo in its current form, is highly subjective and shows low interobserver reproducibility (Cohen’s kappa of 0.5 among 4 pathologists [29]), which makes it unsuitable for an AI algorithm. Some authors proposed overcoming this subjectivity by quantifying IFTA through image analysis methods, for example using color space transformations and structural feature extraction from the images, that would not need human interaction/training [30]. However, this approach has some limitations including loss of information during the color space transformation, high stain variability (not able to correctly classify all the renal structures), and error in the segmentation of these structures with consequent possible inaccurate quantification of interstitial fibrosis (being based on the identification and subsequent removal of non-fibrotic regions from the tissue). In this setting, application of the adaptive stain separation method seems promising [15], and similar approaches will be implemented prospectively in the Galileo algorithm. Ancillary histological modifications at the glomerular (e.g. mesangial nodular expansion), vascular (e.g. hyalinosis and thrombotic microangiopathy) or tubular (e.g. acute damage/necrosis) level may be of interest to further refine the stratification of the risk in the transplant setting, especially in deceased donors. Further training on larger case series with rare histological instances will be carried out to allow Galileo to recognize ancillary but useful modifications of the renal parenchyma. Moreover, further applications of the Galileo algorithm on additional external datasets will help corroborate its reliability and generalizability in the routine clinical setting, as well as its impact on outcomes (e.g. graft survival).
Comments (0)