Optimizing biopsy decisions in PI-RADS 3 lesions: cross-institutional validation of a local clinical risk model

Our study demonstrates that a locally developed and calibrated risk model can safely reduce unnecessary biopsies in men with PI-RADS 3 lesions, achieving the highest net benefit among all evaluated strategies. Importantly, both normalized ADC and PSA density showed robust clinical utility in risk-averse settings.

Our findings provide a framework for tailoring biopsy decisions based on individual risk scenarios. The risk threshold represents the maximum acceptable probability of missing csPCa. In risk-averse settings—where avoiding missed csPCa is paramount and there is minimal concern about biopsy-related complications—clinicians and patients may opt for a lower threshold (e.g., 5%). In such cases, normalized ADC ≤ 0.81 or PSA density ≤ 0.08 ng/ml/ml could serve as viable alternatives to the full multivariable risk model. Conversely, in biopsy-averse scenarios—where minimizing unnecessary procedures and potential harms (e.g., overdiagnosis or complications) takes precedence—a higher threshold (e.g., 15%) may be more appropriate, with our multivariable risk model offering optimal guidance.

While our locally developed risk model demonstrated superior performance compared to previously published risk calculators despite using similar clinical parameters, this advantage stems from better fit and calibration to the local clinical environment, including shared imaging protocols, referral patterns, and patient demographics within the same metropolitan area. This improved performance was anticipated in our study design, as validation challenges with existing risk models suggested that localized approaches may provide more reliable predictions [8, 25]. However, implementing such models requires significant infrastructure including local databases for model fitting, ongoing monitoring, and periodic recalibration. This trade-off between optimal performance and implementation complexity must be considered when selecting a diagnostic strategy. Simpler approaches like PSA density and normalized ADC offer readily implementable alternatives while maintaining good diagnostic performance in risk-averse settings.

Our model development strategy intentionally incorporated PI-RADS 4–5 lesions in the training cohort to leverage their higher csPCa prevalence for robust parameter estimation. This approach was based on the hypothesis that csPCa in PI-RADS 3 lesions shares fundamental clinical and biological characteristics with csPCa found in PI-RADS 4/5 lesions. To mitigate the potential spectrum bias introduced by this methodology, we implemented a dedicated recalibration step that adjusted risk predictions to align with the observed 8% csPCa prevalence within the PI-RADS 3 subgroup. Importantly, our validation cohort comprised exclusively patients with PI-RADS 3 index lesions, providing a stringent assessment of the model’s clinical utility in this specific target population, independent of the initial development approach.

PSA density emerged as a particularly robust parameter, confirming findings from previous validation studies [7,8,9,10,11,12,13]. Our suggested optimal cutoff of 0.08 ng/ml/ml is lower than commonly reported cutoffs, reflecting our focus on a risk-averse setting aimed at minimizing missed csPCa cases. Previous studies have explored various PSA density thresholds. Of particular note are two large volume cohorts: Pellegrino et al. found that a PSA density cutoff of ≥ 0.10 ng/mL/ml, corresponding to a threshold probability of 10%, would have avoided 32% of biopsies while missing 7% of csPCa cases [12]. At a threshold probability of 5%, their corresponding PSA density cut-off was 0.075 ng/ml/ml, which closely aligns with our threshold of 0.08 ng/ml/ml at the same risk threshold. Contrasting results are found in a recent multi-institutional cohort where a PSA density cutoff of 0.15 ng/ml/ml would have avoided 58% of biopsies while missing csPCa in 7%, achieving higher net benefit than 0.10 ng/ml/ml even at a 5% risk threshold [13].

Normalized ADC values showed comparable performance to PSA density and may potentially overcome the limited reproducibility of absolute ADC measurements across different scanner platforms. Barett et al. demonstrated that normalized ADC values (referred to as ADC ratio) are more robust to different b-value combinations and show a stronger inverse relationship with Gleason score compared with absolute ADC values [26]. Our findings support previous work by Hermie et al. using the same normalization approach for risk stratification in PI-RADS 3 lesions, though without suggesting a specific cut-off [11]. While results for normalized ADC in our and previous studies are promising, this parameter remains far less validated than PSA density, warranting further multi-institutional studies to confirm its robustness.

Our findings demonstrated poor predictive value of lesion volume for biopsy decisions in PI-RADS 3 lesions, aligning with Osses et al. [9]. While Lim et al. found MRI-lesion size differences between csPCa and non-csPCa in a mixed cohort, their modest AUC of 0.698 indicates limited clinical utility for biopsy decisions [10]. Similarly, though Ayranci et al. advocated using lesion size for PI-RADS 3 biopsy decisions, their low AUC of 0.585 for distinguishing PCa from benign tissue, combined with the lack of data on discriminating csPCa from the combined group of insignificant cancer and benign findings, undermines this recommendation [15]. Other authors have suggested volume thresholds based on the 0.5 ml cutoff commonly used in radical prostatectomy specimens to define csPCa [23, 27]. However, the accuracy of MRI-based measurements may be limited, particularly in PI-RADS 3 lesions where inflammatory changes can mimic tumor appearance and extent and thus limit the practical value of this parameter.

Our study has several important limitations. First, our use of PI-RADS v2.0 rather than the current v2.1 may affect generalizability to current practice. Second, the high proportion of patients with previous negative biopsies in our cohort differs from contemporary practice, where MRI typically precedes initial biopsy. As a result, our observed csPCa prevalence (8%) was markedly lower than that reported in contemporary cohorts (19% in meta-analyses; 95% CI 14–25%) [28] This discrepancy limits the generalizability of our findings to MRI-first diagnostic pathways. In predominantly biopsy-naïve cohorts, fewer patients may safely avoid unnecessary biopsies, and optimal risk thresholds would require recalibration. Nonetheless, our study provides compelling proof-of-concept evidence that locally developed and calibrated risk models can outperform generic approaches. We encourage future research to build upon our methodological framework by tailoring risk models to institution-specific case mixes, thereby refining optimal thresholds for biopsy decisions in current practice.

Conclusions

A locally developed and calibrated risk model achieved the highest net benefit for biopsy decisions in PI-RADS 3 lesions, validated across two institutions within the same metropolitan area. While this approach requires infrastructure for model development, both normalized ADC and PSA density showed robust clinical utility as readily implementable alternatives in risk-averse settings.

Comments (0)

No login
gif