Evolving or immutable - phase I solid tumor trials in the era of precision oncology

Rule-based DES dominate phase I oncology trials to date, with 3 + 3 representing the primary dose escalation design [3]. While there are advantages to utilizing the 3 + 3 design, namely ease of use and safety (e.g., identification of clinically relevant toxicity and low rates of treatment related death) [4], there are also important limitations. One of the primary limitations of the 3 + 3 design is that it is designed for CT with a monotonic relationship between dose, toxicity and anticipated response [5]. Thus, defining maximum tolerated dose (MTD) of an agent is crucially important for identifying the recommended phase 2 doses (RP2D) for subsequent drug development. Novel therapeutics such as TT and IO possess different intrinsic relationships between dose, toxicity and efficacy; the RP2D of an agent may be the biologically effective dose (BED) rather than MTD [6]. Another primary limitation of the 3 + 3 design is that it requires a prespecified attribution of toxicity for each dose level. This is easier to define for CT based upon the anticipated toxicity from a CT class (e.g., neuropathy and hematologic toxicity from platinum agent), but may be more unpredictable when attempting to define anticipated toxicities from TT or IO. In our analysis, TT (47.6%) and IO (22%) were more commonly utilized than CT (6.9%). Newer DES, which rely on Bayesian principles, allow emerging toxicity at a particular dose level to inform optimal toxicity thresholds. Our analysis reflects some shift in DES, with rule-based DES being utilized in 89% of studies (80.5% which used the 3 + 3 design), compared to 96.7% of the time in phase I oncology studies published between 1991 and 2006 [1, 7]. However, there is still room for improvement, and healthcare authorities have recognized this need for reform. In 2021 the FDA Oncology Center of Excellence announced Project Optimus, an initiative to reform dose selection and optimization for novel therapeutic agents (FDA Project Optimus) in the era of PO [8]. This initiative serves as an acknowledgement by the FDA that in the modern therapeutic landscape, establishing maximum tolerated dose by rule-based DES such as 3 + 3 may no longer be optimal to define doses for further testing. Project Optimus is ongoing and encourages strategies such as trial designs which include model-based DES and comparison of multiple doses (e.g., parallel dose-response).

Nearly half of studies included in this analysis did not report the race of participants (213/437 studies, 48.7%). It is challenging to draw meaningful conclusions about racial representation when a large proportion of trials are not reporting this data. Additionally, the absence of patient race identification precludes gathering data about possible differences in dosing for different subpopulations. Based on the predominance of white patients in these studies, it is unclear whether dosing information can be generalized to other faces. While the reported gender of study participants was fairly balanced (51.7% male, 48.3% female), the majority of patients on studies in this analysis identified as white (61.7%), and only 6.5% of patients identified as black. The “other” category, which captured Hispanic or Latino patients, only represented 6.1% of patients. The greatest proportion of studies were based in North America (46.5%), and based on United States census data from July 2023, 75.5% of Americans identify as white, 13.6% of Americans identify as black, and 19.1% of Americans identify as Hispanic or Latino [9]. Based on our analysis, minority populations are still being underrepresented in clinical trials. Since 25.6% of analyzed studies were conducted on 2 or more continents, we would further expect more diversity in the patient population. However, no trials were conducted on the African continent and few trials in South America, somewhat further restricting diversity.

Additionally, despite surveying a heavily pretreated patient population with a median of 3 prior lines of therapy, most studies (71.8%) required an ECOG performance status (PS) of 1. The functional status of the real-world population of patients who have received multiple therapies may not reflect the high-functioning population able to enter a trial, limiting generalizability of results. Notably, the subjective nature of PS may undermine its ability to accurately predict which patients are suitable for a particular therapy. Nonetheless, the ASCO-Friends of Cancer Research Performance Status Work Group conducted a simulation study which demonstrated that including relatively small numbers of ECOG 2 participants had only modest effects on treatment hazard ratio and study power, and expanding eligibility may lead to shorter trial duration as a result of faster accrual. The working group has set forth recommendations to consider expanding PS eligibility [10]. This is of import in the era of precision oncology, as novel agents with differing toxicities from traditional cytotoxic chemotherapy may be more tolerable in frailer patients. A more precise definitionof PS may also be helpful, such as Karnofsky PS which has more categories than the more commonly used, ECOG PS.

Among all lab value inclusion criteria analyzed except albumin, at least 40% of inclusion cutoff values were not clearly specified (Table 2). Clearly identifying clinical characteristics required for entering a study is critical in accurately defining the study population eligible for a particular therapy. In the case of albumin, 93.1% of studies did not state its use as an inclusion criterion. Prior work evaluating mortality rates of patients in phase I trials has suggested that lower albumin levels (3.3 g/dL) are associated with higher rates of death within 90 days, independent of ECOG PS [11]. Lower albumin levels have been associated with decreased survival rates in numerous studies [12,13,14,15]. Additionally, a previous investigation of factors associated with precision oncology trial participation has found a significant association between lower albumin levels and lesser likelihood of enrollment in genotype-matched trials [16]. Greater utilization of albumin as a study inclusion criterion may aid in appropriately excluding patients at high risk for clinical deterioration.

As the aim of phase I oncology trials has expanded beyond safety to signal-finding, ECs have been increasingly utilized [17]. Use of ECs and size of ECs (> 20 patients) in these trials has been associated with drug success in later lines of development [18]. Our data lends further credence to this idea, as the use of ECs was significantly associated with progression to phase 2 testing in multi-variate analysis. However, we cannot account for the fact that planned ECs may have been eliminated when the drugs showed little sign of activity in the escalation phase, thus skewing our results. In this analysis, industry studies were much more likely to include ECs than non-industry funded studies (p < 0.01). Given the strong association of EC use in phase I trials with subsequent phase II testing, it is plausible that differential EC utilization was the underlying reason why phase I industry trials appeared more likely to lead to phase II testing than non-industry trials. Our analysis demonstrates the continued increase in utilization of ECs, with 40% of all studies including such cohorts (24% in prior analysis [17]). Among studies utilizing ECs, the primary objective was listed explicitly in 76.4% of cases. This represents an improvement from prior analyses in which primary objective definition was not stated clearly. Despite the increasing utilization of ECs, sample size justification for the cohorts was not provided in most trials (71.1%). In the studies which justified EC size, target ORR threshold was utilized 20.3% of trials. FDA guidance for EC use in phase I trials suggests clear sample size justification in the statistical analysis plan may facilitate more seamless development for a drug based on more concrete signals of anti-tumor activity [19]. This is a glaring area of weakness in current phase I oncology trial design which can be remedied quite easily in our estimation. Given the increasing use of biomarker selection for ECs (46.6% in our analysis), target response rate thresholds should be easier to estimate. Simple response threshold-based sample size justification (e.g., Simon’s two-stage design) should become a mainstay of statistical analysis plans for studies involving ECs, given this approach will optimize signal-finding in phase I oncology trials.

Limitations

We note the following limitations in our analysis. First, hematologic malignancies were not represented in this analysis given the inherent differences between most drugs used in hematologic versus solid tumor malignancies. It is an established practice to separate malignancies in this fashion in previously published studies. Second, we acknowledge the possibility of discrepancies in methodology between the two data abstractors. This was mitigated by an independent third reviewer who established 98.7% concordance between the work of the primary abstractors. Third, we relied on trial information presented in publications and published on clinicaltrials.gov instead of study protocols, given the variability in access to full-length study protocols. Since data was only considered from published manuscripts, there was inherent publication bias (towards positive studies) in the analysis. To mitigate this risk, we included a diverse array (with impact factors ranging from low to high) of journals with a track record of publishing phase I studies. Fourth, we restricted studies with drug combination thearpies to those in which only the study drug dose was escalated and data was provided for one dose-escalation cohort; we excluded phase I/II studies and studies with multiple dose cohorts reported separately or multiple drug combinations reported separately. We may have underrepresented more recent novel therapeutics by excluding these types of studies.

However, we restricted our analysis to phase I only trials to most purely assess the evolving landscape of these trials. The number of studies included in the analysis (N = 437) reduces the concern about the generalizability of our conclusions.

Comments (0)

No login
gif