The estimated annual financial impact of gene therapy in the United States

Human subjects and code availability

Please note, that this study is exempt from IRB review as it uses public and de-identified aggregate data. The code written to generate all estimates is available at the request of the study authors.

Summary of methods

As outlined in Fig. 1, we first identify all existing late-stage clinical trials of gene therapies in phase 2/3 or 3 trials. We then estimate each trial’s likelihood of success, year of approval and spending on the successful therapies by summing the product of their expected prices and number of patients. We describe the separate tasks required for our analyses in the following subsections: [1] identification of the number of gene therapies currently in the clinical trial process and their associated diseases and therapeutic areas [2]; estimation of the probabilities of success of these trials [6]; estimation of the time to approval [7]; simulation of the expected number of patients treated by these therapies if approved; and [8] estimation of the expected market prices of the approved therapies.

Fig. 1: A flowchart showing the performance of the simulation.figure 1

After extracting the information on each disease from the clinical trial databases, we simulate whether the disease will obtain an approval. If it fails to do so, the simulation will end for this disease in this iteration. Otherwise, we estimate the expected number of patients to be treated, compute the corresponding cost of treatment, and store the results. At each step of the computation, we sourced data from the published literature and impute missing information.

Gene therapies in clinical trials and associated diseases and therapeutic areas

We use clinical trial metadata from the Citeline Trial Trove database and the U.S. National Library of Medicine’s ClinicalTrials.gov database to determine the number of gene therapies currently under development. We downloaded data from the Citeline database and isolated any trials tagged with ‘gene therapy’ under the ‘therapeutic class’ field. We supplemented this information by searching for trials on the clinicaltrials.gov main page using the keywords ‘gene therapy’, and then reading the trial description to determine if the trial was related to a gene therapy. All database queries were made on or before December 31, 2019. Clinical trials from both sources were merged before filtering for those clinical trials that were in either phase 2/3 or phase 3 of the development process and were not known to be ‘compassionate uses’ of the treatment. Compassionate use refers to the administration of investigational treatments outside of the clinical trial to treat patients with serious or immediately life-threatening diseases, or conditions when there are no comparable or satisfactory alternative treatment options. We exclude compassionate use from this study as those results are rarely used as data points in the clinical development process, and their uses often occur outside of clinical trial settings [14, 15]. Clinical trials without a U.S. trial site were included in the dataset because it is currently possible for the FDA to grant marketing approval using evidence from foreign clinical trials, as empowered by Federal administrative law 21 CFR Part 312.120 [16]. We removed repeated entries of the same trial. We then identified diseases, therapeutic areas, and patient ages targeted by each gene therapy.

This process yielded 109 unique trials investigating 57 distinct diseases, listed in Table A1 in the Supplementary Materials. We classified diseases into three categories: cancer (oncology), rare disease, and general disease. The distribution of diseases and the clinical trials by category and therapeutic area are shown in Table A1. Most trials and diseases were categorized in the area of oncology, followed by rare diseases. These therapeutic areas are notoriously risky for development. Only 3.1% of drug development programs in oncology and 6.2% in rare diseases go from phase 1 to approval, compared to the baseline of 13.8% across all drugs and indications [5].

Probability of success estimates simulation

We define a gene therapy development program as a set of clinical trials made by a sponsor when testing a therapeutic for efficacy against a disease. We considered whether gene therapy would be developed for a disease by simulating correlated random successes for each gene therapy program and observing if at least one approval took place. This computational method assumes that clinical trials are always perfectly correlated within the same development program. It can be argued that different gene therapy treatments for disease are highly correlated, since they operate on similar platforms (e.g., CAR-T or in-vivo gene delivery using adeno-associated virus vectors), even though different gene sequences may be targeted. To reflect this association, we assumed a correlation of 90% between development programs in our simulation. A sensitivity analysis, however, demonstrated that our computations are insensitive to this parameter.

Phase 3 to approval of probability of success (PoS3A) for each disease was informed by prior studies on the probabilities of success by therapeutic area of drug development programs from the MIT Laboratory of Financial Engineering’s Project ALPHA website [17]. These estimates were derived from over 55 thousand drug development programs between January 2000 and January 2020, and computed using the path-by-path method introduced in Wong et al. [5]. The PoS3A values used in this study’s simulations are as follows: Autoimmune/Inflammation, 48.5%; Cardiovascular, 50.1%; Central Nervous System (CNS), 37.0%; Metabolic/Endocrinology, 45.7%; Oncology, 28.5% and Ophthalmology, 45.9%. The mapping of diseases to therapeutic areas is shown in Table A2.

Time to approval simulation

An estimate of the time to approval for gene therapy treatments was determined in order to assess the patient impact and cost over time of the treatment. Gene therapies require approval from the FDA through the biologics licensing application (BLA) pathway. Typically, companies submit a BLA to the FDA after the end of the clinical trial period. Our estimate assumed that the time between the end of the last clinical trial for the disease and the submission of the BLA was a variable drawn from a triangular distribution between 0 and 365 days, with a median of 182.5 days. This was informed by the practical knowledge that it takes an average of 6 months to prepare the documents for the BLA submission [5].

There is an additional lag time between the submission of the BLA and the FDA’s decision. The FDA has 60 days to decide if it will follow up on a BLA filing [18], and it can take another 10 months to deliver its decision [19]. This implies the maximum possible time between BLA submission and FDA approval will be 12 months. Thus, our estimate assumed that the time between the BLA submission and the FDA decision would also be drawn from a triangular distribution between 0 and 365 days, with a median of 182.5 days. These assumptions are also valid for therapies that use the priority review pathways. This estimate also assumed that the BLA would be filed only after the last clinical trial for a disease had ended. Trials with missing declared end dates had their end dates imputed by adding random durations to the trial start date, drawn from a gamma distribution fitted to clinical trials with complete date information in the data (see Fig. 2).

Fig. 2figure 2

The empirical distribution of duration against our fitted gamma distribution.

Diseases with a prior approved therapy were automatically considered to be approved as of December 31st, 2020. For some diseases, their last clinical trial ended before January 2017, and no subsequent approval or product launch was observed. Diseases that matched this criterion were treated as though they had failed.

Number of patients simulation

This simulation captures the number of new and existing patients treated over time, conditioned on the disease receiving an approved gene therapy. We considered only the superset of patient segments listed in the clinical trials for each disease. For example, if there were two clinical trials, one targeting ‘patients above the age of 40’ and the other targeting ‘patients above the age of 18’, only the latter was considered when estimating the patient population for the disease. If insufficient information about the sub-population was given, it was assumed that all the patients with that disease were eligible.

Incidence and prevalence

We searched medical journals and online data repositories for the number of currently affected patients and the number of new patients per year for each indication, such as the Surveillance, Epidemiology, and End Results (SEER) website and cancer.net. If we identified an estimated patient population using this method directly, it entered our model; otherwise, we multiplied the prevalence and incidence rates of the disease by the population of the U.S., which was assumed to be 327.7 million [20].

In cases where estimates for the disease incidence were available but not the prevalence, we combined the incidence of the disease (i.e., i new patients a year) and the disease survival rate (i.e., p% of the people with a disease will be alive after k years) to obtain the steady-state estimate of the prevalence (j) using 1. Alternately, there were diseases identified where the prevalence was available but not the incidence. In these cases, we estimated the incidence from the prevalence by rearranging 1 to yield Eq. 2.

$$}}}}} \, (}}}}})=\frac}$$

(1)

$$}}}}} \, (i)=\frac$$

(2)

To do so, we assumed that the number of patients would be constant through the years at a level j (that is, ki new patients are added over k years and j(1 − p) patients will die over the same period, and therefore ki = j(1 − p) will determine the number of patients that is constant over time). The number of patients for each disease is presented in Table A3. These estimates were adjusted to avoid double-counting in cases of overlapping patient populations, e.g., the number of patients for ‘Spinal Muscular Atrophy’ is the difference between ‘Spinal Muscular Atrophy’ and ‘Spinal Muscular Atrophy I’ (a sub-category of the former).

Treatment of patients over time simulation

This simulation assumes that newly diagnosed patients were treated immediately upon diagnosis and that the proportion of existing patients who seek treatment do so in such a way that the existing stock of patients will decline exponentially, with a half-life of λ. Mathematically, the proportion of existing patients that seek treatment between time t and t + δ after approval is given by E (t, δ, λ), where:

$$E\left(t,\delta ,\lambda \right)=^}}}}2}}-^}}}}2}},t \, > \, 0$$

(3)

We assumed that 25% of the existing stock of patients would seek treatment in the first year of this simulation. This required that the half-life be set to 28.91 months, which in turn implied that 95% of all patients who were diagnosed prior to the approval of the gene therapy would want treatments within 10.5 years. A sensitivity analysis was performed on this assumption to determine its impact on the results.

Patient penetration simulation

It is unlikely that all patients with a prevalent case of a disease will receive gene therapy treatments. This may be due to ineligibility, or lack of awareness of the treatment, among other reasons. We labelled the percentage of the patients that received gene therapy treatments in any given period as the ‘patient penetration rate,’ and modelled this rate using a ramp function, ρ (t, Θmax, Tmax). The ramp function is frequently used by industry to model the rate of adoption of a product or technology [11]. It is given by:

$$\rho \left(t,_,_\right)=\left\\frac_}_},0\le t\le _\\ _,}}}}}\end\right.$$

(4)

We assumed different ramp functions for diseases belonging to our three categories: rare disease, ‘general’ or chronic disease, and cancers. For rare diseases, faced with improved prospects of survival, we assumed more patients would be willing to enrol in new treatments quickly after approval. In addition, since the number of patients with individual rare diseases is relatively small, insurers may be more willing to cover these therapies and manufacturers more able to cope with a larger proportion of patients. Given this, µθ was assigned a high value of 40% and µT was assigned a low value of 6 months.

On the other hand, many chronic diseases in our general category are seldom deadly, while affecting a larger number of patients, even in the millions. Since an acceptable standard of care is often available for these conditions, patients may be less inclined to use new treatments due to a lack of certainty in benefit and durability. Thus, this study assumed that the maximum penetration rate for gene therapies approved to treat general chronic diseases would be 1%, and the ramp-up period, 5 years.

As an intermediate case, cancers have characteristics that fall between these two extremes, but in general, they are more like the rare disease category. We therefore assigned values of 10% to the maximum penetration rate and 12 months to the ramp-up period. All variances were set to 10% of their means in order to model a moderate level of uncertainty in our numbers. This assumption did not affect our mean estimates of the number of affected patients or spending on gene therapy.

The net number of patients to be treated for the disease at time t after the approval of a gene therapy is given by:

$$}}}}}}_=\rho \, (t,_,_)}[}}}}} \, }}}}}}_+E (t,\delta ,\lambda ) } \, }}}}}}_]$$

(5)

We did not consider the effect of market competition among different therapies for the same disease, or the effect of patient type on the expected number of treated patients. This is in part because it is hard to determine the expected patterns of use without an existing approval. Instead, we modelled each treatment-disease fraction of the population that would be eligible for treatment, assuming independence.

Expected market pricing simulation and QALYs gained from gene therapy treatment

The cost to the healthcare system of providing the gene therapy for a disease for all patients treated at time t after approval is given by C(t), where

$$C\left(t\right)=}_\times }}}}}\; }}}}}\; }}}}}$$

(6)

The price of each treatment is crucial to computing the expected total spending, and a source of considerable uncertainty because the gene therapies that are the subject of this analysis are generally not yet approved, and consequently not priced by their company. We address this uncertainty by estimating expected prices using well-established methods. The Institute for Clinical and Economic Review (ICER) is an independent nonprofit organization that evaluates the clinical and economic value of healthcare innovation. ICER calculates the expected prices of new therapies based on the relative benefits and costs to the patient reported in pivotal clinical trials. ICER does this by comparing the expected quality-adjusted life-year (QALY) with and without the treatment, then multiplying the difference in QALY (∆QALY) by a constant value of a life-year gained, typically set between $50 thousand and $150 thousand per ∆QALY [21].

$$}}}}}}}}}}\; }}}}}\; }}}}}=}}}}}\; }}}}} \, }}}}}\times \Delta }}}}}$$

(7)

At the time of our analysis, ICER had published several reports containing estimates of QALYs gained by patients treated with existing gene therapies, including those with vision loss associated with biallelic RPE65-mediated retinal disease following treatment with voretigene neparvovec® [2], and with SMA Type I following treatment with onasemnogene abeparvovec-xioi® [22]. These reports computed the ∆QALY using the results of the clinical trials that formed the basis for FDA approval to estimate the potential improvements in the quality of life and life expectancy of the patients among treated patients. Replicating the ICER method for all the clinical trials under consideration in this paper was infeasible since most of these trials were not yet complete nor had they reported pivotal trial results for FDA approval during the timeframe of our analysis. As an alternative, we developed a mathematical model based on a modification of the ICER method and calibrated using the pricing of currently available gene therapies approved for use in the U.S. market, to estimate the expected increase in QALYs from gene therapy for each disease in our sample.

The Appendix describes our method for estimating the expected QALYs gained from gene therapy. The next subsection describes our method of estimating the price per QALY gained from gene therapy.

Price per ∆QALY

To estimate as realistic a market price of new gene therapy as possible, we calibrated our assumed price per ∆QALY with the two currently available gene therapies priced in the U.S. market as of January 2020: onasemnogene abeparvovec-xioi, priced at $2.1 million per patient [23], voretigene neparvovec, priced at $0.425 million per eye treated [24], Separately, betibeglogene autotemcel, marketed as Zynteglo, and sold at a cost of 1.6 million Euros (approximately $1.8 million), has been approved in the European Union at the time of our analysis, and was approved in the U.S. in August 2022 with a price of $2.8 million for a one-time dose. To improve the precision of our estimates, we added the two CAR-T therapies also approved and available in the U.S. market, tisagenlecleucel, marketed as Kymriah and approved in 2017, priced at $0.475 million for a one-time dose [25], and axicabtagene ciloleucel, marketed at Yescarta and approved in 2019, priced at $0.373 million for a one-time dose [25]. We calibrated the price per ∆QALY by minimizing the mean-squared error (MSE) between the estimated price given the expected change in QALY and the actual price. We reported the mean absolute percentage error (MAPE) between the estimated price and the actual price in addition to the MSE. To account for potential ∆QALY differences between the gene therapies and the CAR-T therapies, we performed two separate calibrations. We assumed that the price per ∆QALY for general diseases was identical to that for cancerous indications.

Considering only the therapies approved in the U.S. through January 2020, we estimated a price per E(∆QALY) of $101,663 (MSE: 2.18 × 109, MAPE: 11.2%) for rare diseases and $40,797 (MSE: 1.77 × 1010, MAPE: 44.2%) for other diseases. Using all the data points, the price per E(∆QALY) for rare diseases increases to $114,781 (MSE: 1.70 × 1012, MAPE: 108%). In this paper, we used the former value in our calculations, since it has a smaller MSE and better reflects current prices in the U.S. This value gives us pricing estimates of $2.09 M per patient for onasemnogene abeparvovec-xioi and $0.470 M per eye for voretigene neparvovec, which is consistent with the prices we observe in the real world.

Our calibrated price per E(∆QALY) for cancerous indications is just slightly below ICER’s $50 thousand to $100 thousand range for ‘intermediate care value’. The higher price per E(∆QALY) for rare diseases reaffirms the general belief that developers of treatments for rare diseases should be compensated more for their elevated research and development risk and the lower financial prospects of serving a small population of patients. It is assumed that the clinical cost of delivering the gene therapy is a negligible fraction of the overall cost of development (though it is considerably higher than the delivery cost of conventional therapeutics). It is also likely that the outside option cost will be similar.

The expected increases in QALY computed by our model were also close to those reported by ICER for these treatments [1, 2]. For example, we estimated that treatments for Spinal Muscular Atrophy Type 1 and Leber Congenital Amaurosis due to RPE65 Mutations provided 20.56 and 4.63 incremental QALYs, whereas ICER estimates onasemnogene abeparvovec-xioi and voretigene neparvovec to provide 12.23 to 26.58 and 1.3 to 2.7 incremental QALYs, respectively.

ICER also provides a range of ∆QALY estimates corresponding to different age groups. To compare our estimates, we followed their definition of age groups: a minor is defined as a patient below the age of 18, and an elderly patient as one who is older than 62 years old. The remaining cohort of the patients was defined as adults. We used the distribution of ages to produce a weighted average estimate. We deliberately applied the same methods and assumptions for all other diseases to estimate the expected changes in QALY for Spinal Muscular Atrophy Type 1 and Leber Congenital Amaurosis due to RPE65 Mutations, even though these numbers were directly available from ICER reports. This calibration of price per ∆QALY corrects for potential biases in our data, and as a result, allows our price estimates to be more realistic.

Finally, one million iterations of our simulation were performed to compute the mean number of gene therapy patients and their total spending. At this number of iterations, the computed mean was expected to be within 1.89% of the true mean 95% of the time. The 5th and 95th percentiles of the computed values were reported as the upper and lower bounds respectively.

Sensitivity analyses

To test the sensitivity of our results to initial conditions and assumptions, we simulated ±20% changes in the following variables and analyzed their impact on our results:

1.

The maximum penetration rate in the ramp function, Θmax

2.

The time to maximum penetration rate in the ramp function, Tmax

3.

The amount of QALY gained in each disease

4.

The price per ∆QALY

5.

The phase-3-to-approval probability of success (PoS3A)

6.

The number of new patients of each disease

7.

The number of existing patients of each disease

8.

The time from phase 3 to BLA

9.

The time from BLA to approval

For each of these factors, we considered its impact on the peak monthly spending and the cumulative spending from January 2020 to December 2034 of patient treatment. We explored how the variables might change the timing of peak monthly spending. Additional details are provided in the Appendix.

留言 (0)

沒有登入
gif