Master statistical analysis plan: attractive targeted sugar bait phase III trials in Kenya, Mali, and Zambia

Prevalence outcomes

The prevalence of malaria infection among participants aged 6 months and older, detected by RDT, will be analyzed using a multi-level (variance components model) constructed on a generalized linear model framework with a Bernoulli likelihood and a logit link function. Random intercepts will be included for each study cluster, and the study arm will be included as a fixed effect coded categorically as 0 for arm A and 1 for arm B. The analyst will be blinded to the true assignment until the results are presented. The model will take the form below where pij is the probability of positivity at the individual level (i indexes individuals within clusters and j indexes clusters), α is the global intercept, Xij is the arm assignment for individual i in cluster j, βarm is the arm effect to be estimated, uj are random intercepts for the cluster, and σ is the standard deviation of the random intercept distribution:

$$\mathrm(_)=\alpha +_^_+_$$

where the likelihood is of the form:

And the random intercepts are assumed to follow a normal distribution:

Model results will be presented as the estimates of eα and the odds ratio above and the standard deviation or variance of the random effects distribution. 95% confidence intervals for the odds ratio and eα estimates as well as z-statistics and p-values for each coefficient will be presented.

Routine clinical incidence

The incidence of clinical malaria obtained from passive case detection will be analyzed as total incidence using a generalized linear model framework with a Poisson likelihood and a log link function. The incidence will be summed for all months of follow-up within each study cluster, and the study arm will be included as a fixed effect coded categorically as 0 for arm A and 1 for arm B. Exposure will be the population of the cluster as assessed during enumeration. The analyst will be blinded to the true assignment until the results are presented. The model will take the form below where yi is the total incidence at the cluster level where only aggregated data is available (i indexes clusters), α is the global intercept, Xi is the arm assignment for cluster i, βarm is the arm effect to be estimated, exposurei is the person time at risk for cluster i, and λij refers to the log E(yij|uj).:

$$\mathrmE(_)=\alpha +_^_+\mathrm(_)$$

where the likelihood is of the form:

Model results will be presented as the estimates of \(^\) and incidence rate ratios above and the standard deviation or variance of the random effects distribution. 95% confidence intervals for the IRR and \(^\) estimates as well as z-statistics and p-values for each coefficient will be presented. Results will be presented as incidence rates and incidence rate ratios along with their associated 95% confidence intervals, and p-values.

The outcome will also be checked for the distributional assumption that the mean and variance of the outcome are similar after conditioning on a cluster (e.g., are the within-cluster mean and variance similar); if the variance is substantially larger, a negative binomial likelihood will be considered.

Where individual-level data is available for this outcome, a similar approach will be followed but instead focused on cumulative incidence and using a variance components model. The model will take the form below where yij is incidence at the individual (i indexes individuals within clusters and j indexes clusters), α is the global intercept, Xij is the arm assignment for individual i in cluster j, βarm is the arm effect to be estimated, uj are random intercepts for the cluster and exposureij is the person time at risk for individual i in cluster j, λij refers to the log E(yij|uj), and σ is the standard deviation of the random intercept distribution:

$$\mathrmE\left(_|_\right)=\alpha +_^_+_+\mathrm(_)$$

where the likelihood is of the form:

And the random intercepts are assumed to follow a normal distribution:

Results will be presented as the incidence rate ratio (IRR), corresponding 95% confidence interval, and p-value based on the z-statistic. The primary outcome will also be checked for the distributional assumption that the mean and variance of the outcome are similar after conditioning on a cluster (e.g., are the within-cluster mean and variance similar) and if variance is substantially larger a negative binomial likelihood will be considered.

Parity

Daily female vector mosquito survival determined by parity is the main entomological outcome of the trial. The primary analysis will be conducted using parity data at the individual mosquito level with a multi-level (variance components model) constructed on a generalized linear model framework with a Bernoulli likelihood and a logit link function. Random intercepts will be included for each entomological study cluster and for each sampling household, and the study arm will be included as a fixed effect coded categorically as 0 for arm A and 1 for arm B. A simple model will first be considered as an unadjusted analysis which only includes fixed effects for study arm as described, and nested random effects for household and study cluster and an intercept. A more fully adjusted model will also be used for analysis to account for the complex sampling design by which mosquitoes are captured for parity analysis. This model will include fixed effects for collection location (indoors vs. outdoors), time since intervention, and calendar month as a seasonality adjustment. Additional random effects will be considered for the catch team/HLC individual. The models will generally take the form below where pij is the probability of parity at the individual mosquito level (i indexes individual mosquitoes within clusters and j indexes clusters), α is the global intercept, Xarmij is the arm assignment for individual i in cluster j, βarm is the arm effect to be estimated, Xindoorsij represents the individual mosquito being caught indoors, βindoors is the effect of being indoors on parity relative to collection happening outside, Xtimeij represents a measure of the continuous time since the start of the trial, and βtime is meant to capture an overarching time trend; this variable can also be interacted with the study arm fixed effect to produce an estimate of the difference in time trend by study arm. Xmonthij represents a series of monthly dummy variables in which individual mosquitoes were caught, and βmonth represents the series of monthly intercepts, intended to capture seasonal variation in parity. uj are random intercepts for the cluster, σ is the standard deviation of the cluster random intercept distribution, hk are random intercepts for houses, and σh is the standard deviation of the household random intercept distribution:

$$\mathrm(_)=\alpha +_^_+_^_+ _^_+\sum_^_^_+_+_$$

where the likelihood is of the form:

And the random intercepts are assumed to follow a normal distribution:

$$_ \sim N\left(0,_^\right)$$

Model results will be presented as the estimates of α and the odds ratio for the arm above and the standard deviation or variance of the random effects distributions. 95% confidence intervals for the odds ratio and α estimates as well as z-statistics and p-values for each coefficient will be presented.

Analysis based on cluster summaries will also be considered. All parity measurements within each cluster will be summarized as a single proportion. The cluster estimates of the proportion parous will be compared across arms using Student’s t-test. Results will be presented as mean parity and standard deviation of parity as well as t-statistic and p-value. 95% CIs for mean parity will also be presented for each arm.

Mosquito abundance

The analysis of data on mosquito abundance derived from capture of adult Anopheles spp. mosquitoes via CDC UV light traps placed indoors and outdoors near houses overnight will be constructed on a generalized linear model framework with a Poisson likelihood and a log link function. Random intercepts will be included for each entomological study cluster and study arm will be included as a fixed effect coded categorically as 0 for arm A and 1 for arm B. A simple model will first be considered an unadjusted analysis which only includes fixed effects for study arm as described, and cluster-level random effects and an intercept. Autoregressive terms may also be considered with appropriate lags determined by temporal partial auto-correlation functions. The model will take the form below where yij is the count of adult Anopheles spp. mosquitoes caught at the individual trap night (i indexes individual trap nights within clusters and j indexes clusters), α is the global intercept, Xarmij is the arm assignment for individual i in cluster j, βarm is the arm effect to be estimated, Xindoorsij represents the trap-night observation being indoors, βindoors is the effect of being indoors on mosquito density/abundance relative to collection happening outside, Xmonthij represents a series of monthly dummy variables in which individual mosquitoes were caught, and βmonth represents the series of monthly intercepts. uj are random intercepts for the cluster and exposureij is the number of trap nights corresponding to the particular yij observation (generally this will be equal to one (where it does equal one for all observations the log(exposureij) term may be omitted)) for trap night i in cluster j, λij refers to the log E(yij|uj), and σ is the standard deviation of the random intercept distribution:

$$\mathrmE\left(_|_\right)=\alpha +_^_+_^_+ \sum_^_^_+_+\mathrm(_)$$

where the likelihood is of the form:

And the random intercepts are assumed to follow a normal distribution:

Results will be presented as the incidence rate ratio (IRR), corresponding 95% confidence interval, and p-value based on the z-statistic. This outcome will also be checked for the distributional assumption that the mean and variance of the outcome are similar after conditioning on cluster (e.g., are the within-cluster mean and variance similar); if the variance is substantially larger, a negative binomial likelihood will be considered.

Sporozoite rate

Sporozoite rate or the proportion of adult female Anopheles spp. which are sporozoite positive and captured during the trial will be analyzed using a multi-level (variance components model) constructed on a generalized linear model framework with a Bernoulli likelihood and a logit link function. Random intercepts will be included for each study cluster and the study arm will be included as a fixed effect coded categorically as 0 for arm A and 1 for arm B. A simple model will first be considered an unadjusted analysis which only includes fixed effects for the study arm as described and cluster-level random effects and an intercept. A more fully adjusted model will also be used for analysis to account for the complex sampling design by which mosquitoes are captured for α-CSP ELISA. This model will include fixed effects for the capture method (HLC vs. CDC light trap), collection location (indoors vs. outdoors), time since intervention, and calendar month as a seasonality adjustment. Additional random effects will be considered for the catch team/HLC individual. The models will generally take the form below where pij is the probability sporozoite positivity at the individual mosquito level (i indexes individual mosquitoes within clusters and j indexes clusters), α is the global intercept, Xarmij is the arm assignment for individual i in cluster j, βarm is the arm effect to be estimated, XHLCij represents the individual mosquito being caught by HLC, βHLC is the effect of HLC catch on sporozoite rate relative to CDC light trap, Xindoorsij represents the individual mosquito being caught indoors, βindoors is the effect of being indoors on parity relative to collection happening outside, Xtimeij represents a measure of the continuous time since the start of the trial, and βtime is meant to capture an overarching time trend; this variable can also be interacted with the study arm fixed effect to produce an estimate of the difference in time trend by study arm. Xmonthij represents a series of monthly dummy variables in which individual mosquitoes were caught, and βmonth represents the series of monthly intercepts, intended to capture seasonal variation in sporozoite rate. uj are random intercepts for the cluster and σ is the standard deviation of the random intercept distribution:

$$\mathrm(_)=\alpha +_^_+_^_+_^_+ _^_+\sum_^_^_+_$$

where the likelihood is of the form:

And the random intercepts are assumed to follow a normal distribution:

Model results will be presented as the estimates of α and the odds ratio above and the standard deviation or variance of the random effects distribution. 95% confidence intervals for the odds ratio and α estimates as well as z-statistics and p-values for each coefficient will be presented. Sporozoite rate will be directly estimated as the predicted probability of being sporozoite positive in each month when captured via HLC and in each study arm and indoors and outdoors. 95% prediction intervals for sporozoite rate will also be presented.

Human landing rate

The analysis of data on human landing/biting rate derived from the capture of adult Anopheles spp. mosquitoes via HLC conducted indoors and outdoors near houses overnight will be constructed on a generalized linear model framework with a Poisson likelihood and a log link function. Random intercepts will be included for each study cluster, and the study arm will be included as a fixed effect coded categorically as 0 for arm A and 1 for arm B. A simple model will first be considered an unadjusted analysis which only includes fixed effects for the study arm as described, and cluster-level random effects and an intercept. Additional random effects will be considered for catch date, household, and/or HLC “catcher” and autoregressive terms may also be considered with appropriate lags determined by temporal partial auto-correlation functions. The model will take the form below where yij is the count of adult Anopheles spp. mosquitoes landing on an individual catcher during a specific night (i indexes individual catch-nights within clusters and j indexes clusters), α is the global intercept, Xarmij is the arm assignment for individual i in cluster j, βarm is the arm effect to be estimated, Xindoorsij represents the catch-night observation being indoors and βindoors is the effect of being indoors on human landing relative to collection happening outside, Xmonthij represent a series of monthly dummy variables in which individual mosquitoes were caught and βmonth the series of monthly intercepts. uj are random intercepts for the cluster and exposureij is the number of catch-nights corresponding to the particular yij observation (generally this will be equal to one (where it does equal one for all observations the log(exposureij) term may be omitted)) for catch-night i in cluster j, λij refers to the log E(yij|uj) and σ is the standard deviation of the random intercept distribution:

$$\mathrmE\left(_|_\right)=\alpha +_^_+_^_+ \sum_^_^_+ _+\mathrm(_)$$

where the likelihood is of the form:

And the random intercepts are assumed to follow a normal distribution:

Results will be presented as the incidence rate ratio (IRR), corresponding 95% confidence interval, and p-value based on the z-statistic. This outcome will also be checked for the distributional assumption that the mean and variance of the outcome are similar after conditioning on cluster (e.g., are the within cluster mean and variance similar); if the variance is substantially larger, a negative binomial likelihood will be considered. Human landing rate will be taken to be the predicted mean landing catch per day in each month disaggregated by arm, and indoors vs. outdoors. 95% prediction intervals will also be calculated.

EIR

The analysis of the entomological inoculation rate will utilize data derived from capture of adult Anopheles spp. mosquitoes caught via HLC or CDC light trap indoors or outdoors only and will follow similar principles to the analysis of total sporozoite-positive mosquitoes. The analysis will be based on Student’s t-test. For this analysis, estimates of EIR will be made independently for each cluster by calculating an estimated annual EIR within each cluster according to the following formula.

$$EIR= \sum_^_\left\\frac^_}^_};if \sum_^_>0\\ 0;otherwise\end\right.$$

where EIR equals the number of infected bites per person night per year and n represents the number of months of the year. Where collections are not made during the full calendar year because the malaria transmission season is assumed to be short and infectious bites are not expected outside of the transmission season, zero will be substituted for the estimated number of infectious bites per person-day during these months as shown in the formula above. In the formula above, b represents the number of mosquitoes captured via HLC on a catch person-night j during month i, s represents the estimated sporozoite rate for each cluster in month i, d represents the number of person catch-days for person catch night j in month i (which will generally be equal to one), and finally, m represents the total number of observations (person catch-nights) of HLC conducted. EIR within each cluster will be summarized as a single annualized EIR estimate post-intervention. The cluster estimates of the EIR will be compared across arms using a Student’s t-test. Results will be presented as mean annualized EIR and standard deviation of annualized EIR as well as t-statistic and p-value. 95% CIs for mean parity will also be presented for each arm. Should the distribution of EIR be substantially non-normal a non-parametric test such as the Mann–Whitney U-test may be considered.

留言 (0)

沒有登入
gif