Spatio-temporal characterization of phenotypic resistance in malaria vector species

The study site

The study was conducted in parts of the African continent that bear the largest burden of malaria risk and have reported cases of IR in malaria vectors [1]. The study focused on Nigeria, Burkina Faso, Ethiopia, Mali, Cameroon, Ghana, Tanzania, Côte d’Ivoire, and Mozambique due to the consistent availability and the reliability of data records on IR in malaria vector populations, spanning from 2005 to 2017.

Data

In this study, we used spatio-temporal data on IR in malaria vectors from the Vector Atlas (VA) database. This dataset encompasses georeferenced records detailing IR levels in mosquito vector populations across various classes of insecticides, namely pyrethroid, organochlorine, carbamate, and organophosphate. The IR state at each location is quantified as the percentage of vectors that survive (or perish) when subjected to susceptibility tests. Our study primarily focused on regions with extensive and consistent data records spanning over 15 years in Nigeria, Burkina Faso, Ethiopia, Mali, Cameroon, Ghana, Tanzania, Côte d’Ivoire, and Mozambique. By correlating the coordinates of these spatio-temporal IR data with corresponding IR driver values, we obtained information from various databases (Additional file 6: Table S5).

The IR driver variables were chosen based on a thorough review of existing literature and prior research endeavors around the modeling and mapping of IR distribution. The drivers encompass a range of critical factors that are known to influence the prevalence and spatial distribution of IR in mosquito vector populations. These factors include:

(1)

Human population density/count: The density and count of human populations in specific regions play a pivotal role in shaping IR dynamics. Areas with high human density often experience more extensive insecticide usage and consequently have a greater potential for the development and spread of IR [31].

(2)

Insecticide usage for vector control: The frequency and intensity of insecticide application for vector control significantly impact the selection pressure on mosquito populations. Areas with widespread insecticide use are more likely to witness the emergence and propagation of IR [23].

(3)

Agricultural activities: Agricultural practices can influence the presence of IR in malaria vectors. The use of insecticides in agriculture can contribute to the selection of resistance traits in mosquito populations [25].

(4)

Environmental/Climatic factors: Environmental and climatic conditions, such as temperature, humidity, and precipitation, can affect mosquito vector biology and behavior, which, in turn, can influence the development and spread of IR [5, 17].

A comprehensive summary of these IR drivers, along with their associated sources and relevance, is provided in Table 4. These drivers were carefully considered to construct a robust model that can effectively characterize the spatio-temporal dynamics of phenotypic resistance in malaria vector species.

Table 4 Potential drivers of insecticide resistance in malaria vectorsData exploration

Prior to extracting values of IR drivers from various databases, we excluded non-georeferenced records to ensure data consistency and reliability. Subsequently, we conducted an in-depth exploratory data analysis (EDA) for each class of insecticide. The primary objective of this analysis was to identify the key drivers that significantly influenced occurrence of confirmed IR state in both space and time. The criteria for defining confirmed IR state were based on the guidelines provided by the WHO, where confirmed IR state is characterized by vector populations exhibiting a mortality rate of less than 90% within 60 min following susceptibility testing [38]. It is important to note that other states, such as possible resistance (mortality ranging from 90 to 97%) and susceptible resistance (mortality between 98 and100%), were also considered [38]. However, our specific focus was solely on confirmed resistance.

We further assessed the strength of the correlation between the identified IR drivers and the confirmed resistance state. Additionally, we explored the existence of any correlation or clustering patterns among the IR drivers themselves. To achieve this, we employed several EDA techniques tailored to the nature of the data. For continuous data, we utilized principal component analysis (PCA), correlation tests, and cluster analysis. On the other hand, for categorical data, we performed chi-square tests. One of the key metrics used to quantify the strength of correlation was the correlation coefficient (r), expressed mathematically as:

$$r = \frac_-\overline\right)\left(_-\overline\right)}_-\overline)}^\sum _-\overline)}^}}$$

(1)

where. \(r\) = correlation coefficient \(_\) and \(\overline\) are the values and mean of x variable respectively \(_\) and \(\overline\) are the values and mean of y variable respectively.

The equation for chi-square statistic is.

$$^= \sum_^\frac_-_)}^}_}$$

(2)

where \(^\) is the chi-square statistic \(_\) is the observed frequency \(_\) is the expected frequency

To identify the key drivers using PCA, our first step involved identifying the principal components (PCs) that collectively explained over 80% of the variation in the data based on cumulative proportion. Once we had identified these essential PCs, we then selected IR drivers that exhibited loading exceeding 0.25 from each of the identified PCs. We then used cluster analysis (dendrograms) to obtain clusters of the IR drivers. These clusters were instrumental in determining which drivers were closely related. The selection process for variables to retain or drop within these clusters was informed by various criteria. We considered the strength and significance of the correlation of each driver with the confirmed IR, as well as the correlations among the drivers within the same cluster. Additionally, we considered the loadings derived from the PCA outputs. In the process of determining which drivers to include, we also conducted a literature review to help us make informed decisions on which IR drivers from the clusters were most relevant and essential for our study. After obtaining the refined subset of IR drivers for use in our analysis, we implemented cellular automata (CA) models. We employed the CA approach, which allows for consideration of the potential initiation of the IR state at different locations, providing a more comprehensive understanding of IR dynamics. The CA models were specifically designed for An. gambiae complex and An. arabiensis and were implemented separately for each class of insecticide under consideration.

Cellular automata (CA) approach

Cellular automata (CA) belongs to a group of spatially explicit models (SEM) which are constructed upon transition rules defining the progression of a geographical phenomenon [39]. SEM are computational methods rooted in geographical locations, allowing for replication of the dynamics of geographical phenomenon [40]. CA operates by utilizing a grid consisting of individuals cells each assigned a finite state. Over time, these cells undergo state changes governed by specified transition rules influenced by the state of neighboring cells. In the CA approach, the spatial domain is divided into grid cells, each initially assigned a state denoted as \(_^\). The state of cell (i,j) at t + ∆t is expressed as;

where \(_^\) is the cell state at time t, \(_^\) is the state of cells in its neighborhood, R represents the transition rules, and ∆t is the time step.

In our study, we extended the CA model to encompass the potential for the state of interest, which is confirmed resistance, to emerge in cells that do not belong to the immediate neighborhood cells at time t + ∆t. This confirmed IR state could initiate in any location beyond the neighboring cells, if the prevailing conditions in that specific location permit the onset of the confirmed resistance state at a given time t. This extension allows for accounting for the possibility that “confirmed resistance” may develop in areas that are not in direct proximity to the current location. It acknowledges that the spatial distribution of IR is influenced not only by neighboring cells but also by conditions and factors that may impact IR emergence in more distant locations within the grid. This enhanced model provides a more comprehensive understanding of how “confirmed resistance” spreads and evolves in space and time. We represented this novel concept as.

$$_^=\left\f(_^,_^,R), _^=0\\ f(_^,_^,R, _^), _^>0\end\right.$$

(4)

where \(_^\) is the cell state at time t, \(_^\) is the state of cells in its neighborhood, R represents the transition rules, and ∆t is the time-step \(_^\) is the cell which is not part of neighborhood cells and whose conditions at time t allows for the commencement of the state of interest, in this case, confirmed resistance to insecticide.

In the model, the appearance or disappearance of a confirmed IR state within a specific cell at a given time (t) depends on the current state of the IR drivers associated to that cell at time t. For instance, should temperature and other relevant IR drivers in the cell favor the prevalence of confirmed IR state to prevail, then it persists over time. However, if the temperature rises above a threshold where the vector with confirmed IR state cannot survive, applying the transition rules lead to the disappearance of confirmed IR state in that cell, at a given time. This t can be expressed mathematically as in Eq. 5;

$$_^=\left\ 1 \,if\, R\, or\, _^=True\\ 0\, otherwise\end\right.$$

(5)

where \(_^\) is the cell state at time t + ∆t. Here, a value of 1 signifies a cell in a confirmed IR state, indicating that the cell will maintain this state if the IR driver conditions are conducive to the persistence of confirmed IR within the mosquito population. Should these conditions deteriorate over time, cells previously in a confirmed IR state will lose this designation. Such changes in the IR driver conditions drive the dynamics of confirmed IR states. Similarly, this framework applies to the genesis of confirmed IR states in cells beyond the immediate vicinity. A cell transitions to a confirmed IR state if the IR driver conditions are favorable for the emergence of a mosquito population exhibiting confirmed IR.

Model implementation

From the subset of IR drivers obtained after conducting EDA, we formulated the transition rules for the CA model, with a focus on each species and insecticide class. These transition rules governed the interaction of different IR drivers within cells of interest, that is neighborhood cells and cells within the study area, to determine whether the state of IR in such cell transitioned to confirmed IR or not. This underscored the importance of considering the values of the IR drivers at each cell at time step to ascertain the expected dynamics of confirmed IR state.

To determine these transition rules and set appropriate thresholds and association between IR drivers and confirmed IR state, we employed various analytical techniques including descriptive statistics, cross-tabulations, boxplots, proportion tables, pivot tables, and charts. Through summary statistics and literature review, we established boundaries for the continuous factors driving IR. This process involved setting the observed minimum and maximum values of these IR drivers as the lower and upper thresholds, respectively. For categorical variables, thresholds were determined by analyzing how their categories correlated with the confirmed IR state, utilizing proportionate analyses, pivot tables, and charts for this purpose. To refine our dataset further, we employed boxplots to identify and subsequently exclude potential outliers. In our quest to discern the conditions conducive to the emergence of confirmed IR states in cells removed from their immediate neighborhood, we undertook exploratory data analysis. This step was crucial for observing trends in confirmed IR states across varying combinations of IR drivers, with cross-tabulations and pivot tables being instrumental in this endeavor. Additionally, we pinpointed specific IR drivers that, even in isolation, were linked to higher frequencies of confirmed IR prevalence. The significance of these IR drivers was ascertained through correlation tests, aiding in the identification of unique driver combinations that consistently foster confirmed IR states in mosquito populations. Literature reviews further reinforced our findings, confirming that certain IR driver combinations indeed promote confirmed IR states. For instance, we found that mosquito populations from areas practicing rice farming through irrigation consistently exhibited confirmed IR states, regardless of other agricultural methods employed. Other significant combinations identified include areas engaging in both irrigated rice and vegetable farming with temperatures spanning 15 to 38 °C; and locales with insecticide coverage (ITN use and IRS) surpassing 0.76, alongside vegetable farming and similar temperature ranges, as supported by literature [41, 42]. These exploratory analyses underscore the concept that IR states can propagate from their point of origin to neighboring areas, a phenomenon supported by various studies [30, 43,44,45] and indicative of a diffusion process. Furthermore, the transition rules derived from our analyses are detailed in the code, available in the section dedicated to data and material availability.

Once the transition rules were established, we divided several countries into grid cells measuring 5 by 5 km. We then extracted the values of the IR driving factors from various databases, utilizing the centroids of each cell for the period spanning from 2000 to 2018. These countries included Ethiopia, Cameroon, Burkina Faso, Uganda, and Nigeria. The data were available on various temporal scales, including monthly, yearly, or at intervals of several years. To standardize the temporal aspect, we converted the data to a yearly basis and interpolated them to create monthly time steps, facilitating the use of monthly intervals during the model implementation.

With the data of all IR drivers for each country for the entire period, we arranged them sequentially then converted the data to rasters. The resulting rasters were then stacked together, with centroids identifying each cell uniquely, while time identified each time step uniquely. Therefore, with transition rules formulated, and study areas gridded, and rasters for IR variables arranged sequentially, the next task was to identify the cell which contained the initial records of confirmed IR state for respective cell, determine its neighborhoods, then use the transition rules to establish whether the neighboring cells or those within the study area but not part of the neighborhoods cell would result into confirmed IR state or not. Therefore, to implement the CA models, we designated certain initial cells as points for the commencement of IR, assigning a value of 1 to cells with a confirmed IR state and 0 to those without. Subsequently, we executed the CA model based on the formulated rules, utilizing the Moore neighborhood configuration. Figure 10 illustrates the implementation of the model.

Fig. 10figure 10

Summary flowchart of insect resistance model implementation

We implemented the CA model for both An. gambiae complex and An. arabiensis species, encompassing each of the four insecticide classes. The CA models were executed using R version 4.0.5 [46].

Model assessment

Model fine-tuning emerged as a critical phase in our study, aiming to ensure the model’s capability to accurately reflect the dynamics of IR particularly within the An. gambiae complex and An. arabiensis found in several parts of Africa. For An. gambiae complex, we concentrated our fine-tuning efforts on Ethiopia, Cameroon, and Burkina Faso selected for their representation of Africa’s diverse agro-ecological landscapes. These countries served as the backdrop for adjusting our CA models to predict confirmed IR states for different classes of insecticides and vector species accurately. The fine-tuning process was thorough, with a primary focus on refining transition rules and thresholds, key determinants of the CA models’ ability to mimic the actual IR dynamics accurately. This refinement began with establishing transition rules based on identified thresholds, which are the ranges of predictor variables linked to confirmed IR states. Following this, the models were implemented, and their outputs were juxtaposed with actual IR case data to evaluate their accuracy. This evaluation informed further adjustments to the transition rules and thresholds, alongside the introduction of new rules catering to interactions among subsets of predictors, leading to iterative model implementations until the models proficiently mirrored the IR dynamics.

Subsequent to the fine-tuning phase, we proceeded to validate the models for confirmed IR within An. gambiae complex and An. arabiensis. Validation was executed using countries not previously involved in the EDA, rule formulation, or fine-tuning stages, such as Nigeria and Uganda for An. gambiae complex, and Cameroon for An. arabiensis. This approach, encompassing both spatial and temporal validation methods, aimed to assess if the models' predictions of IR locations aligned with actual reported IR incidences over time, thus ensuring the validation was conducted on an out-of-sample basis. Performance metrics, including the classification accuracy score, were employed to gauge the CA models’ effectiveness. This meticulous process of model fine-tuning and validation underscores our commitment to creating robust and accurate models capable of contributing significantly to the understanding and management of IR dynamics in mosquito populations across diverse African ecosystems.

$$Accuracy\, score= \frac_+_}_+_+_+_}$$

(6)

where; \(_\) is the true positives \(_\) is the true negatives \(_\) is the false positives \(_\) is the false negatives

In addition, we conducted a comparative analysis of the outputs of our model with existing models on mapping phenotypic resistance in Africa.

Comments (0)

No login
gif