 Research
 Open Access
 Published:
Predicting the spread and persistence of genetically modified dominant sterile male mosquitoes
Parasites & Vectors volume 14, Article number: 480 (2021)
Abstract
Background
Reproductive containment provides an opportunity to implement a stagedrelease strategy for genetic control of malaria vectors, in particular allowing predictions about the spread and persistence of (selflimiting) sterile and malebiased strains to be compared to outcomes before moving to (selfsustaining) genedrive strains.
Methods
In this study, we: (i) describe a diffusion–advection–reaction model of the spread and persistence of a single cohort of male mosquitoes; (ii) elicit informative prior distributions for model parameters, for wildtype (WT) and genetically modified dominant sterile strains (DSM); (iii) estimate posterior distributions for WT strains using data from published markrecapturerelease (MRR) experiments, with inference performed through the DelayedRejection Adaptive Metropolis algorithm; and (iv) weight prior distributions, in order to make predictions about genetically modified strains using Bayes factors calculated for the WT strains.
Results
If a single cohort of 5000 genetically modified dominant sterile male mosquitoes are released at the same location as previous MRR experiments with their WT counterparts, there is a 90% probability that the expected number of released mosquitoes will fall to < 1 in 10 days, and that by 12 days there will be a 99% probability that no mosquitoes will be found more than 150 m from the release location.
Conclusions
Spread and persistence models should form a key component of risk assessments of novel genetic control strategies for malaria vectors. Our predictions, used in an independent risk assessment, suggest that genetically modified sterile male mosquitoes will remain within the locality of the release site, and that they will persist for a very limited amount of time. Data gathered following the release of these mosquitoes will enable us to test the accuracy of these predictions and also provide a means to update parameter distributions for genetic strains in a coherent (Bayesian) framework. We anticipate this will provide additional insights about how to conduct probabilistic risk assessments of stagereleased genetically modified mosquitoes.
Graphical abstract
Background
Over the last 30 years conventional control strategies have reduced the global incidence rate of malaria. The efficacy of these strategies, however, appears to be waning. Global malaria incidence rates have barely changed since 2014, and progress has stalled at around 57 cases per 1000 population at risk despite increased expenditure on elimination and prevention between from 2014 to 2017 [1]. Consequently, the World Health Organisation predicts that morbidity targets for 2020, 2025 and 2030 will not to be met without significant improvements in vector control and chemoprevention.
Laboratory studies and modelling indicate that novel control strategies that use gene drives to force engineered alleles through vector populations with the aim to suppress these populations or render them unable to transmit plasmodium parasites could augment conventional strategies and significantly improve current rates of malaria control [2,3,4,5]. As with many novel technologies, however, the safety of genetic control methods is uncertain, and the potential for spread and ecosystemwide impacts makes the gene drive strategy contentious [6, 7].
To date, all research on genedrive modified mosquitoes (GDMMs) has been conducted in laboratories under strict physical, ecological, reproductive and/or molecular containment [8]. Current guidance recommends that if containment is to be relaxed it should be done in a “phasedrelease” strategy wherein data are gathered, risks are evaluated and containment is gradually lifted in a stepwise fashion if and when the risks are deemed acceptable [9, 10]. The possibility of longdistance dispersal [11] and biosecurity concerns with large semifield enclosure, however, may erode confidence in strategies that try to gradually lift ecological or physical containment.
Genetic and reproductive containment strategies may be more amenable to a gradualrelease strategy, and several approaches are currently being pursued [12,13,14]. In this paper we focus on the spatiotemporal dynamics that govern the spread and persistence of a single cohort of genetically modified, dominant sterile male (DSM) Anopheles coluzzii mosquitoes released in a small village in Burkina Faso. These sterile male mosquitoes are reproductively contained (but females that carry the sterilising construct are fertile) and could represent the first stage in a threestage pathway to malaria vector control using a gene drive that results in a malebiased sex ratio [15, 16].
In our analysis the spatiotemporal dynamics of DSM mosquitoes are driven by: (i) a diffusion process that accounts for the dispersal capacity of the insects; (ii) an advection process that accounts for the attraction of individuals to swarms, and (iii) a reaction process that accounts for the death of sterile male mosquitoes. These processes can be described mathematically by a partial differential equation (PDE) of the advection–diffusion–reaction type, which models how the distribution of mosquitoes in time and space is affected by a chemoattractant present in the environment. Models of this type have also been widely used within the environmental and ecological literature. In particular, such models have been used previously to model mosquitoes, for example the dispersal of Aedes albopictus in Reunion Island using a similar type of parabolic PDE [17].
Our objective is to predict the spread and persistence of the genetically modified mosquitoes ahead of a proposed field release. These predictions are an important component of an independent risk assessment conducted prior to the field release [18] that assessed the risk of vectorborne disease transmission [19] and could also help to inform the design of a postrelease monitoring strategy. Our approach implements the Bayesian paradigm; we elicit the prior distribution of model parameters from relevant experts, for both wildtype (WT) mosquitoes and their DSM counterparts. We obtain the posterior model parameter values for the WT parameters using the results of markreleaserecapture (MRR) experiments conducted by [20]. We then use Bayesian model averaging to predict the spread and persistence of DSM mosquitoes.
Methods
PDE model
The literature review conducted in [21] suggests that the behaviour of hostseeking mosquitoes can be categorised as: (i) plume finding, in which flight direction is either random (kinesis), determined by visual features in the absence of wind or deliberately upwind, downwind or crosswind (anemotaxis) when wind is present; and, (ii) plume tracking, where once the mosquito detects an odour plume, flight direction is deliberately upwind, or possibly determined by the odour gradient in windless conditions, in order to find the odour source (positive chemotaxis). More details on the properties of these search strategies can be found in [22].
In this analysis, we assume that village compounds (groups of closely spaced houses) provide a source of carbon dioxide (CO_{2}) that acts as an attractant for female Anopheles gambiae sensu strictu and Anopheles coluzzi mosquitoes [23, 24]. We further assume that male mosquitoes swarm in places where the probability of encountering receptive females is highest [25]; hence, swarms occur in and around compounds (as evident in Figure 1 of [20]). Consequently, high concentrations of attractant result in flights that (on average) are closer to the direction of the nearest compound for both males and females. Conversely, low concentrations have less influence on the mosquitoes and result in greater variability in flight direction. We do not incorporate the effect of wind speed or direction into the analysis (rather, we assume that the local dispersal dynamics are not greatly affected by the wind at this spatial scale), and we also assume that the attraction process is not influenced by the number of mosquitoes.
With these assumptions we developed an advection–diffusion–reaction model to describe the expected number of male WT and DSM mosquitoes,
with the following boundary conditions (Neumann reflecting conditions):
where \(U(\varvec{s})\) is a function that describes the strength of mosquito attraction at location s, and \({\alpha }\) enables us to modulate the effect of the utility function on the overall attraction. Utility functions like this have been used in similar ecological contexts to describe an attractive or repulsive flux [26].
In this model we use swarm sites as centers of attraction and assume that the strength of attraction decays according to a squaredsum exponential decay kernel
where \(\mathcal {L}\) are the known swarm locations (identified prior to the MRR experiment [20]) and \(\sigma\) is an unknown parameter that controls the range of attraction—that is, the distance beyond which the attraction of a swarm site \(\varvec{s}_\ell\) for a mosquito is negligible.
The theoretical solution to the partial differential Eq. 1 is the expected number of male mosquitoes at time t and location \(\varvec{s}=(\text {easting},\text {northing})\). The partial derivatives are denoted \(\lambda _{\varvec{t}}\) (with respect to time) and \(\lambda _{\varvec{s}\varvec{s}}\) (with respect to location). Because we use a numerical solver, we define a grid over the spatial domain. Each grid cell is therefore one areal unit. In this study, an areal unit is a square of 100 m^{2} of area. All process model parameters are summarised in Table 1.
Data
In this analysis we use the results of five MRR experiments [20] wherein approximately 5000 adult male WT mosquitoes, marked with a coloured powder, were released 2 h before swarming (around 16:00 h), at three different locations, on five different dates (Table 2). Mosquitoes were recaptured in three ways via: (i) swarm sampling in the evening (times were not specified, but we assume the sampling occurred between 19:00 and 21:00 h); (ii) pyrethroid spray catches (PSC) in the morning inside houses; and (iii) placement of humidified clay pots within rooms within houses, which were subsequently checked in the morning (between 06:00 and 07:00 h). All release and recapture sites were geolocated to allow for a spatial model to be used.
Recaptures were performed for 7 days after release for all experiments except the first one where recaptures were performed for 5 days. We used the results of the first four MRR experiments to derive posterior estimates of male WT model parameters, reserving the results of the fifth experiment to compare with model predictions.
Priors
Process model
A key challenge when conducting risk assessments of a novel technology is the lack of empirical information. Without any operational history from which success and failure rates might be estimated, quantitative risk predictions must (at least initially) be based on the testable predictions of domain experts. The Bayesian inference paradigm encourages the careful elicitation of prior information and provides a coherent mechanism for updating this information as data become available [27]. The experimental observations and expert elicitation, however, must be independent but carefully aligned for this process to work.
For this analysis we carefully elicited opinions on the mortality rate and dispersal distance of WT and DSM male mosquitoes from four independent recognised experts on malaria transmission by mosquitoes in Africa and genetic vector control methods; the independent experts were not involved in the development of the DSM mosquitoes [18]. The experts were encouraged to draw on their own research experience and knowledge of the published literature while constructing their assessments. Prior to the elicitation, experts were provided information on the genetic construct (originally incorporated into the G3 laboratory colony strain of Anopheles gambiae) and were told that the genetically modified strain would be repeatedly backcrossed with WT mosquitoes captured in the vicinity of a biosecure laboratory in order to gradually replace the G3 genetic background with that of the locally originated strain. The location of the laboratory was provided to each expert, but the species of local WT mosquitoes was not specified. For the mortality rate, we proposed that the experts answer questions about either the probability of mortality, the probability of survival or the life expectancy. All responses were converted to an estimate of the daily probability of survival (p) which was then converted to the mortality rate parameter (\(\mu = \log (p)\)) used in the PDE model (Additional file 1).
The model’s diffusion parameter (\({D}\)) was not elicited directly but calculated using the experts’ prior opinion on the average daily dispersal distance (d) of male mosquitoes. All experts agreed that a lognormal distribution provided the most appropriate description of their uncertainty for both WT and DSM strains. The model’s prior diffusion parameter was subsequently calculated using the relation \({D}= d^2/\pi t\) (see Additional file 2 for a detailed derivation). The geographical and temporal context of the elicitation was carefully prescribed, including a standardized time frame (dispersal over a 1day period), so that all experts’ opinions could be combined.
The model’s two chemotaxisrelated parameters are: (i) the relative strength of attraction towards the source of the attractant that a mosquito experiences at a location (\({\alpha }\)), with units \(\text{m}^{1} \text{R}^{1}\) where m is the distance in meters between the location and the source and R is a measure of the “reward” at the source, often expressed as the concentration of CO_{2} for female mosquitoes; and (ii) the distance beyond which a mosquito is negligibly attracted to a swarm site, called the range (\(\sigma\)) with units m. The attractiveness of a swarm site at distance \(\sigma\) from a mosquito is roughly 38% of its attractiveness at zero distance.
Finding prior information on these parameters proved difficult for the particular setting of this model, in particular for \({\alpha }\). We therefore chose a very broad prior distribution for \({\alpha }\) that spans several orders of magnitude, from \(1\%\) (1st percentile) to \(140\%\) (99th percentile) of the attraction at the source. For the range \(\sigma\) defined the same way we do, different articles on related species provide values ranging from 3 [28] to 18 m for single baiting or > 36 m for double baiting [29], or even 40 m [30]. We therefore used a lognormal distribution for this prior with positive probability across all these possibilities (Table 3).
Observation model
An important advantage of the Bayesian paradigm is that it allows us to develop more realistic hierarchical models that capture uncertainty in the biological and physical processes that drive a phenomenon, as well as the uncertainty associated with the measurement process with which these processes are observed. In this context, this allows us to reflect the uncertainty about how many mosquitoes any particular trap will catch given a known number of mosquitoes in the proximity of the trap.
Here we represent the observation model uncertainty using three, trapspecific, catchability parameters, denoted \(p_{\text{cfr}}\) for claypots, \(p_{\text{psc}}\) for insecticide spraying and \(p_{\text{ss}}\) for swarm sampling. These parameters define the probability of catching a mosquito given that it was in close proximity of the trap at the time of sampling [31]. Note that prior and posterior estimates of these parameters depend on the model resolution because the definition of “close proximity” is determined by the resolution of the raster (i.e. the grid cell size) over which the PDE model is numerically solved. Hence, all predictions must be made at the same resolution used for solving the PDE and updating the prior distributions.
The priors for the catchability parameters were chosen according to the following principles:

For \(p_{\text{ss}}\), a beta distribution with parameters \(\alpha_{\text{ss}} = 10\) and \(\beta_{\text{ss}} = 20\) (meaning on average one third of the mosquitoes in the swarm are caught, following the standard operating procedures of the MRR experiment, which requires one third of the swarm to be captured).

For \(p_{\text{psc}}\), a beta distribution with parameters \(\alpha_{\text{psc}} = 1.4\) and \(\beta_{\text{psc}} = 1\). This prior was chosen because the PSC procedure involves spraying an entire room and capturing all the dead mosquitoes in the room. The PDE model, however, is specified at a resolution of the compounds, which on average have 1.4 rooms. We assume that each room has on average the same number of male mosquitoes.

For \(p_{\text{cfr}}\), assuming this is the least efficient method but not having more information we deliberately selected a fairly low informative prior. So we chose a beta distribution with parameters \(\alpha_{\text{cfr}} = 1\) and \(\beta_{\text{cfr}} = 1\).
A complete list of process and observation model priors and their sources is summarised in Table 3.
Inference
WT male parameters
A Bayesian hierarchical model (BHM) is classically written:
where here \(\theta\) is the vector of parameters of the PDE model and the trap catchability parameters, \(\lambda\) refers to the expected abundance of male mosquitoes in the vicinity of the traps and y refers to the MRR data collected following the 12 releases (\(r = {1, \dots , 12}\)), observed with three different collection (trap) techniques (\(c = {1, \dots , 3}\)). Assuming conditional independence of the observations, the posterior distribution can be rewritten as:
where \(\theta _d\) is the list of parameters defined in Table 1.
For a given release r, we describe the number of available released mosquitoes at time t as a Poisson random variable with expectation \(\lambda _r(t)\), where the dependence of this expectation on location is suppressed:
where \(\lambda _r(t)\) is given by the PDE model described in Eq. 1. The data \(\mathsf{y}\) are the observed count of male mosquitoes in a given trap. Since the traps catch only a portion of the mosquitoes that occur in their immediate vicinity, a natural trap observation model is the binomial distribution:
where \(n_r\) is a realization of the Poisson random variable \(\lambda _r\) (the expected number of released mosquitoes in the trap’s immediate vicinity), and \(p_c\) is the catchability parameter, or “trap efficiency”, which is trap dependent. In this study we do not account for the possibility of false positive (insects or mosquito species incorrectly identified as released male mosquitoes) or false negative (released male An. gambiae mosquitoes incorrectly identified as something else) probabilities as they are assumed to be equal to zero [20].
Given the binomial likelihood, the posterior distribution is obtained by integrating over the possible Poisson outcomes:
Markov Chain Monte Carlo (MCMC) inference methods have been successfully applied to hierarchical Bayesian models, similar to the model described here, on many occasions (see, for example [32,33,34,35]). We found, however, that the standard randomwalk Metropolis MCMC routine was too slow to mix. Furthermore, in this context, more advanced methods, such as Hamiltonian Monte Carlo sampling, which require repeated likelihood computations along the proposal path, are inefficient because of the high computational costs entailed by the need to numerically solve the PDE model for each new proposal.
We found a reasonable compromise through the DelayedRejection Adaptive Metropolis (DRAM) algorithm, proposed by [36]. This MCMC algorithm uses a multivariate proposal distribution that automatically adapts to allow for posterior correlations between the parameters and identifies the directions of principal change along the ridges in the posterior surface. The acceptance rate of the DRAM algorithm is also improved by using a delayed rejection scheme where, instead of immediately advancing the chain following rejection of a parameter set, a second proposal is made that depends on both the current position of the chain and the rejected parameter set. We implemented DRAM by using the function modMCMC in the FME package [37] in R ([38]; https://www.Rproject.org/), using one delayed rejection step and updating the proposal distribution every 200 iterations. We run a total of three MCMC chains of 15000 samples each, with 5000 used as burnin. The convergence of the chains was assessed using Gelman’s \(\hat{R}\) criterion (see chapter 11 in [39]).
Posterior prediction validation
We evaluated the accuracy of the posterior model predictions by comparing them against the observed recaptures in the MRR experiment that we deliberately withheld from the inference procedure. The fifth MRR experiment was conducted at the same location as the first four experiments, and under similar conditions with two exceptions: (i) all marked male mosquitoes were released at a single location, and (ii) a much larger number of mosquitoes were released (Table 2). The experimental conditions and population dynamics might therefore be considered to be slightly different than those which prevailed during the first four experiments. If the model is nonetheless able to make sufficiently accurate predictions, then we may be more confident in its ability to be generalised to other similar release scenarios.
Bayesian model averaging
The field data for the WT male mosquito allows us to calculate the posterior distribution for the WT mortality and dispersal distance parameters. This in turn allows us to measure how well each expert’s prior distribution matches the posterior distribution, and weight each expert’s response accordingly. We do this by considering each expert’s prior as an alternative model and use the theory of model evidence [40] to calculate the Bayes factor [19, 41].
We assume that experts who are good at predicting outcomes with WT mosquitoes (i.e. those whose prior distributions are close to the posterior distributions) will also be good at predicting outcomes with DSM mosquitoes, and weight the experts’ DSM priors by the posterior probability of the models using Bayesian model averaging to obtain a weighted linear pool of expert opinion for the DSM mortality and dispersal distance (Fig. 1).
Results
WT parameters
Table 4 provides the summary statistics of the posterior distributions of the PDE model parameters (Fig. 2 provides both the priors and posteriors for comparison). The use of the MRR data allowed us to provide refined estimates for the parameters of the PDE model. The diffusion coefficient in particular has seen its uncertainty decrease, yielding a mean value of 127 m^{2} per day. The chemotaxis component has a posterior mean value of about 0.07, while the range parameter for the mosquito attraction is about 33.9 m.
The posterior samples of the dispersal and swarm attraction parameters are highly correlated such that high dispersal values are associated with low attraction, and viceversa. We anticipated this and deliberately chose a weakly informative prior for the attraction parameter to allow the data to drive their posterior estimates as far as possible. The largest observable dispersal, however, is clearly dictated by the distance between the sources of attractant (compounds) and the traps laid out in the field. In this instance, all traps were laid within 500 m of release sites and compounds (Fig. 1 in [20]), limiting the ability to infer the possibility of the much higher dispersal values represented in the expert prior (Fig. 2).
Spread and persistence of male WT mosquitoes
Figure 3 provides an overview of the expected evolution of the number of mosquitoes caught in the set of traps set for the fifth MRR (just 14 selected locations where at least 1 mosquito was caught during MRR 5). When comparing the model outcomes to the actual capture data, we note that of the 365 observations (of the number of mosquitoes caught in a given trap at a given time), only 11 observations report numbers outside the 90% credible interval given by the model, providing a good level of confidence on the ability of the model to capture the general dynamic and its ability to make usable predictions both in space and time about the spread and dispersal of WT strain mosquitoes. The reported rootmeansquare deviation (RMSE) for this crossvalidated experiment is 0.335.
DSM parameters
The weighted linear pool priors for DSM mortality and diffusion (weighted according to the Bayesian model averaging approach detailed above) are summarised in Table 5. The two parameters are quite different from the WT posterior estimates, with the DSM mortality prior sixfold higher than the WT posterior, and the dispersal multiple orders of magnitude bigger.
The difference between the posterior distributions for WT and DSM mortality reflects the effect of the Bayesian model averaging but also the significant reduction in uncertainty that occurs when moving from prior to posterior distributions (see Fig. 2), as often occurs in a Bayesian analysis. The linear pool of expert prior distribution for the dispersal distance of WT mosquitoes was also very much higher than its posterior distribution estimated using the data. It is possible therefore that the weighted linear pool prior for DSM dispersal will also prove to be an overestimate, noting of course that the posterior estimates of WT dispersal are conditional on the design of the MRR experiments which was established before this analysis.
Spread and persistence of male DSM mosquitoes
The predicted persistence of DSM mosquitoes following a release of a single cohort of 5000 males is summarised in Fig. 4. The large uncertainty captured in Table 5 is clearly reflected in these predictions: the mean expected survival is estimated to be about 9 to 10 days, with a 90% probability that the expected number of male DSM mosquitoes falls to < 1 by day 10. The 90% credible interval, however, is large, and there is a small probability (c. 0.05) that it could take as long as about 2 months for this to occur.
The dispersal of the DSM mosquitoes is balanced by the mortality rate. Because of this, the DSM are not expected to disperse far from their release site. The expected number of mosquitoes per cell is delineated by the red and orange contours in Fig. 5. For instance, while the spread is expected to extend on average to up to 500 m away from the release location by day 2 (with at most a 1% chance of finding a mosquito further away), the extent then contracts quickly to a limited area (< 250 m from release location) by day 5, to finally about 150 m away from the release location at day 12.
Discussion
This analysis uses a combination of mathematical modelling, Bayesian inference and expert elicitation to predict the spread and persistence of genetically modified DSM mosquitoes following the release of a single cohort of 5000 individuals. These predictions were made as part of an independent risk assessment [18, 42] (available at: https://targetmalaria.org/resources/library/?category=risks_andassessments) that also addressed the accidental release of fertile female mosquitoes. The results reported here helped determine the scope of the assessment and in particular provided the rationale for excluding the possibility of effects on nontarget organisms caused by changes in food web structures or the provision of ecological services: the predicted survival of the DSM population is too short, and the associated population size too small, to cause any noticeable effect on nontarget organisms or ecosystem processes.
The risk assessment was finalised and made public in May 2018, prior to the decision by the National Biosafety Agency (NBA) of Burkina Faso to authorise the field trial (July 2018), but after the submission by Target Malaria of a biosafety application to the NBA seeking authorisation for the release (November 2017; see: https://targetmalaria.org/resources/library/?category=developmentpathway). The field release itself occurred in July 2019 (see: https://tinyurl.com/y4qr9lc5). We anticipate that the results from this field experiment will soon be forthcoming and that they will provide an important opportunity to test the accuracy of the predictions presented here. The risk assessment results are case specific and do not generalise to other genetic control strategies that involve (for example) population modification strategies.
In this analysis we were able to validate the model predictions for WT mosquitoes by holding back a proportion of the observation data. This also enabled us to use Bayesian model averaging to identify experts who were more accurate in their prior predictions. By allowing the opinions of these experts to carry greater weight, we were also able to reduce uncertainty in the prior predictions for DSM mosquitoes. This approach assumes, however, that experts who make more accurate predictions about WT mosquitoes will also make better predictions about genetically modified mosquitoes. We will also be able to test this assumption once the results of the DSM field release are published.
Our results demonstrate how field observations greatly reduce the uncertainty between the prior information elicited with independent domain experts and the posterior distribution. Despite their relatively large uncertainty, our experience is that expertderived prior distributions are essential when attempting to run inference over the multidimensional parameter space that dynamic population models demand, and that careful elicitation will therefore continue to be an essential component of future risk assessment studies.
Our posterior estimates of WT dispersal are conditional on a series of MRR experiments that were conducted prior to the elicitation of the dispersal prior distributions. These priors played no part in the experimental design, and the difference between the prior and posterior distributions depends on this design. This difference may reflect conservative prior estimates of mosquitoes’ mean daily dispersal when in the vicinity of attractants (swarm locations), but it may also be an artefact of the MRR design that focussed efforts in and around village compounds to maximise the number of recaptures. As Epopa et al. [20] notes, however, further intensive sampling studies outside or around villages will be useful to ensure that posterior estimates of dispersal are not blind to longrange dispersal outcomes that are not witnessed because of the recapture site design. There is also the possibility of an impact of genetic modification on dispersal, either through lifehistory traits linked to vagility or perhaps hostseeking behavioural traits. If vagility is negatively impacted, then dispersal could be less for DSM compared to WT mosquitoes, but if hostseeking ability was instead curtailed, then dispersal could perhaps decrease for DSM relative their WT counterparts. As mentioned above, the analysis conducted here could be repeated to take into account the DSM MRR data that were in the planning stages at the time of the risk assessment.
We believe that dynamic population models will form a central component of any riskbased governance system for gene drivemodified mosquitoes. We therefore suspect that confidence in this governance structure will likely depend on the extent to which risk assessments are able to predict the spread and the persistence of mosquitoes carrying gene drive constructs, within the bounds of adequate accuracy. Critically, a staged release strategy provides the opportunity to compare the predictions of these types of models against observed outcomes, learn how to approach the modelling and inference challenges and gradually define the bounds of accuracy that regulators and stakeholders believe are adequate.
This staged learning is borne out by this analysis: the reproductive containment strategy utilised in the DSM mosquitoes provides a good, relatively simple, starting point for this type of analysis. The reaction dynamics in our model are greatly simplified by virtue of the fact that the released male mosquitoes are sterile. The dynamic models for the second stage in Target Malaria’s pathway, namely selflimiting male bias strains, will be more complicated because they must accommodate the birth processes and genetic dynamics that are not relevant here, and also other potentially important interspecific interactions [43]. The models for the third and final stage, that is selfsustaining driving male bias strains, will be more complicated again particularly because the interspecific interactions are likely to be more important and because the spatiotemporal scope of the analysis will be significantly larger.
Conclusion
Our analysis demonstrates the strength of the Bayesian approach in the context of staged learning. Given its qualities, we believe this paradigm is the most appropriate to handle the prediction and risk assessment challenges that novel, gene drivebased strategies for controlling malaria vectors will pose in the coming years.
Availability of data and materials
The datasets used and/or analysed during the current study are available from the corresponding author on reasonable request. The code used to generate the results is accessible from the following github repository ick003/ rRiskDSMspread.
Abbreviations
 BHM:

Bayesian hierarchical model
 cfr:

Clay pots resting catches
 DRAM:

DelayedRejection Adaptive Metropolis
 DSM:

Dominant sterile male
 GDMMs:

Genedrive modified mosquitoes
 MCMC:

Markov chain Monte Carlo
 MRR:

Markreleaserecapture
 NBA:

National Biosafety Agency
 PDE:

Partial differential equation
 PSC:

Pyrethroid spray catches
 SS:

Swarm sampling
 WT:

Wild type
References
 1.
World Health Organisation. World malaria report 2019. Geneva: World Health Organisation; 2019. https://www.who.int/publications/i/item/worldmalariareport2019. Accessed 1 Dec 2020.
 2.
Gabrieli P, Smidler A, Catteruccia F. Engineering the control of mosquitoborne infectious diseases. Genome Biol. 2014;15(11):535. https://doi.org/10.1186/s1305901405357.
 3.
Beaghton A, Hammond A, Nolan T, Crisanti A, Godfray HCJ, Burt A. Requirements for driving antipathogen effector genes into populations of disease vectors by homing. Genetics. 2017;205(4):1587–96. https://www.genetics.org/content/205/4/1587.
 4.
CarballarLejarazú R, Ogaugwu C, Tushar T, Kelsey A, Pham TB, Murphy J, et al. Nextgeneration gene drive for population modification of the malaria vector mosquito, Anopheles gambiae. Proc Natl Acad Sci USA. 2020;117(37):22805–14.
 5.
Simoni A, Hammond AM, Beaghton AK, Galizi R, Taxiarchi C, Kyrou K, et al. A malebiased sexdistorter gene drive for the human malaria vector Anopheles gambiae. Nat Biotechnol. 2020;38(9):1054–60. https://doi.org/10.1038/s4158702005081.
 6.
Ledord H, Callaway E. ‘Gene drive’ mosquitoes engineered to fight malaria. Nature. 2015. https://doi.org/10.1038/nature.2015.18858
 7.
Champer J, Buchman A, Akbari OS. Cheating evolution: engineering gene drives to manipulate the fate of wild populations. Nat Rev Genet. 2016;17(3):146–59. https://doi.org/10.1038/nrg.2015.34.
 8.
Akbari OS, Bellen HJ, Bier E, Bullock SL, Burt A, Church GM, et al. Safeguarding gene drive experiments in the laboratory. Science. 2015;349(6251):927–9. https://science.sciencemag.org/content/349/6251/927.
 9.
WHOTDR & FNIH. The Guidance Framework for testing of genetically modified mosquitoes. Geneva: World Health OrganizationTDR and the Foundation for the National Institutes of Health; 2014. http://www.who.int/tdr/publications/year/2014/guidefmrkgmmosquit/en/ . Accessed 01 Mar 2018.
 10.
National Academies of Sciences Engineering and Medicine. Gene drives on the horizon: advancing science, navigating uncertainty, and aligning research with public values. Washington DC: National Academies of Sciences Engineering and Medicine: The National Academies Press; 2016.
 11.
Huestis DL, Dao A, Diallo M, Sanogo ZL, Samake D, Yaro AS, et al. Windborne longdistance migration of malaria mosquitoes in the Sahel. Nature. 2019;574(7778):404–8. https://doi.org/10.1038/s4158601916224.
 12.
Klein TA, Windbichler N, Deredec A, Burt A, Benedict MQ. Infertility resulting from transgenic IPpoI male Anopheles gambiae in large cage trials. Pathog Glob Health. 2012;106(1):20–31. https://doi.org/10.1179/2047773212Y.0000000003.
 13.
Galizi R, Doyle LA, Menichelli M, Bernardini F, Deredec A, Burt A, et al. A synthetic sex ratio distortion system for the control of the human malaria mosquito. Nat Commun. 2014;5(1):1–8.
 14.
Nash A, Urdaneta GM, Beaghton AK, Hoermann A, Papathanos PA, Christophides GK, et al. Integral gene drives for population replacement. Biol Open. 2019;8(1). https://bio.biologists.org/content/8/1/bio037762.
 15.
Beaghton A, Beaghton PJ, Burt A. Vector control with driving Y chromosomes: modelling the evolution of resistance. Malar J. 2017;16(1):286. https://doi.org/10.1186/s1293601719327.
 16.
North AR, Burt A, Godfray HCJ. Modelling the potential of genetic control of malaria mosquitoes at national scale. BMC Biol. 2019;17(1):26. https://doi.org/10.1186/s1291501906455.
 17.
Dufourd C, Dumont Y. Modeling and simulations of mosquito dispersal. The case of Aedes albopictus. Biomath. 2012;1(2):1–7.
 18.
Hayes KR, Barry S, Beebe N, Dambacher JM, Barro PD, Ferson S, et al. Risk assessment for controlling mosquito vectors with engineered nucleases: sterile male construct final report. CSIRO; 2015. https://fnih.org/whatwedo/geneconvene/impact/technicaladvice. Accessed 01 Mar 2018.
 19.
Hosack GR, Ickowicz A, Hayes KR. Quantifying the risk of vectorborne disease transmission attributable to genetically modified vectors. R Soc Open Sci. 2020;8(3):201525.
 20.
Epopa PS, Millogo AA, Collins CM, North A, Tripet F, Benedict MQ, et al. The use of sequential markreleaserecapture experiments to estimate population size, survival and dispersal of male mosquitoes of the Anopheles gambiae complex in Bana, a west African humid savannah village. Parasit Vectors. 2017;10(1):376. https://doi.org/10.1186/s1307101723106.
 21.
Cummins B, Cortez R, Foppa IM, Walbeck J, Hyman JM. A spatial model of mosquito hostseeking behavior. PLoS Comput Biol. 2012;8(5):e1002500.
 22.
Pasternak Z, Bartumeus F, Grasso FW. Levytaxis: a novel search strategy for finding odor plumes in turbulent flowdominated environments. J Phys A Math Theor. 2009;42(43):434010.
 23.
de Jong R, Knols BGJ. Olfactory responses of hostseeking Anopheles gambiae s. s. Giles (Diptera: Culicidae). Acta Tropica. 1995;59:333–5.
 24.
Majeed S, Hill SR, Dekker T. Detection and perception of generic host volatiles by mosquitoes: responses to CO_{2} constrains hostseeking behaviour. R Soc Open Sci. 2017;4:160647.
 25.
Diabate A, Tripet F. Targeting male mosquito mating behaviour for malaria control. Parasit Vectors. 2015;8:347. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4485859/.
 26.
Moorcroft PR, Barnett A. Mechanistic home range models and resource selection analysis: a reconciliation and unification. Ecology. 2008;89(4):1112–9. https://doi.org/10.1890/061985.1.
 27.
Lindley DV. The philosophy of statistics. J R Stat Soc Ser D (The Statistician). 2000;49(3):293–337. https://doi.org/10.1111/14679884.00238.
 28.
McIver SB, McElligott PE. Effects of release rates on the range of attraction of carbon dioxide to some southwestern Ontario mosquito species. J Am Mosq Control Assoc. 1989;5(1):6–9.
 29.
Gillies MT, Wilkes TJ. The range of attraction of animal baits and carbon dioxide for mosquitoes. Studies in a freshwater area of West Africa. Bull Entomol Res. 1972;61(03):389.
 30.
Zhu L, Qualls WA, Marshall JM, Arheart KL, DeAngelis DL, McManus JW, et al. A spatial individualbased model predicting a great impact of copious sugar sources and resting sites on survival of Anopheles gambiae and malaria parasite transmission. Malar J. 2015;14(1):59.
 31.
Martin TG, Wintle BA, Rhodes JR, Kuhnert PM, Field SA, LowChoy SJ, et al. Zero tolerance ecology: improving ecological inference by modelling the source of zero observations. Ecol Lett. 2005;8(11):1235–46.
 32.
McGoff K, Mukherjee S, Pillai N. Statistical inference for dynamical systems: a review. Stat Surv. 2015;9:209–52. http://arxiv.org/abs/1204.6265.
 33.
Chkrebtii O, Campbell D, Calderhead B, Girolami M. Bayesian solution uncertainty quantification for differential equations. Bayesian Anal. 2016;11(4):1269–73. http://projecteuclid.org/euclid.ba/1480474948.
 34.
Wikle CK. Hierarchical Bayesian models for predicting the spread of ecological processes. Ecology. 2003;84(6):1382–94.
 35.
Ruggeri F, Sawlan Z, Scavino M, Tempone R. A hierarchical Bayesian setting for an inverse problem in linear parabolic PDEs with noisy boundary conditions. Bayesian Anal. 2017;12(2):407–33.
 36.
Haario H, Saksman E, Tamminen J. An adaptive metropolis algorithm. Bernoulli. 2001;7(2):223–42. http://projecteuclid.org/euclid.bj/1080222083.
 37.
Soetaert K, Petzoldt T. Inverse Modelling, Sensitivity and Monte Carlo Analysis in R using Package FME. 2010; 1–15. https://cran.rproject.org/web/packages/FME/index.html.
 38.
R Core Team. R: a language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing; 2021. https://www.Rproject.org/.
 39.
Gelman A, Carlin JB, Stern HS, Dunson DB, Vehtari A, Rubin DB. Bayesian data analysis. Boca Raton: CRC Press; 2014.
 40.
Robert CP. The Bayesian choice: from decisiontheoretic foundations to computational implementation. 2nd ed. Berlin: Springer Science & Business Media; 2007.
 41.
Kass RE, Raftery AE. Bayes factor. J Am Stat Assoc. 1995;90(430):773–95.
 42.
Hayes KR, Hosack GR, Ickowicz A, Foster S, Peel D, Ford J, et al. Risk assessment for controlling mosquito vectors with engineered nucleases: controlled field release for sterile male construct. Hobart: CSIRO; 2018. https://fnih.org/whatwedo/geneconvene/impact/technicaladvice. Accessed 01 Dec 2020.
 43.
Beeton NJ, Hosack GR, Wilkins A, Forbes LK, Ickowicz A, Hayes KR. Modelling competition between hybridising subspecies. J Theor Biol. 2020;486:110072. http://www.sciencedirect.com/science/article/pii/S0022519319304412.
Acknowledgements
This study would not be possible without the contribution of the domain experts—Heather Ferguson, Tovi Lehmann and Fred Gould. The authors would also like to acknowledge the numerous discussions and feedback received from Ron Thresher, Uffe Høgsbro Thygesen and Peter Caley. The authors would like to thank Jess Ford, Jeff Dambacher and Anders Gonçalves da Silva for helping facilitate the expert elicitation sessions. This analysis was funded by the Foundation for the National Institutes of Health and CSIRO Health and Biosecurity. The authors would also like to acknowledge CSIRO Data61 and Target Malaria for their help and support.
Author information
Affiliations
Contributions
AI carried out the original draft preparation, proposed the statistical model and ran the statistical analyses. SF wrote the code for the PDE model. KH and GH participated in expert elicitation. All authors participated on drafting and revision of the manuscript. All authors agree to be held accountable for the work performed herein. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Ethics approval and consent to participate
All elicitations were conducted with the informed consent of independent experts under ethics approval from the CSIRO Social Science Human Research Ethics Committee (reference LR06/2014). The markreleasecapture data were kindly provided by the Institut de Recherche en Sciences de la Santé, BoboDioulasso, Burkina Faso, and were collected with approval from the local institutional ethics committee (Centre Muraz Institutional Ethics Committee), reference number 0092012/CECM.
Competing interests
The authors declare that they have no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Additional file 1.
Elicitation process, questions and answers.
Additional file 2.
Derivation of the relationship between the diffusion coefficient and the average dispersal distance.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
About this article
Cite this article
Ickowicz, A., Foster, S.D., Hosack, G.R. et al. Predicting the spread and persistence of genetically modified dominant sterile male mosquitoes. Parasites Vectors 14, 480 (2021). https://doi.org/10.1186/s13071021049821
Received:
Accepted:
Published:
Keywords
 Entomological survey data
 Sterile male
 Expert elicitation
 Bayesian hierarchical model
 Monitoring