### Mosquito population model

We developed a discrete-time stochastic model that simulates the life cycle of a mosquito population. A schematic of the model is outlined in Fig. 1 and is described below.

The life cycle of a mosquito consists of four main stages: egg, larvae, pupae and adult. The first three stages are aquatic and, although there is variation in the wild, last approximately 5–14 days [38]. Larvae emerge from eggs and feed on algae, bacteria and other small particulate matter whilst developing through four separate larval instars, before metamorphosing into pupae and then finally adults. After maturation into the adult stage and having successfully mated, a female begins its first gonotrophic cycle. During the gonotrophic cycle, which lasts approximately 3 days, the mosquito takes a blood meal from a human or animal host, after which the blood is digested during a resting phase and used to develop the eggs in the ovaries. Once the eggs have matured, the mosquito is considered gravid and will search for a suitable site for oviposition. This process is then repeated until the eventual death of the adult.

As in previous models [28], to reduce complexity but maintain biological realism, we group the first two larval stages into a single class, early instar larval stage *E*, and the third and fourth stages into a late larval stage class *L.* Pupae *P* and adults *M* remain as single stages. Transition between early- and late-stage larvae and associated density-dependent mortality can be described as a binomial event occurring within discrete time \(\delta t\). The probability of mortality or transition can be defined as \(\delta t\) multiplied by the sum of the two respective rates, development to late-stage larvae (\(d_{E} )\) and density-dependent mortality \(\mu_{E} \left( {E,L} \right)\). A proportion \(\frac{{d_{E} }}{{\mu_{E} \left( {E,L} \right) + d_{E} }}\) of all the larvae transitioning out of *E* are expected to enter the state *L*, again simulated by sampling from the corresponding binomial distribution. Similarly, transitions between late-stage larvae and pupae can be described with a probability \(d_{E}\) and density-dependent mortality of \(\mu_{E} \left( {E,L} \right)\). Pupae then develop into adults with a probability of \(d_{P}\) and a density-independent probability of mortality \(\mu_{p}\). Half of all emerging adults are assumed to be female, and male adults are not explicitly modelled (we assume that there are enough males to successfully mate with all females). For simplicity and model tractability we assume that adults die at an age-independent rate \(\mu_{{\text{m}}}\) and have an approximate average life expectancy of 9–12 days in the wild [39]. Regular immigration of adult mosquitos from external populations is known to occur in most wild populations [40]; therefore, migration is modelled as a Poisson process with rate \(I\). The input of new larvae into the system comes from eggs laid at a constant rate, \(M\beta \delta t\), where \(\beta\) is a fitted parameter for the average daily egg-laying rate of adult females.

The dynamics of a mosquito population can thus be described as a series of stochastic equations where \({\text{Bin}}\left( {p,n} \right)\) denotes a binomial draw with \(p\) probability and \(n\) trials, and \({\text{Poisson}}\left( \lambda \right)\) a draw from a Poisson distribution with given lambda.

$$E\left( {t + \delta t} \right) = E\left( t \right) - B_{E} + M\beta \delta t,$$

$$L\left( {t + \delta t} \right) = L\left( t \right) - B_{L} + {\text{Bin}}\left( {\frac{{d_{E} }}{{\mu_{E} \left( {E,L} \right) + d_{E} }},B_{E} } \right),$$

$$P\left( {t + \delta t} \right) = P\left( t \right) - B_{P} + {\text{Bin}}\left( {\frac{{d_{L} }}{{\mu_{L} \left( {E,L} \right) + d_{L} }},B_{L} } \right),$$

$$M\left( {t + \delta t} \right) = M\left( t \right) + \frac{1}{2}{\text{Bin}}\left( {\frac{{d_{P} }}{{d_{P} + \mu_{P} }},B_{P} } \right) - {\text{Bin}}\left( {\mu_{M} \delta t,M} \right) + {\text{Pois}}\left( { I} \right),$$

(1)

where

$$B_{E} = {\text{Bin}}\left( {\left( {d_{E} + \mu_{E} \left( {E,L} \right)} \right)\delta t,E} \right),$$

$$B_{L} = {\text{Bin}}\left( {\left( {d_{L} + \mu_{L} \left( {E,L} \right)} \right)\delta t,L} \right),$$

$$B_{P} = {\text{Bin}}\left( {(d_{P} + \mu_{P} )\delta t,P} \right).$$

### Density-dependent mortality

Density dependence in mosquito larval stages results from the competition for resources during the early and late instars; however, pupae are not pressured by resources, as they do not feed. For simplicity, although adults may have additional density constraints, it is assumed in this model that adults are not constrained by density effects [41].

Density dependence for early-stage larvae *E*(*t*) and late-stage larvae *L*(*t*) at time \(t\) can therefore be described as \(\mu_{E} \left( {E\left( t \right),L\left( t \right)} \right)\) and \(\mu_{L} \left( {E\left( t \right),L\left( t \right)} \right)\).

To quantify the effect of egg-laying under different potential systems of density dependence, three functional forms were considered within the model:

A linear relationship between larval density and density-dependent mortality, as used by White et al*.* [28]:

$$\mu_{E} \left( {E\left( t \right),L\left( t \right)} \right) = \mu_{E}^{0} \left( {1 + \frac{E\left( t \right) + L\left( t \right)}{{K\left( t \right)}}} \right)^{ } ,$$

(2)

$$\mu_{L} \left( {E\left( t \right),L\left( t \right)} \right) = \mu_{L}^{0} \left( {1 + \gamma \frac{E\left( t \right) + L\left( t \right)}{{K\left( t \right)}}} \right).$$

Density-dependent mortality, which grows as some (fitted) power (\(\Omega\)) of larval density, resulting in \(\Omega\) determining the intensity of density dependence, where \(\Omega\) < 1 intensity grows at a decreasing rate, whilst \(\Omega\) > 1 intensity grows at an increasing rate.

$$\mu_{E} \left( {E\left( t \right),L\left( t \right)} \right) = \mu_{E}^{0} \left( {1 + \frac{E\left( t \right) + L\left( t \right)}{{K\left( t \right)}}} \right)^{ \Omega } ,$$

(3)

$$\mu_{L} \left( {E\left( t \right),L\left( t \right)} \right) = \mu_{L}^{0} \left( {1 + \gamma \frac{E\left( t \right) + L\left( t \right)}{{K\left( t \right)}}} \right)^{\Omega } .$$

Exponential density dependence, where mortality increases exponentially with larval population density:

$$\mu_{E} \left( {E\left( t \right),L\left( t \right)} \right) = \mu_{E}^{0} e^{{\left( {\frac{E\left( t \right) + L\left( t \right)}{{K\left( t \right)}}} \right)}} ,$$

(4)

$$\mu_{L} \left( {E\left( t \right),L\left( t \right)} \right) = \mu_{L}^{0} e^{{\left( {\gamma \frac{E\left( t \right) + L\left( t \right)}{{K\left( t \right)}}} \right)}} .$$

Here, \(\mu_{E}^{0}\) and \(\mu_{L}^{0}\) are the death rates for early and late instars at low densities, \(K\left( t \right)\) is the environmental carrying capacity at time \(t\) and \(\gamma\) is the difference in effect of density dependence on late-stage instars compared with early-stage instars. Larval carrying capacity (*K*) was assumed to be proportional to rainfall from the previous \(\tau\) days weighted by an exponential distribution with mean \(2\tau\), where \({\text{rain}}\left( t \right)\) in millimeters is the daily rainfall and \(\lambda\) is a village-specific scaling factor as used by White et al*.* [28]:

$$K\left( t \right) = \lambda \frac{1}{{\tau \left( {1 - e^{{ - {\raise0.7ex\hbox{$t$} \!\mathord{\left/ {\vphantom {t \tau }}\right.\kern-\nulldelimiterspace} \!\lower0.7ex\hbox{$\tau $}}}} } \right)}}\mathop \int \limits_{0}^{t} e^{{ - \left( {t - t^{\prime}} \right)/\tau }} {\text{rain}}\left( {t^{\prime}} \right){\text{d}}t^{\prime}.$$

(5)

As there were inconsistent records of temperature in the Garki project for the study sites, we assumed a single fixed value. Similarly to White et al. [28], we used the mean air temperature of 24 °C to obtain an approximate mean value for the water temperature of 28 °C, and based our priors on this value.

### Clumped egg-laying

To simulate clumped egg-laying, the input of eggs into the system (\(M\beta \delta t)\) in Eq. (1) is replaced by a stochastic “clumping” process. Here, the probability of each adult female laying eggs during any one time step \(\delta t\) is \(\frac{\delta t}{S}\), where \(S\) is the duration of the gonotrophic cycle. The total number of females laying eggs, \(E_{x}\), is thus sampled from a binomial distribution \(E_{x} = {\text{Bin}}\left( {\frac{\delta t}{S},M} \right)\). The number of eggs laid in that time step is then sampled from a Poisson with mean \(E_{x} n\), where \(n\) is the mean clutch size \({\text{Pois}}\left( {E_{x} n} \right)\). As \(n\), unlike \(\beta\), explicitly considers the gonotrophic cycle \(S\), the relationship between the two values is dependent on the average number of gonotrophic cycles within a mosquito's lifespan (\(\mu_{{\text{m}}}\)); therefore, approximately, \(n = \frac{{ \beta \left( {e^{{S\mu_{{\text{m}}} }} - 1} \right)}}{{\mu_{{\text{m}}} }}.\)

### Data

The model was fitted to rainfall and mosquito catch data obtained from the Garki project [39]. The project was a World Health Organization-funded study into the effects of malaria and vector interventions during the 1970s and presents one of the most detailed longitudinal data sets for combined mosquito and meteorological data. The data consist of environmental, demographic and mosquito catch data from 24 villages over an approximately 5-year period. Time series of *Anopheles gambiae* caught by pyrethrum spray in houses were extracted for eight villages and the data pooled to the village level. From these villages, two were from control areas where no interventions were carried out after the first rainy season, allowing the data to be split into two successive seasons, giving a total of ten population time series.

### Parameter estimation

We used an adaptive particle Markov chain Monte Carlo algorithm (pMCMC) to directly fit our stochastic model of mosquito dynamics to the data collected during the Garki project. pMCMC is a form of MCMC which uses a particle filter component to calculate a marginal likelihood for parameter acceptance/rejection within a Metropolis-Hastings algorithm [42, 43]. To account for the overdispersion and fractional sampling of the population, a beta-binomial likelihood function was used, with the parameter accounting for overdispersion \(r\) and the fraction of the population sampled \(p\) being fitted (see Additional file 1 for details of the pMCMC algorithm and likelihood methods).

The number of particles and chain length in a pMCMC has no set rules, and there is a trade-off between minimising the noise in the model and maintaining a realistic compute time. For each scenario of density dependence and egg-laying, we looked at the decrease in variation for set parameters when increasing the particle number and estimated the time to convergence in the chains. Therefore, the pMCMC chain was run with 150 particles for 2,000,000 iterations, with a 100,000-iteration burn-in. Attempts to increase the particle number and achieve convergence when fitting with an exponential density dependence were not successful, so full chains were not run. Priors were based on lab and field studies detailed in White et al. [28]. Approximate initial conditions for each of the villages were obtained by assuming deterministic equilibrium, allowing the value of each state variable to be derived from the value of a single estimate of the initial value of *E*, *L*, *P*, or *M* (see Additional file 1). This single parameter was denoted as *z* and estimated for each village. Most parameters were assumed to be common to all villages and were estimated by fitting the model to the data from all villages simultaneously, except village-/season-specific scaling factors \(\lambda_{{\left( {1 \ldots 10} \right)}}\) and initial condition parameters \(z_{{\left( {1 \ldots 10} \right)}}\), which were assumed to be village- and season-specific. Median values of the posterior distributions from the pMCMC chains for each parameter were then used to simulate the models for plotting, and 95% credible intervals were obtained by repeatedly sampling the joint posterior distribution.

### Mosquito population growth rate

*R*_{m} in the context of this paper is considered as the population growth rate in the low-density limit, as quantified by the number of adult female offspring produced per adult female. To calculate this, two values were derived: a raw value in the absence of density dependence and a second value which included the effects of density dependence. To obtain a raw analytical value we followed White et al*.* [28], where a female laying \(n\) viable eggs during an oviposition cycle can expect to oviposit approximately \(n\left( {e^{{ - S\mu_{{\text{m}}} }} + e^{{ - 2S\mu_{{\text{m}}} }} + \cdots } \right) = \frac{n}{{e^{{S\mu_{{\text{m}}} }} - 1}}\) eggs, where \(e^{{ - S\mu_{{\text{m}}} }}\), \(e^{{ - 2S\mu_{{\text{m}}} }}\), etc., is the proportion surviving one and two gonotrophic cycles, respectively. The fraction surviving to adulthood can then be defined respectively for clumped and non-clumped egg-laying as:

$$R_{{\text{m}}} = \frac{1}{2}\left( {\frac{n}{{e^{{S\mu_{{\text{m}}} }} - 1}}} \right)\left( {\frac{1}{{1 + \mu_{{\frac{E}{{d_{E} }}}}^{0} }}} \right)\left( {\frac{1}{{1 + \frac{{\mu_{L}^{0} }}{{d_{L} }}}}} \right)\left( {\frac{1}{{1 + \frac{\mu P}{{d_{L} }}}}} \right),$$

(6)

$$R_{{\text{m}}} = \frac{1}{2}\left( {\frac{ \beta }{{\mu_{{\text{m}}} }}} \right)\left( {\frac{1}{{1 + \mu_{{\frac{E}{{d_{E} }}}}^{0} }}} \right)\left( {\frac{1}{{1 + \frac{{\mu_{L}^{0} }}{{d_{L} }}}}} \right)\left( {\frac{1}{{1 + \frac{\mu P}{{d_{L} }}}}} \right).$$

(7)

However, clumped egg-laying means larval populations always experience some density-dependent competition; thus the raw value for *R*_{m} never predicts actual population growth. Accounting for minimal density dependence is analytically challenging, however. We therefore evaluated *R*_{m} numerically by simulating the model for a single female mosquito until its death in a system empty of other adults and recorded the number of female offspring surviving until adulthood. Simulations were repeated using multiple random samples of parameter sets sampled from the pMCMC posterior to estimate credible intervals. To account for stochastic variation, the mean was taken from 50 simulations of each parameter set.

### Effect on gene drive

To illustrate the effect on vector control, we looked at a test case of the impact on gene drive from changes in both density dependence and the addition of clumped egg-laying, by reconstructing the gene drive model formulated by Deredec et al. [20]. This model considers HEGs, a class of selfish genetic elements which are one of the most credible gene drive methods currently being developed. HEG efficacy is determined by the number of HEGs being used and their homing rate (the fraction of potential recipient heterozygous chromosomes that require the HEG). We ran the model using parameter estimates from our results for each of our scenarios and established the impact on the required HEGs and homing rate for elimination.