Predicting Culex pipiens/restuans population dynamics by interval lagged weather data

Background Culex pipiens/restuans mosquitoes are important vectors for a variety of arthropod borne viral infections. In this study, the associations between 20 years of mosquito capture data and the time lagged environmental quantities daytime length, temperature, precipitation, relative humidity and wind speed were used to generate a predictive model for the population dynamics of this vector species. Methods Mosquito population in the study area was represented by averaged time series of mosquitos counts captured at 6 sites in Cook County (Illinois, USA). Cross-correlation maps (CCMs) were compiled to investigate the association between mosquito abundances and environmental quantities. The results obtained from the CCMs were incorporated into a Poisson regression to generate a predictive model. To optimize the predictive model the time lags obtained from the CCMs were adjusted using a genetic algorithm. Results CCMs for weekly data showed a highly positive correlation of mosquito abundances with daytime length 4 to 5 weeks prior to capture (quantified by a Spearman rank order correlation of rS = 0.898) and with temperature during 2 weeks prior to capture (rS = 0.870). Maximal negative correlations were found for wind speed averaged over 3 week prior to capture (rS = −0.621). Cx. pipiens/restuans population dynamics was predicted by integrating the CCM results in Poisson regression models. They were used to simulate the average seasonal cycle of the mosquito abundance. Verification with observations resulted in a correlation of rS = 0.899 for daily and rS = 0.917 for weekly data. Applying the optimized models to the entire 20-years time series also resulted in a suitable fit with rS = 0.876 for daily and rS = 0.899 for weekly data. Conclusions The study demonstrates the application of interval lagged weather data to predict mosquito abundances with a feasible accuracy, especially when related to weekly Cx. pipiens/restuans populations.

Also precipitation has already been identified as another important factor influencing Cx. pipiens/restuans population dynamics, as high rainfall occurring several weeks before a capture event was positively correlated with the mosquito abundance [5,6]. The accurate mechanisms behind this relationship however are less well understood. It has been suggested that rainfall increases the water surface area and therefore possible oviposition sites [6,13]. Precipitation may also negatively affect mosquito capture, as the immature stages might get washed away by heavy rainfall. Geery [14] recorded a partial flushing of larvae (22-34% reduction) from catch basins in Cook County by rain events < 25 mm and an up to 91% reduction for strong rainfall of > 100 mm. Furthermore, precipitation may also decrease adult activity levels [15,16].
One of the classical approaches to investigate the relation between mosquito population dynamics and weather data is to correlate their time series [13,17]. In 2005 Curriero et al. [18] introduced cross-correlation maps (CCMs) as a tool to study the influence of environmental conditions during a time lagged interval (instead of using point lags, i.e., the conditions at a certain time point prior to the capture event) on the abundance of Ochlerotatus sollicitans, another Culicidae species. They demonstrated that lagged time intervals are better adapted to describe the mosquito abundance than point lags. Since then CCMs have been used to illustrate the correlations between various environmental factors and population size of Aedes sollicitans, Ae. vexans, Cx. pipiens, Cx. restuans and Cx. salinarius [5,19,20].
The aim of this study is to use interval lagged correlations between environmental factors and observed Cx. pipiens/restuans abundance, gained from CCMs, to develop a predictive model of mosquito population dynamics. This mosquito model should be able to simulate the seasonal cycle of Cx. pipiens/restuans populations and  help uncover the temporal scale influences on model performance by comparing daily versus weekly predictions. Such a model would be immensely useful for risk assessment studies i.e. of such arboviral diseases as mentioned above. An analogous study using CCMs and simulated population dynamics of midges was recently presented by the authors for Bluetongue disease in Austria [21]. In this study a mosquito model developed for a spatial scale representative for a major part of Cook County, and not for specific localized capture sites, is introduced. Such models encompassing larger areas are important tools for investigating mosquito born disease outbreaks and their intervention strategies.

Environmental data
The meteorological quantities used in this study were taken from the weather station of the Chicago O'Hare International Airport (WMO No. 72530 [23]) located at geographical longitude λ = 87.900°W, latitude = 41. 983°N and altitude h = 205 m. Air temperature T in°C, precipitation P in mm/day, relative humidity H in % (calculated from air and dew point temperature) and wind speed W in m/s were selected for this study as quantities potentially influencing mosquito abundance. Another quantity, influencing mosquito diapause, is the daytime length D in hours. It is a function of the sun's declination ε and the geographic latitude and was calculated using the CBM model described by Forsythe [24]: with the declination of the sun ε calculated as a function of the calendar day d: Figure 2 depicts the climate diagram calculated from temperature and precipitation data used in this study. Included parameters of the CCM and OPT model indicated by"+" or the used lag (not included quantities indicated by"-"), as well as their AIC, correlation coefficient r S and root mean square error RMSE.
Following the well known Köppen-Geiger climate classification [25], the climate conditions in Cook County were characterized by Dfa climate (D = continental or snow climate, f = fully humid, a = hot summers).

Statistical methods
Two different time scales were used in the analysis: daily and weekly data, wherefore we calculated the weekly means from the daily data. In a first step cross-correlation maps (CCMs) were applied to determine the maximal correlations between mosquito abundance and environmental quantities. CCMs illustrate the correlation coefficients between N i , the number of captured mosquitoes at time i and an environmental quantity X, averaged over a time period starting at time i − j (time lag 1) and ending at i − k (time lag 2), with j ≥ k. Thus, a CCM at the coordinates j and k illustrates In the case of j = k (plotted in the diagonal), the correlation coefficients are equal to those of a cross-correlogram. CCMs for the selected environmental quantities daytime length, temperature, precipitation, relative humidity and wind speed are depicted in Figure 3. Spearman's rank order correlation was applied because mosquito capture rates as well as some environmental quantities, especially daytime length and precipitation, are non-Gaussian distributed. The maximum time lag was set to 120 days and 20 weeks, respectively, to make sure that the maximum correlations were found. The correlation coefficient gained by correlating the environmental quantities at the capture event (r S (0, 0); i.e., no lags were applied) with the mosquito time series (using Spearman's rank correlation) is a result not only by the effect of those quantities on mosquito abundance, but also from the effects on their trapping probability. Thus, as the CCMs were used to describe the relation between the environmental quantities and the abundance of Cx. pipiens/restuans, the environmental quantities at the time of the capture were excluded. However, for the predictive model (see below) the influence of the environmental quantities on the trapping probability is likely to be important to replicate the capture rates. For the generation of the predictive model the data set was divided into training and test data. To archive a uniform distribution of the test data over the 20 years the data were split in even and odd years. The odd years were selected by a random process to be the test data set.
The interval lagged environmental quantities daytime length (D), temperature (T), precipitation (P), humidity (H) and wind speed (W) were integrated in a Poisson regression model to predict Cx.pipiens/restuans abundance: In this model the environmental conditions at the capture event (X i ) were also included, as they could affect mosquito activity and thus the capture rate. Please note that j and k for the lags of the five parameters do not necessarily have the same values. Non-relevant terms from this full model were removed in a stepwise procedure to identify the model with the lowest AIC (Akaike's Information Criterion [26]). The resulting model (CCM) was used to predict the mosquito abundance.
However, as the effect of the environmental conditions possibly interact, inclusion of the lags obtained from the CCMs might not result in the best optimal predictive model. Therefore, the next step was to generate a predictive model with a higher fit by identifying the best combinations of lags for the environmental quantities. Again, parameter combinations with less parameters than the full model were considered as well, and model fit was used to evaluate by the AIC. As it was not possible to test all different lag combinations, a genetic algorithm was used to identify the model with the highest fit. Genetic algorithms are heuristic techniques to solve optimization problems by emulating the processes of natural selection [27]. Here, the beginning and end of each time lag from the environmental quantities was subject to this process as well as the decision to include a certain parameter. The generated combination of time lags was implemented in the regression model. The AIC was chosen as the optimization criterium. For the daily data 5000 iterations were conducted, while 2000 were used for the weekly data. For both data sets the population size was set to 200 and the genetic algorithm was run 10 times, showing that all of those runs converged to the same results. The final model from the genetic algorithm with the optimal combinations of lags is called OPT (Table 1).
To test whether those regression models can be used to predict Cx. pipiens/restuans population dynamics we calculated Spearman's rank correlation coefficient r S (as the data were non-Gaussian distributed) and the root mean square error RMSE between the model predictions and the observations. All statistical analyses were conducted with the freely available statistical computing environment R [28]. The package genalg [29] was used to conduct the genetic algorithm.

Cross-correlation maps
The Cx. pipiens/restuans capture rates were positively correlated with daytime length, temperature and precipitation ( Figure 3). Humidity and wind speed were negatively correlated with mosquito abundance. The highest correlation was found for daytime length averaged from the fifth and fourth week prior the capture event with r S (5, 4) = 0.898 (daily r S (39, 28) = 0.873). Mosquito abundances were highly correlated with the mean temperature of the 2 weeks prior a capture event resulting in r S (2, 1) = 0.870. The highest daily correlation was estimated for the average temperature calculated from the last 18 days prior the capture day with r S (18, 1) = 0.853. Of the tested environmental quantities precipitation showed the weakest association with mosquito abundance. The highest correlation for this quantity was found for precipitation averaged over the 10 weeks prior to capture with r S (10, 1) = 0.487 (daily data: r S (76, 1) = 0.474). Interestingly, although precipitation was positively correlated with mosquito capture rates, humidity showed to be negatively correlated with mosquito capture rates. The highest effect was found for the humidity averaged over the time period of 15 to 2 weeks prior capture resulting in r S (15, 2) = −0.561 (daily data: r S (106, 23) = −0.562). A maximum negative correlation of r S (3, 1) = −0.621 (daily r S (23, 1) = −0.647) was found for wind speed averaged over 3 weeks (23 days) prior to capture.

Regression analysis
The CCM model calculated for the daily data set included all possible environmental quantities, with the exception of lagged wind speed ( Table 1). The CCM model for the weekly data was a much more reduced model, containing only the temperature during the capture week and the lagged quantities for daytime length, temperature, precipitation and humidity. To test whether the information gained from the CCMs could be used to predict the average seasonal cycle of female Cx. pipiens/restuans, an average mosquito year for both weekly and daily data, respectively, was compiled. The model predictions reproduced the average seasonal cycle of the mosquito abundances fairly accurately (Figure 4). The correlation of the observed mean seasonal cycle with the CCM model Summary of the regression model after the parameter optimization of the lags (OPT) from the daily and weekly data (trainings data set). For each quantity the regression coefficient β and the standard error of the regression coefficient SE, the test statistic (z value) and the p-value is given. The used environmental quantities were D -daytime length, T -temperature, Pprecipitation, H -relative humidity, and W -wind speed. The subscript 0,0 represents the environmental quantity at the time of capture. Two subscripts indicate the beginning and the end of the lagged time period.
was r S = 0.899 for the daily and r S = 0.917 for the weekly data set. Similar results were obtained for the entire 20 years time series (Figures 5 and 6). The OPT model obtained by the genetic algorithm included similar environmental quantities (Table 1).
Temperature and precipitation time lags were within the frame obtained by the CCMs, but span a shorter time period. The number of captured mosquitoes increased with temperature at the capture day, but there was also a negative effect with temperature shortly before the capture (Table 2). Thus, a higher temperature at the time of capture compared to the temperature at the lagged time interval results in higher capture rates than the other way around. Daytime length at the time of the capture and 13 weeks prior to the capture event were positively associated with the Cx. pipiens/restuns abundance, causing a shift in the seasonal cycle between the daytime length and the mosquito abundance of about 4 weeks. The lag for wind speed included in the OPT model was from rather far back in time. It comprises a time frame of about 3 to 4 months before the capture event. The time span in which humidity influenced the capture rates was shifted closer to the capture event and also enfolds the much narrower time span of approximately the month preceding the capture. Using the OPT model estimates obtained by fitting the model from to the training data set (Table 2), the mean seasonal cycle of the mosquito abundance as well as the mosquito dynamics of the last 20 years were predicted. The correlations of the observed versus predicted mean seasonal abundance resulted in r S = 0.906 for the daily and r S = 0.922 for the weekly data set (Figure 4). They were also well adapted to predict the course of the mosquito population over the last 20 years (Figures 5 and 6).
Both, the CCM and the OPT model, were able to predict the beginning and the end of the seasonal cycle as well as the inter-annual differences in the amplitude correctly. However, the models produced by the genetic algorithm are better adapted to predict mosquito abundances during mid-summer as they could reproduce the bimodal seasonal peak abundance of some of the years. The advantage of using the optimized time lags obtained by the genetic algorithm compared to implementing the results from the CCMs directly into a regression model is shown by the higher correlation coefficient and lower RMSE (both for the training as well as the test data sets) and by a significantly improved AIC (Table 1).

Discussion
Temperature dependent life expectancy and development rates make it difficult to distinctly assign the time lags influencing mosquito capture rates to the different developmental stages. Under summer field conditions adult Culex spp. have a mean lifespan of about 1 week [30,31]. Weather conditions within this period therefore have their main impact on adult mosquitoes. The duration of the aquatic stages also varies with temperature, and weather conditions from about 8 to 20 days prior to capture presumably affect the aquatic stage [7]; weather conditions from about 20 to 28 days affect the adult stage of the parent generation. Previous studies with CCMs considered a time period of about 4 weeks, which approximately represents the lifetime of a mosquito, inclusive all aquatic stages [5,18,19]. The results of this study however show that environmental factors have to be considered further back in time. As this exceeds the mean mosquito lifespan, it is likely that weather conditions occurring far back in time do not affect the current population directly, but affect previous generations. Due to exponential growth rates even small effects of weather conditions on a mosquito population could therefore result in vast effects in future generations.
Daytime length may be the most important factor generating the seasonal pattern of mosquito abundance as it regulates -together with temperature -the incidence of diapause. The conditions occurring during pupal development have been shown to determine whether the adult female undergoes diapause [11,32]. The results of the regression models revealed that maximum mosquito abundance was reached 4 weeks after the longest daytime length. This is also reflected by the results of the CCMs. The shift of 4 weeks between the photoperiod and the mosquito abundance peak may indicate that effects on the pupal stage of the parent generation may be more influential than effects on the current generation.
The time with the strongest effect of ambient temperature was at the time of the capture and the time shortly before this event, indicating that mostly adult mosquitoes were affected. Mosquito capture rates increased with increasing temperature, as already shown in previous studies [5,6]. Furthermore, it has been demonstrated that considerable changes in the temperature at the capture event compared to the previous days affect the number of captured mosquitoes. Chuang et al. [13] found a similar temperature effect for Cx. tarsalis and Ae. vexans as the authors described a positive influence of temperature at the week of the capture and a lesser negative effect of temperature with a 2 week lag. Thus it seems that female Cx. pipiens/restuans are strongly influenced by temperature changes. This might indicate that they delay flight activities (e.g. host searching) until more favorable temperature conditions occur and considerably decrease their activities at a sudden temperature drops. Sudden temperature changes could not only affect their activity, but could also possibly influence their survival rates.
Capture rates were influenced by rainfall accumulated over long time periods exceeding the typical mosquito life span. This indicates that the amount of precipitation during the previous generation had a stronger effect on the capture rates than the rain falling during the lifespan of the captured mosquitoes. As many of the potential breeding sites, such as shallow temporal ponds, only exist after a certain amount of rainfall, the increased number of breeding sites after rainfall had a positive effect on the number of captured mosquitoes weeks later. Those pools need to be sustained by rainfall for several weeks to ensure the survival of the aquatic stages. At our study site, the suburban area of Chicago, catch basins represent an important breeding site for mosquitoes [14,33]. In temporary as well as in stagnant water bodies rainfall increases the water volume. This causes a decreasing larval density, which results in increased development rates and decreased mortality rates [34,35]. Previous studies have shown that high amounts of rainfall decrease the number of mosquito larvae in catch basins dramatically [14,36]. Interestingly, this negative effect of precipitation on the larvae was not visible in our study where the effects of previous rainfall on the number of adult mosquitoes was investigated. It is possible that this negative effect on larvae, which is noticeable for about 4 days [36], was overlain by the subsequent longer lasting positive effects mentioned above.
In contradiction to the results from the CCMs, a high relative humidity in the month prior the capture event had a positive effect on mosquito capture rates. This shows that interrelations between environmental quantities may shroud the effect of one quantity on the mosquito capture rates when it is considered without others (as it was done in the CCMs). The regression analysis on the contrary allows to control for the effect of the other quantities included in the model. This positive effect of relative humidity was also found for several Culicidae species, showing that relative humidity influences mosquito activity patterns and the dynamics of oviposition [15,37].
Experiments by Hoffmann [38] indicate that a high wind speed does not reduce flight activity in mosquitoes, but rather impairs the mosquito orientation by deluding attracting stimuli like CO 2 and thus reducing capture rates. The results of this study indicate that the effects of wind on the parent generation may have a more important effect on the capture rates than on the current generation. The negative effect of elevated wind speed in the several weeks prior to the capture event may be caused by a lower chance for a blood meal during this time period.
Mosquito larvae control strategies conducted by the mosquito control agency in the study area were not accounted for in this study. Those strategies have been changed over the course of 20 years in the methods used as well as in efficiency. Aberrations of our model results from the observation data could thus be caused by changes in the used mosquito larvae control strategies. For further applications of the presented models one has to keep in mind that they represent a "managed" Cx. pipiens/restuans population. This is on the one hand a benefit, as in many inhabited areas (in the U.S.A.) there is some kind of mosquito control, especially in endemic areas of mosquito borne diseases like WNV. On the other hand, one has to be careful when adopting this model for other regions.
Cross-correlation maps have been proven to be useful tools in investigating time lagged associations between vector abundance and environmental factors [5,[18][19][20]. However, as pointed out by Cohnstaedt [39], there are two major disadvantages of CCMs. First, they do not consider that fact that adults at one time period are a function of the number of adults from the previous generation. And second, that lagged weather variables by a fixed time period ignores the temperature dependence of developmental rates. The first argument concurs with the results of this study. By extending in this study the maximum time lag we were able to reveal that environmental effects on previous generations are likely more important factors describing the number of captured mosquitoes than the effect of those variables on the current generation. With our analyses we are not able to counter the second argument, we can just be careful when interpreting the results found regarding their association to different developmental stages.

Conclusions
The final question is, whether the information gained from CCMs could be used to predict Cx. pipiens/ restuans population dynamics. The applicability of the models for daily and weekly predictions depends of course on ones expectations. On both time scales the models resulted in a good predictability of the seasonal cycle. Interannual differences in the mosquito abundance could in large part be reproduced. Especially the optimized model for the weekly data allowed to predict the mosquito abundances to a high degree, and could be of practical use, e.g. planning of mosquito control strategies and simulation of mosquito borne diseases.