As expected (e.g. see , but also Table 1), temperature -and its quadratic value (for LST)- influenced C. imicola population abundance. Veronesi et al. showed experimentally that temperature could influence the duration and survival of sub-adult stages in C. imicola . In our models, temperature from WS, when tested as a single predictor, explained 36% of the variability measured in the dataset investigated, which is ~10% higher than MODIS LST variance explanation. This range (from 27 to 36%) is lower than the variability in C. imicola catches explained on the basis of WS temperature and LST in previous studies, where values ranging between 34 to 40% were found . Combining temperature with a first-order autoregressive cofactor increased significantly the explanatory power of the model.
Whilst the strong effects of temperature and autoregressive cofactors were somewhat anticipated, we were surprised to find that including NDVI did not substantially improve the predictive power of the model. NDVI could indeed be believed to improve spatial predictions by highlighting areas where moisture conditions are more favourable to C. imicola larval stages [43–46]. In addition, it was previously a variable improving C. imicola distribution models . One possible explanation is the role played by artificial breeding sites in the direct vicinity of the traps. Indeed, all traps are placed in farms where a variety of artificial breeding sites can be found: mud surrounding local provision of livestock drinking water, local small streams of cleaning water, or even small-scale irrigated pastures. All these could provide C. imicola populations suitable habitats surrounding the trap, even in the absence of surrounding vegetation that could be detected at larger-scale by the remotely sensed NDVI signal. In addition, compared to other studies, NDVI may not be such a limiting factor in Sardinia where catches appear to be very abundant over the entire region. A similar approach developed at the fringe of the C. imicola distribution range may hence highlight a relatively stronger contribution of factors other than temperature in modelling C. imicola population dynamics. The fact that the model overestimated the populations in Tuscany suggests that the model would need further adjustments to account for a different range or value in the environmental conditions than those encountered in Sardinia alone.
It is obvious that the model simulations had lower predictive power than the initial statistical approach built to find estimates of cofactors (lower correlation coefficient and higher RMSE, Table 3). Indeed the reduction in predictive power can be explained by the fact that the simulations only use a constant initial population at t0 (here 13 individuals in March 2001) to feed the predictions over the entire time series whereas the predictions of the statistical model are estimated with the observed abundances measured at each previous time step. In other words, predictions from the model simulations are not made based on the observed population at the previous time step, but are based on the modelled population at the previous time step, hence the reduction in predictability. However, the comparative advantage is that the simulation demonstrated moderate to good predictability over space and time simply based on the spatio-temporal distribution of the predictors, and do not require field samplings of C. imicola to make the predictions.
Those simulations succeeded fairly well in reproducing the seasonality of the populations, the maintenance of C. imicola activity during winter, even at very low population levels, and the likely outcome of extinction at high elevation ( > 500 m). Even if the simulations were not able to fully quantify the level of the peaks of maximum abundances, they described very well the increase in population activity that occurs at the beginning of each season (Figure 3). The applied perspective of such a characteristic could be found in the development of a surveillance system that could predict seasonal vector abundances on the basis of the current temperature and could help to alert on periods of high risk of bluetongue disease transmission. The model predicted extinctions at high elevation (Additional file 2: Figure S2 and Additional file 3: Figure S3), and the maintenance of the activity of vectors in these areas would require renewed introductions. A further development of the model could account for external introduction of novel specimens. For example, it could be coupled to broad-scale wind density models such as presented in [47, 48], transport and trade networks , or to local-scale leptokurtic models, which describe the decrease in Culicoides spp. abundances as a function of the distance to the farm . Other factors, not accounted for in this study, are likely to influence C. imicola populations. One such factor may be the local density of livestock (horses, cattle, sheep and goats), which provides both hosts for blood feeding females, and breeding sites through the manure. Sardinia hosts approximately 3.9 million sheep and goats, the highest density in all of Italy. Although this number does not show strong seasonal fluctuation, grazing patterns are strongly seasonal in the most elevated part of the island, with sheep flocks free-grazing in the pastures. In contrast, most sheep are grazing in pastures in the direct vicinity of farms in the low-elevation parts of Sardinia. These factors may also influence the spatio-temporal pattern of C. imicola populations, but quantification of these effects is difficult due to a lack of high-resolution data on hosts and grazing patterns. In addition, Onderstepoort-type black-light traps catches do not accurately reflect host-seeking behaviour by biting midges in comparison to host-baited traps catches , hence limiting the use of our dataset to test for the local effect of hosts distribution on C. imicola populations abundance.
Average correlation coefficients between observations and predictions were very similar between the RS model and the WS model (Figure 2). The model with RS predictors has nevertheless the advantage that predictions can be made over all pixels without interpolation of observations such as is needed in the case of weather station data. Furthermore, interpolation tends to produce very continuous surface that do not fully reflect the local heterogeneities in temperatures (Figure 5A and B). This comparative advantage could not be quantified in our study, probably because the observation error in catches could be higher than the difference in predictions due to the differences in the type of temperature data. Another possible explanation is that local temperature at the level of the trap, influenced by local conditions (shelter and shading, local topography) could be as different from the RS data as it is from the WS data. This could only be evaluated thoroughly using local temperature measurements with data loggers. Overall, the distribution that we predicted at the seasonal peak is fairly similar to that observed by previous studies, (e.g. [13, 33]), with low-level population areas located at high altitudes.
Our modelling approach included at least three sources of uncertainties. The first one appears when an autoregressive cofactor is included and results from the interplay between process and observation error . We tried to take it into account by (1) quantifying the effect of observation error on coefficient estimates, (2) trying to quantify the influence of observation error on process error, and (3) including using a stochastic component in the modelling framework.
As expected, introducing different levels of observation error influenced both GLM coefficient estimates , and the extent of process error (Figure 4). Since we have little prior information on the magnitude of observational error, we decided to run our simulations with a fairly high level of observational error, introduced in the form of an error term sampled from a normal distribution with 0-mean and standard deviation equal to half the variance found in the whole dataset. This introduces a lot of variability in the predictions, and one way to reduce uncertainties could only be gained by experimental quantification of observation error. Other methods to include explicitly observation and process error are available (e.g. state-space models such as in  or else,  and  for implementations using the Bayesian techniques; mixed models, such as highlighted in ; hybrid models developed in ). Our approach was somewhat simpler, but allows quantification of the impact of observation error to be made under various modelling frameworks, such as for examples BRT , GLMM  or autologistic models .