A global set of Fourier-transformed remotely sensed covariates for the description of abiotic niche in epidemiological studies of tick vector species

Background Correlative modelling combines observations of species occurrence with environmental variables to capture the niche of organisms. It has been argued for the use of predictors that are ecologically relevant to the target species, instead of the automatic selection of variables. Without such biological background, the forced inclusion of numerous variables can produce models that are highly inflated and biologically irrelevant. The tendency in correlative modelling is to use environmental variables that are interpolated from climate stations, or monthly estimates of remotely sensed features. Methods We produced a global dataset of abiotic variables based on the transformation by harmonic regression (time series Fourier transform) of monthly data derived from the MODIS series of satellites at a nominal resolution of 0.1°. The dataset includes variables, such as day and night temperature or vegetation and water availability, which potentially could affect physiological processes and therefore are surrogates in tracking the abiotic niche. We tested the capacities of the dataset to describe the abiotic niche of parasitic organisms, applying it to discriminate five species of the globally distributed tick subgenus Boophilus and using more than 9,500 published records. Results With an average reliability of 82%, the Fourier-transformed dataset outperformed the raw MODIS-derived monthly data for temperature and vegetation stress (62% of reliability) and other popular interpolated climate datasets, which had variable reliability (56%–65%). The transformed abiotic variables always had a collinearity of less than 3 (as measured by the variance inflation factor), in contrast with interpolated datasets, which had values as high as 300. Conclusions The new dataset of transformed covariates could address the tracking of abiotic niches without inflation of the models arising from internal issues with the descriptive variables, which appear when variance inflation is higher than 10. The coefficients of the harmonic regressions can also be used to reconstruct the complete original time series, being an adequate complement for ecological, epidemiological, or phylogenetic studies. We provide the dataset as a free download under the GNU general public license as well as the scripts necessary to integrate other time series of data into the calculations of the harmonic coefficients.


Background
Various methods of species distribution modelling have been applied to arthropods of medical importance to understand the factors limiting their distributions [1][2][3][4]. These quantitative tools combine observations of species occurrence with environmental features (variously called "descriptive variables", "environmental variables", or "abiotic covariates") to capture the niche of the target species and then project a prediction on a geographic range. This approach is called correlative modelling [5,6]. Such projection is generally a map illustrating the similarity of the abiotic covariates in relation to the data used to train the model. Commonly, only the abiotic component of the niche (e.g., temperature, water vapour) is used to infer the niche of the target species, although for some species, it is necessary to include an explicit description of biotic factors, like the availability of hosts, which are necessary as a blood source. These abiotic covariates are thus used to gain information about which variables may affect the fitness of the species. Because information on abiotic variables can be produced on a timely basis, correlative modelling is a useful tool for resource managers, policy makers, and scientists.
A number of modellers have argued strongly for the use of predictors that are ecologically relevant to the target species, describing the biological and ecological constraints of the species in the spatial range to be modelled [4,[7][8][9][10]. However, the rule seems to be the automatic selection of variables by the modelling algorithms, relying on the statistical values of model performance [11] rather than weighting them by ecological relevance. Without such biological background, the forced inclusion of numerous variables can produce models with highly reliable matching distributions that are statistically rather than biologically relevant. The tendency in correlative modelling is to use abiotic covariates that are interpolated from climate stations [12]. These datasets describe either the monthly values of a variable (e.g., mean temperature in March) or the relationships among the variables (e.g., rainfall in the warmest quarter). The overall usefulness of these datasets for global climate studies is not in question, but they may be affected by internal issues like collinearity [13,14] that influence the reliability of the resulting spatial projection. Collinearity refers to the non-independence of predictor variables, usually in a regression-type analysis. It is a common feature of any descriptive ecological dataset and can be a problem for parameter estimation because it inflates the variance of regression parameters and hence potentially leads to the wrong identification of predictors as relevant in a statistical model [14].
Tackling the complex challenges of decision-making about human and animal health requires development of a monitoring and assessment system of the climate covering the Earth's dimensions. Such a system must be coherent, reliable, and ready for updating as new data incorporate into the stream of observations. It ideally would supply indicators that account for climate changes and trends and how they might affect the physiological processes of the organisms to be modelled. Remotely sensed products of Earth's processes are dynamic predictors suitable for capturing the niche preferences of some medically important arthropods [15]. Because of continuous temporal sampling, remotely sensed data provide a synoptic representation of the climate at the required spatial and temporal scales. However, the potential of such harmonised datasets to capture the abiotic niche of organisms has not yet been fully explored [16,17]. It has been mentioned that weather patterns are better surrogates for niche preferences of an organism than are the averaged and extreme values of some variables [18]. Incorporating such phenological descriptives of the abiotic niche would improve estimations of the abiotic preferences of the target organism. Studies have focused on the transformation of the time series of remotely sensed covariates via principal component analysis (PCA) or Fourier transformation [16][17][18]. These modifications of the time series of covariates retain the variability of the original dataset while removing the collinearity. This paper describes a dataset of remotely sensed covariates based on the transformation by harmonic regression (time series Fourier transform) of monthly data derived from the MODIS series of satellites. Such a dataset is internally coherent, has a small number of layers to reduce the inflation of the derived models, and includes information about day and night temperature, vegetation, and water availability. This paper shows how the dataset was produced and provides the scripts necessary for further calculations. We also explicitly explored the performance of the dataset describing the abiotic niche of several species of ticks [19] and compared it with the results using other popular datasets of climate features. We provide the transformed dataset for free download under the GNU general public license serving the purpose of making specific data available to ecologists and epidemiologists.

A primer on harmonic regression
Harmonic regression is a mathematical technique used to decompose a complex signal into a series of individual sine and cosine waves, each characterised by a specific amplitude and phase angle. In the process, a series of coefficients describe the cyclical variation of the series, including its seasonal behaviour. A variable number of components can be extracted, but only a few terms are in general necessary to describe annual, semi-annual, and smaller components of the seasonal variance. In summary, the harmonic regression produces an equation with coefficients that fit the seasonal behaviour of each pixel of a series of images. When the term for time is incorporated, the coefficients reconstruct the value of the environmental variable for such time. Most important, these coefficients can be used to describe the amplitude, peak timing, seasonal peaks, seasonal threshold, and many other features of a time series [20]. Thus, harmonic regression describes the pattern of the temporal variable to be measured, from which other phenological data can be obtained. It serves as a method of potential application for capturing the abiotic niche of an organism because it describes both the pattern (seasonal components) and the ranges of climate variables between defined time intervals with the coefficients that result from the harmonic regression. The harmonic regression used in this study has the following form: where Y is the value of the variable at a moment of the year, α 0 is the offset, ai is the coefficient of the ith oscillation, L is the fundamental frequency, and x is the time-dependent variable. The coefficients of the harmonic regression are referred to here as "environmental covariates" because they explicitly represent the environmental niche that an organism may occupy. The final form of the regression equation is Y = A + (B*(sin(2πt))) + (C*(cos (2πt))) + (D*(sin(4πt))) + (E*(cos(4πt))) + (F*(sin(6πt))) + (G* (cos(6πt))) where A, B, C, D, E, F, and G are the seven coefficients chosen to represent the complete time series, and t is the time of the year. Y represents the reconstructed value of a variable for the time t. Figure 1 displays the potential of the method to describe complex series of data. The first coefficient in the regression is the mean of the regressed variable. Each further pair of coefficients contributes to explain the complete series by determining the amplitude and the phase of periods of time that are half the length of the preceding period, e.g., twelve, six, three months, etc. Hypothetical examples in Figure 1 show how different phenological patterns are easily created, explaining the full potential of the method. Figure 1D displays real monthly values of temperature, randomly selected from two sites in the northern and southern hemispheres, compared with the weekly reconstruction of these actual series using the equation and the coefficients in Figure 1E, where "t" is the time of the year. The error of the fitted equations to the actual data is less than 1%, as measured by the residuals.
The interest of harmonic regression is that a few coefficients are able to reconstruct even daily values of the target variable (weekly in the example of Figure 1D). We claim that these coefficients retain the ecological meaning of the variable, because after reconstruction of the time series, standard features (in terms of "length of the summer", "peak of humidity in spring" or "number of days below 0°C") are still available using simple algebra [20]. The reduction of the time series by other methods, like Principal Components, allows the destruction of such seasonal component [21]. In correlative modelling, harmonic regression defines the abiotic niche with a few variables, therefore improving the reliability of the models because internally correlated variables, like time series, are not included [21].

The series of data
All the data were obtained from the NEO's (NASA Earth Observations) web server (http://neo.sci.gsfc.nasa.gov/ about/). The mission of NEO is to provide an interface to browse and download satellite data from NASA's constellation of Earth Observing System satellites. Over 50 different global datasets are represented with daily, weekly, and monthly snapshots. NEO is part of the EOS Project Science Office located at the NASA Goddard Space Flight Center.
Four series of data were targeted because of their potential to describe the abiotic niche of parasitic organisms: the Land Surface Temperature, either at day or night (LSTD, LSTN); the Normalised Difference Vegetation Index (NDVI); and the Leaf Area Index (LAI). The first expresses the temperature at the ground surface with a precision of one decimal. We worked out both LSTD and LSTN because the phenological curve of these datasets can address calculations of the total accumulated temperature over a given threshold, which is important in the detection of habitat. The NDVI is a measure of the photosynthetic activity of plants. Its value has been proven in the field of large-scale monitoring of vegetation cover, and it has been extensively used as a descriptive variable of the habitat for medically important arthropods [22,23]. NDVI thus represents an adequate source of data to cope with the water component of the arthropod life cycle, assessing temporal aspects of vegetation development and quality [23,24]. However, the relationship between NDVI and vegetation can be biased in low-vegetated areas, unless the soil background is taken into account [25]. The LAI defines an important structural property of a plant canopy, the number of equivalent layers of leaf vegetation relative to a unit of ground area [26]. This feature is important for the abiotic niche of an organism because it measures how the ground is protected against the sun and its evaporative capacities.
The four series of covariates (LSTD, LSTN, NDVI, and LAI) were obtained from the NEO website at a resolution of 0.1°, from October 2000 to December 2012 at 8-day intervals. The available sets of images have been already processed by the MODIS team, with improved cloud masking and adequate atmospheric correction and satellite orbital drift correction applied. Such processing is extremely important because the raw data are free of pixels contaminated by clouds or ice, which avoids interpretation errors. We prepared one month composites from the 8-day images, using the method of the maximum pixel value, to obtain the largest area without gaps in pixels. Data were filtered using a Savitzky-Golay smoothing filter [27]. One of the problems with applying remotely sensed imagery to the detection of abiotic niche is the existence of gaps at regions near the poles because of the long-lasting accumulation of snow, ice, or clouds. The effects are larger in the northern hemisphere because of the proximity of inhabited lands to the North Pole. The detection of these gaps and filling them with estimated values may be unreliable if the number of consecutive gaps is too long [28]. Some regions in the far North were not included in the final set of images because they were covered by snow, clouds, or ice for periods longer than 4 months.
Monthly values of each variable were subjected to harmonic regression. We performed the harmonic regressions in the R development framework [29] together with the packages "raster" [30] and "TSA" [31]. Seven coefficients for each variable were extracted from the annual time series. A script is provided as Additional file 1, illustrating the production of the coefficients of the harmonic regression. The coefficients representing the yearly, 6-month, and 3-month signals were selected from the harmonic regressions. Thus, seven layers of coefficients of each variable could reconstruct the complete original time series and constitute the environmental covariates proposed in this paper to describe the abiotic niche of organisms.
A RGB composition of the four sets of harmonic coefficients is included in Additional file 2: Figure S1.

Comparison of performance of the environmental variables
We aimed to demonstrate that (i) the coefficients of the harmonic regression have a significantly smaller collinearity than the original MODIS-derived time series and other Week of the year  popular climate datasets commonly used in correlative modelling, and (ii) that the performance of the harmonic coefficients in describing the abiotic niche of parasitic organisms is better than other products commonly used for this purpose. Collinearity is a statistical phenomenon of a dataset of spatial covariates [14]. Two or more variables in a multiple regression model may be highly correlated and then inflate the reliability of the model. In our application, the typical situation involves the use of time series of covariates that are strongly correlated (e.g., the temperature in one month is expected to be very similar to the values of the following month). A special situation exists when covariates are grid interpolations of climate point records.
In this case, the problems are magnified because the interpolation algorithms use a set of discrete, irregularly spaced sites (the meteorological stations) and the temporal series of covariates will exhibit a high collinearity. We assessed collinearity of the covariates with the variance inflation factor (VIF), which is a measure of correlation between pairs of variables [32]. Values of VIF > 10 denote a potentially problematic collinearity within the set of covariates, indicating that these covariates should be removed from model development [33]. A VIF = 1 indicates that the variables are orthogonal. VIF was calculated with the package "fmsb" [34] for R on the monthly values of LSTD, LSTN, NDVI, and LAI, as well as the derived harmonic coefficients. To compare with other popular products used in the inference of the abiotic niche, we computed the VIF of the monthly values of temperature and rainfall of Worldclim (www.worldclim.org) and the so-called "bioclimate variables" from the same source, which are calculated ratios among some significant variables [35] at the same spatial resolution as the remotely sensed data.
The performance of the models built with these abiotic covariates was tested on a dataset of the reported world distribution of ticks of the subgenus Boophilus. This database of tick distribution has a global extent and is therefore appropriate for an explicit test of the environmental covariates. These ticks have a recent history of introduction by the trade movements of livestock [19], and some species are sympatric and thus may have similar preferences for defined portions of the abiotic niche [36]. Thus, the reported world distribution of boofilid ticks is a demanding statistical problem of discrimination among species because some of them may share a portion of the available ecological niche. We used the known distribution data for Rhipicephalus (B.) annulatus, R. australis, R. decoloratus, R. geigyi, and R. microplus, which consists of 9,534 records for the five species. Few details are known about the distribution of R. kohlsi, and it was removed from further calculations. Details of the compilation of the original dataset have been provided [36], but the dataset has been updated with new records from Africa and South America published after the date of the original compilation. Figure 2 shows the spatial distribution of the world records of the five species.
We wanted to discriminate among the five species of ticks as a proof of concept, using different datasets. This application is intended to allow inferences regarding the abiotic conditions behind an observed distribution of an organism, not to project such inferences onto the spatial domain but to correctly classify the set of records. The best set of abiotic covariates will produce the best description of the abiotic niche of these species of ticks, thus allowing the best discrimination among species. We built a discriminant analysis with the records of the five species of ticks and the different datasets of environmental covariates. Figure 2 The reported distribution of 9,534 records of ticks of the subgenus Boophilus. Only records with a pair of coordinates were included in the map and considered for further computations. Records from Asia lack such reliable georeferencing and were not included.
Details of the discriminant analysis approach to distribution models or epidemiological issues have been addressed elsewhere [37,38]. We used a standard (linear) approach to the discriminant analysis, which uses a common (within-) covariance matrix for all groups. We used stepwise variable selection to control which variables are included in the analysis. We used the discriminant scores, the distance to the mean of that classification, and the associated probability to assign the classification of each record of ticks included in this study. The performance of such models is traditionally assessed by calculating the area under the curve (AUC) of the receiver operator characteristic [39], a plot of the sensitivity (the proportion of correctly predicted known presences, also known as absence of omission error) vs. 1specificity (the proportion of incorrectly predicted known absences or the commission error) over the whole range of threshold values between 0 and 1. The model AUC thus calculated is compared to the null model that is an entirely random predictive model with AUC = 0.5, and models with an AUC above 0.75 are normally considered useful [40]. Using this method, the commission and omission errors are therefore weighted with equal importance for determining the performance of the model. Other than the calculation of AUC, we explicitly evaluated the percentage of correctly determined records of ticks, using the different sets of abiotic covariates.
To capture the abiotic niche and thus discriminate the five species of ticks, we used (i) the coefficients of the harmonic regression of LSTD and NDVI; (ii) the same set of (i) plus the coefficients of the harmonic regression of LAI; (iii) remotely sensed monthly averages of LSTD and NDVI; (iv) the same set in (iii) after removal of the pairs of covariates with VIF > 10; (v) monthly averages of temperature and rainfall obtained from Worldclim; (vi) bioclimate variables from the Worldclim dataset; and (vii and viii) monthly Worldclim values and bioclimate variables after removal of the covariates with VIF > 10, respectively. No attempts were made to include LSTN in these efforts because it parallels the phenology of LSTD. We are aware that NDVI is not highly correlated with rainfall, but it is commonly used as a surrogate of drought conditions [41], and its performance can therefore be compared with rainfall estimates. Table 1 includes the collinearity values among the seven coefficients of the harmonic regressions of each series of remotely sensed covariates over the complete Earth's surface. The calculation of collinearity between LSTD and LSTN was omitted because they express the same variable either at day or night and are obviously highly correlated. The collinearity among the harmonic environmental Collinearity was calculated as the variance inflation factor. Values lower than 10 are indicative of low collinearity and could be used together in models of the environmental niche. The number after the letters of the variables indicates the ordinal coefficient in the harmonic regression of the variable.

Results
variables was lower than 3 for every possible combination, an indication that all of these covariates could be used together to train models without inflation of the resulting inference. However, the monthly series of remotely sensed covariates had values of VIF higher than 200 (Tables 2, 3 and 4), and the maximum statistically allowable is around 10. The transformation of the monthly series of remotely sensed covariates removes the collinearity while retaining its complete ecological meaning. Tables 5 and 6 show the VIF values for the monthly series of interpolated temperature and rainfall, respectively. A total of 45% of monthly combinations of temperature and 6% of monthly combinations of rainfall produced VIF values higher than 10.
The "bioclim" variables were also affected by the collinearity (Table 7). Some combinations of these covariates produced high VIF values, including combinations of variables related to temperature (e.g., annual mean, mean of coldest quarter, seasonality, annual range, maximum and mean of warmest quarter, minimum and mean of driest quarter) and a few combinations of rainfall (wettest period and quarter and driest period and quarter) that are intuitively correlated. Table 8 reports the results of the discriminant analysis trained with different combinations of environmental covariates applied to the dataset of the world distribution of the ticks of the subgenus Boophilus. The table includes data on both the percentage of records correctly identified by each model and the AUC values, a measure of general reliability. All the models performed variably, but the best overall performance was obtained for the Fourier-derived covariates including seven coefficients of LSTD and NDVI and the first five coefficients of LAI, with 82.4% correct determinations. This model produced the best discrimination between R. annulatus and R. geigyi, with almost 70% of records of the former correctly determined. The performance of discriminant analysis decreased if only the

Discussion
Increased availability of species distribution and environmental datasets, combined with the development of sophisticated modelling approaches, has resulted in many recent reports evaluating the distributions of healththreatening arthropods [42][43][44][45][46]. This capture of the environmental niche represents an inference of the recorded distribution of the organism, which can then be projected into a different spatial or temporal framework. The capture of the abiotic niche comes with some methodological caveats, however: (i) It is necessary to select a set of  Collinearity was calculated as the variance inflation factor. Values higher than 10 are indicative of high collinearity.
descriptive covariates with an ecological meaning for the organism to be modelled [7]; (ii) these covariates must be free of statistical issues that could affect the process of inference [47]; (iii) they must cover the widest geographical range [48]; and (iv) they should be ideally prepared with the same resolution. It is commonly the case that points (i) and (ii) may be mutually exclusive, i.e., the ecologically relevant covariates are indeed highly correlated, therefore leaving only ecologically inappropriate covariates for environmental inference. The automatic selection of the covariates that render the best model, which has become popular in recently available modelling algorithms [49], introduces further unreliability in the modelling process. A large evaluation of how to deal with collinearity in environmental covariates [14] concluded that none of the purpose-built methods yielded much higher accuracies Collinearity was calculated as the variance inflation factor. Values higher than 10 are indicative of high collinearity. Collinearity was calculated as the variance inflation factor. Values higher than 10 are indicative of high collinearity. The names Bio1 to Bio19 are the names defined by the Worldclim dataset, namely annual mean temp., mean diurnal range, isothermality, temp. seasonality, max temp. of warmest month, min. temp. of coldest month, temp. annual range, mean temp. of wettest quarter, mean temp. of driest quarter, mean temp. of warmest quarter, mean temp. of coldest quarter, annual precipitation, precipitation of wettest month, precipitation of driest month, precipitation seasonality, precipitation of wettest quarter, precipitation of driest quarter, precipitation of warmest quarter, precipitation of coldest quarter. than those that ignore collinearity. As a rule, collinearity must be removed before the building of the models because it cannot be handled by further methods.
We produced a dataset of environmental variables based on the harmonic regression of remotely sensed time series of day and night temperature, vegetation stress, and leaf area index. This dataset is aimed to fit the statistical rules of internal coherence when applied to the detection of the environmental niche of organisms. Our goal was to produce a homogeneous set of uncorrelated variables, retaining the complete ecological meaning and covering the complete Earth's surface. We obtained the raw data from a reliable source that ensures the best pre-processing, which makes for a consistent and homogeneous set of raw variables. The meaning and the potential of the harmonic regression to capture the phenology of the climate have been already pointed out [20]. We evaluated the performance of the harmonic regression coefficients with a dataset of world records of boofilid ticks, which is a challenging problem for such techniques because these species have a pan-Tropical and Mediterranean distribution [50]. In some cases, the trade movements of livestock introduced and spread species far away from the original ranges [51]. We demonstrated that the covariates derived from the harmonic regression better captured the abiotic niche of several species of ticks than did the monthly raw set of descriptors or interpolated gridded climate, which have been traditionally used for this purpose [52][53][54]. We are aware that the nominal spatial resolution of 0.1°may be too coarse for some applications focusing on local or regional issues, which could require a higher resolution. The choice of such resolution is a balance between complete coverage of the Earth's surface and processing requirements in terms of time and computer resources. Such resolution is similar to a previous set focusing on remotely sensed data from the AVHRR series of sensors [55]. However, MODIS is particularly more attractive for epidemiological applications than AVHRR because of the better spectral and temporal resolutions [55].
One source of unreliability is the inference from inadequate sets of descriptive covariates, which in some cases may include a high collinearity [14]. We are considering collinearity in the context of a statistical model that is used to estimate the relationship between one response variable (the species in our application) and a set of descriptive covariates. Examples include regression models of all types, classification and regression trees, and neural networks. Coefficients of a regression can be estimated, but with inflated standard errors [56] that result in inaccurate tests of significance for the predictors, meaning that important predictors may not be significant, even if they are truly influential [14]. Extrapolation beyond the geographic or environmental range of sampled data is prone to serious errors because patterns of collinearity are likely to change. Obvious examples include use of statistical models to predict distributions of species in new geographic regions or changed climatic conditions, giving the impression of a well-fitted model to which tests of model reliability are "blind" [21,57,58].
Generalised sets of covariates produce an unmanageable level of uncertainty in species distribution models that cannot be ignored. The use of sound ecological theory and statistical methods to check predictor variables can reduce this uncertainty, but our knowledge of species may be too limited to make more than arbitrary choices. Data reduction methods are usually employed to remove these correlations and provide one or more transformed images without such correlation, which can then be used in further analyses or applications. One ordination approach commonly applied to multi-temporal imagery is PCA [59], but explicit measures of seasonality are lost in the ordination process. PCA thus achieves data reduction at the expense of biological descriptiveness. Alternative methods For some of these datasets of descriptive covariates, the analysis was repeated with every variable included (e.g., the 12 months of average temperature) and after the highly correlated variables were removed. A discriminant analysis was conducted, and its reliability evaluated by the percent of records correctly predicted and the area under the curve (AUC). The AUC is a general measure of model performance and does not consider individual results of true positives for each species. Therefore, some models may perform better for a particular species while having a general low AUC. The percent of correctly determined records of each species is also included.
that retain information about seasonality include polynomial functions [10] and temporal Fourier analysis [17,18]. The Fourier transformation of remotely sensed variables has been proposed as a reliable approach to define the niche of organisms [18,19,60] because it retains the complete variability of the original time series as well as the ecological meaning. Temporal harmonic regression transforms a series of observations taken at intervals over a period of time into a set of (uncorrelated) sine curves, or harmonics, of different frequencies, amplitudes, and phases that collectively sum to the original time series. A high-resolution version of AVHRR data converted into Fourier derivate, focused on the western Palearctic, was made available commercially [54], and a general algorithm to handle MODIS images and decompose them into harmonics was already available [18]. Our application is thus the first to provide a set of statistically suitable, internally coherent set of variables with ecological meaning, aimed at describing the abiotic niche of organisms and covering the complete Earth's surface. While this new set of environmental descriptors has been developed to delineate the associations of parasites with abiotic traits and how these traits can shape potential distributions, it would potentially benefit ecologists and epidemiologists in the capture of the abiotic niche of other organisms.

Conclusions
The set of environmental covariates described in this study covers the complete Earth and lacks internal issues that may inflate the models derived. It targets capturing the abiotic niche of organisms, with potential applications in a variety of fields in ecology, epidemiology, and phylogeography. The tests, applied to a worldwide collection of records of five species of ticks with overlapping spatial distributions, demonstrated that the environmental variables derived from a harmonic regression better discriminated the species, and therefore their abiotic niche, outperforming the reliability of other sets of environmental covariates and not inflating the models as a result of the collinearity of the descriptors, which were measured by the VIF. The usefulness of interpolated gridded covariates is not in question in many fields, but it must be stressed that they offer limited value for describing the abiotic niche of ticks because the application of statistical rules may force removal of ecologically relevant covariates describing such a niche. We have made the set of coefficients of the harmonic regressions available for free download and provided the scripts necessary to either reproduce the workflow or to apply the methodology to new sets of time variables.