Population genetic structure of the malaria vector Anopheles funestus, in a recently re-colonized area of the Senegal River basin and human-induced environmental changes

Background Anopheles funestus is one of the major malaria vectors in tropical Africa. Because of several cycles of drought events that occurred during the 1970s, this species had disappeared from many parts of sahelian Africa, including the Senegal River basin. However, this zone has been re-colonized during the last decade by An. funestus, following the implementation of two dams on the Senegal River. Previous studies in that area revealed heterogeneity at the biological and chromosomal level among these recent populations. Methods Here, we studied the genetic structure of the newly established mosquito populations using eleven microsatellite markers in four villages of the Senegal River basin and compared it to another An. funestus population located in the sudanian domain. Results Our results presume Hardy Weinberg equilibrium in each An. funestus population, suggesting a situation of panmixia. Moreover, no signal from bottleneck or population expansion was detected across populations. The tests of genetic differentiation between sites revealed a slight but significant division into three distinct genetic entities. Genetic distance between populations from the Senegal River basin and sudanian domain was correlated to geographical distance. In contrast, sub-division into the Senegal River basin was not correlated to geographic distance, rather to local adaptation. Conclusions The high genetic diversity among populations from Senegal River basin coupled with no evidence of bottleneck and with a gene flow with southern population suggests that the re-colonization was likely carried out by a massive and repeated stepping-stone dispersion starting from the neighboring areas where An. funestus endured.


Background
Anopheles funestus Giles, 1900 is one of the major malaria vectors in tropical Africa with An. gambiae sensu stricto and An. arabiensis [1], being the primary malaria vector in some areas [2][3][4][5]. In Senegal, An. funestus has been already described in all bio-geographic zones [6] and exhibits a predominant role in malaria transmission [7][8][9]. Following the recurrent droughts that have occurred during the 1970s, this species had disappeared from many parts of sahelian Africa, including the Senegal River basin, in consequence of the disappearance of its specific breeding sites [10]. However, after more than three decades of absence, the re-emergence of An. funestus was reported at the beginning of this century in the low valley of the Senegal River [11]. The hydroagricultural implementations following the start-up of the Diama dam are highly suspected to create favourable breeding places for the re-establishment of An. funestus [9,12]. This re-colonization had thus given rise to the fear of recrudescence of the transmission and incidence of malaria in this area. An entomological survey carried out thereafter in the Senegal River basin showed heterogeneity of An. funestus populations in their anthropophily, densities and parity [9]. Because this species is known to be biologically and genetically highly polymorphic [13][14][15][16][17], these observations led us to suspect a potential genetic structure within the newly established An. funestus populations. Likewise, significant chromosomal differentiation not linked to geographical distance has been reported between the An. funestus populations from the Senegal River basin and those from the sudanian domain [12].
In the present study, we studied the genetic structure of the An. funestus populations in the Senegal River basin using microsatellite DNA markers. This set of molecular markers were demonstrated to be suitable tools for population genetics studies within this mosquito [17][18][19][20]. We aimed at investigating the genetic diversity and the genetic structure of the newly established An. funestus populations in the Senegal River basin in comparison to a more southern population of this species.

Study sites and mosquito collection
The study was carried out in four villages located in the Senegal River Basin, in sahelian domain (Mbilor, Gankette Balla, Diaminar Keur Kane, Loboudou) and in the village of Dielmo, located in the sudanian domain ( Figure 1). The re-emergence of An. funestus was previously observed in the four selected villages from the Senegal River basin and biological heterogeneities were observed between sites [9]. The village of Mbilor (16°29' N, 15°33' W) is situated in the low valley of the Senegal River. A retention basin made by the Senegalese Sugarcane Company and derived from the Senegal River represents the unique source of permanent water, which represents the main breeding site of anopheline mosquitoes. The village of Gankette Balla (15°58'N, 15°55'W), Diaminar Keur Kane (16°00'N, 15°5 4'W) and Loboudou (15°57'N, 15°55'W) are situated on the shores of The Guiers Lake. The association between the shores of the Guiers Lake and the development of fresh water plants represent the main breeding sites for An. funestus, being the predominant anopheline species collected [9]. Mosquito collections were carried out between October 2003 and September 2004 in these four villages.
Mosquitoes from the Senegal River basin populations were compared to Dielmo (13°45'N, 16°25'W), located in a sudanian domain of central Senegal, on the marshy bank of a small permanent stream, where anopheline mosquitoes breed all year round. Specimens from Dielmo were collected during the rainy season in 2000, between July and October. All mosquitoes were collected by indoor pyrethrum spraying in randomly selected human dwellings. After collection, An. funestus were identified morphologically [21] and stored individually in labelled tubes with desiccant for further molecular processing.
DNA extraction, molecular species identification and microsatellite genotyping Genomic DNA was extracted from wings and legs of each individual mosquito following the protocol described by Morlais et al. [22]. DNA was then resuspended in sterile water in individual tubes. Morphological identification of An. funestus sensu stricto was confirmed by molecular methods [23,24]. Eleven microsatellite loci were selected from published An. funestus sequence data [25][26][27][28], based on their high polymorphism and no evidence for null alleles. These microsatellites were FunL, FunO, AFUB11, AFND19, AFND20, AFND32, FunQ, AFND40, AFND5, FunG and FunF.
PCR was performed for each locus individually in a 25 μl reaction volume. The reaction mixture contained 1 X PCR buffer containing 1.5 mM MgCl 2 (Promega, France), 0.2 mM of each dNTP (Eurogentec, Belgium), 10 pmol of each primer, 0.1U of Taq DNA polymerase (Promega, France) and approximately 5-10 ng of template DNA. The forward primer was labelled in 5' with either by Fam, Vic, Ned or Pet fluorescent markers to allow multiplex electrophoresis (Eurogentec, Belgium). Amplification was performed in a GeneAmp 9700 thermocycler (Applied Biosystems, France). Cycling conditions were: an initial denaturation step at 94°C for 2 min followed by 36 cycles of 30 s at 94°C, 30 s at 54°C, 30 s at 72°C and a final elongation step of 10 min at 72°C. Amplified products were diluted and pooled for genotyping in AB3130 sequencer (Applied Biosystems, France). Alleles were sized relatively to an internal size standard using GENEMAPPER v4.0 (Applied Biosystems, France).

Data analysis
For each locus and each population sample, heterozygosity was computed using GENETIX v.4.05 [29] and the number of alleles were computed using FSTAT v.2.9.3.2 [30]. FSTAT and GENEPOP v.4.0.3 [31] were used to assess the deviation from Hardy-Weinberg equilibrium at each locus, each population sample, and overall as indicated by the inbreeding coefficient (F IS ). Linkage disequilibrium between pairs of microsatellite loci was assessed using FSTAT v.2.9.3.2 [30]. Significance was tested using the randomization approach implemented in FSTAT with Bonferroni-adjusted P-values.
Genetic differentiation between populations was assessed by estimating Wright's F-statistics [32], calculated according to Weir & Cockerham [33]. Statistical significance of F ST was assessed using G-based exact tests for genotypic differentiation [34], available in GENEPOP.
Because demographic instability such as recent population bottleneck and/or expansion might bias genetic differentiation, heterozygosity tests were implemented to test for Mutation-Drift Equilibrium (MDE) using BOTTLENECK V1.2.02 [35]. At selectively neutral loci, the expected heterozygosities calculated from allele frequency data (He) and from the number of alleles and sample sizes (Heq) are expected not to be significantly different (He Heq) in a population at MDE. If a population experiences a bottleneck, rare alleles will be lost by genetic drift, and Heq will decrease faster than He (He > Heq). The resulting apparent excess of heterozygotes is an indicator of a recent bottleneck event, while the opposite (He < Heq) may suggest population expansion. Estimates of expected heterozygosity under MDE were calculated using two mutation models for microsatellite evolution: the Two Phased Mutation model (TPM) [36] and the Stepwise Mutation Model (SMM) [37]. Wilcoxon signed rank tests were used to determine whether deviations from MDE were statistically significant.
We applied a Bayesian model-based clustering algorithm to infer population structure and to assign individuals (probabilistically) to clusters without a priori knowledge of population units and limits. This approach, implemented in STRUCTURE v2.2 [38], uses individual multilocus genotype data to cluster individuals into K groups while minimising Hardy-Weinberg disequilibrium and gametic phase disequilibrium between loci within groups [39]. In STRUCTURE v 2.2, the number of distinct genetic clusters in the data set (K) was estimated from 1 to 12 by the posterior log probability of data under each K, Ln [Pr(X|K)] [38]. Each run carried out 500,000 iterations after a burn-in period of 40,000, using the admixture model and correlated allele frequencies.
To check for convergence of the Markov chain Monte Carlo (MCMC), we performed over 5 replicates for each value of K and then checked the consistency of results [40]. The estimated number of clusters (K) was taken to be the value of K with the highest Pr (X|K) [39].
Groups of populations were defined according to STRUCTURE results and the hierarchical genetic variation existing within groups of populations and within populations was analyzed by Analysis of Molecular Variance (AMOVA) using Arlequin version 3.5 [41].
The correlation between genetic and geographical distances was assessed by the regression of on the logarithm (ln) of geographical distance [42], and tested using the Mantel test available in GENEPOP. Dispersion ability was estimated by the "neighborhood size," Dσ2, that is, the product of the effective population density (D) by the average squared parent offspring distance (σ 2 ) [42].
Kruskal-Wallis test was used to compare the average number of alleles and the average proportions of heterozygosity between populations using XLSAT Pro 2006 W .
The Bonferroni correction procedure [43] was applied to evaluate significance when multiple tests were performed.

Genetic variability
Genotypes of 235 An. funestus females were scored at 11 microsatellite loci. These microsatellite loci were highly polymorphic with a number of distinct alleles per locus ranging from 6 (FUNQ, AFND5) to 15 (AFND32, FUNL) in our five populations ( Table 1). The average number of alleles per locus ranged from 6.8 to 7.7 and was not significantly different among populations (P = 0.83). Mean observed heterozygosity across all loci ranged from 0.68 to 0.72 and was not significantly different among populations (P = 0.89). Heterozygosity tests were performed to explore     (Table 2).

Hardy-Weinberg and linkage disequilibrium
When the pooled samples were analyzed as a single population, 3 (AFUB11, AFND19 and FUNQ) of 11 loci showed significant deviations from Hardy-Weinberg equilibrium due to significant heterozygote deficiency (Table 1). When considering all loci, no deviation from Hardy-Weinberg equilibrium was observed within each population studied. However, significant deviation from Hardy-Weinberg expectations was observed for loci AFUB11 and AFND19 respectively in populations of Mbilor and Diaminar after the Bonferroni correction was applied. These deviations from Hardy-Weinberg expectations were associated with heterozygote deficiency (Table 1). No linkage disequilibrium was observed in any pair of loci (6600 permutations, P > 0.05).

Genetic differentiation and structure analysis
The levels of genetic differentiation between pairs of populations were estimated by F ST values. Table 3 shows F ST estimates for all pairwise population comparisons. The values of F ST between pairwise population comparisons for all loci ranged from 0 (Diaminar-Gankette) to 0.0519 (Dielmo-Mbilor). The highest F ST estimates were obtained between the most distant populations. F ST estimates were highly significant (P <10 -4 ) between Dielmo and the populations of the Senegal River basin and between Mbilor and the populations around Guiers Lake area (Diaminar, Gankette and Loboudou). After all, only three comparisons were not significant after Bonferroni correction: Diaminar-Loboudou (P > 0.005), Diaminar-Gankette (P > 0.005), and Gankette-Loboudou (P > 0.005).
Structure v 2.2. provided consistent results over 5 replicated runs tested for each K. The probability of the data (LnPr(X|K)) greatly increased from K = 1 to K = 2, and then reached a maximum value at K = 3, after which the values decreased gradually (Figure 2). Thus, in agreement with results based on F ST , the most likely number of genetic clusters in the dataset is K = 3 ( Figure 2).
The analysis of molecular variance (AMOVA) of all the eleven microsatellites then confirmed the differentiation and structure analysis with the variation within individuals, among individuals within populations, among populations within groups and among groups being 92.73%, 4.21%, 0.08%, and 2.97% respectively (Table 4). AMOVA showed that the variation among populations within groups explained only 0.06% of the total variance while the variation from among individuals within populations and within individuals explained 4.21% and 92.73% of the total variation, respectively.

Isolation by distance
Isolation by distance was tested and showed a positive and significant (R 2 = 0.76, P = 0.02) correlation between F ST /(1-F ST ) and the logarithm of distance, when considering the five villages (Dσ 2 = 7.23) (Figure 3)

Discussion
This study revealed genetic stability among the populations of Anopheles funestus in the Senegal River basin. Moreover, our results showed high levels of genetic diversity within the re-emergent populations of the Senegal River basin and a permanent population in the Sudanian domain. The comparable levels of genetic diversity between both areas confirmed the genetic stability of the newly established populations from the Senegal River basin. Furthermore, geographical distance seems to be the key factor for population genetic divergence, although, other factors could potentially play a role in the genetic differentiation among Senegal River basin populations. The high levels of gene flow denote important mosquito migration among populations. The observed genetic stability of An. funestus populations from the Senegal River basin was at least unexpected. Indeed, after several decades of periodic droughts, and recent re-emergence of An. funestus, we predicted a genetic signal of bottleneck followed by population expansion, as reported elsewhere in An. gambiae [44]. The absence of such demographic signals may be explained by the noteworthy effective population size of the An. funestus population, which could promptly foster population equilibrium [45]. For instance, the large population size in Anopheles gambiae prevented any impact on the mosquito population structure after strong population selection (i.e. vector control measures, dry seasons) [46,47]. On the other hand, high migration rates between populations can also erase any genetic signal of natural selection [48]. Therefore, the absence of signal for bottleneck and demographic expansion may indicate that the re-colonization was massive and/or seasonally repeated. We then hypothesize that the gene flow between Senegal River basin and other An. funestus populations may occur each year at the rainy season, when breeding sites are numerous and close enough to allow mosquito dispersal among populations.
Our results showed low but significant levels of genetic differentiation between populations of the Senegal River basin and those from the sudanian area. High levels of gene flow have been repeatedly reported on An. funestus and our estimates are consistent with previous studies on genetic structure of An. funestus populations in Senegal and other parts of Africa using microsatellites [17,18,20,49,50]. The important gene flow between An. funestus populations from the Senegal River basin and the sudanian population (252 km apart), revealed by our analysis, indicate the existence of continuous populations of this malaria mosquito are inter-connected. Such observations were already reported in the other genetic studies in the populations of An. funestus [19,49,51,52] and An. gambiae s.l [53][54][55][56]. Therefore, these results reassert that the recolonization of An. funestus in the Senegal River basin was probably carried out by a step by step dispersion starting from the neighbouring areas where An. funestus had not disappeared.
The genetic analysis clearly distinguished between sahelian and sudanian populations. The genetic tests of isolation by distance suggested that genetic differences observed between domains are linked to the geographical distance. This is a common pattern in Anopheles [44,57,58] and particularly in Anopheles funestus [18,20,50,51]. In contrast, An. funestus populations from the Senegal River basin may be subdivided into two distinct genetic entities: populations around Guiers lake area (Diaminar, Gankette, Loboudou) and populations from the low valley of Senegal River (Mbilor). The existence of these three genetic entities (two in the Senegal River basin and one from the sudanian domain) was confirmed by different genetic approaches (i.e. Structure and AMOVA). When Dielmo was excluded from the isolation by distance analysis, geographical distance did not explain the genetic differentiation observed between Mbilor and populations around Guiers Lake area. Thereby, other factors rather than geographical distance should play a key role into the population structuration of An. funestus populations in the Senegal River basin. Chromosomal differentiation detected between An. funestus populations of the low valley of Senegal River and the Guiers Lake area (difference of frequency for the inversion 3La) [12] may be implicated in the structure observed as demonstrated in the An. funestus populations of Burkina Faso [17]. Because paracentric inversions are involved in the adaptation to various environments, the chromosomal differentiation detected between An. funestus populations of the low valley of Senegal River and the Guiers Lake area [12] could be the consequence of different breeding sites, themselves consequences of different environmental changes induced by human activity.

Conclusions
Our study showed the existence of three genetically different subpopulations of An. funestus: populations around Guiers Lake area, populations from the low valley of Senegal River, and populations from Dielmo. The high genetic diversity among populations from the Senegal River basin coupled with no evidence of bottleneck and with a gene flow with the southern population suggested that the re-colonization was likely carried out by massive and repeated steppingstone dispersions starting from the neighbouring areas where An. funestus endured. Geographical distance is not the only factor involved in shaping of the genetic structure observed between the An. funestus populations from the low valley of the Senegal River and The Guiers Lake area and we hypothesize that the different breeding sites created by human activities may have shaped chromosomal structuration and may explain the restricted but still occurring gene flow. Our study is therefore indicative of adaptation of malaria vectors to the environment modified by humans.