Evidence for genetic differentiation at the microgeographic scale in Phlebotomus papatasi populations from Sudan

Background Cutaneous Leishmaniasis (CL) is endemic in Sudan. It is caused by Leishmania major parasites and transmitted by Phlebotomus papatasi sandflies. Recently, uncommon clinical manifestations of CL have been reported. Moreover, L. donovani parasites that cause Visceral Leishmaniasis (VL) have been isolated from CL lesions of some patients who contracted the disease in Khartoum State, Central Sudan with no history of travelling to VL endemic sites on south-eastern Sudan. Because different clinical manifestations and the parasite behaviour could be related to genetic differentiation, or even sub-structuring within sandfly vector populations, a population genetic study was conducted on P. papatasi populations collected from different localities in Khartoum State known for their uncommon CL cases and characterized by contrasting environmental conditions. Methods A set of seven microsatellite loci was used to investigate the population structure of P. papatasi samples collected from different localities in Khartoum State, Central Sudan. Populations from Kassala State, Eastern Sudan and Egypt were also included in the analyses as outgroups. The level of genetic diversity and genetic differentiation among natural populations of P. papatasi was determined using FST statistics and Bayesian assignments. Results Genetic analyses revealed significant genetic differentiation (FST) between the Sudanese and the Egyptian populations. Within the Sudanese P. papatasi populations, one population from Gerif West, Khartoum State, exhibited significant genetic differentiation from all other populations including those collected as near as 22 km. Conclusion The significant genetic differentiation of Gerif West P. papatasi population from other Sudanese populations may have important implication for the epidemiology of leishmaniasis in Khartoum State and needs to be further investigated. Primarily, it could be linked to the unique location of Gerif West which is confined by the River Nile and its tributaries that may act as a natural barrier for gene flow between this site and the other rural sites. The observed high migration rates and lack of genetic differentiation among the other P. papatasi populations could be attributed to the continuous human and cattle movement between these localities.


Background
Female sandflies of the genus Phlebotomus in the Old World are vectors of Leishmania parasites, the causative agent of different clinical forms of leishmaniasis [1]. The disease is endemic in 88 countries in five continents with a total of 350 million people at risk [2].
The parasites within the Leishmania donovani complex usually invade the macrophages of the liver, spleen and bone marrow causing the severe symptoms of the fatal Visceral Leishmaniasis (VL) [3,4]; while the parasites of L. major complex which cause Cutaneous Leishmaniasis (CL) invade the subcutaneous reticulo-endothelial system and cause self-healing lesions that leave life-long scars [5,6].
Phlebotomus papatasi (Scopoli, 1786) is the principal vector of Leishmania major in the Old World [1]. It is a widely distributed species found in variable habitats and associated with a wide range of vertebrate hosts [7,8].
Due to their poor dispersal capacity, populations of P. papatasi are expected to show some genetic structuring along their geographical range as a result of adaptation to local habitats and limited gene flow [9]. Such genetic variability could play an important role in the epidemiology and clinical manifestations of leishmaniasis since it affects the vectorial capacity of the vector [10,11].
Previous studies focusing on potential population substructuring in P. papatasi used a number of molecular markers and provided contrasting results. For example, isoenzyme analyses clearly separated Western Mediterranean P. papatasi populations from those from eastern countries but, predictably, failed to reveal differences between urban and rural sandfly populations in Morocco [12]. Mitochondrial DNA sequences coding for cytochrome b (mtDNA cyt b) and sequences from the second internal transcribed spacer of the ribosomal DNA (rDNA ITS2) were both used to study the population structure of P. papatasi populations from North Africa and the Mediterranean sub-region countries [13,14]. The analyses of mtDNA cyt b sequences suggested some level of genetic differentiation among widely separated populations and revealed a pattern of isolationby distance between populations from Syria, Egypt, Israel/ Palestine and Turkey [13]. The ND4 mtDNA region was also used in combination with the rDNA ITS2 to study the population structure of P. papatasi from 18 countries from North Africa, the Mediterranean sub-region, Saudi Arabia and India, but revealed no clear phylogeographic structure between those populations. However, signs of restricted gene flow were found among populations from Iran, Egypt, Syria, Yemen and Turkey [14].
Microsatellite markers combined with Bayesian statistic analysis were recently used to study the population structure of Phlebotomus papatasi populations in countries from the North-African and the Mediterranean subregions. This study confirmed the occurrence of highly significant genetic differentiation between some populations [15]. However, the geographical scale of the study did not allow for detecting possible genetic differentiation at the local level, which may be the most relevant for explaining the observed patterns of variation in epidemiologically relevant traits observed in some regions.
Thus, so far, genetic differentiation among P. papatasi populations could not be demonstrated at the local geographical level, despite evidence suggesting that it may occur. For example, Schmidt and Schmidt (1963) observed marked morphometric variations within the Egyptian populations of P. papatasi suggestive of substructuring [16]. Using isoenzyme analysis, Kassem et al. reported the presence of polymorphisms at many isoenzyme loci among populations of P. papatasi from Egypt but no significant genetic differentiation could be detected [17,18]. It must be stressed that the possibility of genetic differentiation in P. papatasi populations at the local scale has not yet been explored using microsatellite markers which, with their higher mutation rates, should be comparatively much more informative than isoenzymes and other sequence loci [19].
In Sudan, CL is caused by L. major and transmitted by P. papatasi [20,21]. Before the 1970s, the disease was confined to the western parts of the country. Thereafter, major epidemics occurred along the River Nile and the disease became endemic in many regions of the country [21]. The usual clinical forms of the disease usually heal spontaneously without the need for medical treatment, a matter which discourages patients from attending the health centres, and therefore obscures the incidence of the disease [22]. Visceral Leishmaniasis (VL) which is caused by L. donovani and transmitted by P. orientalis is known to be endemic in eastern and southern parts of Sudan [4,23], with few scattered foci along the White Nile and Darfur [24]. Recently, uncommon clinical manifestations of CL that did not heal spontaneously nor responded to usual drugs have been reported [25]. Moreover, L. donovani parasites have been isolated from CL lesions of some patients who contracted the disease in Khartoum State, Central Sudan, with no history of travelling to VL endemic sites [26]. Recent studies have demonstrated the possibility of genetic exchanges between different strains and species of the Leishmania parasites [27][28][29] which may further complicate the epidemiology of the disease since hybrid parasites may adapt differently to the vector and reservoir hosts [30].
Because different clinical manifestations and the parasite behaviour may also be related to genetic differentiation or sub-structuring within sandfly vector populations [11,31], we conducted a population genetic study of P. papatasi populations in Sudan from broad to local geographical scale. This was done using a set of 5 microsatellite markers, especially developed for P. papatasi [32], and given the paucity of such markers, by testing and using additional markers developed for P. perniciosus a related phlebotomine species [33].
P. papatasi populations were collected from different localities characterized by the distribution of the atypical CL cases. The level of genetic diversity and genetic differentiation among natural populations of P. papatasi was determined using F ST and Bayesian assignments. Identifying potential factors leading to genetic differentiation and structuring in P. papatasi populations might improve our understanding of the epidemiology of the disease and help develop appropriate control strategies.

Study sites
A total of 126 male specimens of P. papatasi, which are typically easier to identify morphologically than females in this and other phlebotomine species, were collected from five different localities in Sudan separated by 20 to 440 Km ( Figure 1, Table 1). Individuals from a colonized population from Egypt (EGY) (~1600 Km from Khartoum State) were also included. The Sudan study sites were selected according to the distribution of the uncommon CL cases. They also represent different ecological zones separated by the River Nile and its tributaries. Four sites were located in Khartoum State; namely, Moyleih (MO), a semi-desert area on the west bank of the White Nile, El-Trais (TR), an irrigated area on the west bank of the White Nile; Sirougia village (SR) is a green rich irrigated area with many horticultural schemes on the east bank of the River Nile, northern Khartoum and Gerif West (GW) used to be green farms on the west bank of the Blue Nile, but recently changed into a residential area (also known as El-Manshya). Cutaneous leishmaniasis is not common in Moyleih and El-Trais areas; however some of the patients with CL lesions due to L. donovani were from these sites (Pers. Comm. Prof M. M. Mukhtar, Institute of Endemic Diseases, Sudan).Sirougia is a well-known endemic focus of CL where previous epidemic occurred beside some of the uncommon CL cases came from the nearby villages (Pers. Comm. Prof A. M. Musa, Institute of Endemic Diseases, Sudan). Gerif West is not known to be endemic for cutaneous leishmaniasis. However, some of the uncommon CL and CL due to L. donovani cases were from this site. The fifth Sudan study site was the village of Wad Shariefai (SHR) in Kassala State, Eastern Sudan which is located in the immediate vicinity of the dry savannah habitats of the Tajouj Forest. Wad Shariefai village is a site where three closely-related species belonging to the subgenus Phlebotomus phlebotomus (P. papatasi, bergeroti and duboscqi) were found sympatrically. Due to the far location of this Eastern site it is expected that P. papatasi from this site would be genetically different from P. papatasi populations from Khartoum Sate, Central Sudan.

Microsatellite PCR amplification
Samples were processed and DNA was extracted as previously described [34]. Microsatellite loci were amplified using the PCR primers originally developed by Hamarsheh et al. and Aransay et al. [32,33]. The PCR reactions were performed in a gradient thermocycler (PT200, MJ) for 30 cycles. The PCR mix contained 1.5 to 3μl of genomic DNA (5-8ng/μl) extracted from individual male sandfly, 5x PCR buffer (Promega, Madison, WI), 15pmol of each primer, 200mM of each dNTPs, 1.2μl of 25mm MgCl 2 and 0.5U Taq W DNA polymerase (Promega, Madison, WI) in 20μL total reaction volume. The thermal cycling conditions were an initial hold at 95°C for 2min, followed by 30 cycles of 94°C for 30s, 55°C for 30s and 72°C for 30s and a final extension at 72°C for 5min. For amplifications involving the primer AA13, a touchdown PCR programme was performed. The programme steps were as mentioned above except that the number of cycles was increased to 35 and the starting annealing temperature was 59°C, which was then decreased by 1°C for the first five cycles. For correct sizing of polymorphic PCR products, the forward primers were labelled with 5'-fluorescent tags. Details of the primers sequences; repeat motif, annealing temperature, and fluorescent labelling are given in Table 2.
PCR product multiplexing PCR product multiplexing was done according to the amplified allele size and the fluorescent dye labelling of each product in order to create two primer sets. Set A contained a mix of Pap1, Pap2, AA13, and AA82, while set B was a mix of Pap4, AA18, and AA24. About 16 samples with different dilutions (1:10, 1:20, and 1:50) were initially genotyped at Eurofins MWG Operon Company, Ebersberg, Germany. According to the results from the first genotyped samples, a suitable dilution factor was chosen and the two sets of sample microplates were prepared, tightly packaged in dry ice and submitted for subsequent genotyping in the aforementioned company.

Microsatellite data analysis
Because the repeat arrays of all the microsatellite loci consist of trinucleotide repeats (except for Pap2 which consist of dinucleotide repeats), the allele sizes were expected to differ predominantly by multiples of three nucleotides. Therefore any exceptions were treated as a result of insertions or deletions (indels) of odd numbers of bases in the flanking region.
The data were screened for the presence of stutters, null alleles and large allele dropout using the software Microchecker version 2.2.3 available at http://www. microchecker.hull.ac.uk [35]. Then the data were analyzed using ARLEQUIN version 3.5 available at http:// cmpg.unibe.ch/software/arlequin3 [36]. The programme was used to calculate allele frequencies at each microsatellite locus and to test whether they conformed to Hardy-Weinberg equilibrium (HWE) in each population. The same programme was used to estimate genetic differentiation (F ST ) between populations and the level of gene flow (Nm) among them, as well as assessing whether patterns of genetic divergence among P. papatasi populations followed an isolation-by-distance pattern. The genotype assignment test implemented in ARLEQUIN programme was used to assign individual sandfly genotypes to their population of origin by computing the log-likelihood of each individual multi-locus genotype in each population sample, assuming that the individual belongs to that population [36]. The significance of all tests (P <0.05) was obtained with 1000 permutations and 100000 Markov chain steps.   Hidden population sub-structure within P. papatasi populations were further investigated using the Bayesian clustering algorithm implemented in the software STRUCTURE version 2, available at http://pritch.bsd. uchicago.edu [37]. The programme was run using an admixture model with a burn-in period of 10000 iterations, followed by 100000 iterations of Markov chain Monte Carlo (MCMC) repeats for each setting of K from 1 to 10 and no prior information about the population structure was provided. After calculations, the value of the ad hoc ΔK was plotted against the assumed number of populations (K = 10), the highest peak of the graph was considered as the number of possible subpopulations in the data set.

Allelic richness and heterozygosity
The three markers Pap1, Pap2 and Pap4 showed varying degrees of heterozygosity among the different study populations; although the two markers Pap3 and Pap5 did not work well with the Sudanese P. papatasi populations even after several trials with different amplification conditions. Amongst the markers for P. perniciosus, AA13, AA18, AA24 and AA82, were successfully amplified in P. papatasi. However, these markers tended to be less polymorphic than in the original species. The total number of alleles per locus ranged from 1 to 5 with locus AA18 being monomorphic in all populations. In addition, each of the two loci AA13 and AA24 were found to amplify two different polymorphic loci with distinct allele size ranges (i.e. AA13a: 81-96bp and AA13b 129-132bp; AA24a: 141-147 and AA24b: 255-273) which were not found to be in linkage disequilibrium. This suggests that these two loci were the results of duplications to different unlinked regions of the genome; thus they were treated as two additional loci thereby increasing the total number of loci examined to nine (Table 3).
There was no evidence for scoring error due to stuttering, large allele dropout or null alleles after checking the dataset with Microchecker software. Except for locus AA18 which was monomorphic in all P. papatasi populations, the other eight studied microsatellite loci were found to be polymorphic. The total number of alleles per locus ranged from 2 to 5 with an average of 2.77 alleles per locus and an average gene diversity (mean expected heterozygosity) of 0.355 ± 0.226 ( Table 3).
The six P. papatasi populations were found to be polymorphic at 88.9% of the amplified loci. Allele frequency distribution between the 6 P. papatasi populations varied from locus to locus, with some loci having similar distributions and others being highly variable. For example, allele 128 for locus Pap1 was the most common allele among all Sudanese populations while allele 125 was the most common among the Egyptian population. For locus Pap2, allele 85 was the most common among populations from Sudan while allele 89 was the most common in a population from Egypt. Private alleles were detected at locus Pap1 in the MO population, locus AA24a in the SR population, loci AA13a and AA24b in the SHR population, and locus Pap2 in population EGY. Locus AA18 was monomorphic and was excluded from subsequent analyses (Table 4).
There was a considerable variation in mean heterozygosity in P. papatasi populations, including the Egyptian laboratory colony, despite its long history of colonization. Surprisingly, this population was the most variable with approximately 0.493 heterozygosity (i.e. on average 49% of the individuals were heterozygous at any given locus). Meanwhile, MO population was the most heterogeneous among all the Sudanese populations (40% heterozygosity). The most uniform population in the study was TR with 31.5% heterozygosity (Table 5).

Linkage disequilibrium and Hardy-Weinberg equilibrium
As few as 14 out of 168 comparisons between pairs of loci were found to be in significant linkage disequilibrium. However, none of those comparisons remained significant after Bonferroni correction.
SR, and GW populations, and locus AA24a within SR population. This could be due to population subdivision rather than the existence of null alleles since amplification was successful for all individuals. Significant deviations from HWE due to homozygote deficiency were found in locus Pap1 within GW population and locus AA13b in all populations. The Egyptian P. papatasi population was monomorphic for two loci (Pap4, and AA13a) and the GW population was monomorphic for AA13a ( Table 6). Because of their consistent deviation  from HWE among all populations, the two loci Pap4 and AA13b were excluded from subsequent analyses.

Estimates of genetic differentiation between populations
The genetic differentiation between pairs of populations, as measured by F ST , ranged from 0.007 to 0.233. Significant F ST estimates (F ST =0.147-0.233) were observed between all P. papatasi populations from Sudan when compared to those from Egypt (  Figure 2). As expected, F ST estimates translated into high estimates of migration rate, Nm, among populations from MO and TR, Southwest Khartoum State. High gene flow estimates were also observed between populations from MO, TR and SR and SHR, despite the large geographical distance between them. However, very little gene flow was observed between those same populations and GW which is approximately geographically equidistant to the other three sites in Khartoum State (Table 7).
A Mantel test showed that there was a positive but non-significant correlation (r = 0.08, P > 0.05) between linearized estimates of genetic differentiation F ST (F ST / (1-F ST )) and geographical distances (log km) (Figure 3).

Population structure
The STRUCTURE genotype assignment programme clustered our study populations into two groups ( Figure 4). The dataset was further analyzed by sorting out individuals between the two suggested subpopulations as belonging to subpopulation A (grey colour) and subpopulation B (black colour) based on the likelihood of their genotype to belong to one or the other population ( Figure 5). Most of the individuals in the Egyptian population (88%) were assigned to subpopulation B, while most of the individuals in the Sudanese populations belonged to subpopulation A. However, the observed heterogeneity in population assignment -e.g. 83% of TR and 73% of MO belong to the population Asuggested that the two clusters do not represent reproductively isolated populations ( Table 8).
The same result was also observed when attempting to assign individual sandflies to the different collection sites by computing the log-likelihood of the genotype of each individual based on the allelic frequencies of each population. According to the population assignment test, about 80% of the Egyptian sandfly individuals were correctly assigned to their population of origin. However, none of the individual sandflies from the Sudanese populations of MO and TR were assigned to the Egyptian one (Table 9).

Discussion
This study further validates the use of microsatellites for studying subtle patterns of genetic differentiation between populations at the microgeographic scale. In such situations, microsatellite markers with their highmutation rates and high allelic diversity are known to perform much better than slow evolving allozymes and other nuclear and mitochondrial markers [38,39]. However, useful microsatellites are neither highly abundant nor easily isolated from arthropod species [40]. Homologous microsatellite loci have sometimes been amplified in species related to the one they were isolated from using the same primers, but amplification success strongly depends on the divergence time separating the species concerned [41].
Here, given the low number of microsatellite loci available for P. papatasi, additional loci originally designed for P. perniciosus were tested and successfully crossamplified. As expected, some of these loci produced fewer alleles than in the original species (P. perniciosus). For example, the total number of alleles per locus for these loci ranged from 4 to 9 in 13 Spanish populations of P. perniciosus with a sample size ranging from 13 to 38 but here in our study allelic richness ranged from 1 to 5. Surprisingly, two primer pairs AA24 and AA13 amplified more than one locus in P. papatasi with distinct allele size-ranges and genetically unlinked, indicating a duplication of the genome in the region of those repeat sequences in P. papatasi. These loci were polymorphic in most P. papatasi populations, which contrasts with previous results in P. perniciosus where locus AA24 was not significantly polymorphic in 12 out of 13 studied populations, and locus AA13 was monomorphic in 5 of them [42].These variations suggest strikingly different evolutionary trajectories. Future comparative genomic studies might help elucidate the phylogenetic relationships and genome rearrangements characterizing these sandfly vector species.
Interestingly, no significant pattern of isolation-bydistance was observed amongst the 6 P. papatasi populations sampled from Central and Eastern Sudan, and Egypt. This result is due to the unexpected significant genetic differentiation found at the local geographical scale between the Sudanese populations that interfered with broader geographical patterns of genetic differentiation. Importantly, the P. papatasi population from GW was significantly genetically differentiated from that of SHR, located 400km away in Eastern Sudan. But the   Figure 1), a matter that may constitute a further barrier to gene flow for the GW population. It is noteworthy that uncommon clinical manifestations of CL were also reported from SR which used to be an endemic site for common CL cases. However, P. papatasi population from this site was significantly genetically differentiated from GW than from the other two sites from Khartoum State (MO and TR). Previous studies demonstrated the presence of different genetic groups among L. major strains from different locations in the same country which could be related to different P. papatasi subpopulations [43,44]. Whether the parasite strains causing these complications in SR and GW belong to the same genetic groups or not, remains to be further explored. It is well known that populations in close proximity are genetically more similar than distant populations and that high gene flow usually precludes local adaptation [45]. Therefore, the lack of interpopulation genetic differentiation among Khartoum State populations (MO, TR and SR) could be attributed to the flies' own dispersal abilities, to human transportation and potentially to wind dispersal. P. papatasi is known to be capable of flying up to 2km per night in open desert where they can be also be carried by air currents. Moreover, sandflies usually swarm over their host during feeding and usually mating takes place on the host; thus they may be capable of movement along with animal herds [46,47]. It should be stated that the Moyleih area (MO) hosts a famous cattle market where animals are brought from Western and Southern Sudan either for slaughter in the nearby "Ghunawa" abattoir or to be herded across the Nile bridge towards the "Kadaro" abattoir on the Eastern bank of the Nile very near the village of Sirougia (SR). In addition, the seasonal movement of the camel-herding nomads northward during the wet season and southward at the end of the wet season [48] may contribute to the dispersal of sandfly over a wide range.
The close genetic similarity among sandfly populations from Khartoum State (MO, TR and SR) and Wad Shariefai (SHR), Eastern Sudan, despite the large geographical distance, could be attributed to the prevailing environmental conditions such as wind speed, humidity and temperatures that may play an important role in sandfly dispersal [1]. Although each of the studied sites has their own peculiar environmental conditions, they all lie within the dry savannah belt [49]. Moreover MO and TR, although they are apparently desert areas, used to be covered by thick savannah forests that may have had similar environmental conditions as SHR [50]. Since the mid-1970s the whole Western Sudan, including the area west of the White Nile, suffered from continuous drought periods that lead to deterioration of the plant cover, soil, water and biodiversity [50][51][52]. Previous studies on the distribution and phenology of P. papatasi indicate that this species can withstand extreme environmental conditions and survive even in areas with extreme temperature and aridity [53][54][55]. The prevailing conditions within deep wall-cracks and small muddy huts provide high humidity and cooler temperatures suitable for sandfly proliferation in these sites. All these factors may explain the genetic similarity observed between distant sandfly populations since flies can disperse largely freely between vast expenses of largely similar habitats.
Low genetic differentiation between P. papatasi populations separated by large geographical distances has also been reported in previous studies based on different molecular markers [14,56,57]. In addition, population genetic studies of other important sandfly species demonstrated the homogeneity of these populations in a radius of 20km while a certain degree of structuring was observed in a zone of approximately 1km [58]. It is worth mentioning that little genetic differentiation was also shown in populations of P. orientalis in Sudan using 30 RAPD markers. In Table 9 Number and percentage of individual Phlebotomus papatasi sandflies assigned to each collection site in Sudan and Egypt using the log-likelihoods approach in ARLEQUIN addition, populations from Dinder National Park, South-Eastern Sudan were found to be genetically very similar to population from SR, Khartoum State [59]. It has been recently postulated that latitude is the main factor determining the degree of genetic diversity among P. papatasi populations [13]. This study also revealed significant genetic differentiation and restricted gene flow between the Egyptian EGY population and all the Sudanese populations, which suggests some degree of isolation-by-distance despite the overall lack of a significant correlation between the geographical distance and estimates of genetic differentiation. The Egyptian P. papatasi colony also showed unexpected high degree of heterozygosity compared to the Sudanese field populations, although long colonization and adaptation to laboratory conditions are usually thought to minimize the genetic variability in laboratory populations [60,61]. Unfortunately, no detailed information was available about the EGY colony, although it may be speculated that this diversity may be a result of its augmentation with new collections, which is a common practice in sand fly colonization. It must be mentioned that in previous studies, high levels of polymorphism and heterozygosity were also observed in P. papatasi colony from Sinai, Egypt even after the 33 generations [62].

Conclusion
The significant differentiation of the GW Phlebotomus papatasi population from the other Sudanese populations may have important consequences on the epidemiology of leishmaniases in Khartoum State and warrants further investigation. Such genetic variations could potentially correlate with differences in the vectorial capacity of the sandfly vector to transmit different strains of the parasites or even hybrids. Future studies using a larger number of microsatellite loci, more populations and more specimens per population are also needed to further elucidate patterns of genetic differentiation of P. papatasi in Sudan. This knowledge would improve our understanding of the epidemiology of the disease in Sudan, and ultimately would result in improved disease and vector control programs.