Evidence of temporal stability in allelic and mitochondrial haplotype diversity in populations of Glossina fuscipes fuscipes (Diptera: Glossinidae) in northern Uganda

Glossina fuscipes fuscipes is a tsetse species of high economic importance in Uganda where it is responsible for transmitting animal African trypanosomiasis (AAT) and both the chronic and acute forms of human African trypanosomiasis (HAT). We used genotype data from 17 microsatellites and a mitochondrial DNA marker to assess temporal changes in gene frequency for samples collected between the periods ranging from 2008 to 2014 in nine localities spanning regions known to harbor the two forms of HAT in northern Uganda. Our findings suggest that the majority of the studied populations in both HAT foci are genetically stable across the time span sampled. Pairwise estimates of differentiation using standardized FST and Jost’s DEST between time points sampled for each site were generally low and ranged between 0.0019 and 0.1312 for both sets of indices. We observed the highest values of FST and DEST between time points sampled from Kitgum (KT), Karuma (KR), Moyo (MY) and Pader (PD), and the possible reasons for this are discussed. Effective population size (Ne) estimates using Waple’s temporal method ranged from 103 (95 % CI: 73–138) in Kitgum to 962 (95 % CI: 669–1309) in Oculoi (OC). Additionally, evidence of a bottleneck event was detected in only one population at one time point sampled; Aminakwach (AM-27) from December 2014 (P < 0.03889). Findings suggest general temporal stability of tsetse vectors in foci of both forms of HAT in northern Uganda. Genetic stability and the moderate effective population sizes imply that a re-emergence of vectors from local residual populations missed by control efforts is an important risk. This underscores the need for more sensitive sampling and monitoring to detect residual populations following control activities.


Background
Tsetse flies (Diptera: Glossinidae) are vectors of human African trypanosomiasis (HAT) and animal African trypanosomiasis (AAT), two diseases that exert a significant constraint on human health, animal production and agricultural livelihood in vast areas of rural sub-Saharan Africa. The occurrence of the two diseases follows the restricted distribution of the tsetse flies across 37 countries, covering more than nine million km 2 between 14°North and 20°South [1,2]. In 2000, the African Union recognized trypanosomiasis as "one of Africa's greatest constraints to socioeconomic development" [3].
HAT is among the most debilitating diseases on the continent, with an estimated 70 million people at risk [4]. The HAT disease presents in two different forms; the Rhodesian form that is restricted to eastern and southern sub-Saharan Africa and is caused by Trypanosoma brucei rhodesiense, and the Gambian form that is restricted to central and western Africa and is caused by T. b. gambiense. The two trypanosome subspecies are identical morphologically but display stark differences in epidemiological features, and require different diagnosis and treatment methods [5,6]. AAT, on the other hand, stands as a major obstacle to the development of more efficient and sustainable livestock production systems in tsetse-infested areas [7]. The Programme on African Animal Trypanosomiasis (PAAT) estimates that AAT causes approximately 3 million cattle deaths per year with farmers using approximately 35 million doses of costly trypanocidal drugs [8]. Overall, annual losses in agriculture alone have been estimated at over US $5 billion [9].
In Uganda, about two thirds of the total land area is infested with tsetse flies [10], and this is the only country known to sustain active transmission of both forms of human pathogenic trypanosomes; T. b. gambiense in the northwest and T. b. rhodesiense in the southeast [11]. There is a growing body of evidence that the disease ranges are expanding [12], and thereby narrowing a disease-free belt of less than 150 km just north of Lake Kyoga [13,14]. Overlap of the two diseases in northcentral districts of Uganda [15] will complicate diagnosis and treatment activities, and provide new challenges, as recombination between the two trypanosome forms can occur in tsetse flies and could lead to novel pathologies [16,17]. This highlights the need for increased understanding and implementation of vector control and disease prevention measures in this region.
The primary vector of both human and animal forms of trypanosomiasis in Uganda is the tsetse species Glossina fuscipes fuscipes, where it is responsible for an estimated 90 % of all disease cases. Glossina f. fuscipes belongs to the Palpalis group of tsetse, which inhabits low bushes or forests at the edges of riverine habitats, and like other tsetse species, has a unique reproductive strategy where larvae develop in utero. Populations appear to respond to seasonal weather patterns, often disappearing during the bi-annual dry season from sites where they were previously abundant [18], which can cause seasonal bottlenecks and thereby result in large temporal changes in gene frequencies due to genetic drift. However, it has also been hypothesized that larval development in utero can stabilize population sizes during dry periods [19], which would result in stable gene frequencies due to low rates of genetic drift. Thus, our understanding of population fluctuations remains incomplete. Likewise, knowledge of tsetse dispersal is also inadequate. Ecological studies using capture-releaserecapture data show that tsetse species have a high capacity for dispersal, but genetic data indicate surprisingly high differentiation among populations [20]. Studying temporal patterns of genetic variation can provide insight into some of these knowledge gaps and the apparent contradictions between ecological and genetic data [21]. Temporal population genetics data can also provide estimates of effective population sizes [22], and probability of population bottlenecks, and thereby aid in the choice of spatial and temporal parameters in vector control programmes [23].
Previous work on G. f. fuscipes temporal population dynamics has focused on southern, central and southeastern Uganda and has demonstrated stability, with little evidence of seasonal variation in population size [24,25]. However, regions north of Lake Kyoga have not yet been investigated, and remain a high priority because tsetse from this region are genetically distinct [26][27][28], and it is where the two forms of HAT will likely merge in the near future [13,14]. This region also differs significantly in climate, especially in annual precipitation [29]. Southern Uganda has a somewhat cooler climate and is less humid, with mean annual rainfall near Lake Victoria often exceeding 2100-3000 mm; the high temperature varies by 2-3°C over the year, with a mean daily high being around 26°C. In the north, the rainfall is between 1000 and 2000 mm, and temperature varies by 5°C over the year, with the mean daily high being 29°C [30]. The unique genetic and climatic background of northern Uganda may therefore create different tsetse population dynamics, and thus change the best strategy of vector control for the region.
In this study, we investigate temporal changes in nuclear allele and mitochondrial haplotype frequencies of G. f. fuscipes populations in areas north of Lake Kyoga by examining temporal samples collected across multiple locations in both the T. b. gambiense and T. b. rhodesiense disease belts. We do this by evaluating genetic stability at one mitochondrial marker and 17 microsatellite loci so as to contribute reliable scientific data around which sustainable vector control programs can be planned and implemented.
Assuming that G. f. fuscipes goes through approximately 8 generations per year [32,33], our temporal separation ranged from 22 to 52 generations. DNA was extracted from tsetse legs using PrepGEM Insect DNA Extraction kit (ZyGEM New Zealand, 2013), following the manufacturer's protocols.

Nuclear microsatellite amplification and genotyping
We evaluated patterns of nuclear DNA (nDNA) genetic diversity using 17 microsatellite loci extensively evaluated for minimal allele dropouts and null alleles and optimized for use in G. f. fuscipes in previous studies [24,25,27]. Amplifications were performed with fluorescently labeled forward primers (6-FAM, HEX and NED) using a touchdown PCR (10 cycles of annealing at progressively lower temperatures from 60°C to 51°C, followed by 35 cycles at 50°C) in 13.0 μl reaction volumes containing 2.6 μl 5× PCR buffer, 1.

Microsatellite marker validation and summary statistics
We evaluated microsatellite genotypes with Genepop v 4.2 [34] to test for deviation from Hardy Weinberg equilibrium (HWE), linkage disequilibrium (LD), and to estimate inbreeding coefficients (F IS ) using the Markov chain method [35] with 100,000 dememorizations, 1,000 batches and 10,000 iterations per batch. Arlequin v 3.5 [36] was used to calculate observed (H O ) and expected (H E ) heterozygosity for each population. Significance values for all multiple tests and comparisons were adjusted using the Benjamini-Hochberg method in preference to the Bonferroni method because of lower incidence of false negatives [37]. We used the program FSTAT v 3.1 [38] to calculate allelic richness and to provide locus and population-specific estimates of microsatellite allele frequencies.
Mitochondrial DNA amplification, sequencing, and summary statistics A 570 bp fragment of the region spanning across the mitochondrial DNA (mtDNA) COI and COII gene was PCR-amplified using the primers COII-F1 (5'-CCT CAA CAC TTT TTA GGT TTA G -3') and COI-R1 (5'-GGT TCT CTA ATT TCA TCA AGT A -3') as described by [27] with an initial denaturation step at 95°C for 5 min, followed by 40 cycles of annealing at 50°C, and a final extension step at 72°C for 20 min. We used a reaction volume of 13.0 μl containing 1 μl of template genomic DNA, 2.6 μl 5× PCR buffer, 1.1 μl 10 mM dNTPs, 0.5 μl 10 mM primers, 1.1 μl 25 mM MgCl2, and 0.1 μl (U/μl) GoTaq polymerase (Promega, USA). The PCR products were purified using ExoSAP-IT (Affymetrix Inc., USA) as per the manufacturer's protocol. Sequencing was carried out for both forward and reverse strands on the ABI 3730xL automated sequencer at the DNA Analysis Facility on Science Hill at Yale University (http://dna-analysis.research.yale.edu/). Sequence chromatograms were inspected by eye and sequences trimmed to remove poor quality data using Geneious v 6.1.8 (Biomatters, New Zealand). The forward and reverse strands were used to create a consensus sequence for each sample, and the sequences trimmed to a length of 404 bp and aligned in Geneious v 6.1.8. DnaSP v 5.10.01 [39] was used to calculate mtDNA haplotype diversity (H d ) and nucleotide diversity (π).

Temporal genetic analyses nDNA
We determined the overall proportion of the variance attributable to differences in sampling dates for all 18 sampled points using the analysis of molecular variance (AMOVA) implemented in Arlequin v 3.5; first, we ran AMOVA with two groups containing all sites at generation zero pooled together to form one group and sites with multiple generations forming another. Then, we performed AMOVA with multiple groups by having each temporal sample (generation 0, 22, 27, 29, 35, 38, 48 and 52) forming a separate group.
Genetic differentiation between temporal samples from the same population was quantified by computing pairwise F ST values in Arlequin v 3.5, and Jost's D EST [40] using DEMEtics [41] in R [42]. It is well documented that high-allelic diversity markers like microsatellites can lead to underestimates of F ST [40,43,44]. To account for this potential bias, we standardized F ST values using the formulae developed by Hedricks [43], and used Jost's D EST . P-values and confidence intervals for Jost's D EST were also calculated based on 1000 bootstrap resamplings. To test for correlation of differentiation indices and time since first sampling, we ran a linear regression of standardized F ST and D EST against number of generations separating time points sampled in JMP® v11.0 (SAS Institute Inc., Cary, NC, USA, 1989-2007).

mtDNA
We employed the same AMOVA strategy outlined above to determine the overall proportion of the variance attributable to differences in sampling dates for the mitochondrial data. We also quantified genetic differentiation between temporal samples from the same population by computing pairwise Φ ST values between temporal samples in Arlequin v3.5, and used the same program to perform Fisher's exact test of sample differentiation based on haplotype frequencies with 10,000 iterations and 1000 dememorization steps.

Effective population size (Ne) estimates and tests for bottlenecks
To provide an estimate of effective populations (Ne) of each site, we employed two methods using the nuclear microsatellite markers; the modified temporal method [45] based on [46], and the linkage disequilibrium (LD) method, as implemented in NeEstimator v 2.01 [47]. The temporal method relates the standardized variance of allele frequencies across generations to the effective population size [48,49] while the LD method uses the level of non-random associations among alleles at different loci [50,51] to estimate the action of genetic drift, and thus Ne.
Evidence of population bottleneck events was tested using two methods, both of which are implemented in the program BOTTLENECK [52]. The first method tested for an excess of heterozygosity relative to observed allelic diversity [53], and was performed separately for each sample and microsatellite locus. Simulation of heterozygosity at mutation-drift equilibrium distributions assumed the two-phase mutation model (TPM) with 70 % single-step mutations and 30 % of multiple-step mutation, as recommended for microsatellite loci [54], and significance was assessed using Wilcoxon's signed rank test, as recommended for fewer than 20 loci [52]. The second method tested for a bottleneck-induced mode shift in allele frequency distributions that were based on equal increments of 0.1 [55].

Summary statistics nDNA
A total of 404 G. f. fuscipes tsetse flies were genotyped using 17 microsatellite loci. Allelic richness ranged from 3.45 in population AM to 4.50 in KR and was generally similar for samples of the same locality analyzed from different time points, except for KR which differed slightly between generation 0 (4.50) and generation 35 (4.08) ( Table 1). None of the pairs of loci analyzed exhibited significant LD after correction for multiple testing, which confirmed previous work that these loci are unlinked. However, locus Pg17 exhibited significant departures from Hardy-Weinberg Equilibrium after correction for multiple testing in most of the populations (results not shown), and was therefore dropped from all further analyses. mtDNA For mitochondrial DNA sequences, we recovered a total of 26 haplotypes from the 18 sampled points. Haplotype diversity (H d ) was moderately high across samples; the number of haplotypes (h) ranged from 2 to 6. Nucleotide diversity (π) was highest in samples from KR (at generations 0 and 35), and lowest in samples from AM (at generation 27) and KT (at generation 22). However, both H d and π remained similar for temporal samples except for KT and PD (Table 1).

Temporal genetic variation and population stability
Results of AMOVA using both microsatellite and mitochondrial DNA frequencies suggested that differences between samples from different time points do not explain a significant amount of the overall genetic variation. On the other hand, differences among sites contributed significantly to the overall variation ( Table 2). STRUCTURE assignment shows that genetic structure remains homogenous from one time point to the next (Fig. 2), and that genetic distance is correlated with spatial distance rather than temporal distance.
nDNA Overall, the allele frequency changes were minimal for most of the sites and appear homogenous among temporal samples from the same locations (Fig. 3). Standardized pairwise F ST (F' ST ) for microsatellites between time points from the same site were generally low  (Table 3) (Table 3).

mtDNA
The same trend was apparent for the mitochondrial DNA haplotype frequencies; some sampled localities gained and others lost haplotypes across time points (Fig. 5). More critically, though, the frequencies of the most common haplotypes remained fairly homogenous     (Table 4). Following the Benjamini-Hochberg FDR correction, only one site (AM at generation 27) tested positive for bottleneck events at P < 0.0389 under the TPM model (Table 4); however, OC at generation 0 and KO at generation 29 were significant at P < 0.05 (Table 4). Additionally, tests for recent bottlenecks using the mode-shift indicator approach indicated that allele frequency distributions in all populations except AM-27 (Table 4) approximated the expected normal L-shape, thus showing no loss of rare alleles via drift as might be expected in a population which has undergone a recent bottleneck event.

Discussion
We carried out an evaluation of changes in gene frequencies at microsatellite loci and at an mtDNA marker  to provide insight into the temporal stability of G. f. fuscipes populations that harbor the two forms of HAT in northern Uganda. Generally, temporal population genetics studies are valuable in evaluating population stability and persistence, especially under changes in environment or following a perturbation [56]. Tsetse flies are an interesting study system because they exist in highly structured meta-populations that are sensitive to external environmental change, and because they act as vectors of dangerous human and animal diseases. The presence of the two forms of HAT in Uganda north of Lake Kyoga and the narrowing gap between the two disease belts provides added impetus for monitoring population dynamics of tsetse flies in the region. Although previous work on G.f. fuscipes from southern and central Uganda indicated population stability [24,25], there has not been any work in the genetically unique tsetse populations found north of Lake Kyoga. Temporal stability was also demonstrated for other closely related tsetse fly species like G. pallidipes in Kenya [21]. One limitation in previous studies has been short time intervals between samples of only 1-2 years. In this study, we sampled at comparatively longer time intervals, and have demonstrated similar stability of G. f. fuscipes in northern Uganda, the region where the two forms of HAT will likely merge in the near future.
Estimates of genetic diversity in both nDNA and mtDNA indicate stable populations that are currently under migration-drift equilibrium. Microsatellite results indicate relatively stable molecular diversity indices (A R , H O and H E) , while MtDNA sequences show stable h, H d , and π except for some populations such as PD and KT ( Table 2). These findings indicate stable populations that currently might be under migration-drift equilibrium. Generally low F IS estimates indicate mostly random mating within these populations. The pattern of reduced F IS values in the second sample at all sites except MY suggests a general increase in outbreeding in recent years, potentially driven by migration.
Pairwise estimates of genetic divergence and AMOVA results in both the nDNA and the mtDNA analyses indicated temporal stability of sampled populations, with low estimates of genetic differentiation between temporal samples from the same locality. Exceptions included sites KT, MY and PD, which showed moderate differentiation using both standardized F ST and Jost's D EST. The moderate differentiation observed in the temporal sites KT and PD might be partially attributable pallidipes [57], G.palpalis gambiensis [58], and even in G. f. fuscipes in Uganda [27]. Thus the differentiation we observed in KT and PD may indicate micro-geographic structure between sampling sites rather than temporal instability. On the other hand, two sites (MY and KR) represent potential regions of instability. Both F' ST and Φ ST values for these sites were highly significant (See Table 3). Temporal sampling efforts for these two sites were not geographically highly separated (5.86 km for MY and 1.22 km for KR), and therefore, these may represent the only areas of instability observed in our study.
Microsatellite-based population size estimates were variable and generally associated with large confidence intervals. Our Ne estimates were lower than most estimates from Uganda obtained by [24] but similar to some localities most proximal to our study sites, such as Masindi (MS). However, our estimates were slightly higher than estimates from southern Uganda around the Lake Victoria basin obtained by [25], and similar to estimates obtained by [59] in other riverine species like G. palpalis palpalis in West Africa. Relatively large Ne estimates coupled with evidence of low temporal differentiation indicates genetic drift does not strongly impact these populations [60]. Although Ne is generally difficult to estimate and is affected by many possible biases, the temporal method we have used here is suitable because high allelic richness (Table 1) should outweigh the upward bias in temporal method estimates of Ne [61]. Furthermore, the large sampling intervals used decreases the bias caused by overlapping generations and age structure [62]. The apparently low Ne estimated at some sites, such as KT, may be due to micro-geographic population structure rather than to a true shift in allele frequencies by genetic drift through time, and therefore may represent a false signal of low Ne.
Despite the tendency of G. f. fuscipes to exist in discrete patches and the potential for seasonal fluctuations in population size presented by the climate of northern Uganda, there is little evidence of genetic bottlenecks in the populations we studied; only population AM at generation 27, sampled in December 2014, showed evidence of a bottleneck. Our finding of limited bottleneck events is congruent with previous studies in southern Uganda [24,25,33]. The apparent lack of extensive seasonal variation in abundance supports the hypothesis that larval development in utero may help to relieve tsetse from harsh environmental conditions during reproduction, and hence stabilizes populations [19]. Furthermore, the networks of rivers, streams and other semi-permanent water bodies common in this region may drive overall stability. We hypothesize that waterways may be facilitating connectivity and 'rescue effects' between populations. Glossina f. fuscipes are known to generally disperse along waterways, following riverbeds or the edges of gallery forests, where they are able to survive low humidity conditions during dry seasons. Additionally, tsetse flies in northern Uganda have not been subjected to extensive control measures. A prolonged period of civil unrest in the region led to a disruption of control efforts and a breakdown in social infrastructure [63], and later control activities were decentralized and are currently managed by underfunded district health and veterinary authorities [64]. At the height of the insurgencies, the populace was displaced into Internally Displaced Persons' (IDPs) camps leaving tsetse populations in natural conditions for approximately two decades. This time frame was also free of major drought [65], which potentially stabilized population sizes and distribution in the region.
In summary, we find temporal stability in gene diversities of most tsetse populations sampled, which lends further credence to the hypothesis that larval development in utero helps to stabilize populations during dry periods, and as a result, decreases seasonal variation in tsetse number. Additionally, large populations of pupa, which develop in the ground over a period of weeks, may also help to ensure the continuity of tsetse populations and would contribute to reducing the variance in genetic changes over time [24]. For northern Uganda, population stability is also probably aided by the presence of many year-round rivers and streams, and historical happenstance, which left tsetse populations in natural conditions free from control efforts for the last several decades.

Conclusions
The findings point to general temporal stability of tsetse vectors in both foci of the two forms of HAT in northern Uganda. Additionally, migration and gene flow is probably important in shaping the observed genetic stability, which suggests that area-wide control strategies are more desirable than localized efforts. Furthermore, genetic stability and moderate effective population sizes suggest there is a significant risk of re-emergence from local residual populations missed by control efforts. This underscores the need for more sensitive sampling techniques to detect residual populations when tsetse densities decrease due to vector control to avoid a population rebound effect when control and monitoring activities are relaxed. Our results, especially the Ne estimates, may also be useful in evaluating the effectiveness of future control efforts, for instance, by estimating reduction in Ne resulting from control.