Population genetics of Enterocytozoon bieneusi in captive giant pandas of China

Background Most studies on Enterocytozoon bieneusi are conducted based on the internal transcribed spacer (ITS) region of the rRNA gene, whereas some have examined E. bieneusi population structures. Currently, the population genetics of this pathogen in giant panda remains unknown. The objective of this study was to determine the E. bieneusi population in captive giant pandas in China. Results We examined 69 E. bieneusi-positive specimens from captive giant pandas in China using five loci (ITS, MS1, MS3, MS4 and MS7) to infer E. bieneusi population genetics. For multilocus genotype (MLG) analysis of E. bieneusi-positive isolates, the MS1, MS3, MS4, and MS7 microsatellite and minisatellite loci were amplified and sequenced in 48, 45, 50 and 47 specimens, respectively, generating ten, eight, nine and five types. We successfully amplified 36 specimens and sequenced all five loci, forming 24 MLGs. Multilocus sequence analysis revealed a strong and significant linkage disequilibrium (LD), indicating a clonal population. This result was further supported by measurements of pairwise intergenic LD and a standardized index of association (I S A) from allelic profile data. The analysis in STRUCTURE suggested three subpopulations in E. bieneusi, further confirmed using right’s fixation index (F ST). Subpopulations 1 and 2 exhibited an epidemic structure, whereas subpopulation 3 had a clonal structure. Conclusions Our results describe E. bieneusi population genetics in giant pandas for the first time, improving the current understanding E. bieneusi epidemiology in the studied region. These data also benefit future studies exploring potential transmission risks from pandas to other animals, including humans.


Background
Diarrheal disease is the second leading cause of morbidity and mortality in children under 5 years old (including infants) in low-income countries [1,2]. The illness can often be traced to the unicellular fungus microsporidian, recognized as a category B biodefense agent on the National Institutes of Health list [3]. Enterocytozoon bieneusi is among the most common microsporidia; in immunocompetent individuals, the anthropozoonotic pathogen may be asymptomatic, but it has been implicated in self-limiting diarrhoea and malnutrition. In immunosuppressed individuals, however, persistent diarrhoea, abdominal pain, and weight loss are common symptoms [3][4][5]. Enterocytozoon bieneusi can also infect various invertebrates and vertebrates, including domestic animals, through a wide variety of transmission routes: water-borne, food-borne, anthroponotic and zoonotic [6,7]. Currently, effective commercial treatments for E. bieneusi infection do not exist, although some reports have suggested that the fungicide fumagillin is effective. However, this drug is toxic to mammals [6].
Enterocytozoon bieneusi cannot be identified based on morphology because its spores are too small and do not have characteristic staining properties, making them indistinguishable. Moreover, in vitro cultivation of the fungus has been unsuccessful thus far [8,9]. Thus, molecular screening is currently the only option. ITS gene analysis is widely used in molecular epidemiological studies of genetically diverse E. bieneusi, identifying over 200 genotypes [10,11]. ITS analysis is adequate for genotyping and epidemiological studies of E. bieneusi in the absence of recombination. However, ITS analysis has major limitations if genetic recombination occurs at any reproduction stage [12]. As a result, multilocus genotyping (MLST) has been developed, targeting three microsatellites (MS1, MS3 and MS7) and a minisatellite (MS4) to provide high-resolution E. bieneusi genotyping based on length polymorphisms and single nucleotide polymorphisms (SNP) [13]. Together with ITS analysis, multilocus genotyping has been gradually applied to E. bieneusi genetic characterization. The results contribute to the establishment of effective control measures by enabling elucidation of transmission routes, characterization of the public health consequences of animal-derived E. bieneusi isolates, and identification of the sources of human E. bieneusi infections.
For E. bieneusi, MLST has been applied to humans, non-human primates, swine, Asiatic black bears, red kangaroos, dairy and native beef, foxes, and red-bellied tree squirrels [14][15][16][17][18][19][20][21]. However, no population genetic analysis is available on E. bieneusi in giant pandas. Therefore, this study aimed to amplify the 12 ITS sequences in E. bieneusi-positive isolates from captive giant pandas of China using micro/minisatellite markers and characterize them using MLST. The resultant data on population genetic structure will help to prevent or reduce microsporidiosis occurrence in giant pandas through the development of efficient control strategies. These data can also be used to compare E. bieneusi population genetics across multiple hosts, improving risk assessment of potential human and livestock infection through infected giant pandas.
A population genetic analysis of allelic profiles was also performed. In Arlequin, we applied exact tests with Markov chain parameters to allelic profile data to assess pairwise intergenic LD. Standardized index of association (I S A ) values were calculated using LIAN 3.7 (http://guanine.evolbio.mpg.de/cgi-bin/lian/lian.cgi.pl) on five-loci haplotypes. I S A equal to zero or with a negative value indicates randomly mating populations and alleles in linkage equilibrium (LE). An I S A value greater than zero indicates a non-panmictic population structure is exhibiting LD. The variance of pairwise differences (V D ) was also calculated as another test of LD; when less than L (95% critical value for V D relative to the null hypothesis of panmixia), the population is panmictic and is in LE. When V D > L, the population is nonpanmictic and has some LD [22]. Intra-/intergenic LD, I S A , neutrality, and recombination events were combined to assess E. bieneusi population structure.

Substructure analysis
Allelic profile data were analyzed to determine subpopulations of E. bieneusi genotypes, using K-means partitional clustering and the admixture model in the Bayesian analysis tool STRUCTURE version 2.3.4 (http://web.stanford.edu/ group/pritchardlab/structure.html) [23]. Twenty simulation runs were conducted for each of K value = 2-8, with 10 4 and 10 5 replicates of the Markov chain Monte Carlo simulation discarded as burn-in. Wright's fixation index (F ST ), a measure of population divergence, was calculated in Arlequin; F ST = 0 indicates similar polymorphisms across all markers and F ST = 1 indicates high levels of between-population divergence. We also performed a median-joining analysis implemented in the software Network version 5.0 (http://www.fluxus-engineering.com/sharenet.htm) to estimate the potential for geographic segregation.

Results and discussion
Among the 69 ITS-positive E. bieneusi isolates, the MS1, MS3, MS4 and MS7 loci were successfully amplified in 48, 45, 50 and 47 specimens, respectively, yielding ten, eight, nine, and five types (Table 1). MS1 exhibited the highest sequence polymorphism and MS7 the lowest. A previous study of E. bieneusi population genetics in humans reported the highest resolution for MS1 and the lowest for MS3 and MS4 [14]. Another study conducted in non-human primates also suggested that MS1 had the highest resolution and MS3 the lowest [16]. In contrast, research on E. bieneusi populations in swine revealed that MS3 and MS4 possessed higher resolution than the other tested loci [15]. These inter-study differences may result from variation in hosts and sample size. The Hd and intragenic linkage disequilibrium shown in Table 2 indicate that all five markers were in significant LD. Assessment of intragenic recombination revealed that ITS and MS4 exhibited one and two recombination events (Rms), respectively, and LD was incomplete in these two loci, whereas the remaining three markers (with complete LD) did not have Rms ( Table 2).
All five loci were amplified and sequenced for 36 specimens, forming 24 MLGs with an Hd of 0.77 (Table 1). The 36 contigs had 49 polymorphic sites, with nucleotide diversity Pi = 0.00439 and an average number of nucleotide differences k = 8.379. In this study, the 36 isolates yielded 24 MLGs with a Hd value of 0.994 or 0.768, depending on whether the analysis used polymorphic or segregating sites. The results indicate that E. bieneusi genotypes are relatively abundant in our giant panda population. However, nucleotide diversity was very low (0.033 based on polymorphic sites, 0.004 based on segregating sites). This pattern of high Hd but low nucleotide diversity suggests the presence of a bottleneck followed by population expansion in E. bieneusi infecting giant pandas.
The observed ZnS, a measure of LD, was 0.254. Fisher's exact test found 339 significant pairwise correlations out of 1176, with 111 remaining significant after a Bonferroni correction. In contrast, chi-squared tests found 605 significant correlations, with 300 remaining significant after Bonferroni corrections. Coalescence simulations with 1000 replicates showed that average ZnS was 0.120, and its 95% confidence interval ranged from 0.066 to 0.462. The probability for expected ZnS ≤ 0.254 was 0.200 (i.e. the observed ZnS had a 0.200 probability of being higher than the expected ZnS if the former was within the 95% confidence interval). Linkage disequilibrium was strong and significant (|D'|: Y = 0.9652 -0.0204X), but a negative slope in the |D'| graph (Fig. 1) revealed a decline in an LD with greater nucleotide distance, leading to the possibility of genetic recombination. Accordingly, we also detected a minimum of four Rms, but we note that they could be due to genetic changes. To further confirm the occurrence of recombination, we utilized GENECONV, SiScan, and MaxChi analyses implemented in RDP 4 software. SiScan analysis did not detect any Rm, whereas both GENECONV and MaxChi detected two Rms (all three tests ignored indels).
The results of Fu's neutrality test shown in Table 3 indicate that the population experienced molecular selection. Moreover, the observation of Fs = -6.680 (the infinite model of mutation) suggested that a great number of alleles were present that probably resulted from a recent population expansion.
We found an I S A of 0.0903 and a V D (0.9725) greater than L (0.7652) ( Table 4), indicating a non-panmictic population exhibiting LD. Furthermore, our Monte Carlo simulations revealed a P MC (significance for obtaining this value in 1000 permutations) below the threshold for retaining the null hypothesis of panmixia (Table 4). Thus, all three tests indicated a non-panmictic   population is exhibiting LD. To exclude the possibility that the LD was due to clonal expansion of one or more MLGs, we also calculated I S A and V D for MLGs another way, considering isolates with the same MLG as a single individual. The I A S value was 0.0408 and V D value was still higher than L (Table 4). Thus, the population of giant pandas in our study was clonal with strong LD and limited recombination, which suggests that MLGs are relatively stable across time and location. Thus, they should be used for tracking the transmission of microsporidiosis.
The LD analysis of the allelic profiles from subpopulations 1, 2 and 3 revealed positive I S A for subpopulation 1 and 3, along with and V D > L with insignificant P MC . However, subpopulation 2 had negative I S A and V D < L ( Table 4). To confirm the existence of epidemic population structure, we then recalculated I S A for MLGs (isolates with the same MLG were treated as one individual) of the three subpopulations. Subpopulations 1 and 2 had negative I S A and V D < L with insignificant P MC , whereas subpopulation 3 had positive I S A and V D > L. Thus, both subpopulations 1 and 2 were in LE and had epidemic population structure, whereas subpopulation 3 remained in LD (despite non-significant P MC ), exhibiting a clonal population structure. This outcome was similar to a previous study in AIDS patients, showing one E. bieneusi subpopulation with a clonal structure and another with an epidemic population structure [24]. A clonal population structure of E. bieneusi was also found in non-human primates and swine [15,16].
All the clusters formed mainly came from Sichuan province in the substructure analysis (Fig. 2b). Thus, no clear geographically segregated subpopulation was observed. While, subpopulation structure analysis showed that subpopulation 1 consisted of isolates only from adults (> 5.5 years); subpopulation 2 had admixture of isolates from adults and yearlings (< 1.5 years), and adults isolates dominated; subpopulation 3 mainly consisted of isolates from juveniles (1.5-5.5 years) which may have contributed to the significance for LD and I S A analyses of each population. In the median-joining

Conclusions
We used ITS and micro/minisatellite markers to analyze 69 E. bieneusi isolates from captive giant pandas of China. We found an overall clonal population of E. bieneusi with strong LD and limited recombination. Within the larger  a Plot of mean likelihood L (K) and variance per K value from STRUCTURE on a dataset containing 36 specimens for five loci. b-d Plots for detecting the number of K groups that best fit the data, where (d) represents the peak delta K value of 3 Fig. 4 Median-joining analysis of the multilocus sequence contigs from 36 E. bieneusi isolates in this study. The size of the circles is proportional to the frequency of each of the 24 multilocus genotypes obtained based on segregating sites. The yellow, green, pink, purple, blue, grey and brown colours in circles represent the isolates from Sichuan, Zhejiang, Guangdong, Shandong, Fujian, Hunan provinces, and Shanghai city, respectively population, three subpopulations were present, with two exhibiting epidemic structure (subpopulations 1 and 2), whereas the third exhibited clonal structure (subpopulation 3). Additional large-scale MLST studies on E. bieneusi in other hosts should be conducted to characterize the pathogen's population genetics better. Such data should prove highly beneficial for characterizing E. bieneusi risk factors, including transmission, host specificity, and exact clinical presentation.