First molecular description, phylogeny and genetic variation of Taenia hydatigena from Nigerian sheep and goats based on three mitochondrial genes

Background Cysticercosis caused by the metacestode larval stage of Taenia hydatigena is a disease of veterinary and economic importance. A considerable level of genetic variation among isolates of different intermediate hosts and locations has been documented. Generally, data on the genetic population structure of T. hydatigena is scanty and lacking in Nigeria. Meanwhile, similar findings in other cestodes like Echinococcus spp. have been found to be of epidemiological importance. Our aim, therefore, was to characterize and compare the genetic diversity of T. hydatigena population in Nigeria based on three mitochondrial DNA markers as well as to assess the phylogenetic relationship with populations from other geographical regions. Methods In the present study, we described the genetic variation and diversity of T. hydatigena isolates from Nigerian sheep and goats using three full-length mitochondrial genes: the cytochrome c oxidase subunit 1 (cox1), NADH dehydrogenase subunit 1 (nad1), and NADH dehydrogenase subunit 5 (nad5). Results The median-joining network of concatenated cox1-nad1-nad5 sequences indicated that T. hydatigena metacestodes of sheep origin were genetically distinct from those obtained in goats and this was supported by high FST values of nad1, cox1, and concatenated cox1-nad1-nad5 sequences. Genetic variation was also found to be higher in isolates from goats than from sheep. Conclusions To the best of our knowledge, the present study described the genetic variation of T. hydatigena population for the first time in Nigeria using full-length mitochondrial genes and suggests the existence of host-specific variants. The population indices of the different DNA markers suggest that analysis of long mitochondrial DNA fragments may provide more information on the molecular ecology of T. hydatigena. We recommend that future studies employ long mitochondrial DNA sequence in order to provide reliable data that would explain the extent of genetic variation in different hosts/locations and the biological and epidemiological significance.

Background Cestodes belonging to the genus Taenia infect a wide range of intermediate host species where the metacestode larval stage causes cysticercosis or coenurosis. A few members of this genus such as Taenia solium, T. saginata and T. asiatica are zoonotic and are responsible for taeniosis in humans while others are of veterinary importance.
The tapeworm T. hydatigena uses canids (primarily dogs) as definitive hosts. The metacestode larval stage of T. hydatigena infects mostly domestic animals such as goats, sheep and pigs, resulting in cysticercosis; an infection of veterinary importance that causes huge economic loses especially in livestock production as a result of mortality [1], condemnation of infected organs and carcasses, and the financial cost of diagnosis and inspection [2][3][4][5].
Taenia hydatigena has a global distribution with a prevalence range of 0.1-32.0%, differing between countries and hosts [4][5][6]. The epidemiology is such that prevalence is usually higher in sheep and goats in most African and European countries compared to countries in Asia [1,4,5], and higher in pigs in Asian and South American countries than in countries of other regions [4,6,7].
Globally, there exists a dearth of data on the genetic variation of T. hydatigena. Although a few studies conducted mostly by using partial cox1 and nad1 mitochondrial gene sequences have reported considerable levels of genetic variation among T. hydatigena populations from different geographical regions and hosts [4,[26][27][28][29], the epidemiological implications if any are far from clear and more studies in this regard may further help us understand the genetic variation or genetic population structure of T. hydatigena that could provide better insight on the disease epidemiology.
Further, the limitation of partial gene sequences in inferring the phylogenetic status of taeniids has been largely emphasized for Echinococcus [30,31]. Therefore, the available data based on partial gene sequences may have limitation in understanding the extent and significance of the genetic variation observed among T. hydatigena. Also, some studies have suggested that complete mitochondrial genes such as nad6, nad5, atp6, nad3 and nad2 may have some advantage over the conventional mitochondrial markers like cox1 and nad1 in investigating the molecular ecology of Taenia spp. [28,32]. Therefore, in the present study, our aim was to identify and investigate the genetic variation of T. hydatigena population in Nigeria based on the cox1, nad1 and nad5 mitochondrial genes, compare the genetic variation between the genes, and infer the phylogenetic relationships with T. hydatigena populations from other geographical regions.

Methods
Nigeria is a West African country with a population of over 180 million people. It is made up of 36 states and a Federal Capital Territory located in Abuja. These states are further grouped into six geopolitical zones (North-East, North-Central, North-West, South-East, South-South, and South-West) due to ethnicity and common ancestry. The vegetation cover is majorly rainforest in the south and savannah in the north. Borno State is located in the North-East zone of the country and home to about 4 million small ruminants (sheep and goats) [33] that are managed majorly by traditional method of livestock farming.
Isolates (n = 32) analyzed in this study were collected postmortem from goats (n = 24) and sheep (n = 8) in the months of November and December 2018 during routine examination at an abattoir in Maiduguri, Borno State, northeastern Nigeria (11°50′41″N, 13°8′89″E). All isolates were of liver origin.

DNA extraction, amplification and sequencing
Genomic DNA was extracted from a cut piece of each cysticercus using Qiagen Blood and Tissue Kit (Qiagen, Hilden, Germany) according to the manufacturer's instructions. PCR was performed in a final 25 µl reaction mixture containing 12.5 μl Premix Ex Taq ™ version 2.0 (Takara Bio, Kusatsu, Japan), 10 pmol of each primer, 0.5 μl of genomic DNA extract (c.20-200 ng), and RNAse free water up to the final 25 μl volume. The reaction was as follows: initial denaturation at 95 °C for 5 min, followed by 35 cycles of denaturation at 95 °C for 30 s, annealing at 55 °C for 40 s, and elongation at 72 °C for 60-90 s and a final extension at 72 °C for 10 min. Amplicons were visualized by electrophoresis in 1.5% (w/v) agarose gels in 1× TAE (40 mM Tris-acetate, 2 mM EDTA, pH 8.5), stained with GelRed ™ , viewed under UV light and the products sequenced (Beijing Tsingke Biotechnology Co., Ltd., Beijing, China). Primers used for amplifying the complete cox1, nad1 and nad5 genes were designed by Primer Premier 5 software based on the full mitochondrial genome sequence of T. hydatigena (GenBank: NC_012896) ( Table 1).

Molecular analysis
Sequences were assembled stepwise with the help of DNAstar v7.1 program and Unipro UGENE v1.29.0 software while making sure that the overlap sequences were identical and then viewed manually for correction of any nucleotide misread followed by alignment in BioEdit v7.2.6 [34]. The identity of each isolate was confirmed with their nucleotide sequence in the Gen-Bank database using the NCBI BLAST algorithm (https ://blast .ncbi.nlm.nih.gov/Blast .cgi). Nucleotide and haplotype diversity indices were estimated in DnaSP v6 [35]. Median-joining network [36] was inferred based on the sequences of mitochondrial cox1, nad1, nad5, and concatenated cox1-nad1-nad5 genes using Pop-ART (http://popar t.otago .ac.nz). Pairwise nucleotide difference was calculated using MEGA 7 [37]. Population neutrality indices, Tajima's D [38] and Fu's Fs [39] were calculated using DnaSP v6 [35]. F ST was calculated using the Arlequin 3.5.2.2 software package [40]. The Bayesian phylogenetic relationship of the Nigerian T. hydatigena isolates with other Taenia spp. was carried out based on cox1-nad1-nad5 (4083 bp) concatenated sequences with MrBayes v.3.1.2. The General Time Reversible model with a proportion of invariable sites and a gamma-shaped distribution of rates across sites (GTR + I + G) was used as the best-fit model of sequence evolution as determined by JModelTest [41]. Markov Chain Monte Carlo (MCMC) sampling was used to assess the posterior distribution of parameters with a chain length of 2,000,000 states, and 10% was discarded as 'burn-in' . Parameters were logged every 1000 states. TreeView v.1.6.6. (http://taxon omy.zoolo gy.gla.ac.uk/rod/treev iew.html) was used to display the phylogenetic tree.

Results
All 32 isolates (24 goats and 8 sheep) were identified as T. hydatigena. Overall, 29, 24 and 27 isolates were successfully amplified for nad1, cox1 and nad5, respectively. A BLAST search of the resulting nucleotide sequences showed > 99% similarity with T. hydatigena sequences deposited in the GenBank database.

Sequence variation
On analysis of the complete cox1 (1620 bp), nad1 (894 bp) and nad5 (1569 bp) mitochondrial nucleotide sequences, we observed 24, 24 and 33 polymorphic sites, respectively for each gene, of which 62.5% (15/24), 75% (18/24) and 36.36% (12/33) were parsimony informative, respectively. The number of haplotypes (Hap) observed was 10, 10, 9 and 9 for nad1, cox1, nad5 and cox1-nad1-nad5 sequences, respectively ( Table 2). According to the median-joining network, the haplotype aNIG1 appeared centralized for nad1 gene sequences and constituted 24.13% (7/29) of the total population with 1-7 mutational differences from the other haplotypes (Fig. 1a). For cox1 network, no central haplotype was observed while for nad5, bNIG9 was at the center of the network with 1-10 mutational steps from the other haplotypes constituting 22.22% (6/27) of the total population and made up mostly of sheep isolates (Fig. 1b). Both nad1 and nad5 central haplotype comprised isolates from both hosts but did not constitute the majority of the population (Fig. 1a, b). In addition, nad5 and cox1 network showed distinct haplotypes for sheep isolates (Fig. 1b, c). Interestingly, on analysis of the concatenated sequences of all three genes, all sheep isolates formed two distinct haplotypes (Fig. 1d).

Phylogenetic analyses
Phylogenetic analysis showed low posterior probability (pp) values indicating weak nodal support (< 0.80) for the different haplotypes (Fig. 2). Also, haplotype NIG3 appeared as a sister taxon to other T. hydatigena cluster (the same haplotype according to the median-joining network also presented at least, a 15-point mutation difference from its closest neighbour, Fig. 1d). The cluster showed three sub-clades with varying pp values of which two Chinese reference isolates formed the first sub-clade with a pp value of 0.99 compared to other sub-clades (< 0.80) which were constituted by haplotypes from Nigeria (Fig. 2).

Comparison of Nigerian T. hydatigena isolates with populations from other geographical regions
Due to the scarcity of full-length mitochondrial DNA (mtDNA) data from other geographical locations as in this study, we could not employ the full-length mitochondrial genes for comparison. Nonetheless, on trimming, a final dataset of 59 nad1 sequences of 435 bp (including 29 sequences from this study) (Additional file 1: Table S4) was used to compare the geographical relatedness of the Nigerian isolates and those from different regions. Results of the analysis showed 32 haplotypes, a high Hd of 0.9573, and a π of 0.00853. Overall neutrality indices showed significantly negative values (Tajima's D = − 1.81180, Fu's Fs = − 25.300). Furthermore, the median-joining network presented two major haplogroups (Fig. 3) that comprised haplotypes from different hosts and locations (Fig. 3). In addition, F ST values of T. hydatigena population based on pairwise comparison showed significant differences only between isolates of Nigerian and European origin (F ST = 0.116, P = 0.009) and European and Asian/Middle Eastern isolates (F ST = 0.088, P = 0.05) (Additional file 1: Table S5).

Discussion
Despite the fact that cysticercosis due to T. hydatigena causes inconceivable damage in livestock production in endemic countries [1][2][3][4], the genetic population structure and the epidemiological significance of the observed genetic variation is poorly understood [4,29]. In Nigeria, previous epidemiological surveys have shown that cysticercosis caused by T. hydatigena in sheep and goats could reach as high as 20-30% [10,13] and below 2% in pigs [7], causing remarkable setbacks in livestock production [3]. Similarly, in other African countries, a number of epidemiological studies have also reported high prevalence of cysticercosis in ruminants [4,42,43].
Meanwhile, mtDNA has been widely employed in investigating the intraspecific variation of metazoans due to the absence of recombination, maternal inheritance, conserved structure, higher mutation rate and a relatively high evolutionary rate [44][45][46][47]. Another method for studying intraspecific variation or genetic diversity in cestodes is the use microsatellite DNA, which has been reported to be highly informative and also commonly used in genetic population studies [48][49][50]. However, in this study, we report the genetic diversity of T. hydatigena isolates from sheep and goats collected from a slaughterhouse in Nigeria based on complete sequences for three mitochondrial genes, cox1, nad1 and nad5, in contrast to the commonly employed partial gene sequences [4,28,51,52].
Our analysis of the three mitochondrial genes revealed a considerable degree of genetic variation. Higher haplotype diversity was recorded for goat isolates and was highest for cox1 gene while nucleotide diversity was similarly higher in goats and highest for nad1 gene. Overall, the diversity and neutrality indices based on concatenation of all three genes were higher in goats and demonstrated inconsistency with population expansion which suggest the possibility of a bottleneck event in the course of evolution. However, the diversity indices of the individual genes cox1 and nad1 were comparable to estimates from Iranian and Italian (Sardinia) T. hydatigena populations from sheep and goats [28] but higher than Palestinian (West Bank) T. hydatigena populations of sheep origin [51]. In Palestine, lower prevalence and transmission rate of cysticercus tenuicollis (the larval stage of T. hydatigena), smaller area of Palestine, low small ruminant population (about 1.5 million) [53] as well as the prevailing management system have been suggested to influence the genetic diversity of T. hydatigena [28,51]. In contrast, Iran and Sardinia (Italy) experience a higher prevalence and transmission rate of cysticercosis, higher population of small ruminants (Iran ≥ 50 million, Sardinia ≥ 3 million), with traditional farming methods still being applied in raising sheep and goats [28]. These differences could possibly have an influence on the genetic diversity of T. hydatigena, such that a higher prevalence in intermediate hosts, as seen in Italy and Iran, may result in multiple infections (cysts from different intermediate hosts) in definitive hosts favouring sexual reproduction and consequently higher genetic variation within the population.
Although the significant differences in the number of examined Nigerian T. hydatigena isolates (from both intermediate hosts) could have influenced the outcome of the neutrality and nucleotide diversity indices, the hypothetical existence of host-specific strains which has been reported cannot be completely ruled out [28,54,55] Based on the individual genes, the median-joining network featured a centralized haplotype only for nad1 and nad5 genes but these did not constitute the majority of the population. On concatenation of the three genes, which is believed to be more informative than the individual genes, no centralized haplotype was found. However, two distinct haplotypes were revealed in all sheep isolates. Meanwhile, the formation of a founder haplotype (29.2% of the population) based on partial cox1 (324 bp) gene was reported in a pooled population of T. hydatigena from different countries and hosts [28]. Conversely, analysis of the nad1 gene in the latter study did not reveal a centralised haplotype. The same was true for Italian isolates from sheep, goats and wild boars [28]. In Palestine, the formation of a founder haplotype was also observed in sheep based on cox1 fragment (444 bp) and constituted about 55% of the examined population [51]. However, in a more recent investigation on the genetic diversity of T. hydatigena isolates from Turkish sheep and goats based on nad1 fragment (471 bp), no centralised haplotype was also observed [52]. Be that as it may, clarification of such discrepancies and improved phylogenetic resolution rest mostly on the analysis of longer or complete mtDNA sequences.
We observed high F ST values between T. hydatigena populations from sheep and goats, suggesting genetic differentiation. Meanwhile, reports of host specificity have been previously documented between isolates from sheep and goats in Iran [54] and India [55], such that T. hydatigena metacestodes from both hosts were found to be morphologically different. Similarly, analysis of the biochemical components of cysticerci from pigs and goats also suggested genetic differences [56]. More so, pairwise comparisons in the present study between goats and sheep isolates yielded a significant F ST value. This is in-line with previous F ST values suggesting genetic differentiation between sheep and goat isolates from Italy, China and Greece, as well as between pig isolates from Italy, China and Poland [28], which further supports the idea of genetic distinction among T. hydatigena populations infecting different hosts. Then again, the observed pairwise fixation values and population indices could have been influenced by the small sample size analysed in the present study as sample size has been found to influence estimates of population indices [38].
Phylogenetic analysis of the concatenated gene sequences showed that all isolates from Nigeria were correctly identified as T. hydatigena as they clustered closely with other T. hydatigena isolates from other regions with relative distances from other Taenia spp. (Fig. 2). However, haplotype NIG3 which consists only of isolates from goats formed a separate clade (Fig. 2). The weak nodal support observed between Nigerian haplotypes is similar to previous observations of low posterior probability values (0.51-1.00) and bootstrap values (79%) between T. hydatigena haplotypes from Iran and Tanzania [4,27].
As previously suggested [56], strain specificity in T. hydatigena, may present a similar feature like that of Echinococcus in which case, deep insight may be provided potentially on examination of longer mtDNA fragments and possibly nuclear genes of larger datasets comprising isolates from different hosts and geographical locations.
Furthermore, analysis of the trimmed nad1 gene sequence dataset (435 bp) from T. hydatigena isolates from other geographical regions and hosts revealed a population network that was consistent with population expansion based on the significant negative Tajima's D and Fu's F, indicating the presence of rare haplotypes as well as the characteristic star-shaped configuration with a centrally placed haplotype similar to previous observation by Boufana et al. [28]. In the network, some haplotypes were formed by isolates from different hosts and locations. Conversely, the distinct haplotypes of Nigerian sheep origin previously classified based on 4083 bp of mitochondrial DNA, formed a new haplotype that comprised isolates from different hosts (including Nigerian goat isolates) and locations, thus questioning the reliability of partial gene sequences in investigating the genetic population structure or resolving the phylogenetic status of T. hydatigena. Moreover, the pitfall of short gene sequences in inferring the phylogenetic status and the genetic diversity of cestodes has been emphasized for Echinococcus spp. [57]. Further, the outcome of longer DNA fragment analysed in the present study demonstrated that using either of the genes alone is insufficient to describe the intraspecific variation existing between T. hydatigena, as the 4083 bp of the mitochondrial genome provided a better resolution as indicated by the phylogenetic network and population indices.
The influence of geography or animal husbandry practices could not have plausibly explained the genetic differences observed between T. hydatigena isolates in this study. This is because small ruminants (sheep and goats) in Nigerian pastoral communities are mostly raised in the same environment and similar animal husbandry system. Moreover, each geopolitical zone of the country share similar geography (vegetation and climatic conditions), and in this study, the examined ruminants originated from the same state.

Conclusion
The present study, to the best of our knowledge, described the genetic variation of T. hydatigena in Nigeria for the first time using full-length mitochondrial gene sequences and suggests the possible existence of hostspecific strains. The scarcity of complete T. hydatigena mitochondrial cox1, nad1 and nad5 sequences from different geographical regions in the GenBank repository limited a wide-range comparison and interpretation. However, the results suggest that longer DNA fragments or complete mitochondrial genome analysis may provide a better resolution in our understanding of the genetic diversity of T. hydatigena and the possible epidemiological significance.
Additional file 1: Table S1. Taenia hydatigena mitochondrial cox1 gene nucleotide sequence polymorphism and corresponding amino acid changes among haplotypes from sheep and goats. Table S2. Taenia hydatigena mitochondrial nad1 gene nucleotide sequence polymorphism and corresponding amino acid changes among haplotypes from sheep and goats. Table S3. Taenia hydatigena mitochondrial nad5 gene nucleotide sequence polymorphism and corresponding amino acid changes among haplotypes from sheep and goats.