Comparative mitogenomics supports synonymy of the genera Ligula and Digramma (Cestoda: Diphyllobothriidae)

Background After observing differences in the number of reproductive complexes per proglottid within the genus Ligula, the genus Digramma was erected. However, the validity of Digramma has been previously questioned due to a low variability in the cox1, nad1 and ITS rDNA sequences between the two genera. We undertook a study to greatly increase the amount of sequence data available for resolution of this question by sequencing and characterizing the complete mitogenomes of Digramma interrupta and Ligula intestinalis. Results The circular mtDNA molecules of Digramma interrupta and Ligula intestinalis are 13,685 bp and 13,621 bp in size, respectively, both comprising 12 PCGs, 22 tRNA genes, two rRNA genes, and two mNCRs. Both mitogenomes exhibit the same gene order and share 92.7% nucleotide identity, compared with 85.8–86.5% to the most closely related genus Dibothriocephalus. Each gene from D. interrupta and L. intestinalis is almost of the same size, and the sequence identity ranges from 87.5% (trnD) to 100% (trnH, trnQ and trnV). NCR2 sequences of D. interrupta and L. intestinalis are 249 bp and 183 bp in length, respectively, which contributes to the main difference in length between their complete mitogenomes. A sliding window analysis of the 12 PCGs and two rRNAs indicated nucleotide diversity to be higher in nad5, nad6, nad2, nad4 and cox3, whereas the most conserved genes were rrnL and rrnS. Lower sequence identity was also found in nad2, nad4, nad5, nad6 and cox3 genes between the two diphyllobothriids. Within the Diphyllobothriidae, phylogenetic analysis indicated Ligula and Digramma to be most closely related to one another, forming a sister group with Dibothriocephalus. Conclusions Owing to higher nucleotide diversity, the genes nad2, nad4, nad5, nad6 and cox3 should be considered optimal candidates to use as molecular markers for population genetics and species identification between the two closely related species. The phylogenetic results in combination with the comparative analysis of the two mitogenomes, consistently support the congeneric status of L. intestinalis and D. interrupta. Electronic supplementary material The online version of this article (10.1186/s13071-018-2910-9) contains supplementary material, which is available to authorized users.


Background
Based upon its paraphyly, differences in the position of the genital pore, the presence of an external seminal vesicle and the absence of a uterine sac in the Diphyllobothriidea Kuchta, Scholz, Brabec & Bray, 2008, the order Pseudophyllidea van Beneden in Carus, 1863 was suppressed and the Diphyllobothriidea and Bothriocephalidea Kuchta, Scholz, Brabec & Bray, 2008 were proposed [1][2][3].
Cholodkovsky [9] erected the genus Digramma Cholodkovsky, 1915 after observing differences in the number of reproductive complexes contained within each proglottid when studying the genus Ligula Linnaeus, 1758. In China, Ligula spp. are distributed in the Qinghai-Tibet Plateau with Schizothoracinae fishes serving as the primary second intermediate host. Digramma spp. are found across the rest of China where the goldfish Carassius auratus acts as their common second intermediate host [10]. However, the validity of Digramma has been questioned due to a low level of difference between the species of two genera in the cox1, nad1 and ITS rDNA sequences [11][12][13]. Thus Digramma is considered to be synonymous with Ligula [2]. Owing to the fact that only one gene and a limited number of isolates were included in that study, more sequence data and a greater range of taxa from different genera are required for a more comprehensive phylogenetic analysis of the Diphyllobothriidea [13].
This study, therefore, aimed to sequence and characterize the complete mitogenomes of Digramma interrupta and Ligula intestinalis and to perform phylogenetic analysis to investigate whether or not these two diphyllobothriids are congeneric using mitogenomic data. Differences within the mitochondrial genes were also compared to determine which genes would be suitable for the design of molecular markers as a means to differentiate D. interrupta from L. intestinalis.

Specimen collection and DNA extraction
Plerocercoids of D. interrupta and L. intestinalis were isolated from the body cavity of Carassius auratus collected from Liangzi Lake in Hubei Province and Gymnocypris selincuoensis from Siling Lake, Tibet, China, respectively. The tapeworms were preserved in 80% ethanol and stored at 4°C. Total genomic DNA was extracted from the posterior region of a single tapeworm using a TIANamp Micro DNA Kit (Tiangen Biotech, Beijing, China), according to the manufacturer's instructions. DNA was stored at -20°C for subsequent molecular analysis. The morphological identification of specimens was confirmed using sequence data from the ITS2 rDNA region [13,14] and the cox1 gene [11].
Amplification and DNA sequencing PCR was carried out as described previously [15,16], with minor modifications. Five degenerate primer sets (Additional file 1: Table S1) were designed to primarily amplify partial sequences of the nad5, cytb, nad2, cox1 and rrnS genes. The sequenced fragments were subsequently used to design primers specific for the amplification and sequencing of the whole mitogenome. PCR reactions were performed in a 20 μl reaction mixture, containing 7

Sequence annotation and analyses
The mitogenome was annotated broadly following the procedure described previously [15,16]. The amplified fragments were initially checked by BLASTN [17], before being assembled manually in a stepwise manner. The annotation was recorded in a Word document with the help of the Geneious program [18], using the mitogenome of Dibothriocephalus latus (syn. Diphyllobothrium latum) (NC_008945) as the reference sequence. PCGs were found by searching for ORFs (employing genetic code 9, echinoderm mitochondrial; flatworm mitochondrial) and the nucleotide alignments against the selected reference genome in Geneious. rrnL and rrnS were annotated in a similar way, via comparison with homologs using Geneious. ARWEN [19] and MITOS [20] web servers were employed to identify and fold all tRNAs. Similarly, the NCBI submission file (*.sqn) and tables of statistics for mitogenomes were generated using a home-made GUI-based program, MitoTool [21]. MitoTool was also used to calculate codon usage and relative synonymous codon usage (RSCU) for the 12 PCGs of D. interrupta and L. intestinalis and the results were sorted and imported into ggplot2 [22] to draw the RSCU figure. A Tandem Repeats Finder [23] was used to predict tandem repeats (TR) in the major non-coding regions (mNCRs), and the secondary structures of NCR1 and TR were folded by Mfold software [24]. Non-synonymous (dN) and synonymous (dS) mutation rates among the 12 PCGs of D. interrupta and L. intestinalis were computed using KaKs_-Calculator [25] utilising a modified Yang-Nielsen algorithm. DnaSP v5 [26] was adopted to conduct sliding window analyses. A sliding window of 500 bp and step size of 25 bp was implemented to estimate nucleotide divergence Pi (π) between the alignments of the mitogenomes of D. interrupta and L. intestinalis.

Phylogenetic analyses
The mitogenomes of 35 cestodes, covering five orders and ten families (Additional file 2: Table S2) were obtained from GenBank and were used, along with the two new mitogenomes generated in this study, to create the phylogenetic reconstruction. Two trematodes, Dicrocoelium chinensis (NC_025279) and Dicrocoelium dendriticum (NC_025280), were used as outgroups. All 36 genes (12 PCGs, 2 rRNAs and 22 tRNAs) were used for phylogenetic inference and were extracted from GenBank files using MitoTool. PCGs were aligned in batches using MAFFT and integrated into our own in-house GUI-based program, BioSuite [27], adopting codon-alignment mode. All RNA genes (rRNA and tRNA) were aligned using a structural alignment algorithm Q-INS-i incorporated into MAFFT-with-extensions software [28]. Gaps and ambiguous sites were deleted using GBlocks [29] integrated by BioSuite with default settings. BioSuite was subsequently used to concatenate the sequences into a single alignment and generate phylip and nexus format files. GTR+I+G was chosen as the optimal model of nucleotide evolution for all datasets based on the Akaike information criterion by ModelGenerator [30]. Two analytical methods were performed: maximum likelihood (ML) and Bayesian inference (BI). The ML analysis was performed in RAxML GUI [31] using a ML+rapid bootstrap (BP) algorithm with 1000 replicates. BI analysis was performed in MrBayes 3.2.6 [32] with default settings, 6 × 10 6 metropolis-coupled MCMC generations, and 1000 sample frequency. Stationarity was considered to have been reached when the average standard deviation of split frequencies was below 0.01, ESS (estimated sample size) was above 200, and PSRF (potential scale reduction factor) approached 1. Bayesian posterior probability (BPP) values were calculated in a consensus tree, after discarding the first 25% of samples as burn-in. Finally, the resultant trees were visualised and annotated by iTOL [33] with the help of several dataset files generated by MitoTool.

Genome organization and base composition
The length of the circular mtDNA molecules of D. interrupta (GenBank accession number: MF671697) and L. intestinalis (GenBank accession number: MF671696) was 13,685 bp and 13,621 bp, respectively. Both mitogenomes were composed of 12 PCGs, 22 tRNA genes, two rRNA genes and two mNCRs (major non-coding regions), all of which were transcribed from the same strand (Fig. 1). As commonly reported for flatworms [34], the two mitogenomes lacked the atp8 gene. The gene order of the two mitogenomes was identical, conforming to the synapomorphic gene arrangement of the order Diphyllobothriidea [35]. The A+T content of the mitogenomes of D. interrupta and L. intestinalis were 67.9% and 67%, respectively, which is in accordance with that of other cestodes (Additional file 2: Table S2).

Protein-coding genes and codon usage
Concatenated PCGs of the mitogenome of D. interrupta and L. intestinalis were both 10,062 bp in size, with an A+T content of 67.5% and 66.6%, respectively ( Table 2). The high A+T content was mainly concentrated on the third codon position (72.4% for D. interrupta and 70.1% for L. intestinalis). The start and termination codons within the mitogenome of D. interrupta and L. intestinalis were identical to one another. GTG was identified as the initial codon for cox3, and ATG for the rest of the 12 PCGs (Table 1). For each PCG, however, all selected Diphyllobothriidea species shared the same start codon (Additional file 3: Table S3), indicating that it may be a synapomorphy within this order. Amongst the termination codons, five out of 12 were identified as TAA, six as TAG, while cox3 used a truncated T stop codon. Codon usage, RSCU, and codon family proportion (corresponding to the amino acid usage) among D. interrupta and L. intestinalis were investigated (Additional file 4: Figure S1). The most abundant codon families were Phe, Leu2, and Ile within the two mitogenomes, which show a preference for the A+T-rich synonymous codons (Additional file 4: Figure S1). This corresponds to the high A+T bias of the two diphyllobothriid mitogenomes.
We measured the selective pressure acted upon amino acid replacement mutations by the ratio of non-synonymous (dN) to synonymous (dS) substitutions for all 12 PCGs of D. interrupta vs L. intestinalis.
Although the values (dN/dS) of atp6 (0.113), nad5 (0.111) and nad2 (0.110) genes were higher than cox1 a b Fig. 2 a Ratios of non-synonymous (dN) to synonymous (dS) substitution rates estimated from individual protein-coding genes of Digramma interrupta and Ligula intestinalis. b Sliding window analysis of the alignment of complete mtDNAs of D. interrupta and L. intestinalis. The black line shows the value of nucleotide diversity in a sliding window analysis of window size 500 bp with step size 25 bp, and the value is inserted at its mid-point. Gene boundaries are indicated with a variation ratio per gene (below each gene) (0.007) and cox2 (0.008) genes (Fig. 2a), all PCGs were under strong negative (purifying) selection (dN/dS < 0.12).

Transfer and ribosomal RNA genes
The sizes of rrnL in both the D. interrupta and L. intestinalis mitogenome was 967 bp, with 68.4% and 67.9% A+T content, respectively. Similarly, the lengths of rrnS were 742 and 743 bp, with an A+T content of 68.2% and 67.5%, respectively ( Table 2). All 22 commonly found tRNAs were present in the mitochondrial genome of D. interrupta and L. intestinalis, adding up to a 1,410 bp total concatenated length in both mitogenomes ( Table 2). All tRNAs could be folded into the conventional cloverleaf structure, with the exception of trnS1 (AGN) and trnR, which lacked DHU arms. The absence of DHU-arms in trnS1 (AGN) and trnR has also been reported in the Caryophyllidea [35] and the Anoplocephalidae [36].

Non-coding regions
The two major non-coding regions (mNCRs), NCR1 and NCR2, were located between trnY and trnL1 and between nad5 and trnG, respectively. The mNCRs were situated in the same location as all diphyllobothriideans surveyed to date (see Additional file 3 in our recent paper [35]). The NCR1 sequences of the two mitogenomes of D. interrupta and L. intestinalis were 223 and 224 bp in length with a heightened A +T bias of 78.9% and 75%, whereas the NCR2 sequences were 249 and 183 bp in size with 73.9% and 74.9% A+T content, respectively ( Table 2). The NCR2 of D. interrupta contained six TRs (tandem repeats). Repeat units 1-5 were identical in nucleotide composition and size (34 bp). Repeat unit 6 was truncated with 29 bp (Fig. 3). TRs were also found in the NCR2 of L. intestinalis, and the consensus repeat (35 bp) was almost identical to that of D. interrupta, with an insertion of a single nucleotide A at the 17th position. Only four repeat units, however, could be found in the mitogenome of L. intestinalis, which contributed to the main difference in length of the complete mitogenome between the two genera. The last repeat unit was also truncated with 22 bp. Both NCR1 and the consensus repeat sequence in NCR2 of the two mitogenomes were

Phylogeny
Both methods (BI and ML) produced phylograms with concordant branch topologies, thus only the latter was shown (Fig. 4). The phylogenetic tree indicated the ordinal topology to be Caryophyllidea + (Diphyllobothriidea + (Bothriocephalidea + (Proteocephalidea + Cyclophyllidea))). Within the family Diphyllobothriidae, Ligula and Digramma clustered with maximum nodal support (BP = 100 and BPP = 1), which formed a sister group with the genus Dibothriocephalus. This clade clustered together with Diphyllobothrium, then forming a sister group with Spirometra.

Discussion
The ordinal topology of Caryophyllidea + (Diphyllobothriidea + (Bothriocephalidea + (Proteocephalidea + Cyclophyllidea))) was consistent with previously identified interordinal relationships of tapeworms, when reconstructed from the dataset of nucleotide or amino acid sequences from partial mitogenomes, large and small subunit rRNA genes, and a combination of the former two [37]. Additionally, this relationship identified between tapeworm groups was congruent with latest mitochondrial phylogenomics [35]. However, a sister-group relationship between the orders Diphyllobothriidea and Bothriocephalidea has been suggested based on mitochondrial phylogenomics [38,39]. This inconsistency may be due to the different methods of phylogenetics employed.
Within the family Diphyllobothriidae, the phylogenetic relationship of the three genera Spirometra, Diphyllobothrium and Dibothriocephalus was congruent with that of recent studies on the phylogenetics of Eucestoda based on mitogenomes, with the topology of Spirometra In the present study, however, Ligula and Digramma were closely related to one another with maximum nodal support, then forming a sister group with the genus Dibothriocephalus. Inferred by the ITS2 rDNA sequence [13] and the 18S rDNA gene [42], the complexes of Ligula and Digramma have also been shown to be closely related to Dibothriocephalus. Further studies using the concatenated nucleotide sequences of 18S rDNA + 28S rDNA + rrnL + cox1, have again demonstrated the genus Dibothriocephalus to be the sister group of Ligula [5]. These phylogenetic results suggest that Dibothriocephalus is the most closely related genus to Ligula and Digramma.
However, mitogenome sequence identity between D. interrupta and L. intestinalis is 92.7%, which is much higher than between either of these species and the represented members of Dibothriocephalus (85.8-86.5%). Furthermore, high mitogenome sequence identity was also found between the congeners in Dibothriocephalus (92.3%), Diphyllobothrium (99.2%) and Spirometra (99.3%). These results suggest that sequence differences between D. interrupta and L. intestinalis are of a degree expected between members of the same genus.
In one study, sequence identity of the cox1 and nad1 genes between D. interrupta and L. intestinalis has been shown to be 100% and 92.6% [11]; however, identity was deemed at 93.5% and 93.0% in the present study, respectively (Table 1). This inconsistency may be due to the use of partial sequence of cox1 and nad1 genes or resulting from the use of formalin-preserved specimens [12]. The gene cox1 is considered to be a useful barcode for metazoans [43], and widely employed for cestode studies [44][45][46][47]. The two mitochondrial genes cox1 and cytb have also been used to study the population genetic structure of L. intestinalis on a local and global scale [14]. However, a lower sequence identity was found in the nad2, nad4, nad5, nad6 and cox3 genes between D. interrupta and L. intestinalis (90-92%), in comparison to the moderate variation seen between the cox1, cytb and nad1 genes (Fig. 5). Additionally, the relatively looser selection pressure of nad5 (0.111) and nad2 (0.110) may accelerate the accumulation of non-synonymous substitutions, which would increase variation of the two genes [48]. These results suggest that the nad2, nad4, nad5, nad6 and cox3 genes should be considered as optimal candidates for genetic markers to be used for population genetics and species identification studies between the two closely related species, D. interrupta and L. intestinalis.

Conclusions
The complete mitogenomes of Ligula intestinalis and Digramma interrupta were sequenced and characterized. The mitogenomes of these two species show a higher identity to each other than to any species in closely related genera. The two mitogenomes consistently support D. interrupta to be a congeneric species with L. intestinalis. High sequence variation in the nad2, nad4, nad5, nad6 and cox3 genes between the two diphyllobothriids suggest that these five genes should be considered as optimal candidates for genetic markers when studying population genetics or looking to differentiate the two closely related species, D. interrupta and L. intestinalis.

Additional files
Additional file 1: Table S1. Primers used to amplify and sequence the mitochondrial genome of Digramma interrupta and Ligula intestinalis.