Complete mitochondrial genome of the giant liver fluke Fascioloides magna (Digenea: Fasciolidae) and its comparison with selected trematodes

Background Representatives of the trematode family Fasciolidae are responsible for major socio-economic losses worldwide. Fascioloides magna is an important pathogenic liver fluke of wild and domestic ungulates. To date, only a limited number of studies concerning the molecular biology of F. magna exist. Therefore, the objective of the present study was to determine the complete mitochondrial (mt) genome sequence of F. magna, and assess the phylogenetic relationships of this fluke with other trematodes based on the mtDNA dataset. Findings The complete F. magna mt genome sequence is 14,047 bp. The gene content and arrangement of the F. magna mt genome is similar to those of Fasciola spp., except that trnE is located between trnG and the only non-coding region in F. magna mt genome. Phylogenetic relationships of F. magna with selected trematodes using Bayesian inference (BI) was reconstructed based on the concatenated amino acid sequences for 12 protein-coding genes, which confirmed that the genus Fascioloides is closely related to the genus Fasciola; the intergeneric differences of amino acid composition between the genera Fascioloides and Fasciola ranged 17.97–18.24 %. Conclusions The determination of F. magna mt genome sequence provides a valuable resource for further investigations of the phylogeny of the family Fasciolidae and other trematodes, and represents a useful platform for designing appropriate molecular markers. Electronic supplementary material The online version of this article (doi:10.1186/s13071-016-1699-7) contains supplementary material, which is available to authorized users.


Background
Fascioloides magna (Bassi, 1875), the type-and only species of the genus Fascioloides Ward, 1917, was first described as Distomum magnum in 1875 [1]. Latter in 1917, Ward erected the genus Fascioloides for Fasciola magna (Bassi, 1875) [2]. Fascioloides magna, known as the large American liver fluke, giant liver fluke or deer fluke, is an important digenetic trematode of the family Fasciolidae [3,4]. This species, which is of North America origin [5,6] and invasive in European countries [7], has high potential to colonize new geographic territories (a variety of wild and domestic ungulates [3,[8][9][10]), and can establish expanding populations from a natural epidemic focus through translocated hosts [5,6,11]. Migration of F. magna immature flukes within the host body often leads to profound damage to the liver and other organ tissues [8,12], causing economic losses worldwide [13].
The consequences of infection of various intermediate and definitive hosts by F. magna has been intensively studied [8,12], but the relevant molecular research of this fluke has not received enough attention [4,9]. To date, a sequence of nuclear ribosomal DNA (rDNA) of F. magna was obtained in 2008 [14], partial sequences of mitochondrial (mt) genes, such as cytochrome c oxidase subunit I (cox1) and nicotinamide dehydrogenase subunit I (nad1) were characterized [3]. According to these data, F. magna was divided into two mt haplotype groups [5,14,15], the first haplotype representing isolates from western North America and Italy, and the second haplotype representing isolates from eastern North America and some European countries such as Czech Republic, Poland and Croatia [3,5]. Recently, the F. magna transcriptome was reported, which provides a useful platform for further fundamental studies of this fluke [16], but complete mt genome of F. magna is still unavailable.
Molecular tools, using genetic markers in mitochondrial DNA (mtDNA) sequences, have been proven reliable in identification and differentiation of trematode species [17][18][19][20]. In the present study we determined the mitochondrial genome sequence of F. magna (Czech isolate) using PCR-coupled sequencing technique combined with bioinformatic analysis, and for the first time assessed its phylogenetic relationship with selected trematodes based on the nucleotide-and inferred amino acid sequences of the protein-coding genes.

Sampling and DNA extraction
Three adult F. magna worms were isolated from livers of naturally infected red deer (Cervus elaphus), hunted at Kokořínsko area, Czech Republic. Worms were washed in 0.1 M phosphate-buffered saline (PBS), pH 7.2, fixed in 70 % (v/v) ethanol and preserved at -20°C, until further use. Total genomic DNA was extracted from individual F. magna specimens using sodium dodecyl sulfate (SDS)/ proteinase K treatment [21] and column-purification (Wizard® SV Genomic DNA Purification System, Promega, Madison, USA), according to the manufacturer's protocol.

Acquisition of ITS rDNA and sample identification
The internal transcribed spacer (ITS) rDNA region of each of the three F. magna specimens, spanning partial 18S rDNA, the complete ITS-1, 5.8S rDNA, ITS-2, and partial 28S rDNA, was amplified using primers BD1 (forward; 5'-GTC GTA ACA AGG TTT CCG TA-3' and BD2 (reverse; 5'-ATG CTT AAA TTC AGC GGG T-3') [22] and sequenced using the same primers. These F. magna samples had ITS-1 and ITS-2 sequences identical to the corresponding sequences available on GenBank (EF051080).

Long-range PCR-based sequencing of mt genome
The primers were designed based on relatively conserved regions of mtDNA sequences from Fasciola hepatica and Fasciola gigantica. The entire mt genome from a single specimen of F. magna was amplified in 5 overlapping fragments, using the primers shown in Additional file 1: Table S1.
PCR reactions were conducted in a total volume of 50 μl, using 25 μl PrimeStar Max DNA polymerase premix (Takara, Dalian, China), 25 pmol of each primer (synthesized in Genewiz, Suzhou, China), 0.5 μl DNA templates, and H 2 O, in a thermocycler (Biometra, Göttingen, Germany). PCR cycling conditions started with an initial denaturation at 98°C for 2 min, followed by 22 cycles of denaturation at 92°C for 18 s, annealing at 52-65°C for 12 s and extension at 60°C for 1-5 min, followed by 92°C denaturation for 2 min, plus 25 cycles of 92°C for 18 s (denaturation), 50-67°C for 12 s (annealing) and 66°C for 3-6 min, with a final extension step for 10 min at 66°C. A negative control (no DNA) was included in each amplification run. Amplicons (2.5 μl) were electrophoresed in a 2 % agarose gel, stained with Gold View I (Solarbio, Beijing, China) and photographed by GelDoc -It TS™ Imaging System (UVP, USA).

Assembly, annotation and bioinformatics analysis
Sequences were assembled manually and aligned against the entire mt genome sequences of Fa. hepatica (GenBank accession No. NC_002546) and Fa. gigantica (NC_024025) using MAFFT 7.122 to infer boundaries for each gene. Amino acid sequences of 12 protein-coding genes were translated using MEGA v.6.06 and NCBI translation Table 21 (Trematode Mitochondrial Code). The tRNA genes were affirmed using the programs tRNAscan-SE [23] and ARWEN (http://130.235.46.10/ ARWEN/) or by comparison with those from the Fa. hepatica and Fa. gigantica mt genomes. The two rRNA genes were identified by comparison with those of Fa. hepatica and Fa. gigantica.
A comparative analysis of the nucleotide sequences of each protein-coding gene, the amino acid sequences, two ribosomal RNA genes, 22 tRNA genes as well as non-coding regions (NCRs) among F. magna, Fa. hepatica and Fa. gigantica was conducted.
All inferred amino acid sequences were aligned using MAFFT 7.122. Poorly aligned sites and divergent regions of the alignment were eliminated using Gblocks Server v. 0.91b (http://molevol.cmima.csic.es/castresana/Gblocks_ server.html) using default settings, selecting the option of less strict conservation of flanking positions. The alignment was then converted into nexus format using Clustal X1.83 and subjected to phylogenetic analysis using Bayesian inference (BI). A mixed model was used in BI analysis using MrBayes 3.1.1 [24], because the most suitable amino acid evolution model JTT + G + F, selected by ProTest 3.4 based on the Akaike information criterion (AIC) [25], was not available in the current MrBayes version. Four independent Markov chain were run for 10,000,000 metropolis-coupled MCMC generations, sampling trees every 1,000 generations. The first 2,500 trees (25 %) were discarded as 'burn-in' , and the remaining trees were used for calculating Bayesian posterior probabilities. The analysis was regarded as completed when the potential scale reduction factor was close to 1, and the average standard deviation of split frequencies was below 0.01. Phylograms were prepared using FigTree v. 1.42 [26].

Genome content and organization
The complete mt genome sequence of F. magna (GenBank accession no. KR006934) is 14,047 bp in length (Fig. 1) and contains 36 genes that are transcribed in the same direction, including 12 protein-coding genes (nad1-6, nad4L, cox1-3, atp6 and cytb), 22 tRNA genes and two rRNA genes (rrnL and rrnS), lacking the atp8 gene (Table 1), consistent with those of selected trematode species available on GenBank [17-19, 27, 28]. There is only one NCR in F. magna mt genome, whereas the mt genomes of Fasciola flukes have two non-coding regions [17,27].
The arrangement of genes in the F. magna mt genome is similar to that of Fasciola spp. [17], except that only one non-coding region (NCR) in F. magna mt genome is located between trnE (13,355-13,422) and cox3 (1-645) ( Table 1). The gene order of F. magna mt genes is similar to that in species of the Paramphistomatidae, Notocotylidae, Echinostomatidae, Heterophyidae and Opisthorchiidae, but is distinct from some flukes of the Schistosomatidae (S. mansoni, S. spindale and S. haematobium) [29].
The nucleotide composition of F. magna mt genome is obviously biased towards A and T. The value of total A + T content for F. magna mtDNA is 61.42 %, within the range recognized in other trematode mt genomes (54.38 % in Paragonimus westermani Indian isolates [30], 72.71 % in Schistosoma spindale [29]). The content of C is low (10.3 %) and that of T is high (44.0 %). The A + T content for each gene or region of F. magna mt genome ranged from 48.48 % (trnL2) to 68.18 % (trnG) (nad3, 64.43 %; cox2, 59.7 %). All 12 protein-coding genes of F. magna mtDNA possess a lower A + T percentage than those of Fa. hepatica and Fa. gigantica [17,27], except for nad5 (Additional file 2: Table S2).

Annotation of F. magna mt genome
In the mt genome of F. magna, the protein-coding genes had ATG or GTG as start codons and TAG or TAA as stop codons ( Table 1). Half of the protein-coding genes of F. magna were initiated with GTG (nad4L, nad4, nad1, cox1, nad6 and nad5). Incomplete codons were not detected in the mt genome of F. magna.
The 22 tRNA genes of F. magna mt genome ranged from 57 to 69 bp in length. The structure of all tRNA sequences is similar to those of Fa. hepatica and Fa. gigantica [17,27]. The large ribosomal RNA gene (rrnL) and the adjacent small ribosomal RNA gene (rrnS) are located between trnT and cox2, and separated by trnC Fig. 1 Organization of the mitochondrial genome of Fascioloides magna. The scales are approximate. All genes are transcribed in the clockwise direction, using standard nomenclature. "NCR" refers to the only non-coding region in F. magna data. The A + T content is shown for each gene or region of the mt genome and represented by colour (9,456-9,518) ( Table 1). The length of the rrnL and rrnS RNA genes is 984 bp and 765 bp, respectively. The only NCR of F. magna mt genome is of 520 bp in length, and is located between trnE and cox3. It contains two complete direct repeats: six copies of a 23 nt -repeat A (AGA TAG GAT AGG CAT CTG GTA TA) and five copies of a 37 ntrepeat B (GGT GCC CCC GGT GAA GGG GGA AAA GGA AGG TTG TAA G). There are five AB repeats, with one A at the end (located at positions 13,620-13,642).
Comparative analysis among mt genomes of F. magna, Fa. hepatica and Fa. gigantica The difference between complete mt genomes of F. magna and Fa. hepatica was 22.66 % (3,290 nt), which is  (Table 2). At the nucleotide level, sequence differences in protein-coding genes ranged from 13.1 to 24.2 % (between F. magna and Fa. hepatica) and from 12.8 to 26.2 % (between F. magna and Fa. gigantica), with cox1, nad1, nad4L and cytb being the most conserved genes, and nad6, nad5 and nad2 being the least conserved genes among those three species. At the amino acid level, sequence differences ranged from 9.2 to 25.4 % between F. magna and Fa. hepatica, and from 8.4 to 27.8 % between F. magna and Fa. gigantica: cox1, cytb, nad4L and nad1 were the most conserved proteincoding genes, while nad6, nad2 and nad5 were the least conserved.
Comparisons between the mt genomes of F. magna and Fasciola spp., at both nucleotide and amino acid levels, indicate that the most conserved and the least conserved gene in the Fasciolidae are cox1 and nad6, respectively. Besides, the nad5 is highly variable, and genes of nad4L and cytb are rather conserved. These characteristics are in accordance with flukes of the families Paramphistomatidae and Notocotylidae [18,28].
Nucleotide differences were also found in ribosomal RNA genes: between F. magna and Fa. hepatica

Phylogenetic analysis
In the phylogenic tree inferred from the concatenated amino acid sequence dataset of all 12 mt proteins (Fig. 2) F. magna clustered with three other Fasciola species with strong support (Bpp = 1). The closest family to the Fasciolidae is Echinostomatidae, represented by Hypoderaeum sp. The taxonomic relationships of the selected trematodes are in concordance with results of previous studies [17][18][19]28]. Each node received the maximum possible nodal support (Bpp = 1).
In several recent phylogenetic studies, the F. magna was characterized only based on partial 28S rDNA [31] and combined ITS1, ITS2 and nad1 sequences [32]. The relationship between the genera Fasciola and Fasciolopsis was considered as being very close and the genetic relationship between F. magna and Fasciola jacksoni (or Fascioloides jacksoni) is disputable [31][32][33]. Further studies are warranted to determine the mt genome of Fa. jacksoni and solve this controversy in the family Fasciolidae.

Conclusions
The present study determined the complete mt genome sequence of the pathogenic liver fluke F. magna and revealed its close relationship with the species of Fasciola. The complete mt genome data of F. magna provides a resource for further investigations of the phylogeny, epidemiology, biology and population genetics of the family Fasciolidae and other trematodes.