The complete mitochondrial genome of a parasite at the animal-fungal boundary

Background Sphaerothecum destruens is an obligate intracellular fish parasite which has been identified as a serious threat to freshwater fishes. Taxonomically, S. destruens belongs to the order Dermocystida within the class Ichthyosporea (formerly referred to as Mesomycetozoea), which sits at the animal-fungal boundary. Mitochondrial DNA (mtDNA) sequences can be valuable genetic markers for species detection and are increasingly used in environmental DNA (eDNA) based species detection. Furthermore, mtDNA sequences can be used in epidemiological studies by informing detection, strain identification and geographical spread. Methods We amplified the entire mitochondrial (mt) genome of S. destruens in two overlapping long fragments using primers designed based on the cox1, cob and nad5 partial sequences. The mt-genome architecture of S. destruens was then compared to close relatives to gain insights into its evolution. Results The complete mt-genome of Sphaerothecum destruens is 23,939 bp in length and consists of 47 genes including 21 protein-coding genes, 2 rRNA, 22 tRNA and two unidentified open reading frames. The mitochondrial genome of S. destruens is intronless and compact with a few intergenic regions and includes genes that are often missing from animal and fungal mt-genomes, such as, the four ribosomal proteins (small subunit rps13 and 14; large subunit rpl2 and 16), tatC (twin-arginine translocase component C), and ccmC and ccmF (cytochrome c maturation protein ccmC and heme lyase). Conclusions We present the first mt-genome of S. destruens which also represents the first mt-genome for the order Dermocystida. The availability of the mt-genome can assist the detection of S. destruens and closely related parasites in eukaryotic diversity surveys using eDNA and assist epidemiological studies by improving molecular detection and tracking the parasite’s spread. Furthermore, as the only representative of the order Dermocystida, its mt-genome can be used in the study of mitochondrial evolution of the unicellular relatives of animals.

Background Introduced parasites can cause significant population declines in susceptible species and generalist parasites in particular, are more likely to be introduced, established and expand their host range [1,2]. The eukaryotic parasite Sphaerothecum destruens is considered a true generalist [1] that can infect and cause high mortalities in freshwater fish species; including commercially important species such as carp and Atlantic salmon [3,4]. Sphaerothecum destruens has been recorded in North America [5][6][7], Europe [8][9][10][11][12] and China [10]. Sana et al. [10] provided data to support that S. destruens was introduced to Europe from China along with the accidental introduction of the invasive fish, topmouth gudgeon Pseudorasbora parva. Gozlan et al. [9] has identified P. parva as a reservoir host for S. destruens, i.e. the parasite can be maintained in P. parva and can be transmitted to other fish species whilst not causing disease and mortality in P. parva. Since its introduction to Europe, P. parva has spread to at least 32 countries from its native range in China [13] and S. destruens has been detected in at least 5 introduced P. parva populations [8,10,12,14].
Sphaerothecum destruens is an asexually reproducing intracellular parasite with a direct life-cycle which involves the release of infective spores to the environment through urine and seminal fluids [15]. The spores can survive and release free-living zoospores in the environment at temperatures ranging from 4 °C to 30 °C [16]. The ability for environmental persistence and its generalist nature, places this parasite as a potential risk to fish biodiversity [17]. Thus, efficient detection of this parasite is essential. Molecular detection using the 18S rRNA gene is currently the most efficient detection method compared to traditional histology [18]. However, due to the thickened cell wall of S. destruens, molecular detection in hosts with low parasite numbers can be difficult [15]. Developing more molecular markers such as mitochondrial DNA markers could improve detection, as there are multiple copies of mitochondria per cell (but note that there are also multiple copies of 18S rRNA genes per cell as well). Furthermore, mitochondrial genes are increasingly used for environmental DNA (eDNA)-based metabarcoding detection and so sequencing the mt-genome of this fish parasite could increase its detection in eDNA-based metabarcoding studies.
In addition to the importance of S. destruens as a potential disease risk for freshwater fishes, its taxonomic position is also evolutionarily important, as it belongs to the class Ichthyosporea (formerly referred to as Mesomycetozoea) which sits at the animal-fungal boundary ( Fig. 1) [19]. The class Ichthyosporea consists of two orders, Dermocystida and Ichthyophonida with S. destruens grouping within the former [15,19]. Phylogenomic studies placed S. destruens in a new clade termed as "Teretosporea" comprised of Ichthyosporea and Corallochytrium limacisporum [20]. Teretosporea was found to be the earliest-branching lineage in the Holozoa [20] and so can be used to provide clues into the origins of higher organisms and mtDNA evolution. Ichthyosporea are difficult to culture, therefore genetic information is often scarce. For example, mitochondrial DNA sequences are lacking for all members of the order Dermocystida.
Here, we have sequenced and present the first complete mt-genome of a species of the Dermocystida, S. destruens, in order to develop new tools for the parasite's detection and provide insights into the parasite's genome architecture evolution.

DNA extraction and sequencing of Sphaerothecum destruens mitochondrial DNA
The S. destruens spores used were obtained from S. destruens culture in EPC cells [4]. Sphaerothecum destruens reproduces asexually so the cultured spores represent clones of a single organism. The partial 18S rRNA gene from this culture has also been sequenced confirming that this is a culture of S. destruens ([4]; GenBank: MN726743). Total genomic DNA was isolated from S. destruens spores using the DNeasy Blood and tissue kit (Qiagen, Hilden, Germany). All the steps were performed per manufacturer's guidelines and DNA was eluted in 100 µl elution buffer and quantified using the Nanodrop (Thermo Fisher Scientific, Waltham, USA). A number of universal mtDNA primers for Metazoa and degenerate primers specific for cnidarians were used to amplify short gene fragments of S. destruens mtDNA. The primer pairs were successful in amplifying the short gene fragments of cox1 [21], cob [22] and nad5 [23] of S. destruens mtDNA. The mitochondrial fragments spanning the cob-cox1 and cox1-nad5 were amplified using the primer pairs LR-COB-F (5′-ATG AGG AGG GTT TAG TGT GGA TAA TGC-3′) and LR-COX1-R (5′-GCT CCA GCC AAC AGG TAA GGA TAA TAA C-3′); LR-COX1-R3 (5′-GTT ATT ATC CTT ACC TGT GTT GGC TGG AGC-3′) and LR-NAD5-R1 (5′-CCA TTG CAT CTG GCA ATC AGG TAT GC-3′), respectively, with two long PCR kits; Long range PCR kit (Thermo Fisher Scientific) and LA PCR kit (Takara, Clontech, Kasatsu, Japan The remaining regions of the mitochondrial genome were amplified with the modified step-out approach [24]. The step-out primer used the primers Step-out3 (5′-AAC AAG CCC ACC AAA ATT TNN NAT A-3′) coupled with the species-specific primers LR-cob-R2 (5′-TCA ACA TGC CCT AAC ATA TTC GGA AC-3′) and LR-nad5-R4 (5′-TGG GGC AAG ATC CTC ATT TGT-3′

Gene annotation
Gene annotation of the mitochondrial genome of S. destruens was performed using the automated annotation tool MFannot (http://megas un.bch.umont real.ca/ cgi-bin/mfann ot/mfann otInt erfac e.pl), followed by visual inspection. Gene annotation was further checked by examining the amino acid sequences of the genes. Genes were translated using the mold, protozoan, and coelenterate mitochondrial code and the mycoplasma/spiroplasma code and aligned with homologous proteins using Clustal W with default options (Gap open cost: 15 and Gap extend cost: 6.66). The 22 tRNA genes were further scanned and secondary structures were generated with MITOS [25]. The annotation for the tatC gene was further checked by predicting its secondary structure and comparing it to the secondary structure of two homologous proteins from Monosiga brevicollis and Oscarella carmela.
tRNA phylogenetic analysis tRNA replication was further investigated through phylogenetic analysis using the identified tRNAs from S. destruens and the reported tRNAs from its closest relative A. parasiticum (GenBank: AF538045 and AF538046; but note that the two species belong to two different orders). Prior to phylogenetic analysis, all tRNA sequences were modified [24]. Specifically, all tRNA sequences had their anticodon sequence and variable loops deleted and CCA was added to all tRNA sequences in which it was missing. The sequences were then aligned using Muscle in Seaview [25,26] followed by visual inspection. A neighbour-joining tree was constructed in MegaX [27], using 1000 bootstraps and p-distance to calculate evolutionary distance with pairwise deletion option for a total of 56 sequences (22 from S. destruens and 24 from A. parasiticum (GenBank: AF538045 and AF538046).

Gene content and organization
The mitochondrial genome of S. destruens was 23,939 bp long with an overall A+T content of 71.2% (Fig. 1). A list of gene order, gene length, and intergenic spacer regions of S. destruens mtDNA is given in Table 1. The nucleotide composition of the entire S. destruens mtDNA sequences is 40.8% thymine, 31% adenine, 19.7%, guanine and 8.5% cytosine (detailed nucleotide composition is listed in Table 2). It consisted of a total of 47 genes including protein-coding genes (21), rRNA (2) and tRNA (22) and two unidentified open reading frames (ORFs), with all genes encoded by the same strand in the same transcriptional orientation (Fig. 2).

Ribosomal RNA and transfer RNA genes
Genes for the small and large subunits for mitochondrial rRNAs (rrnS and rrnL, respectively) were present. They   Twenty-two tRNA genes, including three copies of trnM, were identified in S. destruens mtDNA. The tRNA genes had a length range of 71-80 bp and their predicted secondary structures had a clover leaf shape (Fig. 3). Three copies of trnM (methionine, CAT) had the same length (71 bp) and had the same anticodon -CAT. trnM1 was at 1713 bp from trnM2, whereas trnM2 and trnM3 were adjacent (Fig. 2). Two serine and two arginine tRNA genes were differentiated by their anticodon sequence trnS1 (GCT) and trnS2 (TGA), which were 70% similar, and trnR1 (ACG) and trnR2 (TCT) which were 63% similar. All the tRNA secondary structures had a dihydrouridine (DHU) arm, a pseudouridine (TΨC) arm and an anticodon stem, except for trnS1 (GCT) that had an additional short variable loop. The TΨC and D-loop was comprised of 7 and 7-10 nucleotides, respectively (Fig 3).

Non-coding regions
The total length of the non-coding regions was 842 bp and was comprised of 32 intergenic sequences ranging in size from 1 to 357 bp. Only two intergenic regions had lengths greater than 100 bp: (i) the non-coding region 1 (NCR 1) was 357 bp long and was located between the tatC and nad2 genes; and (ii) the non-coding region 2 (NCR 2) was 117 bp and was located between the trnL and ccmF genes (Fig. 2).

tRNA phylogenetic analysis
The phylogenetic analysis of the tRNAs of S. destruens and A. parasiticum showed that the majority of tRNAs grouped by species with few interspecies grouping (Fig. 4). The phylogenetic results suggest that some of the tRNA genes of S. destruens could have evolved by gene recruitment; these genes were trnV (TAC) and trnL (TAG); indicated by the black arrow in Fig. 4. For A. parasiticum gene recruitment is suggested for trnM, trnI, trnV, trnT and trnA, white arrow in Fig. 4, as already suggested by Lavrov & Lang [32].

Discussion
The mt-genome of Sphaerothecum destruens is remarkably compact when compared to other unicellular organisms in similar taxonomic positions and shows the presence of gene overlaps and an absence of both long intergenic regions and repeat sequences. The mt-genome of S. destruens has the highest coding portion, 96.4%, among the unicellular relatives of animals, with other members showing much smaller coding regions, e.g. M. brevicollis (47%) and A. parasiticum (20%). In addition, S. Table 4 Comparison of mt protein genes in Sphaerothecum destruens (SD) with its close relatives within the Ichthyophonida Amoebidium parasiticum (AP), the choanoflagellate Monosiga brevicollis (MB), and the Filasterea Capsaspora owczarzaki (CO) and Ministeria vibrans (MV) a Data for A. parasiticum and M. brevicollis from [28]; data for C. owczarzaki and M. vibrans from [32] Gene destruens had extensive gene loss especially for ribosomal proteins compared to species within the Filasterea and Choanoflagellatea with only four ribosomal genes in its mitochondrial genome and only 22 tRNAs.
The presence of the tatC in S. destruens represents the first record of this gene within the class Ichthyosporea. TatC has also been reported in M. brevicollis, a choanoflagellate representing the closest unicellular relatives to multicellular animals, and in multicellular animals such as the sponge O. carmella [29]. The tatC gene (also known as ymf16 and mttB) codes for the largest subunit of the twin-arginine transport system pathway and functions in the transport of fully folded proteins and enzyme complexes across membranes [33]. Support for its presence within the S. destruens mt-genome was based on sequence similarity and secondary structure comparisons to homologous proteins in M. brevicollis and O. carmela (Additional file 1: Figure S1). All three homologous tatC proteins have a Met initiation codon; with the tatC from S. destruens and M. brevicolis also having the same amino acids following the initiation codon (Ser and Lys). The overall amino sequence similarity between the tatC in S. destruens and its homologues in M. brevicollis and O. carmella was 21% and 16%, respectively, and all homologous genes had predicted secondary structures encompassing 6 transmembrane domains consistent with their transmembrane localisation.
Ten genes displayed overlapping regions, with these regions ranging from 1 to 46 nucleotides. Similar levels of gene overlaps have been described in other species [34,35]. The tRNA trnN and rnl genes overlap by 46 nucleotides. The overlap is supported by the percentage similarity between the rnl sequences of S. destruens and M. brevicollis, which is 54% ( Table 4). The genes nad3 and tatC overlap by 31 nucleotides and are 44% similar ( Table 4). As transcription of the S. destruens Fig. 3 The predicted secondary structures of 22 tRNAs of Sphaerothecum destruens mitochondrial DNA generated in MITOS [25] The tRNA stands for trnA (transfer RNA alanine), trnL (transfer RNA leucine), trnM1-3 (transfer RNA methionine), trnC (transfer RNA cysteine), trnD (transfer RNA aspartic acid), trnE (transfer RNA glutamic acid), trnG (transfer RNA glycine), trnH (transfer RNA histidine), trnI (transfer RNA isoleucine), trnK (transfer RNA lysine), trnP (transfer RNA proline), trnR1-2 (transfer RNA arginine), trnS1-2 (transfer RNA serine), trnV (transfer RNA valine), trnW (transfer RNA tryptophan), trnY (transfer RNA tyrosine), trnN (transfer RNA asparagine) and trnT (transfer RNA threonine) Fig. 4 Neighbour-joining treed based on pairwise distances among tRNA genes from S. phaerothecum destruens (SD) and Amoebidium parasiticum (AP, AF538045; AF*, AF538046) Nucleotides for anticodons and the variable loops were excluded from the analysis. Portions of the tree discussed in the text are indicated by the black and white arrows. Only bootstrap values above 50 are shown mitochondrial genome has not been examined, the transcription mechanisms for these proteins can only be hypothesised. A potential mechanism could be the transcription mechanism described for ATPase subunits in mammalian mitochondrial genomes [36].
The closest relative to S. destruens which has its mtgenome partially sequenced is A. parasiticum which is a member of the order Icthyophonida within the class Ichthyosporea [19]. In contrast to the mt-genome of S. destruens, the mt-genome of A. parasiticum is large (> 200 kbp) and consists of several hundred linear chromosomes [37]. To date, only 65% of the mt-genome of A. parasiticum has been sequenced [37]. In comparison to A. parasiticum, the mt-genome of S. destruens is at least eight times smaller with all genes encoded by a single circular strand in the same transcriptional orientation. There is a remarkable difference in the coding portion of the genomes between both species with only 20% of the mt-genome of A. parasiticum coding for proteins compared to 93% in S. destruens. The mtgenome of S. destruens contains 47 intron-less genes (including two ORFs) while the mt-genome of A. parasiticum intron and gene rich with 44 identified genes and 24 ORFs [37].
Both S. destruens and A. parasiticum use the mitochondrial UGA (stop) codons to specify tryptophan and have multiple copies of the trnM gene. These observed tRNA gene replications are also reported in M. brevicollis, C. owczarzaki and M. vibrans [29,32,37]. Similar to M. brevicollis, the mitochondrial tRNAs in S. destruens did not have a truncated D or T loop structure. The trnS of A. parasiticum [28], M. brevicollis [28] and S. destruens does not have a nucleotide at position 8, which connects the aminoacyl and D stems of trnS, and in position 26 there is a pyrimidine (uracil) instead of a purine. The trnS gene in S. destruens also has an adenine instead of uracil in the second nucleotide of its D-loop.
Phylogenetic analysis of the available tRNA sequences of S. destruens and A. parasiticum suggests that some tRNAs of both species could have evolved by gene recruitment. For S. destruens these are trnV and trnL. Gene recruitment is a process by which a gene is recruited from one isoaccepting group to another changing the tRNA identity [32]. Gene recruitment has been previously reported in A. parasiticum for trnM, trnI, and trnV [32]. It is important to note that due to the lack of mitochondrial genomes from close phylogenetic relatives of S. destruens, the results of this phylogenetic analysis are limited and must be interpreted with caution. In S. destruens, trnM1 and trnM3 share a higher nucleotide similarity, 70%, in comparison to trnM2 which is 54% and 63%, respectively. The trnM replication in S. destruens could represent different functions of the methionine tRNAs in protein synthesis and initiation of translation [38]; however, the functional significance remains unknown.

Conclusions
Mitochondrial DNA sequences can be valuable genetic markers for species detection and are increasingly used in eDNA-based species detection. This is the first record of the mt-genome of S. destruens, an important pathogen to freshwater fishes, and the first mt-genome for the order Dermocystida. The availability of this mt-genome should help in the detection of S. destruens and closely related parasites in eukaryotic diversity surveys using eDNA. Due to the abundance of mitochondria within cells, mitochondrial DNA could also be used in epidemiological studies by improving molecular detection and tracking the spread of this parasite across the globe [11]. Furthermore, as the only sequenced representative of the order Dermocystida, its mt-genome can be used in the study of the mitochondrial evolution of the unicellular relatives of animals.