Biodiversity and host-parasite cophylogeny of Sphaerospora (sensu stricto) (Cnidaria: Myxozoa)

Background Myxozoa are extremely diverse microscopic parasites belonging to the Cnidaria. Their life-cycles alternate between vertebrate and invertebrate hosts, predominantly in aquatic habitats. Members of the phylogenetically well-defined Sphaerospora (sensu stricto) clade predominantly infect the urinary system of marine and freshwater fishes and amphibians. Sphaerosporids are extraordinary due to their extremely long and unique insertions in the variable regions of their 18S and 28S rDNA genes and due to the formation of motile proliferative stages in the hosts' blood. To date, DNA sequences of only 19 species have been obtained and information on the patterns responsible for their phylogenetic clustering is limited. Methods We screened 549 fish kidney samples from fish of various geographical locations, mainly in central Europe, to investigate sphaerosporid biodiversity microscopically and by 18S rDNA sequences. We performed multiple phylogenetic analyses to explore phylogenetic relationships and evolutionary trends within the Sphaerospora (s.s.) clade, by matching host and habitat features to the resultant 18S rDNA trees. The apparent co-clustering of species from related fish hosts inspired us to further investigate host-parasite co-diversification, using tree-based (CoRE-PA) and distance-based (ParaFit) methods. Results Our study considerably increased the number of 18S rDNA sequence data for Sphaerospora (s.s.) by sequencing 17 new taxa. Eight new species are described and one species (Sphaerospora diminuta Li & Desser, 1985) is redescribed, accompanied by sufficient morphological data. Phylogenetic analyses showed that sphaerosporids cluster according to their vertebrate host order and habitat, but not according to geography. Cophylogenetic analyses revealed a significant congruence between the phylogenetic trees of sphaerosporids and of their vertebrate hosts and identified Cypriniformes as a host group of multiple parasite lineages and with high parasite diversity. Conclusions This study significantly contributed to our knowledge of the biodiversity and evolutionary history of the members of the Sphaerospora (s.s.) clade. The presence of two separate phylogenetic lineages likely indicates independent historical host entries, and the remarkable overlap of the larger clade with vertebrate phylogeny suggests important coevolutionary adaptations. Hyperdiversification of sphaerosporids in cypriniform hosts, which have undergone considerable radiations themselves, points to host-driven diversification. Electronic supplementary material The online version of this article (10.1186/s13071-018-2863-z) contains supplementary material, which is available to authorized users.

Most of the members of Sphaerospora (s.s.) are coelozoic in the excretory system (predominantly renal tubules), only two histozoic taxa, i.e. Sphaerospora fugu (Tun, Yokoyama, Ogawa & Wakayabashi, 2000) and S. molnari, have been sequenced to date. Members of Sphaerospora (s.s.) are believed to have similar life-cycle strategy like other myxozoans that alternate between vertebrate and invertebrate hosts [5,10]. Life-cycles were described for Sphaerospora truttae Fischer-Scherl, El-Matbouli & Hoffmann, 1986 [11] and Sphaerospora dykovae Gunter & Adlard, 2010 [12] but that of S. truttae was later shown to be incorrect as the alternate spore stages from the two hosts did not have identical 18S rDNA sequences [13], while the invertebrate lifecycle stage of S. dykovae still requires molecular confirmation [14]. Vertebrate hosts for sphaerosporids are marine and freshwater teleost fishes as well as amphibians [4,5,15].
The evolution of parasites and their hosts is shaped by their reciprocal influence and it was recently demonstrated that myxozoans and their invertebrate hosts show a high degree of phylogenetic congruence, likely because the latter represent the host group that was first acquired by myxozoans [16]. Cophylogenetic signal between myxozoans and their vertebrate hosts is more obscured as the coevolutionary history of reciprocal adaptation of myxozoans and their intermediate vertebrate hosts is much shorter and is received as a "mixed signal" of invertebrate and vertebrate cophylogeny [16]. Myxozoans are potentially the fastest evolving metazoans on the planet [16][17][18][19][20], with the most radical nucleotide variability found in true sphaerosporids. This group has also a wide range of vertebrate hosts, making them an especially interesting case for cophylogenetic studies. Lack of data for sphaerosporids in a recent study evaluating reciprocal dependencies of the phylogenies of myxozoans and their vertebrate hosts led to inconsistent results in this clade [16].
In the present study, we screened fish kidneys specifically for sphaerosporids, provide descriptions of a wide spectrum of Sphaerospora spp. accompanied by data on host specificity and 18S rDNA sequences that significantly enrich the dataset for phylogenetic and cophylogenetic analyses. We reinvestigated host-parasite codivergence using an extended sphaerosporid dataset, allowing for a detailed analysis of interdependent phylogenies and recreation of the evolutionary history of this special group of myxozoan parasites.

Sample collection and parasite morphology
In total, 549 fishes belong to 28 species (542 from freshwater and 7 from marine habitats) were collected from various localities, mostly in the Czech Republic (Additional file 1: Table S1), between 2012 and 2017. The highest number of fishes belongs to the order Cypriniformes (424 fish, 16 species) and Perciformes (82 fish, 3 species). All captured fish were euthanized by an overdose of clove oil followed by neural pithing. Kidney samples from all fishes were collected using sterile scissors and scalpel blades. Kidney samples were freshly squashed on grease-free glass slides and checked under an Olympus BX51 microscope, followed by digital documentation of kidneys containing presporogonic stages and Sphaerospora spores, using an Olympus DP70 camera. Preliminary species identification was performed referring to published guidelines [3,21,22]. Spores were measured on digital images using ImageJ 1·48q (Wayne Rasband, http://imagej.nih.gov/ij, Java 1·7·0_45; 64 bit) in reference to graticule measurements. Spore measurements (in μm) include spore length (L) and thickness (T), polar capsule length (PL) and width (PW), as well as plasmodium length and width given as the range followed by the mean in parentheses. The ratio of spore length to thickness (L/T) was also calculated to better describe spore shape.

18S rDNA amplification
For DNA analyses, all kidneys (including those considered uninfected by microscopical examination) were fixed in 400 μl TNES urea buffer [23]. Standard phenolchloroform DNA extraction was performed after proteinase-K digestion and the obtained DNA was eluted in 50-100 μl DNase/RNase-free PCR grade water. We screened all kidney samples for myxozoan infection by general myxozoan primer combination sets for 18S rDNA (primer combination of Erib1 + Erib10 followed by second round PCR with MyxGP2F + Act1R; [14,24]; details in Additional file 2: Table S2, Additional file 3:  Table S3, Additional file 4: Table S4). Sphaerospora 18S rDNA sequences were obtained by various combinations of general myxozoan, sphaerosporid and species-specific primer sets with specific amplification conditions (Additional file 2: Table S2 and Additional file 4: Table  S4). Taq-Purple DNA polymerase (Top-Bio, Prague, Czech Republic) or the more sensitive TITANIUM Taq DNA polymerase (Takara Bio Europe/Clontech, Saint Germain en Laye, France) was used for PCR amplification (Additional file 2: Table S2). PCR products were extracted with the Gel/PCR DNA Fragments Extraction Kit (Geneaid Biotech Ltd., New Taipei City, Taiwan) and sequenced commercially (www.SEQme.eu). All obtained sequences were checked thoroughly for clear chromatograms. In case of mixed sphaerosporid infection, amplicons were cloned into the pDrive vector using the PCR Cloning Kit (Qiagen, Hilden, Germany), and transformed into TOP10 chemically competent Escherichia coli cells (Life Technologies, Prague, Czech Republic). Plasmid DNA was isolated by a High Pure Plasmid Isolation Kit (Roche Applied Science, Mannheim, Germany) and three colonies from each PCR product were sequenced commercially. Newly generated sequences were submitted to BLAST (NCBI) for their preliminary identification. Partial sequences were assembled in SeqMan II, DNAStar package v5.05 (DNASTAR Inc., Madison, Wisconsin, USA).

18S rDNA sequence alignment and analyses
Thirty-eight sphaerosporid 18S rDNA sequences were aligned in MAFFT v7.017 [25] implemented in Geneious v8.0.5 [26] by L-INS-i algorithm, scoring matrix 200PAM/k=2 with gap opening penalty of 1.0 and offset value of 0.1. Due to large insertions in 18S rDNA variable regions, the alignment was edited manually in Geneious v8.0.5. The complete alignment including extensive species-specific insertions was 4813 bp long (Additional file 5). GC-content was calculated for all newly obtained 18S rDNA sequences using EditSeq, DNAStar package v5.05 (DNASTAR Inc., Madison, Wisconsin, USA) (Additional file 6: Table S5). Subsequently, nucleotides from extensive long insertions were deleted based on comparison with secondary structure-based alignments [8], resulting in a dataset of 3579 bp which was used for phylogenetic analyses. The distance matrix was produced in Geneious v8.0.5 from the same alignment file after excluding very short partial 18S rDNA sequences obtained from Sphaerospora sp. ex Gobio gobio (L.), Sphaerospora sp. ex Rutilus rutilus (L.), Sphaerospora elopi n. sp. and Sphaerospora hankai (Additional file 7: Table S6). A more restricted dataset consisting only of Sphaerospora spp. obtained from cypriniform hosts (15 taxa, 3768 bp) was produced as it allowed the inclusion of additional informative positions was aligned as described above. Two independent 18S rDNA datasets were produced to calculate (i) the intraspecific divergence of Sphaerospora diversa n. sp. (3 sequences; 3112 bp; Additional file 8: Table S7); and (ii) the interspecific divergence of sphaerosporids obtained from R. rutilus (2 sequences; 917 bp; not shown).

Phylogenetic analyses
Maximum likelihood analysis (ML) was performed using RAxML v7.2.8 [27] implemented in Geneious v8.0.5 with the GTR + Γ model of evolution and 500 bootstrap replicates. jModelTest [28] was used to select the bestfitting model of sequence evolution using corrected Akaike information criterion. Maximum parsimony analysis (MP) was performed in PAUP* v4.b10 [29], using a heuristic search with random taxa addition, the ACCTRAN option and TBR swapping algorithm with bootstrapping analysis for 1000 replicates, gaps were treated as missing data and all characters treated as unordered. Bayesian inference analysis (BI) was performed in MrBayes v3.2.6 [30] implemented in Geneious v8.0.5, using the GTR + Γ model. Posterior probabilities were estimated from 1,000,000 generations by two independent runs of simultaneous Markov Chain Monte Carlo chains with every 100th tree saved. 'Burn-in' period was set to 10%; Tracer v1.6 [31] was used to set the length of 'burn-in' period.

Cophylogenetic analyses
Unavailability of mitochondrial data for certain hosts of the 38 sphaerosporids led to replacement of mitogenome sequences with those of closely related species (5 host taxa), while 3 sphaerosporid taxa were withdrawn from the analysis due to unavailability of closely related/congeneric host mitogenome sequences. In addition, 4 partial sphaerosporid 18S rDNA sequences were excluded due to their short length (details in Additional file 9: Table S8). Hence, for host-parasite cophylogenetic analyses, an alignment of 31 18S rDNA Sphaerospora (s.s.) sequences (3579 bp) was analysed in combination with an alignment of 24 complete host mitogenomes (15 591 bp) available on GenBank (NCBI) (November 2017). Parasite and host ML trees were produced using RAxML v7.2.8 as previously mentioned. Host-parasite cophylogeny was evaluated using an event-based tree topology method, CoRe-PA v0.5.1 [32] without a priori cost assignment, checking 10 4 cost sets using the simplex method on the quality function. Statistical significance was tested by randomizing host and parasite topologies (10,000 random trees used) under the proportion-todistinguishable model. As a second method, we determined Global fit based on patristic distances (Geneious v8.0.5, above-mentioned dataset) and independent from tree topologies, in ParaFit [33], implemented in the APE package v3.4 [34] in R v3.2.4 (R Core Team 2013).

Sphaerosporid species diversity and descriptions
Based on strongly divergent sequences in the variable sections of the 18S rDNA gene region, 17 new 18S rDNA sequences of were obtained from the fish examined in the present study. Based on morphology, Sphaerospora diminuta Li & Desser, 1985 was identified and redescribed from Lepomis gibbosus (L.) caught in the Czech Republic ( Fig. 1a and b, Table 2). Eight taxa are new species for which we provide morphological and molecular data. The lack of microscopically detectable mature and immature spores or the presence of mixed sphaerosporid infections prevent us from identifying the other eight sphaerosporids detected by 18S rDNA sequencing (Tables 1 and 2, Additional file 1: Table S1) from A. brama, Ctenopharyngodon idella (Valenciennes), Gobio gobio, Lota lota (L.), Sander lucioperca (L.), S. erythrophthalmus, Silurus glanis L. and R. rutilus. These may represent new species or      species that have been previously described or recorded in these hosts ( Table 2). Pseudoplasmodium. Disporic pseudoplasmodia measuring 13.6-16.6 × 6.6-11.2 (15.0 × 8.8) (n = 9), with numerous refractile granules (Fig. 1b).

Remarks
Spore measurements, spore surface ornamentation, development of disporic pseudoplasmodia, host tissue localization and the number of polar filament coils match the original description of S. diminuta [35] and its later report [36] though motility of the pseudoplasmodia and higher number of spore surface striations (4-6 vs 3-4) reported by Lom et al. [36] were not observed in the present study (Table 2). In addition, extrasporogonic blood stages were observed by Lom et al. [36]. For the first time, we are providing 18S rDNA sequence data of this species. Another species, Sphaerospora ovophila Xiao & Desser, 1997 from the ovary of L. gibbosus significantly differs by spore measurements, number of polar filament coils, L/T ratio (  (Fig. 1d).

Remarks
There are two sphaerosporids described from A. brama, Sphaerospora bramae El-Matbouli, Hoffmann & Kern, 1995, infecting the renal tubules was described from Germany [39] and Sphaerospora masovica Cohn, 1902, infecting the gall-bladder and intestine from Canada [40]. Both species are much smaller than the present species, nevertheless, posterior ridges are present on both, S. bramae and S. abrami n. sp. (Table 2). Moreover, pseudoplasmodia of two sphaerosporids (without mature spores) were reported from the renal tubules of the same fish host [41,42]. In the present study, another species, Sphaerospora sp. ex A. brama (without morphological data) differs by over 11% from the 18S rDNA of S. abrami n. sp., which confirms their distinct species status. Unavailability of molecular data from previously reported species impede further comparisons with the new species.
Sphaerospora bliccae n. sp. Etymology: The species epithet "dentata" is referred to the tooth-like pointed ridges of the posterior spore valve surface.

Remarks
The spore measurements and the development of disporic pseudoplasmodia differentiate S. dentata n.

Remarks
The low 18S rDNA sequence divergence (0.29-0.89% over 3,112 bp; Additional file 8: Table S7) amongst the isolates of S. diversa n. sp., similar spore measurements and L/T ratios < 1 ( Table 2) confirm the conspecificity of these three isolates. So far, only "Sphaerospora leuciscusi" (nomen nudum) of Longshaw (2004) [44] has been described from the kidney of L. leuciscus [44] and Sphaerospora rota Zaika, 1961 has been reported from the kidney of Leuciscus leuciscus baicalensis, a subspecies of dace in Lake Baikal [45]. The present species has similar spore and polar capsule measurements as "S. leuciscusi" although it develops exclusively in monosporic pseudoplasmodia and has different L/T ratio ( Table 2). Spores of S. diversa n. sp. are significantly smaller than those of S. rota (Table 2), which also differs by a strongly protruding sutural edge, three small lateral protuberances and a prominent ridge on the posterior spore pole. Sphaerospora rota may represent a species complex as it was also reported from distantly related cypriniform fish Cobitis taenia L. and salmoniform fish Brachymystax lenok (Pallas) [45]. Molecular data from both S. leuciscusi and S. rota are not available for comparison with our reports. No sphaerosporid was previously described from S. cephalus and L. idus. Pseudoplasmodia of an undescribed sphaerosporid were reported in the renal tubules of S. cephalus [41], without morphological or DNA sequence data for species comparison.

Remark
Elops saurus or other elopid fishes were not previously reported to harbour sphaerosporids.

Remarks
Sphaerospora. gutta n. sp. is similar to S. scardinii described from the same host when comparing spore measurements, development in mono-and disporic pseudoplasmodia, within-host localization and the number of polar filament coils (Table 2) [43]. However, fine ridges found at the posterior end of S. scardinii were never observed in our samples. For similar reasons, the present species differs significantly from S. dentata n. sp. (Table 2). 18S rDNA data confirm the distinct status of these two new species and another species Sphaerospora sp. ex S. erythrophthalmus, which lacks morphological data (see below and Additional file 7: Table S6). Absence of 18S rDNA data from S. scardinii impedes comparison with this species. Undescribed Sphaerospora spp. were reported in the blood and kidney of S. erythrophthalmus but the lack of spore details and molecular data impede further comparison [42,46,47].
Sphaerospora rutili n. sp. Pseudoplasmodium. Mono- (Fig. 1m) and disporic (Fig.  1l)  Remarks Sphaerospora rutili n. sp. is morphologically similar to "Sphaerospora ousei" (nomen nudum) of Longshaw (2004) [44] which also possesses two uninucleated sporoplams, develops in mono-and disporic pseudoplasmodia within the renal tubules of roach. However, the ornamentation at the posterior spore end of S. rutili n. sp. was never observed in "S. ousei," which has completely smooth shell valves ( Table  2) [44]. Moreover, "S. ousei" has slightly elongated spores contrasting the spores of S. rutili n. sp. which are thicker ( Table 2). Another morphologically similar species, Sphaerospora poljanskii Kulemina, 1969 described from the same host, differs from the present species by larger spore dimensions, a split at the apical spore end, the shape of polar capsules and by the presence of two triangular posterolateral projections (Table 2) [48]. Another roach parasite, Sphaerospora minima Kaschkovsky, 1974 has smaller spores and polar capsule dimensions and spine-like ornamentation arranged in three lines at the posterior spore end, contrasting S. rutili n. sp. spores (Table 2) [49]. A Sphaerospora sp. with ornamentation at the spore end and with similar spore dimensions (deduced from the figure scale-bar) was reported from the renal tubules of roach in South Bohemia, Czech Republic [50]. This is likely the same species as in present study; however, further details on spore morphology and development are missing for species comparison. Lom et al. [41] reported two undescribed Sphaerospora spp. from R. rutilus from localities in the Czech Republic. Sphaerospora sp. 1 has nearly identical spore and polar capsule measurements and number of polar filament coils as S. rutili n. sp. though their L/T ratios are distinct (Table 2). Further details about the spore surface and development are missing for species comparison. Sphaerospora sp. 2 is similar to the present species due to smooth spore surface, identical L/T ratio and ornamentation at the posterior end of the spore but differs by smaller spore size and higher number of polar filament coils ( Table  2). Another species from roach, Sphaerospora sp. ex R. rutilus (present study) partially sequenced from Czech Republic differs by 3% (over 917 bp covering V7 and V8 regions) from 18S rDNA sequences of S. rutili n. sp. Lack of morphological details impedes species comparison. Several other reports of Sphaerospora spp. from the blood and the kidney of roach exist but without further morphological and molecular data [42,43,46]. Sphaerospora carassii Kudo, 1919 has also been described from roach but from different organs (gills, gall-bladder and intestine) and with different spore dimensions (Table 2) [40].

Remarks
This is the first record of sphaerosporid spores described from S. cephalus. Only pseudoplasmodia were reported but further details on spore morphology and molecular data are unavailable for comparison [41]. In the present study, we found a morphologically and morphometrically similar Sphaerospora sp. in true minnows (Leuciscinae), i. e. S. diversa n. sp. (Table 2); however the 18S rDNA sequences differ by 15% (see below and Additional file 7: Table S6), revealing them as two distinct species.

Pathology
None of the screened fish showed macroscopic or microscopic pathological changes in fresh smears. Infection levels with spore-forming stages were mild and only a limited number of parasites were visible in the tubular lumen.
Having attempted various primer combinations, we found that the following sets most successfully amplified sphaerosporid 18S rDNA sequences: (i) general 18S rDNA primer combination of Erib1 + Erib10 followed by a second round PCR with a new primer combination for freshwater Sphaerospora spp. SphFWSSU1243F + SphFWSSU3418R (present study); (ii) a Sphaerosporaspecific general primer combination of PsSSU1850F + Erib10 followed by a second round PCR with PsSSU2110F + Erib10 [5]; and (iii) general 18S rDNA primer combination of Erib1 + Erib10 followed by second round PCR with MyxGP2F + Act1R [14,24] (details in Additional file 2: Table S2). A combination of expanded primer extension time and highly efficient TITANIUM Taq polymerase considerably improved the outcome of PCRs.

Phylogenetic relationships within the Sphaerospora (s.s.) clade
The phylogenetic tree of 18S rDNA sequences including all newly sequenced taxa (Fig. 3) shows that all new sequences cluster within the Sphaerospora (s.s.) clade, allowing us to consider them "true sphaerosporids". The new sequences cluster into two distinct clades: (i) a basal "primary marine" clade of sphaerosporids from marine teleosts (i.e. Lineage A in [5]); and (ii) all other sphaerosporids (i.e. Lineage B in [5]). The latter, larger clade is subdivided into 3 distinct subclades (Fig. 3): (i) a clade of sphaerosporids from amphibians; (ii) the "secondary marine" clade of sphaerosporids with spores containing 4-12 sporoplasms (vs otherwise commonly 2) from marine habitats; and (iii) a "freshwater clade" of sphaerosporids from freshwater fishes, which includes the typespecies S. elegans. The freshwater clade is further divided into three subclades including: (i) sphaerosporids from cypriniform hosts; (ii) species from siluriform hosts; and (iii) a subclade of species from mixed fish host families. Sphaerospora molnari, the only histozoic parasite of the freshwater clade for which 18S rDNA sequences are available, creates a distinct sublineage. Sphaerospora diminuta produces a long branch within the mixed host freshwater subclade. Geography did not reflect phylogenetic clustering of sphaerosporids; however, host habitat (freshwater vs marine) and host group (at the ordinal level) showed a clear pattern in certain clades. Sphaerosporids from the same host order clustered together in the same clade or in sister clades (e.g. Cypriniformes, Centrarchiformes, Mugiliformes, Siluriformes and Anura) (Fig. 3). However, this trend was not observed at host family level as, for example, sphaerosporids from Gobionidae and Xenocyprididae grouped inside species of Leuciscidae and Cyprinidae, respectively (Fig. 4). Moreover, sphaerosporids from Leuciscidae and Cyprinidae clustered in more than one clade within the phylogenetic tree.

Cophylogeny analyses
The phylogenetic analysis of vertebrate mitochondrial sequence data revealed a tree topology that is in accordance with recent phylogenomic studies [51,52], apart from the position of Takifugu rubripes (Eupercaria: Tetraodontiformes: Tetraodontidae) which clustered outside Eupercaria and basal of Percomorphaceae. However, we did not exclude this species from tree reconciliation analysis. In the sphaerosporid phylogenetic tree used for cophylogeny, species clustering was unaltered after excluding six species (see "Cophylogenetic analyses" in Methods section; Additional file 9: Table S8).
The tree topology-based analysis performed in CoRe-PA detected significant congruence between the phylogenetic trees of sphaerosporids and their vertebrate hosts (Fig. 5), with 18 cospeciation events (estimated cost for cospeciation = 0.105) calculated from a dataset of 24 hosts and 31 parasites. Quality of the reconstruction was 1.8830165 × 10 -11 with a total cost of 7.578. CoRe-PA estimated 35 sorting (cost 0.054) and three host switching (cost 0.631) events: (i) from S. erythrophthalmus to A. brama; (ii) from Gasterosteus aculeatus L. to L. lota; and (iii) from a common perciform ancestor to Merlangius merlangus (L.), where all parasites established and diversified successfully. In cypriniforms, sphaerosporid diversity is presently the highest, based on the sampling performed in this study. In half of investigated cypriniform hosts, two independent sphaerosporid lineages are present. Furthermore, the analysis showed that the oldest cypriniforms already had three independent parasite lineages, indicating an extremely successful radiation of sphaerosporids in this host group. Global fit analysis detected 19 (F1.stat) or 24 (F2. stat) statistically significant coevolving host-parasite pairs, depending on the statistics used (calculating ligandreceptor relation importance in F1.stat or using a nonpermutated matrix in F2.stat), and resulting in a global fit of 0.2281377 with highly significant P-value of 0.001 over 999 permutations (Additional file 10: Table S9). Due to the difficulty of amplifying the strongly divergent sequences and extremely long, species-specific insertions by PCR, a condition that is further complicated by myxozoan co-infections in kidneys, Sphaerospora (s.s.) 18S rDNA sequence data has long been scarce [5]. Based on the development of new primers ( [5], present study), the Sphaerospora (s.s.) clade was enlarged from 19 [5,7,9,53] to 36 species. Based on 18S rDNA sequence divergence criteria proposed for other myxozoans [54][55][56], < 1% divergence was considered conspecific for S. diversa n. sp. (3 sequences; this study) and > 1% was considered interspecific variation [56]. However, species-specific long insertions in 18S rDNA cause extremely high sequence divergence (1.87-59.00%) in sphaerosporids, thereby greatly facilitating the differentiation, even of closely related species. Phylogenetic analyses of the enriched dataset showed clustering of the newly obtained sequences in previously established clades, and their GC content matched the previously recognized difference for the two main sphaerosporid clades [5]. However, some additional key findings were revealed in this study. Sphaerospora elopi n. sp. from an evolutionary older teleost, E. saurus (Elopiformes), represents presently the most basal species of "primary marine" sphaerosporids. The "anadromous host" clade of Bartošová et al. [5] was enriched by Sphaerospora spp. from freshwater fishes L. lota, S. lucioperca and L. gibbosus. Sphaerospora truttae is the only species with anadromous hosts (Salmo salar L. and Salmo trutta L.) in this clade but infects its hosts only in freshwater [57]. Moreover, since S. elegans 18S rDNA was sequenced from Gasterosteus aculeatus from an isolated freshwater site (A. Holzer, pers. comm.) and Pomoxis nigromaculatus (Lesueur, 1829) is a freshwater species, this clade can be considered as a "true" freshwater clade, justifying the changed attribute "mixed host clade". Important biodiversity and data enrichment for sphaerosporids from cypriniform hosts (12 new species) allows the interpretation of the clustering of a large number of sphaerosporids from closely related hosts and statements on host specificity. The long branch created by S. diminuta probably represents a novel sublineage rather than a phylogenetic artefact, as variable regions (specifically V4 and V5) and GC content are distinct from the rest of the other sphaerosporids (Additional file 6: Table S5). However, Sphaerospora sp. from P. nigromaculatus clusters sister to S. diminuta, from another centrarchiform host, L. gibbosus. Further taxon sampling from this fish family could resolve the long branch position of S. diminuta in the future.

Coevolution of species of Sphaerospora (s.s.) and their vertebrate hosts
Phylogenetic clustering of sphaerosporids according to host order led us to investigate host-parasite codivergence in this clade of myxozoans and to unravel the evolutionary history of sphaerosporids. Cophylogenetic analyses showed highly significant congruence between the phylogeny of sphaerosporids and their vertebrate hosts, by both, tree topology-based and distance-based methods. Although distance-based methods are considered less biased [87], using a smaller dataset of 19 hosts and 19 sphaerosporids [16] did not result in a significant outcome when using 16S mtRNA data, likely because this limited host dataset showed similar distances between taxa. Holzer et al. [16] showed that full mitogenome host data improve the outcome of distance-based methods but had only limited parasite sequences available and mitogenome data was not analysed at the species level. In our mitogenome-based host phylogeny, all taxa except Takifugu rubripes (Temminck & Schlegel) clustered according to the most updated fish phylogeny inferred using genomic data of nearly 2000 fishes [52]. The improved taxon sampling and more informative host dataset used in the present study hence considerably improved the outcome of cophylogenetic studies. Especially interesting is the finding that cypriniforms are a "preferred" host group with multiple parasite lineages in individual hosts. This appears to support the finding that hyperdiverse host fish groups (Ostariophysi and Percomorpha) [88] show a pronounced potential for parasite diversification [16], also in sphaerosporids. A higher potential of parasite sharing between closely related hosts [89,90] and host-driven diversification was observed in Sphaerospora spp. in leuciscinids in the present study. Closely related cypriniforms are among the most abundant fish groups in European freshwaters [91,92], often sharing the same habitat. This allows diversification of relatively host-specific taxa such as Sphaerospora (s.s.) spp., hence explaining the high biodiversity of sphaerosporids in these habitats, though sampling bias cannot be excluded at present [93].

Evolutionary history of sphaerosporids and their alternate hosts
Holzer et al. [16] suggested that sphaerosporids likely have a marine origin and may have settled in "archiannelid" (chaetopterids or sipunculids) invertebrate hosts.
The present study appears to further indicate the presence of two independent entries of sphaerosporids into archiannelids: (i) at the base of the primary marine clade; and (ii) at the root of all other sphaerosporids. This suggestion is based on the observation that elopiform fishes (Teleostei) are the oldest vertebrate hosts in the primary marine clade [51,52] while tetrapods, which originated earlier than teleosts, occupy this position in the large clade harbouring all other Sphaerospora spp. It is possible that the archiannelid acquired as host in the primary marine lineage was maintained as a single host until teleosts evolved in the marine realm, while the large sphaerosporid clade appears to have a similar evolutionary history as most other myxozoan clades which accommodate cartilaginous fish as their first host group [94][95][96], followed by lineages in tetrapods and finally mirroring the evolution of teleosts [16]. To support this idea, it would be essential to sequence sphaerosporids from evolutionary old fish lineages such as the Chondrichthyes or even the Cyclostomata. A single species, Sphaerospora araii Arthur & Lom, 1985 was described from a ray, Raja rhina Jordan & Gilbert, 1880 [78], but our newly developed primer sets may be able to uncover and sequence further species in cartilaginous fishes. We believe that sphaerosporids from cartilaginous fishes represent missing links that would be able to confirm phylogenetic congruence of sphaerosporids and their vertebrate hosts and contribute further information on their common evolutionary history.

Conclusions
The present study aimed at elucidating the phylogeny and evolutionary history of Sphaerospora (s.s.), based on a greatly enlarged (almost doubled) dataset of difficult to amplify 18S rDNA sequences. Larger datasets including information of new host groups and habitats provided important data, explaining parasite phylogenetic clustering. We report a very narrow host specificity for sphaerosporids. Sphaerospora diversa n. sp. sequenced from three closely related leuciscinid species showed low sequence divergences, presumably reflecting initial hostdriven diversification while the remainder of the newly sequenced species were strictly host-specific. Cypriniforms are characterized by multiple parasite lineages, indicating successful parasite diversification within this host group. Cophylogenetic analyses revealed significant phylogenetic congruence between sphaerosporids and their vertebrate hosts. Based on cophylogenetic analysis, we suggest that parasite entry to invertebrate hosts occurred twice independently during sphaerosporid evolution. Sequencing of sphaerosporids from cartilaginous fish, or other evolutionary older vertebrate groups could substantially support this idea and further elucidate the evolutionary history of this group of fast evolving myxozoans.