Mitochondrial genome of Hypoderaeum conoideum – comparison with selected trematodes

Hypoderaeum conoideum is a neglected but important trematode. The life cycle of this parasite is complex: snails serve as the first intermediate hosts: bivalves, fishes or tadpoles serve as the second intermediate hosts, and poultry (such as chickens and ducks) act as definitive hosts. In recent years, H. conoideum has caused significant economic losses to the poultry industry in some Asian countries. Despite its importance, little is known about the molecular ecology and population genetics of this parasite. Knowledge of mitochondrial (mt) genome of H. conoideum can provide a foundation for phylogenetic studies as well as epidemiological investigations. The entire mt genome of H. conoideum was amplified in five overlapping fragments by PCR and sequenced, annotated and compared with mt genomes of selected trematodes. A phylogenetic analysis of concatenated mt amino acid sequence data for H. conoideum, eight other digeneans (Clonorchis sinensis, Fasciola gigantica, F. hepatica, Opisthorchis felineus, Schistosoma haematobium, S. japonicum, S. mekongi and S. spindale) and one tapeworm (Taenia solium; outgroup) was conducted to assess their relationships. The complete mt genome of H. conoideum is 14,180 bp in length, and contains 12 protein-coding genes, 22 transfer RNA genes, two ribosomal RNA genes and one non-coding region (NCR). The gene arrangement is the same as in Fasciola spp, with all genes being transcribed in the same direction. The phylogenetic analysis showed that H. conoideum had a relatively close relationship with F. hepatica and other members of the Fasciolidae, followed by the Opisthorchiidae, and then the Schistosomatidae. The mt genome of H. conoideum should be useful as a resource for comparative mt genomic studies of trematodes and for DNA markers for systematic, population genetic and epidemiological studies of H. conoideum and congeners.


Background
Echinostomatid trematodes comprise a group of at least 60 species [1], some of which are of socioeconomic significance in animals. Hypoderaeum conoideum (Bloch, 1782) is an important member of the family. This echinostomatid was originally found in the intestines of birds and is known to infect chickens, ducks and geese in many countries around the world [2][3][4]. It has also been found to infect humans and cause echinostomiasis in Thailand [5,6]. Freshwater snails, Planorbis corneus, Indoplanorbis exustus, Lymnaea stagnalis, L. limosa, L. ovata and L. rubiginosa, act as first intermediate hosts and shed the cercariae; bivalves, fishes or tadpoles can act as second intermediate hosts [3,5].
The accurate identification of species and genetic variants of Hypoderaeum conoideum will be central to investigating its biology, epidemiology and ecology, and also has implications for the diagnosis of infections. Although morphological features are used to identify this and other trematodes, such characters are not always reliable [7]. Due to these constraints, various molecular methods have been established for specific identification [7]. For instance, PCR-based techniques using genetic markers in nuclear ribosomal (r) and mitochondrial (mt) DNA have been widely used [7]. The sequences of the first and second internal transcribed spacers (ITS-1 and ITS-2 = ITS) of nuclear rDNA have been particularly useful for specific identification, based on consistent levels of sequence difference between species and little variation within individual species [7], while the mitochondrial gene cox1 has been used for studying genetic variation and relationships among different species [8][9][10]. As a basis for the development of molecular tools to study H. conoideum populations (irrespective of developmental stage), we have characterized the complete mt genome of this parasite, compared this genome with those of selected trematodes and undertaken a phylogenetic analysis of concatenated amino acid sequence data for 12 protein-coding genes to assess the genetic relationship of H. conoideum with these other trematodes.

Parasites and DNA isolation
H. conoideum adults were collected from the intestine of a naturally infected free-range duck in Hubei province, China, in accordance with the Animal Ethics Procedures and Guidelines of Huazhong Agricultural University. These worms were washed in physiological saline and identified morphologically according to existing morphological descriptions [11]. A reference specimen was stained and mounted [12] and the remaining specimens were fixed in 70% (v/v) ethanol and stored at −20°C until use [8]. Total genomic DNA was extracted from one specimen using E.Z.N.A.® Tissue DNA Kit. To provide further identification for this specimen, the ITS-2 region was amplified and sequenced [13], it was identical to a reference sequence available for H. conoideum (GenBank accession no. KJ 944311.1).
Amplification and sequencing of partial cox1, cox3, nad4, nad5 and rrnS Initially, ten oligonucleotide primers (Table 1) were designed to regions of the mt genome of Fasciola hepatica [14], in order to amplify short fragments from the cox1, cox3, nad4, nad5 and the small subunit of ribosomal RNA (rrnS) genes (Table 1). PCR (25 μl) was performed in 10 mM Tris-HCl (pH 8.4), 50 mM KCl, 4 mM MgCl 2 , 200 mM each of dNTP, 50 pmol of each primer, 2 U Taq polymerase (Takara) and 2.5 μl genomic DNA or H 2 O (no-DNA control) in a thermocycler (Biometra) under the following conditions: an initial denaturation at 94°C for 5 min, followed by 30 cycles of 94°C/1 min; 47-50°C/30 s (depending on primer pair), 72°C/1 min, followed by a final extension of 72°C/7 min. Amplicons were sent to Sangon Company (Shanghai, China) for sequencing by using the same forward and reverse primers (separately) as used in PCR.

Long-PCR amplification and sequencing
Ten additional primers (see Table 1) were then designed from the sequences obtained, and used to amplify genomic DNA (~40-80 ng) from five regions (see Table 1) by long-PCR; PCRs (25 μl) were performed in a reaction buffer containing 2 mM MgCl 2, 1× LA Taq Buffer II, 0.4 mM dNTP mixture, 0.8 μM of each primer, 2.5 U LA Taq polymerase (Takara) and 2.5 μl of genomic DNA or H 2 O (no-DNA control) for 35 cycles of 94°C/ 30 s (denaturation), 50°C/30 s (annealing) and 72°C/ 1 min (extension) per kb. Amplicons were cloned into pGEM-T-Easy vector (Promega, USA) according to the manufacturer's protocol; inserts were amplified by long-range PCR (employing vector primers M13 and M14) and then sequenced using a primer-walking strategy [15].
against the mt genome sequences of other available trematodes (including F. hepatica) using the programs Clustal X v.1.83 [16] to infer gene boundaries. The open reading frames (ORFs) were identified using ORF Finder (http://www.ncbi.nlm.nih.gov/gorf/gorf.html) employing the flatworm mitochondrial genetic code. Translation initiation and termination codons were identified as described previously [14,17,18]. The secondary structures of the 22 tRNA genes were predicted using tRNAscan-SE and/or manual adjustment [9,19]. The two rRNA genes were identified by comparison with those from the mt genome of F. hepatica [14]. Amino acid sequences of the protein-coding genes were obtained by using the flatworm mt code, and aligned using the program MUSCLE [20] employing default settings.

Sliding window analysis of nucleotide variation
Sequence variability between H. conoideum and F. hepatica was conducted by sliding window analysis using the software DnaSP v.5 [21]. A sliding window analyses was implemented as described previously [22].
Overlapping nucleotides between the mt genes of H. conoideum ranged from 1 to 40 bp (Table 2), which is the same as other for trematodes, such as F. hepatica [14] and O. felineus [22]. The mt genome of H. conoideum has 26 intergenic spacers, each ranging from 1 to The inferred length of amino acid sequence of 12 protein-coding genes: 1 number of amino acids; 2 initiation and termination codons; 3 intergenic nucleotides; 4 initiation or termination positions of ribosomal RNAs defined by adjacent gene boundaries. Table 2). The nucleotide contents in the mt genome are: 18.92% (A), 11.71% (C), 42.46% (T) and 26.91% (G). The A + T content of protein coding genes and rRNA genes ranged from 59.65% (rrnS) to 68.63% (nad3) ( Table 3), and the overall A + T content of the mt genome is 61.4%.

Transfer RNA and ribosomal RNA genes, and non-coding regions
The H. conoideum mt genome encodes 22 tRNAs; all of them have a typical cloverleaf structure. The length of 22 tRNA genes ranges from 60 bp to 75 bp ( Table 2). There are intergenic and overlapping nucleotides between adjacent tRNA genes ( Table 2). The rrnS and rrnL are 751 bp and 979 bp in length, respectively ( Table 2). The location of rrnS is between tRNA-Cys and cox2, and that of rrnL is between tRNA-Thr and tRNA-Cys, which is the same as other trematodes. In contrast to some other trematodes (two AT-rich regions), such as F. hepatica and F. gigantica [14,23], O. felineus [22] and S. haematobium [24], there is only one AT-rich region (348 bp) in the mt genome of H. conoideum, which is located between tRNA-Glu and cox3 (Figure 1 and Table 2), with an A + T content of 60.19% (Table 3).

A comparison of nucleotide variability between H. conoideum and F. hepatica
A sliding window analysis of H. conoideum and F. hepatica using complete mt genomes showed the nucleotide  diversity Pi (π) for 12 protein-coding genes ( Figure 2). It indicated that the highest level of the mt sequence variability was within the gene atp6, and the lowest was within nad5. In our study, the most conserved proteincoding genes are cox1, nad2 and nad5, and the least conserved are atp6 and nad3.

Phylogenetic relationships
We used concatenated amino acid sequence data representing 12 mt protein-coding genes of H. conoideum, eight other digeneans (C. sinensis, F. gigantica, F. hepatica, O. felineus, S. haematobium, S. japonicum, S. mekongi and S. spindale) and one tapeworm (T. solium) for a selective analysis of genetic relationships ( Figure 3). The tree reveals two large clades with strong support (100%): one contains four members representing two families (Fasciolidae and Opisthorchiidae) and H. conoideum; the other clade contains four members of the Schistosomatidae. In the present analysis, H. conoideum had a relatively close genetic relationship with F. hepatica and other members of the Fasciolidae, followed by Opisthorchiidae, and then the Schistosomatidae. There was no difference in tree topology using the ML, MB and MP methods of analysis (not shown).

Discussion
The present characterization of the mt genome of H. conoideum provides a basis for addressing questions regarding the biology, epidemiology and population genetics of Hypoderaeum spp. In addition, it will also assist in supporting taxonomic studies of Hypoderaeum spp. of other animals (e.g., chickens, ducks, geese and humans) as well as in tracking life cycles by identifying larval stages in different intermediate hosts using molecular tools. Assisted by sliding window analysis, PCR primers could be selectively designed to regions conserved among different trematode species and flanking variable regions in the mt genome that are informative (based on sequencing from a small number of individuals from particular populations). PCR-coupled single-strand conformation polymorphism (SSCP) analysis [29] could then be employed to screen large numbers of individuals representing different populations and, based on such an analysis, samples representing all detectable genetic variability could be selected for subsequent sequencing and analyses. Such an approach has been applied to study the genetic make-up of the blood fluke S. japonicum from seven provinces in China [30,31]. Now that the H. conoideum mt genome is available, it would be interesting to undertake a comprehensive study of this morphospecies from various host species from different countries by integrating morphological Figure 3 Phylogenetic relationship of Hypoderaeum conoideum with selected trematodes; based on concatenated amino acid sequence data representing 12 protein-coding genes by neighbor-joining analysis, using Taenia solium as an outgroup. Nodal support values are indicated (%); the bar indicates amino acid substitution per site. data with PCR-based genetic analyses of adult worms and larval stages (from intermediate hosts) to begin to understand the epidemiology and ecology of H. conoideum. In addition to conducting targeted mt genetic analyses, it would also be useful to include analyses of sequence variability in the two internal transcribed spacers (ITS-1 and ITS-2), 18S and 28S of nuclear ribosomal DNA, because, for trematodes, these markers usually allow specific identification of trematodes. Importantly, although H. conoideum is recognized as a species, it is possible that cryptic species of this taxon might exist. This proposal could be tested using the mt markers defined here, together with ITS-1 and/or ITS-2.

Conclusions
Our analysis showed that H. conoideum is genetically closely related to F. hepatica comparing with other trematodes. The mt genome of H. conoideum should be useful as a resource for comparative mt genomic studies of trematodes and DNA markers for systematic, population genetic and epidemiological studies of H. conoideum and congeners.