The complete mitochondrial genome of the gullet worm Gongylonema pulchrum: gene content, arrangement, composition and phylogenetic implications

Gongylonema pulchrum (Nematoda: Gongylonematidae), a thread-like spirurid gullet worm, infects a range of mammalian definitive hosts, including cattle, pigs, equines, goats, primates and humans, and can cause gongylonemiasis. In the present study, the complete mitochondrial (mt) genome of G. pulchrum was obtained using Long-range PCR and subsequent primer walking. The phylogenetic position of G. pulchrum within the Spiruromorpha was established using Bayesian analyses of the protein-coding genes at the amino acid level. The length of this AT-rich (75.94%) mt genome is 13,798 bp. It contains 12 protein-coding genes, two ribosomal RNA genes, 22 transfer RNA genes and one non-coding region. The gene arrangement is the same as those of Thelazia callipaeda (Thelaziidae) and Setaria digitata (Onchocercidae), but distinct from that of Heliconema longissimum (Physalopteridae). Phylogenetic analyses, based on the concatenated amino acid sequence data for all 12 protein-coding genes using Bayesian inference (BI) method, showed that G. pulchrum (Gongylonematidae) was more closely related to Spirocerca lupi (Spiruroidea) than other members of the infraorder Spiruromorpha. The present study represents the first mt genome sequence for the family Gongylonematidae, which provides the opportunity to develop novel genetic markers for studies of epidemiology, population genetics and systematics of this nematode of human and animal health significance.

markers in portion of mitochondrial (mt) DNA and the internal transcribed spacer (ITS) regions of nuclear ribosomal DNA (rDNA), have found utility for taxonomic and epidemiological studies of G. pulchrum [8], there is still a paucity of information on G. pulchrum in different hosts and countries around the world.
Due to its maternal inheritance, fast rate of evolutionary change, lack of recombination and relatively conserved genome structures [9], mt genomes have been widely used as genetic markers for population genetic structure and the study of phylogenetic relationships among nematodes and trematodes [10]. In particular, concatenated amino acid sequences, derived from the protein-coding genes, lend themselves for assessing systematic relationships of parasitic nematodes [11][12][13][14][15][16][17][18][19]. The objectives of the present study were to characterize the mt genome of G. pulchrum, the first representative of the family Gongylonematidae, and to assess the phylogenetic position of this zoonotic nematode in relation to other nematodes of the infraorder Spiruromorpha.

Ethics statement
The performance of this study was strictly according to the recommendations of the Guide for the Care and Use of Laboratory Animals of the Ministry of Health, China, and our protocol was reviewed and approved by the Research Ethics Committee of Northwest A&F University.

Parasite collection
Adult specimens of G. pulchrum were collected from the oesophagus of a naturally infected goat in Shenmu, Shaanxi province, China, with no specific permits being required by the authority for the sample collection.

Genomic DNA extraction
The gullet worms were washed extensively in physiological saline, identified morphologically to species according to existing keys and descriptions [20], fixed in ethanol and then stored at −20°C until use. The midbody section of each worm was used for the isolation of total genomic DNA using proteinase K digestion and mini-column purification (TIANamp Genomic DNA Purification System, TIANGEN). The identity of the specimen was verified by sequencing regions of the ITS-1 and ITS-2 rDNA using an established method [8]; the two regions were 99.7% and 100% identical to previously published sequences for G. pulchrum from Bos taurus in Iran (GenBank accession nos. AB495392 and AB513721, respectively).

Long-PCR and sequencing
Using three pairs of specific primers (Table 1) designed according to relatively conserved regions within the cox1, cox2 and cox3 regions of nematodes within order Spirurata (Figure 1), three overlapping amplicons of the complete mt genome were amplified by Long-PCR [21]. PCRs were conducted in 25 μl reaction volumes containing 2 mM MgCl 2 , 0.2 mM each of dNTPs, 2.5 μl 10× Taq buffer, 2.5 μM of each primer and 0.5 μl LA Taq DNA polymerase (5 U/μl, TaKaRa). PCR cycling conditions were as follows: 92°C for 2 min (initial denaturation), then 92°C for 10 s (denaturation), 45°C for 30 s (annealing), and 60°C for 8 min (extension) for 9 cycles, followed by 92°C for 10 s, 45°C for 30 s (annealing), and 60°C for 9 min (extension) for 25 cycles, with a cycle elongation of 10 s for each cycle and a final extension at 60°C for 10 min. No-template and known-positive controls were included in each run. Amplicons were column-purified using TIANgel Midi Purification Kit (TIANGEN, Beijing, China). Following an electrophoretic analysis of quality, purified amplicons were sequenced using a primer walking strategy [21] with primers listed in Additional file 1 by Invitrogen Company (Shanghai, China).

Sequence analyses
Sequences were assembled manually and aligned against the complete mt genome sequences of Spirocerca lupi [22] using the computer program MAFFT 7.122 [23] to identify gene boundaries. Each gene was translated into its amino acid sequence using the invertebrate mitochondrial genetic code in MEGA 5 [24]. The translation initiation and termination codons were identified to avoid gene overlap and to optimize the similarity between the gene lengths of closely related species of the infraorder Spiruromorpha. The program tRNAscan-SE [25] was used to find tRNA and infer their secondary structure, putative secondary structures of 19 tRNA genes were identified, and the remaining three tRNA genes (tRNA-Arg, tRNA-Ser AGN and tRNA-Ser UCN ) were inferred by recognizing potential secondary structures and anticodon sequences by eye [26]. Two rRNA genes were predicted by comparison with those of closely related nematodes of the infraorder Spiruromorpha [15,22].

Phylogenetic analyses
Amino acid sequences inferred from published mt genomes representing 12 species of the infraorder Spiruromorpha (Brugia malayi: GenBank accession no. NC_004298; Wuchereria bancrofti: JN367461; Chandlerella quiscali: NC_014486; Loa loa: NC_016199; Acanthocheilonema viteae: NC_016197; Onchocerca flexuosa: NC_016172; Onchocerca volvulus: AF015193; Dirofilaria immitis: NC_005305; Setaria digitata: NC_014282; S. lupi: KC305876; Thelazia callipaeda: JX069968; Heliconema longissimum: NC_016127) were included in the present analysis, using Toxascaris leonina (NC_023504) as the outgroup [27]. 12 amino acid sequences were separately aligned using MAFFT 7.122 and then concatenated, with ambiguously aligned regions excluded using Gblocks 0.91b (doc) [28] with the default parameters using the options for a less stringent selection. Phylogenetic analyses were conducted using Bayesian inference (BI) method. The JTT+G+F model of amino acid evolution was selected as the most suitable model of evolution by ProtTest 2.4 [29] based on the Akaike information criterion (AIC). As the JTT model is not implemented in the current version of MrBayes, an alternative model, CpREV, was used in BI and four chains (three heated and one cold) were run simultaneously for the Monte Carlo Markov Chain. Two independent runs for 1,000,000 metropolis-coupled MCMC generations were used, sampling a tree every 100 generation in MrBayes 3.1.1 [30]. At the end of each run,  Figure 1 Organization of the mitochondrial genome of Gongylonema pulchrum. Scale is approximate. All genes have standard nomenclature except for the 22 tRNA genes, which are designated by the one-letter code for the corresponding amino acid, with numerals differentiating each of the two leucine-and serine-specifying tRNAs (L 1 and L 2 for codon families CUN and UUR, respectively; S 1 and S 2 for codon families UCN and AGN, respectively). All genes are transcribed in the clockwise direction. 'AT' indicates the non-coding region.
the average standard deviation of split frequencies was less than 0.01. In addition, the potential scale reduction factor (~1) was examined to ensure that the convergence had been achieved. A 50% majority rule consensus tree was obtained from BI. Of 10,000 trees, the first 2,500 trees represented burn-in and the remaining trees were used to calculate Bayesian posterior probabilities (Bpp). Phylograms were drawn using the program FigTree v.1.4 [31].

Results and discussion
The complete mt genomic sequence of G. pulchrum (GenBank accession no. KM264298) was 13,798 bp in size ( Figure 1). It contains 12 protein-coding genes (cox1-3, nad1-6, nad4L, atp6 and cytb), two ribosomal RNA (rRNA) genes, 22 transfer RNA (tRNA) genes and one non-coding (control or AT-rich) region, but lacks the atp8 gene ( Table 2). The gene content and arrangement are the same as those of T. callipaeda (Thelaziidae) [15] and S. digitata (Onchocercidae) [32], but distinct from those of D. medinensis (Dracunculidae) and H. longissimum (Physalopteridae) [12]. All genes are transcribed in the same direction. In addition, the mt genome of G. pulchrum has 24 intergenic regions, ranging from 1 to 78 bp in length. The longest region (78 bp) is between tRNA-Pro and tRNA-Asp genes ( Table 2). The nucleotide content of the entire mt genome sequence of G. pulchrum is biased toward A+T (75.94%), in accordance with mt genomes of other spirurid nematodes (Table 3). AT-and GC-skews of the whole mt genome were calculated for G. pulchrum and other spirurid nematodes studied to date ( Table 4). The composition of the mt genome sequence of G. pulchrum was strongly skewed towards T (AT skew = −0.413), and G (GC skew was 0.448) ( Table 3). The most common initiation codon for G. pulchrum is TGG, followed by ATG, ATT, GTT and GTG (4, 3, 2, 2, 1 genes, respectively; Table 2). The most frequent complete termination codon is TAA (4 genes); nad4 is terminated with the codon TAG. Of the remaining genes, nad1, nad3, nad5, atp6 and cox2 are terminated with the abbreviated stop codon T, whereas cytb and nad2 are terminated with the abbreviated stop codon TA. This is consistent with the arrangement in the mt genomes of other nematodes [27,[33][34][35].
Twenty-two tRNA genes were predicted from the mt genome of G. pulchrum and varied from 52 to 59 bp in length. Twenty of the 22 tRNA genes (excluding two tRNA-Ser) have a predicted secondary structure with a 3-5 bp DHU arm and a DHU loop of 7-9 bases, in which the variable TψC arm and loop are replaced by a "TV-replacement loop" of 8-10 bases. As seen in almost all other nematode mtDNAs [36], the tRNA-Ser gene of G. pulchrum mt genome is equipped with a TψC arm and loop but lacks the DHU arm and loop, consisting of a 6-8 bp TψC arm, TψC loop of 4-6 bases and a variable loop of 4 bases. The majority of nematode mtDNA sequences usually contain two non-coding regions with significant size difference [37], but there is only one non-coding region (AT-rich region) in the mt genome of G. pulchrum which is located between cox3 and tRNA-Ala ( Figure 1 and Table 2), with 81.11% of A+T content (Table 3). Furthermore, this region is devoid of consecutive sequences of [A] and [T], and there are no AT dinucleotide repeat sequences; these repeat regions have been reported in the mt genome of A. simplex s.l. and S. digitata [32,34].
Identification and differentiation of G. pulchrum has traditionally been based on morphological features. However, these criteria are often insufficient for specific identification and differentiation, particularly at the larval and/or egg stages. Molecular tools, using genetic markers in mt  cox1 and ITS-1 region of nuclear rDNA, have been used to support clinical diagnosis and to assist in undertaking molecular identification and epidemiological investigations of G. pulchrum [8]. Because sequence polymorphism (heterogeneity) in ITS rDNA sequences occurs within individual spirurid specimens [38], mt genome sequences appear to be better solutions for such studies. Additionally, the cox1 sequences of G. pulchrum were further divided into multiple haplotypes and two groups of haplotypes (i.e. those from a majority of sika deer, wild boars and Japanese macaques and those from cattle and zoo animals, were clearly differentiated) [8]. Nonetheless, the cox1 is a relatively conserved mt gene in nematodes [13,16,39], and, to date, there is no genetic information for G. pulchrum from other mt genes. Phylogenetic analyses of G. pulchrum with related nematodes of the infraorder Spiruromorpha were performed  by BI based on concatenated mitochondrial amino acid sequences of 12 protein-coding genes ( Figure 2). Gongylonema pulchrum (Gongylonematidae) formed the sister group to Spirocerca lupi. Together they formed the sister group to a clade composed of Setariidae and Onchocercidae. Thelazia callipaeda (Thelaziidae) took an early diverging position to the above mentioned taxa, whereas Heliconema longissimum (Physalopteridae) and Toxascaris leonina took an unresolved position at the root of the tree (Figure 2). Many studies have indicated that the mtDNA sequence is a valuable genetic marker for phylogenetic studies [11,12,14,17]. The mt genome sequence of G. pulchrum could promote to reassess the systematic relationships within the spirurid nematodes using mt genomic datasets. Over the last decades, there have been considerable debate concerning the systematics of members of the spirurid nematodes (including superfamilies Filarioidea, Physalopteroidea, Spiruroidea and Thelazoidea) [40]. Some studies using nuclear small subunit (SSU) rDNA and mtDNA sequences have indicated that S. lupi (Spirocercidae) is the sister taxon of T. callipaeda (Thelaziidae), suggesting that S. lupi belongs to the superfamily Thelazioidea [41,42], but this finding was contradicted by other studies that used the same markers [43,44]. The results of the present study support that the Spirocercidae (represented by S. lupi) was more closely related to the family Gongylonematidae (represented by G. pulchrum) than to other families within Spiruromorpha, indicating that both families Spirocercidae and Gongylonematidae belong to superfamily Spiruroidea, consistent with conclusions of a previous study [40]. Given this utility of mt genomic datasets, further work should include sequencing of mt genomes of other spirurid nematodes in order to reconstruct the phylogenetic relationships of spirurid nematodes.

Conclusions
The present study determined the complete mt genome sequence of G. pulchrum, and ascertained its phylogenetic position within the infraorder Spiruromorpha. The complete mt genome represents the first sequenced mt genome of any member of the family Gongylonematidae. It will provide an important resource for the design of novel primers for the study of epidemiology, population genetics and systematics of Gongylonematidae.