Mitochondrial genome of Babesia orientalis, apicomplexan parasite of water buffalo (Bubalus babalis, Linnaeus, 1758) endemic in China

Background Apicomplexan parasites of the genus Babesia, Theileria and Plasmodium are very closely related organisms. Interestingly, their mitochondrial (mt) genomes are highly divergent. Among Babesia, Babesia orientalis is a new species recently identified and specifically epidemic to the southern part of China, causing severe disease to water buffalo. However, no information on the mt genome of B. orientalis was available. Methods Four pairs of primers were designed based on the full genome sequence of B. orientalis (unpublished data) and by aligning reported mt genomes of B. bovis, B. bigemina, and T. parva. The entire mt genome was amplified by four sets of PCR. The obtained mt genome was annotated by aligning with published apicomplexan mt genomes and Artemis software v11. Phylogenetic analysis was performed by using cox1 and cob amino acid sequences. Results The complete mt genome of B. orientalis (Wuhan strain) was sequenced and characterized. The entire mt genome is 5996 bp in length with a linear form, containing three protein-coding genes including cytochrome c oxidase I (cox1), cytochrome b (cob) and cytochrome c oxidase III (cox3) and six rRNA large subunit gene fragments. The gene arrangement in B. orientalis mt genome is similar to those of B. bovis, B. gibsoni and Theileria parva, but different from those of T. orientalis, T. equi and Plasmodium falciparum. Comparative analysis indicated that cox1 and cob genes were more conserved than cox3. Phylogenetic analysis based on amino acid sequences of cox1, cob and cox1 + cob, respectively, revealed that B. orientalis fell into Babesia clade with the closest relationship to B. bovis. Conclusions The availability of the entire mt genome sequences of B. orientalis provides valuable information for future phylogenetic, population genetics and molecular epidemiological studies of apicomplexan parasites.


Background
Mitochondria are essential organelles within cells and are responsible for energy transduction, metabolism, cell growth and survival [1][2][3]. Inside mitochondria, there is a genome called mitochondrial (mt) genome. Mt genomes are present in almost all eukaryotic cells and have remarkable variations in size, structure, and organization [4][5][6].
The largest mt genome has been found in muskmelons with an estimated size of 2400 kb [7][8][9]. The smallest mt genome of only 6 kb in length has been reported in an apicomplexan parasite (Plasmodium) [10,11].
The structure of mt genome contains two major types, the linear form and the circular form. The circular forms are usually present in animal mt genomes with the size ranging from 15 kb to 20 kb, containing 12-13 proteincoding genes, 22 transfer RNA (tRNA) genes and two ribosomal RNA (rRNA) genes, and gene arrangements in the genomes are extremely stable [12]. The linear forms have been documented in many apicomplexan parasites, including Plasmodium, Babesia, Theileria and Eimeria [5,10,11,13]. Compared with animal mt genomes, the mt genomes of apicomplexan parasites encode only three protein-coding genes (cytochrome c oxidase subunits I [cox1] and III [cox3], and cytochrome b [cob]) and six fragments of large subunit rRNA genes [14]. The gene arrangements are also different among animal mt genomes and apicomplexan parasites mt genome.
Meanwhile, in apicomplexan parasites, mitochondrial protein-coding genes have been extensively used as genetic markers for phylogenetic analysis at different taxonomic levels, serving as an ideal model for gene rearrangement, and evolutionary studies [13,15]. Most of the phylogenies of apicomplexan parasites were constructed using cox1 or cob alone, however, in some cases, a combination of cox1 and cob was employed to evaluate the phylogenetic relationships [13,16]. In addition, mt genome sequences are also very valuable for population genetic studies [17][18][19] as reported in Plasmodium vivax, Plasmodium knowlesi, Trypanosoma cruzi [20][21][22].
B. orientalis is a tick-borne, intra-erythrocytic protozoan parasite causing buffalo babesiosis characterized by fever, anemia, icterus, haemoglobinuria and high mortality [23,24]. This species was first reported in 1987 and then identified as a new species named Babesia orientalis in 1997 [25]. The new species was discovered initially based on the differences in morphology, transmission, pathogenicity and endemic areas, compared to Babesia bigemina and Babesia bovis [25], and later confirmed by the phylogenetic analysis based on 18S rRNA and heat shock protein 70 (HSP70) genes [26,27]. The disease caused by B. orientalis is one of the most important parasitic diseases of buffalo in central and south China, resulting in enormous economic losses [27,28]. In spite of its importance, very limited information was available about this parasite, especially at the molecular level, including mt genome sequences and structures.
In the present study, B. orientalis (Wuhan strain) mt genome was determined and annotated. The structure was characterized and compared with those of related species. In addition, the evolution of structural divergence in the apicomplexan mt genomes was discussed.

Parasite cultivation
Babesia orientalis (Wuhan strain) was cultivated according to the protocol of He et al. [29]. In brief, two, 1year-old water buffalo, free of B. orientalis infection as confirmed by microscopy and real-time PCR [29], were splenectomized 14 days prior to B. orientalis infection. Each buffalo was subcutaneously injected with 4 ml of B. orientalis-infected blood (Wuhan strain, percentage parasitized erythrocytes, PPE 1%). Blood samples were collected everyday to monitor the parasitemia until PPE reached 3%.
All the experimental animals were housed, fed and given clean drinking water in accordance with the stipulated rules for the regulation of the administration of affairs concerning experimental animals of P.R. China.

B. orientalis mitochondria DNA sequencing
The blood from experimentally infected buffalo was collected in EDTA. Parasite genomic DNA was extracted from B. orientalis-infected blood using QIAamp DNA Blood Mini Kit (Qiagen, Hilden, Germany) according to the manufacturer's instructions. The terminal inverted repeat (TIR) regions in the beginning and end of B. orientalis mt genome, 1-181 bp and 5835-5996 bp, were obtained by using B. bovis mt genome sequence to blast the full genome sequence of B. orientalis (unpublished data). Four pairs of primers were designed based on these two mtDNA sequences (1-181 bp and 5835-5996 bp) of B. orientalis and by aligning reported mt genomes of B. bovis (EU075182 and AB499088), B. bigemina (AB499085), and T. parva (Z23263 and AB499089) ( Table 1 and Figure 1). The entire mt genome was amplified by P1 and R4. In order to confirm the mt genome, four sets of PCR were processed using primers P1 and R1, P2 and R2, P3 and R3, P4 and R4, respectively ( Figure 1). Amplified products were purified and then ligated into the pMD19-T vector (TaKaRa Biotechnology), and the recombinant clones were sequenced using the ABI PRISM 377 DNA sequencer by following the manufacturer's instructions. The vector primers M13 (-47) and M13 (-48), as well as PCR primers, were used for the sequencing of mt genome.

Gene annotation and sequence analysis
Nucleotide sequences of B. orientalis mt genome (Gen-Bank accession no. KF218819) were aligned with published mt genome sequences of Plasmodium falciparum (M76611), B. bovis (AB499088), T. parva (AB499089), and T. annulata (NT167255) by MAFFT (version 7) [30,31] with manual corrections. Protein-coding genes were predicted by comparing the previously annotated sequences from these four related species. To identify putative rRNA genes, mt DNA sequences or annotated rRNA gene fragments from the four related species were used as queries by pair-wise comparison (Blastn) in NCBI. The entire B. orientalis mt genome was subjected to tRNAscan SE 1.21 (http://lowelab.ucsc.edu/tRNAscan-SE/) using Mito/Chloroplast model option and Nematode Mito model for analyzing the existence of tRNA gene. Results from both models were compared, final annotation was determined according to B. bovis mt genome annotation. B. orientalis mt genome was also annotated using appropriate mitochondrial codes in Artemis software v11 [32,33]. Nucleotide sequence identities were calculated by pairwise comparison between 13 apicomplexan parasites including B. orientalis, another six Babesia species, five Theileria species and P. falciparum for each of the three protein-encoding genes. The data sets of cox1 (472 aa), cox3 (229 aa) or cob (358 aa) were aligned using MAFFT (version 7) employing the FFT-NS-i algorithm [31,34]. The alignment was manually edited using BioEdit 7.1.11 [35]. The nucleotide identities were determined through BioEdit 7.1.11.

Phylogenetic analysis
The concatenated amino acid sequences of cox1, cox3 and cob from 15 apicomplexan parasites (Table 2) were used for similarity analysis. For phylogenetic analysis, cox3 was not included, because cox3 has been presented

Characterization of B. orientalis mt genome
The full length mt genome of B. orientalis was amplified by P1 and R4, and a 5996 bp fragment was obtained. In order to confirm the mt genome sequence, four overlapping fragments were amplified by using primer sets P1-R1, P2-R2, P3-R3 and P4-R4 with expected sizes of 1749 bp, 1655 bp, 1831 bp and 838 bp, respectively. These four overlapping fragments covered the entire genome of B. orientalis (Figure 1). Each of the PCR products were then cloned into pMD19-T vector, and sequenced. A 5996 bp mt genome sequence of B. orientalis (Wuhan strain) was obtained by assembling all the sequenced fragments.
Sequence analysis indicated that B. orientalis mt genome was arranged in a linear form. It containing three protein-coding genes, cox1 (cytochrome c oxidase I), cob (cytochrome b), cox3 (cytochrome c oxidase III) and six large subunit (LSU) rRNA gene fragments, but not any tRNA genes, which is consistent with those of other apicomplexan parasites studied to date (Figures 1 and 2) [11]. The mt genomes of apicomplexan parasites usually contain terminal inverted repeat (TIR) sequences with the size of around 440-450 bp [5]. However, in the mt genome of B. orientalis, 181 bp and 161 bp TIRs were identified from the beginning and the end, respectively ( Figure 1). cox1, the first and fourth rRNA large subunit fragments (L1 and L4) are encoded by one strand of the mt genome, whereas cox3, cob, the second, the third, fifth and sixth fragments of large subunit rRNA genes (L2, L3, L5 and L6) are encoded by another strand   ( Figure 2a). The arrangement and predicted transcriptional direction of three protein-coding genes are the same as that of B. bovis (Figure 2b), B. gibsoni (not shown), and Theileria parva (Figure 2c), however, it greatly differed from that of T. orientalis, T. equi (not shown) and P. falciparum (Figure 2d) [5].
In this study, BLAST analysis of the entire mt genome sequence of B. orientalis to NCBI databases revealed that the mt genome sequence of B. orientalis was most similar to those of B. bovis (EU075182 and AB499088), with an identity of 87%. The other related species were B. caballi, B. bigemina and B. gibsoni, showing their mt genome sequence identities of 86%, 85% and 84% with that of B. orientalis, respectively. However, nucleotide identities were greatly different for the sequences of three protein-coding genes cox1, cox3 and cob, when compared among different species. Pairwise comparison has been carried out among 13 different apicomplexan species including B. orientialis, another six Babesia species, five Theileria species and P. falciparum (see Table 3). The nucleotide sequence differences among 13 species ranged from 38.7% to 99.8% for cox1 (Table 3a), from 28.8% to 92.5% for cox3 (Table 3b) and 32.7% to 99.9% for cob (Table 3c), respectively. These results indicated that cox1 and cob genes were more conserved than cox3 gene. The different extent of conservation in the protein-coding genes suggested that mt genome sequences could be used as a gene marker for population evolutionary studies.

Phylogenetic analysis
The majority of phylogenetic studies in the phylum apicomplexa have utilized 18S rRNA genes [27,42], which allow the analysis of the ancient relationships or strains and species differentiation by focusing on the highly conserved regions defining the critical secondary structure, or the more variable internal transcribed spacer regions, respectively. Because of different extent of conservation of three protein-coding genes and important function of mt genome, the extent of sequence difference might reflect the phylogenetic relationship across species of apicomplexan parasites. So the mitochondrion sequences may provide an alternative approach to conduct these studies [5,10,13].
To analyze the phylogenetic relationship of B. orientalis with other apicomplexan parasites, phylogenetic trees were constructed with the amino acid sequences of cox1, cob and cox1 + cob using neighbor-joining (NJ), maximum likelihood (ML), maximum parsimony (MP) and Bayesian phylogenetic methods. The trees obtained from three data sets by different methods were consistent with no significant changes in the topology or in the bootstrap values (Figures 3a,b, and c). The NJ trees constructed with cox1, cob and cox1 + cob sequences have been presented as a representative (Figure 3). All three NJ trees displayed the same topology with high bootstrap values. B. orientalis appeared in the Babesia clade, and it's most close to B. bovis. However, the bootstrap value was higher in the tree constructed by cox1 + cob as compared to those of cox1 and cob. These results suggested that the combined amino acid sequences of cox1 and cob may be more reliable in studying evolutionary relationships than the sequences of single gene. The relationship of B. orientalis with other apicomplexan parasites revealed by all trees from mt genome sequences were consistent with that from the previous phylogenetic trees based on 18S rRNA and heat shock protein 70 (HSP70) gene [26,27]. These results demonstrated that the mt genome sequences are useful for the phylogenetic studies of apicomplexan parasites.

Conclusion
In this study, we first reported the 5996 bp linear mitochondrion genome of B. orientalis. This mt genome contains 6 LSU rRNA gene fragments and three proteincoding genes, but no tRNA gene. Gene arrangement in the mt genome of B. orientalis is similar to those of B. bovis, B. gibsoni, and T. parva, but different from those of T. orientalis, T. equi and P. falciparum. Phylogenies based on the amino acid sequences of cox1 or cob alone and cox1 + cob combined all indicated that B. orientalis is closest to B. bovis, which is in agreement with the previous phylogenetic studies of B. orientalis. The availability of the entire mt genome sequences of B. orientalis provides valuable information for future phylogenetics, population genetics and molecular epidemiological studies of apicomplexan parasites.