Analysis of kinetoplast cytochrome b gene of 16 Leishmania isolates from different foci of China: different species of Leishmania in China and their phylogenetic inference

Background Leishmania species belong to the family Trypanosomatidae and cause leishmaniasis, a geographically widespread disease that infects humans and other vertebrates. This disease remains endemic in China. Due to the large geographic area and complex ecological environment, the taxonomic position and phylogenetic relationship of Chinese Leishmania isolates remain uncertain. A recent internal transcribed spacer 1 and cytochrome oxidase II phylogeny of Chinese Leishmania isolates has challenged some aspects of their traditional taxonomy as well as cladistics hypotheses of their phylogeny. The current study was designed to provide further disease background and sequence analysis. Methods We systematically analyzed 50 cytochrome b (cyt b) gene sequences of 19 isolates (16 from China, 3 from other countries) sequenced after polymerase chain reaction (PCR) using a special primer for cyt b as well as 31 sequences downloaded from GenBank. After alignment, the data were analyzed using the maximum parsimony, Bayesian and netwok methods. Results Sequences of six haplotypes representing 10 Chinese isolates formed a monophyletic group and clustered with Leishmania tarentolae. The isolates GS1, GS7, XJ771 of this study from China clustered with other isolates of Leishmania donovani complex. The isolate JS1 was a sister to Leishmania tropica, which represented an L. tropica complex instead of clustering with L. donovani complex or with the other 10 Chinese isolates. The isolates KXG-2 and GS-GER20 formed a monophyletic group with Leishmania turanica from central Asia. In the different phylogenetic trees, all of the Chinese isolates occurred in at least four groups regardless of geographic distribution. Conclusions The undescribed Leishmania species of China, which are clearly causative agents of canine leishmaniasis and human visceral leishmaniasis and are related to Sauroleishmania, may have evolved from a common ancestral parasite that came from the Americas and may have split off earlier than the other old world Leishmania. Our results also suggest the following: the isolates GS7, GS1 and XJ771 occur as part of the L. donovani complex; the JS1 isolate is L. tropica; and the isolate GS-GER20 identified as Leishmania gerbilli is close to KXG-2 which is L. turanica.


Background
The leishmaniases are a group of vector borne diseases that are caused by flagellate of the genus Leishmania, which is transmitted by the bite of the sandfly, and affect as many as 12 million people worldwide with 1.5-2 million new cases each year in 88 countries [1]. The genus Leishmania consists of nearly 30 species of morphologically similar kinetoplastid protozoa, and approximately 20 of these species are responsible for a spectrum of human diseases that ranges from mild to fatal infections [2,3].
The complexity of the taxonomy and phylogenetic relationships of the Chinese Leishmania was due to the extensive geographic area and complex ecological environment. Identification of species responsible for different leishmaniasis and clinical manifestation remains uncertain. The strains from cutaneous leishmaniasis (CL) in Xinjiang Uygur Autonomous Region (Xinjiang) especially in Karamay is closely related to L. tropica with analysis of SSU rDNA gene [11,12], whereas the pathogen identified as Leishmania infantum [13,14] or Leishmania turanica [15] from the same geographic region Karamay also could cause CL. However, L. turanica is nonpathogenic to humans, according to Strelkova et al. [16]. The parasites of some visceral leishmaniasis (VL) cases in Sichuan and Gansu provinces were L. donovani or undescribed species Leishmania sp. [17][18][19]. VL and CL have been reported in China to date the species of Leishmania comprises much more than that. The isolates in China were more heterogeneous than previously thought, requiring the reassignment of some isolates into different groups as described by Lu et al. [20].
In previous ITS1 and CO II study [17,19], we summarized the four endemic Leishmania species in China: L. donovani, L. infantum, Leishmania gerbilli, and L. turanica. We also noted that there might be an undescribed Leishmania species endemic in China and highlighted that the isolate IPHL/CN/77/XJ771 from Bachu County, Xinjiang, is L. donovani instead of L. infantum. To elucidate the phylogeny, evolution and epidemiology of interesting group of strains in China, further studies of more genes are required.
Cyt b is one of the cytochromes involved in the electron transport process of the mitochondrial respiratory chain is considered one of the most useful genes for phylogenetic work [34]. Marco et al. proved that the cyt b gene sequencing can precisely identify the Leishmania spp. for all of the local stocks that are well characterized by multi-locus enzyme electrophoresis (MLEE), the current gold standard [32]. Phylogeny and sequence variation of the genus Leishmania has also been discussed successfully with cyt b sequencing [34,35]. In this paper, the cyt b gene of Leishmania from China was sequenced and analyzed using bioinformatics methods. Moreover, the phylogenetic relationships were reconstructed using cyt b sequences obtained by this study and download from the GenBank database. We then discuss in detail the implications of relationships between strains in China and other locations.

Leishmania isolates
A population of cloned promastigotes (including 16 Chinese Leishmania isolates and three isolates from other countries) was stored in liquid nitrogen, and kept at the Department of Parasitology, Western China School of Preclinical and Forensic Medicine, Sichuan University. All of the Leishmania isolates used in this study are listed in Table 1. The promastigotes were cultivated in medium 199 supplemented with 15% heat-inactivated fetal bovine serum (HIFBS) at 28°C. Approximately 1-5 × 10 9 promastigotes were collected at room temperature by centrifugation at 3300 × g for 10 min and washed with phosphate-buffered saline. DNA extraction and polymerase chain reaction (PCR) Total genomic DNA of the parasite was extracted by proteinase K digestion and phenol/chloroform/isoamyl alcohol extraction procedures followed by ethanol precipitation to purify the extracted DNA as described by Sambrook and Russell [39]. PCR was performed to generate a fragment spanning cyt b kinetoplast DNA (kDNA) between the forward primer COIIIF (5 0 -TAAT ACGACTCACTATAGTTTATATTG ACATTTTGTWG ATT-3 0 ) and the reverse primer MURF4R (5 0 -GGGTTT TCCCAG TCACGACGAATCTCTCTCTCCCTT −3 0 ) [35]. The PCR protocols for amplification were: 94°C for 3 min followed by 35 cycles of 94°C for 30s, 58°C for 30s, and 72°C for 1.5 min, followed by a final elongation step at 72°C for 10 min. The amplified products were purified on a 2.0% agarose gel stained with ethidium bromide, using a commercial DNA purification kit according to the manufacturer's protocol. The purified PCR product was then sequenced. The DNA sequences of each individual and each species were deposited in the GenBank database under accession numbers (HQ908255-HQ908273).

Phylogenetic analyses
Phylogenetic hypotheses of Leishmania were generated with cyt b kDNA segments using two types of commonly applied phylogenetic techniques: heuristic searches using maximum parsimony (MP) analyses performed with the program PAUP* program and Bayesian inference (BI) using the MrBayes v.3.2 program [46]. In both MP and BI analyses, gaps were treated as missing data. For heuristic searches under parsimony, invariant characters were removed from the dataset. Each search involved 10 random additional replicates, one tree held at each step, with tree bisection and reconnectin branch swapping, steepest descent on, and a maximum of 10,000 saved trees. Nonparametric bootstrapping was used to generate phylogeny confidence values [47], with 1,000 pseudoreplicates using a heuristic tree search for each pseudoreplicate. Trypanosoma brucei (M94286) was used to root the trees. Prior to Bayesian analyses, the best-fit model of evolution, TIM3 + G, was selected using jModeltest v. 0.1.1 [48] under the Bayesian information criterion [49], following recent recommendations [50]. We estimated the posterior probability distributions by allowing four incrementally heated Markov chains (default heating values) to proceed to four million generations, and with samples were taken every 200 generations. Analyses were repeated beginning with different starting trees to ensure that the analyses were not restricted from the global optimum [51]. Convergence was first tested by examining the average deviation of the split frequencies of the two runs, in order to determine whether the two runs had converged. MCMC convergence was also explored by examining the potential scale reduction factor (PSRF) convergence diagnostics for all model parameters (provided by the sump and sumt commands). The first one million generations before this chain became stationary were discarded, and the remaining samples from the independent runs were pooled to obtain the final approximation of the posterior tree distribution. Sequence alignments were additionally inferred from uncorrected p-distances through NJ networks (Neighbor Net) obtained by SplitsTree 4 [52,53]. This software can detect the alternative evolutionary paths supported by the sequence alignments, and as such, they do not enforce the single bifurcating dendrogram. To yield a single phylogeny hypothesis, the posterior distribution was summarized as a 50% majority rule consensus.

Base composition and nucleotide substitution patterns
The size of the newly determined cyt b fragments is shown in the A base stationarity test showed insignificant differences among the taxa in base composition bias in the data (χ 2 = 85.386150, df = 108, p = 0.94687017). The pdistances among the 10 isolates (10 isolates, SD1, SC10H2, GS6, SC6, GS3, GS2, XJ801, SC11, SC9, GS5; the isolates SD1, SC10H1, GS6, SC6 and GS3 share the same sequence) in China were ranged from 0.000 to 0.023 (mean = 0.010), which are smaller than the distances between these isolate and any other known species. These 10 isolates were then classified into the Leishmania sp. group. The divergence between Leishmania sp. and other Leishmania species ranged from 0.051 (Leishmania sp. versus L.tarentolae) to 0.131 (Leishmania sp. vs. L. turanica and L. gerbilli), with an average of 0.096, a value that is larger than that within Leishmania sp. group. The L. donovani complex group contains seven haplotypes: GS1, GS7, XJ771, PP75a (the same sequence with CRE69), 2S-25M-C2 (the same sequence shared with IPT1), PP75 (the same sequence with DD8). The distance within the strains of L. donovani complex ranged from 0.001 (the strain GS7 of China vs. L. chagasi PP75a and L. infantum CRE69) to 0.013 (L. archibaldi GEBRE1 vs. L. chagasi PP75 and L. donovani DD8), which are smaller than the distances between these strains and other known species. The average distance in this group as a whole is 0.006. The p distances among all species except Leishmania sp. and T. brucei were from 0.002 (between L. turanica and L. gerbilli) to 0.136 (between Leishmania arabica and L. equatoresnsis). Most pairwise comparisons mentioned above had divergence values < 0.136, with an average of 0.106 (Table 2).
For the BI analyses, the likelihood value of the 50% majority consensus tree (Figure 2 In addition to the common phylogenetic relationships among the different species shown by the MP tree and BI tree, the network (Figure 3) calculated by SplitsTree 4 also indicated a clear evolutionary path with a high value. Leishmania sp. and L. tarentolae share most of their evolutionary paths.  Table 1. The letters (a, b, c, d, e, f, g, and h) indicate sharing of a haplotype.

Discussion
As a part of worldwide Leishmania population, the phylogenetics of Chinese isolates with analysis of the cyt b genetic sequences of 16 Leishmania isolates was discussed in this paper which demonstrated similarities and differences compared with previous data [17,19] and keep the genus evolutionary unity and integrity over large geographic ranges and time periods.

Leishmania sp. of China
Most interestingly 10 Chinese strains, representing 6 closely related haplotypes, could not be assigned to any of the so far described species of Leishmania, a finding that is congruent with our earlier ITS1 and COII studies [17,19]. These Leishmania sp. isolates were most closely related to the lizard-infecting L. tarentolae (Figures 1, 2, 3).
It was reported that one of these isolates SC6 was collected from patients with VL in Nanping County of Sichuan Province, was infected successfully 8 dogs (8/12) and its amastigotes were detected in their bone marrow smears [55]. Another isolate SC10H2 was proved that it clustered with the pathogen of canine leishmaniasis in Beichuan County, Sichuan Province, China based on the 17S RNA gene [18]. The non-pathogenic to humans L. tarentolae has been classified as subgenus L. (Sauroleishmania) on the basis of biological criteria and different genes [4,8,10,34]. In such cases, we can conclude that the undescribed Leishmania species which is clearly a causative agent of canine leishmaniasis and human VL do exist in China are related to the Sauroleishmania. However, the more lizard parasites are required to confirm whether Leishmania sp. is assigned to the Sauroleishmania.
The pairwise genetic distance analysis (Table 2) and phylogenetic network (Figure 3) suggest that the cyt b sequences of the Chinese/tarentolae group (Leishmania sp. and L. tarentolae) are closer to the Viannia clade than the older world Leishmania. This finding is in contrast to that of our ITS1 study [17] and other studies: as an OWL species branching from within New World taxa, L. tarentolae (Sauroleishmania) are closer to the Leishmania subgenus than to the Viannia subgenus based on different DNA marks (polA and RNA polymerase II, 7SL RNA, hsp 70) [4,8,10]. It is well knows that different genes can have different evolutionary histories and be influenced by selection and horizontal gene transfer, and the phylogenies are also prone to sampling bias; therefore, more genes of diverse geographic original strains would be needed to elucidate the phylogeny, evolution, and epidemiology of the Chinese/tarentolae group.
The isolates of Leishmania sp. were collected from different foci (plain, desert and hill), and the longest Figure 2 The 50% majority-rule consensus tree inferred from Bayesian inference of cyt b dataset using MrBayes v. 3.2. The numbers at the nodes represent Bayesian posterior probabilities; TIM3 + G was selected using jModeltest v. 0.1.1. Trypanosoma brucei (M94286) is the outgroup. The tree is based on haplotypes (identical haplotypes are presented by one strain). Strain information is shown in Table 1. The letters (a, b, c, d, e, f, g, and h) indicate sharing of a haplotype.
distance between isolates is more than 2000 miles (from Shandong to the Xinjiang) ( Figure 4). Meanwhile, different species were found in the same area. The isolate XJ801 of Leishmania sp. is from Kashi city of Xinjiang. The isolate 801 identified as L. donovani based on ITS1 sequences by Wang et al. [16] and Yang et al.   Table 1 and Figures 1, 2 was built with 1000 bootstrap replicates. It was algorithm, excluding all conserved site. Distances were calculated using the Kimura 2-parameter distance. Trypanosoma brucei (M94286) is the outgroup. Each A-F panel is drawn to the scale indicated and expressed as dissimilarity per nucleotide counted over variable sites (Figures 1-2 Table 1.
Kashi [20]. As such, the Leishmania isolates in China were more heterogeneous, further epidemiologic survey and more strains are required in Kashi.

L. donovani complex of China
Analysis in the current study revealed that the cyt b sequences of GS1, GS7 and XJ771 clustered with other species of L. donovani complex (PP = 1.00). On the basis of MLEE of the representative isolates from the plain, hill, and desert regions of China, Xu et al. were first to identify the causative agents responsible of VL as Leishmania donovani sensu lato and L. infantum [56]. The results based on sequences of cyt b, ITS1 [17] and COII [19] sequences clearly proved the existence of L. donovani in China. However, L. donovani or L. infantum standard isolates cannot be distinguished from L. donovani complex isolate using the cyt b gene in the BI and MP trees. These findings aren't consistent the ITS1 study showing three isolates clustered with L. donovani and a clear classification within subspecies between Leishmania donovani donovani and Leishmania donovani infantum. Therefore, the inter-specific variation of the ribosomal RNA gene ITS1 was inferred to be more suitable than mtDNA segment cyt b for studying the phylogenetic relationships among subspecies. Of course, we can't exclude the possibility that the different interspecific variation between ITS1 and cyt b are calculated by choosing the different samples or numbers of the isolates or strains.

L. turanica of China
Our cyt b data demonstrate that the isolates KXG-2 and GS-GER20 clustered with L. turanica (AB434675) from central Asia, findings that are congruent with those of our earlier studies [17,19] and then clustered with L. arabica from western Asia, a finding that agrees with that of Asato et al. [34] (Figures 1, 2, 3). The definitive hosts of L. gerbilli, L. turanica, L. arabica are rodents of the Old World [57]. Using MLEE methods, the isolate KXG-2 was identified as L. turanica [16], and the isolate GS-GER20 was identified as L. gerbilli [58]. In the 1990s, L. turanica and L. gerbilli were identified in rodents or sandflies in Karamay, Xinjiang and L. turanica was proved to be pathogenic in both monkeys and humans in the laboratory, Phlebotomus mongolensis and Phlebotomus andrejevi were its major vectors [16]. We considered the isolates KXG-2 and GS-GER20 to be L. turanica and L. gerbilli, respectively via the cyt b gene sequences.

L. tropica of China
The species of the L. tropica complex cause the urban form of Old World CL. In Iran, Iraq, and India, it is transmitted by Phlebotomus papatasi. This species is rarely reported in China. The fact that the isolate JS1 was collected from Jiangsu Province clustered with L. tropica, which agrees with the results of our earlier study based on the COII gene [19]. Lu et al. used random amplified polymorphic DNA data to suggest a close relationship between the isolate JS1 and L. tropica (K27) [59]. Thus, we infer that the isolate JS1 may be L. tropica. However to further confirm this inference, more data such as host specificity, life cycle, and biochemical analysis will be needed.

Evolution inference and epidemiology of China
In our analysis, Leishmania cyt b sequences are consistent with the genus Leishmania that contains three subgenera: Leishmania, Sauroleishmania and Viannia [4]. Based on the suggestion that mammalian Leishmania did not evolve from those of lizards but vice versa [60,61], Lukeš et al. [54] proposed that the ancestor of the new world Leishmania evolved in South America and then migrated via the Bering land bridge to Asia via multiple independent genetic loci. The Leishmania lineage would have been dispersed throughout central and/or Southeast Asia, where a major diversification gave rise to L. aethiopica, L. major, L. gerbilli, L. turanica, L. tropica, and the L. donovani complex. The isolates from China were absent in this analysis. However Fraga et al. thought this theory puts L. tarentolae (Sauroleishmania) in an illogical position. Our data suggest that Leishmania sp. of the pathogen of VL and CanL clustering with L. tarentolae (Sauroleishmania) was in the same "illogical position". The maximum parsimony consensus tree ( Figure 1) and splitstree (Figure 3) supports the idea of a common origin with the Viannias subgenus, whereas the Bayesian tree (Figure 2) show the Chinese/taretolae group clustered together with species of Leishmania subgenus. This ambiguous position of L. tarentolae had been discussed by Luyo-Acero et al. based on the same DNA marker cyt b [35] that L. tarentolae clustered with Viannia in the NJ tree consisting with the minicircle phylogenetic analysis [62], and clustered with Leishmania in the MP tree supported by ATPase 6 gene [63]. The position of L. equatorensis as falling outside the Leishmania clade in the parsimony tree is supported by the phylogeny suggested by Cupolillo et al. [64]. However the Chinese/tarentolae group which was not described by Lukeš et al. [54], may have evolved from a common ancestral parasite that came from the Americas and may split off earlier than the other OWL.
Leishmaniasis remains endemic in China, especially in the west and northwest frontier regions. The epidemic foci of VL in China were classified into three types according to different geographical origin, infective agent, and clinical evidences, i.e., plain foci, hill foci, and desert foci [20]. Human VL and CL occur in China, most being VL along with rare CL cases [56,[65][66][67]. VL was one of the most important parasitic diseases occurring in over 17 Chinese provinces in 1951 [68]. Since the condition has come under control, currently, VL is mainly prevalent in six provinces in northwest China [69] (Figure 4). This study proved that the evolution hypothesis of Tian and Chen related to the Chinese Leishmania isolates from different epidemic foci was limited and lacked integrity [70]. In fact, the Chinese Leishmania species occurs as the multiple species L. donovani (L. donovani donovani, L. donovani infantum), L. turanica, L. tropic, Leishmania sp. and so forth, and some of these such as L. donovani and L. turanica were shared with neighbouring countries including India, Russia, and Uzbekistan.

Conclusions
The current study investigated the Chinese Leishmania parasites using cyt b sequence data. Undescribed Leishmania species which are clearly causative agents of CanL and human VL do exist in China and are related to the Sauroleishmania subgenus, may have evolved from a common ancestral parasite that came from the Americas and split off earlier than the other OWL. Our cyt b results also suggest the following: the isolates GS7, GS1 and XJ771 occur as part of the L. donovani complex; the isolate JS1 is L. tropica; and the isolate KXG-2 is close to the isolate GS-GER20, which is L. turanica and L. gebilli respectively. The results of the current study indicate that the isolates from China may have had a more complex evolutionary history. In the future, we will build upon the currently described data set to gain more insight into the fascinating spectrum of Chinese Leishmania.