Genetic diversity, natural selection and haplotype grouping of Plasmodium vivax Duffy-binding protein genes from eastern and western Myanmar borders

Background Merozoite proteins of the malaria parasites involved in the invasion of red blood cells are selected by host immunity and their diversity is greatly influenced by changes in malaria epidemiology. In the Greater Mekong Subregion (GMS), malaria transmission is concentrated along the international borders and there have been major changes in malaria epidemiology with Plasmodium vivax becoming the dominant species in many regions. Here, we aimed to evaluate the genetic diversity of P. vivax Duffy-binding protein gene domain II (pvdbp-II) in isolates from the eastern and western borders of Myanmar, and compared it with that from global P. vivax populations. Methods pvdbp-II sequences were obtained from 85 and 82 clinical P. vivax isolates from the eastern and western Myanmar borders, respectively. In addition, 504 pvdbp-II sequences from nine P. vivax populations of the world were retrieved from GenBank and used for comparative analysis of genetic diversity, recombination and population structure of the parasite population. Results The nucleotide diversity of the pvdbp-II sequences from the Myanmar border parasite isolates was not uniform, with the highest diversity located between nucleotides 1078 and 1332. Western Myanmar isolates had a unique R391C mutation. Evidence of positive natural selection was detected in pvdbp-II gene in P. vivax isolates from the eastern Myanmar area. P. vivax parasite populations in the GMS, including those from the eastern, western, and central Myanmar as well as Thailand showed low-level genetic differentiation (FST, 0.000–0.099). Population genetic structure analysis of the pvdbp-II sequences showed a division of the GMS populations into four genetic clusters. A total of 60 PvDBP-II haplotypes were identified in 210 sequences from the GMS populations. Among the epitopes in PvDBP-II, high genetic diversity was found in epitopes 45 (379-SIFGT(D/G)(E/K)(K/N)AQQ(R/H)(R/C)KQ-393, π = 0.029) and Ia (416-G(N/K)F(I/M)WICK(L/I)-424], Ib [482-KSYD(Q/E)WITR-490, π = 0.028) in P. vivax populations from the eastern and western borders of Myanmar. Conclusions The pvdbp-II gene is genetically diverse in the eastern and western Myanmar border P. vivax populations. Positive natural selection and recombination occurred in pvdbp-II gene. Low-level genetic differentiation was identified, suggesting extensive gene flow of the P. vivax populations in the GMS. These results can help understand the evolution of the P. vivax populations in the course of regional malaria elimination and guide the design of PvDBP-II-based vaccine.


Background
Malaria is a major public health concern in the Great Mekong Subregion (GMS) in Asia [1]. In this region, Myanmar has the heaviest malaria burden and contributes to over 50% of the malaria cases in the GMS [2][3][4]. In recent years, intensified malaria control efforts have resulted in substantial improvement of the malaria situation in most GMS countries. One noticeable change in malaria epidemiology is the emerging dominant status of P. vivax in most endemic areas of the GMS. The formation of liver hypnozoites in P. vivax patients responsible for relapses requires radical treatment that targets both asexual blood stages and liver-stage hypnozoites. Currently, chloroquine-primaquine (CQ-PQ) is the frontline therapy for P. vivax in all GMS countries [5]. However, degraded efficacy of CQ-PQ for P. vivax treatment has been notified in several sites in Myanmar [5][6][7][8]. This might be among the causes responsible for the recently increased P. vivax incidence in northeastern Myanmar bordering China's Yunnan province [5,9]. The development of an effective malaria vaccine is considered important for integrated malaria control. However, one of the main obstacles for a successful vaccine design to enable global protection is the extensive genetic diversity of vaccine candidate genes [10]. In this regard, to design an effective malaria vaccine, it is essential to determine the genetic diversity of the vaccine targets in parasite populations from different malaria-endemic areas.
Blood-stage replication of malaria parasites is the cause of clinical malaria, and vaccines targeting blood-stage parasites are designed to alleviate the clinical symptoms. The formation of tight junction between the Plasmodium merozoite and the host red blood cell (RBC) is central to the invasion process of the parasite. This process in P. vivax involves the Duffy-binding protein (PvDBP), a microneme protein, and the Duffy antigen receptor for chemokines (DARC) on the RBC membrane [11,12]. Although P. vivax could infect DARC-negative individuals, these cases are rare; no alternative ligands for P. vivax binding to reticulocytes have been identified yet [13][14][15][16]. The presence of naturally-acquired antibodies to PvDBP in sera of residents from endemic area that block RBC invasion suggests that this protein is a potential target for effective antibody response [17][18][19]. Naturally-acquired binding-inhibitory antibodies directed against PvDBP contained both strain-specific and strain-transcending components and were associated with a reduced risk of clinical P. vivax malaria. Furthermore, these antibodies against PvDBP could block erythrocyte binding of PvDBP domain II (PvDBP-II) and inhibit merozoite invasion of erythrocytes [20][21][22][23][24], which justifies PvDBP as one of the most promising targets for blood-stage vaccine development against P. vivax.
PvDBP is a protein of 140 kDa and is divided into seven different regions (regions I-VII). The central binding motif necessary for DARC adhesion of PvDBP is mapped to a 170 amino-acid stretch located in the N-terminal cysteine-rich region (PvDBP-II, amino acids 291-460) [25]. Pvdbp-II encoding this protein domain also shows the highest degree of genetic diversity as compared to the rest of the pvdbp sequence. Although the cysteine residues are conserved in P. vivax populations, the remaining amino acids of PvDBP-II are highly polymorphic in P. vivax field isolates from endemic areas such as Brazil, Colombia, South Korea, Thailand and Myanmar, suggesting that this region is under positive natural selection [3,[26][27][28][29]. It has been reported that these polymorphic residues within the PvDBP-II region do not alter its capacity to bind DARC, but some of them affect the immune recognition of PvDBP, suggesting that genetic diversity may be responsible for immune evasion [30,31]. Consequently, the polymorphic nature of pvdbp-II represents a major limitation in successful design of a protective vaccine. Therefore, understanding the nature and genetic polymorphism in pvdbp-II among P. vivax isolates from different geographic areas is important for the rational design of vaccines against vivax malaria.
In this study, the pvdbp-II sequences of P. vivax isolates from eastern and western Myanmar borders were analyzed for genetic diversity, natural selection, recombination, haplotype prevalence and population differentiation, and were compared with the global pvdbp-II sequences. Whereas limited polymorphism of pvdbp existed in the field P. vivax isolates from eastern and western Myanmar, the pvdbp-II region was found to be under natural selection. Besides, the observed low-level genetic differentiation among different P. vivax populations in the GMS, based on the pvdbp-II sequence analysis, supports a general design of an effective vaccine to cover these endemic areas.

Blood sample collection
A total of 163 and 95 blood samples were collected from patients with acute P. vivax malaria attending clinics in an eastern Myanmar border township (Laiza, Kachin State) in 2016 and a western Myanmar border township (Paletwa, Chin State) in 2017, respectively (Fig. 1). Plasmodium vivax infection was confirmed by microscopic examination of thin and thick blood smears. After obtaining written informed consent from participants and guardians in case of minors, finger-prick blood (~ 100 µl) was collected on Whatman 3M filters (Whatman, Shanghai, China), dried and individually stored at − 20 °C for subsequent use. This study was approved by the ethics committees from the Department of Medical Research, Myanmar Ministry of Health and Sports, the China Medical University, and the University of South Florida.

PCR, cloning and sequencing of the pvdbp-II gene
Genomic DNA was extracted from dried blood spots using a QIAamp DNA mini kit (Qiagen, Hilden, Germany) according to the manufacturer's protocol. The pvdbp-II sequence corresponding to nucleotides 715-1704 bp (amino acid positions 239-568 aa) in the Salvador I (Sal I) reference sequence (GenBank: M37514) was amplified by PCR using primers pvdbp-II-F (5′-ACC ACG ATC TCT AGT GCT ATT ATA-3′) and pvdbp-II-R (5′-ATT TGT CAC AAC TTC CTG AGT ATT-3′). PCR reactions included 1× KOD-Plus-Neo buffer, 200 μM dNTPs, 1 mM MgSO 4 , 250 nM of each primer, 0.4 units of KOD-Plus-Neo DNA polymerase (Toyobo, Shanghai, China), and 1.0 μl genomic DNA template. The PCR was performed using the following conditions: 95 °C for 5 min, 44 cycles at 95 °C for 30 s, 55 °C for 30 s, and 68 °C for 60 s, followed by extension at 68 °C for 5 min. The PCR products were analyzed by 1.2% agarose gel, purified with TaKaRa MiniBEST DNA Fragment Purification Kit (TaKaRa, Dalian, China), ligated into a pMD18-T plasmid vector (TaKaRa), and transformed into Escherichia coli DH5α competent cells (TaKaRa). Colony PCR was performed with the primers M13(-20)F (5′-TGT AAA ACG ACG GCC AGT-3′) and M13R (5′-CAG GAA ACA GCT ATG ACC-3′) to identify positive clones with the correct inserts. The colony PCR products having the correct DNA fragment were sequenced using the primers pvdbp-II-F and pvdbp-II-R at a commercial laboratory (Beijing Genomics Institute, Beijing, China). To verify the sequencing accuracy, at least two clones from each isolate were sequenced in both DNA strands. Nucleotide sequences are deposited in GenBank under accession numbers MN233407-MN233488 for the western Myanmar isolates and MN233489-MN233573 for the eastern Myanmar isolates.

Recombination and linkage disequilibrium (LD)
The recombination parameter (C), which includes the effective population size and probability of recombination between adjacent nucleotides per generation, and the minimum number of recombination events (Rm) were measured using DnaSP [42]. LD between different polymorphic sites was computed in terms of the R 2 index using DnaSP for the eastern and western Myanmar border isolates.

Genetic differentiation, haplotype network, and population genetic structure analysis
The genetic differences in populations was investigated evaluating the rate of fixation (F ST ) by analysis of molecular variance (AMOVA) implemented in ARLEQUIN v3.5.2.2 software [43]. The Median-Joining method in NETWORK v5.0.0.3 [44] was used to establish genealogical relationship among the global PvDBP-II haplotypes. STRU CTU RE 2.3.4 was used to define the genetic structure of P. vivax parasite populations from the GMS based on the pvdbp-II [45]. The Bayesian approach was employed to identify the optimum number of clusters (K). All sample data were run for values K = 2-9 using the admixture ancestry model. Each run was implemented with a 'burn-in' period of 50,000 iterations and 100,000 Markov Chain Monte Carlo (MCMC) replications. The most likely number K in the data was estimated by calculating Delta K-values and identifying the K-value that maximizes the log probability of data, |ln''(K)|. The most probable K-value was then calculated according to Evanno's method by using the webpage interface STRU CTU RE Harvester (http://taylo r0.biolo gy.ucla.edu/struc tureH arves ter/). Distruct 1.1 was used to graphically display results produced by the STRU CTU RE software [46].

Analysis of polymorphism associated with B-and T-cell epitopes
To evaluate the possibility that diversity in PvDBP-II within the eastern and western Myanmar border isolates may have arisen from host's immune pressure, the genetic diversity in predicted or known B-and T-cell epitopes and MHC-binding regions in PvDBP-II was examined [26,30]. Polymorphism of each region was analyzed by using DnaSP as described above.

Pvdbp-II genetic diversity in the Myanmar border P. vivax populations
Out of the 258 P. vivax isolates collected from the eastern (n = 95) and western (n = 163) Myanmar borders, the 990 bp pvdbp-II gene fragment covering codons 239-568 was successfully amplified and sequenced in 85 and 82 samples, respectively. The 85 eastern border samples had 21 single nucleotide polymorphisms (SNPs), of which 4 are synonymous and 17 non-synonymous (Table 1). Among them, 20 are segregating sites, and 15 are parsimony informative. The 82 western border samples contained 16 SNPs, of which 15 are non-synonymous. All 16 SNPs are segregating sites, and 14 are parsimony informative. The overall haplotype diversity for the two sample sets was similar at 0.850 and 0.930 for the eastern and western border populations, respectively ( Table 1). The observed pairwise nucleotide diversity for eastern and western Myanmar isolates is 0.006 and 0.004, respectively ( Table 1). The highest peaks of nucleotide diversity within the pvdbp-II region were identified between nucleotide positions 1078-1332 (codons 360-444) (Fig. 2a). Meanwhile, 40% of the polymorphic amino acid sites were also accumulated within this segment (Fig. 2b). Sliding window plots of nucleotide diversity revealed π value ranging from 0 to 0.029 for the eastern isolates and from 0 to 0.021 for the western isolates, respectively (Fig. 2). These results indicated that the central segment for codons 360-444 of the pvdbp-II gene is more polymorphic than the rest of the pvdbp domain II.
Most of the PvDBP-II amino acid changes were found in high frequencies (> 50%) in both populations,

Evidence of natural selection
The neutrality tests (Tajima's D, and Fu and Li's D*, and F*) did not identify significant deviation from zero in pvdbp-II in both border populations (   identified similar patterns in the western Myanmar population, albeit the P-values did not reach statistical significance (P > 0.05) (Fig. 3a-c).
To determine whether selection has played any role in the evolution of pvdbp-II in the two parasite populations, we compared the d N and d S values (Table 3). In both populations, d N /d S > 1 was detected, but Fisher's exact Z-test only showed significance for the western Myanmar isolates (P = 0.033). Sliding window plots of d N /d S for the pvdbp-II sequences showed a similar pattern with a significant excess of d N over d S detected at the 1116-1164 bp region (codons 372-388) in the western Myanmar population (Fig. 3d).

Recombination
In the global P. vivax populations except that from South Korea, which had recent vivax malaria resurgence, the minimum numbers of recombination events between adjacent polymorphic sites (Rm) were ≥ 5 ( Table 4). The   LD index R 2 also declined across the analyzed region in eastern and western Myanmar populations, suggesting that intragenic recombination may also be a possible factor contributing to the increased diversity of pvdbp-II gene (Fig. 4). It is also noteworthy that the R 2 value decreased much more rapidly in the western Myanmar parasites, suggesting a much large parasite population and higher level of intragenic recombination in western Myanmar.

Population differentiation
F ST values were calculated to assess the genetic differentiation of pvdbp-II among the global P. vivax populations. Though separated by more than 400 km, the eastern and western Myanmar P. vivax population had little genetic differentiation with an F ST value of 0.067 (Table 5). Similarly, all P. vivax populations from the GMS had little genetic differentiation, whereas high levels of genetic differentiation were detected between the GMS populations and Asian Pacific population (PNG) and the South American population (Colombia).

Relationship of parasites from the GMS
To identify the population structure of the GMS parasite populations, STRU CTU RE analysis was performed using parasite populations from the eastern and western borders of Myanmar, central Myanmar (from the Mandalay region) and Thailand. Analysis of the |Ln''(K)| curve, Delta K plot and the distribution of clusters amongst haplotypes indicated that the haplotypes were optimally grouped into four clusters (K = 4; Fig. 5). All four parasite populations from the GMS had admixed haplotypes, with the most common haplotypes found with a percentage of 79.1%, 76.1%, 81.2%, and 70.1% in western Myanmar border, eastern Myanmar border, central Myanmar and Thailand populations, respectively. A haplotype network was constructed to establish the relationships among the pvdbp-II haplotypes from the four GMS P. vivax populations mentioned above. A total of 60 haplotypes were identified from 210 pvdbp-II sequences with haplotype prevalence ranging from 0.48 to 17.14%, of which 65% was represented by single parasite isolates. These haplotypes could be roughly grouped into four clusters (Fig. 6). Parasites with the Sal I reference haplotype (H1) had a 2.3% prevalence, and it was only detected in the western and central Myanmar parasite populations. Haplotype 4 (H4) was shared among all GMS populations, with an overall prevalence of 4.8%. Haplotypes 8,28,34,and 37 were shared between two of the four populations, whereas Haplotypes 15, 25, and 31 were shared among the eastern Myanmar, western Myanmar and Thai populations. Of these haplotypes, H25 has an observed frequency of 17.14%, and it is the dominant haplotype of the eastern Myanmar parasite population with a frequency of 29.41% (Fig. 6).

Polymorphisms associated with B-and T-cell epitopes
To determine whether there is a link between positive natural selection and host immune pressure, we examined the genetic polymorphism in identified or predicted B-and T-cell epitopes and MHC-binding regions [26,30]. All the predicted B-and T-cell epitopes and MHC binding regions were polymorphic, and had d N > d S (Table 6). Meanwhile, significant P-value of d N > d S was detected in epitopes 45 (P = 0.023), 48 (P = 0.028) and Ia (P = 0.036) of PvDBP-II protein by Z-test (Table 6). Particularly, high-level nucleotide diversity was found in epitopes 45 (π = 0.029) and Ia (π = 0.028), which contained polymorphic residues at 339, 340, 341, 345, and 346, as well as 372, 374, and 379, respectively. Though the Tajima's D-values for the two epitopes were all positive, they were not significant. These results are consistent with the hypothesis that natural selection acts on epitopes in PvDBP-II and is responsible for the observed diversity of pvdbp-II [3,26,36].  [17, 27-29, 34, 35, 37]. In addition, the two highest peaks of nucleotide diversity within the pvdbp-II sequences in the Myanmar border parasite isolates were localized between Cysteine 5 and 7 of PvDBP-II, which was consistent with several previous reports [28,30,31,47]. The D384G mutation is fixed in the eastern Myanmar border population and reached 92.68% in the western Myanmar isolates. D384G is also highly prevalent in isolates from central Myanmar (85.2%), Thailand (76.7%), Sri Lanka (94%), Iran (61.3%), and Brazil (85%), but it is not prevalent in PNG and Colombia [17, 27-29, 34, 35, 37]. Although eastern and western Myanmar isolates had similar amino acid changes with populations from central Myanmar and Thailand, some mutations found in the Thai isolates (S351C, I367T, S398T, T404R, Q433K and R436T) and central Myanmar isolates (I310L, K386Q, R391H, K455I, K473R, P475A, C477G, Q486E, R490K, D528G, V533M, K541T and A545V) were not identified in either eastern or western Myanmar isolates [3,29]. It has been reported that amino acids 417, 437 and 503 are responsible for forming a discontinuous epitope of PvDBP-II, which was also the main target of inhibitory antibodies [48,49]. In this regard, polymorphisms at these residues, either single or combined, could affect the binding efficiency of inhibitory antibodies to PvDBP and help the parasites to evade host immune attacks. All these three mutations were found in eastern and western Myanmar isolates. The N417K mutation was dominant with high frequencies in both eastern (63.5%) and western (87.8%) Myanmar, whereas the W437R mutation was found considerably high in the eastern (62%) but low (2%) in western Myanmar. The I503K variant was present at a similar prevalence in the eastern (47%) and western Myanmar (49%) populations. All these mutations suggest that antibody responses against PvDBP in these regions might be compromised.
The pvdbp-II has been found to be subject to strong selection in previous studies [17, 27-29, 34, 35, 37]. Although all neutrality tests and the d N /d S test did not  [26]. In particular, high-level genetic diversity and positive selection were identified in peptide Ia, predicted to be exposed on the surface of the PvDBP protein [26]. Significant positive values observed for the neutrality tests for the codons 363-399 and codons 370-392 regions of pvdbp-II suggest a decrease in population size. This appears to be consistent with the scaling up of malaria control in these endemic areas, which has resulted in substantial declines of malaria incidence in recent years. However, the region of codons 542-553 did not follow the same trends with the Fu and Li's D* test, which showed a significant negative value for this domain, suggesting an excess of singletons, probably generated by particular selective sweeps. Thereby, the codons 542-553 region of pvdbp-II may be necessary for maintaining the binding ability of PvDBP-II to DARC.
Taken together, the current study supports the theory that strong balancing selection probably imposed by host immunity may have occurred in the pvdbp-II region in the GMS parasite populations. The high value of the recombination parameters observed in global populations suggested that meiotic recombination might have occurred in PvDBP-II. Recombination contributes to genetic diversity of P. vivax and may also be the cause for the variation of pvdbp-II as reported in several previous studies [26,50]. The existence of recombination events and the decline of LD index R 2 with increasing distance between nucleotide sites support that meiotic recombination plays a role in generating diversity in pvdbp-II among the Myanmar parasite isolates. This result further corroborated previous reports The relatively lower number of the recombination events observed in the eastern Myanmar border isolates might be correlated with decreased P. vivax transmission intensity as this region is moving towards malaria elimination. The F ST index was used to access population differentiation due to genetic structure, and F ST values over 0.25 generally refer to highly differentiated populations [51]. Whereas major population differentiation was observed between geographically well-separated parasite populations (e.g. GMS vs Colombia, GMS vs PNG), little population differentiation was observed in parasite from the GMS, as evidenced by the low F ST values. This may reflect extensive gene flow among these GMS parasite populations in the recent past, which is further boosted by intensive human migration between these GMS countries, especially in the border regions [2]. Both STRU CTU RE and network analyses further confirmed the close relationships among the four GMS populations. The PvDBP-II haplotypes were grouped into four closely related clusters and are shared by all sites.
Genetic diversity of vaccine candidate antigens usually causes poor clinical efficacy of malaria vaccines [52,53]. Thus, an effective PvDBP-II vaccine should include alleles that induce host immune responses that are sufficiently broad to cover the existing antigenic diversity. However, because of higher genetic diversity of P. vivax compared to P. falciparum, generating a broad crossreactive immune response against highly polymorphic asexual stage antigens faces even greater challenges [54]. Our observation that only 15% of PvDBP-II haplotypes were shared among the four populations from the GMS suggests that antigenic diversity will need to be taken into account for a PvDBP-based vaccine. In addition, the Sal I PvDBP-II haplotype was restricted to western and central Myanmar populations at a low frequency (4.76%), indicating that a PvDBP-II vaccine designed based on this reference strain may not work in the GMS. It is noteworthy that the GMS parasite populations did show highly prevalent haplotypes (e.g. H25 at 17.14%), which may serve as the starting point for vaccine development.

Conclusions
This study shows that pvdbp gene is genetically diverse in both the eastern and western Myanmar P. vivax isolates, which is comparable to global P. vivax populations. Part of the pvdbp-II domain is under positive selection, while multiple recombination events further favor diversity. Little genetic differentiation identified among the four populations in the GMS suggests extensive gene flow between these areas. These findings provide important information for understanding the P. vivax population structure and its evolution in endemic areas of the GMS, and allow investigators to select dominant haplotypes for designing an effective bloodstage vaccine.