Genetic diversity, haplotypes and allele groups of Duffy binding protein (PkDBPαII) of Plasmodium knowlesi clinical isolates from Peninsular Malaysia

Background The monkey malaria parasite Plasmodium knowlesi is now recognized as the fifth species of Plasmodium that can cause human malaria. Like the region II of the Duffy binding protein of P. vivax (PvDBPII), the region II of the P. knowlesi Duffy binding protein (PkDBPαII) plays an essential role in the parasite’s invasion into the host’s erythrocyte. Numerous polymorphism studies have been carried out on PvDBPII, but none has been reported on PkDBPαII. In this study, the genetic diversity, haplotyes and allele groups of PkDBPαII of P. knowlesi clinical isolates from Peninsular Malaysia were investigated. Methods Blood samples from 20 knowlesi malaria patients and 2 wild monkeys (Macaca fascicularis) were used. These samples were collected between 2010 and 2012. The PkDBPαII region of the isolates was amplified by PCR, cloned into Escherichia coli, and sequenced. The genetic diversity, natural selection and haplotypes of PkDBPαII were analysed using MEGA5 and DnaSP ver. 5.10.00 programmes. Results Fifty-three PkDBPαII sequences from human infections and 6 from monkeys were obtained. Comparison at the nucleotide level against P. knowlesi strain H as reference sequence showed 52 synonymous and 76 nonsynonymous mutations. Analysis on the rate of these mutations indicated that PkDBPαII was under purifying (negative) selection. At the amino acid level, 36 different PkDBPαII haplotypes were identified. Twelve of the 20 human and 1 monkey blood samples had mixed haplotype infections. These haplotypes were clustered into 2 distinct allele groups. The majority of the haplotypes clustered into the large dominant group. Conclusions Our present study is the first to report the genetic diversity and natural selection of PkDBPαII. Hence, the haplotypes described in this report can be considered as novel. Although a high level of genetic diversity was observed, the PkDBPαII appeared to be under purifying selection. The distribution of the haplotypes was skewed, with one dominant major and one minor group. Future study should investigate PkDBPαII of P. knowlesi from Borneo, which hitherto has recorded the highest number of human knowlesi malaria.


Background
Malaria is caused by blood protozoa of the genus Plasmodium. Historically, four species of Plasmodium are known to cause human malaria: Plasmodium falciparum, P. vivax, P. malariae, and P. ovale. However, about a decade ago, P. knowlesi, a malaria parasite of macaque monkeys, was reported to cause a large number of human infections in Sarawak, Borneo Island [1]. Since then, human knowlesi malaria has been reported in other parts of Borneo Island, Peninsular Malaysia, and in many other countries in Southeast Asia [2]. Imported knowlesi malaria cases have been reported in European countries and Japan due to ecotourism programs to the forested areas of Southeast Asia. Now, P. knowlesi is recognised as the fifth species of Plasmodium that can cause human malaria.
Invasion of a malaria parasite into its host erythrocyte depends on the interaction between the parasite's protein and the corresponding receptor on the surface of the erythrocyte. Plasmodium vivax and P. knowlesi use the Duffy blood group antigen as a receptor to invade erythrocytes [3]. The Duffy binding proteins of P. vivax (PvDBP) and P. knowlesi (PkDBP) are located on their merozoites. PvDBP and PkDBP are members of the erythrocyte-binding protein family which also includes the P. falciparum EBA-175 [4]. PvDBP and PkDBP are large proteins and can be divided into seven regions (I-VII). Region II contains the critical motifs for binding to the erythrocyte Duffy antigen.
PkDBP is encoded by an α-gene and therefore is more specifically known as PkDBPα. This is to distinguish it from two other highly homologous proteins in P. knowlesi -β and γ. Region II of the β and γ proteins have different binding specificities compared to PkDBPα. Region II of the β and γ proteins binds to rhesus erythrocytes but does not bind to Duffy antigen of human erythrocytes [5].
PvDBP has been suggested to be an important vaccine candidate antigen against vivax malaria because it elicits strong immune responses in humans. Design of vaccine against vivax malaria must take into consideration the nature and genetic polymorphism of PvDBP. Region II of PvDBP (denoted as PvDBPII) in P. vivax isolates from different geographical regions such as Colombia, South Korea, Papua New Guinea, Thailand, Iran and Myanmar has been shown to be highly polymorphic, and numerous PvDBPII haplotypes and allele groups have been identified [6][7][8][9][10][11].
It has been observed that antibodies raised against PkDBPαII could inhibit P. knowlesi invasion of human and rhesus erythrocytes in vitro [12]. Therefore, like PvDBPII for vivax malaria, PkDBPαII may also be a candidate vaccine antigen against knowlesi malaria. Whilst many polymorphism studies on PvDBPII have been carried out, hitherto, none has been reported for PkDBPαII. In this report, we present our findings on genetic diversity, haplotyes and allele groups of PkDBPαII in P. knowlesi clinical isolates from Peninsular Malaysia.

Blood samples
Human blood samples used in this study were collected from 20 patients who were infected with Plasmodium knowlesi at the University of Malaya Medical Centre and several private clinics in Peninsular Malaysia. Two blood samples from P. knowlesi-infected wild monkeys (Macaca fascicularis) were provided by the Wildlife Department of Federal Territory, Peninsular Malaysia. These blood samples were collected between 2010 and 2012 from various states in Peninsular Malaysia (

Extraction of DNA
Total DNA of the P. knowlesi was extracted from each blood sample using the QIAGEN Blood DNA Extraction kit (QIAGEN, Hilden, Germany). In each extraction, 100 μl of blood was used. The extracted DNA was suspended in water to a final volume of 50 μl.

Analysis of PkDBPαII sequences
Alignment of 60 sequences of PkDBPαII [53 of human origin, 6 of monkey origin, 1 reference strain (strain H, GenBank Accession No. M90466)] were performed using CLUSTAL-Omega programme which was available on-line (http://www.ebi.ac.uk/Tools/msa/clustalo). Both nucleotide and the deduced amino acid sequences were aligned and analysed. Phylogenetic tree was constructed using the Neighbour Joining method described in MEGA5 [13]. In constructing the phylogenetic tree, bootstrap replicates of 1000 were used to test the robustness of the tree.

PkDBPαII sequence polymorphism analysis
DnaSP ver. 5.10.00 [14] was used to perform polymorphism analysis on the 60 PkDBPαII sequences. Information such as the number of segregating sites (S), haplotype diversity (Hd), nucleotide diversity (π), and average number of pairwise nucleotide differences within the population (K) were generated. The π was also calculated on a sliding window of 100 bases, with a step size of 25 bp to estimate the stepwise diversity across PkDBPαII. The rates of synonymous (Ks) and non-synonymous (Kn) substitutions were estimated and compared by the Z-test (P < 0.05) in MEGA5 using the Nei and Gojobori's method [15] with the Jukes and Cantor correction. In the case of purifying (negative) selection, mutations are usually not advantageous so that Kn will be less than Ks (Kn/Ks < 1). However, in positive selection, non-synonymous mutations can be advantageous and Kn will exceed Ks (Kn/Ks > 1). For testing the neutral theory of evolution, Tajima's D [16] and Fu and Li's D and F [17] tests was carried out using DnaSP 5.10.00. In the Fu and Li's tests, P. vivax PvDBPII (GenBank Accession No. M90466) was used as outgroup.

Results
Nested PCR amplification on the human and monkey blood samples isolates produced DNA fragments of 1053 base pairs. The sequence of each fragment was trimmed to 921 bp, based on the PkDBPαII region identified by Singh et al. [18]. This trimmed sequence encoded an amino acid sequence of 307 in length. A final total of 59 sequences (GenBank Accession No. KC597079 -KC597137) were obtained. DNA sequence analyses were conducted to determine nucleotide diversity and genetic differentiation. The average number of pairwise nucleotide differences (K) for the PkDBPαII was 11.736. The overall haplotype diversity (Hd) and nucleotide diversity (π) for the 60 PkDBPαII sequences were 0.986 ± 0.008 and 0.013 ± 0.002, respectively. Detailed analysis of π, with a sliding window plot (window length 100 bp, step size 25 bp), revealed diversity ranged from 0.003 to 0.034. The highest peak of nucleotide diversity was within nucleotide positions 600-750, whereas the most conserved region was within nucleotide positions 260-360 (Figure 1).
Analysis and comparison at the nucleotide level against P. knowlesi strain H as reference sequence showed mutations at 128 positions among the Peninsular Malaysia isolates. Fifty-two of these mutations were synonymous and 76 were nonsynonymous. To determine whether natural selection contributed to the diversity in the PkDBPαII, the rate of nonsynonymous (Kn) to synonymous mutations (Ks) was estimated. Kn (0.00952) was found lower than Ks (0.00278) and the Kn/Ks ratio was 0.347, suggesting that purifying (negative) selection may be occurring in the PkDBPαII. Similarly, the Z test (Ks > Kn; P < 0.05) also indicated purifying selection on PkDBPαII. In the tests of departure of neutrality of selection, the Tajima's D statistics was −2.085 (P < 0.05), indicating expansion in  (Figure 2). Among the 70 polymorphic sites, 66 showed monomorphic mutation (changed into one amino acidtype) and 4 showed dimorphic mutation [changed into two amino acid types: 221 (Q, T), 302 (N, Y), 303 (F, S), 307 (A, I)]. The amino acid sequences could be categorised into 36 different haplotypes (H1-H36) with haplotype 2 having the highest frequency (19/60). Twelve of the 20 human and 1 monkey blood samples had mixed haplotype infections ( Table 2). Phylogenetic tree analysis revealed that the haplotypes could be clustered into 2 main allele groups ( Figure 3).

Discussion
The DBP of P. vivax and P. knowlesi play an essential role in erythrocyte invasion of the parasites by mediating irreversible binding with its corresponding receptor, the Duffy protein receptor for chemokines (DARC) on the surface of erythrocytes [19]. The DBP elicits strong immune responses in humans and therefore has been suggested to be an important vaccine candidate antigen [12,20]. The critical erythrocyte-binding motif of DBP has been identified to be located in region II of the protein. The P. vivax PvDBP has been observed to have a high degree of genetic polymorphism, and most of the polymorphism is located in region II. Although the cysteine and some hydrophobic amino acid residues in PvDBPII are conserved in P. vivax populations from different geographic regions, but the remaining amino acid residues are highly polymorphic [6][7][8][9][10][11][21][22][23][24]. It has further been revealed that PvDBPII undergoes positive natural pressure which results in allelic variation as a mechanism for immune evasion. On the other hand, despite the emerging importance of P. knowlesi as a human pathogen, no study has thus far been carried out on PkDBPαII. Our present study is the first to report the diversity and natural selection of PkDBPαII.
The PkDBPαII analysed in this study is based on the region defined by Singh et al. [18] using structural biology analysis. In their analysis, twelve C residues (positions 16,29,36,45,99,176,214,226,231,235,304,306), which form six disulphide bridges, have been suggested to be involved in the folding of PkDBPαII for interaction with DARC. Multiple alignment of the 60 PkDBPαII amino acid sequences (Additional file 1) in our study revealed that these 12 residues were highly conserved. Apart from these conserved C residues, the Y94, N95, K96, R103, L168 and I175 residues are required for recognition of DARC on human erythrocytes [18]. Again, our multiple sequence alignment showed high conservation of these residues in the PkDBPαII. In fact, Y94, N95, K96 and R103 fall within the most conserved region in the PkDBPαII (nucleotide positions 260-360 in the gene, Figure 1).
The PkDBPαII (K = 11.736; Hd = 0.986; π = 0.01274) in Peninsular Malaysia seemed to be more diverse than P. vivax PvDBPII in some neighbour countries, such as Myanmar (K = 7.851; Hd = 0.875; π = 0.00790), Korea (K = 2.878; Hd = 0.775; π = 0.00299) and Sri Lanka (Hd = 0.922; π = 0.00982) [7,11,25]. Generally, high levels of genetic diversity in malaria parasite surface antigens are attributed to positive natural selection, for example, by the host immune system [26]. Positive selection has been reported in PvDBPII of isolates in Myanmar, Korea, Sri Lanka and Thailand [7,9,11,25]. On the contrary, the PkDBPαII in Peninsular Malaysia was found to be under purifying (negative) selection. The reason for this finding is unclear, but may be associated with the host of the parasite. For example, it has been reported that the rhoptry-associated protein 1 (RAP-1) antigen of non-human primate malarial parasites (P. knowlesi, P. cynomolgi, P. inui and P. fieldi) showed evidence for negative selection, which was not found in two human malarial parasites (P. falciparum, P. vivax) [27]. Another possible reason for the purifying selection on PkDBPαII is population expansion of P. knowlesi, as evident by the Tajima's D statistics. In Peninsular Malaysia, deforestation in many areas has increased monkey and forest dwelling Anopheles vector contact to human. The expansion of monkey and vector populations and their ecological niches into human habitation may therefore result in the population expansion of P. knowlesi [28].
Phylogenetic studies on P. vivax isolates based on PvDBPII revealed the occurrence of multiple haplotypes in a geographical location, and these haplotypes could be categorized into several allele groups. For example, 25 haplotypes from Thailand were organized into 5 groups [9]. In Papua New Guinea, 27 haplotypes clustered into 3 dominant groups [9], whereas in Korea, 13 haplotypes were clustered into 3 groups [7]. Our study revealed the occurrence of 2 distinct allele groups of PkDBPαII in Peninsular Malaysia (Figure 3). Interestingly, allele group I was more dominant as it contained the majority (30/36) of the haplotypes. The minor allele group II had only 6 haplotypes. Two monkey blood samples were included in this study but all haplotypes from the monkeys were grouped in allele group I. The reference strain H, which was isolated in 1965 in Peninsular Malaysia from the first reported case of human P. knowlesi infection [29], was also grouped in allele group I. This shows that despite temporal separation of almost 50 years between strain H and the recent isolates of this study, no major genetic differences had occurred in the PkDBPαII.

Conclusions
Our present study is the first to report the genetic diversity and natural selection of PkDBPαII. Hence, the haplotypes described in this report can be considered as novel. Although a high level of genetic diversity was observed, the PkDBPαII appeared to be under purifying (negative) selection. Two allele groups of haplotype were obtained. However, the distribution was skewed, in which majority of the haplotypes clustered in a large dominant group. Our study was carried out on Peninsular Malaysia isolates only. Therefore, future study should investigate PkDBPαII of P. knowlesi from Borneo, which hitherto has recorded the highest number of human knowlesi malaria.

Additional file
Additional file 1: Full amino acid sequence alignment of PkDBPαII. Amino acid residues identical to those of the reference sequence (strain H), are indicated by dots. The twelve conserved cysteine (C) residues are marked in yellow. The conserved Y94, N95, K96, R103, L168 and I175 residues required for recognition of DARC on human erythrocytes are highlighted in green.