Pooled-DNA sequencing identifies genomic regions of selection in Nigerian isolates of Plasmodium falciparum
Parasites & Vectors volume 10, Article number: 320 (2017)
The burden of falciparum malaria is especially high in sub-Saharan Africa. Differences in pressure from host immunity and antimalarial drugs lead to adaptive changes responsible for high level of genetic variations within and between the parasite populations. Population-specific genetic studies to survey for genes under positive or balancing selection resulting from drug pressure or host immunity will allow for refinement of interventions.
We performed a pooled sequencing (pool-seq) of the genomes of 100 Plasmodium falciparum isolates from Nigeria. We explored allele-frequency based neutrality test (Tajima’s D) and integrated haplotype score (iHS) to identify genes under selection.
Fourteen shared iHS regions that had at least 2 SNPs with a score > 2.5 were identified. These regions code for genes that were likely to have been under strong directional selection. Two of these genes were the chloroquine resistance transporter (CRT) on chromosome 7 and the multidrug resistance 1 (MDR1) on chromosome 5. There was a weak signature of selection in the dihydrofolate reductase (DHFR) gene on chromosome 4 and MDR5 genes on chromosome 13, with only 2 and 3 SNPs respectively identified within the iHS window. We observed strong selection pressure attributable to continued chloroquine and sulfadoxine-pyrimethamine use despite their official proscription for the treatment of uncomplicated malaria. There was also a major selective sweep on chromosome 6 which had 32 SNPs within the shared iHS region. Tajima’s D of circumsporozoite protein (CSP), erythrocyte-binding antigen (EBA-175), merozoite surface proteins - MSP3 and MSP7, merozoite surface protein duffy binding-like (MSPDBL2) and serine repeat antigen (SERA-5) were 1.38, 1.29, 0.73, 0.84 and 0.21, respectively.
We have demonstrated the use of pool-seq to understand genomic patterns of selection and variability in P. falciparum from Nigeria, which bears the highest burden of infections. This investigation identified known genomic signatures of selection from drug pressure and host immunity. This is evidence that P. falciparum populations explore common adaptive strategies that can be targeted for the development of new interventions.
Malaria is a parasitic disease responsible for morbidity in approximately 214 million people globally and an estimated 438,000 deaths . In the last two decades, there have been significant decline in malaria prevalence due to interventions with drug, long-lasting insecticidal nets and other vectorial interventions. However, the disease remains a significant public health problem in sub-Saharan Africa (sSA) which harbours over 90% of the worldwide burden . The endemicity in sub-Saharan Africa is heterogeneous, with some regions heading towards elimination phase while hotspots of transmission remain in others. This heterogeneity is imposed possibly by differences in environments from rainfall, seasonality, human and vector hosts as well as varied levels of interventions. One of the countries which remain heavily affected is Nigeria. Nigeria bears the highest burden of malaria (mostly from Plasmodium falciparum infections) globally and interventions with drugs and anti-vector tools suffer from logistic difficulties associated with the economy and size of the country as well as extensive infection movements . This has been compounded by antimalarial drug resistance, evasion of host immune response and lack of highly efficacious vaccines [2,3,4]. It has been shown that high drug pressure and immunity select for adaptive parasite strains that sustain transmission (2, 5). Thus, several P. falciparum genes encoding drug or immune targets are under natural selection and bear signatures of directional or balancing selection [5,6,7]. For a more effective management of interventions, it is important to determine patterns of variation due to adaptation of the parasite to interventions and host environments. While directional forces cause adaptively important genetic variants of the parasite to increase in frequency leading to high fixation rates and the appearance of selective sweeps around affected loci, balancing selection brings the parasite’s favoured alleles to an intermediate equilibrium, where they are maintained as genetic polymorphisms. Although population genomics of P. falciparum explored in other West African populations have shown strong directional selection around known drug resistance genes and signatures of balancing selection on multiple candidate vaccine antigens [8,9,10,11,12], inferences of local adaptive mechanisms require looking more representatively at individual populations such as Nigeria.
Recent advances in next-generation genomics have proved useful in individual sequencing of P. falciparum in order to identify genes with polymorphic site frequency spectra consistent with selection. However, such technologies are still infeasible in resource-poor settings where malaria is endemic and large data sets are required. High-throughput sequencing on samples pooled from different individuals is a strategy that has been adopted to characterize genetic variability in diploid Drosophila at a small fraction of cost to detect single nucleotide polymorphisms (SNPs) and their estimated frequencies with a variance comparable with individual sequencing . To understand the genome-wide patterns of selection in Nigeria, we sequenced pooled isolates of P. falciparum and identified genes that are most variable and potentially under strong directional and balancing selection from antimalarial drug use and host immunity.
Participants 2 years and above seeking care for uncomplicated malaria at three health centres - Aramoko-Ekiti (AMK), Lekki (LEK) and Badagry (BDG) - in southwestern Nigeria were recruited during the malaria season between March and September, 2013 (Additional file 1: Figure S1), [samples were collected as part of the Plasmodium Diversity Network in Africa (PDNA) project]. All recruited malaria cases had a temperature of > 37.5 °C on presentation or history of fever in the previous 48 h, and a minimum of 5000 P. falciparum parasites/μl estimated by thick film examination. Up to 5 ml of venous blood was collected from each participant in sterile ethylene diamine tetracetic acid (EDTA)-coated vacutainer tubes after which participants were treated with artemether-lumefantrine according to Nigeria’s treatment policy . Collected samples were transported in an ice-cold container to the UK Medical Research Council Laboratories, The Gambia for further analyses.
Parasite examination and pooled sequencing
Thick and thin blood films were prepared and examined under the microscope (Olympus CX21, Hamburg, Germany) for confirmation of falciparum infections. Plasmodium falciparum-infected blood was depleted of white blood cells by size selection and hydrophoresis using CF11 columns to reduce the presence of human DNA . Thick and thin blood films were prepared before and after cellulose filtration to confirm complete removal of leucocytes. Extraction of genomic DNA was carried out using Qiagen Mini Kit (Qiagen, Hilden, Germany). The extracted DNA samples were stored at -20 °C for subsequent analysis. The quality as well as quantity of each DNA sample was determined using nanodrop (Thermo Fisher Scientific, MA, USA). Real-time PCR was carried out to determine the quantity of parasite DNA relative to human DNA contained in each sample using the procedures described by Veron et al. . To further deplete human DNA extracted with the parasite DNA, NEBNext Microbiome DNA Enrichment Kit (New England Biolabs, MA, USA) was used. Methylated human DNA was removed from the DNA mixture by binding to the methyl-CpG binding domain of human MBD2-Fc protein . Concentrations of individual parasite samples were normalized to 500,000 parasites/μl in a final volume of 920 μl of pooled DNA. Enzymatic shearing, end-repair and adapter-ligation leading to library preparation were carried out using NEBNext library prep kit (New England Biolabs, MA, USA). Library DNA was loaded on Ion 318™ chip (Thermo Fisher Scientific, MA, USA) and run on the Ion Personal Genome Machine (Thermo Fisher Scientific, MA, USA).
Data processing: alignment, variant calling and quality filtering
Reads generated were aligned to P. falciparum 3D7 genome using mapping and assembly with quality (MAQ) tools. We excluded reads of poor mapping quality, and derived a list of SNPs [and small insertions and deletions (indels)] based on uniquely mapping reads and acceptable levels of coverage (minimum 10, maximum 2000). In addition, we applied a filter to rule out error-prone variant calls, based on a threshold of Q20. Allele frequencies in the population were determined for all SNPs, by analyzing all genotyped samples. Non-reference (variant) allele frequency (NRAF) was computed as the proportion of genotyped samples whose allele was not the reference allele. The reading frame and exon boundaries were determined from the PlasmoDB 13.0 annotation of the 3D7 genome (www.plasmodb.org).
Identifying genes under balancing and directional selections
The allele frequency-based neutrality test, Tajima’s D , integrated haplotype score , and the ratio of nonsynonymous and synonymous polymorphisms (πN/πS) were applied to identify genes under selection. DnaSP 5.1  was used to perform calculation of Tajima’s D ratios. Evidence of recent directional selection was obtained from the standardised haplotype score using WHAMM . Extreme positive iHS scores (> 2.5) suggested selective sweeps along the ancestral allele signaling strong directional selection around that site.
High quality sequence data was obtained from genomic DNA pooled from 100 clinical isolates of P. falciparum. Genome-wide short-read sequences were generated yielding a median short sequence read length of 182 bp. Alignment to P. falciparum 3D7 reference genome indicated an average genome-wide coverage depth of 5.7× (Table 1). The sequencing technology yielded 488 million bases and total usable reads of 3,090,240. Variant calls were made at 56,784 polymorphic sites and 13,784 SNPs were identified after quality filtering. There was > 65% more read coverage in the genic regions than in the intergenic regions. Five thousand one hundred and twenty-two genes had at least 3 SNPs and were considered informative for frequency-based analyses. In all, 3122 genes were analysed across the 14 chromosomes of the parasite after excluding genes in the sub-telomeric regions (var, rifin and stevor genes). Each SNP was classified as synonymous or nonsynonymous according to whether an amino acid change occurred when substituting the reference allele with the non-reference allele at that SNP in the 3D7 reference genome sequence, without any other changes. Comparison between nonsynonymous and synonymous SNPs for some known target genes of antimalarial drugs and immunity is presented in Fig. 1.
Evidence of positive directional selection
Standardised integrated haplotype score (iHS) was used to examine evidence of recent directional selection. Fourteen shared iHS regions that had at least 2 SNPs with a score > 2.5 were identified (Table 2). Two of such genes were chloroquine resistance transporter (CRT) on chromosome 7 and multidrug resistance 1 (MDR1) on chromosome 5. There was a weak signature of selection in dihydrofolate reductase (DHFR) in chromosome 4 and MDR5 genes on chromosome 13 with only 2 and 3 SNPs respectively identified within the iHS window (Table 3). However, there was no evidence of recent directional selection in dihydropteroate synthase (DHPS) gene on chromosome 8. There was also a major selective sweep on chromosome 6 which had 32 SNPs within the shared iHS region. However, the specific loci targeted by the selective sweep were unknown.
Identifying signatures of balancing selection
Tajima’s D values were mostly negative (Additional file 2: Table S1), with a mean value of -0.86. One hundred and twelve genes (3.59%) had positive Tajima’s D values (Fig. 2). Six genes that have been previously associated with selection from acquired immunity had significantly high Tajima’s D values (Table 4).
This investigation adopted pooled DNA sequencing to understand genome-wide patterns of selection and genetic variability that had only been previously achievable through individual sequencing [8, 12]. Regions of the genome under recent positive and balancing selections, representing signatures of adaptation to drug pressure and host immunity, were identified. Earlier reports showed no significant variance or population sub-structuring in the study areas , eliminating the possibility of false-positive results for signals of selection owing to population structure.
Drug pressure is a powerful selective force in natural parasite populations. Removal of drug pressure exposes resistant parasites to increased competition leading to a decline in the frequency of resistance conferring selection , as reported in Malawi where after just a decade of non-use, chloroquine (CQ) cleared 100% of asymptomatic P. falciparum infections . Similarly, the prevalence of CQ-resistant parasites in coastal Tanzania decreased after only two and a half years of CQ withdrawal . However, in keeping with the observation of high distribution of CQ-resistant parasites in western Kenya after a period of drug withdrawal , the results of the present study showed that the population of resistant parasites in the Nigerian region studied did not reduce significantly after years of CQ withdrawal. Instead, high extended haplotype scores (iHS) were found for SNPs around chloroquine resistant transporter (CRT) and multidrug resistance (MDR1) genes. These observations are consistent with positive selection from long periods of chloroquine use leading to depletion of chloroquine-susceptible parasites . Strong directional selection in these genes essentially suggests that the re-introduction of the drug for malaria treatment cannot be considered yet as drug resistant strains still predominate the population. This is contrary to previous observations which showed complete  or partial  reversal of P. falciparum populations to chloroquine susceptibility after a period of drug withdrawal. In populations where malaria is endemic, transmission is high, and acquired immunity is extensive, asymptomatic adults, who rarely become ill and who generally do not receive therapy, may provide a favorable environment for susceptible parasites to persist in the population. Malaria intervention efforts in Nigeria might have resulted in lower transmission and less acquired immunity, shrinking refugia and allowing resistant alleles to become fully fixed in the population, as was observed in South America  and Southeast Asia .
There was no strong evidence of selection associated with dihydrofolate reductase (DHFR) and dihydropteroate synthase (DHPS), the two major genes targeted by sulfadoxine-pyrimethamine (sulfadox). This supports reports that the drug is still considerably effective for malaria treatment . Thus, its use as Intermittent Preventive Treatment (IPT) of pregnant women remains effective. However, the detection of weak selection on DHFR gene implies selection acting in the drug resistance locus as sulfadox is still widely available from the informal pharmaceutical sector. There was also a major selective sweep on chromosome 6 supporting previous reports from Senegal  and The Gambia . This is the first evidence of the Chromsome 6 sweep beyond the populations of Senegal and The Gambia which are geographically linked. It therefore could be marking a common selective event in the history of P. falciparum in West Africa. However, further investigation is needed to understand the origin and mechanisms of selection in the chromosome region.
Against a genomic background in which most genes had negative values of Tajima’s D, this study, in agreement with observations in Senegal  and The Gambia , has identified six genes to be under strong balancing selection. These are some of the antigens that had been previously proposed as vaccine candidates [8, 9]. However, not all genes with high positive values of Tajima’s D are under the effect of balancing selection as demographic events such population contraction could also lead to values greater than zero. Thus, the parasite surface genes identified to be under balancing selection may require further validation for selection from immunity or diversity of human erythrocyte receptors. Immunological analyses of the allelic protein products of these identified genes should be prioritized.
While pool-seq has been shown here to be a useful approach in population genomics of P. falciparum, it is worthy of note that haplotype information may be traded off under multiplexed sequencing. Although this may be a necessary price for a cost-effective and accurate characterization of SNP frequencies, it should not be considered insignificant. Also, loss of resolution on haplotype structure makes it difficult to separate demographic from complete selective events as patterns of linkage disequilibrium (LD) cannot be resolved . Futher application of this approach on P. falciparum evolutionary analyses will benefit from a more detailed comparison of allele frequency spectrum obtained with that from individual sequencing.
We have demonstrated the use of pooled DNA sequencing to understand genome-wide patterns of selection of P. falciparum previously achievable through individual sequencing, identifying regions of the parasite genome under the influence of drug and host immune selection.
Chloroquine resistance transporter
Erythrocyte binding antigen
Integrated haplotype score
Mapping and assembly with quality
Merozoite surface protein
Non-reference allele frequency
Serine repeat antigen
World Health Organization. World malaria report 2015. Geneva: World Health Organization; 2015.
Noedl H, Se K, Schaecher K, Smith BL. Evidence of artemisinin-resistant malaria in western Cambodia. N Engl J Med. 2008;359:2619–20.
Dondorp AM, Nosten P, Yi D, Das A, Phyo J, Tarning KM, et al. Artemisinin resistance in Plasmodium falciparum malaria. N Engl J Med. 2009;361:455–67.
Phyo AP, Nkhoma K, Stepniewska EA, Ashley S, Nair R, McGready C, et al. Emergence of artemisinin-resistant malaria on the western border of Thailand: a longitudinal study. Lancet. 2012;379:1960–6.
Volkman SK, Sabeti PC, DeCaprio D. A genome-wide map of diversity in Plasmodium falciparum. Nat Gen. 2007;39:13–119.
Mu J, Awadalla P, Duan J, McGee KM, Keebler J, Seydel K, et al. Genome-wide variation and identification of vaccine targets in the Plasmodium falciparum genome. Nat Gen. 2007;39:126–30.
Weedal GD, Preston BM, Thomas AW, Sutherland CJ, Conway DJ. Differential evidence of natural selection on two leading sporozoite stage malaria vaccine candidate antigens. Int J Parasitol. 2008;37:77–85.
Mobegi VA, Duffy CW, Amambua-Ngwa A, Loua KM, Laman E, Nwakanma D, et al. Genome-wide analysis of selection on the malaria parasite Plasmodium falciparum in West African populations of differing infection endemicity. Mol Biol Evol. 2014;31(6):1490–9.
Amambua-Ngwa A, Tetteh KK, Masnke M, Gomez-Escobar N, Stewart LB, Deerharke ME, et al. Population genomic scan for candidate signatures of balancing selection to guide antigen characterization in malaria parasites. PLoS Gen. 2012;8(11):e1002992.
Mobegi VA, Loua KM, Ahouidi AD, Satoguina J, Nwakanma DC, Amambua-Ngwa A. Population genetic structure of Plasmodium falciparum across a region of diverse endemicity in West Africa. Malar J. 2012;11:223.
Mackinnon MJ, Read AF. Immunity promotes virulence evolution in a malaria model. PLoS Bio. 2004; doi:10.1371/journal.pbio.0020230.
Chang HH, Park DJ, Galinsky KJ, Schaffner SF, Ndiaye D, Ndir O, et al. Genomic sequencing of Plasmodium falciparum malaria parasites from Senegal reveals the demographic history of the population. Mol Biol Evol. 2012;29:3427–39.
Feretti L, Ramos-Onsins SE, Pérez-Enciso M. Population genomics from pool sequencing. Mol Ecol. 2013;22:5561–76.
Federal Ministry of Health. National Malaria Control Programme in Nigeria. Annual report; 2005. p. 1–27.
Venkatessan M, Amaratunga C, Campino S, Auburn S, Koch O, Lim P, et al. Using CF11 cellulose columns to inexpensively and effectively remove human DNA from Plasmodium falciparum-infected whole blood samples. Malar J. 2012;11:41.
Veron V, Legrand E, Yrinesi J, Volney B, Simon S, Bernard C. Genetic diversity of msp3α and msp1_b5 markers of Plasmodium vivax in French Guiana. Malar J. 2009;8:40.
Lister R, Pelizzola M, Dowen RH, Hawkins RD, Hon G, Tonti-Filippini J, et al. Human DNA methylomes at base resolution show widespread epigenomic differences. Nature. 2009;462:315–22.
Tajima F. Statistical method for testing the neutral mutation hypothesis by DNA polymorphism. Genetics. 1989;123:585–95.
Voight BF, Kudaralli S, Wen X, Pritchard JK. A map of recent positive selection in the human genome. PLoS Bio. 2006;10(1371):e72.
Rozas J, Librado P, Sanchez-Del Barrlo JC, Messeguer X, Rozas R. DnaSP version 5 help contents. Bioinformatics. 2009;25(11):1451–2.
Oyebola MK, Idowu ET, Nyang H, Olukosi YA, Otubanjo OA, Nwakanma DC, et al. Microsatellite markers reveal low levels of population sub-structuring of Plasmodium falciparum in southwestern Nigeria. Malar J. 2012;13:493.
Walliker D, Hunt P, Babiker H. Fitness of drug-resistant malaria parasites. Acta Trop. 2005;94:251–9.
Kublin JG, Cortese JF, Njunju EM, Mukadam RA, Wirimma JJ, Kazemba PN, et al. Human DNA methylomes at base resolution show widespread epigenomic differences. Nature. 2009;462(7271):315–22.
Temu MJ, Kaale E, Marawti M. Storage of antimalarials at household level and associated factors in Kiromo ward, Bagamoyo Tanzania. Afr Health Sci. 2006;6:43–8.
Zhong D, Afrane Y, Githeko A, Chui L, Menge D, Yan G. Molecular epidemiology of drug-resistant malaria in Western Kenyan highlands. BMC Infect Dis. 2008;8:105.
Cortese JF, Caraballo A, Contreras CE, Plowe CV. Origin and dissemination of Plasmodium falciparum drug-resistance mutations in South America. J Infect Dis. 2002;186:999–1006.
Nash D, Nair S, Mayxay M, Newton P, Guthmann JP, Nosten F, et al. Selection strength and hitch-hiking around two antimalarial resistance genes. Proc R Soc Lond B Biol Sci. 2005;272:1153–61.
Omole MK, Onademuren OT. A survey of antimalarial drug use practices among urban dwellers in Abeokuta, Nigeria. Afr J Biomed Res. 2010;13:1–7.
Park DJ, Lukens AK, Neafsey DE, Schaffner SF, Chang HH, Valim C, et al. Sequence-based association and selection scans identify drug resistance loci in the Plasmodium falciparum malaria parasite. Proc Natl Acad Sci USA. 2012;109:13052–7.
We are grateful to the patients who contributed samples for this study and to the staff of the local health facilities, The Medical Research Council Unit The Gambia and The Nigerian Institute of Medical Research, Nigeria.
This work was supported by the MDR/DFID Career Development Fellowship and Plasmodium Diversity Network in Africa Centenary Award (1103-PF-PDN-GMSN) from MRC Centre for Genomics and Global Health (CGGH) to AA-N.
Availability of data and materials
The datasets supporting the conclusions of this article are included within the article and its additional files.
Ethics approval and consent to participate
Informed consent was obtained from individuals (parental consent and/or assent from under-16 participants) prior to recruitment. Ethical approval with reference number IRB/12/209 was obtained from the Nigerian Institute of Medical Research (NIMR) ethics committee.
Consent for publication
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Oyebola, K.M., Idowu, E.T., Olukosi, Y.A. et al. Pooled-DNA sequencing identifies genomic regions of selection in Nigerian isolates of Plasmodium falciparum . Parasites Vectors 10, 320 (2017). https://doi.org/10.1186/s13071-017-2260-z
- Dihydrofolate Reductase
- Balancing Selection
- Chloroquine Resistance Transporter (CRT)
- Integrated Haplotype Score