- Open Access
Analysis of population genetic structure of Indian Anopheles culicifacies species A using microsatellite markers
Parasites & Vectorsvolume 6, Article number: 166 (2013)
Anopheles culicifacies sensu lato is an important vector of malaria in Southeast Asia contributing to almost 70% of malaria cases in India. It exists as morphologically similar sibling species A, B, C, D and E with varied geographical distribution patterns. Vector control measures have been difficult for this important vector as the sibling species have developed varying levels of resistance to the currently used insecticides. In view of the importance of this vector, we developed and validated a set of microsatellite markers and the same were used to analyze the population genetic structure of five different geographical populations of An. culicifacies A.
Anopheles culicifacies A samples were collected from different localities across India, and genotyping was performed using eight microsatellite markers on ABI Prism 310 Genetic Analyzer. Several statistical analyses were performed to ascertain the genetic diversity that exists within and between the populations.
The markers were found to be moderately polymorphic in the populations. Genetic analysis indicated significant genetic differentiation between the majority of the population pairs analyzed and was not found to be related to the geographical distances between populations.
This is the first and successful attempt to test the microsatellite markers developed for population genetic analysis of An. culicifacies A. Host feeding and breeding habits of species A suggest that factors other than ecological and geographical barriers were responsible for the genetic differentiation that has been observed between the populations.
Anopheles culicifacies sensu lato has a wide distribution in India and extends into the west to Ethiopia, Yemen, Iran, Afghanistan and Pakistan, and in the east to Bangladesh, Myanmar, Thailand, Cambodia and Vietnam and is also found in Nepal and southern China in the north and extends to Sri Lanka in the south [1, 2]. It is an important malaria vector in India, Sri Lanka and in the countries west of India. It is responsible for 60-70% of new malaria cases in India . This taxon has now been recognized as a species complex with five members provisionally designated as species A and B , C , D  and E . Species, A, B, C and D, were recognized following the observation of a total absence or significant deficiency of heterozygotes in natural populations for the alternate arrangements observed in polytene chromosomes due to paracentric inversions. The fifth species, species E, was identified by correlating Y-chromosome polymorphisms of sons and the sporozoite positivity of mothers.
There is little information available about the population structure and gene flow that occurs between and/or within An. culicifacies sibling species populations in India. Several studies were carried out to examine the population structure of vectors in Africa and other countries, namely An. gambiae s. s. , An. arabiensis, An. funestus, An. darlingii and many other vectors, using microsatellite markers and mitochondrial genes. Recently, two major vectors found in India, An. stephensi an urban malaria vector  and An. baimaii, a vector in north eastern states  were analysed for population genetic structure and gene flow using microsatellite markers and mitochondrial DNA genes respectively. Microsatellites are highly polymorphic genetic markers used extensively for studying genetic structure of populations. Realizing the importance of knowledge on the population structure and gene flow for implementing insecticide resistance management strategies for the control of An. culicifacies sibling species, microsatellite markers developed in our laboratory  were used in this study to understand the population structure of An. culicifacies species A populations, and the results are reported in this paper.
Sample collection and species identification
Anopheles culicifacies samples were collected from five different localities in India, namely the states (districts); Haryana (Sonepat), Gujarat (Kheda), Karnataka (Bijapur), Rajasthan (Jodhpur), and Uttar Pradesh (Allahabad). Specific spatial details of the collection sites are given in Figure 1 and were selected to represent different eco-settings that prevail in India. Collections of indoor-resting mosquitoes were made by hand-collection using suction tube and torchlight. Specimens collected from resting sites were confirmed as An. culicifacies on a morphological basis, following the mosquito identification key by Christophers . For sibling species identification, semi-gravid females were used for cytotaxonomic identification and genotyping. Ovaries from individual semi-gravid females were pulled out and stored in modified Carnoy’s fixative (1:3 glacial acetic acid and methanol) and the carcasses of the mosquito were stored in isopropanol for DNA isolation. The two vials of each mosquito were given a corresponding code number for species correlation. Ovaries were processed for the preparation of polytene chromosomes and diagnostic inversion genotype analyses  were used for sibling species identification. In cases where specimens were not in a suitable stage for cytological identification, 28S-D3 – PCR assay was used .
Genomic DNA extraction and Microsatellite genotyping of field collected samples
The DNA was isolated from individual mosquitoes following the procedure described previously  and stored at −20°C until further use. In total thirty-one microsatellite markers were isolated in An. culicifacies A in our previous investigation . Randomly, eight microsatellite markers, which were found to be polymorphic in the laboratory reared specimens and in one field population of An. culicifacies were used in the present study. Linkage relationship of these loci is not known. Each of the primer pairs belonging to the selected loci (Table 1), were labeled at 5′ with TET™, HEX™ or 6-FAM™ fluorescent dyes (Microsynth Corporation, Switzerland). Multiplex PCRs were set up by grouping 2 or 3 primer pairs together depending on their annealing temperatures, dyes, and sizes and PCR amplifications. The PCR primers and procedures followed in the study were same as described earlier . The resultant amplified products were resolved using an ABI Prism 310 Genetic Analyzer (Applied Bio systems). Alleles were sized relative to an internal fluorescent standard marker and the results were analyzed using the Genescan software version 3.1 (Applied Bio systems).
Random mating i.e., agreement to Hardy-Weinberg Equilibrium (HWE) within each of the populations was tested by using the Arlequin v.2.0 software . An unbiased estimate of the exact P-value for each locus was computed using the Markov chain method of Geo and Thompson  with the forecasted chain length of 100,000 steps and dememorization steps of 1000. The observed proportion of heterozygote deficiencies (D) and the frequencies of null alleles (r) that caused the heterozygote deficiencies were estimated following the method described in Chakraborty et al., . For calculation of D and r, the following equations were used: D = (HE-Ho)/HE and r = (HE-Ho)/(HE + Ho); where HE is expected heterozygote frequency and HO is observed frequency of heterozygotes.
Genetic variability among the populations was measured by Wright’s F-statistics . Further, F-statistics values were also calculated according to the method of Weir and Cockerham using the GenAlex v.5.4 (MS Excel-based genetic analysis tool) . The significance of FST values was tested using the formula described by Workman and Niswander where Chi square = 2NFST (where N is the population size with n-1 degrees of freedom for ‘n’ subpopulations) .
The effective migration rate (Nm) was estimated according to Slatkin’s (1987) formula; Nm = 1/4 [(1/ FST)-1] . To investigate, if levels of differentiation were related to geographical distances, the regression of FST (1-FST) on the natural log (ln) of geographical distance was used .
Morphologically identified An. culicifacies specimens were collected from five different localities in India that were selected to best represent the diversity of Indian geography and its ecology (Figure 1). Of the specimens identified, both cytotaxonomically and by PCR methods, a total of 104 species A samples were analyzed for the population structure of An. culicifacies in India. Details of the number of samples from each of the five localities are given in Table 1 and the location of the study sites are shown in Figure 1.
Distribution and the level of genetic diversity
A total of 104 samples were analyzed for eight microsatellite markers, in order to understand various population genetic parameters that might be influencing the present genetic make-up of An. culicifacies. Genetic diversity using microsatellite markers was compared by examining the number of alleles and its heterozygosity observed among the five different populations of species A. Various population parameters such as the number of samples from each of the five locations, total number of alleles observed (N), expected and observed heterozygosity (Ho, He), deficiency of number of heterozygotes (D), etc. were analyzed (Table 2). Among the populations examined, a total of 69 alleles were observed among the eight loci. The observed loci were found to be moderately polymorphic among all the populations. In all the five populations of species A, the maximum number of alleles observed ranged between 4 and 17. Of the eight microsatellite loci, ACAV1B213 was found to be highly polymorphic with 17 alleles. This marker had a maximum of 12 alleles in Sonepat and Kheda populations, while 8, 9, and 11 alleles were observed in Bijapur, Allahabad, and Jodhpur, respectively. The AcAVIIB46 and AcAVB93A loci were least polymorphic with the maximum number of alleles being four. The four alleles of AcAVIIB46 were observed in Bijapur and Kheda populations while AcAVB93A displayed four alleles only in Bijapur. The weighted average of number of alleles between the eight markers among all the samples ranged from 2.5 to 10.6, while between the five different populations the range of average alleles varied from 4.41-5.74. Among the five populations of species A, the average of expected heterozygosity ranged from 0.42 - 0.87 per locus suggesting that expected heterozygosity does not significantly vary between the populations to the average of the total expected heterozygosity observed cumulatively in the total population. The observed heterozygosity among the eight loci ranged from 0.199 to 0.418 but did not differ significantly among the populations (0.182 to 0.4706) (F-0.53, d.f.-6, P > 0.05).
Hardy-Weinberg equilibrium (HWE)
The genotype frequencies at several loci did not conform to HWE. For the majority of markers, the observed number of heterozygotes was less than the expected number and this deficiency was statistically significant (P < 0.05). A total of 40 tests (8 loci, 5 populations) in species A for conformance to HWE at the locus level within the populations were performed. Of these tests, 29 (72.5%) tests showed a significant deviation from HWE in species A. Furthermore, in all the cases heterozygote deficiency was observed (Table 2). The loci ACAV1B213, ACA61 and ACAVB93A showed a significant heterozygote deficiency in all the five populations, while the locus ACA11B5 showed a significant departure in four out of the five populations. Loci AcAB93 and AcAVIIB46 showed heterozygote deficiency in three populations while loci AcA59 and AcAVIIIB40 showed heterozygote deficiency in only two populations. Presence and frequency of null alleles was tested using the method of Chakraborty et al.  (Table 3). Average estimate of null allele frequency ranged from 0.02 to 1 with a maximum average of 0.45 in all the populations. Locus AcAVB93A showed the maximum frequency of null alleles in all the populations, while ACA11B5 contributed the least (Table 3).
Correlation between the population divergence and isolation of populations by distance
The five populations exhibited remarkably similar allelic distributions on averaging all loci but differential patterns were observed among the loci in four out of eight loci (Table 3). A single allele of 206 bp of the marker ACA59 was predominant in all the five populations of species A, and with reference to the ACAVIB213 locus, each of the populations had a different predominant allele (Table 3). There were also alleles unique to one or the other populations, but the frequencies of these alleles were very low (data not shown). The maximum frequency of alleles at each of the loci ranged from 0.2 to 0.9 in the populations (Table 3).
Genetic variability between species A populations was studied using F-statistics. The pair-wise genetic differentiation among species A populations were found to be significant in eight out of the ten combinations and ranged from 0.0853 – 0.2483. The populations sampled from five different geographical regions represented northern, western, eastern, and southern parts of the country (Figure 1). The shortest distance between the two populations was 400 km between the districts Kheda and Jodhpur, while the longest was 1730 km between the districts Sonepat and Bijapur. A high level of genetic differentiation was observed between the species A populations. The mean FST for all the markers and all the populations was 0.155. The FST values were non-significant for two population comparisons, namely Kheda and Sonepat; and Bijapur and Sonepat, which indicates a low genetic differentiation between these populations. For all other population comparisons, the values were significant suggesting a high genetic differentiation. Furthermore, to understand the role of geographical distance in generating the genetic distance between the sampled populations, the Mantel test was performed. The test revealed no significant correlation between the pair-wise FST/(1 - FST) against the natural logarithm of pair-wise geographical distance (r2 = 0.3952; Figure 2), suggesting that the population genetic structure of An. culicifacies in India did not conform to the isolation-by-distance model. This was also evident from the analysis of molecular variance which is calculated based on the FST values, and is found to be greater within the pairs of populations analyzed (87%) than that among the populations (13%) (Figure 3).
The Nm-values were calculated based on the FST values for all the comparisons and the values are tabulated in Table 4. In all the pair-wise population combinations, except for two comparisons (Sonepat vs. Allahabad; and Jodhpur vs. Sonepat), Nm values were >1 with the maximum of 7.6 between Sonepat and Kheda. The observation of low Nm values suggests the greater genetic differentiation between the populations. Thus, from the analysis of Nm values, it appears that maximum differentiation was in Allahabad vs. Sonepat comparisons (Nm = 0.83) and Jodhpur vs. Sonepat (Nm = 0.70). The two comparisons with maximum Nm values observed in this study are Kheda vs. Sonepat (7.6), and Kheda vs. Bijapur (5.6) which are distantly located, at 1000 and 1400 km apart, respectively, and it is only for these two population comparisons, that the FST values are non- significant. The measure of gene flow again supports that the genetic differentiation observed is not related to geographical distance.
All the eight microsatellite markers examined in An. culicifacies species A populations were polymorphic. Maximum number of alleles at each locus ranged from 4 to 10 with the exception for one marker (ACAVVIB213), which exhibited 17 alleles, suggesting that these markers are moderately polymorphic in the populations. Except for AcAVIIB46 and AcAVB93A with the least number of alleles, the other markers did not exhibit all the alleles in any of the five populations. Significant deviations from the Hardy-Weinberg expectations for these markers were observed in the populations studied. The deficiency of heterozygotes observed in these populations could be due to population sub-structuring or due to the presence of null alleles, i.e. mutation/s in the primer sites that may prevent annealing of primers to the DNA samples. This results in either total failure of amplification (homozygous for null allele), or most commonly one strand will be amplified and the sample will be scored as a homozygote, which in fact is a heterozygote . During genotyping, a few DNA samples did not amplify for one or the other marker. In order to rule out the problems of multiplexing, these samples were also genotyped separately by setting up an individual PCR. As these samples were normally amplified in some other combinations, non-amplification is necessarily not due to the poor quality of DNA or other general PCR procedural problems. Grouping together of different gene pools (Wahlund effect) and non-random mating within the populations i.e. presence of sub-populations as a reason for the deficiencies is not being favored because (i) collections were of only indoor-resting mosquitoes, collected from both human dwellings and cattle sheds, and mosquitoes were collected from structures within a single village or one or two more which are within 5–10 km range, (ii) the extensive cytotaxonomic studies carried out in different parts of the country did not indicate any sub-populations and (iii) that D values up to 10% do not indicate the presence of sub-populations . The D-values observed in this study are significantly low. Therefore, presence of null alleles is being considered to be mainly responsible for the deficiencies of heterozygotes observed at these loci. However, it is not known whether the markers not being in Hardy-Weinberg equilibrium has had any influence on the population genetic parameters estimated in this study. Keeping this in view and that the markers were randomly selected not knowing whether they represent the entire genome, a few conservative conclusions are being drawn from the F-statistics data. FST values have shown a great genetic differentiation between the pairs of populations analyzed. Low Nm values (<1) between Haryana and Uttar Pradesh, and Rajasthan and Haryana suggest a limited gene flow. The geographical distances of these two population pairs were less than those observed for Gujarat and Haryana, and Gujarat and Karnataka which had maximum Nm values (>5 and 7 respectively).
Vindhyachal Mountain ranges which pass through central India (Maharashtra and Madhya Pradesh States) separate northern and southern parts of India. Karnataka is located in the southern part of India. There are no known geographical barriers that exist between the other four populations studied. Furthermore, anopheline species in general are known to have limited flight range. For species A, which is predominantly zoophagic (maximum anthropophilic index observed was 3-4%), An. culicifacies being a rural mosquito and agriculture being practiced extensively in these areas indicate free availability of cattle for blood feeding. Irrigation channels are preferred breeding habitat for species A. This suggests that other than geographical or ecological, some other barriers are playing a role for differential genetic differentiation levels observed between these populations. Residual sprays with effective insecticides alter and interfere with population sizes. At the time of collection of mosquitoes, no spray operations were going on in sites in Uttar Pradesh, Haryana and Rajasthan states. In Gujarat and Karnataka, malathion was being sprayed in the study sites, which even at present is effective on An. culicifacies populations in these areas. Therefore, the effect of insecticide sprays on the mosquito population sampling cannot be ruled out.
This is the first and successful attempt made to study the population genetic structure of An. culicifacies using microsatellite markers. Genetic analysis of five different populations of species A was carried out with eight microsatellite markers. FST values indicated significant genetic differentiation between the majorities of the population pairs analyzed. We hope that these results may add a further step in understanding the dynamics of the vector species for planning effective vector control activities based on population genetic structure.
Ramachandra Rao T: The anophelines of India. 1984, Delhi: Malaria Research Center (ICMR)
World Health Organization Regional Office for South-East Asia: Anopheline species complexes in South and South-east Asia. 2007, Delhi: World Health Organization
Green CA, Miles SJ: Chromosomal evidence for sibling species of the malaria vector Anopheles (Cellia) culicifacies Giles. J Trop Med Hyg. 1980, 83 (2): 75-78.
Subbarao SK, Vasantha K, Adak T, Sharma VP: Anopheles culicifacies Complex: Evidence for a New Sibling Species, Species C. Ann of the Entomol Soc America. 1983, 76 (6): 985-988.
Vasantha K, Subbarao SK, Sharma VP: Anopheles culicifacies complex: population cytogenetic evidence for species D (Diptera: Culicidae). Ann of the Entomol Soc America. 1991, 84 (5): 531-536.
Kar I, Subbarao SK, Eapen A, Ravindran J, Satyanarayana TS, Raghavendra K, Nanda N, Sharma VP: Evidence for a new malaria vector species, species E, within the Anopheles culicifacies complex (Diptera: Culicidae). J Med Entomol. 1999, 36 (5): 595-600.
Lanzaro GC, Toure YT, Carnahan J, Zheng L, Dolo G, Traore S, Petrarca V, Vernick KD, Taylor CE: Complexities in the genetic structure of Anopheles gambiae populations in west Africa as revealed by microsatellite DNA analysis. Proc Nat Acad Sci. 1998, 95 (24): 14260-14265. 10.1073/pnas.95.24.14260.
Temu EA, Yan G: Microsatellite and mitochondrial genetic differentiation of Anopheles arabiensis (Diptera: Culicidae) from western Kenya, the Great Rift Valley, and coastal Kenya. Am J Trop Med Hyg. 2005, 73 (4): 726-
Choi KS, Koekemoer LL, Coetzee M: Population genetic structure of the major malaria vector Anopheles funestus s.s. and allied species in southern Africa. Parasites & Vectors. 2012, 5: 283-10.1186/1756-3305-5-283.
Mirabello L, Vineis J, Yanoviak S, Scarpassa V, Povoa M, Padilla N, Achee N, Conn J: Microsatellite data suggest significant population structure and differentiation within the malaria vector Anopheles darlingi in Central and South America. BMC Ecol. 2008, 8 (1): 3-10.1186/1472-6785-8-3.
Vipin , Dube M, Gakhar SK: Genetic differentiation between three ecological variants (‘type’, ‘mysorensis’ and ‘intermediate’) of malaria vector Anopheles stephensi (Diptera: Culicidae). Insect Sci. 2010, 17 (4): 335-343.
Sarma DK, Prakash A, O’Loughlin SM, Bhattacharyya DR, Mohapatra PK, Bhattacharjee K, Das K, Singh S, Sarma NP, Ahmed GU: Genetic population structure of the malaria vector Anopheles baimaii in north-east India using mitochondrial DNA. Malaria J. 2012, 11: 76-10.1186/1475-2875-11-76.
Sunil S, Raghavendra K, Singh OP, Malhotra P, Huang Y, Zheng L, Subbarao SK: Isolation and characterization of microsatellite markers from malaria vector, Anopheles culicifacies. Mol Ecol Notes. 2004, 4 (3): 440-442. 10.1111/j.1471-8286.2004.00698.x.
Subbarao SK: Anopheline species complexes in south-east Asia. 1998, : WHO Regional Offices for South-East Asia New Delhi
The Fauna of British India, including Ceylon and Burma Diptera Vol IV Family Culicidae Tribe Anopheline. Edited by: Christophers SR. 1933, London: Taylor & Francis
Subbarao SK, Vasantha K, Sharma VP: Cytotaxonomy of certain malaria vectors in India. 1988, Oxford: Clarendos Press
Raghavendra K, Cornel AJ, Reddy BP, Collins FH, Nanda N, Chandra D, Verma V, Dash AP, Subbarao SK: Multiplex PCR assay and phylogenetic analysis of sequences derived from D2 domain of 28S rDNA distinguished members of the Anopheles culicifacies complex into two groups, A/D and B/C/E. Infect Genet Evol. 2009, 9 (2): 271-277. 10.1016/j.meegid.2008.12.007.
Coen E, Strachan T, Dover G: Dynamics of concerted evolution of ribosomal DNA and histone gene families in the melanogaster species subgroup of Drosophila. J Mol Biol. 1982, 158 (1): 17-35. 10.1016/0022-2836(82)90448-X.
Schneider S, Roessli D, Excoffier L: Arlequin: a software for population genetics data analysis. User manual ver. 2000, 2: 2496-2497.
Guo SW, Thompson EA: A Monte Carlo method for combined segregation and linkage analysis. Am J Hum Genet. 1992, 51 (5): 1111-
Chakraborty R, Andrade M, Daiger SP, Budowle B: Apparent heterozygote deficiencies observed in DNA typing data and their implications in forensic applications. Ann Hum Genet. 1992, 56 (1): 45-57. 10.1111/j.1469-1809.1992.tb01128.x.
Wright S: Vol. 4: Variability within and among natural populations. 1978, Chicago [etc.]: University of Chicago Press
Weir BS, Cockerham CC: Estimating F-statistics for the analysis of population structure. Evolution. 1984, 38 (6): 1358-1370. 10.2307/2408641.
Workman PL, Niswander JD: Population studies on southwestern Indian tribes. II. Local genetic differentiation in the Papago. Am J Hum Genet. 1970, 22 (1): 24-49.
Slatkin M: A measure of population subdivision based on microsatellite allele frequencies. Genetics. 1995, 139 (1): 457-462.
Fo R: Genetic differentiation and estimation of gene flow from F-statistics under isolation by distance. Genetics. 1997, 145 (4): 1219-1228.
Lehmann T, Hawley WA, Kamau L, Fontenille D, Simard F, Collins FH: Genetic differentiation of Anopheles gambiae populations from East and west Africa: comparison of microsatellite and allozyme loci. Heredity. 1996, 77: 192-200. 10.1038/hdy.1996.124.
Financial support for this study (Project ID no: 990484) was received from WHO/TDR under Molecular Entomology Committee. The authors would like to thank all technical staff of NIMR for their assistance in conducting the study.
This manuscript was approved by NIMR publication committee (Approval no: 009/2013).
The authors declare they have no competing interests.
SKS conceived and RK, OPS and SKS planned the study. RK made the field collections and NN identified all mosquito samples used in study. SS was involved in setting up all experiments. OPS and NR performed data analysis. SS, NR and SKS drafted the manuscript. All authors read and approved the final version of the manuscript.