DNA barcoding of morphologically characterized mosquitoes belonging to the subfamily Culicinae from Sri Lanka

Background Vectors of mosquito-borne diseases in Sri Lanka, except for malaria, belong to the subfamily Culicinae, which includes nearly 84% of the mosquito fauna of the country. Hence, accurate and precise species identification of culicine mosquitoes is a crucial factor in implementing effective vector control strategies. During the present study, a combined effort using morphology and DNA barcoding was made to characterize mosquitoes of the subfamily Culicinae for the first time from nine districts of Sri Lanka. Cytochrome c oxidase subunit 1 (cox1) gene from the mitochondrial genome and the internal transcribed spacer 2 (ITS2) region from the nuclear ribosomal DNA were used for molecular characterization. Results According to morphological identification, the field collected adult mosquitoes belonged to 5 genera and 14 species, i.e. Aedes aegypti, Ae. albopictus, Ae. pallidostriatus, Aedes sp. 1, Armigeres sp. 1, Culex bitaeniorhynchus, Cx. fuscocephala, Cx. gelidus, Cx. pseudovishnui, Cx. quinquefasciatus, Cx. tritaeniorhynchus, Cx. whitmorei, Mansonia uniformis and Mimomyia chamberlaini. Molecular analyses of 62 cox1 and 36 ITS2 sequences were exclusively comparable with the morphological identifications of all the species except for Ae. pallidostriatus and Aedes sp. 1. Although the species identification of Armigeres sp. 1 specimens using morphological features was not possible during this study, DNA barcodes of the specimens matched 100% with the publicly available Ar. subalbatus sequences, giving their species status. Analysis of all the cox1 sequences (14 clades supported by strong bootstrap value in the Neighbor-Joining tree and interspecific distances of > 3%) showed the presence of 14 different species. This is the first available DNA sequence in the GenBank records for morphologically identified Ae. pallidostriatus. Aedes sp. 1 could not be identified morphologically or by publicly available sequences. Aedes aegypti, Ae. albopictus and all Culex species reported during the current study are vectors of human diseases. All these vector species showed comparatively high diversity. Conclusions The current study reflects the significance of integrated systematic approach and use of cox1 and ITS genetic markers in mosquito taxonomy. Results of DNA barcoding were comparable with morphological identifications and, more importantly, DNA barcoding could accurately identify the species in the instances where the traditional morphological identification failed due to indistinguishable characters of damaged specimens and the presence of subspecies. Electronic supplementary material The online version of this article (10.1186/s13071-018-2810-z) contains supplementary material, which is available to authorized users.


Background
Correct species recognition of mosquito vectors is a vital component in implementing effective vector control strategies. Morphology based taxonomy and molecular characterization are the two major approaches used in species identification. Taxonomic keys used in morphological identification of mosquitoes mainly depend on the external features of adult females and fourth-instar larvae where the specimens must be handled and stored cautiously without damaging the external features, which is not practically possible all the time. Moreover, this approach needs expertise and is time consuming. Also, it does not identify the factors such as genetic variations and phenotypic plasticity that can have an impact on species level identification [1]. Alternatively, DNA barcoding or molecular characterization has become increasingly popular as an efficient method of species identification since it produces results with high precision even from a part of the specimen, within a short period of time [1]. Molecular taxonomy enables researchers to identify mosquitoes to species or subspecies level, understand genetic diversity and make predictions on evolution and phylogenetic relationships. Among all the barcoded insect groups, mosquitoes are the most intensely barcoded [2], probably because of their importance as vectors of many life threatening human diseases.
A region of the cytochrome c oxidase subunit 1 (cox1) gene located in the mitochondrial genome and a region in the nuclear ribosomal DNA internal transcribed spacer 2 (ITS2) have been the most frequently used genetic markers in DNA barcoding of mosquitoes. cox1 has been used as the only molecular marker in characterizing 37 Canadian, 62 Indian, 122 Chinese, 32 Pakistani and 45 Singaporean mosquito species [3][4][5][6][7]. A study conducted in India has used the same marker in investigating the molecular evolution of mosquito vectors of medical and veterinary importance [8]. The ITS2 region has been used in distinguishing closely related mosquito species belonging to various genera of the subfamily Culicianae, such as Culex [9] and Aedes [10]. Another barcode region used in mosquito barcoding studies is cox2 of the mitochondrial genome [11,12]. D3 expansion segment (~1000 bp) a coding region located in the large subunit of the nuclear ribosomal DNA is also used as a molecular marker in mosquito barcoding studies since it shows a higher inter-/intraspecific variations.
Molecular markers, as a group, will provide more reliable information on the genetic variation between and within a species. Sri Lankan anophelines that included major and potential vectors of malaria have been studied successfully using both cox1 and ITS2 markers [13]. A study has used both these marker regions in barcoding the vector mosquitoes Ae. aegypti, Ae. albopictus, Culex tritaeniorhynchus, Cx. vishnui and Cx. pseudovishnui [14]. A combination of cox1, ITS2 and D3 has been used in another study to recognize sibling species of several mosquito species complexes [15]. However, researchers state that characterization of mosquitoes through integrated approaches using both morphological and molecular identification is the best approach for species identification [5].

Study sites
Mosquitoes were collected from nine administrative districts located in the three main climatic zones in Sri Lanka, i.e. Kandy, Matale and Nuwara-Eliya in wet zone (annual rainfall > 2500 mm rainfall); Badulla and Kurunegala in the intermediate zone (annual rainfall 1750-2500 mm rainfall) and Ampara, Batticaloa, Jaffna and Mannar in dry zone (annual rainfall < 1750 mm rainfall) Ampara, Batticaloa and Jaffna were coastal areas covered with little vegetation, few paddy fields and few human settlements. Kurunegala had little vegetation but a river running along the collection site. A dense vegetation cover was observed in the vicinity of Badulla and Matale study sites. Kandy site had a moderate vegetation cover and was surrounded by a significant human settlement. Mannar site was largely surrounded by paddy fields with few human settlements (Fig. 1).

Mosquito collection
Adult mosquitoes were collected using cattle baited trap-huts, backpack aspirators, light traps and BG Sentinel traps. Dried specimens (a minimum of 10 individual mosquitoes) were used in morphological and molecular identifications. Mosquitoes were morphologically identified to generic and species level using standard taxonomic keys [26][27][28][29]. Voucher specimens of each species were pin mounted for a reference collection at the Invertebrate Systematic Diversity Facility (ISDF) at the Department of Zoology, Faculty of Science, University of Peradeniya, Peradeniya, Sri Lanka (UP/ZOO/BT011 -KU015). The reminder were dried and stored for molecular characterization.

DNA extraction, polymerase chain reaction (PCR) and sequencing
Genomic DNA was extracted from head and thoracic regions of each individual using following the method described by Livak [30]. Cytochrome c oxidase subunit 1 (cox1) gene was amplified using forward primer C1-J-1718 (5'-GGA GGA TTT GGA AAT TGA TTA GTT CC-3') and reverse primer C1-N-2191 (5'-CCC GGT AAA ATT AAA ATA TAA ACT TC-3') [31]. ITS2 region was amplified using the forward primer 5'-ATC ACT CGG CTC ATG GAT CG-3' and the reverse primer 5'-ATG CTT AAA TTT AGG GGG TAG TC-3' [32]. Each amplification was performed in 15 μl that included 1 μl of DNA template, 1.5 μl of 10× KAPA buffer A (Cape Town, South Africa), 0.12 μl of KAPA Taq, 0.12 μl of 2.5 mM dNTP mix, 0.75 μl of 50 mM MgCl 2 , 0.51 μl of each primer (10 mmol) and 10.49 μl of ddH 2 O. PCR parameters were 95°C for 5 min and 35 cycles of 94°C for 30 s, 51°C (for cox1) / 55°C (for ITS2) for 40 s and 72°C for 45 s, followed by a final extension step of 72°C for 10 min. PCR products were run in 1.5% agarose gel stained with ethidium bromide and visualized in a gel imaging system. PCR products showing positive clear bands were purified using QIAquick® PCR Purification kits (Hilden, Germany) according to the manufacturers' protocol. A maximum of three PCR positive samples of each species were sequenced bi-directionally at the Department of Molecular Biology and Biotechnology, Faculty of Science, University of Peradeniya.

DNA sequence analysis
The trace files/chromatograms of cox1 and ITS2 sequences were manually edited using BioEdit software. Sequences of low quality were excluded during data analysis. Consensus sequences were aligned using Clustal W in BioEdit software. Once the alignment was completed, species identification was confirmed by comparison to publicly available sequence data in GenBank using Basic Logical Alignment Search Tool (BLAST) [https://blast. ncbi.nlm.nih.gov/Blast.cgi] and the Barcode of Life Database (BOLD) interface [www.boldsystems. org]. The number of parsimony informative sites, number of variable sites, number of haplotypes, haplotype diversity and GC content were analyzed using the DNA Sequences Polymorphism software (DnaSP, version 5.1.10). MEGA version 6.0 was used to calculate intraspecific and interspecific pairwise sequence divergence using the Kimura-2 parameter distance model [33]. Neighbor-Joining (NJ) phylogenetic trees of cox1 and ITS2 sequences were constructed in MEGA 6.0 using Kumura-2 Parameter distances. Branch support of NJ trees were assessed by bootstrapping with 1000 replicates. Codon positions included 1st + 2nd + 3rd + noncoding regions. All the haplotype sequences of cox1 and ITS2 were deposited in the GenBank database (see below).

Results
According to morphological identification, the mosquitoes collected belonged to 14 species of 5 genera Aedes,  (Table 1). A total of 62 cox1 sequences obtained from all 14 species and 36 ITS2 sequences obtained from only 10 species were used in genetic diversity and phylogenetic analysis. The fragment sizes of cox1 was 428 bp and that of ITS2 sequences ranged from 335 bp (Ae. aegypti) to 403 bp (Ae. albopictus) ( Table 1). ITS2 sequences generated for Cx. bitaeniorhynchus, Cx. fuscocephala, Cx. gelidus and Cx. whitmorei were not considered for analysis as they were not of good quality. Fifty cox1 and 29 ITS2 haplotypes were obtained (Table 1).
Species identifications were performed using cox1 and ITS2 sequences compared to already available sequences in GenBank and BOLD system (Additional file 1: Table  S1). Sequences obtained for Aedes aegypti, Aedes albopictus, all 7 Culex species, Mansonia uniformis and Mimomyia chamberlaini were 100% corresponding with their morphological species identifications. cox1 and ITS2 sequences generated for morphologically identified Armigeres sp. 1 specimens showed 99% similarity to the publicly available Armigeres subalbatus sequences, confirming its distinct species status.
Sequence data for cox1 and ITS2 markers from Aedes pallidostriatus were not available in the GenBank or BOLD systems to compare the sequence data generated for the morphologically identified Aedes pallidostriatus. The closest available sequences were from Aedes ochraceus (92% cox1 sequence similarity and 90% ITS2 similarity). Aedes sp. 1 could not be identified using taxonomic keys due to damages in identification features in the samples collected. According to molecular data, Ae. vexans was the closest species to Aedes sp. 1 specimens, with a cox1 similarity of 94% and ITS2 similarity of 82%. Hence, the species level identification of Aedes sp. 1 could not be confirmed using the available morphological and molecular data.

Phylogenetic analysis
The NJ tree constructed using the cox1 sequences formed 14 strongly supported clades (bootstrap value of 100%) 100% compatible with morphological identification (Figs. 2 and 3). All 7 species of genus Culex clustered together with Mi. chamberlaini specimens to form a single major clade, highlighting their close relationship. These two genera also share many common morphological characters. Haplotypes of Aedes aegypti and Ae. albopictus formed a separate clade while Ae. pallidostriatus and Aedes spp. 1 grouped into another clade with Ae. ochraceus and Ae. vexans sequences. Figure 4 shows the NJ tree constructed using the ITS2 sequences generated for the 10 mosquito species (except Culex bitaeniorhynchus, Culex fuscocephala, Culex gelidus and Culex whitmorei). All 10 species formed 10 strongly supported clades, each representing individual species. Further, 5 genera formed 5 major clades.

DNA polymorphism analysis of cox1 sequences
Cox1 sequences were AT rich and the G+C content was 0.333. The number of variable bases was 145, accounting for 33.88% of the total number of sites. Among these variable sites, 142 were parsimony informative sites and only 3 represented singleton mutations. These nucleotide variations were heavily skewed to the third codon position which had 112 variable sites and 1 singleton mutation. First codon position had 28 variable sites and As shown in Table 2, the mean intraspecific K2P distances for all the species were less than 2%. The maximum distance was seen among the haplotypes of Cx. fuscocephala which was 1.4% while Ae. albopictus, Cx. quinquefasciatus and Mi. chamberlaini reported the lowest mean intraspecific distance of 0.2%. The interspecific distances ranged from 6.8% between Cx. whitmorei and Cx. tritaeniorhynchus to 21.6% between Ma. uniformis and Ae. albopictus (Table 2).

Discussion
The present study provides both morphological and molecular characterization of a collection of mosquitoes belonging to subfamily Culicinae in Sri Lanka for the first time. Using the traditional morphology-based taxonomy, belonging to the subfamily Culicinae, collected from Sri Lanka during the study (red labels) and the sequences retrieved from the GenBank database (back labels). Only nodal support > 70% is shown 14 mosquito species belonging to the genera Aedes, Armigeres, Culex, Mansonia and Mimomyia were identified. Three Aedes species were identified to the species level: Ae. aegypti, Ae. albopictus and Ae. pallidostriatus. The species referred to as Aedes sp. 1 and Armigeres sp. 1 could not be identified into its species level due to physical damages.
Seven Culex species, i.e. Cx. bitaeniorhynchus, Cx. fuscocephala, Cx. gelidus, Cx. pseudovishnui, Cx. quinquefasciatus, Cx. tritaeniorhynchus and Cx. whitmorei, one Mansonia species, Ma. uniformis, and one Mimomyia species, Mi. chamberlaini, were recognized with the aid of taxonomic keys. DNA barcoding conducted with cox1 and ITS2 confirmed the identity of these species and the sequence similarity with the publicly available sequences in the GenBank and BOLD system was 99-100%, except for Ae. pallidostriatus and Aedes sp. 1. There were no available sequences for Ae. pallidostriatus and ours was the first GenBank record of this species. Although Ae. vexans showed the closest sequence similarity to Aedes sp. 1, its species identity could not be finalized using both molecular data and morphological identification. Although species status of Armigeres samples was not identified morphologically, sequence data clearly support its identification as Ar. subalbatus. Fig. 3 NJ phylogenetic tree based on Kimura 2-parameter genetic distance model for cox1 sequences of all the 50 haplotypes of 14 species belonging to the subfamily Culicinae, collected from Sri Lanka during the study (red labels) and the sequences retrieved from the GenBank database (back labels); continuation of Fig.2. Only nodal support > 70% is shown Fig. 4 NJ phylogenetic tree based on Kimura 2-parameter genetic distance model for ITS2 sequences of all the 29 haplotypes of 10 species belonging to the subfamily Culicinae, collected from Sri Lanka during the study (red labels) and the sequences retrieved from the GenBank (back labels). Only nodal support > 70% is shown The NJ phylogenetic trees constructed using both cox1 and ITS2 sequences, and genetic distance analysis, further supported the species identity and displayed the phylogenetic relationships between the species. Each individual species was represented by well supported clades (> 98% bootstrap support) confirming the morphological identification of 14 species. Interspecific distance of more than 3% between the cox1 sequences is considered as the threshold in differentiating species [1,34] and this has been applied in many mosquito phylogenetic studies [4][5][6]13]. According to the genetic distance estimates of cox1 sequences, the intraspecific distances of all the species identified during the present study was less than 3% (ranged between 0.2-1.4 %) while the interspecific distances were above 3% (6.8-21.6%). Mosquito barcoding studies have previously reported divergence ranges similar to the present study [4][5][6]13]. Ten of the mosquitoes with ITS2 sequences (except Cx. bitaeniorhynchus, Cx. fuscocephala, Cx. gelidus and Cx. whitmorei) were represented by more than one ITS2 haplotype. Previous studies, had also reported ITS2 haplotype variations within a single species [35][36][37]. However, a study on anopheline mosquitoes from Sri Lanka reported the presence of species-specific ITS2 haplotypes only [13].
Aedes pallidostriatus and Aedes sp. 1 showed highest sequence similarity to Ae. ochraceus and Ae. vexans, respectively. Based on the morphological features, Ae. aegypti and Ae. albopictus belong to subgenus Stegomyia while Ae. pallidostriatus, Ae. ochraceus and Ae. vexans belong to subgenus Aedimorphus. In the phylogenetic tree based on cox1 sequences, the two members of the subgenus Stegomyia formed a single clade while Aedes sp. 1 grouped with the members of the subgenus Aedimorphus. Therefore, the molecular data strongly suggest that Aedes sp. 1 is a member of Aedimorphus.
Culex species recognition is mainly based on morphology of adult females and fourth-instar larvae. However, absence and overlapping of morphological characters have often been identified as factors that lead to misidentification of Culex species [38]. Hence, molecular characterization is important in accurate and precise identification of them. The present study provides a basis for Culex species identification in Sri Lanka using molecular approach. The mean cox1 intraspecific distance ranged between 0.2-1.4% and the interspecific distance ranged between 7.0-11.2% for Culex species. Similar interspecific distances have been reported for Culex species from eastern Amazonian Ecuador [39] and India [8]. The intraspecific distances evaluated for 22 Culex species in Argentina and Brazil varied between 0. 09-3.00% [38] and that for 13 Culex species in Turkey between 0-0.8% [40].
Accurate and precise identification of mosquito vectors and determination of their genetic diversity is important especially in determining the vectorial capacities and planning vector control strategies. Among the mosquitoes studied, Ae. aegypti and Ae. albopictus are the primary and secondary vectors, respectively, of dengue and chikungunya, Cx. tritaeniorhynchus is the major vector of Japanese encephalitis (JE) and Cx. quinquefasciatus major vector of filariasis in Sri Lanka. Also, JE  [22], and these too were barcoded during the current study. According to our results, almost all the species tested showed high genetic diversity which, in turn, demands a greater attention since uniform control measures might not work in the same manner for all the populations of a single species. The present study highlights the importance of molecular characterization in species recognition of mosquitoes which can be successfully incorporated to future development and implementation of vector control strategies.

Conclusions
The study showed that molecular characterization can be successfully employed for species identification of Culicinae mosquitoes. Results of DNA barcoding, using a combination of the genetic markers cox1 and ITS2, were comparable with morphological identifications and more importantly, DNA barcoding could accurately identify the species in the instances where the traditional morphological identification failed due to damaged specimens and indistinguishable characters. The cox1 and ITS2 sequences generated and submitted to the Gen-Bank database could be used as reference sequences in future mosquito taxonomic studies. High genetic diversities observed in vectors of mosquito-borne diseases such as dengue, chickungunya and Japanese encephalitis should be taken into account in planning future vector control programmes in the country. Implementation of vector control programmes must be planned cautiously as uniform control measures may not be equally effective for genetically different populations.

Additional file
Additional file 1: Table S1. Morphologically identified mosquito species, cox1 (fragment size of 428 bp) and ITS2 sequences (fragment sizes are separately listed for each species) generated from them, and the GenBank accession numbers for relevant submissions. Details of the closest publicly available sequences are also presented for comparison.