Identification of wild-caught phlebotomine sand flies from Crete and Cyprus using DNA barcoding

Background Phlebotomine sand flies (Diptera: Psychodidae) are vectors of Leishmania spp., protozoan parasites responsible for a group of neglected diseases called leishmaniases. Two sand fly genera, Phlebotomus and Sergentomyia, contain species that are present in the Mediterranean islands of Crete and Cyprus where the visceral (VL), cutaneous (CL) and canine (CanLei) leishmaniases are a public health concern. The risk of transmission of different Leishmania species can be studied in an area by monitoring their vectors. Sand fly species are traditionally identified using morphological characteristics but minute differences between individuals or populations could be overlooked leading to wrong epidemiological predictions. Molecular identification of these important vectors has become, therefore, an essential tool for research tasks concerning their geographical distribution which directly relates to leishmaniasis control efforts. DNA barcoding is a widely used molecular identification method for cataloguing animal species by sequencing a fragment of the mitochondrial gene encoding cytochrome oxidase I. Results DNA barcoding was used to identify individuals of five sand fly species (Phlebotomus papatasi, P. similis, P. killicki, Sergentomyia minuta, S. dentata) circulating in the islands of Crete and Cyprus during the years 2011–2014. Phlebotomus papatasi is a known vector of zoonotic CL in the Middle East and it is found in both islands. Phlebotomus similis is the suspected vector of Leishmania tropica in Greece causing anthroponotic CL. Phlebotomus killicki was collected in Cyprus for the first time. Sergentomyia minuta, found to present intraspecific diversity, is discussed for its potential as a Leishmania vector. Molecular identification was consistent with the morphological identification. It successfully identified males and females, which is difficult when using only morphological characters. A phylogenetic tree was constructed based on the barcodes acquired, representing their genetic relationships along with other species from the area studied. All individuals identified were clustered according to their species and subgenus. Conclusions Molecular identification of sand flies via DNA barcoding can accurately identify these medically important insects assisting traditional morphological tools, thus helping to assess their implication in Leishmania transmission.


Background
Sand flies (Diptera: Psychodidae) are small (body length < 3 mm) haematophagous insects and vectors of the protozoan parasites Leishmania spp. In the Old World, sand flies of the genus Phlebotomus are involved in an epidemiological cycle where a female sand fly that feeds on a Leishmania-infected reservoir host can become infected and transmit the parasite while feeding on its next target [1]. Leishmania spp. can cause a group of diseases called leishmaniases which, in the Mediterranean Basin appear in two forms: visceral (VL) and cutaneous (CL). There are estimations that leishmaniases are responsible for 20 to 30 thousand deaths worldwide each year [2]. The known pattern of co-evolution between parasites and their vectors renders necessary the evolutionary study by means of species identification and the vectorial capacity of the sand flies in an area [3]. Monitoring sand flies in a region is, therefore, one of the most important steps towards predicting and controlling the disease.
Crete and Cyprus, in the southeastern Mediterranean, are foci of both forms of leishmaniasis. In Crete, cases of VL and canine leishmaniasis (CanLei) (both caused by Leishmania infantum) are quite frequent while cases of CL (caused by Leishmania tropica) are re-emerging [4]. Phlebotomus (Larroussius) neglectus is a proven vector of L. infantum which is abundant on the island [4][5][6][7][8][9] but the vector of L. tropica is not yet implicated. Moreover, Phlebotomus (Paraphlebotomus) similis has been found in CL foci in Crete and it is the suspected vector of L. tropica, like its sister species Phlebotomus (Paraphlebotomus) sergenti elsewhere, since their systematic relationship implies similar vectorial capacity [10][11][12]. In Cyprus, Phlebotomus (Larroussius) tobbi is the vector of L. infantum causing CanLei [13,14]. Leishmania donovani, a recent introduction to Cyprus, causes both VL and CL and it was found in a CanLei case in a mixed infection with L. infantum [15,16]. Vectorial capacity of sand flies circulating in Cyprus and their role in the L. donovani transmission cycle is yet to be determined [14]. Sand flies of the genus Sergentomyia are present throughout Greece [17] and Cyprus [14,18]. Sergentomyia (Sergentomyia) minuta is the predominant species found in the Mediterranean having a doubtful taxonomic status presenting high levels of intraspecific diversity correlated with geographical distribution [19]. Genus Sergentomyia is not deeply studied for its implication in the transmission of Leishmania but its species appear capable of feeding on rodents [20].
It is evident that proper identification of sand flies in an area can help assess the risk of spread of leishmaniasis. Although it is quite demanding, morphological identification is the traditional method that sand fly taxonomists use. It requires careful preparation of specimens after a field trip and a high degree of expertise [21]. Nevertheless, accurate results are achieved if taxonomic keys are updated regularly. Most keys are over 35 years old and do not correspond to intraspecies phenotypic plasticity although they are quite useful in initial species clustering. Moreover, the presence of subpopulations at early stages of genetic divergence, with no significant morphological changes compared to the main population, could be overlooked using traditional identification. This mechanism of speciation could lead to cryptic species unknown to date [22]. Furthermore, for many sand fly groups males or females, based on their morphology alone, can be impossible to identify [23]. For example, females of P. similis and P. sergenti are separated by comparing slight differences in the pharynx which can be confusing even for an experienced taxonomist since the keys that are used for morphological identification are based on type-species individuals. There are closely related species within the genus Lutzomyia whose females are indistinguishable, leading to the usage of wing morphometrics to solve these problems [24]. Wings of sand flies are, however, quite delicate and often get lost during field samplings.
Molecular identification can tackle identification problems. There are no prerequisites asked (i.e. gender, developmental stage) and can be fast and more reliable compared to morphology. DNA barcoding was created aiming to build a universal library of specific sequenced fragments of the mitochondrial gene that encodes the cytochrome c oxidase subunit 1 (cox1 or "barcode") that will correspond to species, helping the scientific community to answer systematic questions [25]. In sand fly taxonomy research, DNA barcoding (i.e. identification via cox1 sequencing) is the second most used molecular identification method. It is quite popular in the New World and it advances rapidly in the Old World [23]. The method has helped to reveal cryptic sand fly species [26] and it has also been used to distinguish female sand flies between closely related species [22]. In Greece, recently, two different studies used the method to successfully identify sand flies in VL/CL/CanLei foci [27,28].
This study presents the molecular identification of five wild caught Phlebotomus and Sergentomyia sand fly species from Crete and Cyprus based on DNA barcoding. A phylogenetic analysis, based on cox1 sequences, showed the systematic relationships of the sand flies caught with other sand flies circulating around the Mediterranean Basin. Furthermore, the publication of the sequences acquired through this study will help towards enriching the sand fly barcode library. Presence and possible vectorial capacity status of these medically important insects is discussed concerning the studied areas. Since DNA barcoding can save time and win on accuracy when compared to morphological identification, it could accompany traditional identification with morphological tools in order to verify questionable results. That way, possible mistakes or systematic discrepancies could be resolved and all individuals studied can be placed in their respective taxa.

Sand fly DNA extraction, PCR and sequencing
The Qiagen QIAamp DNA micro kit (Qiagen, Hilden, Germany) was used to extract sand fly DNA. Barcoding region of cox1 gene was amplified using primers LCO1490/ HCO2198 [31] under previously described conditions [25,32]. PCR products resulting from cox1 amplification were purified using Qiagen QIAquick PCR Purification kit. PCR primers were used in double stranded sequencing which was performed in CEMIA SA (Larisa, Greece). Sequencing results quality was checked by eye and identity of all sequences was confirmed by BLAST™ queries. CodonCode Aligner™ (v. 3.7.1 CodonCode Corporation (Centerville, MA, USA) software was used for editing the sequences.

Dataset
A dataset was created and used for phylogenetic analyses which included a total of 108 sequences. The set was comprised of the 31 cox1 sequences from the present study, 31 Larroussius cox1 sequences from Crete and Cyprus [27], 45 cox1 sequences of related sand fly taxa derived from GenBank™ and one Culex pipiens cox1 sequence as outgroup, also derived from GenBank™. The sequences were translated into amino acids using MEGA  [33] and no stop codons were observed. Multiple sequence alignments were performed using CLUSTALW [34] as implemented in MEGA. Genetic distances were calculated using Tamura-Nei model [35], also in MEGA.

Phylogenetic analyses
The optimal partitioning scheme (unpartition or codon partition) and the best-fit nucleotide substitution model for each partition were identified using the Partition Finder (PF) v.1.1.1 [36]. PF was ran two different times with the models of molecular evolution restricted to those that are available in either MrBayes or RAxML, using the greedy search algorithm, linked branch lengths in calculations of likelihood scores, and the Bayesian information criterion (BIC) for selecting among alternative partitioning strategies. The models that include both a parameter for among-site rate heterogeneity (G) and a parameter for invariant sites (I) were ignored, because the adding of a proportion of invariable sites creates a strong correlation, making it impossible to estimate both parameters reliably. Another drawback of the model is that the estimate of the propotion of invariable sites (p0) is very sensitive to the number and divergences of the sequences included in the data [37]. Phylogenetic inference analyses were conducted using Bayesian inference (BI), and maximum likelihood (ML) methods. The BI analysis was performed in MrBayes (v.3.2.6; [38]) with four runs and eight chains per run for 10 7 generations sampling every 100th generation. This generated an output of 10 5 trees. Several MCMC convergence diagnostics were used to check for convergence and stationarity. The first 25% of the trees (25% "burn-in" in Bayesian terms) were discarded as a measure to sample from the stationary distribution and avoid the possibility of including random, suboptimal trees. A majority rule consensus Bayesian tree was then calculated from the posterior distribution of trees, and the posterior probabilities were calculated as the percentage of samples recovering any particular clade [39], where probabilities higher than 95% were considered indicative of significant support.
ML analyses were conducted with RAxML v.8.1.21 [40] using RAxMLGUI v.1.5 [41] under the models selected in PF analyses where parameters were estimated independently for each partition. The best ML tree was selected from 500 iterations and the confidence of the branches of the best ML tree was assessed based on 1000 thorough bootstrap replicates.

Identification of wild-caught sand flies
Thirty one sand flies (16 females and 15 males) from Crete and Cyprus were identified by morphology: 11 individuals as P. papatasi, 6 as P. similis, 2 as P. killicki, 6 as S. minuta and 6 as S. dentata. No intraspecies peculiarities were observed among the individuals examined. DNA barcoding resulted in cox1 sequences for each sample and BLAST™ queries confirmed their identity (Table 1).

Sequence analysis
The dataset contained 636 bp of cox1 sequences. Variable sites were 245 while parsimony informative sites were 217. The pairwise genetic distances ranged from 0 to 25%. After grouping obtained sequences, according to species (Table 2), Phlebotomus spp. individuals had mean genetic distance between 6 and 21% while members of the genus Sergentomyia had less extreme distance range (13-17%). The highest intraspecific mean genetic distance (intraspecies delimitation), based on our dataset, was set as 3%. This value was calculated among the S. minuta individuals analyzed. On the other hand, the respected lowest interspecific value (interspecies delimitation) was set as 6%. This was assessed when calculating the distance between P. syriacus and its sister species P. neglectus. Phlebotomus papatasi sequences from Crete had 1% mean genetic distance from the ones from Cyprus. As for the S. minuta sequences, there was a 4% mean genetic distance between the sand flies from Greece and Cyprus and a 3.3% between S. minuta sequences from Crete and mainland Greece. Intraspecies mean genetic distance of all S. minuta sequences, regardless their origin, was less than 1%. The analysis of PF supported the partitioning of the dataset in the three codon positions. The nucleotide substitution model selected for each data partition were: for MrBayes SYM + I for the first codon position, HKY + I for the second and GTR + G for the third codon position, whereas GTR + G for each one of the codon positions for RAxML.

Phylogenetic analyses
Maximum likelihood (-lnL = 4340.76), and Bayesian inference analysis (arithmetic mean -lnL = 4352.85) produced similar topologies. Considering the Bayesian analysis, the MCMC convergence diagnostics (average standard deviation of split frequencies, the plot of the generation versus the log probability of the data, the average Potential Scale Reduction Factor, and the minimum value of minimum Estimated Sample Sizes) revealed no clues of non-convergence, indicating stationarity, that is, there should be no tendency of increase or decrease over time.
The Bayesian phylogenetic tree presented in Fig. 1 demonstrates a clear separation between the two genera, Phlebotomus and Sergentomyia, and also the subgenera Phlebotomus, Paraphlebotomus, Transphlebotomus and Sergentomyia, respectively. All species were separated with each one having its own branch. Sequences for all newly collected individuals clustered together with those already published for the respective species of the same or different locality forming groups of the same subgenus. The phylogenetic groupings provided by the tree, coupled with the aforementioned sequencing queries against GenBank™, confirmed the molecular and morphological identification of the sampled sand flies.

Discussion
The present study used DNA barcoding to identify 31 wild-caught sand flies, belonging to five species, from Crete and Cyprus. The species identified are P. papatasi (from both Crete and Cyprus), P. similis (from Crete), P. killicki (from Cyprus), S. minuta (from both Crete and Cyprus) and S. dentata (from Cyprus). After obtaining the cox1 sequences necessary for DNA barcoding, a larger dataset of cox1 barcodes was constructed in order to place the new ones in a phylogenetic tree that would describe the relationships between sand fly species around the Mediterranean Basin and conclusively verify their identity (Fig. 1).
Phlebotomus papatasi is the vector of L. major that causes zoonotic CL in humans in many countries in Africa and Asia [42]. It can be found locally all around southern Europe but there are no studies of its vectorial role in Greece [12]. Given that P. papatasi has an established presence in Crete and mainland Greece [4,7,8,43,44] as well as in Cyprus [14,18], population monitoring should continue to determine whether it can act as the vector of L. major in these areas in case infected rodent reservoirs are introduced. In the present study, P. papatasi individuals were well separated with high posterior probability (Fig. 1) from other species groups and clustered together with published cox1 barcodes of the same species. Phlebotomus papatasi belongs to the phylogenetically distinct subgenus Phlebotomus, members of which have been quite often used in phylogenetic analyses [18,28,45,46]. In a study where 22 populations of P. papatasi from 16 countries were subjected to phylogenetic analysis, it was shown that samples from Crete and Cyprus shared haplotypes implying close genetic relationships albeit their insular isolation [47]. The present study encountered two individuals that share a haplotype (data not shown). This fact, along with the 1% mean genetic distance between the samples of P. papatasi from Greece and Cyprus of the present study, adds more supporting information to that conclusion. Additionally, their clear morphological homogeneity suggests that these two populations, most likely, do not constitute different taxa. All individuals analyzed in the present study represented a single clade although they were derived from quite diverse geographical locations (India, Ethiopia, Israel). This observation indicates that more studies are needed to investigate the possibility of cryptic or sibling taxa within this geographically diverse species group.
Phlebotomus similis is the sister species of P. sergenti, a proven vector of L. tropica [11,48,49]. Due to its abundance in CL foci in Greece, it is believed that this is the species transmitting L. tropica in the country [50]. Sand fly samplings in Greece, in the past, reported the presence of P. sergenti. However, Depaquit et al. [11] suggested that the species found is actually P. similis and that this is the sole Paraphlebotomus species found in the country able to transmit L. tropica. In fact, since 2002, no published sand fly samplings in Greece have reported P. sergenti [7,16,27]. Additionally, sequencing analysis showed that P. similis individuals from this study exhibit a 12% distance from P. sergenti sequences derived from Algeria, separating these species in a clear manner. This is the first time that cox1 sequences of P. similis are deposited in GenBank, a first step to start monitoring P. similis populations to resolve their systematic status via DNA barcoding. Molecular studies of the subgenus Paraphlebotomus based on nuclear [51] and mitochondrial markers [45,46] have come to same topology conclusions as the present study.
Phlebotomus killicki was recently described as a member of the subgenus Transphlebotomus and it was found in locations in Crete and Turkey, sites 500 km apart, along with P. anatolicus [30]. In Cyprus, the presence of P. economidesi was reported along with that of P. mascittii [14,52] but its presence on the island should be reevaluated [30]. Phlebotomus economidesi was also found in Turkey [30]. This is the first report of P. killicki in Cyprus; where this species was found in sympatry with P. economidesi. However, further samplings will determine whether there are other sympatry phenomena, as for example with P. mascittii [30]. As for the phylogenetic relationships between the available P. killicki individuals, the samples from Cyprus were not separated from those collected in Turkey or Crete [30]. Another Transphlebotomus Fig. 1 Bayesian inference (BI) tree (number above branches represent BI posterior probabilities and bootstrap support from a maximum likelihood (ML) analysis as BI/ML). Culex pipiens was used as the outgroup. Species names correspond to Table 1. Individuals from the present study appear in bold species, P. canaaniticus, occurs in the Middle East [53]. Phlebotomus mascittii is the Transphlebotomus species presenting the widest geographical distribution compared with other species within that subgenus (from Germany and Austria [54,55] to Crete [4,7]). Transphlebotomus is a subgenus that is closely related to the subgenera Adlerius and Larroussius [56] which both contain Leishmania vectors [57] and have being thoroughly studied [27,[58][59][60]. It is suggested that since populations of Transphlebotomus spp. sand flies are detected more often than before, more studies for their vectorial role should be conducted.
Sergentomyia minuta and S. dentata are known to be present in mainland Greece [7,44] as well as in the islands of Crete [6][7][8] and Cyprus [14,18]. Sergentomyia minuta has a doubtful taxonomic status and a quite unique intraspecific variability [19] which should be evaluated together with its possible vectorial capacity. All Sergentomyia samples sequenced in this work clustered by genus, species and locality, respectively, confirming previous studies [28,61]. The S. minuta clade (Fig. 1) presented a geographical locality-based separation and the three sub-clades created suggest that speciation events could be underway. Genetic distances between the three sub-clades (Cyprus, mainland Greece, Crete) exceed 3% while all three intraclade distances did not exceed 1%, evidence that supports the above-mentioned isolation based on this dataset. Of particular note, Sergentomyia sp. individuals were caught using sticky traps [9], mostly among rock pile wall formations which provide resting and hiding places for lizards. This genus contains sand fly species that feed on reptiles and transmit Sauroleishmania, a parasite infecting reptiles [10]. Such rock formations also provide a perfect nesting habitat for sand flies. Since sand flies are known to be weak flyers, they tend to reside close to their blood meal sources [1]. Given that, it is suggested that S. minuta populations, in their specific habitats, may have been isolating themselves faster than other species thus creating such an intraspecies diversity as the one demonstrated here (Fig. 1). As this evidence underlines the possible existence of cryptic taxa, further samplings for individuals in those areas are being conducted. The genetic isolation between individuals from Crete and Cyprus is not supported with morphological findings indicating that further multigene molecular work, accompanied with morphospecies clustering, would provide further information on this interesting issue. There are an increasing number of studies that detect Leishmania DNA in sand flies belonging to the genus Sergentomyia [62][63][64][65] but detection of parasite DNA in sand flies is not sufficient evidence for a species to be classified as vector [66]. A study in Senegal examined whether Sergentomyia sand fly species were vectors of L. infantum in a CanLei endemic focus where Phlebotomus species are absent or significantly under-represented. It was concluded that S. dubia and S. schwetzi are the possible vectors of the parasite in the studied area [67] based on accepted vectorial criteria [68]. As a result, it appears that Sergentomyia needs to be studied more thoroughly since there are more taxonomic and vectorial capacity questions requiring answers [19,61].

Conclusions
This study constitutes the molecular identification of the sand fly species caught in two VL/CL/CanLei foci, Cyprus and Crete, in the southeastern Mediterranean, using DNA barcoding. It makes a contribution towards understanding the systematic status of P. similis, the suspected vector of L. tropica in Greece. This is the first time DNA barcoding has been applied to this important species and the derived barcodes are added to the GenBank database. The presence of P. killicki is reported for the first time in Cyprus and possible newly arising taxa within the S. minuta phylogenetic clade are demonstrated. Regarding the P. papatasi individuals caught in the two islands, it was shown that although the two populations are geographically unassociated, they show no morphological or DNA barcoding-based differences. As more barcodes are added to the database, identification/clustering process of sand flies and their molecular systematics will be accurately resolved. Since sand fly species are quite important in the transmission of Leishmania parasites as well as other pathogens, their geographical distribution and vectorial capacity must be extensively evaluated. DNA barcoding helps towards that direction by putting the stepping stone to the comprehension of taxa.