Molecular characterization of freshwater snails in the genus Bulinus: a role for barcodes?

Background Reliable and consistent methods are required for the identification and classification of freshwater snails belonging to the genus Bulinus (Gastropoda, Planorbidae) which act as intermediate hosts for schistosomes of both medical and veterinary importance. The current project worked towards two main objectives, the development of a cost effective, simple screening method for the routine identification of Bulinus isolates and the use of resultant sequencing data to produce a model of relationships within the group. Results Phylogenetic analysis of the DNA sequence for a large section (1009 bp) of the mitochondrial gene cytochrome oxidase subunit 1 (cox1) for isolates of Bulinus demonstrated superior resolution over that employing the second internal transcribed spacer (its2) of the ribosomal gene complex. Removal of transitional substitutions within cox1 because of saturation effects still allowed identification of snails at species group level. Within groups, some species could be identified with ease but there were regions where the high degree of molecular diversity meant that clear identification of species was problematic, this was particularly so within the B. africanus group. Conclusion The sequence diversity within cox1 is such that a barcoding approach may offer the best method for characterization of populations and species within the genus from different geographical locations. The study has confirmed the definition of some accepted species within the species groups but additionally has revealed some unrecognized isolates which underlines the need to use molecular markers in addition to more traditional methods of identification. A barcoding approach based on part of the cox1 gene as defined by the Folmer primers is proposed.

Results: Phylogenetic analysis of the DNA sequence for a large section (1009 bp) of the mitochondrial gene cytochrome oxidase subunit 1 (cox1) for isolates of Bulinus demonstrated superior resolution over that employing the second internal transcribed spacer (its2) of the ribosomal gene complex. Removal of transitional substitutions within cox1 because of saturation effects still allowed identification of snails at species group level. Within groups, some species could be identified with ease but there were regions where the high degree of molecular diversity meant that clear identification of species was problematic, this was particularly so within the B. africanus group.

Conclusion:
The sequence diversity within cox1 is such that a barcoding approach may offer the best method for characterization of populations and species within the genus from different geographical locations. The study has confirmed the definition of some accepted species within the species groups but additionally has revealed some unrecognized isolates which underlines the need to use molecular markers in addition to more traditional methods of identification. A barcoding approach based on part of the cox1 gene as defined by the Folmer primers is proposed.

Background
Freshwater snails belonging to the genus Bulinus act as intermediate hosts in the life cycle of the widespread and debilitating parasitic disease schistosomiasis in Africa, Madagascar and adjacent regions (see Fig. 1). Schistosome species within the S. haematobium group which depend on snails from Bulinus for transmission include three human pathogens (S. haematobium, S. intercalatum and S. guiniensis) and five others which may infect wild and domestic ruminants (S. bovis, S. curassoni, S. mattheei, S. leiperi and S. margrebowiei). The relationship and interaction between schistosomes and snails is very specific and compatibility may differ over quite small geographical ranges [1]. A reliable taxonomy of the genus Bulinus is a funda-mental prerequisite for understanding the epidemiology of this disease.
The thirty six species within the genus have been placed into four species 'groups' [2]; the B. africanus group, B. forskalii group, B. reticulatus group and the B. truncatus/tropicus complex. For the most part, species have been classified on the basis of their morphology although, in recent decades, the study of ploidy [3], allozymes [4][5][6] and DNA methods [7][8][9][10][11][12][13] have all played an increasing role in species discrimination. Morphological characters, whilst adequate to allocate a specimen to a species group are sometimes unreliable when used to classify at higher resolution [8,10] especially within the B. africanus group. Consequently, there is a requirement for a robust system of identification and classification to supplement more traditional approaches. The data presented here represent initial steps towards achieving consistency and uniformity in the identification process. Nuclear (its2) and mitochondrial (cox1) sequences are compared with respect to their ability to resolve species and species group relationships within Bulinus and cox1 is used to screen 81 isolates and place them in a phylogenetic context.

Samples
The available samples selected for study represented as diverse a group as possible within the genus Bulinus and comprised of both collected and donated material that took the form of specimens from the field and maintained cultures which were stored in liquid nitrogen, ethanol or used fresh (Table 1). Those specimens preserved in ethanol were left in TE buffer pH 7.4 overnight in order to per-fuse out any remaining alcohol from within the tissue that might interfere with subsequent extraction techniques.

Extraction of genomic/mitochondrial DNA
Total genomic DNA was extracted from whole snail tissue in a manner similar to that outlined by Stothard et al [7] with minor modification. Snail tissue was homogenized in lysis buffer (100 mM Tris, 1.4 M NaCl, 16 mM EDTA, 2% hexadecyltrimethylammonium bromide [CTAB]). To this was added 20 μl of proteinase K (20 mg/ml) and the whole mixed in a rotisserie incubator at 55°C for 1.5 to 2.0 hours. Subsequently, an equal volume of chloroform/ isoamyl alcohol (24:1) was added to the digest and gently mixed. Tubes containing the digest were then spun at 13,000 rpm for 10 minutes. The upper aqueous layer was removed using 'wide bore' pipette tips and nucleic acids were precipitated in 'Analar' grade absolute ethanol. Following precipitation for 15 minutes at -20°C, the DNA was centrifuged again at 13,000 rpm to form a pellet. The absolute ethanol was removed and the pellet washed in 70% ethanol before a final centrifugation at 13,000 rpm. The ethanol was then discarded and the pellet dried in a dry heating block at 90°C for 5 minutes before dissolution in an appropriate amount of purified water.

Amplification of cox1 and its2 fragments
The partial cox1 fragment was amplified in one, two or more sections from the mitochondrial component of extracted total genomic DNA using the polymerase chain reaction (PCR) and various combinations of the primers that are shown in Table 2 (see additional file 1, for PCR and sequencing primers that proved successful with particular isolates and Fig. 2 for approximate primer locations). Amplicons of its2 were generated in a single section using two primers ETTS1 and ETTS10 (see Table 2). Either an Applied Biosystems GeneAmp PCR System 2400 or 2700 thermal cycler were used throughout the project in combination with GE Healthcare 'Ready-To-Go' PCR beads. Upon reconstitution with an appropriate volume of template, primer and pure water to a total of 25 μl, each dissolved bead forms a solution containing 200 μM of each dNTP, 10 mM Tris-HCl, 50 mM KCl and 1.5 mM of MgCl 2 . Cycling conditions for both cox1 and its2 PCR reactions were as follows: one cycle of 94°C for 5 min, 45 cycles of 94°C for 15 sec, 40°C for 30 sec and 72°C for 45 sec (in the case of cox1 this was increased to 1 min for amplicons over 1000 bp) and a final single cycle of 72°C for 7 min. PCR fragments were separated on a 1% agarose gel and bands were excised using a scalpel blade. The recovered DNA was purified for sequencing using a QIAquick Gel Extraction Kit (Qiagen). Following quantification and a check for purity with a Nanodrop ND-1000 Spectrophotometer (Nanodrop Technologies Inc), sequencing reactions were performed directly on each PCR product using an Applied Biosystems Big Dye Kit ver-

Phylogenetic analysis of sequence data
Following basic editing and analysis of compiled cox1 and its2 sequences on Sequencher 4.6, the sequences were used to perform BLAST searches [14] via the National Center for Biotechnology Information against GenBank and EMBL sequence databases in order to ensure that parasitic and other potential contaminant sequences had not been obtained in error. Sequences were then aligned and analysed using MEGA 3.1 [15] where alignment was undertaken using Clustal W [16]. The cox1 data for all taxa were analysed solely as nucleotides and subject to analysis by both Neighbour-Joining and Minimum Evolution methods using Kimura's 2 parameter model (K2P) for pair-wise distance calculations as this accommodates the difference in the rate of accumulation between transitions and transversions. The Minimum Evolution algorithm employed Close-Neighbour-Interchange (CNI -level 1) to explore the most optimal topology with the initial, temporary tree obtained by Neighbour-Joining. All gaps were deleted from the dataset using the 'complete deletion' option in MEGA 3.1 and the invertebrate mitochondrial code was used throughout. Nucleotide sequence data for its2 was analysed in a similar manner except that p-distance was employed instead of K2P.
The different forms of analysis were subject to bootstrapping (1000 repeats) as a means of testing the reliability of individual branches within the generated tree. Biomphalaria glabrata was used as an outgroup taxon upon which to root the structures. Sequence saturation for cox1 was visualized graphically, using the program DAMBE [17] that allows the number of differences between isolates or species in terms of transitional and transversional substitutions to be plotted against pair-wise distance values.

Analysis of its2 sequence data
Twenty nine samples were selected for sequencing of an amplicon containing the 3' end of the 5.8S gene and the entire its2 region. This was undertaken in order to compare the phylogenetic signal of a nuclear marker with that of cox1 (see Fig. 3   5' GCA TAC TGC TTT GAA CAT CG 3' Forward [27] nasutus nasutus and B. nasutus productus was not possible using the its2 data and sequence differences in this region are unlikely to be of value for the detection of hybridization events among isolates with overlapping geographic ranges. The its2 sequence data for B. wrighti were unusual in having two unique insertions relative to other Bulinus isolates, a short 16 bp insertion between fragment positions 219 and 234 and a larger 63 bp insertion between positions 248 and 310.

Analysis of cox1 sequence data
The sequence of the mitochondrial cox1 gene within Bulinus was found to be highly variable. The nucleotide composition regarding this genus was AT rich (69.4%) which is in close agreement with previous work [12]. In order to test which areas of the cox1 amplicon gave the best phylogenetic signal, the fragment was analysed in its entirety, using the first 644 bp and also the final 387 bp. Stothard and Rollinson [9] and Jones et al [12] used the latter section of the cox1 sequence and this corresponds to the region encompassed by primers Asmit1 (AT1) and Asmit2 (AT2) shown in Fig. 2. The first 644 bp covers an area comparable to the sequence bounded by the Folmer primers LCO1490 (CO1) and HCO2198 (CO2) [18] and which has been used previously for barcoding studies [19]. Figure 4 shows three Neighbour-Joining trees generated for each region using both transitional and transversional substitutions. Resolution of the tree improves as progressively longer stretches of sequence are included in the analysis. However, saturation analysis of the 'Asmit' and 'Folmer' regions show that both are subject to saturation of transitional events suggesting that inclusion of transversional substitutions only in the calculation would present a more accurate estimate of isolate relationships within the genus. Figure 5 shows a graph where mean, pair-wise genetic distances of the two areas calculated from transversional substitutions only are compared with the corresponding distance matrix co-ordinates for the complete fragment. In this manner, the variability of genetic distance figures for the 'Asmit' and 'Folmer' regions relative to the complete fragment may be shown. For small genetic distance values correspondence is good between the three regions, however, as distances increase in size those associated with the 'Asmit' fragment tend to drift away from the corresponding complete sequence derived genetic distance values within the mid-range of the 'x' axis. Distances of the 'Folmer' region also exhibit a degree of variation when compared with those of the complete fragment but to a smaller extent. Additionally, a short span of genetic distance values between approximately 0.058 and 0.064 is entirely missing within the dataset. To a lesser degree this is reflected in a smaller region between 0.013 and 0.018. Figure 6 shows the same procedure undertaken for transitional substitutions where correspondence of distance values for the 'Asmit' and 'Folmer' areas again 'drifts' with increasing genetic distance relative to the complete fragment. However, whilst the overall number of pairs contributing to each distance figure is considerably less than in Fig. 5, the numerical variation of distance values present in the matrices is far Locations of primers used for the cox1 fragment  greater, particularly between the ranges 0.035 to 0.077. Neighbour-Joining and Minimum Evolution trees were therefore generated for the dataset using transversions from the complete fragment sequence in order to obtain the maximum resolution possible. The resultant trees contained divisions corresponding to the four species groups (see Figs 7 &8).

B. africanus species group
Classification within this complex is probably the most difficult of the four Bulinus snail species groups. The cox1 sequence data for samples within this group revealed an extensive degree of genetic variation throughout the continent (see the designation 'A' in Figs 7 &8). There are three areas within the data set that may be used as reference points for interpretation of Figs 7 &8. The first is identified by sample code A23, a snail from Kinyasini, Unguja, which on the basis of previous work [8,9,20] Table 1 for origins of isolates.  Table 1 for origins of isolates. morphological differences are often difficult to determine, hence samples A9 and A10 identified as B. globosus can be distinguished from A28 and A29, B. africanus. The two species differ in the penis sheath, which is bigger in B. africanus being longer and/or thicker than the preputium [2] but such characters have been found unreliable for species discrimination [10].
Kenyan specimens from the Kisumu region used by Raahauge and Kristensen [10] have been included in the present study (A11 to A13 & A33). Utilizing RAPD profiles and PCR-RFLP results, these authors concluded that those samples labelled in Figs 7 &8 as A11 to A13 all appeared to be local variants of the same species and this conclusion is confirmed in the present analysis with all of the snails identified as B. globosus. In addition, they also showed that another sample screened in their study (ADC farm, Kisumu), labelled in Figs 7 &8 as A33, was different from the other Kisumu specimens and this has also been supported.

B. forskalii species group
The group as a whole splits into two main sections, namely, snails from West Africa i.e. Cameroon, Bukina Faso, Niger, Senegal, São Tomé and Angola (F53 to F58) and those from the eastern side, East Kenya, Pemba, Mafia and Mauritius (F59 to 65). Additionally, isolate F52 from Uganda although sharing a common ancestor with East and West African B. forskalii group isolates, is quite distinct from the other species. However, only one specimen has been examined and more samples from this locality in Uganda are required for further study. The its2 tree showed F52 to have a closer relationship with East African members of the B. forskalii group. Within this group certain isolates are of known species and can be used as reference points, namely F58, which is B. forskalii, F61 and F62 both being B. cernicus from Mauritius and F63 to F65 Comparison of regional cox1 genetic distances using transversions only Fewer isolates of the B. forskalii group have been tested and analysed as compared with the B. africanus group and so it might be imprudent to draw too many specific conclusions from the data. Africa as it appears that there may be more than one species involved.
Comparison of regional cox1 genetic distances using transitions only  Table 1

B. wrighti
Neighbour-Joining tree for Bulinus isolates using the 'Folmer' cox1 fragment, transversions only  Table 1 (Fig. 8). The observation by Brown [2] that there were indications of a "significant biological difference" between B. tropicus and B. natalensis is supported. The cluster representing B. truncatus in Figs 7 &8 has very short branch lengths implying that the hybridization event which Goldman et al. [3] suggested gave rise to this tetraploid is a relatively recent phenomenon.

B. reticulatus group
There are only two recognised species within this group and it has only been possible to examine one of them, B. wrighti. This species has a characteristic cox1 sequence which positions the group close to the B. truncatus/tropicus complex.

Barcoding
The sequence information presented here is not a typical 'barcode' in so far as the sequence is longer than most barcodes which are usually around 650 bp in length [19,23,24]. Moreover, the generated PCR fragments have been amplified and sequenced using a wide variety of different primers due to the highly variable nature of the sequence. A pan-species group/isolate barcode in the usual 'sense' might be possible to locate but requires a common set of primers to be designed which will amplify all species within the genus for a particular cox1 region and that the area so determined mirrors the results generated by the current longer sequence. For this reason, a Neighbour-Joining tree (Fig. 9) has been generated for the area of Bulinus cox1 which corresponds approximately with the 'Folmer' barcode region. Agreement of Fig. 9 with Figs 7 &8 whilst not identical is very close and provides hope that this region could be used for barcoding Bulinus species in future.
The advantages [19] and disadvantages [25] over the use of barcoding, and the utilization of a single mitochondrial gene such as cox1 for identification and phylogenetic purposes have been the subject of considerable debate. It is accepted that a range of both nuclear and mitochondrial markers are required to provide a more accurate estimate of evolutionary history. However, a technique for routine screening which is relatively quick, cost effective and reproducible is essential given that resources are limited. The selected area of sequence from the cox1 gene appears to achieve this by forming a framework upon which a classification can begin to be constructed. Previous studies have used cox1 in the phylogenetic evaluation of Bulinus, namely, Stothard & Rollinson [9], Stothard et al. [11] and Jones et al. [12] and, in this respect, the current project is not unique, however, the sequences used in the present paper are three times the length of those previously employed and the range of Bulinus isolates/species much more extensive. The study is ongoing and undoubtedly as more taxa are acquired and sequences added to the database, the shape of the Neighbour-Joining and Minimum Evolution trees as shown in Figs 7 &8 will progressively alter and become more informative.