Regional biogeography of microbiota composition in the Chagas disease vector Rhodnius pallescens

Background Triatomine bugs are vectors of the protozoan parasite Trypanosoma cruzi, which causes Chagas disease. Rhodnius pallescens is a major vector of Chagas disease in Panama. Understanding the microbial ecology of disease vectors is important in the development of vector management strategies that target vector survival and fitness. In this study we examined the whole-body microbial composition of R. pallescens from three locations in Panama. Methods We collected 89 R. pallescens specimens using Noireau traps in Attalea butyracea palms. We then extracted total DNA from whole-bodies of specimens and amplified bacterial microbiota using 16S rRNA metabarcoding PCR. The 16S libraries were sequenced on an Illumina MiSeq and analyzed using QIIME2 software. Results We found Proteobacteria, Actinobacteria, Bacteroidetes and Firmicutes to be the most abundant bacterial phyla across all samples. Geographical location showed the largest difference in microbial composition with northern Veraguas Province having the most diversity and Panama Oeste Province localities being most similar to each other. Wolbachia was detected in high abundance (48–72%) at Panama Oeste area localities with a complete absence of detection in Veraguas Province. No significant differences in microbial composition were detected between triatomine age class, primary blood meal source, or T. cruzi infection status. Conclusions We found biogeographical regions differ in microbial composition among R. pallescens populations in Panama. While overall the microbiota has bacterial taxa consistent with previous studies in triatomine microbial ecology, locality differences are an important observation for future studies. Geographical heterogeneity in microbiomes of vectors is an important consideration for future developments that leverage microbiomes for disease control.

In this study, we evaluated patterns of whole-body microbiota of a Chagas disease vector. Chagas disease, caused by the kinetoplastid protozoan parasite Trypanosoma cruzi, is transmitted between a wide range of potential mammalian hosts and humans by hematophagous (blood-feeding) triatomine insect vectors. Despite widespread control programmes, Chagas disease remains a significant health threat to millions of inhabitants in Latin America, particularly those that live in poverty [26]. The idea of using bacterial symbionts of triatomine bugs to control Chagas disease has long been proposed [27]. Recent studies describe microbial community composition within triatomines [23,[28][29][30][31][32][33], including R. pallescens from Colombia [22] and Panama [34], and a sister species, R. prolixus [35]. Studies thus far describe triatomine microbiota as having low complexity in terms of diversity and species-specific patterns [23,29,30], yet the microbiota for many taxa remain to be studied.
Infection of triatomines with trypanosomes has been associated with reduction in gut microbial diversity [28,35], and blood meal identity may influence composition of the predominant bacterial taxa [30,31]. However, other important comparisons among triatomines are lacking, such as differences between location and habitat type. Microbial composition variation between different geographical locations has been observed in ticks [15,18,20,24,25] and with mixed observations in mosquitoes [16,36]. Habitat has also been shown to be a main driver of microbial species composition in mosquitoes [37,38].
There are more than 150 species of triatomines, with different distributions, habitat requirements and life histories that can impact microbiota, or microbiota that can affect vectorial capacity. This complexity requires extensive research into different triatomine microbiomes. Currently, we still lack basic microbial community composition descriptions for many triatomine species and these large gaps in our knowledge make informed research for vector biocontrol difficult. Therefore, advancing research in triatomine microbiota is crucial for gaining a better understanding of T. cruzi infection, triatomine vector capacity, and developmental biology.
Rhodnius pallescens is the main vector Trypanosoma cruzi and T. rangeli in Panama [39] where they are widely distributed throughout Panama and into the neighboring countries of Costa Rica and Colombia. Rhodnius pallescens is commonly found and associated with Attalea butyracea palms [40,41] that are common and widely distributed across many different habitats in Panama [40,41]. The sylvatic behavior has complicated control efforts [42]. New endemic regions are still being described [43][44][45] and a darker chromatic variation of R. pallescens associated with distinct genetic groups of T. cruzi and T. rangeli has also been found recently in Santa Fe District, Panama [44].
Here, we describe the whole-body bacterial microbiota of wild-caught R. pallescens from three separate geographical locations in Panama. We used the entire triatomine body to encompass all potential microbial taxa relevant to R. pallescens that could affect their fitness and survival as a benchmark for future studies of localized anatomy microbiota. We hypothesize that both habitat type and geographical location will be associated with differences in whole-body microbiota composition similar to previous findings in other vectors [15,16,18,20,24,25,37,38]. We further hypothesize that complex environments, such as forest patches, will be associated with a more diverse microbiota composition in insects in more homogenous environments (i.e. cattle pastures), as previous studies in insects show that environmental diversity leads to increased microbial diversity [46][47][48]. We used Illumina 16S rRNA amplicon sequencing to characterize and evaluate the bacterial microbiota of R. pallescens between different locations and habitats, comparing infection status, age class, and primary blood-meal source to evaluate a range of variables that may be associated with whole-body bacterial community composition.

Sample collection and DNA extraction
All R. pallescens evaluated specimens (n = 89) were collected in Panama, Central America using Noireau traps [49] in Attalea butyracea palms (the main habitat of this species) and placed directly in 95% molecular grade ethanol before use. We sampled from a total of 8 palms, in three habitats (pasture, peridomestic and peridomesticforest), from three geographic locations in lowland moist tropical forest (Las Pavas and Trinidad de las Minas) and moist tropical forest (Santa Fe and Veraguas) (Fig. 1). We consider peridomestic to be home yards or areas within 100 m of a dwelling and peridomestic-forest to be patches of regenerated forest within a peridomestic landscape matrix. Samples from Las Pavas, La Chorrera District (9.104167° N, 79.885833° W) (n = 27 from two habitats) and Trinidad de las Minas, Capira District (8.775556° N, 79.995833° W) (n = 32, from one habitat) were from a previous study examining blood meals [50]. We further collected 30 samples from four sites comprising three habitats located in Santa Fe District, Veraguas (8.509232° N, 81.077800° W) from 8-11 July 2017. The Santa Fe region has recently been described as a new endemic focus for Chagas disease in Panama, where a dark morph of R. pallescens predominates [43,44]. All specimens were nymphs, primarily N3 and below (92%, 95% CI: 84.4-96.39%), with the exception of one male from Trinidad de las Minas (Additional file 1: Table S1). DNA was extracted from whole specimens following Kieran et al. [50]. Briefly, samples were macerated and digested overnight in digest buffer with proteinase K and extracted with phenol-chloroform-isoamyl alcohol. Extractions were reconstituted in TLE buffer (10 mM Tris, pH 8; 0.1 mM EDTA), and impurities were removed with Sera-Mag SpeedBeads ™ (Thermo Fisher Scientific, Waltham, MA, USA; [51]) with a final reconstitution in 30 µl TLE.

DNA amplification and sequencing
We amplified bacterial 16S rRNA DNA using the S-D-Bact-0341-b-S-17 (5′-CCT ACG GGN GGC WGC AG-3′) forward and S-D-Bact-0785-a-A-21 (5′-GAC TAC HVG GGT ATC TAA TCC-3′) reverse primer pair [52] to which we added modifications following previous studies [50,53,54]. We added Illumina TruSeq sequences to the 5′-end of the forward (Read 1) and reverse (Read 2) primer creating fusion primers. We synthesized 8 forward and 12 reverse fusion primers, each with a unique variable length (5-8 bp) index sequence between the 16S and TruSeq sequences. We then performed two rounds of PCR. For the first-round we performed replicate PCRs using 12.5 μl reactions of KAPA HiFi HotStart Kits (Kapa Biosystems, Wilmington, MA, USA) consisting of 2.5 μl of 5× buffer, 0.375 μl of 10 mM dNTPs, 0.25 μl hot start Taq, 5.4 μl molecular grade water, 1 μl of 5 μM forward primer, 1 μl of 5 μM reverse primer, and 2 μl of DNA. The ranges of DNA sample concentrations were from 10 ng/ μl to 60 ng/μl. Each DNA sample had a unique primerindex combination with the following thermocycler conditions: 98 °C for 3 min, followed by 30 cycles at 95 °C for 30 s, 63 °C for 1 min, 72 °C for 1 min and a final extension at 72 °C for 5 min. Amplification success was verified on a 1.5% agarose gel.
Amplicons were pooled in equal concentrations and cleaned using a 1:1 ratio of SPRI-beads and reconstituted  [55]. We used 25 μl reaction of KAPA HiFi HotStart Kits using 5 μl of 5× Buffer, 0.75 μl of 10 mM dNTPs, 0.5 μl HotStart, 3.75 μl molecular grade water, 2.5 μl of 5 μM forward primer, 2.5 μl 5 μM reverse primer, and 10 μl of 16S amplicon pool. We performed two replicate PCRs with the following thermocycler conditions: 98 °C for 2 min, followed by 10 cycles at 98 °C for 30 s, 60 °C for 30 s, 72 °C for 30 s and a final extension at 72 °C for 5 min. Library product was cleaned with Sera-Mag SpeedBeads ™ (1:1 ratio) and pooled with other uniquely indexed samples prior to sequencing.
For blood-meal source data, we used previous data from Kieran et al. [50] and for newly collected samples, we amplified 12S rRNA gene following Kieran et al. [50]. All libraries were sent to the Georgia Genomics and Bioinformatics Core (http://dna.uga.edu) for sequencing on an Illumina MiSeq using a v3 PE300 kit (Illumina, San Diego, CA, USA). We also screened for the presence of T. cruzi and T. rangeli amplifying telomeric kinetoplastid DNA with Tc189 and Tr primers [56]. Samples were also verified for T. cruzi using 121/122 primers targeting the kinetoplastid minicircle [57]. Amplification success was verified on a 1.5% agarose gel.

Data processing and analysis
We demultiplexed the amplicon indices using Mr. Demuxy 1.2.0 (https ://pypi.org/proje ct/Mr_Demux y/) and resulting fastq files were imported into Geneious 10.0.1 [58]; (https ://www.genei ous.com) where we trimmed primers, paired and merged the reads using FLASH [59]. Subsequent data were exported as fastq files for importation into Qiime2 [60]. The quality of the sequences was checked and filtered using QIIME2 v. 2018.8 plugin DADA2 [61] and chimeric sequences were removed. The remaining forward sequences were truncated to a final length of 292 bp and the reverse sequences were truncated to a final length of 240 bp. Amplicon sequence variants (ASV) were analyzed using the q2-diversity Qiime2 plugin to calculate multiple alpha diversity metrics, including Shannonʼs index H' , Simpsonʼs index D s , Chao1, Faith's phylogenetic diversity, and observed-ASV's.
The Qiime2 plugin q2-phylogeny was used to complete a multiple sequence alignment to reconstruct rooted and unrooted phylogenetic trees from the filtered alignment based on maximum-likelihood approximation with FastTree 2 [62]. Alpha and beta diversity metrics were assigned using the q2-diversity Qiime2 plugin. An alpha rarefaction was used to evaluate sampling depth, and the data was rarefied at 1000 sequences per sample, removing two samples and retaining 87 samples for final analyses.
The Qiime2 plugin q2-feature-classifier was used to align the sequences against the Greengenes 13.8 database [63]. OTUs were identified from phyla down to the genus level, we removed archaea, chloroplasts, mitochondria, not available (NAs), and uncharacterized taxa at the kingdom level.
Alpha diversity (species diversity) was calculated using Shannonʼs (species richness), Simpson's (evenness or relative abundance), and Chao1 (estimate of diversity from abundance) diversity metrics. To compare alpha diversities from individuals across location, habitat type, and infection status, a one-way analysis of variance (ANOVA) and post-hoc Tukeyʼs honest significant difference (HSD) tests for multiple comparisons were performed to evaluate differences in taxonomic abundance and alpha diversities. A P-value less than 0.05 was considered statistically significant.
Beta diversity (compositional variation) was calculated for the whole-body microbiota comparison between triatomines across location, habitat type, and infection status using Bray-Curtis dissimilarity. Bray-Curtis is based on shared OTU counts between individuals. Finally, we used non-metric multidimensional scaling (nMDS) [64] to visualize differences between the microbial communities, and a permutational MANOVA for hypothesis testing [65]. All diversity analyses and visualizations were conducted using qiime2 artifact outputs in R (v. 3.5.1) and with the packages phyloseq [66], vegan [67], dplyr [68], ggplot2 [69] and metacoder [70]. To further assess our findings of a location effect we removed older age classes (N4, N5, Adult) and repeated the analysis (n = 80).

16S rRNA sequences and classification of entire microbiota community
We obtained a total of 4,995,733 16S rRNA V3-V4 region sequences from 101 samples, including the negative controls of molecular grade water and positive controls of E. coli. After quality filtering and the exclusion of 14 samples due to low read numbers, the number of sequences obtained per sample ranged from 1000 to 34,792 reads, with a mean frequency of 9811.63. The total number of OTUs within the 87 final samples was 4033, with the top 4 phyla consisting of Proteobacteria (60.67%), Actinobacteria (16.93%), Bacteroidetes (9.55%) and Firmicutes (4.11%) out of the total phyla present in the dataset. Rhodnius pallescens, in Las Pavas and Trinidad de las Minas respectively, is primarily composed of Proteobacteria (71.63% and 76.11%, respectively), Actinobacteria (6.13% and 15.88%, respectively), and Bacteroidetes (13.98% and 2.66%, respectively) (Fig. 2, Additional file 2: Table S2, Additional file 3: Figure S1). This contrasts with specimens from northern Veraguas with the most abundant phylum shifting from Proteobacteria (26.43%) to Actinobacteria (27.56%) and introducing more Firmicutes (11.48%). At the family-level (Fig. 3, Additional file 3: Figure S1), the top 3 taxa overall were Anaplasmataceae (45.76%), Pseudonocardiaceae (6.04%), Moraxellaceae (2.77%), although these relative proportions differ by location (Additional file 4: Table S3, Additional file 3: Figure S1). Most notably, as seen in Fig. 3, Anaplasmataceae was the dominant family throughout samples from Las Pavas (48.30%) and Trinidad de las Minas (72.51%) (Fig. 4, Additional file 5: Table S4, Additional file 3: Figure  S1) but is not present within samples collected in northern Veraguas. The high abundance of Anaplasmataceae was due to a single genus, Wolbachia spp., comprising greater than 70% and 42% of the composition in more than half the specimens from Trinidad de las Minas and Las Pavas, respectively (Additional file 6: Table S5). No differences in microbial composition were detected between triatomine age class (Shannon, F (5, 85) = 1.07, P > 0.09) or primary blood-meal source (Shannon, F (9, 77) = 1.07, P > 0.38).

Infection rates
Trypanosma cruzi and T. rangeli were detected in sampled vectors at all locations (Additional file 1:

Discussion
Here, we characterized the bacterial microbiota of 87 wild individuals of the Chagas disease vector Rhodnius pallescens from three populations in Panama. We explored comparisons in composition between location, microhabitat, nymphal stage, T. cruzi infection, and blood-meal status. Overall, the microbiota of R. pallescens exhibited relatively low complexity in its bacterial composition which is consistent with other triatomine studies [23,29,30]. Proteobacteria has also been found to be the most abundant phylum in other vector species [1,[71][72][73][74][75] including the triatomines R. neglectus, R. prolixus, Triatoma vitticeps, T. infestans, T. brasiliensis, T. pseudomaculata, Dipetalogaster maximus and Panstrongylus megistus [29,30,73], while a predominance of Actinobacteria has been found previously in R. pallescens [22], both consistent with our findings. Common bacterial genera found in other triatomines include Burkholderia, Dietzia, Gordonia, Williamsia [22,30,73], Actinomycetospora, Arsenophonus, Corynebacterium, Rhodococcus, Staphylococcus [23,30,31], and Enterococcus, Enterobacteriaceae, Bacillus [23,33]. Of these only Actinomycetospora was one of the top 20 genera found across all studied sites (Additional file 5: Table S4). Enterobacteriaceae and Bacillus were found at all sites, but at much lower abundance (0.34-6.03% and 0.21-0.76%, respectively) than found by Waltmann et al. (36.7% and 2%, respectively) [33]; however, the present study differs in that we analyzed wild and not laboratory-reared samples. Dietzia and Gordonia were each in the top 20 taxa for Las Pavas and Trinidad de las Minas (Additional file 5: Table S4). Arsenophonus was detected with a very low abundance in only a single specimen from northern Veraguas Province. All other taxa were found across all sites, but at lower abundance (Additional file 19: Table S6) than in other studies.
High levels of Proteobacteria observed in specimens from Panama Oeste Province localities are due to the very high levels of Wolbachia sp. Specimens from northern Veraguas Province, interestingly, did not have any Wolbachia present. It has been estimated that Wolbachia infects 52% of all aquatic insect species [76] and can infect a high proportion of the number of individuals in a species [77]. However, while many arthropod species may be infected with Wolbachia, a majority of the individuals within a species may not be. In a comparative study, Sazama et al. [76] found that less than half of the individuals were infected in most (69%) Wolbachia-infected species. Wolbachia has been found previously in Rhodnius sp. [29,34] and is common in hematophagous insects [78], but has not been found in other triatomines [29,31,73]. In triatomines [29] and sandflies [74,79,80], the role of Wolbachia remains unknown. In mosquitoes, Wolbachia can affect reproduction and insecticide resistance among others [7], thus creating opportunities for vector biocontrol. However, without further knowledge of the role of Wolbachia in triatomines, identifying microbes that can serve as effective control agents for triatomines requires further research. To this end, we need more characterized microbiomes under multiple environmental conditions, and functional analysis of microbial taxa.
In our study, geographical location was associated with differences between microbial communities of R. pallescens. This observation is most evident between the two most disparate geographical locations (northern Veraguas vs Panama Oeste localities). This observation may be the result of quite different environments between these locations. Veraguas Province is located in the highlands of the western isthmus of Panama where the climate is cooler (mean temperature of 21 °C) and more humid with mountainous topography. In the two Panama Oeste area locations (Las Pavas and Trinidad de las Minas) the topography is flatter with warmer temperatures (mean 27 °C). Of particular interest is that the evaluated specimens from Santa Fe District in northern Veraguas correspond with a darker chromatic variation of R. pallescens infected by specific genetic groups of T. rangeli and T. cruzi [44]. Although the genetic characteristics of this population have not been studied, the reported phenotypic differences and the differences found in their microbial composition could be explained by the presence of this dark chromatic variant in this geographical region. However, it is not known if this dark variant represents a separate geographical population, a new subspecies, or a new distinct species of R. pallescens.
Geographical differences in microbiota have been observed in ticks [15,18,20,24,25], but not in mosquitoes where species-specific microbiota is thought to be stable [36]. One study in triatomines did not observe any difference in the microbiota between three distant locations in the southern USA for Triatoma protracta [23], which is a trend that has been confirmed [81] and opposed [82] in other hemipterans. As in other insect species, taxonomic clade-wide stability of microbiota composition with regard to one environmental variable or another does not appear to be consistent. Generalizations about geographical variation in insect microbiomes will have to remain at the species level for now. However, this is still very much an open question in triatomine microbiota research.
Contrary to our expectations, there was no significant difference in bacterial community composition between T. cruzi-infected and uninfected individuals. This contrasts with previous studies that have found significant differences between microbiomes of T. cruzi-positive and negative individuals [23,30]. However, our small sample size, limited number of palms sampled (n = 8) with skewed infection ratios, and a skewed abundance of younger stage nymphs (N1-N3), may confound true observable differences. Furthermore, detectable levels of differences may be localized to a portion of the triatomine gut where T. cruzi develops; this deserves further study and experimental controls. A similar situation may occur during infection with T. rangeli, which can colonize not only the insect intestine but also the hemocoel and salivary glands [83]. We also did not find any significant difference in the microbiota as a result of the dominant blood-meal source found, as previously observed [31]. This could be due to the complexity of variables that influence the microbiota. While bloodmeal source potentially has an effect on the bacterial composition in the gut, these samples often have mixed blood-meal sources with differing abundance (Additional file 1: Table S1; [50]), making discrete differences difficult to observe. Habitat associated microbial composition was found to be different between peridomestic and peridomestic-forest; however, this observation represents a few peridomestic-forest samples from northern Veraguas only and likely is an artifact of location difference. Similarly, microbial composition between age classes showed no differences, but because this dataset is highly skewed toward a couple of nymphal classes, distinctions are impossible to detect. This study examined the whole-body microbiota, which may obscure anatomically localized differences observed in other studies and, on the whole, result in a more comprehensive measure of microbial composition where local environment has a bigger impact.

Conclusions
In conclusion, we examined the whole-body microbiota of Rhodnius pallescens, which can serve as a benchmark for future comparative studies examining the microbiota of specific organs or anatomical regions. Interestingly, the largest difference in R. pallescens microbial community composition was between geographical locations. While we did not find any definitive differences in relation to other variables (e.g. habitat type, age class, bloodmeal source, infection status) these remain important aspects of vector biology that require further study. The effects of geographical environmental diversity can be minimized through the use of more comparative studies using laboratory-reared insects and controlled studies to tease apart more complex variables such as blood-meal sources and infection status.