- Open Access
Nanopore adaptive sampling for targeted mitochondrial genome sequencing and bloodmeal identification in hematophagous insects
Parasites & Vectors volume 16, Article number: 68 (2023)
Blood-feeding insects are important vectors for an array of zoonotic pathogens. While previous efforts toward generating molecular resources have largely focused on major vectors of global medical and veterinary importance, molecular data across a large number of hematophagous insect taxa remain limited. Advancements in long-read sequencing technologies and associated bioinformatic pipelines provide new opportunities for targeted sequencing of insect mitochondrial (mt) genomes. For engorged hematophagous insects, such technologies can be leveraged for both insect mitogenome genome assembly and identification of vertebrate blood-meal sources.
We used nanopore adaptive sampling (NAS) to sequence genomic DNA from four species of field-collected, blood-engorged mosquitoes (Aedes and Culex spp.) and one deer fly (Chrysops sp.). NAS was used for bioinformatical enrichment of mtDNA reads of hematophagous insects and potential vertebrate blood-meal hosts using publically available mt genomes as references. We also performed an experimental control to compare results of traditional non-NAS nanopore sequencing to the mt genome enrichment by the NAS method.
Complete mitogenomes were assembled and annotated for all five species sequenced with NAS: Aedes trivittatus, Aedes vexans, Culex restuans, Culex territans and the deer fly, Chrysops niger. In comparison to data generated during our non-NAS control experiment, NAS yielded a substantially higher proportion of reference-mapped mtDNA reads, greatly streamlining downstream mitogenome assembly and annotation. The NAS-assembled mitogenomes ranged in length from 15,582 to 16,045 bp, contained between 78.1% and 79.0% A + T content and shared the anticipated arrangement of 13 protein-coding genes, two ribosomal RNAs, and 22 transfer RNAs. Maximum likelihood phylogenies were generated to further characterize each insect species. Additionally, vertebrate blood-meal analysis was successful in three samples sequenced, with mtDNA-based phylogenetic analyses revealing that blood-meal sources for Chrysops niger, Culex restuans and Aedes trivittatus were human, house sparrow (Passer domesticus) and eastern cottontail rabbit (Sylvilagus floridanus), respectively.
Our findings show that NAS has dual utility to simultaneously molecularly identify hematophagous insects and their blood-meal hosts. Moreover, our data indicate NAS can facilitate a wide array of mitogenomic systematic studies through novel ‘phylogenetic capture’ methods. We conclude that the NAS approach has great potential for broadly improving genomic resources used to identify blood-feeding insects, answer phylogenetic questions and elucidate complex pathways for the transmission of vector-borne pathogens.
Hematophagous insects are major disease vectors that transmit a wide variety of pathogens to their blood-meal hosts. From a One Health perspective, blood-feeding insects are responsible for considerable morbidity and mortality across global human, livestock and wildlife communities . Yet, despite their global importance, vector-borne diseases remain difficult to study and adequately control. Vector-borne pathogens are often maintained in complex enzootic transmission cycles which frequently involve numerous species of both vector and vertebrate hosts. Even among groups of hematophagous insects (e.g. mosquitoes) there exists a wide diversity of disparate species which may individually exhibit substantial differences in their host feeding preferences and other natural history characteristics which impact their ability to serve as disease vectors (i.e. vector competence) [2,3,4,5].
From a broader perspective, the global diversity of hematophagous insect species presents a challenge to vector-borne disease research, as accurate species identifications are difficult in the absence of trained entomologists and region-specific taxonomic keys. For these reasons, straightforward methodologies that leverage molecular species barcoding without requiring a priori knowledge are of special interest to the global One Health community. Fortunately, the advancement of molecular tools and bioinformatic pipelines throughout the past decades has helped to inform many complex aspects of vector-borne pathogen transmission [6,7,8,9]. In particular, DNA barcoding—accomplished through PCR amplification and sequencing of select phylogenetically informative loci—has facilitated species identification efforts across hematophagous insect taxa [10, 11] and has been successfully applied to the molecular identification of vertebrate hosts from arthropod blood meals [12,13,14,15].
Across metazoan organisms, mitochondrial (mt) genes have emerged as the most popular targets for species barcoding efforts. In comparison to most nuclear genes, those encoded by mt genomes (mitogenomes) are well-suited for molecular species identification and phylogenetic reconstruction due to their relatively conserved evolutionary origins, indispensable cellular functions, predominantly uniparental inheritance and elevated mutation rates combined with very low rates of recombination . Animal mitogenomes are circular in structure, relatively small in size (approx. 16 kb) and contain the same core set of 13 protein-coding genes (PCGs), 22 transfer RNAs (tRNAs) and two ribosomal RNAs (rRNAs) [16, 17]. Furthermore, mitogenomes are present at many copies within each cell, making them convenient targets for molecular analyses using a variety of biological sample types [17,18,19,20,21,22,23]. Sequence data of numerous mt loci (e.g. those encoding cytochrome c oxidase subunit 1 [COI or cox1], cytochrome b [cyt-b] and mtDNA D-loop) are well-documented for their phylogenetic and phylogeographic utility, having variable rates of evolution that can be leveraged to test taxonomic hypotheses and elucidate evolutionary histories [10, 11, 15, 24, 25]. The majority of barcoding efforts, including those focused on blood-feeding insects and blood-meal analysis, have leveraged traditional PCR followed by Sanger sequencing of either the COI and cyt-b genes [10, 11, 15, 26,27,28,29,30]. In particular, the Barcoding of Life Initiative has identified the COI gene as an ideal locus for the global standardization of sequence-based animal species identification . Although such molecular barcoding has proved useful for identifying particular taxonomic lineages, it remains limited in its generation of singular-gene sequence data and potential for negative results due to PCR failure.
The weaknesses of PCR-based species barcoding and blood-meal analysis are largely overcome using de novo next-generation metagenomic sequencing approaches. Previously, high-throughput applications (i.e. Illumina sequencing) have been shown to provide the ample depth-of-coverage required for de novo taxonomic classification of blood-meal hosts [26, 32, 33]. Despite technological and bioinformatic advancements, second-generation sequencing methods (e.g. Illumina sequencing) are time-consuming, frequently require large brick-and-mortar sequencing laboratories and can be cost prohibitive. In the context of the global burden of vector-borne diseases, second-generation metagenomic approaches are largely out of reach due to their limited availability.
Nanopore sequencing technologies have increased accessibility, both financially and computationally, to genomic data. In particular, the Oxford Nanopore Technologies (ONT; Oxford, UK) MinION sequencing platform is a portable device that can be used both in laboratory and field settings for a variety of sequencing applications [34,35,36]. An important aspect of the MinION platform is that it sequences individual DNA or RNA molecules across a microfluidic flow cell. During a MinION experiment, single-molecule nucleotide sequences of the sequencing library are typically produced at a rate of approximately 450 bases per second, and hundreds of DNA/RNA fragments are analyzed simultaneously. Given the novelty of nanopore-based real-time sequencing, a variety of bioinformatic tools have been developed that effectively leverage the single-molecule aspect of the technology. For example, nanopore adaptive sampling (NAS) is a powerful bioinformatic method that selectively sequences individual DNA, complementary DNA (cDNA) or RNA molecules in real-time [23, 37, 38]. NAS utilizes real-time mapping against a user-specified reference file to compare nucleotides of individual DNA molecules (approx. 200–400 bp, every approx. 0.4 s) as they are being sequenced. This dynamic method has a wide number of applications and can selectively enrich DNA or RNA targets of interest (e.g. particular genes, RNA species, mitogenomes, etc.) and reject unwanted molecules during a sequencing experiment [37, 38].
Combined with the long-read sequencing capabilities of the MinION platform, NAS is particularly well-suited for the selective enrichment and assembly of complete mitogenomes in animal species . Moreover, NAS mitogenome sequencing can effectively capture mtDNA sequences having at least 75% identity with a given reference; thus, the method can be used for mitogenome assembly of a variety of taxa for which references are absent (Fig. 1) . For these reasons the NAS method has great potential as a molecular/bioinformatic tool that can be leveraged to advance One Health research efforts focused on hematophagous insects and their blood-meal hosts. Here, we use NAS to molecularly characterize the mitogenomes of four mosquito species and a biting deer fly, and we show proof-of-concept of its utility for blood-meal analysis. We posit that NAS holds great promise for a variety of vector-borne disease research projects aimed at elucidating hematophagous insect diversity and associated pathogen transmission pathways.
Mosquitoes were opportunistically collected using dry ice-baited CDC Miniature Light Traps (model 512; John W. Hock Company; Gainesville, FL, USA) hung approximately 6 feet above ground in Ramsey County, St. Paul, Minnesota, USA. Mosquito traps were set at 1700 hour and recovered at 1000 hour the following day. A single blood-fed Chrysops deer fly was opportunistically collected while conducting fieldwork within Washington County, Minnesota, USA (Afton State Park; as approved by the Minnesota Department of Natural Resources: permit 202145). All specimens were cold anesthetized and morphologically examined under a stereomicroscope. Blood-fed female mosquitoes were separated from non-target insects, and initial taxonomic identifications were based on standard morphological features . For downstream nucleic acid extractions and sequencing experiments, female mosquito specimens were visually inspected to ensure that they appeared fully engorged and free of visible eggs, which suggested that all mosquitoes had obtained their blood meals within roughly 24 h of specimen collection. All specimens were submerged in RNAlater (Sigma-Aldrich, St. Louis, MO, USA) and subsequently preserved at – 80 °C prior to nucleic acid extraction and molecular analyses. A list of the specimens examined is provided in Additional file 1: Table S1.
DNA extraction and ONT library preparation
Genomic DNA was individually extracted from a small subset of blood-fed Culex and Aedes mosquitoes, including Culex restuans (n = 3), Culex territans (n = 1), Aedes vexans (n = 1) and Aedes trivittatus (n = 1), and from one deer fly (Chrysops niger; n = 1), using a Qiagen DNeasy Blood and Tissue Kit following manufacturer’s instructions (Qiagen, Hilden, Germany). The resulting extracts were quantified using a Qubit 4 Fluorometer (Invitrogen, Thermo Fisher Scientific, Waltham, MA, USA). Genomic DNA libraries for ONT sequencing were produced for each specimen using Sequencing Ligation Kits SQK-LSK109 and SQK-LSK110, following standard ONT protocol. For each sample, 0.3–1.5 μg of initial DNA was used for ONT sequencing library construction. Samples were barcoded using the ONT kit EXP-NBD104, pooled, and five libraries were sequenced until completion for approximately 22–48 h on a MinION Mk1b device using R9.4 flow cells. All sequencing was performed on a Linux desktop computer with the following specifications: Intel C600/X79 series i9-10920X 12 core; Linux 5.4.0–77-generic x86_64; Ubuntu 18.04; Nvidia Quadro RTX 4000 GPU. The five sequencing experiments are denoted here as follows: A: C. niger NAS (2 barcodes, with 1 barcoded sample consisting of midges not included in the present study); B: Cx. restuans and Cx. territans barcoded NAS; C: Ae. vexans NAS (no barcode); D: Cx. restuans unenriched control sequencing (no barcode and no NAS); and E: Cx. restuans and Ae. trivittatus barcoded NAS (2 additional barcoded samples not included in the present study) (see Table 1).
Sequencing was initiated using the MinKNOW GUI, v4.3.20, software (ONT) in which adaptive sampling is integrated. Reference files containing publicly available mitogenome sequences and select barcoding genes (i.e. those encoding COI, cyt-b) were compiled in FASTA format and used for NAS-based enrichment. For NAS sequencing experiments A (C. niger), B (Cx. restuans and Cx. territans) and E (Cx. restuans and Ae. trivittatus), we manually curated a reference FASTA containing mitogenome assemblies of related (i.e. genus level) blood-feeding insects, as well as an assortment of mammalian and avian mitogenomes based on species considered to be potential blood-meal hosts for our study location (a list of GenBank accession numbers is provided in Additional file 2: Table S2). For NAS experiment C (Ae. vexans), we elected to use the NCBI RefSeq download of the complete mt genome database (downloaded 16 Sept 2021). Real-time basecalling was performed using the FAST basecalling model in Guppy and mapped to a provided reference file using the NAS. After completion of the sequencing runs, post-hoc high-accuracy basecalling was performed using Guppy (v5.0.11; release with GPU-enabled basecalling for Ubuntu) prior to downstream analyses. The NAS software pipeline provides an optional immediate readout of reads mapping to particular reference sequences, and real-time alignment data can be recorded in the resulting sequencing summary output text file. For experiments C and E, this optional readout was utilized to extract individual sequences based on successfully aligned reads contained in the sequencing summary output text file. For the remaining NAS experiments (A and B), targeted mtDNA reads were mapped using Minimap2 (v2.17-r941) and indexed using SAMtools (v1.9) in post-hoc bioinformatic analyses.
NAS versus control sequencing
A control nanopore sequencing experiment (experiment D) was conducted to test the effectiveness of NAS. This experiment generated sequences from genomic DNA of a single mosquito (Cx. restuans) without NAS for approximately 45 h. During the sequencing runs, basecalling was accomplished using the FAST model, and post-hoc basecalling was conducted using the high-accuracy model of Guppy to keep all other sequencing variables (other than NAS) equivalent. Sequences from experiment D were filtered based on a quality score of 10 and read lengths between 1 and 16 kb using NanoFilt v2.7.1 . Filtered reads were subsampled at random using seqtk v1.3 (https://github.com/lh3/seqtk) and used in comparison analyses. Randomly sampled reads were mapped to the Cx. pipiens mitogenome using the program Minimap2 v2.17  and the percentage of assembled reads were noted. Experiment D was compared to NAS experiment B for Cx. restuans and Cx. territans, using the same filter parameters, random subsampling technique and mapping program.
Mitogenome assemblies and phylogenetics
Reads were filtered for mitogenome assembly using NanoFilt with a minimum Q-score of 10 and read lengths of between 1 and 16 kb . Filtered reads for individual samples were de novo assembled using Flye v2.8.3 following methods outlined in Wanner et al. . Subsequent mitogenome assemblies were polished to produce more accurate assemblies using Medaka v1.4.3 (https://github.com/nanoporetech/medaka). Open reading frames (ORFs) were identified and annotated using MITOS2 web server (http://mitos2.bioinf.uni-leipzig.de/index.py) with specification for metazoan RefSeq and invertebrate genetic code . Each annotated mitogenome was visually inspected using Geneious Prime software v2021.2.2, and aligned against a published and annotated mitogenome of another member in each genus. Where necessary, the start and end positions of particular genes were manually adjusted based on those of previously characterized mitogenomes. Polished and assembled mitogenomes were visualized using OGDRAW v1.3.1 (https://chlorobox.mpimp-golm.mpg.de/OGDraw.html) .
After annotation of the mt genomes, COI gene sequences were used in phylogenetic analyses to verify species identification. COI sequences were downloaded from GenBank for each appropriate species (Culex, Aedes and Chrysops) to build phylogenies. Appropriate outgroup samples also were downloaded from GenBank and included in each analysis to root the phylogeny. Sequences were aligned using MAFFT v7.475 , and subsequent alignments were used in a maximum likelihood analysis with the following parameters in RAxML v8.211 : (i) model of evolution was set to GTRGAMMAI; and (ii) analyses were performed using 1000 bootstraps. The resulting best tree for each analysis was visualized using FigTree v1.4.1 (https://github.com/rambaut/figtree). Initial putative blood-meal identifications were determined by real-time alignment to vertebrate mt genomes during each sequencing experiment (see Additional file 2: Table S2). Following post-hoc high-accuracy basecalling, confirmation of blood-meal host identifications were achieved using Minimap2 and SAMtools to re-map reads across vertebrate mt genomes to characterize successfully mapped reads. Average coverage was calculated for the human blood meal (experiment A), and phylogenetic trees were generated (using the same parameters described above) for the house sparrow (Passer domesticus) and eastern cottontail (Sylvilagus floridanus) blood meals to determine statistical nodal support as mtDNA coverage was considerable in reads derived from these vertebrate hosts.
Output and performance of nanopore sequencing experiments
Genomic DNA was isolated from five species of blood-fed insects (species identifications were confirmed using COI barcoding, as described below): C. niger (n = 1), Cx. restuans (n = 3), Cx. territans (n = 11), Ae. vexans (n = 1) and Ae. trivittatus (n = 1). Molecular data for each specimen was generated across five ONT sequencing runs (Table 1). The number of bases generated for each sample ranged from 200 Mb to 5 Gb, and the number of reads generated ranged from 0.6 million to 10.8 million. Average read length (N50) was lower in sequences generated from NAS runs (347–536 bp) than those generated in the control experiment (1.96 kb), as expected (Table 1). Raw FASTQ files generated for each sequencing run were submitted to the Sequence Read Archive (SRA) at NCBI (Bioproject: PRJNA775614; Biosamples: SAMN22604479-SAMN22604483; SAMN22888850-SAMN22888851).
Comparison of NAS and control sequencing for mitogenome enrichment
Our comparison of the NAS method versus a control nanopore sequencing experiment (i.e. NAS off; experiment D) revealed that approximately 65.8% and 75% of reads in the NAS active experiment (experiment B: Cx. territans and Cx. restuans, respectively) mapped to the Cx. pipiens reference, whereas 1.8% of reads in the control experiment (experiment D) mapped to the same reference (Table 2). Given that experiment D consisted of a single mosquito, we performed sequence subsampling analyses and found similar percentages (NAS: > 65% reads mapped; control: approx. 2% reads mapped; Additional file 3: Table S3).
Assembly and annotation of mt genomes
For each species sequenced using NAS enrichment, we assembled a single, circular contig from aligned mtDNA reads. Mean assembly coverage for each mitogenome is reported as follows: C. niger: 56×; Cx. restuans: 612×; Cx. territans: 1342×; Ae. trivittatus: 46×; Ae. vexans: 750×. In addition to these successfully assembled mitogenomes, we also attempted assembly using aligned reads generated during the control experiment (experiment D: Cx. restuans). Here, we noted that 19 contigs of high coverage (1846×–2428×) and short lengths (426 bp–958 bp) were generated during Flye assembly and this ultimately precluded our ability to obtain a high-quality mitogenome in the absence of NAS-based enrichment. The five NAS-derived mitogenomes were all of an intermediate size, ranging from 15,582 to 16,045 bp in total length (Fig. 2; Additional file 4: Figure S1). Nucleotide bias was also consistent across all five mitogenomes and ranged from 78.1% A + T to 79.0% A + T in total nucleotide content.
Annotation of our assemblies indicated that each encoded a total of 37 genes, which included the expected assemblage of PCGs (n = 13), tRNAs (n = 22) and rRNAs (n = 2). A short region lacking gene content (i.e. control region) at the suspected site of replication initiation was also observed in each sequenced mitogenome. Among our Aedes and Culex mitogenomes, the majority strand (J-strand) was found to encode nine PCGs and 13 tRNAs, while the minority strand (N-strand) was found to encode four PCGs, nine tRNAs and both rRNA genes. Overall, this organization of gene content as well as the specific order of genes was found to be identical among our three mosquito mitogenomes and consistent with that reported in other members of the Culicidae family [46, 47]. Our C. niger deer fly mitogenome differed slightly in that the J-strand encoded nine PCGs and 14 tRNAs, with the N-strand encoding the remaining four PCGs, eight tRNAs and both rRNAs. Importantly, this organization is consistent with previously sequenced mitogenomes for dipteran members of the suborder Brachycera . Complete annotated mitogenome assemblies were submitted to the NCBI Organelle database (Additional file 2: Table S2).
Molecular identification of insect species and blood meals
Genetic identification for each vector species was determined using the complete COI gene that was annotated from the assembled mitogenomes reported herein (experiments A-C) or from consensus sequences generated by mapping to a COI gene reference (experiments D and E). The resulting best trees generated from a bootstrapped maximum likelihood analysis are reported in Additional file 5: Figure S2. Each vector COI sequence generated for this study was statistically supported and formed a monophyletic clade of the appropriate species.
A blood meal identified as human was successfully sequenced from C. niger (experiment A). We recovered over 200 mtDNA sequences of the human blood meal, and mapped reads primarily ranged from approximately 100 to approximately 8 kb in length, with a single read spanning nearly the entire human mitogenome (16,123 bp; Fig. 3). Experiment E generated blood-meal sequencing reads for two vector species. For the Cx. restuans vector, 16 reads mapped to the house sparrow (P. domesticus) mitogenome, ranging in length from 106 to 2475 bp. Of these 16 mapped reads, three reads mapped to the control region (CR) of the house sparrow mt genome. A maximum likelihood analysis revealed that the sequences generated from the CR formed a statistically supported clade with P. domesticus (1.0% divergence within the P. domesticus clade; Fig. 4a; Additional file 6: Table S4). The Ae. trivittatus vector from experiment E resulted in a blood-meal host identified as an eastern cottontail rabbit (S. floridanus) with a single read (955 bases long) mapping to the 12S gene. A phylogenetic analysis of the 12S gene supports the molecular identification of the blood meal to S. floridanus (2.0% divergence within the S. floridanus clade; Fig. 4b).
In the present study we demonstrate the dual utility of NAS for mitogenome assembly and blood-meal identification in hematophagous insects. Using data generated through NAS, we successfully enriched mtDNA, and assembled and annotated entire mt genomes of four mosquito species (Ae. vexans, Ae. trivittatus, Cx. restuans, Cx. territans) and the black deer fly (C. niger) (Fig. 2). Of these, the mt genome assemblies for Ae. trivittatus, Cx. restuans, Cx. territans, and C. niger are the first to become available for the respective species. Using a meta-mt barcoding approach, we confirmed the vertebrate sources of the blood meals of Cx. restuans (house sparrow), Ae. trivittatus (eastern cottontail rabbit) and C. niger (human) (Figs. 3, 4; experiments A and E).
NAS is a relatively new bioinformatic tool , thus comparisons to traditional nanopore sequencing runs are essential. The analyses presented herein and those of others [23, 37, 38, 49, 50] clearly show that NAS consistently enriches reference sequences (herein, mt sequence data) at levels ranging from 0.96-fold to 5.4-fold versus control ONT experiments. The NAS experiments conducted herein successfully captured entire mitogenomes of the hematophagous insects, with coverages ranging from 56× (C. niger) to 1342× (Cx. territans). Moreover, the mitogenome of the control run (experiment D; NAS off; Cx. restuans) was not fully assembled into a single, circular contig. This finding is interesting given we secured an estimated coverage of over 23,000 sequences from the control experiment (based on reference-guided mapping to the Cx. pipiens mt genome). In light of our failure to assemble a complete circular contig from these control data, we posited that there were likely numerous nuclear mtDNA segments (NUMTs) that survived filtering, ultimately resulting in the generation of 19 short, linear contigs during the Flye assembly process. To evaluate this further, we subsampled these mapped approximately 23,000 reads from the control run (experiment D) and characterized them using BLAST (Basic Local Alignment Search Tool) against the NCBI nucleotide database. For a majority of these subsampled reads, we observed BLAST matches to Culex mtDNA which consisted of only a short percentage (approx. 5–10%) of the overall read query. Further analysis of the flanking regions on these subsampled reads most closely matched against nuclear regions of Cx. pipiens pallens (NCBI assembly accession: GCA_016801865.1) and Cx. quinquefasciatus (NCBI assembly accession: GCA_015732765.1) whole-genome assemblies, suggesting that these reads did indeed contain NUMTs with mt pseudogenes. NUMTs are common across eukaryotes and have been reported in a variety of mosquito taxa [46, 51, 52]. Our findings suggest that targeting mtDNA using NAS—which bases its enrichment through reference-based mapping of the initial 200–400 bp for each read sequenced—may help resolve the issue of inadvertent NUMT capture, as NUMTs contained within long stretches of nuclear DNA would not map against our mtDNA references and be rejected from complete sequencing using our approach. Thus, NAS can work to streamline sequencing and assembly for mitogenomes while avoiding the confounding effects of NUMTs that remain challenging with traditional molecular methods.
Similar to Wanner et al. , the results reported herein clearly document the utility of NAS for the enrichment and downstream assembly of complete mt genomes. In light of these observations, we posit that NAS will be a particularly useful tool for a multitude of studies, ranging from the inter- and intra-species characterization of mitogenome variation to biodiversity assessments of a wide variety of taxa wherein mt-based species barcoding is routinely conducted . For hematophagous insects in particular, taxonomic identification can be challenging for both novice and skilled entomologists using morphological characters alone, especially for cryptic species. Moreover, given the range of expansions associated with climate change and anthropogenic-driven introductions, traditional morphological keys for a given geographic area may be unable to resolve species status for resident hematophagous insects. For these reasons, a robust molecular-based approach that facilitates the taxonomic identification of blood-feeding insects will greatly aid vector-borne disease research and vector biosurveillance. In the present study, we demonstrate how our NAS approach will directly contribute to such efforts by beginning with unidentified field-collected specimens and successfully producing complete mitogenome assemblies and molecular identifications for our sampled species. We highlight the natural history of the insects sampled herein to demonstrate the power of our end-to-end NAS mtDNA approach.
Aedes vexans is a cosmopolitan species and often represents the most common mosquito species collected in the Upper Midwest, USA . It is considered a competent vector of a variety of pathogens, including West Nile virus, Eastern and Western encephalitis virus, Zika virus, St. Louis encephalitis virus, Rift Valley fever virus and Dirofilaria immitis, although is rarely considered a species of significant disease concern [54,55,56]. Aedes trivittatus shares many ecological characteristics with Ae. vexans and is an abundant summer floodwater mosquito throughout the American Midwest; Ae. trivittatus appears to feed chiefly on mammals, including the eastern cottontail rabbit (S. floridanus), [57,58,59], a feeding preference which is supported by our blood-meal analysis findings. Similar to Ae. vexans, Ae. trivittatus is likely a competent vector of both West Nile virus  and D. immitis , although its precise role in pathogen transmission in North America has not been well-characterized. The Ae. trivittatus mt genome assembly provided herein and deposited into NCBI (accession number OL471015) is the first to be reported for this species. Culex restuans is distributed throughout the USA and Canada [39, 62]; it is increasingly being recognized as an important vector of West Nile virus, particularly in urban and suburban areas , and is likely capable of transmitting St. Louis encephalitis virus . Notably, adult female Cx. restuans mosquitoes are largely indistinguishable from North America's primary West Nile virus vector, Cx. pipiens, based on external morphological features [65, 66]. The Cx. restuans mt genome assembly provided herein (accession number OL351548) is the first to be reported for this species and represents an important resource for the molecular differentiation of these medically important West Nile virus vectors. Culex territans is widespread in the Eastern USA where it feeds on frogs . It transmits a hepatozoon parasite to frogs but is unlikely to serve a major role in mammalian disease transmission . The Cx. territans mt genome assembly provided herein and deposited on NCBI (accession number OL351549) is the first reported for this species. The deer fly C. niger is widely distributed in Eastern USA and south-eastern Canada and is associated with marshy areas where its larvae feed on organic matter in soil . It is a common pest of livestock . Other members of the genus are involved in the mechanical transmission of Fransicella tularensis in Western USA . The C. niger mt genome assembly provided herein and deposited on NCBI (accession number OL351550) is the first to be reported for this species.
Regarding our NAS-based recovery of the mitogenomes of the C. niger deer fly, the only available mt genome for reference within the genus was from C. silvifacies collected from China. Our de novo assembly of the C. niger mitogenome is separated by a genetic distance of approximately 9.4% Kimura-2 Parameter values from C. silvifacies. Likewise, we used the mt genome for Cx. pipiens collected from Tunisia as reference for our Cx. restuans and Cx. territans NAS sequencing. The de novo assembled mitogenomes generated herein for Cx. restuans and Cx. territans have a genetic distance of approximately 5.5% and approximately 8.0% from Cx. pipiens, respectively. These results show that the NAS method can be used with distantly related, yet congeneric species references to successfully identify species having no reference available; we coin this approach as ‘phylogenetic capture.’ The broader implications of these findings is that NAS-based recovery of mtDNA sequences requires only a single mt genome from a given genus. It is also possible that by strategically selecting reference sequences for NAS that effectively span the phylogenetic distance of a given clade (i.e. basal and terminal clades, ≥ 1 members of a polytomy, 1 of 2 sister taxa, etc.) will enhance the discovery and identification of taxa wherein reference sequence data are absent or unavailable. We note that the NAS reference file for experiment C (Ae. vexans) included all publicly available mitogenomes on the NCBI organelle database at the time of our experiment (11,982 mitogenome sequences; accessed 16 Sept 2021). We observed no indication that the MinKNOW software or NAS method was impacted by this number of reference sequences. The implication of this observation is that a single mitogenome reference file spanning the tree of life can be used for the NAS phylogenetic capture method. If accurate, this approach has far-reaching implications for species barcoding and molecular systematics that extend beyond the scope of the analyses presented herein (e.g. pathogen discovery).
We were unable to recover blood-meal identifications for mosquitoes sampled in NAS experiments B (Cx. territans, Cx. restuans) and C (Ae. vexans). In experiment B, one possible explanation for this observation is that blood meals may have been from a vertebrate source not included in our initial enrichment FASTA. For example, Cx. territans is known to feed from a variety of vertebrate sources, including reptiles and amphibians , groups which were not represented as enrichment references during this sequencing experiment. To overcome this potential limitation during subsequent experiments C and E, we elected to increase the number of representative mitogenomes and taxa in our references to increase the likelihood of positive blood meal-associated read recovery. As reviewed by Kent , there are several variables which either individually, or combined, could additionally influence successful blood-meal identification using molecular methods, including: (i) time between feeding event and sample preservation (longer times result in more host blood being digested); (ii) mammalian hosts having enucleated red blood cells (RBCs; successful identification dependent upon leukocytes present in a mammalian sample; increased chance of success if host has nucleated RBCs [e.g. birds, reptiles, amphibians]); and (iii) a high percentage of hematophagous insect DNA within extracts (bulk, whole individual DNA extracts result in majority of DNA originating from the hematophagous insect). Despite these observations, NAS for blood-meal identification has several advantages over traditional PCR approaches, in particular: (i) NAS is not confounded by PCR inhibitors (i.e. heme) present in blood ; and (ii) the method can detect multiple hosts within a heterogeneous blood meal when using a reference file of substantial phylogenetic breadth. Moreover, because nanopore sequencing is a single-molecule technology, NAS data negate the need for cloning and Sanger sequencing of mixed, multi-host PCR products that are the result of PCR primers targeting highly conserved regions.
Our NAS data consisted of enriched mtDNA sequences ranging from hundreds to thousands of nucleotide bases of a given sample, and downstream bioinformatic processes yielded de novo mitogenome assemblies for four insects for which no previous mitogenome existed (i.e. unavailable on public nucleotide repositories). The vast majority of available sequences for mt-based molecular barcoding consist of COI sequences (approx. 9.87 M) managed by the Barcode of Life Data System (BOLD; www.boldsystems.org). Thus, when combined with the BOLD database, NAS mitogenome sequencing is particularly useful for the molecular identification of insect vectors (as demonstrated above). Select mtDNA genes that are frequently used for species barcoding (i.e. genes encoding COI, cyt-b) are easily extracted from de novo assembled mitogenomes of hematophagous insects (and their corresponding blood meals), for rapid BLAST and phylogenetic analyses facilitated by the BOLD initiative. We anticipate that the availability of complete mitogenome assemblies will increase exponentially with the usage of single-molecule long-read sequencing methodologies (e.g., ONT and Pacific Biosciences [Menlo Park, CA, USA]). Reference mitogenomes of hematophagous insects will provide important insights into the evolutionary histories of disease vectors and will greatly assist with accurate taxonomic identification efforts of cryptic species.
Nanopore sequencing through adaptive sampling is a revolutionary approach that can be leveraged to address many biological questions [37, 49, 50, 73,74,75,76]. Long-read single-molecule sequencing is ideally suited for recovering entire mitogenomes across the tree of life, thus opening the door to enhanced mt barcoding. Here, we demonstrate how NAS can be utilized to dually identify hematophagous insects and their blood-meal hosts through complete mitogenome sequencing and characterization of recovered mt barcoding genes. Our data indicate that NAS can generate sequence data enriched for mt reads over unenriched ‘traditional’ nanopore sequencing to improve downstream mitogenome assemblies. Importantly, the NAS-based strategies outlined here can be leveraged to sequence a wide diversity of arthropod taxa, as publicly available mitogenome references at the genus level are sufficient to capture mt reads for related species whose complete mitogenome has not yet been sequenced (i.e. ‘phylogenetic capture’). Given the well-documented ease of use and portability of ONT MinION instruments [34, 77, 78], we envision that similar NAS-based mt barcoding efforts, in addition to metagenomic pathogen surveillance, can be performed within individual research laboratories, including field-based locations, worldwide. Thus, we posit that NAS-based surveillance of hematophagous insects will greatly advance global One Health research projects and biosurveillance initiatives.
Availability of data and materials
All raw sequence data generated during the course of this study have been deposited within the NCBI SRA biorepository and are publicly available (Bioproject: PRJNA775614; Biosamples: SAMN22604479-SAMN22604483; SAMN22888850-SAMN22888851). Mitochondrial genome assemblies resulting from our research have been deposited at GenBank (NCBI Accession OL351547-OL351550, OL471015).
Basic local alignment search tool
Barcode of Life Data System
Cytochrome c oxidase subunit 1
Nanopore adaptive sampling
Nuclear mitochondrial DNA segment
Oxford Nanopore Technologies Inc
Pérez de León AA, Mitchell RD, Watson DW. Ectoparasites of cattle. Vet Clin North Am Food Animal Prac. 2020;36:173–85.
Gómez-Díaz E, Figuerola J. New perspectives in tracing vector-borne interaction networks. Trends Parasitol. 2010;26:470–6.
Kent RJ. Molecular methods for arthropod bloodmeal identification and applications to ecological and vector-borne disease studies. Mol Ecol Resour. 2009;9:4–18.
Hardy JL, Houk EJ, Kramer LD, Reeves WC. Intrinsic factors affecting vector competence of mosquitoes for arboviruses. Annu Rev Entomol. 1983;28:229–62.
Bartholomay LC, Michel K. Mosquito immunobiology: the intersection of vector health and vector competence. Annu Rev Entomol. 2018;63:145–67.
Quick J, Grubaugh ND, Pullan ST, Claro IM, Smith AD, Gangavarapu K, et al. Multiplex PCR method for MinION and Illumina sequencing of Zika and other virus genomes directly from clinical samples. Nat Protoc. 2017;12:1261–76.
Naveca FG, Claro I, Giovanetti M, de Jesus JG, Xavier J, de IaniFC M, et al. Genomic, epidemiological and digital surveillance of Chikungunya virus in the Brazilian Amazon. PLoS Negl Trop Dis. 2019;13:e0007065.
Hill SC, de Souza R, Thézé J, Claro I, Aguiar RS, Abade L, et al. Genomic surveillance of Yellow Fever virus epizootic in São Paulo, Brazil, 2016–2018. PLoS Pathog. 2020;16:e1008699.
Giovanetti M, Faria NR, Lourenço J, Goes de Jesus J, Xavier J, Claro IM, et al. Genomic and epidemiological surveillance of Zika Virus in the Amazon region. Cell Reports. 2020;30:2275–83.
Beebe NW. DNA barcoding mosquitoes: advice for potential prospectors. Parasitology. 2018;145:622–33.
Jinbo U, Kato T, Ito M. Current progress in DNA barcoding and future implications for entomology. Entomol Sci. 2011;14:107–24.
Alcaide M, Rico C, Ruiz S, Soriguer R, Muñoz J, Figuerola J. Disentangling Vector-Borne transmission networks: a universal DNA barcoding method to identify vertebrate hosts from arthropod bloodmeals. PLoS ONE. 2009;4:e7092.
Mukabana WR, Takken W, Knols BGJ. Analysis of arthropod bloodmeals using molecular genetic markers. Trends Parasitol. 2002;18:505–9.
Reeves LE, Gillett-Kaufman JL, Kawahara AY, Kaufman PE. Barcoding blood meals: new vertebrate-specific primer sets for assigning taxonomic identities to host DNA from mosquito blood meals. PLoS Negl Trop Dis. 2018;12:e0006767.
Townzen JS, Brower AVZ, Judd DD. Identification of mosquito bloodmeals using mitochondrial cytochrome oxidase subunit I and cytochrome b gene sequences. Med Vet Entomol. 2008;22:386–93.
Gissi C, Iannelli F, Pesole G. Evolution of the mitochondrial genome of Metazoa as exemplified by comparison of congeneric species. Heredity. 2008;101:301–20.
Boore JL. Animal mitochondrial genomes. Nucleic Acids Res. 1999;27:1767–80.
Cameron SL. Insect mitochondrial genomics: implications for evolution and phylogeny. Annu Rev Entomol. 2014;59:95–117.
Crampton-Platt A, Timmermans MJTN, Gimmel ML, Kutty SN, Cockerill TD, Vun Khen C, et al. Soup to tree: the phylogeny of beetles inferred by mitochondrial metagenomics of a Bornean rainforest sample. Mol Biol Evol. 2015;32:2302–16.
Crampton-Platt A, Yu DW, Zhou X, Vogler AP. Mitochondrial metagenomics: letting the genes out of the bottle. GigaSci. 2016;5:15.
Romero PE, Weigand AM, Pfenninger M. Positive selection on panpulmonate mitogenomes provide new clues on adaptations to terrestrial life. BMC Evol Biol. 2016;16:164.
Timmermans MJTN, Dodsworth S, Culverwell CL, Bocak L, Ahrens D, Littlewood DTJ, et al. Why barcode? High-throughput multiplex sequencing of mitochondrial genomes for molecular systematics. Nucleic Acids Res. 2010;38:e197.
Wanner N, Larsen PA, McLain A, Faulk C. The mitochondrial genome and epigenome of the golden lion Tamarin from fecal DNA using nanopore adaptive sequencing. BMC Genomics. 2021;22:726.
Bradley RD, Baker RJ. A Test of the genetic species concept: cytochrome-b sequences and mammals. J Mammal. 2001;82:960–73.
Creedy TJ, Andújar C, Meramveliotakis E, Noguerales V, Overcast I, Papadopoulou A, et al. Coming of age for COI metabarcoding of whole organism community DNA: towards bioinformatic harmonisation. Mol Ecol Resour. 2021;00:1–15.
Borland EM, Kading RC. Modernizing the toolkit for arthropod bloodmeal identification. Insects. 2021;12:1–27.
Molaei G, Andreadis TG, Armstrong PM, Bueno R, Dennett JA, Real SV, et al. Host feeding pattern of Culex quinquefasciatus (Diptera: Culicidae) and its role in transmission of West Nile virus in Harris County, Texas. Am J Trop Med Hyg. 2007;77:73–81.
Santiago-Alarcon D, Havelka P, Schaefer HM, Segelbacher G. Bloodmeal analysis reveals avian Plasmodium infections and broad host preferences of Culicoides (Diptera: Ceratopogonidae) vectors. PLoS ONE. 2012;7:e31098.
Schnell IB, Thomsen PF, Wilkinson N, Rasmussen M, Jensen LRD, Willerslev E, et al. Screening mammal biodiversity using DNA from leeches. Curr Biol. 2012;22:R262–3.
Videvall E, Bensch S, Ander M, Chirico J, Sigvald R, Ignell R. Molecular identification of bloodmeals and species composition in Culicoides biting midges. Med Vet Entomol. 2013;27:104–12.
Costa FO, Carvalho GR. New insights into molecular evolution: prospects from the barcode of life initiative (BOLI). Theory Biosci. 2010;129:149–57.
Kieran TJ, Gottdenker NL, Varian CP, Saldaña A, Means N, Owens D, et al. Blood meal source characterization using Illumina sequencing in the Chagas disease vector Rhodnius pallescen (Hemiptera: Reduviidae) in Panama. J Med Entomol. 2017;54:1786–9.
Muturi EJ, Dunlap C, Tchouassi DP, Swanson J. Next generation sequencing approach for simultaneous identification of mosquitoes and their blood-meal hosts. J Vector Ecol. 2021;46:116–21.
Blanco MB, Greene LK, Rasambainarivo F, Toomey E, Williams RC, Andrianandrasana L, et al. Next-generation technologies applied to age-old challenges in Madagascar. Conserv Genet. 2020;21:785–93.
Hoenen T, Groseth A, Rosenke K, Fischer RJ, Hoenen A, Judson SD, et al. Nanopore sequencing as a rapidly deployable ebola outbreak tool. Emerg Infect Dis. 2016;22:331–4.
Jain M, Olsen HE, Paten B, Akeson M. The Oxford Nanopore MinION: delivery of nanopore sequencing to the genomics community. Genome Biol. 2016;17:239.
Kipp EJ, Lindsey LL, Khoo BS, Faulk C, Oliver JD, Larsen PA. Enabling metagenomic surveillance for bacterial tick-borne pathogens using nanopore sequencing with adaptive sampling. bioRxiv. 2021. https://doi.org/10.1101/2021.08.17.456696.
Payne A, Holmes N, Clarke T, Munro R, Debebe BJ, Loose M. Readfish enables targeted nanopore sequencing of gigabase-sized genomes. Nat Biotechnol. 2021;39:442–50.
Darsie RF, Ward RA. Identification and geographical distribution of the mosquitoes of North America, north of Mexico. Gainesville: University Press of Florida; 2005.
De Coster W, D’Hert S, Schultz DT, Cruts M, Van Broeckhoven C. NanoPack: visualizing and processing long-read sequencing data. Bioinformatics. 2018;34:2666–9.
Li H. Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics. 2018;34:3094–100.
Donath A, Jühling F, Al-Arab M, Bernhart SH, Reinhardt F, Stadler PF, et al. Improved annotation of protein-coding genes boundaries in metazoan mitochondrial genomes. Nucleic Acids Res. 2019;47:10543–52.
Greiner S, Lehwark P, Bock R. OrganellarGenomeDRAW (OGDRAW) version 1.3.1: expanded toolkit for the graphical visualization of organellar genomes. Nucleic Acids Res. 2019;47:59–64.
Katoh K, Standley DM. MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol Biol Evol. 2013;30:772–80.
Stamatakis A. RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics. 2014;30:1312–3.
Behura SK, Lobo NF, Haas B, deBruyn B, Lovin DD, Shumway MF, et al. Complete sequences of mitochondria genomes of Aedes aegypti and Culex quinquefasciatus and comparative analysis of mitochondrial DNA fragments inserted in the nuclear genomes. Insect Biochem Mol Biol. 2011;41:770–7.
Luo Q-C, Hao Y-J, Meng F, Li T-J, Ding Y-R, Hua Y-Q, et al. The mitochondrial genomes of Culex tritaeniorhynchus and Culex pipiens pallens (Diptera: Culicidae) and comparison analysis with two other Culex species. Parasites Vectors. 2016;9:406.
Wang K, Li X, Ding S, Wang N, Mao M, Wang M, et al. The complete mitochondrial genome of the Atylotus miser (Diptera: Tabanomorpha: Tabanidae), with mitochondrial genome phylogeny of lower Brachycera (Orthorrhapha). Gene. 2016;586:184–96.
Gan M, Wu B, Yan G, Li G, Sun L, Lu G, et al. Combined nanopore adaptive sequencing and enzyme-based host depletion efficiently enriched microbial sequences and identified missing respiratory pathogens. BMC Genomics. 2021;22:1–11.
Martin S, Heavens D, Lan Y, Horsfield S, Clark MD, Leggett RM. Nanopore adaptive sampling: a tool for enrichment of low abundance species in metagenomic samples. Genome Biol. 2022;23:11.
Hlaing T, Tun-Lin W, Somboon P, Socheat D, Setha T, Min S, et al. Mitochondrial pseudogenes in the nuclear genome of Aedes aegypti mosquitoes: implications for past and future population genetic studies. BMC Genet. 2009;10:1–12.
Ding Y-R, Li B, Zhang Y-J, Mao Q-M, Chen B. Complete mitogenome of Anopheles sinensis and mitochondrial insertion segments in the nuclear genomes of 19 mosquito species. PLoS ONE. 2018;13:e0204667.
Dunphy BM, Rowley WA, Bartholomay LC. A taxonomic checklist of the mosquitoes of Iowa. J Am Mosq Control Assoc. 2014;30:119–21.
Gendernalik A, Weger-Lucarelli J, Garcia Luna SM, Fauver JR, Rückert C, Murrieta RA, et al. American Aedes vexans mosquitoes are competent vectors of Zika Virus. Am J Trop Med Hyg. 2017;96:1338–40.
Ndiaye EH, Fall G, Gaye A, Bob NS, Talla C, Diagne CT, et al. Vector competence of Aedes vexans (Meigen), Culex poicilipes (Theobald) and Cx. quinquefasciatus say from senegal for West and East African lineages of Rift Valley fever virus. Parasit Vectors. 2016;9:94.
Turell MJ, Dohm DJ, Sardelis MR, O’Guinn ML, Andreadis TG, Blow JA. An update on the potential of North American mosquitoes (Diptera: Culicidae) to transmit West Nile virus. J Med Entomol. 2005;42:57–62.
Molaei G, Andreadis TG, Armstrong PM, Diuk-Wasser M. Host-feeding patterns of potential mosquito vectors in connecticut, USA: molecular analysis of bloodmeals from 23 species of Aedes, Anopheles, Culex, Coquillettidia, Psorophora, and Uranotaenia. J Med Entomol. 2008;45:1143–51.
Nasci RS. Variations in the blood-feeding patterns of Aedes vexans and Aedes trivittatus (Diptera: Culicidae)1. J Med Entomol. 1984;21:95–9.
Pinger RR, Rowley WA. Host preferences of Aedes trivittatus (Diptera: Culicidae) in Central Iowa. Am J Trop Med Hyg. 1975;24:889–93.
Tiawsirisup S, Platt KB, Evans RB, Rowley WA. A comparision of West Nile virus transmission by Ochlerotatus trivittatus (COQ.), Culex pipiens (L.), and Aedes albopictus (Skuse). Vector-Borne Zoonotic Dis. 2005;5:40–7.
Pinger RR. Presumed Dirofilaria immitis infections in mosquitoes (Diptera: Culicidae) in Indiana, USA. J Med Entomol. 1982;19:553–5.
Strickman D, Darsie RF. The previously undetected presence of Culex restuans (Diptera: Culicidae) in Central America, with notes on identification. Mosquito Syst. 1988;20:21–7.
Johnson BJ, Robson MG, Fonseca DM. Unexpected spatiotemporal abundance of infected Culex restuans suggest a greater role as a West Nile virus vector for this native species. Infect Genet Evol. 2015;31:40–7.
Monath TP, Tsai TF. St. Louis encephalitis: lessons from the last decade. Am J Trop Med Hyg. 1987;37:40S-59S.
Apperson CS, Harrison BA, Unnasch TR, Hassan HK, Irby WS, Savage HM, et al. Host-Feeding Habits of Culex and Other Mosquitoes (Diptera: Culicidae) in the Borough of Queens in New York City, with characters and techniques for identification of Culex Mosquitoes. J Med Entomol. 2002;39:777–85.
Ebel GD, Rochlin I, Longacker J, Kramer LD. Culex restuans (Diptera: Culicidae) relative abundance and vector competence for West Nile virus. J Med Entomol. 2005;42:838–43.
Bartlett-Healy K, Crans W, Gaugler R. Temporal and Spatial Synchrony of Culex territans (Diptera: Culicidae) with their Amphibian hosts. J Med Entomol. 2008;45:1031–8.
Desser SS, Hong H, Martin DS. The life history, ultrastructure, and experimental transmission of Hepatozoon catesbianae n. comb, an apicomplexan parasite of the bullfrog, Rana catesbeiana and the mosquito, Culex territans in Algonquin Park Ontario. J Parasitol. 1995;81:212–22.
Pechumann L. The insects of Virginia: No. 6. Horse flies and deer flies of Virginia (Diptera: Tabanidae). Va Polytechnic Inst State Univ Res Div Bull. 1973;81:1–92.
Weiner TJ, Hansens EJ. Species and numbers of bloodsucking flies feeding on hogs and other animals in southern New Jersey. J New York Entomol Soc. 1975;83:198–202.
Jellison WL. Tularemia geographical distribution of “Deerfly Fever” and the biting fly Chrysops discalis Williston. Public Health Rep. 1950;65:1321–9.
Akane A, Matsubara K, Nakamura H, Takahashi S, Kimura K. Identification of the heme compound copurified with deoxyribonucleic acid (DNA) from bloodstains, a major inhibitor of polymerase chain reaction (PCR) amplification. J Forensic Sci. 1994;39:362–72.
Miller DE, Sulovari A, Wang T, Loucks H, Hoekzema K, Munson KM, et al. Targeted long-read sequencing identifies missing disease-causing variation. Am J Hum Genet. 2021;108:1436–49.
Marquet M, Zöllkau J, Pastuschek J, Viehweger A, Schleußner E, Makarewicz O, et al. Evaluation of microbiome enrichment and host DNA depletion in human vaginal samples using Oxford Nanopore’s adaptive sequencing. Sci Rep. 2022;12:4000.
Lin Y, Dai Y, Liu Y, Ren Z, Guo H, Li Z, et al. Rapid PCR-based nanopore adaptive sequencing improves sensitivity and timeliness of viral clinical detection and genome surveillance. Front Microbiol. 2022;13:929241.
Cheng H, Sun Y, Yang Q, Deng M, Yu Z, Zhu G, et al. A rapid bacterial pathogen and antimicrobial resistance diagnosis workflow using Oxford nanopore adaptive sequencing method. Briefings Bioinform. 2022;23:bbac453. https://doi.org/10.1093/bib/bbac453.
Maestri S, Cosentino E, Paterno M, Freitag H, Garces JM, Marcolungo L, et al. A rapid and accurate MinION-based workflow for tracking species biodiversity in the field. Genes. 2019;10:468.
Pomerantz A, Peñafiel N, Arteaga A, Bustamante L, Pichardo F, Coloma LA, et al. Real-time DNA barcoding in a rainforest using nanopore sequencing: opportunities for rapid biodiversity assessments and local capacity building. GigaScience. 2018;7:033.
We thank Suzanne Stone for assistance within the molecular laboratory and related logistics. The Minnesota Supercomputing Institute provided essential computational and data storage resources. The staff of the Wildlife Rehabilitation Center of Minnesota (Roseville, MN) and Luciano Caixeta kindly provided access to forested lands and the University of Minnesota dairy barn, respectively, for insect collecting. Financial support for CMB and JB was kindly provided by the University of Minnesota, College of Veterinary Medicine Summer Scholars program with funds originating from the Office Of The Director of the NIH under Award Number T35OD011118 and the Office of Graduate Programs, College of Veterinary Medicine. We thank Tiffany Wolf and Erin Burton for providing dissecting microscopes for morphological identifications of insects. We used BioRender (www.biorender.com) to create figures presented herein.
This research presented herein was supported in part by the Office Of The Director of the NIH under Award Number T35OD011118, the Office of Graduate Programs, College of Veterinary Medicine, University of Minnesota and startup funds provided to Peter A. Larsen through the Minnesota Agriculture, Research, Education, Extension, and Technology Transfer (AGREETT) program.
Ethics approval and consent to participate
Consent for publication
All authors consent to publication.
The authors declare no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Additional file 1:
Table S1. Vector specimens examined for each sequencing experiment. SRA accession number includes raw FASTQ files from each experiment. NCBI organelle accession numbers are only available for the four vector species that had fully assembled mitochondrial genomes generated in this study.
Additional file 2:
Table S2. Genbank accession numbers that were included in the reference FASTA file used during NAS sequencing runs. Experiment A included the only publically available representative of the genus Chrysops (C. silvifacies), a species that is not native to the study region of North America; various potential vertebrate hosts and a whole genome assembly for the bacterial pathogen Francisella tularensis were also included as enrichment references. Experiment B contained related mosquito species, an assortment of possible bird and mammal blood meals and possible pathogen targets of interest. Experiment E contained closely related vector species and possible bird, mammal, reptile and amphibian blood meals. Experiment C is not included in this table as we used the NCBI RefSeq download for the entire mitochondrial genome database provided by NCBI (https://ftp.ncbi.nlm.nih.gov/refseq/release/mitochondrion/; accessed 16 Sept 2021).
Additional file 3:
Table S3. Sequences were randomly subsampled starting at 9000 sequences. The resulting subsampled files were mapped to the mitogenome of Culex pipiens using Minimap2.
Additional file 4:
Figure S1. Mitogenome map for the black deer fly, Chrysops niger, sequenced during nanopore adaptive sampling Experiment A. Orientation of gene transcription is denoted with arrows and A + T content across the mitogenome is depicted on the innermost circle in light gray.
Additional file 5: Figure S2.
Maximum likelihood phylogenies generated based on the COI gene for hematophagous insects sequenced with NAS. Phylogenies were generated using 1000 bootstrap replicates with statistically supported nodes (≥ 75 bootstrap value) depicted with an asterisk (*). Consensus sequences generated in the present study are denoted in red with their corresponding experimental run; comparable sequences obtained through the Barcode of Life Data System (BOLD; www.boldsystems.org). a Maximum likelihood tree for deer flies in the genus Chrysops b Maximum likelihood tree for Aedes mosquitoes c Maximum likelihood tree for Culex mosquitoes.
Additional file 6:
Table S4. Sequences downloaded from GenBank (species and accession numbers) used in the phylogenies of blood meals sequenced from experiment E.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
About this article
Cite this article
Kipp, E.J., Lindsey, L.L., Milstein, M.S. et al. Nanopore adaptive sampling for targeted mitochondrial genome sequencing and bloodmeal identification in hematophagous insects. Parasites Vectors 16, 68 (2023). https://doi.org/10.1186/s13071-023-05679-3
- Mitochondrial genomes
- Molecular barcoding
- Phylogenetic capture