Gene discovery and differential expression analysis of humoral immune response elements in female Culicoides sonorensis (Diptera: Ceratopogonidae)

Background Female Culicoides sonorensis midges (Diptera: Ceratopogonidae) are vectors of pathogens that impact livestock and wildlife in the United States. Little is known about their biology on a molecular-genetic level, including components of their immune system. Because the insect immune response is involved with important processes such as gut microbial homeostasis and vector competence, our aims were to identify components of the midge innate immune system and examine their expression profiles in response to diet across time. Methods In our previous work, we de novo sequenced and analyzed the transcriptional landscape of female midges under several feeding states including teneral (unfed) and early and late time points after blood and sucrose. Here, those transcriptomes were further analyzed to identify insect innate immune orthologs, particularly humoral immune response elements. Additionally, we examined immune gene expression profiles in response to diet over time, on both a transcriptome-wide, whole-midge level and more specifically via qRTPCR analysis of antimicrobial peptide (AMP) expression in the alimentary canal. Results We identified functional units comprising the immune deficiency (Imd), Toll and JAK/STAT pathways, including humoral factors, transmembrane receptors, signaling components, transcription factors/regulators and effectors such as AMPs. Feeding altered the expression of receptors, regulators, AMPs, prophenoloxidase and thioester-containing proteins, where blood had a greater effect than sucrose on the expression profiles of most innate immune components. qRTPCR of AMP genes showed that all five were significantly upregulated in the alimentary canal after blood feeding, possibly in response to proliferating populations of gut bacteria. Conclusions Identification and functional insight of humoral/innate immune components in female C. sonorensis updates our knowledge of the molecular biology of this important vector. Because diet alone influenced the expression of immune pathway components, including their effectors, subsequent study of the role of innate immunity in biological processes such as gut homeostasis and life history are being pursued. Furthermore, since the humoral response is a key contributor in gut immunity, manipulating immune gene expression will help in uncovering genetic components of vector competence, including midgut barriers to infection. The results of such studies will serve as a platform for designing novel transmission-blocking strategies. Electronic supplementary material The online version of this article (doi:10.1186/1756-3305-7-388) contains supplementary material, which is available to authorized users.


Background
Culicoides biting midges (Diptera: Ceratopogonidae) are nuisance pests and some species are important vectors of disease-causing viruses, protists, and nematodes. In the US, Culicoides sonorensis transmits bluetongue virus and epizootic hemorrhagic disease virus to wild and domestic ruminants (e.g. sheep, deer, cattle), and has also shown potential to vector other viruses [1,2]. While both sexes of midges feed on sugars in the form of extrafloral nectar, female C. sonorensis midges are anautogenous, requiring blood meals to initiate egg development. Since this process also serves as a means of pathogen acquisition from infected hosts, only female midges are disease vectors.
Arthropod vectors utilize physical and physiological defenses to combat microbes that may be present in the blood or sugar meal and to maintain homeostatic balance in gut bacterial populations. Physical defenses include the peritrophic matrix, which forms around the ingested blood meal and partitions microbes such as bacteria by sizeexclusion [3]. A second line of defense involves the innate immune response, comprised of humoral and cellular components that act locally (e.g., epithelia, proximal to microbes) and/or systemically (i.e., fat body and hemolymph). Three major conserved signaling pathways that orchestrate the insect humoral immune response have been elucidated in model organisms such as fruit flies and mosquito vectors and include: Imd (Immune deficiency), Toll and JAK/ STAT (Janus kinase/signal transduction and activators of transcription) [4]. In some dipteran flies, the Imd pathway is activated when peptidoglycan cell wall components of Gram-negative bacteria directly bind transmembrane peptidoglycan recognition protein (PGRP) receptors, pattern recognition receptors (PRRs) which are present on a variety of cells, especially barrier epithelia and fat body [4]. Imd activation results in the synthesis of antimicrobial peptides (AMPs) such as Diptericin via the Relish transcription factor [5]. The Toll pathway is activated by peptidoglycan components of Gram-positive bacterial cell walls and fungal glucans, and thus primarily responds to infections with these classes of microorganisms [4]. In the insect hemocoel, binding of these microbe-associated molecular patterns (MAMPs) to circulating PRRs triggers an extracellular serine protease cascade that eventually results in intracellular activation of NF-ƙB response elements and the transcription of Toll-induced AMPs. Alternatively, fungal proteolytic activity also activates the Toll pathway via the protease Persephone [6]. In the JAK/ STAT pathway, three components, the Domeless receptor, the Janus Kinase Hopscotch, and the transcription factor STAT are at least partly involved in antiviral defenses in various flies [7,8]. Relatively recently, more evidence is mounting that implicates both the Imd and Toll pathways in the dipteran antiviral defense repertoire as well, including defense against entomopathogenic viruses and arboviruses [9,10].
AMPs are small, potent, antimicrobial effectors that are quickly synthesized by the insect fat body, hemocytes or epithelia in response to pathogen or microbe exposure [11,12]. A majority are cationic at physiological pH, which facilitates interactions with microbial cell envelope components [13]. Immune studies in important insect vectors have demonstrated AMP upregulation in response to pathogen challenge either by natural or artificial routes. Anopheles gambiae presented with bacteria and malaria parasites upregulate defensin in the midgut and carcass and express this AMP in the salivary glands during late stages of infection [14][15][16]. Sandflies express AMPs in response to Leishmania infection and some AMPs, such as Attacin, are involved in antitrypanosomal responses in tsetse flies [17][18][19]. AMPs and other effectors also participate in population control of non-pathogenic gut microbes. Larval dipterans are exposed to environmental bacteria through normal feeding activities and often harbor these indigenous microbiota transstadially [20][21][22][23]. Populations of gut-associated microbiota in adult insects are tightly regulated and reflect a balance between the immune response and bacterial tolerance [24][25][26]. In several vectors, a tripartite relationship between gut bacteria, pathogens, and the vector innate immune response has been demonstrated, including the impact such associations have on vector competence [27][28][29]. Thus, knowledge of the humoral response of blood feeding vectors helps not only in understanding their biology, but can also reveal mechanisms underlying refractoriness.
Innate immune responses in biting midges, including AMP expression, have not been investigated. In our previous work, we sequenced and annotated the transcriptome of adult female C. sonorensis and examined the responding transcriptome profiles of whole midges during various feeding states. In the current study, we identified and describe the components of the humoral immune response including receptors, signaling molecules and effectors from the Toll, Imd, and JAK/STAT pathways. Furthermore, we examined their differential activation on a transcriptome-wide level in whole female midges under different feeding states (teneral, blood and sucrose feeding over time). The gut-specific expression of selected AMPs in response to blood and sugar meals was quantified over time, and we found that blood feeding alone highly induced expression of five AMPs in the alimentary canal. This is the first description of these pathways in the midge, and likewise is the first look at temporospatial expression of AMP genes in relation to diet source. The role of these immune pathways in gut microbial ecology and vector competence in midges is discussed.

Humoral immune gene discovery
The adult female midge reference transcriptome has been previously described in [30]. In brief, female midges were unfed (teneral) or were exposed to different diets (blood or sucrose) and sampled at early (2, 6, 12 h post ingestion, pooled) or late (36 h post ingestion) time intervals. Total RNA from whole midges was used to prepare indexed temporospatial specific sequencing libraries and deep sequenced on an Illumia HiSeq2000. A de novo transcriptome was constructed and can be downloaded from the Transcriptome Shotgun Assembly deposited at DDBJ/ EMBL/GenBank under the accession GAWM00000000 and bioproject 238338. The transcriptome is comprised of 19,041 unigene assemblies that can be found in the GenBank nucleotide database under the following accessions: GAWM01000001-GAWM01019041. Homology based annotation of the unigene set was carried out through comparisons to Aedes aegypti and Culex quinquefasciatus datasets, and the non-redundant protein database at GenBank. For the current study, functional signatures were determined by alignment to the Interpro (www.ebi.ac.uk/ interpro) and ImmunoDB (cegg.unige.ch/insect/immunodb) databases to check for domains and orthologs, respectively, and to confirm correct annotation along with complete ortholog structure/function. Essentially, these methods were used to determine if the unigene deduced amino acid sequences contained complete domains and motifs associated with the immune components function and structure as defined in other arthropods.

Transcriptome-wide expression profiles of humoral immune genes
Humoral immune genes were identified by searching the gene annotations and assigned GO terms, and by applying knowledge from other arthropod systems. Digital genome-wide gene expression profiles for female midges under different feeding and temporal conditions were described previously [30]. Briefly, treatment groups were comprised of: teneral (unfed, 2 d old), or those fed either 10% sucrose or blood and collected to represent early (2, 6, 12 h post-ingestion, pooled) or late (36 h postingestion) conditions; two biological replicates of each of these five treatment groups were collected and analyzed to determine condition-specific global gene expression profiles. Pairwise comparisons were made between and within diet source across time using the Tuxedo software package as we previously described [30], and statistically significant differences in gene expression were reported (P ≤ 0.01).

Alignments of AMP genes
Multiple alignments of deduced peptide sequences were performed using CLC Genomics Workbench (www. clcbio.com). Insect sequences downloaded from NCBI were manually trimmed, inspected, and aligned with CLUSTALW.

Antimicrobial peptide expression in C. sonorensis alimentary canal
Culicoides sonorensis midges (AK colony) were reared at the US Department of Agriculture Arthropod-Borne Animal Diseases Research Unit and maintained at 26°C, 70-80% relative humidity, with a 12-12 hour light-dark photoperiod. One to two day-old female adult midges were allowed to feed ad libitum for 1.5 h on a 10% sucrose solution or for 1 h on defribrinated sheep blood (Colorado Serum Company, Denver, CO) via an artificial membrane. Each feeding trial was replicated three times. At 3, 8, 12, and 24 h post feeding, midges (n = 15/time point per replicate) were anesthetized with carbon dioxide and removed for processing. The alimentary canal was dissected from each midge and pooled by time point for homogenization in Tri-Reagent (Ambion). Total RNA extraction was performed using a modified manufacturer's protocol incorporating Bromo-3-chloro-propane in the extraction step and overnight ethanol precipitation. RNA quality was analyzed with a Nanodrop spectrophotometer and cDNA was synthesized from 500 ng total RNA using the QuantiTech Reverse Transcription kit following the manufacturer's instructions (Qiagen, Valencia, CA). qRT-PCR detection was performed using a 5 PRIME RealMasterMix SYBR ROX kit (5 Prime, Gaithersburg, MD) according to the manufacturer's protocol and run in 10 μl reactions consisting of primers diluted to a final concentration of 250 nM and cDNA templates diluted 1:10. To minimize variability, pipetting was performed using an Eppendorf epMotion 5070 platform and reactions run in triplicate on a Mastercycler ep realplex thermalcycler (Eppendorf, Hauppauge, NY) with the following parameters: 95°C for 2 min, followed by 40 cycles of 95°C for 15 s, 60°C for 20 s, 60°C for 15 s. Primer sequences are listed in Additional file 1, and include the reference gene EF1b [GenBank: GAWM01010754], which was previously identified as a candidate since it is not differentiallyexpressed across teneral or sucrose-or blood-fed midges [30]. C T values were analyzed using the Relative Expression Software Tool [31], which allows for group wise comparison and statistical analysis of relative expression while accounting for differences in primer efficiencies.

Results and discussion
Components of the C. sonorensis humoral immune system in the transcriptome The adult female transcriptome consists of 19,041 unigenes as described previously [30]. A search of the assigned Gene Ontology (GO terms) for humoral and immune returned 52 and 125 unigenes, respectively. However, searching of GO terms did not reveal all critical components of the pathways described below, and subsequent manual curation and searching using public resources revealed a total of 217 unigenes (~1.1% of the adult female midge) that make up or are involved in the insect humoral immune response. Three major conserved pathways in insect humoral immunity were revealed, including: Imd (Immune deficiency), Toll and JAK/STAT (Janus kinase/signal transduction and activators of transcription) with all or most components such as receptors, signaling intermediates, transcriptional regulators, effectors and regulators. All critical components were identified for Toll and JAK/ STAT, but we did not identify two signaling components of the Imd pathway (Imd, FADD). Below we introduce and describe the detailed components of the midge humoral immune response.

Imd pathway
The Imd pathway is part of the dipteran humoral antibacterial response that is activated when meso-diaminopimelic acid-containing peptidoglycan (DAP-PGN) binds transmembrane long-form peptidoglycan recognition proteins (PGRPs) [4,32]. We confirmed the identity of seven longform PGRPs in the midge transcriptome (Table 1). For immune signal transduction to ensue, activated PGRPs act through the adaptor Imd and subsequently FADD, which are two death-domain proteins that interact with DREDD (a Caspase-8 homolog). Interestingly, we did not identify orthologs for either Imd or FADD in the C. sonorensis transcriptome, although these have been identified in other nematocera [33], but a DREDD ortholog was identified [GenBank: GAWM01000519]. In Drosophila, DREDD cleaves the inhibitory domain from phosphorylated Relish, and the rel domain then translocates to the nucleus to induce expression of effectors such as antimicrobial peptides (AMPs) [32]. Relish is phosphorylated by a parallel component of the Imd pathway involving IAP (inhibitor of apoptosis), TAB2 (tak-associated binding protein), and several kinases, such as TAK1 (transforming growth factor activated kinase) and the IKK complex [4]. Orthologs for all components of this branching part of the Imd pathway were found in the transcriptome including: IAP2 [GenBank: GAWM01008211], TAB2 [GenBank: GAWM01006076], TAK1 [GenBank: GAWM01010356; GenBank: GAWM01012184], the ird5 ortholog IKK-beta [GenBank: GAWM01013537] and the key ortholog IKKgamma, also known as Kenny [GenBank: GAWM01018250]. Two non-allelic sequences for TAK1 were identified (Table 1). This MAP3K also modulates the branch point between IMD and JNK (c-Jun N-terminal kinase) pathways, by phosphorylating both the IKK complex and JNKK (junkinase-kinase), respectively [34]. We also identified two Relish orthologs, with one [GenBank: GAWM01014885] likely being either rel-2 (a rel-1 paralog), or possibly a truncated isoform of rel-1 [GenBank: GAWM01014884].
Regulation of the Imd pathway in insects includes both basal and inducible regulators that modulate the timing and amplitude of the immune response, respectively [35]. In C. sonorensis, we identified the inducible negative regulators PIRK (poor Imd response upon knock-in, also known as PIMS or RUDRA) [GenBank: GAWM01010231] and PGRP-SC2/SC3 (a short-form scavenger type of circulating PGRP) [GenBank: GAWM01018647] as well as the basal negative regulators Caspar (also known as FASassociated factor 1, FAF1) [GenBank: GAWM01012793] and Caudal [GenBank: GAWM01004228].

Toll pathway
Unlike the Imd pathway of humoral immune response, the Toll pathway functions solely in the systemic (e.g. fat body and hemolymph) recognition of microbes in insects. This is because microbial MAMPs (e.g. Lys-type peptidoglycan or fungal glucans) do not directly bind Toll receptors but instead are pre-processed by circulating PRRs including PGRP-SA and Gram-negative binding proteins 1 and 3 (GNBP1, GNBP3; also knows as Beta-1,3 Glucan Binding Proteins). These interactions start a protease cascade that eventually cleaves circulating pro-Spaetzle to the Toll-binding cytokine Spaetzle, after which signal transduction and effector expression ensues [36,37]. The upstream humoral components of the Toll pathway that were identified in C. sonorensis include PGRP-SA [GenBank: GAWM01018051], three GNBP1 orthologs [GenBank: GAWM01002165; GenBank: GAWM01003712; GenBank: GAWM01004143] and GNBP3 [GenBank: GAWM01011997], three putative Spaetzle orthologs [GenBank: GAWM01001358; GenBank: GAWM01006049; GenBank: GAWM01012721], all without signal peptide, and one Spaetzle-1 ortholog, complete with signal sequence [GenBank: GAWM01015015] ( Table 2).
All cell-associated components of the insect Toll pathway were identified in the C. sonorensis transcriptome. Insect Toll receptors have characteristic extracellular N-terminus leucine-rich repeats (LRR), at least two flanking cysteine-rich motifs (CRR) and intracellular Toll/IL-1 receptor (TIR) domains [38]. We identified two putative Toll receptors, which were complete except for CRR motifs: [GenBank: GAWM01015594; GenBank: GAWM0101 9001] and three complete Toll receptors [GenBank: GAWM01015706; GenBank: GAWM01013057; GenBank: GAWM01013058] ( Table 2). Intracellular Toll signaling involves three death-domain containing proteins including the adaptor MyD88 and the mammalian IRAK1 and IRAK4 orthologs Pelle and Tube, respectively [4]. Complete orthologs for MyD88 [GenBank: GAWM01018790], Pelle [GenBank: GAWM01001221; GenBank: GAWM01011117] and Tube [GenBank: GAWM01007838] were found in the transcriptome. CsPelle and CsTube both contain typical death and kinase domains, and CsPelle has the GD  dipeptide motif while CsTube has the RD dipeptide motif [39]. In Drosophila Tube, the kinase function has been evolutionarily lost; however, Tube proteins from the nematocerans Aedes aegypti [GenBank: AAEL007642], Culex pipiens [GenBank: CPIJ013746] and now C. sonorensis retain complete kinase domains [39,40]. Transcription of effector molecules in the insect Toll pathway is controlled by the Rel-inhibitor, and IkappaB ortholog, Cactus, and the Rel-1 transcription factors Dorsal or Dif. In the C. sonorensis transcriptome, we identified a complete Cactus ortholog [GenBank: GAWM01009580], containing typical ankyrin repeats, as well as several Dorsal orthologs. The CsDorsal sequences represent at least two dorsal genes and possibly two additional spliceforms (Table 2), and the sequence was highly similar to that from the single-copy dorsal gene in mosquitoes (data not shown) [41].

Antimicrobial peptides
When Toll and Imd pathways are activated, their transcription factors (e.g. Dorsal, Relish) translocate to the nucleus and bind NF-kB promoters upstream of effector genes, such as those encoding antimicrobial peptides (AMPs). Full sequences for several AMPs were present in the midge transcriptome (Table 1). Two members of the Attacin superfamily were identified, with one having the full characteristics of insect Attacins [GenBank: GAWM01017969], bearing two C-terminus glycine-rich (G) domains in tandem ( Figure 1A). The other attacin-like glycine-rich AMP [GenBank: GAWM01008443] had only one G domain, showing high similarity to the G1 domain of other dipteran glycine-rich AMPs ( Figure 1A). This short CsAttacin-like AMP is not a Diptericin since it lacks both an N-terminus proline-rich P-domain and a pentaglycine repeat domain, which is characteristic of fly Diptericins [42,43]. In mosquitoes, glycine-rich short AMPs annotated as "Attacins" also bear only one G domain ( Figure 1A). Therefore, these nematoceran Attacinlike AMPs categorically are neither Diptericins nor Attacins, but rather represent another member of this AMP family. We infer that CsAttacin-like AMP is likely a truncated paralog of CsAttacin, rather than being an ortholog of the short Attacin-like AMPs in mosquitoes. A single midge Cecropin was identified and is 58 amino acids in length including signal [GenBank: GAWM01000005] ( Figure 1B). Cecropins have alpha-helical peptide structures that form pores in bacterial cell envelopes [11,44,45]. CsCecropin contains numerous, conserved positive amino acid residues (mainly lysine and arginine) which comprise a characteristic motif associated with this AMP class and is important in interactions with negatively charged bacterial cell membranes. The CsCecropin deduced amino acid sequence was most similar in sequence to Cecropins from other nematocera.
Two paralogous Defensins [GenBank: GAWM01019039; GenBank: GAWM01019040] were identified from the C. sonorensis transcriptome and only shared 34.8% sequence identity (Table 1, Figure 1C). Like other insect Defensins, both CsDefensins contain six conserved cysteines which are critical to the secondary structure of this AMP and the interaction with the bacterial envelope [46].

JAK/STAT pathway
The JAK/STAT pathway is involved in the antiviral defense in insects, as well as cell proliferation, differentiation and development in flies such as Drosophila [10]. Viral infection causes upregulation of the cytokine Upd (unpaired), which is a ligand for the receptor Domeless (Dome). Pathway activation ensues after dimeric Dome receptors change conformationally and cause the autophosphorylation, and activation, of the JAK-kinase Hopscotch (Hop). Hop goes on to phosphorylate Dome, which provides STAT docking sites, after which Hop phosphorylates the SH2 domains on recruited STATs. Phosphorylated STAT dimers translocate to the nucleus and induce expression of target genes. We identified all components of the JAK/STAT pathway in the C. sonorensis transcriptome ( GAWM 01013279] STAT. The mechanism by which the STATinduced genes control viral amplification remains unknown, but reverse-genetic approaches have shown that hop mutant Drosophila have higher Drosophila-C virus (DCV) loads [7]. Similarly, RNAi knockdown of either dome or hop results in higher susceptibility to dengue virus infection in mosquitoes [8]. Two negative regulators of the JAK/STAT pathway include SOCS (suppressor of cytokine signaling), which prevents STAT activation by binding phosphorylated Hopscotch or by preventing or blocking docking sites on Dome receptors, and PIAS (protein inhibitor of activated STAT), which blocks STAT from accessing binding sites upstream of target genes [47]. Complete sequences for the JAK/STAT negative regulators SOCS and PIAS were identified. One of the Culicoides SOCS [GenBank: GAWM01008465] was structurally homologous to Drosophila SOCS36E, which has been confirmed to be a JAK/STAT repressor in flies [48], and the other [GenBank: GAWM01008657] is a possible ortholog of SOCS7. The two complete orthologs for the SUMO ligase PIAS [GenBank: GAWM01011450; GenBank: GAWM01011451] contain all the domains associated with the transcription-blocking functions of this inhibitor [49] (Table 3).

Other immune related genes
Other humoral immune components and effectors were found in the transcriptome including hemolymph defense molecules such as thioester-containing proteins (TEPs) and prophenoloxidase (PPO). Insect TEPs are active in the systemic response to invasive microbes, and help in opsonization for subsequent clearance by phagocytosis [50]. We identified two TEP3 orthologs [GenBank: GAWM01009528; GenBank: GAWM01016118] in C. sonorensis. In mosquitoes, TEP3 has been shown to be involved in both the antibacterial and antiparasitic (antimalarial) defense [50]. PPO zymogen is stored in insect hemocytes and is activated via a serine protease cascade to phenoloxidase (PO) in response microbial challenge. PO enzymes oxidize phenols to orthoquinones, which polymerize into melanin, and this results in melanization of invading microbes or wounds [4]. Two complete PPO paralogs were found in the Culicoides transcriptome [GenBank: GAWM01015170; GenBank: GAWM01010754]. Mosquitoes have from nine (A. gambiae) to ten (Ae. aegypti) genes coding for PPOs, and members of this family have been implicated in refractoriness to Plasmodium infection in A. gambiae [51].

Dietary effects on transcriptome-wide expression of humoral immune genes
Many of the Imd, Toll and JAK/STAT genes were differentially expressed after female midges fed on blood or sucrose. The humoral immune response to diet is not due to direct stimulation, but rather is likely mediated through alteration of the gut microbial community. Such a circuitous influence on the gut epithelial immune response has been shown in mosquitoes, where diet causes proliferation of gut bacteria, which produce immunostimulatory MAMPs such as peptidoglycan (PGN). The diet promotes bacterial proliferation by two mechanisms: (1) directly, where the meal provides nutrients to support microbial growth or (2) indirectly, where components of the meal, such as free heme in blood, block the activity of reactive oxygen species that otherwise act in suppressing gut flora populations [52,53]. The MAMPs produced by proliferating gut bacteria activate both local responses by binding PRRs on epithelial cells (e.g., Imd responses) and systemic responses in the hemocoel (e.g., Imd or Toll on the fat body), mediated through second messengers [54,55] or by PGN diffusing into the hemolymph [56,57]. A tripartite interaction between gut microbes, innate immune responses and vector competence for pathogens has been demonstrated in mosquitoes and other hematophagous insects [27,29,[58][59][60]. In our gene expression analyses of the midge transcriptomes, we found that altered expression of humoral immune genes was more often associated with blood feeding than sugar feeding. Ongoing studies in our laboratory have shown that the blood meal induces proliferation of midge gut bacteria, and more specific analyses on these microbial-ecological dynamics are being assessed (data not shown).  Expression of most of the Imd pathway genes changed significantly after female midges fed on blood or sucrose (P ≤ 0.01; Table 4). After feeding on blood or sucrose, three PGRP genes were upregulated, and three were down regulated, showing no clear pattern of response for these receptors. In regards to Imd cell signaling components, all were significantly upregulated after either early or late blood feeding (or both) except for TAB2 and TAK1, which were not differentially expressed. kenny and ird5 were downregulated in late blood fed midges relative to expression levels after early blood feeding (P ≤ 0.001; Table 4).
Genes involved with feedback modulation of the Imd response were also differentially expressed in blood-fed midges. In early blood-fed midges, the negative regulators pirk and caspar were downregulated, which would permit early transcription of Imd-response target genes, including AMPs [35]. Expression of the transcription factor relish was downregulated in early blood fed-midges (Table 4, Figure 2), which may represent feedback mechanisms to modulate the amplitude of the immune response, which may represent feedback mechanisms to modulate the amplitude of the immune response. Transcripts coding for the Figure 2 Transcriptome-wide differential expression analyses of selected Culicoides sonorensis Imd and antimicrobial peptide genes. Early transcriptomes are 2, 6, 12 h post ingestion (pooled) and late transcriptomes are 36 h post ingestion, for each diet (blood or sucrose). Teneral midges were newly emerged and unfed. Log 10 FPKM values indicated in legend of the heat map. Further description of these genes can be found in Table 1, and fold-change values and statistics can be found in Table 4. scavenger amidase PGRP-SC2/SC3 were upregulated in late blood fed midges, possibly serving as a negative regulator to suppress excessive Imd stimulation [35]. All five AMP target genes were differentially expressed following blood feeding (Table 4, Figure 2). Genes for the AMPs attacin-like and attacin were both highly upregulated after blood feeding, but differed in their temporal expression patterns, with attacin-like being lateinduced, and attacin being early-induced (Table 4, Figure 2). Sucrose feeding also caused upregulation of each gene, with similar patterns in temporal expression. Both defensin paralogs were upregulated in midges in the early sucrose-fed transcriptomes, but they differed in their upregulation in response to blood feeding, with one being early-blood and one being late-blood induced (Table 4, Figure 2). Expression of cecropin was upregulated by both sucrose and blood feeding, with early responses being significantly higher than late (P ≤ 0.009; Table 4). Many Toll pathway components were differentially expressed in midges following blood or sucrose feeding (Table 5). PGRP-SA was downregulated in both early and late blood-fed midges, but most of the other upstream signaling components were significantly upregulated (P ≤ 0.01) in either blood or sugar fed midges (except one of the GNBP1; Table 5, Figure 3). In contrast, toll receptors and dorsal transcription factors were downregulated after blood feeding ( Table 5). The signaling components myd88, pelle, tube and cactus were all upregulated in early blood fed midges. Systemic responses to conditions in the gut suggest that there is communication between these two body compartments in midges. Further, the expression patterns of Toll components could be a glimpse into feedback mechanisms designed to quell the systemic response to proliferating gut microbiota which are immunostimulatory, yet are not invasive or threatening to the midge.
The negative regulators of the JAK/STAT pathway, SOCS and PIAS, were upregulated in midges early after blood feeding (P ≤ 0.000009; Table 6). In addition, although hop was upregulated after early blood feeding, expression of STAT transcription factors was downregulated in early blood-fed midges. Except for the dome receptors, expression levels of all JAK/STAT components returned to baseline (teneral) levels at 36 h post ingestion ( Table 6). The phenomenon of blood feeding alone suppressing the JAK/STAT pathway would play an important role in the infection success of arboviruses present in the blood meal, since the expression of some antiviral genes is regulated by STAT [8]. We are currently investigating whether this early downregulation occurs locally in gut epithelial cells, which serve as the midge's primary line of defense against arboviruses.
Expression of other systemic immune components also changed significantly after blood feeding (P ≤ 0.00001). This included tep3 [GenBank: GAWM01016118], which was downregulated nearly 4-fold early after blood feeding but then returned to baseline expression levels 36 h after the blood meal, and two genes for prophenoloxidase (ppo). One ppo gene [GenBank: GAWM01010754] was Figure 3 Transcriptome-wide differential expression analyses of selected Culicoides sonorensis Toll genes. Early transcriptomes are 2, 6, 12 h post ingestion (pooled) and late transcriptomes are 36 h post ingestion, for each diet (blood or sucrose). Teneral midges were newly emerged and unfed. Log 10 FPKM values indicated in legend of the heat map. Further description of these genes can be found in Table 2, and fold-change values and statistics can be found in Table 5.
upregulated nearly 9-fold in early blood fed midges and over 1000-fold in late blood fed midges. However, the other C. sonorensis ppo [GenBank: GAWM01015170] was downregulated over 16-fold in early blood-fed midges before returning to the baseline expression level at 36 h post-blood feeding.
AMP expression in the gut of female C. sonorensis As a follow up to the transcriptome-wide analysis of innate immune gene expression in female C. sonorensis, we performed tissue-specific qRTPCR analysis of AMP expression in the alimentary canal ("gut"). The aim was to more finely assess the temporo-spatial expression of these effector genes after blood and sucrose feeding. In congruence with our whole midge transcriptome-wide expression analyses (Table 4, Figure 4), blood feeding resulted in upregulation of all five AMPs in the gut (Figure 4). The attacin-like AMP was significantly upregulated in late blood-fed midges while attacin was upregulated early and sustained through 24 h post-blood ingestion ( Figure 4A and B, respectively). On a whole-midge level, the two defensin genes showed different patterns of expression with defensin m.9997 being upregulated early after blood feeding, and defensin m.9998 being induced late after blood feeding (Table 4). However, in the gut-specific qRTPCR analysis, both defensin genes showed similar patterns of upregulation after blood feeding, and the fold-increase was significantly different from teneral midge expression levels at 12 and 24 h after blood feeding ( Figure 4C and D). This suggests that the differential expression patterns seen in the transcriptome-level analyses would be attributable to tissues other than the alimentary canal, possibly the fat body. Both local (gut) and systemic (fat body) defensin responses to the ingested blood meal have been reported in other hematophagous arthropods [61][62][63]. Midge cecropin was upregulated at all four time points after blood ingestion, and expression levels were significantly different from teneral midges at 12 and 24 h post-blood feeding ( Figure 4E). Sucrose feeding did not result in significant upregulation of attacin-like, attacin or cecropin in the alimentary canal, but did induce expression of both defensin genes. The expression of these AMPs after the blood meal is likely a consequence of altered microflora populations, whose proliferation would have an immunostimulatory effect on the gut epithelial cells. Intriguingly, this suggests that blood feeding alone indirectly impacts the conditions of the gut and, putatively, the vector competence of midges for pathogens in the blood meal.

Conclusions
We demonstrated conservation of humoral immune components in the three major immune pathways of insects (Imd, Toll and JAK/STAT) in the C. sonorensis transcriptome. We have also provided insight into these defense pathways in the midge, by examining their patterns of expression on a transcriptome-wide level. We showed that blood feeding alone greatly impacts the expression of many components of these pathways, most importantly effector molecules such as AMPs, PPO and TEPs, which may be directly involved with the midge's vector competence for pathogens. This knowledge allows us to take the next steps in assessing function by utilizing reversegenetics (e.g. RNAi) approaches to more clearly define the role of the innate immune system in midge permissiveness or refractoriness for pathogens. Such studies will be aimed at revealing novel transmission-blocking and disease intervention strategies.
In this study, we did not explore the other arthropod immune and defense response components including the DUOX and JNK pathways, components of which have been found in our transcriptome but have not yet been completely characterized. These pathways as well as other defense systems such as iron sequestration, melanization and cellular responses will be an important focus of future studies aimed at fully characterizing the immune repertoire of this important vector species.

Additional files
Additional file 1: Primer sequences used for qRT-PCR analyses of antimicrobial peptide gene expression in female C. sonorensis alimentary canal.
Additional file 2: Antimicrobial peptide (AMP) expression analysis using REST-MCS©. Midges were fed blood or sucrose and processed as described in the text for qRTPCR of AMP gene expression, with three biological replicates (shown). A pairwise fixed allocation randomization test was performed using REST-MCS® to analyze AMP gene expression. P-values are for comparison to the calibrator state (teneral, unfed whole female midges) using the reference gene EF1b. Statistically significant P-values are shown in yellow. Red and blue represent upregulation and downregulation of target genes, respectively and grey indicates that threshold cycle was not crossed within 40 cycles (thus, no detectable expression).

Competing interests
The authors declare that they have no competing interests.
Authors' contributions DN conceived the study; DN, MBL, CAS performed the experiments and contributed to data analysis, interpretation and manuscript writing and A B D C E Figure 4 qRT-PCR analysis of midgut antimicrobial peptide (AMP) expression in female Culicoides sonorensis. Female midges were fed blood or sucrose and midguts (n = 15 per time-point) were dissected at 3, 8, 12, and 24 h after feeding. Relative AMP expression was determined using the methods of Pfaffl [31] for attacin m.3140 (A), attacin m.7821 (B), defensin m.9998 (C), defensin m.9997 (D), and cecropin m.10000 (E), with teneral midges serving as the calibrator condition and incorporating the reference gene EF1b. Error bars represent SEM taken from three biological replicates. Asterisks denote change in expression from baseline (teneral) levels (P < 0.05). n.d., not determined (i.e., threshold cycle not crossed within 40 cycles). The details of each of the three biological replicates, including P-values, are available in Additional file 2.
editing. Specifically, CAS and DN performed the bioinformatic analyses, and MBL performed the alignments and qRTPCR. All authors read and approved the final version of the manuscript.