Isolation and molecular characterization of a major hemolymph serpin from the triatomine, Panstrongylus megistus

Background Chagas disease kills 2.5 thousand people per year of 15 million persons infected in Latin America. The disease is caused by the protozoan, Trypanosome cruzi, and vectored by triatomine insects, including Panstrongylus megistus, an important vector in Brazil. Medicines treating Chagas disease have unpleasant side effects and may be ineffective, therefore, alternative control techniques are required. Knowledge of the T. cruzi interactions with the triatomine host needs extending and new targets/strategies for control identified. Serine and cysteine peptidases play vital roles in protozoan life cycles including invasion and entry of T. cruzi into host cells. Peptidase inhibitors are, therefore, promising targets for disease control. Methods SDS PAGE and chromatograpy detected and isolated a P. megistus serpin which was peptide sequenced by mass spectrometry. A full amino acid sequence was obtained from the cDNA and compared with other insect serpins. Reverse transcription PCR analysis measured serpin transcripts of P. megistus tissues with and without T. cruzi infection. Serpin homology modeling used the Swiss Model and Swiss-PDB viewer programmes. Results The P. megistus serpin (PMSRP1) has a ca. 40 kDa molecular mass with 404 amino acid residues. A reactive site loop contains a highly conserved hinge region but, based on sequence alignment, the normal cleavage site for serine proteases at P1-P1′ was translocated to the putative position P4′-P5′. A small peptide obtained corresponded to the C-terminal 40 amino acid region. The secondary structure of PMSRP1 indicated nine α-helices and three β-sheets, similar to other serpins. PMSRP1 transcripts occurred in all tested tissues but were highest in the fat body and hemocytes. Levels of mRNA encoding PMSRP1 were significantly modulated in the hemocytes and stomach by T. cruzi infection indicating a role for PMSRP1 in the parasite interactions with P. megistus. Conclusions For the first time, a constitutively expressed serpin has been characterized from the hemolymph of a triatomine. This opens up new research avenues into the roles of serine peptidases in the T. cruzi/P. megistus association. Initial experiments indicate a role for PMSRP1 in T. cruzi interactions with P. megistus and will lead to further functional studies of this molecule.


Background
In 1909, Carlos Chagas identified a new human disease and its infectious biological agent, Trypanosoma cruzi [1]. He described the parasite's life cycle in a wild mammalian host as well as the insect vector, a hemipteran from the family Reduviidae, subfamily Triatominae [2] that was classified as Panstrongylus megistus. This species is highly adaptable to a variety of ecosystems and has now become one of the most important vectors of Chagas disease in Brazil [3].
Subsequently, descriptions of more than 140 species of triatomines, distributed in 15-19 genera have been reported [4][5][6] but the genera most studied, due to their association with Chagas' disease transmission, are Triatoma, Panstrongylus, Rhodnius and Dipetalogaster [7]. The biochemical and physiological similarities and differences of these vectors require further detailed research in order to more fully understand the factors responsible for the specificity occurring in various trypanosome/triatomine interactions [8].
Triatomines are exclusively hematophagous, hemimetabolic insects, passing through five nymphal instars before emerging as adults [9,10]. One triatomine vector species, Rhodnius prolixus, is easily raised in the laboratory and also usually molts following each blood meal so that these characteristics have contributed to this insect becoming a model for physiological studies [9].
In 2005, the National Human Genome Research Institute (NHGRI) designated R. prolixus as an important organism for genome sequencing [11]. The knowledge of the insect physiology accruing from such genome information could be used to identify targets for inhibiting the vectorial competence of triatomines and consequently to control Chagas disease [12]. Some genes related with triatomine immunity have already been identified [13] including defensin [14][15][16], lysozyme [14,17,18], prolixicin [19] and components of the Rel/Nuclear Factor kappa B family [20]. However, in triatomines, despite these latter studies, our knowledge of the regulation of physiological pathways, such as those controlled by the protease cascades that activate reproduction, development and immunity is strictly limited. In insect immunity, the prophenoloxidase system and recognition of pathogenassociated molecular patterns (PAMPs) involve protease cascades which are triggered when the host recognition receptors bind to PAMPs, including peptidoglycans, lipophosphoglycans or 1,3-beta-glucans [21][22][23][24][25].
Protease inhibitors generally regulate all protease cascades with serpins being the largest and most widely studied family of such inhibitors [23]. Serpins are found in both prokaryotes and eukaryotes and are involved in many biological processes, including the regulation of innate immune reactions in insects [22,24,25]. Serpins are typically composed of 350-400 amino acid residues and contain an exposed reactive center loop (RCL) which binds to the active protease site [26,27]. Serpins can adopt different conformational states and can either be active and stressed (native form) or inactive and relaxed (latent form). The native state is unstable with the RCL exposed and poised to interact with the target protease. After interaction, there is cleavage of the scissile bond (P1-P1′) and the RCL becomes linked to the protease covalently and, assisted by the breach and shutter regions, inserted into the β-sheet A to stabilize the structure (e.g. [28]). Serpins in their latent states can also be cleaved but remain stable and inactive. When highly concentrated, serpins can acquire inactive polymeric structures [27,29,30].
Many sequences of insect serpins have been deposited in public databases [24], but in triatomines, as far as we are aware, this kind of protease inhibitor has only been described in a truncated sequence of a contig from the cDNA library of the Triatoma infestans sialome [31].
Although there is a lack of information on serpins in triatomine/T. cruzi associations, a role for these molecules in other insect vector/parasite interactions has already been established. Thus, in mosquito/Plasmodium and tsetse fly/African trypanosomes interactions with several serine proteases and their inhibitory serpins have been described, and details of their roles in controlling the Toll and prophenoloxidase immune activation pathways are emerging [32,33]. The important role of serpins in the life cycle of T. cruzi is, however, implied by the detection of multiple serine protease genes in this flagellate [34] and by the role of serine proteases in the invasive stage of the parasites in the mammalian phase of the life cycle. Thus, the propyl oligopeptidase family of serine proteinases (oligopeptidase B and Tc-80) has been shown to be involved in parasite adhesion and entry into host cells [35]. More recently Oliveira et al. [36,37] described the presence of heparin binding proteins (HPBs) on the surface of all stages of T. cruzi which modulate the attachment of the parasites to glycosaminoglycans on both mammalian and insect cells. These HPBs were shown to be localized at the flagellar membrane and, subsequently, to have serine protease activity [36,37]. Such serine proteases are usually associated with complex cascades that amplify signals and are controlled by serine protease inhibitors [24].
The present paper derives from an initial comparative appraisal of the hemolymph protein profiles of vector insects from the genera Triatoma, Panstrongylus, Rhodnius and Dipetalogaster that led to the discovery of a major serpin in the hemolymph of P. megistus. Here, we describe the purification and characterization of this serpin as well as a putative role for this molecule in the interaction of T. cruzi with its P. megistus host. The identification of such a potentially important host factor could assist manipulation of the vector physiology to block T. cruzi development or even to compromise the ability of the vector insect to resist disease.

Ethics statement
The animals used to maintain the insects at FIOCRUZ were treated according to the Ethical Principles in Animal Experimentation approved by the Ethics Committee in Animal Experimentation (CEUA/FIOCRUZ) approved under the protocol numbers P-54/10-4/LW12/11 and L-0061/08. Both protocols are from CONCEA/MCT (http://www. cobea.org.br/), which is associated with the American Association for Animal Science (AAAS), Federation of European Laboratory Animal Science Associations (FELASA), International Council for Animal Science (ICLAS) and Association for Assessment and Accreditation of Laboratory Animal Care International (AAALAC).

Insects and hemolymph collection
The insects used in the experiments were fifth instar nymphs of the following species: Dipetalogaster maximus, Triatoma infestans, P. megistus, Panstrongylus lutzi, R. prolixus, Rhodnius neglectus and Rhodnius brethesi from colonies established and maintained at Laboratório de Doenças Parasitárias, FIOCRUZ, as described by Carvalho-Moreira et al. [38]. The insects were fed on anesthetized chickens until the 3th instar and thereafter fourth instar nymphs received citrated rabbit blood, from CECAL-FIOCRUZ-RJ, using an artificial apparatus [39].
Hemolymph was collected from fifth instar nymphs, 5 days after feeding from insects anesthetized on ice and carefully cleaned with 70% ethanol, by excising the metathoracic legs and gently pressing the abdomen. Drops of hemolymph were collected with micropipettes and pooled in Eppendorf tubes, on ice, containing a few crystals of phenylthiourea (Sigma-Aldrich, St. Louis, MO, USA) to prevent melanization [40]. The hemolymph was then centrifuged at 5000 × g for 5 min and the supernatants stored at −20°C until use.

Electrophoresis of hemolymph samples
SDS-PAGE (14% to 16%) was performed on a Mini-Protean II system (Bio-Rad, Hercules, CA, USA) under reducing conditions, using 4% stacking gels [41]. Staining was carried out with colloidal Coomassie Brilliant Blue [42] or with silver nitrate [43]. To compare the profiles of the hemolymph protein of triatomines, 0.1 μl hemolymph samples from each species were used. The molecular mobilities of proteins were determined by interpolation from mobilities of commercial pre-stained standards (Sigma-Aldrich) by computer analysis. Proteins from the hemolymph and samples from chromatography were quantified with a protein test kit (Sigma-Aldrich) using bovine serum albumin (BSA) standards [44]. For preliminary activity assays, enriched samples were also submitted to low pressure molecular exclusion chromatography with Sephacryl 100 in a 1.0 × 100 cm, glass Econo-Column (Bio-Rad), at 4°C. The column was washed and eluted with PBS (0.14 M NaCl in 0.01 M phosphate buffer, pH 7.2, 380 mOsm) and equilibrated with Cytochrome C and vitamin B12 standards. The column was loaded with 0.5 ml of sample, and 0.75 ml fractions collected after 25 ml of discarded void volume. Protein concentrations of fractions were determined and the protein profiles analyzed by SDS-PAGE, as described above.

Trypsin digestion of selected SDS-PAGE bands
The ca. 40 kDa band ( Figure 1A-I) and the low molecular mass (less than 10 kDa) band ( Figure 1A-II) detected by SDS-PAGE analysis of the protein purified in the previous step were submitted to tryptic digestion [45]. The eluted peptides were completely dried on a Speed-Vac concentrator (Thermo Scientific, Waltham, MA, USA), resuspended in 12 μl of a 1% (v/v) formic acid solution and kept at −20°C until mass spectrometric analyses, described below.

Peptide sequencing by high resolution mass spectrometry
The tryptic peptide digests were submitted to reversedphase nanochromatography coupled to nanoelectrospray high resolution mass spectrometry for identification. Peptides were eluted online in a LTQ XL/Orbitrap mass spectrometer (Thermo Scientific) for analysis. MS1 spectra were acquired on the Orbitrap analyzer (300 to 1,700 m/z) at a 60,000 resolution. For each spectrum, the 10 most intense ions were submitted to HCD (higher energy collisional dissociation) followed by MS2 acquisition on the Orbitrap analyzer at 15,000 resolution.
The raw data files generated from the duplicate mass spectrometric analyses were submitted to PEAKS version 6.0 build 20120620 (Bioinformatics Solutions Inc., Canada) for protein identification [46][47][48][49]. Peptide-spectrum matching (PSM) was performed against the non-redundant FASTA database of the National Center for Biotechnology Information (NCBInr, downloaded on September 4th, 2012) using Metazoa as taxonomical restriction and data were filtered for a 1% FDR at the peptide level. The mass spectra that did not yield any PSM according to Peaks DB but had high scoring de novo results were submitted to a similarity-driven search against the full NCBInr database using an in-house tool called PepExplorer. This tool is currently under development in the Laboratory for Proteomics and Protein Engineering (ICC-FIOCRUZ, Brazil). Briefly, it relies on the Smith-Waterman algorithm [50], the Peaks ALC de novo scores and machine learning to compose a final identification list.
After the complete sequence of the protein, described in the present article, was obtained by molecular biology, it was added to the full Swiss-Prot database (downloaded on November 15th, 2012) and the PEAKS analysis was repeated against this new database, maintaining all the other parameters as previously stated.

Sequencing of P. megistus serpin (PMSRP1) cDNA
For the identification of PMSRP1 encoding cDNA, midgut and fat body of five P. megistus fifth instar nymphs at seven days after feeding were dissected. Total RNA was isolated using the NuceloSpin RNA II Kit (Macherey-Nagel, Düren, Germany), according to the manufacturer's protocol. First strand cDNA was synthesized using the 3′-RACE Kit (Invitrogen, Karlsruhe, Germany), following the manufacturer's instructions. For the subsequent PCR, the degenerate oligonucleotide primer PMSRP-F1 (5′-TGGGCNACNCARTTYAAYCC-3′), based on the amino acid sequence WATQFNP obtained by mass spectrometry and the reverse abridged universal amplification primer were used. The PCR fragments obtained were cloned into the pGEM-T Easy vector (Promega, Madison, WI, USA). Each clone was sequenced at least three times in both directions by Plataforma Genômica, Sequenciamento de DNA/PDTIS-FIOCRUZ, Rio de Janeiro, Brazil. Based on this first sequence, the oligonucleotides PMSRP-GSP1 (5′-CAGG-TAAACCTGTAAT-3′), PMSRP-GSP2 (5′-CTTCATTT GAATATGGAAGCTCT-3′) and PMSRP-GSP3 (5′-GG AAGGTGGTTAAATGGAAATA-3′) were designed for the amplification of the 5′-end using a 5′ RACE Kit Version 2.0 (Invitrogen) according to the manufacturer's instructions. The resulting amplicons were cloned and sequenced, as described above.

Sequence analysis of PMSRP1
Sequence identity of the PMSRP1 encoding cDNA and comparison with the GenBank database was assessed using the BLASTX program at the web servers of the National Center for Biotechnology Information (http://www.ncbi. nlm.nih.gov/) [51]. Derived serpin amino acid sequences were aligned using ClustalW version 2.1 and slightly manually corrected [52]. Putative signal peptide cleavage sites were calculated with SignalP Version 4.0 [53]. The predicted isoelectric point and molecular mass was determined with the Compute pI/MW tool at http://expasy.org/tools. The secondary structure was predicted using J-pred program (http://www.compbio.dundee.ac.uk/~www-jpred/).

Reverse transcription (RT)-PCR analysis of P. megistus tissues
Total RNA was extracted, using the NucleoSpin RNAII Kit (Macherey-Nagel), from salivary glands (SG), stomach (ST, = crop or anterior midgut), small intestine (SI, = posterior midgut), fat body (FB) and hemocytes (HC) of P. megistus fifth instar nymphs (n = 10) at 7 days after feeding with heat-decomplemented rabbit blood containing 2 × 10 6 cells/ml T. cruzi strain Dm28c. Control insects were fed on blood without parasites. P. megistus gDNA was extracted from stomach tissue of five insects using the Wizard SV Genomic DNA Purification Kit (Promega). Prior to dissection, insects were immersed in water at 55°C for 15 s to detach hemocytes from other tissues [54]. First-strand cDNA was synthesized from 1-3 μg total RNA using the First-Strand cDNA Synthesis Kit (GE Healthcare, Buckinghamshire, UK) according to the manufacturer's protocol. To verify that no genomic DNA remained, the gene encoding T. brasiliensis defensin 1 (def1), which contains an intron of 107 bp, was initially amplified as a cDNA control [14,16].
PCRs were carried out to detect relative levels of serpin cDNAs using the specific primers PMSRP-RT-F (5′-GAATTGCTGAGAATTTGTATGC-3′) and PMSR P-RT-R (5′-ATGTTGAAGAACTTTAAACATTG-3′) at the following cycling conditions: 94°C for 25 s; 52°C for 25 s; 72°C for 30 s and resulted in amplicons of 301 bp. The cycle numbers (25, 30 and 35) were empirically optimized to exclude signal saturation. PCRs were undertaken three times under the same conditions using technical replicates. For an internal control and standardization, the gene encoding ß-actin was amplified, as described previously [55,56]. As negative controls, PCR reactions were carried out lacking a template. Amplification products (5 μl) were separated on an ethidium bromide stained 2% agarose gel and documented with an EDAS 290 gel documentation system (Kodak, Rochester, NY, USA). Band intensity was measured with the ImageJ program (version 1.47v). Means and standard deviations of the different samples were calculated. One-way ANOVA and Student's t-tests were carried out to evaluate significant differences in the various tissues and between infected and non-infected insects. All nucleic acid experiments were performed on a Veriti 96-Well Fast Thermal Cycler (Applied Biosystems, Carlsbad, CA, USA). For verification of primer specificity all obtained PMSRP1 amplificates were purified and sequenced as described above.

Construction of the PMSRP1 model
Initially, the homology model of serpin was constructed as described by Abreu et al. [57], using the Swiss Model and Swiss-PDB viewer programs available at http://swissmodel.expasy.org/and http://www.expasy.org/spdbv/, respectively [58,59]. The set of structurally conserved regions (SCRs) was constructed based on the crystal structure of the serpin from Tenebrio molitor (PDB entry code 3OZQ). T. molitor serpin structure (1.9 Å crystal resolution) did not have a reactive center loop (RCL) that was built based on serpin B3 (PDB = 2ZV6) with a root mean square deviation (RMSD) of 1.34 Å. Blocks of structurally conserved regions were identified and the structure alignment of the serpin sequences was generated. Coordinates for all residues were transferred to the serpin sequence and loops were constructed in a single round. Several cycles of constrained energy minimization regularized the structures and their geometrical parameters. In subsequent runs, the serpin model was minimized and validated [60]. The prediction of the electrostatic potential map (MEP) was also performed in the Swiss PDB viewer program. It was generated in the range from 25.0 (deepest red color) to +30.0 (deepest blue color) kcal/mol and superimposed onto a molecular surface of constant electron density of 0.002 e/ au3. Each point of the three dimensional molecular surface map expresses the electrostatic interaction energy value evaluated with a probe atom of positive unitary charge, providing an indication of the overall molecular size and location of attractive (negative) or repulsive (positive) electrostatic potentials shown in red and blue, respectively.

Hemolymph protein profile and pre-purification
In the present study, a comparative appraisal of the hemolymph protein profiles of triatomine vector species was initially undertaken in order to identify potential hemolymph factors that could be responsible for variations in triatomine vector competence to transmit T. cruzi and Trypanosoma rangeli [13]. The results showed that the protein profiles from the hemolymph of D. maximus, T. infestans, P. megistus, R. prolixus, R. neglectus and R. brethesi had similar banding patterns in the SDS PAGE gels above 60-70 kDa ( Figure 2). However, only in the P. megistus hemolymph was there a major band with a molecular mass of ca. 40 kDa (Figure 2). The other species also had bands around 40-50 kDa, but none of them represented a major hemolymph protein. Subsequently, a similar band of ca. 40 kDa was also detected in the hemolymph from another Panstrongylus species, P. lutzi (Additional file 1: Figure S1).
The initial purification of the P. megistus ca. 40 kDa hemolymph protein by anion exchange chromatography showed that the fraction which eluted with 0.125 M NaCl was the most highly enriched for the ca. 40 kDa target protein and was selected for subsequent analysis. Under an optimized calibration curve using Rf values for molecular mass standards, only from 14.4 to 45 kDa, in 15% reducing SDS-PAGE ( Figure 1A), the molecular mass of the band was recalculated as 38.8 kDa ( Figure 1B). This mass is smaller than that predicted for the full length protein by the molecular biology data as 43.1 kDa (see below). A possible explanation for this discrepancy is provided by the presence of a less intense, small molecular mass band smaller than 10 kDa at the bottom of the gel ( Figure 1A-II). Subsequent mass spectrometric analysis of this band confirmed that it corresponded to the cleaved C-terminal region of PMSRP1 (see below).

Mass spectrometric analyses
The 40 kDa band was excised from SDS-PAGE gels and digested with trypsin followed by analysis with reversed phase nanochromatography coupled online to high resolution mass spectrometry. The generated data were analyzed with PEAKS 6.0 software using an algorithm that combines de novo sequencing, peptide sequence tag and peptide-spectrum match (PSM) against an NCBInr database restricted to Metazoa. Unfortunately, no protein was identified other than the common keratin contaminants and trypsin autolysis products. The results produced 1,372 mass spectra that did not yield any PSM according to Peaks DB but had high scoring de novo results. These de novo results were therefore submitted to a similarity-driven search against the full NCBInr database using an in-house tool called PepExplorer. This approach identified the major protein present in the SDS-PAGE band as a member of the SERPIN (P. megistus serine proteinase inhibitor, designated PMSRP1) family ( Table 1). The first seven amino acids of one of the peptides (i.e., WATQFNPSLTK) were chosen for designing primers for further molecular biology assessments (Additional file 2: Figure S2). The annotated de novo sequence for the full peptide is shown in Figure 3, which illustrates the initial and terminal codons and the two polyadenylation signals.

Sequencing of PMSRP1 cDNA
After 3′ RACE with a degenerate forward primer 5′ RACE, and assembly of the sequences, PMSRP1 encoding cDNA (GenBank accession no. JX894893) of 1606 bp was identified using total RNA that came from the fat body of P. megistus. PMSRP1 comprised a 19 bp 5′ non-coding region and a 317 bp 3′ non-coding region between the stop codon (TAA) and the first polyadenylation signal (AATAAA). An open reading frame of 1212 bp encoded a deduced pre-protein of 404 amino acid residues with a predicted molecular weight of 45.7 kDa. At the PMSRP1 amino-terminus a putative signal peptide of 24 amino acid residues and a cleavage site after Thr were identified ( Figure 3). The amino acid sequence NDVKDL QEITDESNK detected by mass spectrometry and the absence of sequences derived from the signal peptide supported the SignalP predicted amino-terminal end of mature PMSRP1 (Figures 3, 4 and 5). The theoretical molecular mass and isoelectric point of the mature protein were 43.1 kDa and 5.73, respectively (Figure 4).
The comparative analysis of the primary sequence alignment of the thirteen serpins from different insects showed that at position 344-381 of PMSRP1 there was a reactive center loop (RCL) which is a motif characteristic for serpins. The putative RCL based in the alignment includes the highly conserved flexible hinge region (GTNAAAA) at putative position P15-P9 (G[T/S]X[A/ G/S] 4 ) [61], the putative cleavage site (putative P1-P1′), and the PF residues of the "shutter region" (Figure 4). In addition, unlike most similar serpin amino acid sequences from other insects, Glu at putative P13 is substituted by an Asn, but like the serpins from the other insect species, the putative P8-P4′ region is highly variable (Figure 4). Finally, at putative positions P1 and P1′, based on sequence alignment, no cleavage site for serine proteases is present but chymotrypsin (Tyr) and trypsin (Arg) cleavage sites are present at putative P3′ and P4′, respectively ( Figure 4). Thus, the putative position P4′-P5′ based on sequence alignment is the true scissile bond, P1-P1′, of PMSRP1 (Figures 4 and 5). Mature PMSRP1 showed a low level of identity when compared with other serpin sequences available in the GenBank. The protein showed the greatest similarity of 38.2% with the sequences of An. gambiae (XP_314158) and a slightly lower of 37.8% and 37.4% with the sequences of L. longipalpis (ABV60345) and C. felis (AAN73325), respectively ( Figure 4). Despite this level of identity, the theoretical prediction of the serpin secondary structure pointed to the conservation of nine α-helices and three βsheets, similar to other serpin family members [61,62] (Figure 4).
After the full-sequence of the serpin present in the hemolymph of P. megistus was determined by deduction of mRNA sequence analyses, it was added to the full Swiss-Prot database and all the data were run using the PEAKS 6.0 workflow. The PMSRP1 was unambiguously identified with a sequence coverage of 87% ( Figure 5). An important finding was that a 40-residue-long C-terminal region did not yield any peptide that matched its sequence, although tryptic sites (Arg and Lys) were present. Mass spectrometric analysis of the small band detected in the SDS-PAGE ( Figure 1A-II) confirmed that it corresponded to the cleaved C-terminal region of PMSRP1 ( Figure 5) undetected in the amino acid sequence obtained from the band with 38.8 kDa (Figure 1A-I). This supports the concept that the scissile bond, P1-P1′, occurs between the putative P4′ and P5′ (Arg-Ile) positions based on sequence alignment (Figures 4 and 5).

3D-Structure of PMSRP1
In this study, we also constructed the PMSRP1 theoretical model using the T. molitor serpin and the serpin B3 crystal structures as templates (Additional file 3: Figure  S3). The comparison of the PMSRP1 model with the templates revealed a highly conserved structure at both secondary (three β-sheets and nine α-helices) and threedimensional levels (RMSD = 1.34 and 0.49 Å, serpin B3 and T. molitor serpin, respectively) (Figures 4 and 6). The conservation of the core of the structure reinforced the hypothesis of some authors that the serpin scaffold is intolerant to the deletion of all but peripheral elements of secondary structure [61,62]. Thus changes in non-conserved residues that allowed the folding of the serpin into an active native state were favored by the selective pressure.
Similar to other serpins, such as α-antitrypsin, breach, shutter, gate and hinge regions were identified in the PMSRP1 model ( Figure 6). In addition, the 40-residuelong C-terminal region is involved in several H-bonds that help in organizing a group of α-sheets in the core of the protein (Figure 6). These H-bonds may help in retaining the C-terminal region in the structure in case of cleavage of the RCL, which is exposed in the serpin model. The potential electrostatic map of PMSRP1 revealed a large negative surface with some positive patches included on the reactive center loop (RCL) corresponding to residues Arg340 and Arg343 ( Figure 6).

PMSRP1 transcript abundance in different tissues
Reverse transcription PCR relative to ß-actin was used to measure PMSRP1 transcript abundance in different tissues of fifth instar nymphs at seven days after feeding with blood containing T. cruzi or a parasite free meal ( Figure 7). Preliminary amplification of the P. megistus defensin gene showed a single band of about 130 bp in all tissues analyzed and the absence of a 240 bp gDNA band (not shown). Therefore, it was assumed that there was no contamination of the mRNA with nucleic DNA. In negative controls, lacking cDNA and carried out for each RT-PCR, no amplification products were detected. After 30 and 35 cycles most PCR products were saturated and, therefore, band intensity of the PCR products after 25 cycles was quantified. PMSRP1 encoding mRNA was abundant in all tested tissues, in both T. cruzi Figure 4 Alignment of deduced amino acid sequence of P. megistus serpin (PMSRP1) compared with different insect serpins. Identical amino acid residues between PMSRP1 and other serpins are shaded grey. The conserved reactive centre loop is boxed, the hinge regions and the PF residues of the "shutter region" are underlined. The putative proteolytic cleavage site, predicted by the alignment, is indicated by P1/P1′. Predicted serine proteases cleavage sites are marked in bold typeface. The true P1-P1′ of PMSRP1 (putative P4′-P5′) scissile bond is indicated by an arrow. The secondary structure was predicted by using the Jnet prediction method (http://www.compbio.dundee.ac.uk/~www-jpred/submit.html), ß-strands are highlighted in green, α-helices in red and the extended region in blue. The GenBank accession numbers for the analysed sequences are: Culex quinquefasciatus (XP 001865071), Aedes aegypti (XP 001658641), Anopheles gambiae (XP 314158), Ctenocephalides felis (AAN73325), Lutzomyia longipalpis (ABV60345), Bombus impatiens (XP 003487908), Bombus terrestris (XP 003399187), Nasonia vitripennis (XP 001606111), Drosophila melanogaster (CAB 63097). Acyrthosiphon pisum (BAH 71022), Tribolium castaneum (EFA 09186) and Bombyx mori (NP 001037021).
infected and control insects, although in significantly different quantities (Figure 7). Graphical representation reflected the amplicon distribution in the gels of control insects (FB > SG/HC > ST > SI). In the stomach and small intestine, PMSRP1 transcript levels were lowest after 25 cycles. In insects given a T. cruzi infected blood meal 7 days previously, the PMSRP1 transcript abundance decreased significantly in the stomach (P < 0.001) and increased in the small intestine (P < 0.01) ( Figure 7C). In the salivary glands, the PMSRP1 transcript abundance was slightly higher than in the intestinal tract but did not differ significantly between infected and uninfected insects. In the hemocytes of control insects, the PMSRP1 transcript abundance was comparable with that in the salivary glands, but, following T. cruzi infection, the PMSRP1 abundance increased significantly in these cells (P < 0.01). The highest PMSRP1 levels were detected in the fat body of the insects but showed no significant differences following infection (Figure 7).

Discussion
Although the triatomines acting as vectors of Chagas disease belong to the same subfamily, Triatominae, the members have unique anatomical, genetic, physiological and ecological characteristics [63]. Morphological differences between the species have been described and are used as taxonomic markers [4]. Such anatomical divergences are often associated with physiological variations specific for each species and occur despite the highly conserved chromosome number in triatomines. Karyotype comparisons between species show that there is cytogenetic variability in the sex chromosome number, chromosomal position of rDNA clusters, genome size and heterochromatin organization [5,64]. These genetic variations result in differences between species in the biochemical composition of the saliva [65,66], in the gut microbiota [67], in the feeding behavior, and in the colonization of specific ecotopes [68]. Variation in profiles of the hemolymph proteins between different species of triatomines is another characteristic resulting from these genetic divergences, as shown in the present paper ( Figure 2). Such factors might influence the epidemiology of Chagas disease [69], and could assist in the reconstruction of the evolutionary history of the triatomines [70].
In the present paper too, although there were some similarities in the hemolymph banding patterns in the SDS PAGE gels of the different triatomine species, only in the Panstrongylus spp. was there a major band with a molecular mass of ca. 40 kDa, corresponding to the serpin, PMSRP1. This difference cannot be explained without further investigation of both the level of this band at various stages in the life cycles of triatomines, as well as the possible functions of this serpin in P. megistus. If the results reveal increases in P. megistus hemolymph serine protease levels prior to molting and decreases in PMSRP1 during molting then this could indicate a role for PMSRP1 in the regulation of cuticle degradation as in Choristoneura fumiferana [71]. Previously, the molting fluids of insects have been shown to contain several proteases, including a trypsin-like protease in M. sexta, probably involved in cuticle breakdown [72,73]. Following molting, these enzymes would need inhibition to protect the newly formed cuticle. This latter hypothesis has gained support from research with the serpins of the spruce budworm, C. fumiferana [71]. Since P. megistus, is a particularly long-lived insect with a highly sclerotized and melanized exoskeleton then degradation and shedding of the cuticle may present a particular problem, requiring higher levels of both proteases and associated serpin inhibition.
In addition, the vectorial competence between triatomine species is also highly variable with differences in the multiplication and development of T. cruzi [74] and T. rangeli in the insect hosts [75]. Variability of parasite development in these insects is also promoted by the specificity of parasite strains and clones for each vector species [38,76], which may reflect the differential efficacy of the triatomine immune response in each species.
In P. megistus, the major protein band in the hemolymph, identified by SDS-PAGE, was a serpin. A similar band was also present in P. lutzi hemolymph, confirming that the genus Panstrongylus has some homogeneity [77] although considerable variation in salivary gland proteins has been Figure 7 Relative transcript abundance of PMSRP1 encoding mRNA in different tissues of fifth-instar nymphs after 25 cycles in relation to ß-actin, A-Seven days after a blood meal containing T. cruzi strain Dm28c (2 × 10 6 cells/ml), B-Seven days after a parasite-free blood meal (control), C-Fold increase (positive axis) and decrease (negative axis) transcript abundance of PMSRP1 in infected insects (A) compared with the uninfected control (B). Asterisks relates to significant differences (*P < 0.01, **P < 0.001) obtained after statistical analyses using one way ANOVA and Student's t-test. SGsalivary glands, STstomach, SI-small intestine, FBfat body, HChemocytes.
shown even within P. megistus populations [78]. Recently, a salivary gland serine protease has been described in P. megistus [79,80], as well as Kazal-type and lipocalin inhibitors. The Kazal-type inhibitor, however, was expressed in the stomach but not in the small intestine or hemocytes [81]. Nevertheless, work on P megistus is in its infancy with only 45 ESTs reported recently as available for this species [64] so that the detection of many new factors, such as the serpin in the stomach, fat body and hemocytes of the present paper, is to be expected. The fact that serpins have not been reported from other triatomine species, except for only a truncated cDNA cluster in the sialome of Triatoma infestans [31], may reflect the more recent evolution of this vector to the hematophagous habit and the need to adapt to changes in diet [78].
The amino acid sequence alignment of PMSRP1 with serpins from other insects indicates a low level of identity from 32.9 to 38.2% (Figure 4). Such heterogeneity among serpins is not unusual and can even occur within the same insect species as with the six serpins of Manduca sexta which have less than 40% amino acid sequence identity [82].
Nevertheless, the comparison of the sequence of PMSRP1 with those from other insects revealed several important common features such as the conservation in the reactive center loop (RCL) (Figures 4 and 6). Significantly, inhibitory serpins are characteristically recognized by a consensus sequence in their hinge with P17 (E), P16 (E/K/R), P15 (G), P14 (T/S), P12-P9 (A/G/S) [61,83]. In PMSRP1, the hinge sequence is in putative positions P17 (E), P16 (E), P15 (G), P14 (T), and P12-P9 (AAAA) which exactly matches the inhibitory sequence above. Non-inhibitory serpins would deviate from this sequence, be unable to effectively insert the RCL into the β-sheet A and therefore insert the RCL into an adjacent molecule to form non-inhibitory polymers with alternative functions (reviewed in [61,84]).
In addition, compared to most similar serpin amino acid sequences from other insects, Glu at P13 is substituted by Asn (Figure 4). Although it involves a charge change from the negative Glu to the uncharged Asn, according to our electrostatic potential map of the serpin model ( Figure 6), there is no significant contribution of this substitution to the serpin charge distribution. This substitution may not be crucial since Glu and Asn are structurally similar, without extensive side-groups, and it has been shown previously in the hinge region that substitutions to uncharged residues have little effect [85].
PMSRP1, and all serpins from the other insect species, also have highly variable P8-P4′ regions ( Figure 4) and include the predicted proteolytic cleavage sites P1 and P1′ which are known to contribute to the inhibitor specificity of serpins [86]. However, no cleavage site for serine proteases is present at the putative P1 or P1′ sites of PMSRP1 but chymotrypsin and trypsin cleavage sites, respectively, occur at Tyr and Arg of putative positions P3′ and P4′, respectively ( Figure 4). Consequently, the P1-P1′ scissile bond of PMSRP1 is translocated to the putative position P4′-P5′ based on sequence alignment (Figures 4 and 5). The same amino acids, Arg-Ile, form the cleavage site for Manduca sexta serpins 4 and 5 as well as for 25 hemolymph serine proteases in this insect [87]. The serpin from the beetle Sphenophorus levis also shows an unusual cleavage site at Arg-Ile of the putative P2-P1 of the RCL with a fragment released from the C-terminus of 4.3 kDa that was not detected in SDS-PAGE analysis [88].
The M. sexta serpins 4 and 5 and the S. levis serpin are cleaved at Arg-Ile and probably involved in regulating prophenoloxidase (proPO) [82,89]. However, the M. sexta serpins do not inhibit the usual prophenoloxidase activating proteases (PAPs) but act upstream on hemolymph protease-6 to control both melanization and antimicrobial peptide expression [87]. Since PMSRP1 is also cleaved at Arg-Ile by trypsin and has the conserved hinge inhibitory consensus sequence, but not the usual proPO cleavage sequence, Asn-Lys-Phe-Gly, then PMSRP1 may still be involved in regulating pro-PO but upstream from the usual PAPs as with the M. sexta serpins.
An important finding was that a 40-residue-long Cterminal region of PMSRP1 did not yield any peptide that matched its sequence, although tryptic cleavage sites were present. The molecular mass of PMSRP1 was calculated to be 38.8 kDa, a significantly smaller value than that predicted as 43.1 kDa for the mature protein by the molecular biology data. A possible explanation for this discrepancy might be that the C-terminal region has a deduced molecular mass of 4.6 kDa. If this molecular mass is subtracted from the predicted molecular mass of the mature protein (43.1 kDa), the resulting new theoretical mass for the protein cut from the denaturing electrophoresis gel would be 38.5 kDa, in accordance with the experimental mass (38.8 kDa) determined by SDS-PAGE. This 40 amino acid residue peptide was cut at Arg-Ile of P1-P1′ (putative P4′-P5′) and removed from the C-terminus (Figures 4 and 5), as was confirmed by the amino acid sequence shown in Figure 5.
It is not unusual for cleavage in the RCL of a serpin to result in the release of a small C-terminal fragment as with PMSRP1. Thus, the plant serpin, WSZ3, with a molecular mass of ca. 42 kDa, can be cleaved in the RCL with various proteinases to release ca. 4 kDa C-terminal fragments and form another fragment of ca. 39 kDa [90]. These results corroborate those found for PMSRP1, since the SDS-PAGE shown in Figure 1A displays a low molecular mass band. As PMSRP1 was concentrated and dialyzed in Centriprep 30 (kDa), prior to SDS-PAGE, any small free peptides should have been discarded during the centrifugation. Therefore, the small polypeptide visualized at the bottom of the gel ( Figure 1A-II), probably originated from the native PMSRP1 molecule during sample processing for electrophoresis, which is carried out under denaturing and reducing conditions. These conditions would break non-covalent and covalent bonds (such as disulfide bridges) between molecules.
The gene encoding PMSRP1 is expressed in hemocytes, fat body, salivary glands and digestive tube and, as shown by SDS-PAGE, the respective protein is present constitutively, at least in the hemolymph. Preliminary experiments with PMSRP1 failed to detect any protease inhibitory activity or any effect on hemocytes phagocytosis and prophenoloxidase activation (data not shown). This may be due to inactivation resulting from previous cleavage of the RCL to yield the less than 10 kDa C-terminal. Serpins in their native (active) forms are not that thermodynamically stable and can readily convert to their more stable, latent (inactive) conformation [91]. This conversion may result from interaction with extracellular matrix proteins, hemolymph metalloproteinases, or simply by dialysis, storage or freezethawing so that functional testing may require freshly purified protein at room temperature [92].
Extensive attempts failed to obtain the active form of PMSRP1 for detailed functional assays, utilizing insects injected with anticoagulants such as EDTA or the immunosuppressant, dexamethasone, or by recombinant technology (unpublished data). However, measuring mRNA levels in tissues of P.megistus following T. cruzi infection was more successful in initial experiments. The results showed that at 7 days post-infection with T.cruzi, PMSRP1 expression is significantly upregulated in the small intestine and hemocytes and downregulated in the stomach of insects. Previously it has also been shown in other insects that serpins involved in immunity are modulated by infection [89,93].
Interpretation of these results for modulation of PMSRP1 following T. cruzi infection refers to several previous studies on changes in expression of triatomine immune molecules and immune reactions in R. prolixus following T. cruzi or Trypanosoma rangeli parasitization. These studies include results with assays for hemocyte microaggregation, nitric oxide (NO), nitric oxide synthase (NOS) and prophenoloxidase (PpO) activities [54,[94][95][96]. Unfortunately, variations both in insect sampling times post-infection and the use of both biochemical and molecular biology analyses in these studies complicate matters, although generalizations can be made. Thus, all these previous studies show, like the present results with PMSRP1, that both T. cruzi and T. rangeli can manipulate the host immune response to optimize their survival. For example, oral infection with T. cruzi enhances levels of PpO and antibacterial factors but reduces NO in the R. prolixus stomach to kill competing bacteria and aid parasite survival [94]. In the present study, the decrease in PMSRP1 expression in the stomach of P. megistus could potentially mediate such an increase in PpO levels in response to T. cruzi infection by controlling negatively the activity of proteases involved in triggering this process (e.g. [89]). In contrast, the increase in PMSRP1 expression in the hemocytes recorded in P. megistus is also relevant since oral infection of R. prolixus with a non-invasive strain of T. rangeli, inhibited levels of both PpO and microaggregation reactions in the hemocoel [54,[95][96][97]]. An increased level of hemocyte PMSRP1 could well be responsible for such a down-regulation of the host immune responses.

Conclusions
In conclusion, a serpin has been identified and characterized, for the first time in a triatomine, from the hemolymph of P. megistus and shown to be constitutively present. This finding is particularly important as it should open up new avenues of research into the roles of serine proteases in the association of T. cruzi with its insect vector host. Initial experiments indicate a role for the PMSRP1 in T. cruzi interactions with P. megistus but further studies are required to detail the functions of this molecule in vector insect/parasite interactions.

Additional files
Additional file 1: Figure S1. Additional file 2: Figure S2. Fully annotated spectrum for peptide sequence WATQFNPSLTK after de novo sequencing using PEAKS 6.0 software. A precursor mass accuracy of 1.3 ppm and an ALC score of 82% were assigned to this peptide. Data were generated by high resolution acquisitions (Orbitrap analyzer) in MS1 and MS2 modes; fragmentation was performed by HCD (higher energy collisional dissociation) fragmentation.
Additional file 3: Figure S3. Alignment of the primary sequence of T. molitor and P. megistus serpins. This alignment was used to construct the 3D model of PMSRP1 shown in Figure 6. The identical amino acids residues are represented by an asterisk (*), whereas residues with similar chemical properties and score > 0.5 in the Gonnet PAM250 matrix are (:) and those with different chemical properties and low-score ≤ 0.5 are (.).