Tissue-specific transcriptomes of Anisakis simplex (sensu stricto) and Anisakis pegreffii reveal potential molecular mechanisms involved in pathogenicity

Background Larval stages of the sibling species of parasitic nematodes Anisakis simplex (sensu stricto) (s.s.) (AS) and Anisakis pegreffii (AP) are responsible for a fish-borne zoonosis, known as anisakiasis, that humans aquire via the ingestion of raw or undercooked infected fish or fish-based products. These two species differ in geographical distribution, genetic background and peculiar traits involved in pathogenicity. However, thus far little is known of key molecules potentially involved in host-parasite interactions. Here, high-throughput RNA-Seq and bioinformatics analyses of sequence data were applied to the characterization of the whole sets of transcripts expressed by infective larvae of AS and AP, as well as of their pharyngeal tissues, in a bid to identify transcripts potentially involved in tissue invasion and host-pathogen interplay. Results Approximately 34,000,000 single-end reads were generated from cDNA libraries for each species. Transcripts identified in AS and AP encoded 19,403 and 10,424 putative peptides, respectively, and were classified based on homology searches, protein motifs, gene ontology and biological pathway mapping. Differential gene expression analysis yielded 226 and 339 transcripts upregulated in the pharyngeal regions of AS and AP, respectively, compared with their corresponding whole-larvae datasets. These included proteolytic enzymes, molecules encoding anesthetics, inhibitors of primary hemostasis and virulence factors, anticoagulants and immunomodulatory peptides. Conclusions This work provides the scientific community with a list of key transcripts expressed by AS and AP pharyngeal tissues and corresponding annotation information which represents a ready-to-use resource for future functional studies of biological pathways specifically involved in host-parasite interplay. Electronic supplementary material The online version of this article (10.1186/s13071-017-2585-7) contains supplementary material, which is available to authorized users.

and economic concern; indeed, beside the effects on the marketability of marine products, Anisakis third-stage larvae (L3 s) are the causative agents of a human disease known as anisakiasis. This occurs as a consequence of accidental ingestion of L3 s, and consists of a mild to severe disease classified as gastric anisakiasis (GA), intestinal (IA) and extraintestinal anisakiasis depending on the localization of the larva [6,7]. In addition, infection with Anisakis spp. may cause sensitisation to parasite allergens [8,9] that, following subsequent exposure, can result in a variety of systemic reactions [10]. Moreover, Anisakis spp. have been recently observed in the same localization with gastro-intestinal tumors [11][12][13]. While, thus far most reports of anisakiasis originate from regions of the world where consumption of raw or undercooked fish is common (e.g. Japan) [3], the global prevalence of gastrointestinal and allergic anisakiasis is likely to be severely underestimated, particularly because of the intrinsic limitations of currently available diagnostic tools.
Anisakis simplex (Rudolphi, 1809) (sensu stricto) (s.s.) is responsible for almost all of the reported human cases of anisakiasis in Japan, whereas its sibling species, Anisakis pegreffii Campana-Rouget & Biocca, 1955, is responsible for most cases of anisakiasis in southern Europe [1]. Anisakis simplex (s.s.) displays higher tolerance for artificial gastric juice and acid [14,15] and penetrates both agar and fish muscle [15,16], as well as digestive tract tissues of Wistar rats [17] more efficiently than A. pegreffii. Interestingly, recent comparative analyses of the transcriptomes and proteomes of A. simplex (s.s.) and A. pegreffii larvae revealed both qualitative and quantitative differences of the potential allergens responsible for the onset of allergic anisakiasis [18,19], thus calling on more investigations into potential adverse effects elicited by these two species.
Indeed, in spite of growing concerns for public health due to anisakiasis, the molecular mechanisms responsible for the pathogenicity of Anisakis spp. remain largely unknown. Pharyngeal excretory glands of the Anisakis larval stages have long been hypothesized to be implicated in such mechanisms through the release of proteolytic enzymes [20][21][22], and data from other parasitic nematodes of the same superfamily (Ascaridoidea) suggest that peptidases could play key roles in biological pathways linked to fundamental host-pathogen interactions [23]. However, thus far and to the best of our knowledge, no data are available on the molecules transcribed by the pharynx of Anisakis spp. The identification of these molecules and the characterization of their expression profiles compared to other larval tissues may provide clues as to their role(s) in biological pathways associated with the pathogenicity of these parasites. Therefore, in the present study, an indepth analysis of differential gene expression between the whole larva and the pharyngeal tissues of both A. simplex (s.s.) and A. pegreffii was carried out, transcriptomes were obtained and molecules putatively responsible for the pathogenicity of these two species were identified.

Parasite material
Three fish species were collected and analyzed between 2015 and 2016: the Atlantic mackerel Scomber japonicus from the major FAO 27 fishing area (North East Atlantic region), the European anchovy Engraulis encrasicolus and the European pilchard Sardina pilchardus from the FAO 37 (Mediterranean). Viscera and fillets of each fish were visually inspected under a stereomicroscope for the detection of Anisakis larvae. Parasites were collected, washed and stored in sterile PBS for subsequent dissection of the pharyngeal tissues (defined as the apical portion starting with the linear pharynx and continuing with the globular ventricular oesophageal appendix (see Fig. 1) henceforth designated as 'PX'). Tissues were dissected under a stereomicroscope and the remainder of the larva ('WL') was stored separately for further DNA and RNA extractions (see below). Samples corresponding to PX and WL of each larva were homogenized and processed using the TRIsure reagent (Bioline, London, UK), according to the manufacturer's instructions, for simultaneous isolation of DNA and RNA, thus overcoming potential biases due to partial removal of larval tissues or organs.
Genomic DNA was used as a template for molecular identification of parasite species, based on PCR-RFLP analysis of the nuclear ribosomal marker internal transcribed spacer (ITS) according to published diagnostic keys [24].

RNA extraction and quality check
Total RNA was extracted using the TRIsure™ reagent (Bioline) and contaminating genomic DNA was removed using the DNAse in the Turbo DNA-free kit (Life Technologies, Waltham, USA). Three independent biological replicates were analysed, each one consisting of total RNA extracted from three WL and ten PX of each A. simplex (s.s.) (' AS') and A. pegreffii (' AP'), respectively. The quality of the extracted RNA was verified visually on agarose gel, while resulting quantities were measured using the Take3 Module of Synergy HT Multi-Detection Microplate Reader (Biotek, Winooski, USA) and a Bioanalyzer 2100 (Agilent Technologies, Waldbronn, Germany).
cDNA library construction and RNA-sequencing RNA strand-specific libraries were created using the New England Biolabs Kit (NEBNext® Ultra™, Ipswich, USA), according to manufacturer's instructions (Illumina, San Diego, California). In brief, polyadenylated (PolyA+) RNA was purified using the NEB (NEBNext® Poly(A) mRNA Magnetic Isolation Module) from 3 μg of total RNA from each WL and PX sample, fragmented and reverse transcribed into cDNA. Following fragmentation of the mRNA, first and dUTPbased second strand synthesis were carried out, followed by end-repair, A-tailing, ligation of the indexed Illumina adapters and digestion of the dUTP-strand. Ligated products of 150-400 bp were excised from agarose gels and PCR amplified. Products were cleaned using a MinElute PCR purification kit (Qiagen, Hilden, Germany) and the quality of each cDNA library was verified on an Agilent 2100 Bioanalyzer. For RNA-Seq, pooled libraries were single-end sequenced on a HiSeq2500 platform (Illumina) using HiSeq Flow Cell v4 and 125 bp read length.

Bioinformatics analyses of sequence data
Raw reads were trimmed and filtered for Illumina adaptor sequences, sequences with suboptimal read quality (i.e. PHRED score ≥ 25.0) and sequences shorter than 125 bp using Trimmomatic v.0.36 [25].
For AS, the high-quality single-end reads representing both WL and PX were mapped to the available A. simplex reference genome (Assembly GCA_000951095.1) with BWA-MEM version 0.7.12-r1039 [26] with default parameters. Resulting alignments were extracted and information linked to genome regions to which reads were mapped (e.g. intron-exon boundaries) were obtained from the GTF annotation file available from the WormBase repository database at http://parasite.wormbase.org/Anisa-kis_simplex_prjeb496/. Then, the number of raw reads mapping to each A. simplex gene was calculated using the Rsubread package in R (https://www.r-project.org/). Transcripts differentially expressed in WL and PX were identified by DESeq2 package in R (|logFC| > 2). In order to control for errors associated with multiple pairwise comparisons, a false-discovery rate correction (FDR < 0.05) was applied to the data set [27].
For AP, clean reads obtained from high-throughput sequencing of WL and PX were pooled and assembled de novo using the Trinity software [28]. Subsequently, reads from each WL and PX were individually aligned to the assembled transcriptome using Bowtie, and the relative expression of each transcript was estimated via the RSEM package. The trimmed mean of M-values normalization method (TMM) was used for cross sample normalization, and the normalized expression of each transcript was presented as transcripts per million (TPM). Assembled transcripts with < 10 aligned reads were eliminated from subsequent analyses (http://deweylab.biostat.wisc.edu/ rsem/). Differentially expressed transcripts between WL and PX were identified by DESeq2 package in R (|logFC| > 2; FDR < 0.05).
Putative ortholog transcripts shared between AS and AP were identified using a reciprocal best hit analysis on the longest ORF, and by retaining only the sequences aligned with a continuous region covering at least 30% of the query sequence [29]. To better define the secretory nature of predicted proteins in these tissue-specific repertoires, we refined the search of predicted signal peptides in both oesophageal and whole body-enriched catalogues. Among enriched subsets, only contigs encoding for peptides starting with Met were retrieved and the probability to carry signal peptides was evaluated using the on-line prediction server SignalP [30].
Transcripts encoding for proteins belonging to five families, i.e. astacins, ShK toxin, EB cystein-rich modules, Kunitz and CRISPs (CAP superfamily), were selected amongst molecules upregulated in the PX of each AS and AP and subjected to further analyses using the BLAST search tool, Clustal Omega, Artemis and domain organization platform available from the EMBL-EBI Pfam website. These molecules were selected based on their documented roles in key mechanisms of host-parasite interactions [23].
For AP, transcript annotation was performed by comparing sequences to data available in the nucleotide sequence collection (Nt) of NCBI and in the non-redundant (Nr) (www.ncbi.nlm.nih.gov) and SwissProt (http://expasy.org/) databases using BLASTn and BLASTx algorithms, respectively. The software Trinotate (Transcriptome Functional Annotation and Analysis) (https://trinotate.github.io/) was used to extract annotation information from the eggNOG, KEGG and Gene Ontology databases using default settings. The coding regions of transcripts were predicted using TransDecoder [28].

Samples and species identification
A total of 85 anisakid larvae were collected, i.e. 31 from Atlantic mackerels, 32 from European anchovies and 22 from European pilchards. Of these, 31 larvae collected from the Atlantic mackerel were identified as AS, and 34 larvae collected from European anchovies and European pilchards were identified as AP following restriction analysis with HinfI [24]. Twenty larvae collected from European pilchards were identified as Hysterothylacium aduncum and were thus excluded from the study. Three WL and PX samples from each AS and AP were subjected to high-throughput RNA-Seq.
The Anisakis simplex (s.s.) and A. pegreffii transcriptomes A total of 32,859,191 and 37,606,372,125 bp reads were generated for each WL and PX of AS, while 23,397,005 and 36,382,470 reads were generated from WL and PX of AP, respectively (Table 1). Following pre-processing for quality, a total number of 29,011,662 (WL-AS) and 32,853,551 (PX-AS), 21,128,142 (WL-AP) and 32,711,755 (PX-AP) reads were retained for further analyses. Raw reads generated in the present study have been submitted in the Sequence Read Archive (SRA) database of NCBI (http://www. ncbi.nlm.nih.gov/sra) under accession number PRJNA374530.
Given the availability of a draft genome sequence for A. simplex [31] (Wormbase BioProject PRJEB496; http:// parasite.wormbase.org/Anisakis_simplex_prjeb496/Info/Ind ex/), raw reads generated for AS were mapped to the existing genome assembly sequences (Wormbase BioProject PRJEB496); approximately 90% of reads for AS-WL and AS-PX could be mapped to the available genome sequences, with 86% of these mapping only once. The 61,865,213 pooled reads obtained from both WL and PX of AS, mapped to a total of 19,403 genes (out of 20,971) in the AS genome. Conversely, due to the unavailability of a draft genome sequence for AP, raw reads generated from WL and PX samples for this species were assembled de novo (see Methods). The pooled 53,839,897 reads for AP-WL and AP-PX yielded, following quality-filtering, 21,195 contigs (Table 1). Shared transcripts (putative orthologs) between AS and AP were identified by pairwise BLASTP reciprocal best hit analysis (E-value cut-off: 1e-06; minimum hit coverage 30%), performed on the longest ORF identified for each transcript, resulting in a list of 4635 putative orthologs (Additional file 1: Table S1).

Transcript annotation and differential gene expression analysis
Annotation information linked to AS transcript-encoding genes were derived from data available from the WormBase  (Table 1). Of these transcripts,~35% could be annotated via BLAST searches against the nr, SwissProt, KEGG and GO databases (Table 1). Detailed annotation information for individual AP transcripts identified in this study is available in Additional file 2: Table S2.
For each AS and AP, complete lists of differentially expressed genes (DEGs) in WL and PX containing annotations informations are available in Additional file 3: Table S3 and Additional file 4: Table S4.
The search of signal peptides in both pharyngeal and whole larva enriched catalogues was carried out with the aim to better define the secretory nature of predicted proteins. In both AS and AP, the pharynx and the whole larva subsets ( Table 2) were therefore sorted to retrieve only peptides starting with Met and the probability to carry signal peptides was evaluated using the SignalP Server (Table 2). Putative secreted proteins were detected in transcripts upregulated in the PX of each AS and AP (20 and 36% of the total number of upregulated transcripts, respectively) pointing out a secretory aptitude of pharyngeal repertoires. Totals of 16 and 22% of transcripts upregulated in WL of each AS and AP were predicted to encode a signal peptide.
Pfam enrichment analyses were also performed in both AS and AP by comparing the pharyngeal upregulated subset (FDR < 0.05) with the whole transcript sequence datasets (Table 3).
For AS, a total number of 335 differentially expressed genes were identified (Fig. 2a), of which 109 were upregulated in WL and 226 in PX (Additional file 2: Table S2, Additional file 4: Table S4). Seventy-one and 107 of these genes, respectively, were annotated with Pfam information (Additional file 2: Table S2, Additional file 4: Table S4). The remaining genes could not be annotated based on the information available in current public databases. Transcripts upregulated in WL encoded mostly enzymes with proteolytic activity, e.g. peptidases M1 and S10 (Additional file 2: Table S2). In the PX subset, 34.5% contigs encoded for proteins with putative roles in worm metabolism and physiology, including sugar transport, nervous system and motility. This reflects the biological function of this anatomical region, which includes, for instance, the nervous ring and the digestive oesophagous (see Fig. 1). A further 16% (36/226) of transcripts upregulated in the PX of this species encoded for (i) peptidases; (ii) members of the CAP superfamily, including proteins similar to known allergens from the venom of wasps, honeybees (such as histidine phosphatase superfamily members and cysteine-rich secretory proteins) and snakes (such as those carrying a pancreatic trypsin inhibitor Kunitz domain); and (iii) molecules with putative immune-modulatory function (e.g. leucine-rich repeat; immunoglobulin subtype, immunoglobulin-like domain, immunity-related GTPases-like) ( Table 3). In particular, putative peptidases included (i) aspartic peptidases M1 (n = 1), (ii) astacin peptidase M12A (n = 5), (iii) peptidase M13 (n = 2), (iv) serine carboxypeptidase S10 (n = 1), as well as hemopexin-like metallopeptidases, carboxylesterases and ShKT Stichodactyla helianthus toxin (n = 3). Other predicted proteins upregulated in the PX subset with similarity to known allergens included molecules encoding for cysteine-rich domains, such as Kunitz, lustrins, trypsin inhibitors and 'Clostridium epsilon toxin'; the latter was only detected only in the PX subset of AS (Table 3 and Additional file 1: Table S1).
A total of 502 DEGs were identified in AP, of which 163 were upregulated in WL and 339 in PX ( Fig. 2b and Additional file 3: Table S3, Additional file 4: Table S4). Of these, 80 and 88 could be annotated with Pfam terms, while 83 and 100, respectively, matched amino acid sequences available in the NCBI nr database (Additional file 3: Table S3, Additional file 4: Table S4).
GO terms enrichment analysis was performed to compare groups of transcripts upregulated in the WL and PX of each AS and AP (cf. Additional file 5: Figure S1 and Additional file 6: Figure S2, respectively).
Putative orthologous transcripts encoding for proteins upregulated in the PX of both AS and AP, of interest based on their putative biological function in the tissues analysed, were characterized in further detail. Two transcripts in AP and one in AS encoded for putative ShK toxin peptides, with different domain organization as revealed by amino   acid sequences alignments (Fig. 3). The first ShK peptide identified in AP (AP1) contained three adjacent ShK domains followed by a cysteine-rich portion at the Nterminus, in contrast to AS predicted ortholog that contain only two ShK domains. The second putative ShK peptide contained two adjacent domains in both AS and AP (Fig. 3).
Interestingly, the C-terminus of ShKgene1 partially matched known allergens of A. simplex (s.l.), as revealed by BLAST analysis (Anis 7 GenBank accession number ABL77410, query coverage 93% and identity 25%; Anis 12 GenBank accession number AGC60031, query coverage 46% and identity 33%; Anis 14 GenBank accession number BAT62430, query coverage 71% and identity 34%). A transcript encoding for a similar ShK C-term could not be identified in AS using the Artemis tool, likely as a result of the current fragmentation of available genomic scaffolds. ShK toxin domains were also found associated to astacin peptides (Fig. 4). In particular, five putative orthologous peptides encoding for astacin metalloendopeptidase were identified in AS and AP, with a different number of ShK modules (one or two, Fig. 5). Such modular structures were observed in other organisms: while 38% of sequences with only astacin domain retrieved in the Pfam database belonged to parasitic nematodes, around 80 and 83% of sequences with astacin domain associated to one or two ShK modules, respectively, were represented by parasitic nematodes.
Among transcripts encoding for cysteine-rich proteins, orthologous members of the CRISP-SCP family (CAP-superfamily) were detected in both AS and AP (Additional file 7: Figure S3). In particular, one transcript significantly upregulated in the PX of both species encoded for typical conserved residues of SCP-like proteins. Conversely, no differential expression was detected for the remaining transcripts encoding for the other three CRISP orthologues (alignments of peptides are available in Additional file 7: Figure S3).
Transcripts encoding putative proteins containing an EB module, often associated with Kunitz domains, were upregulated in the PX of both AS and AP: two orthologues were identified, with a further paralogue in AS (Additional file 8: Figure S4). In particular, five EB modules were identified in AP1 (Anisakis EB module 1), four in AS_EB1A and three in AS_EB1B (Additional file 8: Figure S4a) whereas, for Anisakis EB module 2, four adjacent EB modules were observed in both AP and AS (Additional file 8: Figure S4b).
Finally, transcripts encoding putative peptides carrying Kunitz domains were enriched in AS PX only. Two putative peptides, both containing conserved a cysteinerich region known as lustrin, showed homology to "major allergen" Anis1 found in Anisakis, Ascaris and Toxocara (Additional file 9: Figure S5). One of the two peptides encompasses two Kunitz domains followed by two lustrin domains (Additional file 9: Figure S5a), while the second contained seven repeates of lustrin-Kunitz domains (Additional file 9: Figure S5b).

Discussion
Parasitic nematodes infect more than one billion people globally, causing substantial suffering and massive economic losses [32]. Anisakidosis is an emerging fish-borne zoonosis of major public health concern. Nevertheless, the impact of the disease is largely underestimated and our current understanding of the mechanisms of infection and clinical disease in humans is still scant.
While knowledge of the complex mechanisms underlying host-pathogen interplay in several parasitic nematodes is expanding [33][34][35], studies of the interactions between Anisakis and its accidental hosts are in their infancy [18,19]. Previous evidence suggested that secretions from the pharyngeal subventral and esophageal dorsal glands of the parasites may play roles in the intraluminal and extracorporeal digestion of tissues of natural (intermediate or paratenic) hosts [21] and in the invasion of the gastrointestinal mucosa of human accidental hosts [21].
In an effort to investigate the nature and the potential role(s) of molecules expressed by the pharyngeal tissues of Anisakis spp. in pathways linked to pathogenicity, we analyzed the whole larval and pharyngeal transcriptomes of the two pathogenic sibling species AS and AP. The analysis of transcriptomes provided a number of transcripts similar to those inferred from previous studies of Anisakis spp. and of other non-model species of parasitic nematodes [19,35,36]. In addition, several shared transcripts were identified in these two species, albeit the total number of orthologous genes may be underestimated due to the unavailability of a draft genome sequence for A. pegreffii. The number of full-length peptides predicted from assembled transcript sequences for AP was significantly lower than the corresponding number for AS, thus indicating that a significant number of AP transcripts could not be assembled with confidence. Nevertheless, comparative analyses of levels of gene transcription in the WL and PX of these two species led to the identification of groups of molecules with putative roles in mechanisms of parasite pathogenicity. Amongst these molecules, the cysteine-rich secretory proteins (CRISPs), belonging to the CAP superfamily (cysteine-rich secretory proteins, antigen 5, and pathogenesis-related 1 proteins), were signifincatly upregulated in the PX of both AS and AP. Members of this family are secreted proteins with extracellular endocrine or paracrine functions, identified in several eukaryotes, including arthropods, flatworms and roundworms (reviewed by Cantacessi et al. [37]). In nematodes, members of this family are also known as Ancylostoma-secreted proteins or activationassociated secreted proteins (ASPs) (reviewed by Cantacessi et al. [37]). Increased transcription of ASPs is observed during the transition of the ensheathed, free living L3 of the dog hookworm, Ancylostoma caninum, to its exsheathed parasitic form [38]. While the exact function(s) of nematode ASPs are yet to be uncovered, this observation suggests that these molecules may be involved in biological pathways associated with penetration of host tissues and/or interactions with the host immune system (cf. [35,36]).
A group of proteins with roles in host tissue penetration and digestion encoded by transcripts were differentially expressed in the PX of both AS and AP is represented by the peptidases. In particular, transcripts encoding for metalloproteinases (i.e. aminopeptidases, astacins and neprilysins) were particularly abundant in AS. Aminopeptidases catalyze the removal of single amino acids from the amino N-terminus of small peptides, thus playing a role in their final digestion [39]. In addition, members of this family hydrolyse the epoxide leukotriene-A4 to form an inflammatory mediator [40]. The Anisakis orthologue might be involved in eliciting a host immune and/or inflammatory response. Astacins are implicated in the activation of growth factors, degradation of polypeptides, and processing of extracellular proteins, also sharing common features with serralysins, matrix metalloendopeptidases, and snake venom proteases [40]. Astacins might also play a role as anticoagulants; indeed, the hematophagus mollusk Colubraria secretes Astacin/ShKT-containing proteins that may inhibit ion channels, interfere with hemostasis and facilitate the spreading of the toxins in the body of the prey by degrading extracellular matrix molecules and inactivating several endogenous vasoactive peptides [41]. In the free-living nematode Caenorhabditis elegans, an astacin enzyme is expressed in the hypodermal tissue and is required for normal collagen secretion [42]. Astacins are also found in the parasitic filarioid Brugia malayi and in the trichostrongylid Haemonchus contortus [43] where they play a role in collagen processing and cuticle formation. Proteases belonging to the M13 family include, amongst other proteins, the neprilysins, molecules found in a wide range of vertebrates (including mammals), and invertebrates (e.g. Drosophila and C. elegans [44,45]). In mammals, these proteins are involved in cardiovascular development, blood-pressure regulation, nervous control of respiration, and regulation of neuropeptides in the central nervous system, while their biological functions in nematodes are yet unclear. However, in experimentally infected mice, M13 peptidases secreted by Trichinella spiralis have been demonstrated to participate in the early onset of intestinal inflammation [46]. Similar mechanisms may occur over the penetration of intestinal tissues by Anisakis larvae. Amongst other peptidase-encoding transcripts upregulated in the PX of both AS and AP and with putative roles in host tissue digestion, serine carboxypeptidases have been previously detected in Angiostrongylus cantonensis [47] and Strongyloides ratti [48] where, similarly to aminopeptidases, they also serve as digestive enzymes.
Molecules encoding for potential anaesthetics were also detected amongst transcripts differentially expressed in the PX of both AS and AP, including predicted peptides containing a ShK toxin domain. In venomous organisms, the ShKT is a signature of potent ion channel blockers [49], whereas the presence of this protein motif in oxidoreductases and prolyl hydroxylases of plants, and in astacin-like metalloproteases and trypsin-like serines proteases of helminths confers these enzymes potential roles in potassium channel-modulatory activities. However, in the zoonotic dog roundworm Toxocara canis, secreted mucins implicated in immune evasion mechanisms are characterized by the presence of evolutionarily distinct modules, i.e. the mucin and ShKT domains [50], which raises questions on the possible roles on Anisakis ShKT domain-encoding molecules in interactions between the parasite and the host immune system. Moreover, the association of astacin and ShK toxin domain appears to be particularly common in parasitic nematodes.
Besides transcripts of interest detected in both AS and AP, a relatively small number of molecules were exclusively upregulated in the PX of AS. These molecules included peptides encoding Kunitz domains, with trypsin inhibitor activity, and aspartic proteases. Proteins with Kunitz domains are degrading enzymes that act as protease inhibitors, with a functionally conserved role in cuticle formation in a diverse range of nematodes [43]. In blood-sucking organisms such as ticks [51,52] and molluscs [41], proteins containing Kunitz domains are possibly involved in the inhibition of thrombin and coagulation factors. Functions in mechanisms aimed to prevent attack by host proteolytic enzymes and coagulation factors in the host blood stream are suggested for the zoonotic tapeworm Echinococcus granulosus [53], and for the blood flukes Schistosoma japonicum and S. mansoni [54,55]. Excretory/secretory products from A. simplex (s.l.) larvae contain molecules with anticoagulant activities that may play a role in the pathogenesis of anisakiasis in accidental hosts [56]. In Anisakis, proteins with Kunitz domains with trypsin-like proteinase properties have been previously identified in studies examining the proteolytic enzymatic activity of this parasite [20,57,58]; their optimal activity was recorded at pH 7.5 (approximately the pH of the terminal ileum) and a host body temperature of 36-37°C, thus supporting the hypothesis that these enzymes are activated during invasion of the intestine in the homeothermic hosts. In addition, one of the major known Anisakis allergens, i.e. Anis1, contains a Kunitz-domain [59]. A transcript upregulated in AS PX showed high homology to this antigen (Additional file 9: Figure S5a, b). Indeed, it is possible to hypothesize that novel molecules other than the already known peptides with antigenic and pathogenic properties may elicit the inflammatory and immune response observed in humans. Aspartic proteases (pepsin-like) were also included amongst the upregulated AS molecules. The best known source of aspartic proteases is the stomach of mammals [60], and pepsin was identified as a positive regulator of cathepsin-B involved in the penetration of rat gut by Angiostrongylus cantonensis [60,61]. Results from our study suggest that transcripts encoding selected Kunitzdomain containing proteins and aspartic proteases were exclusively expressed by AS. However, given that the set of AP transcripts were assembled de novo in the absence of a genome reference sequence, and the significantly lower number of predicted peptides, their presence in the genome and transcriptome of AP cannot be fully excluded.