In silico analysis of the fucosylation-associated genome of the human blood fluke Schistosoma mansoni: cloning and characterization of the enzymes involved in GDP-L-fucose synthesis and Golgi import

Background Carbohydrate structures of surface-expressed and secreted/excreted glycoconjugates of the human blood fluke Schistosoma mansoni are key determinants that mediate host-parasite interactions in both snail and mammalian hosts. Fucose is a major constituent of these immunologically important glycans, and recent studies have sought to characterize fucosylation-associated enzymes, including the Golgi-localized fucosyltransferases that catalyze the transfer of L-fucose from a GDP-L-fucose donor to an oligosaccharide acceptor. Importantly, GDP-L-fucose is the only nucleotide-sugar donor used by fucosyltransferases and its availability represents a bottleneck in fucosyl-glycotope expression. Methods A homology-based genome-wide bioinformatics approach was used to identify and molecularly characterize the enzymes that contribute to GDP-L-fucose synthesis and Golgi import in S. mansoni. Putative functions were further investigated through molecular phylogenetic and immunocytochemical analyses. Results We identified homologs of GDP-D-mannose-4,6-dehydratase (GMD) and GDP-4-keto-6-deoxy-D-mannose-3,5-epimerase-4-reductase (GMER), which constitute a de novo pathway for GDP-L-fucose synthesis, in addition to a GDP-L-fucose transporter (GFT) that putatively imports cytosolic GDP-L-fucose into the Golgi. In silico primary sequence analyses identified characteristic Rossman loop and short-chain dehydrogenase/reductase motifs in GMD and GMER as well as 10 transmembrane domains in GFT. All genes are alternatively spliced, generating variants of unknown function. Observed quantitative differences in steady-state transcript levels between miracidia and primary sporocysts may contribute to differential glycotope expression in early larval development. Additionally, analyses of protein expression suggest the occurrence of cytosolic GMD and GMER in the ciliated epidermal plates and tegument of miracidia and primary sporocysts, respectively, which is consistent with previous localization of highly fucosylated glycotopes. Conclusions This study is the first to identify and characterize three key genes that are putatively involved in the synthesis and Golgi import of GDP-L-fucose in S. mansoni and provides fundamental information regarding their genomic organization, genetic variation, molecular phylogenetics, and developmental expression in intramolluscan larval stages.


Background
The deoxyhexose sugar L-fucose is a major constituent of an array of immunologically important carbohydrates that are presented on surface-expressed and secreted/ excreted glycoconjugates of the human blood fluke Schistosoma mansoni (reviewed by [1]). Although the schistosome glycome is perhaps the most extensively characterized among invertebrates, relatively little is known about the enzymatic machinery responsible for its expression. Recent studies by Fitzpatrick et al. [2] and Peterson et al. [3] inventoried the schistosome α3and α6-fucosyltransferases (FucTs), which transfer L-fucose from a GDP-L-fucose nucleotide-sugar donor to an oligosaccharide acceptor to create α3 and α6 linkages, respectively. These studies also demonstrated stage-and gender-specific variations in FucT gene transcription, which may contribute to differential fucosyl-glycotope expression that has been reported among stages of S. mansoni [4][5][6][7].
While the population composition and cellular organization of the expressed glycosyltransferases are key determinants affecting carbohydrate structural diversity, other factors are also important, including nucleotide-sugar donor availability, Golgi membrane dynamics, intralumenal pH, and competition for donor/acceptor substrates [8]. In S. mansoni, this means that GDP-L-fucose synthesis and Golgi import, which dictate fucose donor availability in the Golgi, likely contribute to differential fucosyl-glycotope expression. However, to date, no studies have examined these aspects of fucosylation in schistosomes.
In general, GDP-L-fucose synthesis is localized in the cytosol and can occur by two possible metabolic pathways, the de novo and salvage pathways (reviewed by [9]), which constitute approximately 90% and 10%, respectively, of total GDP-L-fucose synthesis in mammalian cells [10]. In de novo synthesis, GDP-D-mannose is converted to GDP-L-fucose in three steps by GDP-D-mannose-4,6dehydratase (GMD, EC 4.2.1.47) and the bifunctional enzyme GDP-4-keto-6-deoxy-D-mannose-3,5-epimerase-4-reductase (GMER, EC 1.1.1.271; also called GDP-L-fucose synthase). Alternatively, the salvage pathway generates GDP-L-fucose from free cytosolic L-fucose in two steps, which are generally catalyzed by L-fucokinase (Fuk) and L-fucose-1-phosphate guanylyltransferase (FPGT; also called GDP-L-fucose pyrophosphorylase). Both pathways are summarized in Figure 1. GMD and GMER are well conserved across prokaryotic and eukaryotic taxa in terms of both structure and function [11], but the salvage pathway exhibits some variation. While homologs of Fuk and FPGT have been described in several mammalian species [12][13][14][15], the salvage pathway in Bacteroides and Arabidopsis comprises a single bifunctional enzyme (Fkp in Bacteroides; FKGP in Arabidopsis) that exhibits both Fuk and FPGT activities [16,17]. Elements of a salvage pathway do not exist in Drosophila [18] and only a Fuk homolog has been identified in C. elegans [11]. How GDP-L-fucose is synthesized in S. mansoni is unknown.
In eukaryotes, fucosylation occurs primarily in the Golgi. Consequently, following GDP-L-fucose synthesis in the cytosol, the activated fucose is imported into the Golgi lumen where it can be utilized by Golgi-localized FucTs. This translocation is driven by a GDP-L-fucose transporter (GFT), which couples GDP-L-fucose entry with equimolar exit (i.e., antiportation) of GMP, a downstream byproduct of fucosylation (reviewed by [19]).
Previous studies indicate that GDP-L-fucose synthesis and transport are essential processes in the production of fucosylated glycans. For example, increased expression of GMD, GMER and GFT was linked to higher levels of fucosylation in human hepatocellular carcinoma [20,21] and elevated expression of sialyl Lewis X during inflammation and tumorigenesis [22]. Additionally, Omasa et al. [23] observed decreased fucosylation of recombinant human antithrombin III following RNAi-mediated knockdown of GFT in transfected Chinese hamster ovary cells. The essential role of GFT in proper fucosylation is further evidenced in humans by the rare autosomal recessive syndrome leukocyte adhesion deficiency type II (LADII), which is characterized by severe psychomotor and growth retardation, facial malformation, and persistent and recurrent infections with marked neutrophilia [24]. Red blood cells of LADII patients feature a non-fucosylated variant of the H antigen (called the "Bombay" phenotype), and leukocytes lack the fucosylated Lewis-type blood groups that are requisite for extravasation during immune challenge [25]. Importantly, LADII results from a deficiency in GDP-L-fucose transport, which is attributable to mutations in the GFT gene [26][27][28][29][30]. These observations suggest the possibility that GDP-L-fucose synthesis and Golgi import play key roles in the regulated expression of fucosylated glycotopes in S. mansoni as well.
In the present study, we used a homology-based genomewide bioinformatics approach to identify and characterize putative GDP-L-fucose synthesis-and transport-associated genes in S. mansoni. This study provides fundamental information about the genomic organization, splicing and molecular phylogenetics of these fucosylation-associated genes as well as important insights regarding their putative roles in glycotope expression in snail-associated larvae, particularly miracidia and primary sporocysts.

Isolation and cultivation of S. mansoni larvae
Ethics statement: Research protocols involving mice, including routine maintenance and care, have been reviewed and approved by the Institutional Animal Care and Use Committee (IACUC) at the University of Wisconsin-Madison under assurance number A3368-01. Generation of antibodies against recombinant proteins was performed by GeneTel Laboratories LLC (Madison, WI, USA) in accordance with protocols reviewed and approved by the Office of Laboratory Animal Welfare (OLAW) at the National Institutes of Health under assurance number A4489-01.
Adult and larval S. mansoni (NMRI strain) were collected and cultivated as described by Yoshino and Laursen [31]. Briefly, adults were harvested from infected mice by perforation of the hepatic portal veins, and viable eggs were isolated from liver tissue by homogenization and washed in sterile 0.9% NaCl. Eggs were hatched in artificial pond water [32], and the free-swimming miracidia were either used immediately or transformed to primary sporocysts by cultivation at 26°C in Chernin's Balanced Salt Solution (CBSS; [33]) containing glucose and trehalose (1 g/L each) as well as penicillin and streptomycin (CBSS + ). Transformation of most miracidia was complete within 24 h of culture origination. In this study, primary sporocysts were maintained in CBSS + for up to 10 days, with refreshment of the culture medium at 2 and 7 days.

GDP-L-fucose synthesis and transport gene identification
The amino acid sequences of previously characterized GDP-L-fucose synthesis-and transport-associated genes, including GMDs, GMERs, GFTs, Fuks, FPGTs, Fkp and FKGP, of Homo sapiens, Mus musculus, Drosophila melanogaster, Caenorhabditis elegans, Arabidopsis thaliana, and Bacteroides fragilis were downloaded from Reference Sequence (RefSeq) and GenBank online databases at the National Center for Biotechnology Information (NCBI; accession numbers in Tables 1 and 2) and used as queries in a genome-wide tBLASTn [34] screen of genomic scaffolds and predicted genes to identify homologs in the Schistosoma mansoni Database (SchistoDB; [35]).

Primer design
The oligonucleotide primers used in this study were designed using Vector NTI Advance 11.0 software (Invitrogen, Eugene, OR, USA) and the IDT SciTools suite [87] based on available SchistoDB-derived genomic sequence information as well as data obtained by this study, and custom DNA oligonucleotides were purchased from Integrated DNA Technologies (IDT, Coralville, IA, USA). A complete list of primer sequences used in this study is provided in (Additional file 1: Table S1A-E).
Reverse transcriptase-PCR and rapid amplification of cDNA ends for GMD, GMER, and GFT transcript sequencing Kits and reagents for molecular assays were used according to the manufacturers' recommendations unless otherwise indicated. Primers used for reverse transcription (RT)-PCR and rapid amplification of cDNA ends (RACE) are provided (see Additional file 1: Table S1A-C). RT-PCR and RACE protocols were performed as detailed in [3] and are summarized as follows: Miracidia, 2-day in vitro-cultivated primary sporocysts and mixed-sex adults (i.e., pooled male and female worms) were washed with artificial pond water (miracidia), CBSS (sporocysts) or mammalian phosphate-buffered saline Figure 1 Schematic diagram of GDP-L-fucose synthesis. GDP-L-fucose synthesis occurs by two cytosolic pathways, namely the de novo and salvage pathways. In de novo synthesis (A), GMD with coenzyme NADP + removes one H 2 O-equivalent from GDP-D-mannose to form GDP-4-keto-6-deoxy-D-mannose. Then, GMER catalyzes epimerizations at C3 and C5 followed by an NADPH-dependent reduction of C4 to yield GDP-L-fucose. In the salvage pathway (B), Fuk transfers a single phosphate from ATP to free cytosolic L-fucose, yielding L-fucose-1-phosphate and the byproduct ADP. Next, FPGT transfers GMP from GTP to L-fucose-1-phosphate, producing GDP-L-fucose and pyrophosphate. Evidence presented here strongly supports the exclusive use of the de novo synthetic pathway in S. mansoni. GMD, GDP-D-mannose-4,6-dehydratase; GMER, GDP-4-keto-6-deoxy-D-mannose-3,5-epimerase-4-reductase; Fuk, L-fucokinase; FPGT, L-fucose-1-phosphate guanylyltransferase.

Phylogenetic analysis of nucleotide-sugar transporters
Representative amino acid sequences of functionally characterized nucleotide-sugar transporters were compiled from RefSeq and GenBank databases with our data from S. mansoni (Table 2). Sequences were aligned using default settings in MUSCLE v 3.6 [88], with subsequent manual correction in Mesquite [89]. A guide tree was developed for Bayesian phylogenetic inference using neighbor-joining methods in FastTree v 2.0.1 [90] with a Jukes-Cantor + CAT model. Analyses were then performed using mixed amino acid models within MrBayes v 3.1.2 [91] with two parallel runs of four Markov chain Monte Carlo (MCMC) chains, each for five million generations, with subsampling every 100th generation. To ensure the tree search was not a "tree prefix" refers to nomenclature applied in phylogenetic analyses of NSTs ( Figure 6, in Additional file 3: Figure S2). b Official gene names/identifiers are provided. Genes in boldface type were used as query sequences to search for GDP-L-fucose transporter homologs in the SchistoDB [35]. c NST activity has been demonstrated for these substrates. trapped at local optima, two independent replicates were conducted [92]. Stationarity of molecular evolutionary parameters was assessed at effective sample sizes >400 in Tracer v1.5 [93]. Additionally, convergence of the MCMC chains was evaluated using the online program AWTY [94]. Trees prior to stationarity were burned-in, and remaining trees were used to assess posterior probabilities for nodal support.
Real-time quantitative PCR analysis of GMD, GMER, and GFT mRNA expression in miracidia and primary sporocysts of S. mansoni Real-time quantitative (q)PCR protocols used in this study were performed according to the recommendations by Applied Biosystems [95], including strict criteria for qPCR primer design, validation and optimization. Relative transcript abundance in miracidia and primary sporocysts was examined using the comparative C T (ΔΔC T ) method. ATP synthase f (herein termed "ATPsf"; SAGE tag 195 corresponding to Smp_140480 in the SchistoDB) and the GroES chaperonin (SAGE tag 132 corresponding to Smp_097380) were selected as endogenous calibrators based on SAGE data [96], which indicate stable expression between miracidia and primary sporocysts. The compatibility of calibrator and gene of interest (GOI) qPCR primers under normal reaction conditions was assessed by plotting ΔC T at 10-fold dilutions of cDNA input and determining the slope of the resultant semi-log regression line; primer efficiencies were deemed compatible if the absolute value of the slope was less than 0.1. Validated calibrator and GOI primer sequences are listed in Additional file 1: Table S1D. Miracidia and in vitro-cultivated primary sporocysts were washed with artificial pond water and CBSS, respectively, followed by extraction of total RNA and immediate preparation of first-strand cDNA as above. It should be noted that RNA integrity was not routinely assessed prior to cDNA synthesis (as per MIQE guidelines [97]) due to limited raw RNA yields; however, integrity in select samples was visually inspected via electrophoretic fractionation. Also, raw and DNA-free RNA concentrations were estimated using a NanoDrop 1000 Spectrophotometer (Thermo Fisher Scientific, Waltham, MA, USA), and only samples exhibiting A 260 :A 280 and A 260 :A 230 ratios >1.8 were processed for inclusion in qPCR analyses. Real-time qPCR reactions (50 μL/rxn) were performed in triplicate using an ABI 7300 Real-Time PCR System (Applied Biosystems), with reaction mixtures comprising 1× SYBR Green PCR Master Mix (Applied Biosystems), 20 ng RNA input-equivalents of parasite cDNA and gene-specific primers (100 nM each forward and reverse for GMD, GMER and GFT; 200 nM each for GroES and ATPsf ). Cycling parameters included an initial denaturation at 95°C for 10 min followed by 40 cycles of 95°C for 15 sec and 60°C for 1 min. Amplification fidelity was confirmed by post-cycling thermal dissociation and agarose gel fractionation of qPCR products. The geometric mean of ATPsf and GroES C T values was used to normalize GOI C T values such that ΔC T = C T-GOI -C T-GeoMean (ATPsf, GroES) , and ΔC T values were compared across three independent biological replicates using iterative heteroscedastic two-sample t-and Wilcoxon rank sum tests, with significance set at p≤0.05 and p=0.10, respectively. It should be noted that the nonparametric Wilcoxon rank sum test lacks statistical power when sample size is low (e.g., n=3) and a p-value of 0.10 is acceptable in the current analyses.

Antibody purification using blotted recombinant GMD and GMER proteins
To reduce nonspecific binding and cross-reactivity of GMD and GMER chicken IgY antibodies, 200 μg purified GMD/GMER protein was fractionated in 12.5% polyacrylamide gel and electroblotted for 1.5 h at 100 mA onto 0.2 μm nitrocellulose (Bio-Rad Laboratories) using a TE 77 Semi-Dry Transfer Unit (Hoefer, San Francisco, CA, USA). Following transfer, the membraneimmobilized proteins were visualized with Ponceau S stain (Sigma-Aldrich), and bands were excised by razorblade. After destaining, the protein-bearing membrane strips were blocked overnight at 4°C with 5% nonfat dry milk in tris-buffered saline (TBS: 20 mM Tris, 150 mM NaCl, pH 7.5), rinsed two times with TBS containing 0.05% Tween® 20 (Thermo Fisher Scientific) (TBST), incubated overnight at 4°C with 10 mL crude pre-immune or gene-specific chicken IgY and washed three times with TBST. Finally, bound antibodies were eluted twice by incubation for 10 min in 5 mL 0.1 M Glycine-HCl (pH 2.7), with eluates being immediately neutralized with 400 μL 2 M tris (pH 8.0) followed by dialysis in PBS overnight at 4°C using a 7K MWCO Pierce Slide-A-Lyzer® Dialysis Cassette (Thermo Fisher Scientific). This antibody isolation procedure was repeated twice more using the same antigen-bound strips. Dialyzed eluates were combined and concentrated~250-fold with a 9K MWCO Pierce Protein Concentrator (Thermo Fisher Scientific), and stored for later use at 4°C in 50% glycerol.
Preparation of cytosolic, membrane/organelle, nuclear and cytoskeletal protein fractions from larval S. mansoni Subcellular fractionation of miracidia, primary sporocysts and mixed-sex adult worms was performed using a modification of the ProteoExtract® Subcellular Proteome Extraction Kit (EMD Chemicals) protocol, which was originally optimized for use with mammalian cell/tissue samples. Parasites were gently washed four times with artificial pond water (miracidia), CBSS (sporocysts) or mammalian PBS (adults), followed by two washes with Calbiochem® Wash Buffer (kit component). After the final wash, the parasites were pelleted by centrifugation for 1 min at 300 g and 4°C, resuspended in 1.5 mL Extraction Buffer I containing 1× protease inhibitor cocktail (PIC) (kit components), and gently agitated for 10 min at 4°C on a LABQUAKE® Rotatory shaker (Barnstead/Thermolyne, Dubuque, IA, USA). The parasite residua were pelleted by centrifugation for 10 min at 1100 g and 4°C, and the supernatant (cytosolic fraction, F1) was transferred to a clean tube on ice. Residua were then resuspended in 1.5 mL Extraction Buffer II containing 1× PIC (kit components) and incubated 30 min at 4°C on the rotary shaker. Following centrifugation for 10 min at 6500 g and 4°C, the supernatant (membrane/organelle fraction, F2) was placed on ice. Parasite residua were resuspended again in 0.75 mL Extraction Buffer III containing 1× PIC and 562.5 U Benzonase® (kit components), and suspensions were incubated on the rotary shaker for 10 min at 4°C. The insoluble material was pelleted by centrifugation for 10 min at 8200 g and 4°C, and the supernatant (nuclear fraction, F3) was set aside on ice. Finally, the residua were resuspended in 0.75 mL Extraction Buffer IV containing 1× PIC (kit components) and incubated for 30 min at room temperature on the rotary shaker. Insoluble cell debris was pelleted for the last time by centrifugation at 8200 g and room temperature and the final fraction (cytoskeletal fraction, F4) was set on ice. All fractions were then dialyzed in PBS overnight at 4°C using 6-8K MWCO D-Tube™ Dialyzers (EMD Chemicals) and concentrated~15 fold with a Microcon® YM-10 Centrifugal Filter Device (Millipore).

Results and discussion
Composition, genomic organization, and splicing of schistosome GDP-L-fucose synthesis and transport genes An exhaustive homology-based search of the Schistosoma mansoni Database (SchistoDB; [35]) using a diversity of previously characterized GDP-L-fucose synthesis-and transport-associated enzymes (see Tables 1-2) identified three homologs in the schistosome genome, herein termed GMD, GMER and GFT (genes and corresponding SchistoDB annotations listed in Table 3). GMD and GMER putatively constitute a complete de novo pathway for GDP-L-fucose synthesis. No homologs of salvage pathway-associated genes (Fuk, FPGT, Fkp, FKGP) were identified, suggesting that GDP-L-fucose synthesis in S. mansoni occurs only by de novo conversion of GDP-D-mannose. Unlike Caenorhabditis and Arabidopsis, which encode multiple paralogs of GMD and GMER [11,49], only one homolog of each gene occurs in S. mansoni. In addition to known Golgi-associated GFTs, search queries included the ER-resident transporter Efr, which imports GDP-L-fucose donor substrates for consumption by ER-associated protein O-FucTs in Drosophila. These searches failed to identify a homologous ER-type GFT in S. mansoni despite the previous finding that schistosomes express two putative ER-resident protein O-fucosyltransferases [3]. Notably, Ishikawa et al. [72] observed that Drosophila Golgi-and ER-resident GFTs (Gfr and Efr, respectively) function redundantly in the O-fucosylation of Notch receptor, suggesting the existence of two pathways for supplying GDP-L-fucose to ER-resident protein O-FucTs. Therefore, a second, ER-type GFT may not be necessary for O-fucosylation in S. mansoni.
To confirm mRNA expression of GMD, GMER and GFT in S. mansoni and obtain full-length CDSs, transcript sequences were RT-PCR and RACE-amplified from miracidial, primary sporocyst and adult worm cDNAs. Complete nucleotide sequences were submitted to GenBank at NCBI (accession numbers in Tables 1-2). While GMD and GMER sequence data generally validate the corresponding SchistoDB predictions, the data indicate that annotation Smp_155830 erroneously combines a portion of the GFT CDS with an upsteam gag-pol polyprotein-coding gene, which comprises~65% of the predicted GFT CDS. Mapping sequence data onto the corresponding SchistoDB-derived genomic scaffolds demonstrated that schistosome GMD, GMER and GFT are all multiexonic, with CDSs spanning 10, 6 and 8 exons, respectively (Table 3, Figure 2A).
Alternative splicing was observed for all schistosome GDP-L-fucose synthesis-and transport-associated genes ( Figure 2B). Because many of these observations were based on data obtained by RT-PCR and RACE, which target specific sections of each transcript rather than complete CDSs, the relationships among alternative splice events (i.e., whether splice events occur co-dependently in the formation of particular isoforms) are largely unknown. Most modes of alternative splicing were observed, including exon skipping (GMD, GMER and GFT), intron retention (GMD, GMER and GFT), mutual exclusion (e.g., exons 1 and 2 of GFT) and use of alternative splice donor sites (GMD and GMER). An in silico analysis to determine the consequences of alternative splicing revealed that many of these events altered protein coding by introducing a premature termination codon (PTC), forcing a downstream frameshift, or effecting an in-frame deletion or addition. However, additional studies are required to determine the true biochemical effects of these variations.
In eukaryotes, alternative splicing is often an important source of phenotypic complexity, which is driven by splicemediated expansion of the proteome, posttranscriptional gene regulation (e.g., introduction of a PTC that leads to nonsense-mediated decay) and alteration of cis-regulatory elements that control mRNA translation efficiency, stability and localization (reviewed by [98]). Additionally, in many biological systems, alternative splicing is an important mechanism of modulating physiological activity during development, differentiation and stress responses, and such developmentally regulated alternative splicing has been well documented in S. mansoni (e.g., [99][100][101]). While a comprehensive investigation of splice variation in the context of parasite development was beyond the scope of the present study, the data feature multiple examples of variant splice events that potentially modulate GMD, GMER and GFT expression. For instance, the observed splice-mediated introduction of PTCs and frameshifts could target the affected GMD, GMER and GFT transcripts for nonsensemediated decay, and developmental regulation of these processes could yield stage-and/or tissue-specific GDP-L-fucose synthesis and transport activities. Moreover, this could affect FucT activity in the Golgi and ultimately determine the developmental expression of fucosylated glycotopes.

In silico characterization of schistosome GMD, GMER, and GFT
To provide support for their putative roles in GDP-L-fucose synthesis and transport, the predicted amino acid sequences of schistosome GMD, GMER and GFT were compared against previously characterized homologs of other organisms, and proteins were examined for the presence of key primary sequence elements. GMDs and GMERs of other organisms are cytosolic soluble enzymes of the short-chain dehydrogenase/reductase (SDR) gene family and feature a Rossman dinucleotide-binding domain (reviewed by [9]; also see references in Table 1). Amino acid alignment of schistosome GMD and GMER to functionally Table 3 Genomic organization of GDP-D-mannose-4,6-dehydratase (GMD), GDP-4-keto-6-deoxy-D-mannose-3,5-epimerase-  (Figures 3 and 4). In pairwise comparisons, schistosome GMD shares~53-61% of its primary sequence with homologs (61.2% identical to Bacteroides Gmd), while schistosome GMER is~25-62% identical to its homologs (61.7% identical to human FX).

4-reductase (GMER) and a GDP-L-fucose transporter (GFT) in Schistosoma mansoni
Both GMD and GMER alignments demonstrated the presence of a well-conserved glycine-rich phosphate-binding loop (GxxGxxG; alignment positions 57-63 in GMD; positions 23-29 in GMER), which is key to watermediated hydrogen bonding between the Rossman fold of redox-associated enzymes and the pyrophosphate of dinucleotide enzyme cofactors (e.g., NAD + /NADP + ) [102], and both enzymes feature the catalytically important SDR-associated [S/T]-Y-K triad (alignment positions 187, 211 and 215 in GMD; positions 126, 155 and 159 in GMER) [103][104][105]. Additionally, schistosome GMER features conserved cysteine and histidine residues (C-128, H-198) that are thought to be involved in proton exchange between GMER and its epimerization reaction intermediates [105]. Finally, analyses using the Simple Modular Architecture Research Tool (SMART; [106]) and Phobius transmembrane topology and signal peptide prediction server [107] demonstrated that schistosome GMD and GMER lack either a transmembrane domain Previous studies have demonstrated that GFTs are generally Golgi-resident multispan transmembrane proteins with 10 TMDs [27,28,30,70]. Moreover, these genes feature a high degree of conservation across invertebrate and vertebrate taxa. Protein alignments of schistosome GFT with functionally characterized orthologs from Caenorhabditis, Drosophila, Mus and humans (see Table 2) revealed 25.2% overall identity, with pairwise comparisons indicating that schistosome GFT shares~37-41% identity with orthologous GFTs ( Figure 5). A unique feature of the schistosome protein is its conspicuously long C-terminal tail; however, the  Table 1). Alignment position is indicated above each block, and sequence length is reported to the right of each line. Positions exhibiting greater than 80% conservation are highlighted in gray, and identities are identified in black. A well-conserved glycine-rich phosphate-binding loop (GxxGxxG), which is key to water-mediated hydrogen bonding between the Rossman folds of redox-associated enzymes and the pyrophosphates of dinucleotide enzyme cofactors (e.g., NAD + /NADP + ) [102], is underlined. Also, the catalytically important [S/T]-Y-K triad common to members of the SDR family of enzymes is indicated by asterisks [103,104]. Vector NTI Advance 11.0 software alignment settings: BLOSUM45 matrix, gap opening penalty = 12, gap extension penalty = 0.1, gap separation penalty range = 0, no residue-specific or hydrophobic residue gaps.
significance of this extension remains unknown. An analysis of membrane topology using the Phobius server suggested the presence of 10 tightly spaced TMDs, with both N-and C-terminal tails oriented into the cytoplasm (in Additional file 2: Figure S1). For comparison, the positions of the 10 TMDs in known GFTs were also determined using the Phobius server, and alignment with schistosome GFT demonstrated that the spacing of TMDs is roughly conserved across taxa ( Figure 5).
Altogether, these data support a role for schistosome GFT in GDP-L-fucose transport.
Alternative splice isoforms of GFT that exclude exons 7 and 8 (and the intervening intron) encode a truncated protein featuring 7 TMDs. Importantly, nucleotide-sugar transporters (NSTs), including the GFTs, are part of a diverse drug/metabolite transporter superfamily composed of multispan transmembrane proteins (typically with 4-10 TMDs) that function in drug export, nutrient/metabolite  Table 1). Alignment position is indicated above each block, and sequence length is reported to the right of each line. Positions of identity are indicated in black, and gray-highlighted positions are greater than 80% conserved among the sampled sequences. A well-conserved glycine-rich phosphate-binding loop (GxxGxxG), which mediates the binding of dinucleotide cofactors (e.g., NAD + /NADP + ) by redox-associated enzymes [102], is underlined. Additionally, the catalytically important [S/T]-Y-K triad of the short-chain dehydrogenase/reductase-type enzymes is indicated by asterisks (*), and residues thought to be involved in proton exchange between GMER and its epimerization reaction intermediates are marked with carets (^) [105]. Vector NTI Advance 11.0 software alignment settings: BLOSUM45 matrix, gap opening penalty = 12, gap extension penalty = 0.1, gap separation penalty range = 0, no residue-specific or hydrophobic residue gaps. efflux and compartmental metabolite exchange [108,109]. While the observed in-frame deletion of three TMDs likely abolishes GDP-L-fucose transport activity (given it lacks definitive primary sequence characteristics), the truncated GFT could retain its function as an NST (but with altered substrate specificity) or adopt a new class of metabolite transport function altogether. Future studies should assess the biochemical significance of this truncation.
The above topological analyses employed several transmembrane prediction tools (e.g., TMHMM 2.0 and TMpred [110,111]), but Phobius was the only one that predicted all 10 TMDs in most genes. Only TMD 9 of the human GFT was undetected using this method. Lubke et al. [27] reported similar difficulty in demonstrating this same TMD, which they attributed to its unusually high hydrophilicity. In general, in silico predictions of NST membrane topology are inherently difficult because current algorithms do not account for the relative thinness of the Golgi membrane (~20% thinner) and thus fail to recognize the concomitantly short TMDs of Golgi-resident transmembrane proteins such as NSTs [112]. In fact, at a typical length of 17-22 aa, the TMDs of Golgi proteins are on average five aa shorter than those of plasma membrane-associated proteins [113][114][115].

Phylogenetic analysis of nucleotide-sugar transporters
Primary sequence identity alone cannot reliably predict substrate specificity among NST genes [54,112]. NSTs can share as much as 50-60% of their primary sequences and exhibit different substrate specificities while proteins that are only 20% identical can transport the same nucleotide-sugar substrates [63]. However, previous studies have demonstrated that phylogenetic analyses can separate NSTs into functional groups [108,116]. To refine the predicted substrate specificity of schistosome GFT and better understand the structure-function relationship between GFTs and other NSTs, we conducted a phylogenetic analysis of schistosome GFT and a functionally diverse sampling of previously characterized NSTs. The topology of the resultant phylogeny is consistent with observations by Martinez-Duncker et al. [108]: the current repertoire of NSTs can be divided into three main families/groups (NST families 1-3), which form separate monophyletic clades (Figure 6; see in Additional file 3: Figure S2 for a rooted tree demonstrating the three NST families). Consistent with the notion that closely related NSTs can be functionally divergent, all three families include members with aberrant substrate specificities. While structure-function relationships in families 1 and 2 Figure 5 Amino acid alignment of GDP-L-fucose transporters. The predicted amino acid sequence of schistosome GFT (Sm) is compared to GFTs of Caenorhabditis elegans (Ce), Drosophila melanogaster (Dm), Mus musculus (Mm) and humans (Hs) (accession numbers in Table 2). Alignment position is indicated above each block, and sequence length is reported to the right of each line. Positions of identity and positions exhibiting at least 80% conservation are highlighted in black and gray, respectively. The positions of ten well-conserved TMDs (underlined below sequences and alignment blocks) were determined using the Phobius transmembrane topology and signal peptide prediction server [107]. Vector NTI Advance 11.0 software alignment settings: BLOSUM45 matrix, gap opening penalty = 12, gap extension penalty = 0.1, gap separation penalty range = 0, no residue-specific or hydrophobic residue gaps. remain somewhat unclear, NST family 3 can be broken down into four daughter clades (corresponding to subfamilies J-M in [108]) that correspond to substrate specificity. Subfamily J includes NSTs exhibiting multispecific UDPsugar transport activities, while subfamilies K, J and M feature NSTs having relatively narrow substrate specificities (GDP-D-mannose, UDP-D-galactose or GDP-L-fucose, respectively). Of the 18 previously characterized family 3 NSTs examined here, only LPG2 of Leishmania donovani features uncharacteristic activity for its clade, transporting GDP-L-fucose and GDP-D-arabinose in addition to GDP-D-mannose. Schistosome GFT forms a monophyletic clade with known Golgi-resident GFTs, supporting a predicted role in GDP-L-fucose transport. Notably, Drosophila Efr, which delivers GDP-L-fucose to the ER, clusters with NST family 2. This is consistent with other NST family 2 transporters that function in the ER and not the Golgi. Indeed, Martinez-Duncker et al. [108] reported that 54% of NST family 2 members feature a C-terminal di-lysine (KKxx) ER-retention/retrieval signal, and one such signal (KKVE) is present in Drosophila Efr. In contrast, similar ER-retention /retrieval signals do not exist in schistosome GFT or any of the family 3 NSTs examined here.

GMD, GMER and GFT mRNA expression in miracidia and primary sporocysts of S. mansoni
Given recent data demonstrating the abundant expression of fucosylated glycotopes in snail-associated schistosome Figure 6 Phylogenetic tree of nucleotide-sugar transporters. The amino acid sequences of previously characterized NSTs and putative GFT of Schistosoma mansoni (RefSeq/GenBank accession numbers in Table 2) were included in a phylogeny annotated with substrate specificity data. Posterior probabilities are indicated at each node, and genetic divergence (substitutions per site) is represented by the scale bar. Family 3 NSTs [108], which include transporters of GDP-L-fucose, are labeled on the right. To demonstrate the topology of NST families 1 and 2, these data are also presented as a rooted tree in (Additional file 3: Figure S2 [7,117] and their predicted immunomodulatory roles in snail hosts, GMD, GMER, and GFT steady-state transcript levels were assayed by qPCR in miracidia and 2-and 10-day in vitro-cultivated primary sporocysts. The results indicate that all three genes are differentially expressed during the miracidium-to-primary sporocyst transformation and subsequent cultivation (Figure 7). In conjunction with larval transformation, GMER and GFT transcript levels declined 48% and 31%, respectively, after two days in culture, while GMD expression remained unchanged. During subsequent in vitro cultivation of primary sporocysts (up to 10 days), GMD transcript abundance climbed~4-fold while expression of GMER and GFT stayed the same. These results are somewhat confounding since GMD and GMER constitute a single biosynthetic pathway.
For comparison, the S. mansoni Serial Analysis of Gene Expression (SAGE) Database [96] was examined for relevant SAGE tags, and tags 7188 and 10882 corresponding to GMD and GMER, respectively, were identified. Consistent with the present study, the SAGE data indicate that GMD transcript abundance increases~3-fold from miracidia to 6-day primary sporocysts while the GMER-specific tag 10882 was not detected in either larval stage. Interestingly, both genes exhibited peak expression in 20-day primary sporocysts, suggesting that GDP-L-fucose synthesis potentially increases in older larvae. GFT transcript expression (as indicated by tag 4514) followed a similar profile, with relatively low transcript levels in miracidia and 6-day primary sporocysts and peak expression after 20 days in culture. In the present study, if sporocyst cultivation times had been longer, the expression of all three genes may have peaked similarly in older larvae (i.e., >10 days in culture).
The GMDs in bacteria participate in several overlapping synthetic pathways, with reaction intermediates being converted to GDP-L-fucose, GDP-D-rhamnose or GDP-D-talose by GMER, GDP-6-deoxy-d-lyxo-hexos-4-ulose-4-reductase (RMD) and GDP-6-deoxy-D-talose synthetase (GTS), respectively (reviewed in [118]). Additionally, GMDs of Paramecium bursaria, Chlorella virus 1 and some bacteria (e.g., Pseudomonas aeruginosa) are bifunctional, having the added ability to catalyze the same stereospecific reduction as RMD. A similar, still unknown dual functionality or involvement in other biochemical pathways in schistosomes could explain why the observed GMD and GMER expression profiles vary independently; however, participation of GMD in GDP-D-rhamnose or GDP-D-talose biosynthesis in particular is unlikely because rhamnose and talose, as well as homologs of RMD and GTS, are not observed in S. mansoni.

Recombinant GMD and GMER protein expression, purification, and antibody production
To facilitate analyses of protein expression in larval S. mansoni, GMD and GMER were heterologously expressed and purified, and the recombinant proteins were used to raise GMD-and GMER-specific chicken IgY antibodies (in Additional file 4: Figure S3A-B). To assess specificity, antibodies were tested against blotted crude parasite extracts and pure GMD and GMER recombinant antigens. Initially, immunoblots revealed unacceptable levels of crossreactivity (especially between anti-GMD IgY and recombinant GMER; in Additional file 4: Figure S3C), so antibodies were further purified by membrane adsorption against the purified antigens. Subsequent immunoblots demonstrated that antigen specificities of both IgY preparations were greatly improved, showing essentially monospecific reactivities (in Additional file 4: Figure S3D). Membrane-isolated antibodies were used in downstream immunoblot and microscopic analyses.

Characterization of GMD and GMER protein expression in miracidia and primary sporocysts of S. mansoni
Multiple attempts were made to demonstrate the presence of GMD and GMER in crude adult and larval extracts using western blotting, but only faint bands were produced (Peterson, unpublished data). To enhance detection of GMD and GMER in western blot analyses and concurrently Figure 7 GDP-L-fucose synthesis-and transport-associated gene transcription in larvae of Schistosoma mansoni. Real-time qPCR was used to examine GMD, GMER and GFT transcription in miracidia (Mir) and 2-and 10-day in vitro-cultivated primary sporocysts (2dS and 10dS, respectively). Transcript abundances in primary sporocysts were compared to miracidia (arbitrarily set at 1), and data were analyzed across three biological replicates using heteroscedastic two-sample t-and Wilcoxon rank sum tests, with significance set at p≤0.05 (indicated by *) and p=0.10 (indicated by †), respectively. demonstrate their cytosolic localization, 2-day primary sporocysts were serially extracted using a ProteoExtract® Subcellular Proteome Extraction Kit, yielding enriched cytosolic, membrane, nuclear, and cytoskeletal protein fractions. While application of the ProteoExtract® kit for subcellular fractionation of whole schistosome larvae has yet to be experimentally validated regarding the fidelity of differential extraction, Coomassie-stained gels clearly demonstrated compositional differences in the resultant protein fractions and fractionation successfully facilitated detection of GMD and GMER in subsequent immunoblots ( Figure 8A). Consistent with their expected roles in cytosolic GDP-L-fucose synthesis, immunoblots revealed the presence of GMD and GMER only in the presumptive cytosolic fraction (bands at 38 and 35 kDa, respectively).
In a comparison of cytosolic extracts from miracidia and 2-and 10-day primary sporocysts, GMD and GMER proteins appear to be stably expressed during larval transformation and subsequent in vitro cultivation ( Figure 8B). This result seemingly contradicts qPCR and  Figure S3) were used in western blot analyses of 2-day in vitro-cultivated primary sporocyst subcellular protein extracts (A). Four enriched fractions were examined: (F1) cytosol, (F2) membrane/membrane organelle, (F3) nucleus, and (F4) cytoskeleton. Blots indicate the presence of GMD and GMER only in the cytosolic fraction. Next, cytosolic fractions were used to compare GMD and GMER protein expression among mixed-sex adults, miracidia and 2-and 10-day in vitro-cultivated primary sporocysts (B). In both experiments, total protein was visualized in-gel by Coomassie staining. SAGE data described above, which indicate stage-specific differences in GMD and GMER transcript levels among snail-associated larvae. One possible explanation for the apparent discrepancy between transcript and protein abundances is the inability of qPCR and SAGE approaches to adequately differentiate between "functional" GMD/GMER-coding transcripts and variants that are pretranslationally targeted for nonsense-mediated decay or are translated to truncated proteins not detected by the above methods. For example, while GMD gene transcription appears to increase~4-fold in 10-day in vitro-cultivated primary sporocysts, the absolute abundance of "functional" GMD-coding transcripts might remain unchanged, thus resulting in no detectable alteration in protein expression. Additionally, protein turnover rates may be sufficiently low to permit persistence and stable detection regardless of declining transcript abundance (e.g., GMER). Lastly, it should be noted that colorimetric precipitation-mediated detection of immunoreactive proteins is perhaps inadequate for the demonstration of relatively minor differences in protein abundance and application of more quantitative detection methods (e.g., fluorescence) might have revealed low-level stagespecific variations in GMD and GMER expression that mirror the observed changes in gene transcription.
Immunoblots also examined GMD and GMER expression in mixed-sex adult worms ( Figure 8B). Cytosolic extracts were seemingly devoid of immunoreactive GMD, suggesting differential expression between adults and larvae. Additionally, adult extracts featured two anti-GMER IgY-reactive bands, one corresponding to GMER and a second at~42 kDa. The added band potentially represents the translated product of an adult-specific alternative splice isoform; however, none of the observed variants can account for the increased protein size. Alternatively, the band is an artifact of antibody crossreactivity. That adult worms apparently lack GMD while expressing one or more GMER isoforms is confounding, given their roles in the same biosynthetic pathway. One possible explanation is that GMER or an alternative protein isoform has an unknown role in a separate pathway, which drives its expression independent of GMD.
Finally, the membrane-purified antibodies were employed in confocal laser scanning microscopy to demonstrate the tissue localization of GMD and GMER proteins in miracidia and 2-and 10-day primary sporocysts (Figure 9). Both proteins were observed predominantly in the ciliated epidermal plates and tegument of miracidia and sporocysts, respectively, while antibodies exhibited at least minor reactivities in internal somatic tissues. Similar patterns of expression in schistosome larvae were observed for several prominent fucosylated glycotopes, including Fucα1-3GalNAcβ1-4GlcNAc (F-LDN) and Fucα1-3GalNAcβ1 -4(Fucα1-3)GlcNAc (F-LDN-F) [7]. Importantly, colocalization of schistosome GMD and GMER implies the presence of a complete de novo pathway for GDP-L-fucose synthesis, and further supports their roles in fucosylation. Figure 9 Localization of de novo GDP-L-fucose synthesis in miracidia and primary sporocysts of Schistosoma mansoni. Confocal laser scanning microscopy was used to assess the localization of schistosome GMD and GMER in miracidia and 2-and 10-day in vitro-cultivated primary sporocysts. Fixed and permeabilized larvae were immunostained with membrane-purified chicken IgY raised against schistosome GMD and GMER (rows 2 and 4, respectively), and antibody reactivities were assessed relative to control larvae incubated with membrane-adsorbed pre-immune eluates (rows 1 and 3). Panels include paired micrographs depicting GMD/GMER expression (green) alone and merged with counterstained actin (e.g., muscles, flame cells; red) and DNA (e.g., nuclei; blue). Approximate scale is represented in the lower right corner (bar = 50 μm).

Conclusions
The present study used a genome-wide homology based bioinformatics approach to identify GDP-L-fucose synthesis-and transport-associated genes in the human blood fluke Schistosoma mansoni. The above data indicate that GDP-L-fucose in S. mansoni is generated in the cytosol by a de novo synthetic pathway comprising GMD and GMER enzymes, after which the resulting activated fucose is imported into the Golgi by the multispan transmembrane protein GFT. Importantly, these enzymes represent a bottleneck in the fucosylation process since GDP-L-fucose is the sole nucleotide-sugar donor utilized by Golgi-and ER-resident FucTs. This research has provided a necessary foundation for future investigations that further explore the role of GDP-L-fucose synthesis and transport in schistosome development and immunobiology. Additionally, the genes identified in this study are potential targets for the development of novel anti-schistosomal chemotherapeutics.

Additional files
Additional file 1: Table S1. This PDF document contains Additional file 1: Table S1, which lists the oligonucleotide primers used for RT-PCR, RACE, qPCR and protein expression in this study.
Additional file 2: Figure S1. This TIF document contains Additional file 2: Figure S1, which features the results of an in silico analysis of GFT membrane topology. Transmembrane domains were identified in the schistosome GFT protein using the Phobius transmembrane topology and signal peptide prediction server [107]. The Phobius output suggested 10 TMDs, a number that is consistent with GDP-L-fucose transporters of other organisms [27,28,30,70] (also see Figure 5) (A). A model based on this output was constructed, portraying the arrangement of the 10 TMDs (numbers indicating the amino acid boundaries of each TMD) as well as the most likely orientation for schistosome GMD within the Golgi membrane (B).
Additional file 3: Figure S2. This TIF document contains Additional file 3: Figure S2, which features a rooted phylogenetic tree of nucleotide-sugar transporters (see Figure 6 for detailed unrooted tree). The amino acid sequences of NSTs with previously characterized substrate specificities were obtained from RefSeq and GenBank databases at NCBI (accession numbers in Table 2). A tree was constructed using Bayesian methods implemented in MrBayes v3.12 with mixed amino acid evolutionary models. Monophyletic clades representing NST families 1-3 [108] are indicated, and genetic divergence (substitutions per site) is represented by the scale. The tree is rooted on NST family 2.
Additional file 4: Figure S3. This TIF document contains Additional file 4: Figure S3, which describes heterologous expression and isolation of recombinant schistosome GMD and GMER proteins and downstream affinity purification of GMD-and GMER-specific polyclonal chicken IgY. GST-GMD and -GMER fusion constructs were created in pGEX-6P-1 vector, and the encoded proteins were expressed in E. coli. Fusion protein expression in induced (Ind) and uninduced (Un) cultures was compared by SDS-PAGE fractionation and Coomassie staining of soluble cellular extracts (A). Fusion protein-containing extracts were passed through a GST-affinity column, and bound GMD and GMER were eluted by PreScission™ Protease-mediated cleavage of the GST fusions. Eluates were then analyzed by SDS-PAGE fractionation and Coomassie staining (B). Polyclonal chicken IgY antibodies were raised against recombinant GMD and GMER proteins, and the resultant antibodies were tested by immunoblotting the pure recombinant antigens (C). Due to crossreactivity among the antibodies and antigens (especially between anti-GMD IgY and recombinant GMER), antibodies were affinity-purified by membrane adsorption using bound GMD and GMER antigen. Following elution, antibody preparations were again tested against blots of pure antigen, demonstrating greatly reduced crossreactivity (D).