Beta carbonic anhydrases: novel targets for pesticides and anti-parasitic agents in agriculture and livestock husbandry

Background The genomes of many insect and parasite species contain beta carbonic anhydrase (β-CA) protein coding sequences. The lack of β-CA proteins in mammals makes them interesting target proteins for inhibition in treatment of some infectious diseases and pests. Many insects and parasites represent important pests for agriculture and cause enormous economic damage worldwide. Meanwhile, pollution of the environment by old pesticides, emergence of strains resistant to them, and their off-target effects are major challenges for agriculture and society. Methods In this study, we analyzed a multiple sequence alignment of 31 β-CAs from insects, some parasites, and selected plant species relevant to agriculture and livestock husbandry. Using bioinformatics tools a phylogenetic tree was generated and the subcellular localizations and antigenic sites of each protein were predicted. Structural models for β-CAs of Ancylostoma caninum, Ascaris suum, Trichinella spiralis, and Entamoeba histolytica, were built using Pisum sativum and Mycobacterium tuberculosis β-CAs as templates. Results Six β-CAs of insects and parasites and six β-CAs of plants are predicted to be mitochondrial and chloroplastic, respectively, and thus may be involved in important metabolic functions. All 31 sequences showed the presence of the highly conserved β-CA active site sequence motifs, CXDXR and HXXC (C: cysteine, D: aspartic acid, R: arginine, H: histidine, X: any residue). We discovered that these two motifs are more antigenic than others. Homology models suggested that these motifs are mostly buried and thus not well accessible for recognition by antibodies. Conclusions The predicted mitochondrial localization of several β-CAs and hidden antigenic epitopes within the protein molecule, suggest that they may not be considered major targets for vaccines. Instead, they are promising candidate enzymes for small-molecule inhibitors which can easily penetrate the cell membrane. Based on current knowledge, we conclude that β-CAs are potential targets for development of small molecule pesticides or anti-parasitic agents with minimal side effects on vertebrates.

fish, beneficial insects, and non-target plants [3]. The extensive use of pesticides, such as Dichlorodiphenyltrichloroethane (DDT), in recent decades has led to their recurrent detection in many surface and ground waters [4]. As a result of these negative consequences, natural products have become popular among consumers [5].
As of the 1960s pesticide resistance had already evolved in some key greenhouse pests, prompting the development of alternative methods of management. The pressure to reduce insecticide usage was reinforced by the adoption of bumble-bees for pollination within greenhouses [6]. Biological control plays a central role in the production of many greenhouse crops. The term "Biopesticide" encompasses a broad array of microbial pesticides, including biochemicals derived from micro-organisms and other natural sources, and those resulting from the incorporation of DNA into various agricultural commodities [7]. Bacteria, fungi, viruses, entomopathogenic nematodes (ENPs), and herbal essential oils are often used as biopesticides [8]. Novel approaches to control pests involve targeting of specific insect and parasite enzymes. This can be done using either chemical or biological compounds. Acetylcholinesterase (AChE) of the malaria mosquito (Anopheles gambiae) has been reported as a target site for pesticides [9]. Three pesticides, Atrazine, DDT, and Chlorpyrifos, have been determined to affect the esterase (GE), glutathione S-transferase (GST), cytochrome P450 monooxygenase (P450), and acetylcholinesterase (AChE) activities of Chironomus tentans (an aquatic midge) [4]. Proteinases serving as insect digestive enzymes are defined targets in pest control [10]. Enzyme inhibitors, such as: piperonyl butoxide (PB), a mixed-function oxidase (MFO) inhibitor; triphenyl phosphate (TPP), a carboxyesterase (CarE) inhibitor; and diethyl maleate (DEM), a glutathione S-transferase (GST) inhibitor, have been used to inhibit insect enzymes [11]. Inhibition of Plasmodium falciparum carbonic anhydrase (CA) with aromatic heterocyclic sulfonamides was investigated in 2011 [12]. In another study, a thiabendazole sulfonamide showed a potent inhibitory activity against both mammalian and nematode α-CAs [13].
Five independently evolved classes of CAs (α, β, γ, δ, and ζ) have been identified, of which one or more are found in nearly every cell type, underscoring the general importance of this ubiquitous enzyme in nature [14]. The CAs are involved in several important biological processes, such as respiration and transportation of CO 2 and bicarbonate between metabolizing tissues, pH and CO 2 homeostasis, electrolyte secretion in different organs, bone resorption, calcification, tumorigenicity, and some biosynthetic reactions including gluconeogenesis, lipogenesis, and ureagenesis [15]. Since 1990, many demonstrated and putative β-CAs have been discovered not only in photosynthetic organisms, but also in eubacteria, yeast, archaeal species [16] and 18 metazoan species [17]. Recently, we reported 52 β-CAs in metazoan and protozoan species [18]. At least one study has shown the effects of β-CA inhibitors as anti-infective agents on different bacterial and fungal pathogens [19], yet this approach has not been tested in vivo in metazoans or protozoans. In this article, we introduce β-CAs as novel potential target enzymes to control agricultural and veterinary insects and parasites which cause enormous economic losses worldwide.

Methods
Identification of putative β-CA enzymes and multiple sequence alignment (MSA) In total, 23 parasite and 8 plant β-CA sequences relevant to agriculture and livestock husbandry, or as model organisms, and one bacterial sequence (Desulfosporosinus meridiei) were retrieved from UniProt (http://www.uniprot. org/) and NCBI (http://www.ncbi.nlm.nih.gov/). The full list of agriculture and livestock husbandry pests and plants containing β-CA addressed in this research are shown in Table 1. We focused on 98 amino acid residues around the catalytic active site of all tested β-CAs, starting 7 amino acid residues prior to the first highly conserved sequence (CXDXR). The Clustal Omega algorithm [20] within the Jalview program (version 2.8.ob1) (http://www. jalview.org/) was used to create a multiple sequence alignment (MSA) [21].

Phylogenetic analysis
All sequences were individually analyzed for completeness and quality. The β-CA sequence for Solenopsis invicta (UniProt ID: E9IP13) was determined to have a spurious exon when the genomic sequence was analyzed by the Exonerate program using the other β-CA proteins as query sequences, and subsequently 17 amino acids were removed [49]. Similarly, the full genome of Acyrthosiphon pisum was analyzed. Of the three Acyrthosiphon pisum β-CA sequences identified in UniProt, two were incomplete (UniProt IDs: C4WVD8 and J9JZY3) and found to be fragments of the same complete protein predicted in our analysis (Acyrthosiphon pisum BCA-2). Finally, the full genome of Ichthyophthirius multifiliis was scanned for β-CA proteins using the same method, and two new putative β-CA proteins were identified (Ichthyophthirius multifiliis BCA-3 and BCA-4).
A protein sequence alignment was created using Clustal Omega [20] based on which the corresponding nucleotide sequences were then codon-aligned by the Pal2Nal program [50]. Using the Desulfosporosinus meridiei bacterial sequence as an outgroup, a phylogenetic analysis was computed using Mr. Bayes v3.2 [51] with the GTR model of codon substitution and all other parameters set to default. In total, 200,000 generations were computed with a final standard deviation of split frequencies of 3.33 × 10 −4 . Fish and fish farming [32] Lepeophtheirus salmonis Salmon louse Parasite living on wild salmon and fish farming Fish and fish farming [25] Necator americanus New World hookworm Necatoriasis in dog, cat, and human (zoonosis) Humans and animals health [33] Solenopsis invicta Red imported fire ant (RIFA) Mound-building activity, Damage plant roots which leads to loss of crops, and interfere with mechanical cultivation Wooden instrument industries and consumers, and gardening [34] Tribolium castaneum Red flour beetle Pest of stored grain products, carcinogenic by secretion of quinones, causative agent of occupational IgE-mediated allergy and some other diseases Wheat, flour, cereal and nut based food industries [35][36][37][38] Trichinella spiralis Pork worm Trichinosis in rat, pig, bear and human (zoonosis) Pig breeding [39] Trichoplax adhaerens Adherent hairy plate Adherence to the wall of a marine aquariums Aquarium and ornamental fishing industry [40] Arabidopsis thaliana Mouse-ear cress -A popular model organism in plant biology and genetics [41] Pisum sativum Pea -Pea is most commonly the small spherical seed or the seed-pod [42] Gossypium hirsutum Upland cotton -Upland cotton is the most widely planted species of cotton [43] Nicotiana tabacum Tobacco -Its leaves are commercially processed into tobacco [44] Vitis vinifera Grape vine -Commercial significance for wine and table grape production [45] Solanum tuberosum Potato -The world's fourth-largest food crop, following maize, wheat and rice [46] Populus trichocarpa Black cottonwood or California poplar -A model organism in plant biology [47] Capsella rubella A genus from Mustard family -A member of Mustard family [48] The final phylogenetic tree was visualized in FigTree (http:// tree.bio.ed.ac.uk/software/figtree/).

Prediction of subcellular localization
Subcellular localization of each identified invertebrate β-CA was predicted using the TargetP webserver (http:// www.cbs.dtu.dk/services/TargetP/). TargetP is built from two layers of neural networks, where the first layer contains one dedicated network for each type of targeting sequences, such as cytoplasmic, mitochondrial, or secretory peptides, and the second layer is an integrating network that outputs the actual prediction (cTP = cytoplasmic, mTP = mitochondrial, SP = secretory, or other). It is able to discriminate between cTPs, mTPs, and SPs with sensitivities and specificities higher than what has been obtained with other available subcellular localization predictors [52].

Prediction of antigenic sites in β-CA
The protein sequences of 23 parasite and 8 plant β-CAs were analyzed with the European Molecular Biology Open Software Suite (EMBOSS) program Antigenic (http:// emboss.bioinformatics.nl/cgi-bin/emboss/antigenic). EMBOSS Antigenic predicts potentially antigenic regions of a protein sequence, using the method of Kolaskar and Tongaonkar [53]. Application of this method to a large number of proteins has shown that their accuracy is better than most of the known methods [54][55][56].

Homology modelling
Homology models of four selected β-CAs, including FC551456 (Ancylostoma caninum), F1LE18 (Ascaris suum), E5SH53 (Trichinella spiralis), and C4LXK3 (Entamoeba histolytica) were prepared by first selecting the most suitable template structure. For this purpose, a BLAST search of the PDB database (http://www.rcsb.org/pdb/ home/home.do) was performed using each of the four sequences. Results for three out of these four searches revealed that PDB structure 1EKJ (β-CA from Pisum sativum) possessed the most similar sequence, while PDB id 2A5V (β-CA from Mycobacterium tuberculosis) was found to be the most similar to C4LXK3 (Entamoeba histolytica). Clustal Omega was used to prepare a multiple sequence alignment for those six sequences. The multiple sequence alignment showed nine completely conserved residues within the sequences; the known highly conserved CXDXR and HXXC motifs were among them (data not shown). Homology modelling was performed according to multiple sequence alignment containing FC551456 (Ancylostoma caninum), F1LE18 (Ascaris suum), E5SH53 (Trichinella spiralis), and PDB 1EKJ by using the Modeller program (version 9.13) [57] with PDB model 1EKJ (β-CA from Pisum sativum) as a template. A homology model for C4LXK3 (Entamoeba histolytica) was prepared using PDB 2A5V for pairwise alignment and as a template structure. The resulting models were structurally aligned using the BODIL program [58]. A figure illustrating the homology models was prepared by using the VMD program (version 1.9.1) [59], and edited within Adobe Photoshop (version 13.0.1).
The structural availability of the epitope in the PDB model 1EKJ (β-CA from Pisum sativum) and the homology model based on the β-CA sequence from Ancylostoma caninum was studied by preparing the molecular surface with VMD, using a probe radius of 1.4 Å. The potential epitope residues were excluded from the surface presentation and were shown as Van der Waals (VdW) spheres.

Multiple sequence alignment (MSA)
The MSA of 23 parasite and 8 plant β-CA sequences revealed the presence of the highly characteristic conserved sequence motifs CXDXR and HXXC (C: cysteine, D: aspartic acid, R: arginine, H: histidine, X: any residue) in all sequences. These results verify the presence of the β-CA enzyme in several insects and parasites which are pathogenic to various species of plants and animals and are thus considered relevant to agriculture and livestock husbandry (Figure 1).

Phylogenetic analysis
The results of the phylogenetic analysis of DNA sequences encoding 23 parasite and 8 plant β-CAs are shown in Figure 2. From the resulting tree we see four distinct clades, three of which represent distinct potential β-CA targets. From the top, the first clade represents β-CAs of invertebrate pests, the second clade are plant model organisms, the third clade is entirely represented by the four β-CAs of Ichthyophthirius multifiliis, and the final clade represents three species of amoeba. The Entamoeba spp. sequences occupy a midpoint between our outgroup bacteria species and the others.

Prediction of subcellular localization
The results of subcellular localization prediction of β-CAs in selected parasite and plant species are shown in Table 2. The predictions were based on the analysis of full-length β-CA protein sequences. In the Name column, there are both the UniProt ID and species scientific name. The results reveal that 6 of 23 β-CAs from parasites (XP_004537221.1, B0WKV7, U6PDI1, E5SH53, B3S5Y1, and predicted BCA2 in A. pisum) were predicted to have a mitochondrial localization signal; 6 of 8 β-CAs of plants (P17067, Q8LSC8, P27141, D7TWP2, I2FJZ8, and B9GHR1) were predicted to have a chloroplastic localization.

Prediction of antigenic sites in β-CA
According to the acceptable 3-85 residue variation in epitope length of an antigen [60] and default parameters of EMBOSS Antigenic database, the minimum length of an antigenic region in this set of β-CAs is 6 amino acid residues. The predictions of antigenic sites in the 31 β-CA proteins are shown in Table 3; the highest score belongs to the most antigenic site.

Homology modelling
Homology models of four selected β-CAs verified the predicted localization of conserved residues in the active site. Two loop regions showed high variability in the sequence length which is apparent in the Figure 3C, D and indicated by "*" and "**". In addition, homology modelling suggested insertion located within the longest α-helix in case of homology models based on 1EKJ ( Figure 3C, indicated by "***").
To study the molecular availability of the predicted main antigenic epitope, surface exposure of the homology model created from PDB model 1EKJ (β-CA from Pisum sativum) and the homology model based on the β-CA sequence from Ancylostoma caninum were studied  by visualizing the molecular surface ( Figure 4). The analysis revealed that the majority of the epitope was buried within the structure. The residues considered to be mainly buried in the structure are shown in green, while solvent-exposed residues are shown with red colour. Two residues in PDB model 1EKJ (β-CA from Pisum sativum) appear considerably smaller than their complements in the Ancylostoma caninum-based homology model, and those residues can be considered to be only partially exposed (Figure 4, indicated by yellow colour in the alignment). Taken together, these results indicate that the predicted epitope sequence is mainly buried in β-CA sequences.

Discussion
Several insect, parasite, and plant genomes contain genes which encode β-CA enzymes. Some of these parasites and insects are either causative agents or vectors of important veterinary, fish farming, and zoonotic diseases ( Table 1). For this analysis we selected 31 β-CAs, 23 from parasites and 8 from plants. These sequences were retrieved from protein databases, or predicted from their genomes, . There was also an orchard invasive dipteran fruit fly (Ceratitis capitata) and three pests of wood industries, such as Camponotus floridanus, Dendroctonus ponderosae, and Solenopsis invicta.
Our MSA of β-CAs in plants, parasites, and insects showed that they all contain the first (CXDXR) and second (HXXC) highly conserved sequences of β-CA. The presence of β-CA proteins in various insects and parasites and their absence in mammals suggests that these enzymes could be potential targets for the development of novel pesticides or anti-parasitic drugs with minimal side effects on vertebrates. A key requirement for such novel β-CA inhibitors is the high isoform specificity. The The italic and bolded residues represent the first (CXDXR) and second (HXXC) highly conserved sequences in the catalytic active sites of the enzyme whenever present in the predicted epitope. *:HitCount means the total number of antigenic residues in the whole sequence of one protein or antigen.
distinction among β-CA proteins elucidated in the phylogenetic tree indicates that inhibitors can be created which would target β-CAs specific to different groups of species, leaving those in other species, such as plants, unaffected. Unfortunately, design of highly specific inhibitors will require proper structural data based on protein crystallography. Thus far, β-CA crystal structures from only a few different species are available in PDB database (http:// www.rcsb.org/pdb/home/home.do), including some algae, bacteria, archaea, yeast, and a plant Pisum sativum [61].
Our prediction results on the subcellular localization of β-CAs showed that 6 of 23 β-CAs from parasites (XP_004537221.1, B0WKV7, U6PDI1, E5SH53, B3S5Y1, and predicted BCA2 in A. pisum) are probably mitochondrial enzymes. It is well known that several pesticides have unwanted side effects because of their off-target impacts on mitochondria [62]. Blocking of β-CAs in insect and parasitic cells can affect mitochondrial metabolic cycles and possibly eradicate the pathogens. Figure 5 presents 14 categories of known αand/or β-CA inhibitors, which

Figure 4
Determination of the availability of the predicted epitope. The molecular surface of the homology model of β-CA from Ancylostoma caninum is shown as solid grey and the target epitope sequence was excluded from the surface presentation. The epitope residues exposed to solvent are shown as red VdW spheres and numbered, while buried residues are shown with green spheres. An alignment containing PDB 1EKJ and the corresponding sequence from Ancylostoma caninum predicted β-CA is shown. The numbering of the residues in the alignment is according to the Ancylostoma caninum sequence. The yellow residues in the alignment indicate partially buried structure. are able to inhibit catalytic activity of these enzyme families [63,64]. As the result, inhibition of CA activity would slow down some cellular biochemical pathways in parasites and insects, such as gluconeogenesis, nucleotide biosynthesis, fatty acid synthesis, gastrointestinal function, neuronal signaling, respiration, and reproduction. In plants and algae, it is known that β-CAs are required for CO 2 sequestration within chloroplast, and therefore CA inhibition would affect the rate of photosynthesis [65]. Importantly, β-CA inhibition in fungi and Drosophila melanogaster revealed completely different inhibition profiles [17], suggesting that β-CAs of parasites and insects can be inhibited with higher affinity than plant CAs by applying the right inhibitors and concentrations.
Another important goal is to find inhibitors that are specific for β-CAs and do not affect α-CAs at all. This would first require detailed structural data on selected parasite and insect CAs. The resolved structures would then allow high throughput screening of chemical compounds, identification of the most promising inhibitor molecules, and testing of potential compounds in vitro and in vivo.
Vaccination would offer another option to develop antiparasitic treatments based on β-CAs. In our study we used computational antigen prediction tools, which have been developed to reduce the laboratory work required to identify important antigenic epitopes in pathogenic proteins [66]. The Protegen database (http://www.violinet.org/ protegen/) has been used to identify a number of predicted antigens from bacteria, viruses, parasites and fungi, which are involved in immune responses against various infectious and non-infectious diseases [67]. Antigenic site prediction of β-CA of parasites and plants revealed that the first and second highly conserved sequences (CXDXR and HXXC) represent the most plausible antigenic sites of β-CAs. Because these epitopes are located in the region of the active site and are mainly buried (Figure 4), they show very limited promise as vaccine targets. Furthermore, most β-CAs are intracellular proteins which are not readily accessible for immunological recognition. Taking all of these results together, small molecule inhibitors should still be considered the first option when β-CAs are investigated as therapeutic target proteins. Figure 5 Effects of 14 CA inhibitors on αand β-CAs of parasites and insects. Some compounds inhibit members of both αand β-CA enzyme families. The brown box shows physiological processes where bicarbonate plays a role as a biochemical substrate. The ultimate goal of future research should be the creation of inhibitors specific to both enzyme families and to each isozyme. Ideally, the specific inhibitors would cause tissue-and organ-specific effects in parasites and vectors with minimal off-target effects on other species. Number 1 shows the catalytic pathway of αand β-CA and number 2 shows the inhibitory effects of αand β-CA inhibitors.

Conclusions
Our present work is the first study that discusses the potential role of β-CAs as target proteins for pesticides and anti-parasitic agents in agriculture and livestock husbandry. Our results could potentially have significant impacts on development of novel pesticides, which would directly benefit both food and forest industries. This is important as pests cause significant costs for agricultural, horticultural, and livestock husbandry products due to production losses [68]. Since β-CA sequences are not present in the genomes of vertebrates, the possible offtarget effects in human and vertebrate animals should be minimal if high isozyme specificity is achieved. Discovery and validation of a new generation of β-CA inhibitors as pesticides and anti-parasitic agents would be a novel research field for chemical and pharmaceutical industries to improve safe nutrition and general health in societies.