Pipeline for the identification and classification of ion channels in parasitic flatworms
- Bahiyah Nor†1,
- Neil D. Young†1Email author,
- Pasi K. Korhonen1,
- Ross S. Hall1,
- Patrick Tan2, 3,
- Andrew Lonie4 and
- Robin B. Gasser1Email author
© Nor et al. 2016
Received: 9 February 2016
Accepted: 5 March 2016
Published: 16 March 2016
Ion channels are well characterised in model organisms, principally because of the availability of functional genomic tools and datasets for these species. This contrasts the situation, for example, for parasites of humans and animals, whose genomic and biological uniqueness means that many genes and their products cannot be annotated. As ion channels are recognised as important drug targets in mammals, the accurate identification and classification of parasite channels could provide major prospects for defining unique targets for designing novel and specific anti-parasite therapies. Here, we established a reliable bioinformatic pipeline for the identification and classification of ion channels encoded in the genome of the cancer-causing liver fluke Opisthorchis viverrini, and extended its application to related flatworms affecting humans.
We built an ion channel identification + classification pipeline (called MuSICC), employing an optimised support vector machine (SVM) model and using the Kyoto Encyclopaedia of Genes and Genomes (KEGG) classification system. Ion channel proteins were first identified and grouped according to amino acid sequence similarity to classified ion channels and the presence and number of ion channel-like conserved and transmembrane domains. Predicted ion channels were then classified to sub-family using a SVM model, trained using ion channel features.
Following an evaluation of this pipeline (MuSICC), which demonstrated a classification sensitivity of 95.2 % and accuracy of 70.5 % for known ion channels, we applied it to effectively identify and classify ion channels in selected parasitic flatworms.
MuSICC provides a practical and effective tool for the identification and classification of ion channels of parasitic flatworms, and should be applicable to a broad range of organisms that are evolutionarily distant from taxa whose ion channels are functionally characterised.
Ion channels are pore-forming transmembrane protein complexes, whose functions include generating electrical signals (action potentials) by regulating the flow of ions across the membranes of cells, gating ion flow across epithelial and secretory cells, and governing cell volume . These channels are categorised physiologically based on their gating mechanisms (voltage-gated or ligand-gated) and the types of ions that they transport (e.g., Ca2+, Cl−, K+ and Na+) [1, 2]. Given that they have essential and specific roles in a wide range of different cells and that the disruption or mutation of their functions often causes serious disease , ion channels are recognised as valuable targets for drugs for many non-infectious disorders of humans and animals [4, 5].
Ion channel repertoires of some (“model”) organisms, such as Homo sapiens (human) and Caenorhabditis elegans (free-living roundworm), are relatively well defined, because of the availability of extensive genomic, proteomic, functional and other datasets as well as ion channel functional information for these species (e.g., [6–10]), but this is not the case for most other organisms whose biology, biochemistry and physiology are largely unknown and are divergent from well-characterised organisms, such as humans and C. elegans . This is particularly the case for eukaryotic pathogens, such as flatworm parasites (phylum Platyhelminthes), which are evolutionarily distinct from “model” species and cause devastating diseases of major proportion in humans and animals around the world .
Salient information on parasites chosen for the present study
Opisthorchiidae (liver fluke)
Schistosomatidae (blood fluke)
Urogenital schistosomiasis; squamous cell carcinoma of the bladder
Cystic echinococcosis or hydatidosis
Alveolar echinococcosis or hydatidosis
The availability of large genomic datasets and the development of new bioinformatic approaches now make it feasible to classify ion channels using amino acid sequence and/or protein structural similarities. Generic bioinformatic tools, such as BLAST , HMMER  and InterProScan , are commonly used for gene annotation [18–20]. Besides these generic tools, some studies [21–23] have delivered algorithms specifically to classify ion channels, and most of them employ machine-learning algorithms trained using ion channel protein sequence data from specialised protein databases, such as IUPHAR (International Union of Basic and Clinical Pharmacology), LIC (ligand-gated ion channel) and VKCDB (voltage-gated potassium channel) [24–26]. Most functionally annotated and curated ion channels in the UniProtKB/SwissProt database are from deuterostomes (e.g., vertebrates) and ecdysozoans (e.g. C. elegans and Drosophila melanogaster). The integration of these data and use of advanced bioinformatics should significantly enhance our ability to explore (identify and classify) ion channels in eukaryotes that are evolutionarily distant from taxa whose ion channels are functionally characterised. To this end, the aim of this study was to establish a bioinformatic pipelines for the reliable identification and classification of ion channels in parasitic flatworms affecting millions of people and animals worldwide (Table 1). Our main focus here was on the cancer-causing (carcinogenic) liver fluke O. viverrini , and we extended its application to related flukes as well as socioeconomically important tapeworm parasites [28, 29] (Table 1).
Three datasets were prepared: (1) The training dataset was established using all classified ion channel and aquaporin sequence data from the KEGG database [30, 31] as well as molluscan ion channel sequences in the UniProtKB/Swiss-Prot Database , sodium channel protein 1 brain (Q05973; SCN1_HETBL), glutamate receptor (P26591; GLRK_LYMST), gamma-aminobutyric acid receptor subunit beta (P26714; GBRB_LYMST) and FMRFamide-activated amiloride-sensitive sodium channel (Q25011; FANA_HELAS). All UniProt/Swiss-Prot sequences were annotated using the KEGG orthology ion channel K-term of the KEGG entry, with highest sequence similarity inferred using BLASTp . All human and C. elegans sequences and any sequences with ambiguous amino acid residues (“X”, “B” or “Z”), or that were annotated as “hypothetical” or “putative”, were removed from the training dataset. The training dataset was divided into 48 ion channel subfamily classes and one aquaporin class. Sequence similarity bias was removed from each subfamily class by selecting representative protein sequences of particular groups with >80 % sequence similarity using the CD-HIT program . (2) The test dataset was established using all predicted proteins available for proteomes of human and C. elegans in the KEGG database [30, 31]. (3) The parasite dataset represented amino acid sequences translated from genes of O. viverrini  and related flatworms Cl. sinensis (liver flukes), S. haematobium, S. japonicum, S. mansoni (blood flukes), E. granulosus, E. multilocularis and T. solium (tapeworms) (Table 1).
Prediction of ion channel proteins
For the test and parasite datasets, ion channels were predicted based on amino acid sequence similarity searches (Fig. 1). To remove any ‘false-positives’ from these datasets, we initially screened each sequence against the KEGG database using BLASTp  (E-value of <10−15), retaining proteins with a best match to an annotated ion channel. For the test dataset, a sequence similarity match to a human or C. elegans sequence in the KEGG database was ignored. Then, the remaining test and parasite dataset proteins were compared (BLASTp, E-value <10−45) against the training dataset, with sequences similar to training dataset proteins retained as putative ion channel proteins.
For all sequences in each dataset, we identified conserved domains using InterProScan v.5.7.48  and the Pfam database . We curated the Pfam conserved domain accession numbers for individual sequences in the training dataset, to create conserved (C-) domain profiles for individual ion channel subfamilies. These profiles were then used to characterise and group sequences in the test and parasite ion channel datasets, based on the presence or absence of C-domains. Then, we predicted transmembrane (TM-) domains in individual sequences using TMHMM v.2.0  and curated the number of TM-domains predicted from each sequence in the training dataset for each ion channel subfamily. The range of predicted TM-domains for sequences classified in each subfamily was then used to characterise and group sequences in test and parasite ion channel datasets. Finally, we divided putative test and parasite ion channels into four distinct groups according to: sequence similarity to known ion channels, and presence of C-domain and TM-domain(s) (Group 1); similarity, and presence of C-domain, but no TM-domain(s) (Group 2); similarity, and presence of TM-domain(s), but no C-domain (Group 3); similarity, but no C- or TM-domains (Group 4).
Construction and testing of support vector machines (SVMs)
where n is the length of the sequence and a i represents amino acid residue at position i. In total, each sequence was represented as a vector of 475 features, including the amino acid composition (20 characters), Chou’s pseudo-amino acid composition (λ = 55) and dipeptide frequency (400 characters).
The SVMs were constructed using LIBSVM  extension in R v.3.2.0  using the e1071 package . For comparative purposes, five models were constructed using radial basis kernel, each with different sets of features and kernel parameters that were tuned with five-fold cross validation. The first model, named ‘Amino’, was built using 20 amino acid frequencies as features; the second model, called ‘Chemistry’, was built using 55 features based on the hydrophobicity, hydrophilicity and side chain-mass. The third model, ‘Chou’, was built using Chou’s pseudo-amino acid composition by combining the 20 amino acid and 55 chemical information features. The fourth model, named ‘Dipeptide’, was built using 400 dipeptide composition features. The last model, ‘Classifier’, was built using all 475 features.
The classification models were validated using five-fold cross-validation, and assessed against the classifications of known ion channel and aquaporin sequences encoded in the human and C. elegans genomes. Receiver operating characteristic (ROC) analysis  was conducted to evaluate the performance of each model. For comparative purposes, we also assessed the test dataset using other probabilistic classification methods, including random forest, classification via logistic regression and prior classifier, conducted using established methods [46–48]. Using the best-performing classification models, confusion matrices were constructed to further evaluate each model and compare their performance based on the final table of confusion. For the final model, the average classification probability values for individual subfamilies in the test dataset were computed; these probability values were utilised to classify the ion channels predicted from the parasite dataset.
Protein categories were classified based on SVM probability values: Category A proteins had probability values greater than or equal to the subfamily probability threshold. Category B proteins had probability values between 50 % of the subfamily probability threshold and the subfamily probability threshold. Category C proteins had probability values less than 50 % of the subfamily probability threshold. A confidence ranking was given to our ion channel classifications. High confidence classifications included channels in Category A (Groups 1 to 4) and Category B (Groups 1 and 2), which were annotated by SVM subfamily classification. Medium confidence classifications included channels in Category B (Groups 3 and 4), which were annotated by SVM subfamily-classifications and designated with the suffix, “-like” (e.g. GABA-like ion channel). Low confidence classifications included all proteins in Category C (Groups 1 to 4), which represented ion channel-like proteins but could not be confidently assigned to a particular family or subfamily.
Training and test datasets
The training dataset consisted of 26,050 classified ion channel and aquaporin sequences (Additional file 1: Table S1). After removing protein sequences from human and C. elegans as well as ambiguous sequences and sequence similarity bias from the dataset, 6299 classified ion channel and aquaporin sequences remained for model construction and training (Additional file 1: Tables S1 and S2). The test dataset consisted of the combined human and C. elegans proteins, including 389 sequences annotated with ion channel and aquaporin K-terms in the KEGG database (Additional file 1: Table S1).
Identification of ion channels
From the test dataset, 657 ion channel-like proteins with sequence similarity (BLASTp, E-value <10−15) to known ion channels in the KEGG database were identified (Additional file 1: Table S3); they included 390 and 267 from humans and C. elegans, respectively, of which 299 human (100 %) and 93 C. elegans (100 %) ion channels were retained. Using a stringent sequence similarity search (BLASTp, E-value <10−45) against sequences in the training dataset, 344 human and 185 C. elegans sequences were retained (Additional file 1: Table S3), including 299 human (100 %) and 93 C. elegans (100 %) ion channels.
A total of 194 unique Pfam C-domains were detected in 6161 sequences (~97.8 %) of the training dataset, with 88 unique C-domains detected in >75 % of the sequences of 45 ion channel subfamilies (Additional file 2: Figure S1), such as the neurotransmitter-gated ion channel ligand-binding domain (PF02931) in >88 % of the Cys-loop subfamilies. TM-domains were detected in 5774 (~91.7 %) sequences in the training set, with the number of such domains varying from 1 to 22 per protein (Additional file 2: Figure S2), being within the expected range for individual ion channel subfamilies. TMs were not detected in 525 sequences (Additional file 2: Figure S2). Based on sequence similarity, and the presence/absence of conserved and TM-domains, the sequences from the test dataset were divided into Group 1 (n = 443; including 335 known ion channels), Group 2 (57; 44 known ion channels), Group 3 (15; 5 known ion channels) and Group 4 (14; 5 known ion channels). Sequences within individual groups were then subjected to ion channel classification (Additional file 1: Table S4).
Ion channel classifiers
The performance of each of the five SVM models to classify ion channels was assessed using the training dataset. For this purpose, any known non-ion channel sequences were removed. Based on the five-fold cross-validation, training and test accuracies (Additional file 1: Table S5), we concluded that the ‘Dipeptide’ (94.6 % test accuracy) and ‘Classifier’ (95.9 % test accuracy) models out-performed the other three models (Additional file 1: Table S5). Confusion matrices for the ‘Classifier’ and ‘Dipeptide’ models were constructed to further evaluate the models, and to compare their performances based on the final table of confusion (Additional file 1: Table S6); the ‘Classifier’ model recorded the best overall scores (Additional file 1: Table S6).
The performance of the ‘Classifier’ model was evaluated using the complete test dataset (including protein sequences that were not ion-channels) and recorded a sensitivity of 95.2 %, an accuracy of 70.5 % and a specificity of 0 %; this result was expected, as an SVM model had not been trained for protein sequences other than ion channels (i.e. “non-ion channel” sequences). This finding shows the importance of identifying ion channels prior to classifying them.
The performance of the SVM classifier and the other probabilistic classification methods (random forest, classification via logistic regression and prior classifier) were then compared using the test dataset, employing the sorted probability values to construct ROC curves for each classifier (Additional file 2: Figure S3). The area-under-the-curve (AUC) for the SVM ‘Classifier’ was 0.911, random forest classification was 0.9105, the logistic regression classifier was 0.8211 and the AUC for prior classifier was 0.6701. The SVM and random forest classifiers performed similarly, but due to the high dimensionality of the data, classification via SVM was preferred.
Overall, there was a correlation between the probability values for the test dataset and correctness of their classification (Additional file 2: Figure S4A). In general, classifications with probability values of ≥0.54 tended to be correct, whereas those with lower probability values tended to be incorrect. When probability values were compared among ion channel subfamilies (Additional file 2: Figure S4B), the average probability values for each subfamily ranged from ~ 0.15 to 0.91 (Additional file 2: Figure S4B). Based on these findings, we elected to infer confidence in future classifications made using the SVM classifier employing the average probability values for individual subfamilies (Additional file 2: Figure S4B), instead of using single threshold probability value for all ion channel classifications (Additional file 2: Figure S4A). Using the test dataset, we observed higher probability values for proteins identified as Group 1 and Group 2 ion channels (Additional file 2: Figure S5). The majority of ion channels in Groups 3 and 4 had classifier probability values of <0.5 (Additional file 2: Figure S5).
Ion channels of Opisthorchis viverrini and other flatworms
Comparison of the numbers of ion channel sequences within each family among humans, C. elegans and representative parasitic flatworms* that were classified with high and medium confidence
Ion channel family
Glutamate-gated cation channels
Epithelial and related channels
Ryanodine and IP3 receptors
Voltage-gated ion channels
Related to voltage-gated ion channels
Unclassified ion channel-like proteins
Here, we constructed a practical bioinformatic pipeline, designated MuSICC, to both identify and classify known ion channel families/subfamilies by combining three existing tools and an SVM classifier  trained using classified ion channel amino acid compositions, Chou’s pseudo-amino acid compositions  and dipeptide frequencies. Although previous tools were developed to identify select ion channel groups [21, 23, 50, 51], none of them both identify and classify (all) ion channels into families and subfamilies. Here, we focused on developing a pipeline that would identify and classify such ion channels from eukaryotic organisms that are genetically and biologically very distinct from “model” organisms (such as C. elegans, Drosophila and humans, whose ion channels are well-characterised). The phylogenetic positions of parasitic flatworms in the eukaryotic evolutionary tree  made them ideal candidates for this study. Moreover, evidence that some flatworms are developing resistance against some of the recommended chemotherapies [13, 14] necessitates the search for new anthelmintics, and ion channels represent promising targets for such drugs [4, 5].
In this study, we first constructed and evaluated the pipeline to identify and classify channels in O. viverrini, a highly significant carcinogenic parasite affecting >8 million people worldwide . Following this evaluation, we then applied this pipeline to datasets for seven other socioeconomically important flatworms (Table 1), and undertook a detailed, comparative analysis. The key to accurate identification and classification was the prediction process. As the SVM models were not trained using non-ion channel sequences (i.e. there is non-ion channel classifier), these models are not able to distinguish between ion channel and non-ion channel sequences. Therefore, it is important that the prediction of ion channel sequences (data screening) is accurate. We defined three prediction criteria: (1) significant sequence similarity to known ion channels, (2) presence of ion channel C-domains, and (3) an appropriate number of TM-domains compared with known ion channels.
The sequence similarity (BLASTp) screening steps proved to be effective in filtering out the majority of non-ion channel sequences. In the test dataset, 137 sequences (25.9 %) were incorrectly identified as ion channels. We determined that 32 of the 137 ‘false-positives’ did not encode ion channels but were very similar to the ion channel training sequences, whereas 105 sequences were not annotated using the KEGG database. We compared the annotations of these 105 sequences with those in the UniProtKB  and RefSeq  databases; 88 of the sequences were putative ion channels/proteins, and 17 were unknown/uncharacterised proteins. Therefore, we are confident that future predictions, based on the thresholds set here, will yield a low number of false-positive results, if any at all.
Although conducting two BLASTp processes may be computationally exhaustive and somewhat time consuming, the same result was not achievable by conducting BLASTp only once against either KEGG database or the training sequences. Proteins that were not ion channels and shared high sequence similarity (BLASTp, E-value < 10−45) with ion channels were first identified and excluded by initially screening against the complete KEGG database and selecting proteins with a match to an ion channel. An additional search of our curated training dataset ensured that false-positive results were minimised, and known ion channels were retained. As the accurate prediction of ion channels is the key to the performance of the present pipeline, we considered the computation time to be less of a priority, at this stage.
The application of three existing bioinformatics tools posed some limitations on the present pipeline. First, the pipeline is dependent on the KEGG database and the KEGG Orthology (KO) grouping method. KO grouping provided a hierarchical annotation based on K-terms, which eased the process of predicting ion channel sequences following the first BLASTp step. However, the implementation of the KO grouping method for predicting sequences was restricted to the annotated ion channel genes in the KEGG database. BLASTp analysis against protein databases without an established annotation system would make an automation process impossible, because manual annotation of ion channel sequences is not feasible as the number of sequences increases. An alternative to the KO annotation is the UniProt Gene Ontology Annotation (UniProt-GOA) database . Second, the bioinformatic pipeline is dependent on the performance of the prediction tools applied – BLASTp, InterProScan and TMHMM 2.0. Based on the present findings, the tools applied here allow the reliable prediction of ion channel sequences. However, the quality of sequences to be identified and classified needs to be high; the use of poor quality sequences will result in mis-classifications.
Two factors were considered crucial in relation to accepting or rejecting the classification made by the SVM classifier. The first was the probability value, computed by the classifier to determine the probability that an unknown sequence belonged to the classified ion channel subfamily, and enabling the probability thresholds to be defined for individual subfamilies (Additional file 2: Figure S4). The second factor considered was the groupings that were made based on the prediction criteria. There was a close association between grouping and the SVM classifier probability value (Additional file 2: Figure S5A); sequences classified with a probability of >0.8 were usually assigned to Groups 1 and 2 - the sequences with significant similarity to known ion channels and contained conserved domains of ion channels. Therefore, sequence grouping also provided confidence in the classification of ion channels.
Ion channels of parasitic flatworms described in the published literature and whether they were identified and classified correctly using our bioinformatic pipeline (MuSICC)
Shaker-related K+ channel
Nicotinic acetylcholine receptors
Ca2+ channel beta subunits
Novel glutamate-gated chloride channel subunits
Acetylcholine-gated chloride channels
ATP-sensitive potassium channel
The number of calcium ion channels classified for O. viverrini was higher than for C. elegans and humans. The number of sequences encoding such channels in O. viverrini represents ~ 19.5 % of 41 sequences classified with confidence to encode ion channels. This is more than the proportion of calcium ion channels in H. sapiens (~13.7 %), and there was also considerable diversity compared with C. elegans and human. Although there are some channels (~40.8 %) that are conserved among the three species, there are ion channels that are shared only by any two of these organisms. Notably, the acid-sensing ion channels (ASIC) and ATP-gated cation channels (P2X) present in both O. viverrini and human were absent from C. elegans.
The subsequent classification of ion channels from the seven other species of flatworms (trematodes and cestodes) further reinforced the genetic diversity between these parasites and the two well-characterised “model” organisms. The average probability values, which were lower than the thresholds computed by the SVM classifier, indicated that ion channels of these parasites are distinct from all presently known ion channels, despite being similar to them and containing the C-domains. Furthermore, more than half of the sequences were annotated as “unclassified ion channel-like proteins” based on the low probability values and the absence of ion channel C-domains. Importantly, the bioinformatics pipeline established here is able to identify and classify ion channels (with 95 % accuracy), irrespective of sequence diversity. Nonetheless, it may be possible, in the future, to enhance the performance of the pipeline using structural similarity predictions and by training the SVM classifier using protein sequences other than ion channel to be able to distinguish ion channels from those that are not. However, this will require additional work as the process of selecting non-ion channel sequences, as the training dataset would need to include a substantial number of curated sequences from distinct groups of proteins from many different species of eukaryotes.
The present study delivers a practical and effective bioinformatic pipeline (MuSICC) for both the identification and classification of ion channels in parasitic flatworms of socioeconomic importance. MuSICC should be useful for the selection of high-priority candidates for functional genomic studies and for drug target discovery in parasitic flatworms. In addition, it might guide future investigations of the roles of ion channels in cellular processes and host-parasite interactions. Although applied to parasitic flatworms, the MuSICC pipeline should be applicable to classifying ion channels in a wide range of organisms.
This project was also supported by a Victorian Life Sciences Computation Initiative (grant number VR0007) on its Peak Computing Facility at the University of Melbourne, an initiative of the Victorian Government (R.B.G. and A.L.). Funding from the Australian Research Council, the National Health and Medical Research Council (NHMRC) of Australia, Yourgene Bioscience and Melbourne Water Corporation is gratefully acknowledged (R.B.G. et al.). N.D.Y. holds an NHMRC Career Development Fellowship. P.K.K. is the recipient of a scholarship (STRAPA) from the University of Melbourne.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
- Hillie B. Ion channels of excitable membrane. USA: Sinauer Associates; 2001.Google Scholar
- Ackerman MJ, Clapham DE. Ion channels - basic science and clinical disease. N Engl J Med. 1997;336:1575–86.View ArticlePubMedGoogle Scholar
- Jentsch TJ, Hübner CA, Fuhrmann JC. Ion channels: Function unravelled by dysfunction. Nat Cell Biol. 2004;6:1039–47.View ArticlePubMedGoogle Scholar
- Jiang Z, Zhou Y. Using bioinformatics for drug target identification from the genome. Am J Pharmacogenomic. 2005;5:387–96.View ArticleGoogle Scholar
- Overington JP, Al-Lazikani B, Hopkins AL. How many drug targets are there? Nat Rev Drug Discov. 2006;5:993–96.View ArticlePubMedGoogle Scholar
- Coetzee WA, Amarillo Y, Chiu J, Chow A, Lau D, McCormack T, Moreno H, Nadal MS, Ozaita A, Pountney D, et al. Molecular diversity of K+ channels. Ann N Y Acad Sci. 1999;868:233–55.View ArticlePubMedGoogle Scholar
- Conn PJ, Pin JP. Pharmacology and functions of metabotropic glutamate receptors. Annu Rev Pharmacol Toxicol. 1997;37:205–37.View ArticlePubMedGoogle Scholar
- Macdonald RL, Olsen RW. GABAA receptor channels. Annu Rev Neurosci. 1994;17:569–602.View ArticlePubMedGoogle Scholar
- North RA. Molecular physiology of P2X receptors. Physiol Rev. 2002;82:1013–67.View ArticlePubMedGoogle Scholar
- Strange K. From genes to integrative physiology: Ion channel and transporter biology in Caenorhabditis elegans. Physiol Rev. 2003;83:377–415.View ArticlePubMedGoogle Scholar
- Riutort M, Álvarez-Presas M, Lazaro E, Sola E, Paps J. Evolutionary history of the tricladida and the platyhelminthes: an up-to-date phylogenetic and systematic account. Int J Dev Biol. 2012;56:5–17.View ArticlePubMedGoogle Scholar
- Welburn SC, Beange I, Ducrotoy MJ, Okello AL. The neglected zoonoses - the case for integrated control and advocacy. Clin Microbiol Infect. 2015;21:433–43.View ArticlePubMedGoogle Scholar
- Brennan GP, Fairweather I, Trudgett A, Hoey E, McCoy, McConville M, Meaney M, Robinson M, McFerran N, Ryan L, et al. Understanding triclabendazole resistance. Exp Mol Pathol. 2007;82:104–09.View ArticlePubMedGoogle Scholar
- Brockwell YM, Elliott TP, Anderson GR, Stanton R, Spithill TW, Sangster NC. Confirmation of Fasciola hepatica resistant to triclabendazole in naturally infected Australian beef and dairy cattle. Int J Parasitol Drugs Drug Resist. 2014;4:48–54.View ArticlePubMedPubMed CentralGoogle Scholar
- Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment tool. J Mol Biol. 1990;215:403–10.View ArticlePubMedGoogle Scholar
- Eddy SR. Profile hidden Markov models. Bioinformatics. 1998;14:755–63.View ArticlePubMedGoogle Scholar
- Zdonov EM, Apweiler R. InterProScan - an integration platform for the signature-recognition methods in InterPro. Bioinformatics. 2001;17:847–48.View ArticleGoogle Scholar
- Lee N, Chen J, Sun L, Wu SJ, Gray KR, Rich A, Huang MX, Lin JH, Feder JN, Janovitz EB, et al. Expression and characterization of human transient receptor potential melastatin 3 (hTRPM3). J Biol Chem. 2003;278:20890–97.View ArticlePubMedGoogle Scholar
- MacDonald K, Buxton S, Kimber MJ, Day TA, Robertson AP, Ribeiro P. Functional characterization of a novel family of acetylcholine-gated chloride channels in Schistosoma mansoni. PLoS Pathog. 2014;10:e1004181.View ArticlePubMedPubMed CentralGoogle Scholar
- Scott JG, Warren WC, Beukeboom LW, Bopp D, Clark AG, Giers SD, Hediger M, Jones AK, Kasai S, Leichter CA, et al. Genome of the house fly, Musca domestica L., a global vector of diseases with adaptation to a septic environment. Genome Biol. 2014;15:466.View ArticlePubMedPubMed CentralGoogle Scholar
- Lin H, Ding H. Predicting ion channels and their types by the dipeptide mode of pseudo amino acid composition. J Theor Biol. 2011;269:64–9.View ArticlePubMedGoogle Scholar
- Liu W-X, Deng E-Z, Chen W, Lin H. Identifying the subfamilies of voltage-gated potassium channel using feature selection technique. Int J Mol Sci. 2014;15:12940–51.View ArticlePubMedPubMed CentralGoogle Scholar
- Saha S, Zack J, Singh B, Raghava GPS. VGIchan: Prediction and classification of voltage-gated ion channels. Geno Prot Bioinfo. 2006;4:253–58.View ArticleGoogle Scholar
- Donizelli M, Djite MA, Le Novere N. LGICdb: a manually curated sequence database after the genomes. Nucleic Acids Res. 2006;34:D267–D69.View ArticlePubMedPubMed CentralGoogle Scholar
- Gallin WJ, Boutet PA. VKCDB: voltage-gated K+ channel database updated and upgraded. Nucleic Acids Res. 2010;39:D362–D66.View ArticlePubMedPubMed CentralGoogle Scholar
- Kenakin T. New concepts in pharmacological efficacy at 7TM receptors: IUPHAR Review 2. Br J Pharmacol. 2013;168:554–75.View ArticlePubMedPubMed CentralGoogle Scholar
- Sripa B, Bethony JM, Sithithaworn P, Kaewkes S, Mairiang E, Loukas A, Mulvenna J, Laha T, Hotez PJ, Brindley PJ. Opithorchiasis and Opisthorchis-associated cholangiocarcinoma in Thailand and Laos. Acta Trop. 2011;120S:S158–S68.View ArticleGoogle Scholar
- Eckert J, Schantz PM, Gasser RB, Torgerson PR, Bessonov AS, Movsessian SO AT, Grimm F, Nikogossian MA. Chapter 4: Geographic distribution and prevalence. In: WHO / OIE Manual on Echinococcosis in Humans and Animals: a Public Health Problem of Global Concern. 2001.Google Scholar
- Garcia HH, Moro PL, Schantz PM. Zoonotic helminth infections of humans: echinococcosis, cysticercosis and fascioliasis. Curr Opin Infect Dis. 2007;20:489–94.View ArticlePubMedGoogle Scholar
- Kanehisa M, Goto S. KEGG: Kyoto encyclopedia of genes and genomes. Nucleic Acids Res. 2000;28:27–30.View ArticlePubMedPubMed CentralGoogle Scholar
- Kanehisa M, Goto S, Sato Y, Kawashima M, Furumichi M, Tanabe M. Data, information, knowledge and principle: back to metabolism in KEGG. Nucleic Acids Res. 2014;42:D199–205.View ArticlePubMedPubMed CentralGoogle Scholar
- The UniProt Consortium. UniProt: a hub for protein information. Nucleic Acids Res. 2014;43:D204–D12.View ArticlePubMed CentralGoogle Scholar
- Li W, Godzik A. CD-HIT: A fast program for clustering and comparing large sets of proteins or nucleotide sequences. Bioinformatics. 2006;22:1658–59.View ArticlePubMedGoogle Scholar
- Young ND, Nagarajan N, Lin SJ, Korhonen PK, Jex AR, Hall RS, Safavi-Hemami H, Kaewking W, Bertrand D, Gao S, et al. The Opisthorchis viverrini genome provides insights into life in the bile duct. Nat Commun. 2014;5:4378.PubMedPubMed CentralGoogle Scholar
- Jones P, Binns D, Chang H-Y, Fraser M, Li W, McAnulla C, McWilliam H, Maslen J, Mitchell A, Nuka G, et al. InterProScan 5: genome-scale protein function classification. Bioinformatics. 2014;30:1236–40.View ArticlePubMedPubMed CentralGoogle Scholar
- Finn RD, Bateman A, Clements J, Coggill P, Eberhardt RY, Eddy SR, Heger A, Hetherington K, Holm L, Mistry J, et al. The Pfam protein families database. Nucleic Acids Res. 2014;42:D222–D30.View ArticlePubMedPubMed CentralGoogle Scholar
- Krogh A, Larsson B, von Heijne G, Sonnhammer ELL. Predicting transmembrane protein topology with a hidden Markov model: Application to complete genomes. J Mol Biol. 2001;305:567–80.View ArticlePubMedGoogle Scholar
- Chou KC. Prediction of protein cellular attributes using pseudo-amino acid composition. Proteins. 2001;43:246–55.View ArticlePubMedGoogle Scholar
- Tanford C. Contribution of hydrophobic interactions to the stability of the globular conformation of proteins. J Am Chem Soc. 1962;84:4240–47.View ArticleGoogle Scholar
- Hopp TP, Woods KR. Prediction of protein antigenic determinants from amino acid sequences. Proc Natl Acad Sci U S A. 1981;78:3824–28.View ArticlePubMedPubMed CentralGoogle Scholar
- Shen HB, Chou KC. PseAAC: a flexible web-server for generating various kinds of protein pseudo amino acid composition. Anal Biochem. 2008;373:386–88.View ArticlePubMedGoogle Scholar
- Chang C-C, Lin C-J. LIBSVM: a library for support vector machines. ACM Trans Intell Syst Technol. 2011;2:1–27.View ArticleGoogle Scholar
- R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. [http://www.R-project.org/]
- e1071: Misc Functions of the Department of Statistics (e1071), TU Wien. R package version 1.6-4. [http://CRAN.R-project.org/package=e1071]
- Pollack I, Decker LR. Confidence ratings, message reception, and the receiver operating characteristic. J Acoust Soc Am. 1958;30:286–92.View ArticleGoogle Scholar
- Anderson JA. In: Krishnaiah PR, Kanal LN, editors. Logistic discrimination. In: Handbook of statistics, vol. 2. Amsterdam: North Holland; 1982. p. 169–91.Google Scholar
- Breiman L. Random forests. Mach Learn. 2001;45:5–32.View ArticleGoogle Scholar
- Jain AK, Duin RPW, Mao J. Statistical pattern recognition: A review. IEEE Trans Pattern Anal Mach Intell. 2000;22:4–37.View ArticleGoogle Scholar
- Cortes C, Vapnik V. Support-vector networks. Mach Learn. 1995;20:273–97.Google Scholar
- Lin H, Chen W. Briefing in application of machine learning methods in Ion channel prediction. Sci World J. 2015;2015:945927.Google Scholar
- Lin H, Li QZ. Using pseudo amino acid composition to predict protein structural class: Approached by incorporating 400 dipeptide components. J Comput Chem. 2007;28:1463–66.View ArticlePubMedGoogle Scholar
- Giribet G. Assembling the lophotrochozoan (=spiralian) tree of life. Phil Trans R Soc B. 2008;363:1513–22.View ArticlePubMedPubMed CentralGoogle Scholar
- The UniProt Consortium. Activities at the Universal Protein Resource. Nucleic Acids Res. 2014;42:D191–D98.View ArticlePubMed CentralGoogle Scholar
- Tatsutova T, Ciufo S, Fedorov B, O’Neill K, Tolstoy I. RefSeq microbial genomes database: new representation and annotation strategy. Nucleic Acids Res. 2014;42:D553–D59.View ArticleGoogle Scholar
- Huntley RP, Sawford T, Mutowo-Muellenet P, Shypitsyna A, Bonilla C, Martin MJ, O’Donovan C. The GOA database: Gene ontology annotation updates for 2015. Nucleic Acids Res. 2015;43:D1057–D10-63.View ArticlePubMedPubMed CentralGoogle Scholar
- Ikeda T. Effects of blockers of Ca2+ channels and other ion channels on in vitro excystment on Paragonimus ohirai metacercariae induced by sodium cholate. Parasitol Res. 2004;94:329–31.View ArticlePubMedGoogle Scholar
- Greenberg RM. Ca2+ signalling, voltage-gated Ca2+ channels and praziquantel in flatworm neuromusculature. Parasitology. 2005;131:S97–S108.View ArticlePubMedGoogle Scholar
- Mendonca-Silva DL, Novozhilova E, Corbett PJR, Silva CLM, Noel F, Totten MIJ, Maule AG, Day TA. Role of calcium influx through voltage-operated calcium channels and of calcium mobilization in the physiology of Schistosoma mansoni muscle contractions. Parasitology. 2006;133:67–74.View ArticlePubMedGoogle Scholar
- Greenberg RM. Are Ca2+ channel targets of praziquantel action? Int J Parasitol. 2005;35:1–9.View ArticlePubMedGoogle Scholar
- Bentley GN, Jones AK, Agnew A. ShAR2beta, a divergent nicotinic acetylcholine receptor subunit from the blood fluke Schistosoma. Parasitology. 2007;134:833–40.View ArticlePubMedGoogle Scholar
- Dufour V, Beech RN, Wever C, Dent JA, Geary T. Molecular cloning and characterization of novel glutamate-gated chloride channel subunits from Schistosoma mansoni. PLoS Pathog. 2013;9:e1003586.View ArticlePubMedPubMed CentralGoogle Scholar
- Huang Y, Chen W, Wang X, Liu H, Chen Y, Guo L, Luo F, Sun J, Mao Q, Liang P, et al. The carcinogenic liver fluke, Clonorchis sinensis: new assembly, reannotation and analysis of the genome and characterization of tissue transcriptomes. PLoS One. 2013;8:e54732.View ArticlePubMedPubMed CentralGoogle Scholar
- Lun ZR, Gasser RB, Lai DH, Li AX, Zhu XQ, Yu XB, Fang YY. Clonorchiasis: a key foodborne zoonosis in China. Lancet Infect Dis. 2005;5:31–41.View ArticlePubMedGoogle Scholar
- Rollinson D. A wake up call for urinary schistosomiasis: reconciling research effort with public health importance. Parasitology. 2009;136:1593–610.View ArticlePubMedGoogle Scholar
- Young ND, Jex AR, Li B, Liu S, Yang L, Xiong Z, Li Y, Cantacessi C, Hall RS, Xu X, et al. Whole-genome sequence of Schistosoma haematobium. Nat Genet. 2012;44:221–5.View ArticlePubMedGoogle Scholar
- Liu F, Zhou Y, Wang ZQ, Lu G, Zheng H, Brindley PJ, McManus DP, Blair D, Zhang Q, Zhong Y et al. The Schistosoma japonicum genome reveals features of host-parasite interplay. Nature. 2009;460:345–51.View ArticlePubMed CentralGoogle Scholar
- McManus DP, Gray DJ, Li Y, Feng Z, Williams GM, Stewart D, Rey-Ladino J, Ross AG. Schistosomiasis in the People’s Republic of China: the era of the Three Gorges Dam. Clin Microbiol Rev. 2010;23:442–66.View ArticlePubMedPubMed CentralGoogle Scholar
- Berriman M, Hass BJ, LoVerde PT, Wilson RA, Dillon GP, Cerqueira GC, Mashiyama ST, Al-Lazikani B, Andrade LF, Ashton PD, et al. The genome of the blood fluke Schistosoma mansoni. Nature. 2009;460:352–58.View ArticlePubMedPubMed CentralGoogle Scholar
- Colley DG, Bustinduy AL, Secor WE, King CH. Human schistosomiasis. Lancet. 2014;383:2253–64.View ArticlePubMedPubMed CentralGoogle Scholar
- Protasio AV, Tsai IJ, Babbage A, Nichol S, Hunt M, Aslett MA, de Silva N, Velarde GS, Anderson TJC, Clark RC, et al. A systematically improved high quality genome and transcriptome of the human blood fluke Schistosoma mansoni. PLoS Negl Trop Dis. 2012;6:e1455.View ArticlePubMedPubMed CentralGoogle Scholar
- Tsai IJ, Zarowiecki M, Holroyd N, Garciarrubio A, Sanchez-Flores A, Brooks KL, Tracey A, Robes RJ, Fragoso G, Sciutto E, et al. The genomes of four tapeworm species reveal adaptations to parasitism. Nature. 2013;496:57–63.View ArticlePubMedPubMed CentralGoogle Scholar
- Kim E, Day TA, Bennett JL, Pax RA. Cloning and functional expression of a Shaker-related voltage-gated potassium channel gene from Schistosoma mansoni (Trematoda: Digenea). Parasitology. 1995;110(Pt 2):171–80.View ArticlePubMedGoogle Scholar
- Agboh KC, Webb TE, Evans RJ, Ennion SJ. Functional characterization of a P2X Receptor from Schistosoma mansoni. J Biol Chem. 2004;279:41650–57.View ArticlePubMedGoogle Scholar
- Salvador-Recatala V, Greenberg RM. The N terminus of a schistosome beta subunit regulates inactivation and current density of a Ca2 channel. J Biol Chem. 2010;285:35878–88.View ArticlePubMedPubMed CentralGoogle Scholar
- Salvador-Recatala V, Schneider T, Greenberg RM. Atypical properties of a conventional calcium channel β subunit from the platyhelminth Schistosoma mansoni. BMC Physiol. 2008;8:6.View ArticlePubMedPubMed CentralGoogle Scholar
- Hwang SY, Han HJ, Kim SH, Park SG, Seog DH, Kim N, Han J, Chung JY, Kho WG. Cloning of a pore-forming subunit of ATP-sensitive potassium channel from Clonorchis sinensis. Korean J Parasitol. 2003;41:199–33.View ArticleGoogle Scholar
- Geadkaew A, von Bülow J, Beitz E, Tesana S, Grams SV, Grams R. Bi-functionality of Opisthorchis viverrini aquaporins. Biochimie. 2015;108:149–59.View ArticlePubMedGoogle Scholar
- Thanasuwan S, Piratae S, Brindley PJ, Loukas A, Kaewkes S, Laha T. Suppression of aquaporin, a mediator of water channel control in the carcinogenic liver fluke, Opisthorchis viverrini. Parasit Vectors. 2015;7:224.View ArticleGoogle Scholar