Detecting genotyping errors at Schistosoma japonicum microsatellites with pedigree information
Parasites & Vectors volume 8, Article number: 452 (2015)
Schistosomiasis japonica remains a major public health problem in China. Integrating molecular analyses, such as population genetic analyses, of the parasite into the on-going surveillance programs is helpful in exploring the factors causing the persistence and/or spread of Schistosoma japonicum. However, genotyping errors can seriously affect the results of such studies, unless accounted for in the analyses.
We assessed the genotyping errors (missing alleles or false alleles) of seven S. japonicum microsatellites, using a pedigree data approach for schistosome miracidia, which were stored on Whatman FTA cards.
Among 107 schistosome miracidia successfully genotyped, resulting in a total of 715 loci calls, a total of 31 genotyping errors were observed with 25.2 % of the miracidia having at least one error. The error rate per locus differed among loci, which ranged from 0 to 9.8 %, with the mean error rate 4.3 % over loci. With the parentage analysis software Cervus, the assignment power with these seven markers was estimated to be 89.5 % for one parent and 99.9 % for a parent pair. One locus was inferred to have a high number of null alleles and a second with a high mistyping rate.
To the authors’ knowledge, this is the first time that S. japonicum pedigrees have been used in an assessment of genotyping errors of microsatellite markers. The observed locus-specific error rate will benefit downstream epidemiological or ecological analyses of S. japonicum with the markers.
Whilst there have been great successes in the control of schistosomiasis japonica in China over the last six decades, the disease remains a major public health problem with an estimated 0.29 million people infected and over 245 million people living in endemic areas . Moreover, the disease has been resurging in areas where it was previously well controlled or its transmission interrupted [2, 3]. Therefore, it is of importance to explore the factors influencing the persistence and/or spread of Schistosoma japonicum. Molecular approaches, for example population genetic analyses, can be applied and integrated into the on-going parasitological or serological surveillance programs [4–6] to enhance our knowledge of transmission of this disease, including addressing questions such as who is infecting whom?
Population genetic analyses of parasites can elucidate parasite transmission patterns by understanding gene flow and population structure between and among spatial or temporal parasite populations [7–9]. However, as adult worms reside in the blood veins of mammalian hosts and sampling the worms directly from live hosts is impossible, studying population structure of schistosomes in the field is logistically challenging. Therefore the alternative practice is to collect and genotype schistosome larvae: either miracidia hatched from eggs in host faeces, or cercariae shed from an intermediate host snail. This is facilitated by the development of the Whatman FTA card-based approach , which allows field-collected larvae to be stored at room temperature for up to 4 years and to be successfully reused for molecular analyses up to 11 times . This in turn facilitates the growing research in the molecular epidemiology of S. japonicum [12–14, 11].
Genotyping errors (i.e. the proportion of observed alleles or genotypes which differ from the true alleles or genotypes) can bias the frequencies reported for a population. Even a small per-locus genotyping error rate can result in relatively large probabilities of a multilocus genotype containing at least one error [15, 16]. Hence, there is an increasing call for reporting genotyping error rates and then integrating these errors in the downstream population genetic analyses [17, 18, 15]. Several approaches have been proposed for the quantification of genotyping errors , among which, detecting errors with pedigree data has been the most robust assay . Therefore, we assessed the genotyping errors on seven commonly used S. japonicum microsatellites [20–22], as seen in Table 1, for schistosome miracidia stored on Whatman FTA cards. Our S. japonicum pedigree was established from laboratory crosses of parasites, with adult worm pairs, and their offspring (miracidia) from each parasite family, collected. Each parasite was genotyped in an individual multiplex PCR reaction. Genotyping errors in offspring with the seven microsatellite loci were detected based on Mendelian inheritance of alleles, and further estimated with the widely used program CERVUS V3.0.7 . The results benefit future molecular analyses of this organism, aiding accurate interpretation of results from amplification using these common markers.
Schistosoma japonicum was originally obtained from infected snails from the Shitai county of Anhui, China in April 2013 and then maintained in mice in the laboratory. Miracidia were hatched from eggs collected from the livers of mice. Individual snails, with no previous schistosome infections, were individually exposed to a single miracidium and later checked for infection of schistosome with a shedding experiment. For the details in procedure, see the work in [24, 25]. As the parasite undergoes only asexual reproduction within snail hosts these individually laboratory-infected snails therefore each harbored only clonal cercariae of the same gender and genotype. Twenty mice were individually exposed to approximately 50 cercariae from only one snail each (i.e. one mouse per snail) and the resultant adult worms were morphologically sexed, to back inform on the sex of the cercariae infection in each snail. The sex of cercariae from 11 single miracidium-exposed snails were successfully identified, with six snails harboring female cercariae and five snails harboring male cercariae.
The cercariae from these 11 snails were used for parasite cross experiments. Sixteen mice were each exposed to two genotypes of five cercariae from two snails, i.e. five male cercariae of the same genotype from one snail and five female cercariae of the same genotype from another snail (one mouse per unique genotype cross). Due to limited cercariae shed from the 11 infected snails and to minimize animal usage, only a total of 16 genetically unique worm pairs were established in 16 mice, rather than the theoretically possible 30 unique worm pairs from 5 male and 6 female parasite clones (Fig. 1). Adult worm pairs were obtained six weeks post-exposure via portal perfusion and liver examination of the mouse. The adult worm pairs were stored in 99 % ethanol and frozen for future analyses. The liver was minced and eggs (i.e. the offspring of the genetically unique adult worm pairs) were collected for hatching of miracidia. The larvae, with the small size of 150 μm , were then collected individually using a pipette in 3–5 μl water and stored on a Whatman FTA Classic Card for subsequent DNA analysis. A total of 16 different parasite families with known adult worm pairs and their offspring were obtained for molecular analyses. The experiments including the following molecular analyses are shown in Fig. 1. Ethical Approval: The research was approved by the Soochow University Ethics Committee and the care and use of experimental animals complied with institutional standards.
DNA extraction and microsatellite amplification
Genomic DNA from adult worms was extracted using an EZgene™ Mollusc gDNA Kit (Biomiga, Inc. San Diego, USA) according to the manufacturer’s protocols. DNA extraction from miracidia was performed as described elsewhere . A total of seven previously published microsatellite loci were investigated, and the forward primer for each pair was labeled with 6-FAM, HEX, TAMRA or ROX (Table 2). PCR reactions were carried out in 15 μl reaction volumes containing 1 μl of adult worm DNA or from one FTA Whatman card disc with a single miracidium, using the QIAGEN Multiplex PCR Kit (cat. nos. 206152, Germany). Thermo cycling was carried out in an Arktik thermocycler (Thermo Scientific) with the following PCR profile: 95 °C for 5 min, followed by 40 cycles of 30s at 95 °C, 90s at annealing temperature (five cycles at each temperature from 57 °C to 51 °C decreased by 2 °C, then 20 cycles at 50 °C), and 30s at 72 °C, with a final extension at 68 °C for 10 min. Multiplexed PCR products were genotyped using an ABI 3100 automated sequencer (Applied Biosystems) in Sangon Biotech (Shanghai, China). Each adult worm was multiplexed twice to improve accuracy of true genotype scoring.
We combined automated allele calling with visual inspection of each sample. GeneMarker HID Version 2.6.1 (SoftGenetics LLC) was used to automatically score alleles. The parameters and the loci bin ranges were set based on S. japonicum samples from several geographical regions to maximize the match to the characteristics of the loci used. Automated binning of allele data (i.e. converting the raw decimal data into integers) provided consistency across multiple plates of PCR products, whereas visual inspection avoided the errors due to uncorrected size measure with low scores shown in the software. See Additional file 1 for an example.
Identification and quantification of genotyping errors
There are three common types of genotyping errors including: 1) null allele, a non-amplifying allele due to a mutation in the primer target sequence ; 2) allelic dropout, the stochastic non-amplification of an allele at a heterozygous locus ; 3) false allele, allele-like PCR-generated artifact . For practical handling and as in the work , two types of errors, missing alleles (i.e. generally caused by the existence of an allelic dropout or a null allele) and false alleles, were classified here and identified through Mendelian-inheritance checking. A missing allele is an allele that is not observed in a miracidium but is expected to be inherited from a parent adult worm. A false allele is an allele that is called from a miracidium, but does not exist in either parent worm. This could result from a mutation between two generations or a false peak . If there was no amplification at all on a given loci, then we did not include this as a ‘non amplification’ event as in this situation, no allele calls would be made at all, as the amplification has not worked for that loci, and therefore this scenario would not bias downstream molecular analyses. As recommended in , we calculated two indexes, error rate per locus and per multilocus genotype. Error rate per locus is measured as the ratio between the number of single-locus genotypes including at least one allelic mismatch and the number of single-locus genotypes examined, calculated for each locus and over loci. Error rate per multilocus genotype is the ratio between the number of multilocus genotypes including at least one allelic mismatch and the number of multilocus genotypes examined (i.e. the miracidia error rate).
Cervus V3.0.7 [23, 33], a program for parentage analysis, was used to calculate indexes including Number of alleles, Observed heterozygosity (Ho), Expected heterozygosity (He), Polymorphic information content (PIC), Average non-exclusion probability of the first parent (Excl1), Average non-exclusion probability of parent pair (Excl2), and Null allele frequency, for each locus and over all loci (if allowable). Numbers of mismatches were also detected in known mother-offspring or known father-offspring pairs for each locus. The mistyping rate was calculated as the ratio of the number of mismatches to the number of alleles compared, scaled by the average probability of detecting a mismatch .
As seen in Table 2, an average of 6.7 miracidia from each genetically unique adult worm pair (i.e. a total of 107 offspring miracidia from 16 mice) were successfully genotyped. The number of alleles ranged from 3 to 8 among loci with an average of 5.43 per locus in the 11 parental schistosomes (five male and six female) and of 5.71 per locus in the 107 miracidia. The observed heterozygosity was between 0.273 and 0.727 in the parental worms and between 0.402 and 0.867 in the miracidia. The value of no amplification rate in miracidia varied among loci, ranging from 0 for Sjp4 and Sjp18 to 11.2 % (12/107) for TS2. All genotype data were seen in Additional file 2.
From Table 3, a total of 31 genotyping errors, with 29 missing alleles and two false alleles, were detected in 107 miracidia individuals through visual inspection of incompatibility between worm pairs and their offspring. The error rate per locus differed among loci, with the highest of 9.8 % for Sjp22. The mean error rate over loci was 4.3 %. Among 107 miracidia multilocus genotypes checked, 27 individuals had at least one genotyping error, which gave an error rate of 25.2 % per multilocus genotype.
Parentage analyses with the software Cervus
As required by Cervus, we combined both 11 worms and 107 miracidia to estimate the genetic diversity and probabilities of parent exclusion in parentage analyses. As seen in Table 4, the unbiased expected heterozygosity (He) ranged from 0.493 to 0.830 among loci, with the mean 0.652 over loci. The mean PIC was up to 0.603 over loci. When the seven loci were combined, the average non-exclusion probability of the first parent (Excl1) and the average non-exclusion probability of the parent pair were 89.5 % and 99.9 %, respectively. Null allele frequency was less than 0.05 for all loci with the exception of TS2.
As seen in Table 5, a total of 16 mismatching calls were detected in known mother-offspring pairs and 15 in known father-offspring pairs. No mistyping was identified for either Sjp4 and Sjp18. The highest mistyping rate of up to 11.4 % in mother-offspring and 17.1 % in father-offspring was estimated for the locus Sjp22. The second highest was observed for the locus TS2.
We evaluated the genotyping errors in individual multilocus reactions involving seven S. japonicum microsatellite markers for miracidia samples stored on Whatman FTA cards. Our results can inform future studies using these microsatellites through a more thorough understanding of the potential error rates of multiplex amplifications, aiding accurate interpretation of their results. Among the microsatellites used, two had 100 % success rate in amplification and no errors recorded, indicating these as highly reliable and reproducible markers and therefore highly recommended for future use. Two loci, one with the highest mistyping rate and one with possible null alleles were identified, and should be used with caution under these PCR conditions in the future. To the authors’ knowledge, this is the first time that S. japonicum pedigrees have been used in an assessment of genotyping errors with microsatellite markers.
The selected set of seven schistosome microsatellites revealed high genetic variation in the 11 parasites (five male and six female) used and their offspring miracidia. If there were no genotyping errors, their considerably high polymorphism and combined assignment power would be very useful in parentage analyses, and then can be used to track transmission of the parasite  and other population genetic analyses. In the current study, for practical handling and due to the approach of sample storage we classified the observed genotyping errors into two types- missing and false alleles. Using pedigree analyses, we observed up to one-fourth of miracidia individuals could have at least one error, which is much lower than the proportion (44 %) reported for S. mansoni . One possible reason would be associated with more loci (i.e., nine) used in the S. mansoni study.
An error rate of 2 % in microsatellite studies is usual and acceptable [36, 15]; however, in this study we observed a considerably high mean error rate over loci (4.3 %). The error rate varied with locus. No errors were observed at the loci Sjp4 and Sjp18. Particularly for Sjp18, a high polymorphism was also displayed in the samples used, indicating that this microsatellite marker is highly informative for this kind of analyses. However, at the locus Sjp22, one-third of the total errors were detected, plus one false allele observed, therefore, special precautions should be taken in downstream analyses of the molecular data created from this locus.
Cervus is a likelihood-based parentage-assignment program [23, 33] and has been the most widely employed in inferring parent–offspring relationships . With known pedigree data (for example, a known parent-offspring pairing) the software can also be used to quantify mistyping errors and then estimate error rate. In this study with Cervus, two loci (Sjp22 and TS2) were shown to contain high errors and the locus TS2 was suggested with potential null alleles. To minimize genotyping errors, a protocol implementing quality assurance procedures has been proposed , but genotyping errors cannot be completely eliminated. Therefore the knowledge of errors, particularly the locus-specific error rate, can add power and accuracy to downstream analyses, especially as a majority of software packages, for example Colony2  and MasterBayes , have been developed to allow the incorporation of errors into their algorithms. Our data are also important for future optimization of improved multiplex reactions, indicating certain loci which are highly reliable (Sjp4 and Sjp18), and other microsatellites which may be either improved or replaced (Sjp22 or TS2).
Overall false alleles were rare, indicating that if allele calls are made for these loci, then they are likely to be accurate. However some alleles may be lost. Overall this means that any bias that genotyping error may impose will be more likely to be associated with genetic diversity, rather than population structure analyses. This may increase the relative proportion of homozygous calls versus heterozygous calls, as well as reducing the number of rare and private alleles. However, if the same multiplex reactions are used to compare populations, then a similar loss of diversity would be expected across the samples and should not affect the overall conclusions. As this study was performed comparing miracidia directly with each other and their parents, the chance of these differences we observe and attribute to amplification and genotyping errors, being due to mutations arising from only one round of sexual reproduction, is minimal, particularly given the relatively low mutation rate of schistosomes . We are therefore confident that our results represent accurate measures of genotyping errors for each loci.
The genotyping errors of S. japonicum at seven loci were characterized with pedigree data. Two error-prone loci were identified and should be paid more attention. Null alleles at one locus were detected with the program Cervus. The observed locus-specific error rate is useful for any further epidemiological, ecological or evolutionary research on S. japonicum involving the above microsatellite markers.
Zhen H, Zhang LJ, Zhu R, Xu J, Li SZ, Guo JG, et al. Schistosomiasis situation in China in 2011. Chin J Schisto Control. 2012;24(6):621–6.
Liang S, Yang C, Zhong B, Qiu D. Re-emerging schistosomiasis in hilly and mountainous areas of Sichuan, China. Bull World Health Organ. 2006;84(2):139–44.
Utzinger J, Zhou XN, Chen MG, Bergquist R. Conquering schistosomiasis in China: the long march. Acta Trop. 2005;96(2–3):69–96.
Liang S, Yang CH, Zhong B, Guo JG, Li HZ, Carlton EJ, et al. Surveillance systems for neglected tropical diseases: global lessons from China’s evolving schistosomiasis reporting systems, 1949–2014. Emerg Themes Epidemiol. 2014;11:19.
Yang K, Li W, Sun LP, Huang YX, Zhang JF, Wu F, et al. Spatio-temporal analysis to identify determinants of Oncomelania hupensis infection with Schistosoma japonicum in Jiangsu province. China Parasit Vectors. 2013;6:138.
Liang YS, Wang W, Li HJ, Shen XH, Xu YL, Dai JR. The South-to-North Water Diversion Project: effect of the water diversion pattern on transmission of Oncomelania hupensis, the intermediate host of Schistosoma japonicum in China. Parasit Vectors. 2012;5:52.
Criscione CD, Poulin R, Blouin MS. Molecular ecology of parasites: elucidating ecological and microevolutionary processes. Mol Ecol. 2005;14(8):2247–57.
de Meeus T, McCoy KD, Prugnolle F, Chevillon C, Durand P, Hurtrez-Bousses S, et al. Population genetics and molecular epidemiology or how to “debusquer la bete”. Infect Genet Evol. 2007;7(2):308–32.
Gower CM, Gouvras AN, Lamberton PH, Deol A, Shrivastava J, Mutombo PN, et al. Population genetic structure of Schistosoma mansoni and Schistosoma haematobium from across six sub-Saharan African countries: Implications for epidemiology, evolution and control. Acta Trop. 2013;128(2):261–74.
Gower CM, Shrivastava J, Lamberton PH, Rollinson D, Webster BL, Emery A, et al. Development and application of an ethically and epidemiologically advantageous assay for the multi-locus microsatellite analysis of Schistosoma mansoni. Parasitology. 2007;134(Pt 4):523–36.
Xiao N, Remais JV, Brindley PJ, Qiu DC, Carlton EJ, Li RZ, et al. Approaches to genotyping individual miracidia of Schistosoma japonicum. Parasitol Res. 2013;112(12):3991–9.
Shrivastava J, Gower CM, Balolong Jr E, Wang TP, Qian BZ, Webster JP. Population genetics of multi-host parasites--the case for molecular epidemiological studies of Schistosoma japonicum using larval stages from naturally infected hosts. Parasitology. 2005;131(Pt 5):617–26.
Lu DB, Wang TP, Rudge JW, Donnelly CA, Fang GR, Webster JP. Genetic diversity of Schistosoma japonicum miracidia from individual rodent hosts. Int J Parasitol. 2011;41(13–14):1371–6.
Rudge JW, Carabin H, Balolong E, Tallo V, Shrivastava J, Lu DB, et al. Population genetics of Schistosoma japonicum within the Philippines suggest high levels of transmission between humans and dogs. PLoS Negl Trop Dis. 2008;2(11), e340.
Hoffman JI, Amos W. Microsatellite genotyping errors: detection approaches, common sources and consequences for paternal exclusion. Mol Ecol. 2005;14(2):599–612.
Bonin A, Bellemain E, Bronken Eidesen P, Pompanon F, Brochmann C, Taberlet P. How to track and assess genotyping errors in population genetics studies. Mol Ecol. 2004;13(11):3261–73.
Dewoody J, Nason JD, Hipkins VD. Mitigating scoring errors in microsatellite data from wild populations. Mol Ecol Notes. 2006;6:951–7.
Crawford LA, Koscinski D, Keyghobadi N. A call for more transparent reporting of error rates: the quality of AFLP data in ecological and evolutionary research. Mol Ecol. 2012;21(24):5911–7.
Johnson PC, Haydon DT. Software for quantifying and simulating microsatellite genotyping error. Bioinform Biol Insights. 2009;1:71–5.
Shrivastava J, Barker GC, Johansen MV, Zhou XN, Aligui GD. Isolation and characterization of polymorphic DNA microsatellite markers from Schistosoma japonicum. Mol Ecol Notes. 2003;3:406–8.
Xiao N, Remais J, Brindley PJ, Qiu D, Spear R, Lei Y, et al. Polymorphic microsatellites in the human bloodfluke, Schistosoma japonicum, identified using a genomic resource. Parasit Vectors. 2011;4:13.
Yin M, Hu W, Mo X, Wang S, Brindley PJ, McManus DP, et al. Multiple near-identical genotypes of Schistosoma japonicum can occur in snails and have implications for population-genetic analyses. Int J Parasitol. 2008;38(14):1681–91.
Marshall TC, Slate J, Kruuk LE, Pemberton JM. Statistical confidence for likelihood-based paternity inference in natural populations. Mol Ecol. 1998;7(5):639–55.
Su J, Zhou F, Lu DB. A circular analysis of chronobiology of Schistosoma japonicum cercarial emergence from hilly areas of Anhui. China Exp Parasitol. 2013;135(2):421–5.
Wang CZ, Lu DB, Guo CX, Li Y, Gao YM, Bian CR, et al. Compatibility of Schistosoma japonicum from the hilly region and Oncomelania hupensis hupensis from the marshland region within Anhui. China Parasitol Res. 2014;113:4477–84.
Beltran S, Galinier R, Allienne JF, Boissier J. Cheap, rapid and efficient DNA extraction method to perform multilocus microsatellite genotyping on all Schistosoma mansoni stages. Mem Inst Oswaldo Cruz. 2008;103(5):501–3.
Callen DF, Thompson AD, Shen Y, Phillips HA, Richards RI, Mulley JC, et al. Incidence and origin of “null” alleles in the (AC)n microsatellite markers. Am J Hum Genet. 1993;52(5):922–7.
Gagneux P, Boesch C, Woodruff DS. Microsatellite scoring errors associated with noninvasive genotyping based on nuclear DNA amplified from shed hair. Mol Ecol. 1997;6(9):861–8.
Taberlet P, Griffin S, Goossens B, Questiau S, Manceau V, Escaravage N, et al. Reliable genotyping of samples with very low DNA quantities using PCR. Nuc Acids Res. 1996;24(16):3189–94.
Steinauer ML, Agola LE, Mwangi IN, Mkoji GM, Loker ES. Molecular epidemiology of Schistosoma mansoni: a robust, high-throughput method to assess multiple microsatellite markers from individual miracidia. Infect Genet Evol. 2008;8(1):68–73.
Broquet T, Petit E. Quantifying genotyping errors in noninvasive population genetics. Mol Ecol. 2004;13(11):3601–8.
Pompanon F, Bonin A, Bellemain E, Taberlet P. Genotyping errors: causes, consequences and solutions. Nat Rev Genet. 2005;6(11):847–59. doi:10.1038/nrg1707.
Kalinowski ST, Taper ML, Marshall TC. Revising how the computer program CERVUS accommodates genotyping error increases success in paternity assignment. Mol Ecol. 2007;16(5):1099–106.
Lu DB, Rudge JW, Wang TP, Donnelly CA, Fang GR, Webster JP. Transmission of Schistosoma japonicum in marshland and hilly regions of China: parasite population genetic and sibship structure. PLoS Negl Trop Dis. 2010;4(8), e781.
Van den Broeck F, Geldof S, Polman K, Volckaert FA, Huyse T. Optimal sample storage and extraction procotols for reliable multilocus genotyping of the human parasite Schistosoma mansoni. Infect Genet Evol. 2011;11(6):1413–8.
Castro J, Pino A, Hermida M, Bouza C, Riaza A, Ferreiro I, et al. A microsatellite marker tool for parentage analysis in Senegal sole (Solea senegalensis): Genotyping errors, null alleles and conformance to theoretical assumptions. Aquaculture. 2006;261:1194–203.
Walling CA, Pemberton JM, Hadfield JD, Kruuk LE. Comparing parentage inference software: reanalysis of a red deer pedigree. Mol Ecol. 2010;19(9):1914–28.
Wang J, Santure AW. Parentage and sibship inference from multilocus genotype data under polygamy. Genetics. 2009;181(4):1579–94.
Hadfield JD, Richardson DS, Burke T. Towards unbiased parentage assignment: combining genetic, behavioural and spatial data in a Bayesian framework. Mol Ecol. 2006;15(12):3715–30.
Valentim CL, LoVerde PT, Anderson TJ, Criscione CD. Efficient genotyping of Schistosoma mansoni miracidia following whole genome amplification. Mol Biochem Parasitol. 2009;166(1):81–4.
This work was funded by the National Sciences Foundation of China (No.81273141). We thank Wen-Qiao Huang for assisting mice infection in the laboratory. We are very grateful to Dr Chun Hai (Isaac) Fung from Georgia Southern University for his great help in the English of the manuscript.
The authors declare that they have no competing interests.
DL conceived of and designed the study. YG, DL and HD carried out the molecular genetic work and were assisted by PHL in analyzing the results. DL, YG and PHL drafted the manuscript. All authors read and approved the final manuscript.
About this article
Cite this article
Gao, YM., Lu, DB., Ding, H. et al. Detecting genotyping errors at Schistosoma japonicum microsatellites with pedigree information. Parasites Vectors 8, 452 (2015). https://doi.org/10.1186/s13071-015-1074-0
- Schistosoma japonicum
- Genotyping errors