Skip to main content

Pooling as a strategy for the timely diagnosis of soil-transmitted helminths in stool: value and reproducibility



The strategy of pooling stool specimens has been extensively used in the field of parasitology in order to facilitate the screening of large numbers of samples whilst minimizing the prohibitive cost of single sample analysis. The aim of this study was to develop a standardized reproducible pooling protocol for stool samples, validated between two different laboratories, without jeopardizing the sensitivity of the quantitative polymerase chain reaction (qPCR) assays employed for the detection of soil-transmitted helminths (STHs). Two distinct experimental phases were recruited. First, the sensitivity and specificity of the established protocol was assessed by real-time PCR for each one of the STHs. Secondly, agreement and reproducibility of the protocol between the two different laboratories were tested. The need for multiple stool sampling to avoid false negative results was also assessed. Finally, a cost exercise was conducted which included labour cost in low- and high-wage settings, consumable cost, prevalence of a single STH species, and a simple distribution pattern of the positive samples in pools to estimate time and money savings suggested by the strategy.


The sensitivity of the pooling method was variable among the STH species but consistent between the two laboratories. Estimates of specificity indicate a ‘pooling approach’ can yield a low frequency of ‘missed’ infections. There were no significant differences regarding the execution of the protocol and the subsequent STH detection between the two laboratories, which suggests in most cases the protocol is reproducible by adequately trained staff. Finally, given the high degree of agreement, there appears to be little or no need for multiple sampling of either individuals or pools.


Our results suggest that the pooling protocol developed herein is a robust and efficient strategy for the detection of STHs in ‘pools-of-five’. There is notable complexity of the pool preparation to ensure even distribution of helminth DNA throughout. Therefore, at a given setting, cost of labour among other logistical and epidemiological factors, is the more concerning and determining factor when choosing pooling strategies, rather than losing sensitivity and/or specificity of the molecular assay or the method.


Pooling of faeces [1,2,3,4,5], urine [6, 7], serum [8] or disease vectors [9] have all been used as a cost-effective strategy to screen for infection present in the given substrate/matrix. Such an approach has been shown to provide accurate results, while reducing time and labour requirements. Additionally, but perhaps more so in the veterinary world than in any clinical mass drug administration (MDA) programme, ‘pooling’ as a strategy may allow for a rapid estimation of drug efficacy or infection prevalence present in the herd based on microscopy results and subsequent faecal egg counts (FECs) [10,11,12,13].

As previous goals to reduce the intestinal worm burden and morbidity in school-aged children have been extended and enriched with new programmes to achieve universal coverage of at-risk populations by 2030, new monitoring methods need to be implemented. Novel, precise and robust diagnostic tools that measure prevalence reduction and detect interruption of transmission are key to enable de-implementation of MDA programmes [14, 15]. Soil-transmitted helminths transmitted via the faecal-oral route (Ascaris lumbricoides, Trichuris trichiura, Necator americanus, Ancylostoma duodenale, An. ceylanicum and Strongyloides stercoralis) and/or via skin penetration (N. americanus, An. duodenale, An. ceylanicum and S. stercoralis) are amongst the neglected tropical pathogens drawing increased attention as targets for transmission interruption and possible elimination. Even though preventable, they affect almost a third of the world’s population [16]. However, surveillance of ongoing MDA programmatic efforts that aim to reduce the worm burden include thousands or tens of thousands of samples to be screened and analysed for STH-related prevalence, especially in low-prevalence areas where large sample sizes are required to accurately detect changes in infection. Previous attempts to evaluate pooling as a means of scaling soil-transmitted helminth diagnosis have yielded poor results. Such studies have relied upon microscopy as the diagnostic strategy [13, 17, 18], which lacks the sensitivity of molecular tools, such as quantitative polymerase chain reaction (qPCR); caveats and disadvantages of this approach have been thoroughly described previously [19, 20].

Such tools would ideally retain their sensitivity when samples from multiple individuals are combined, whilst minimizing the reagent cost implicated. More recent studies report additional cost granularity, including operational and logistical costs, concluding that a ‘pooling approach’ might not be as worthwhile as hoped [5]. These studies, however, have neither taken into account predicted pool sizes as optimal nor have they incorporated an adequately sensitive diagnostic tool; thus, such conclusions are yet to be confirmed. Modelling studies followed by experimental validations have suggested an optimal pooled sample range where pooling tends to be more cost-effective, whilst maintaining robustness and precision with minimal variation [12] but the decision whether to proceed with pooling or not will likely be based on a number of additional factors. Cost (determined by reagents, labour required, logistical and operational considerations), time (sample transportation and pool preparation) and the need for a sensitive enough diagnostic tool are not the only determinants which must be considered when deciding in favour, or in opposition, of pooling. The sample size of the study (n) and existing STH prevalence may also influence decision making [21].

Quantitative PCR has emerged as an effective molecular diagnostic tool to fill the need of heightened sensitivity compared to microscopy when infection levels drop considerably. Some of the advantages of qPCR include the theoretical ability to detect single numbers of eggs present in the faeces due to its analytical sensitivity, to distinguish between species [22, 23] and to achieve accurate results rapidly. Given these factors, qPCR may be the most likely currently available method to enable STH detection in pools in low-prevalence areas, especially when prevalence is close to the breakpoint of transmission [24]. For this reason, the use of PCR as part of a viable pooling strategy should be evaluated [25].

In settings with low-intensity of infections, the majority of samples screened are expected to be negative [26]. The sensitivity of a given method might increase or decrease when pooling is recruited; increasing, when multiple ‘weak’ infections are combined in a single pool, so collectively the target of interest is detectable by qPCR and decreasing, when a single infected sample is ‘buried’ among uninfected ones, and subsequently diluted, hence undetectable by qPCR [11].

A need for ‘spin-outs’ (subsequent tests) after testing the pools and the identification of the STH infection at an individual level may increase the cost of the ‘pooling approach’ substantially if required too often. This negates any advantages of the approach. Also, the risk of contamination is higher as testing larger pools of samples extends the handling and processing period and increases the risk for contamination, leading to false positive results, thus driving the cost higher, especially when re-extractions are needed to confirm individual infections [27]. When the sensitivity of an STH assay is decreasing, a very ‘weak’ infection might be missed in a pool of negatives. This could reduce the cost since collectively that pool would identify as negative so no added labour (or cost) for ‘spin-outs’ would be needed. As mentioned, any pool sizes higher than between 5 and 8 increases cost and time to prepare the pools and requires additional equipment.

Building on preliminary unpublished data gathered by members of our group, and taking into account the pool sizing predictive models, we examined the recruitment of pools of 5 as a tool for screening samples with low STH infection levels, aiming not to compromise either sensitivity or specificity of the qPCR. Additionally, the reproducibility of the protocol and agreement in two different laboratory settings was interrogated, and the necessity for multiple replicates obtained from each pool or individual samples was also evaluated. A basic cost exercise was performed through direct comparison of processing samples individually or as parts of pools. Also, without any prior knowledge regarding the distribution of the positive samples in a screened population, two scenarios were included in the cost-analysis based on different prevalence levels given; a ‘best-’ and a ‘worst-case’ scenario. Acknowledging that this analysis does not represent a mathematical cost model, we accounted simply for prevalence in a given sample population, labour time based on wages in different income settings and consumable costs based on standard list prices. Our results show that choosing whether to ‘pool or not to pool’ can only be determined effectively after considerable scrutiny of each of the component processes, which may be more problematic or prohibitory than loss of granular sensitivity of the diagnostic method used to detect the target of choice. Each process component should be taken into consideration before deciding in favour of pooling strategies.


Study design (phases I and II)

During phase I (‘seeding’ experiment) a series (n = 20) of infection-naïve stool samples purchased commercially (BioIVT; Westbury, NY, USA) were spiked with known numbers of N. americanus eggs mimicking low levels of infection as classified by the World Health Organisation (WHO) guidelines [28] and were mixed with four additional infection-naïve samples of equal volume to create pools of 5.

During phase II (field-samples experiment) of the study, aliquots from a series of field samples with known STH infection status, collected as part of an unrelated study, were mixed with four additional field samples (of equal volume) that had been tested and verified to be negative for all the five STH species of interest (see ‘strategic pooling’) to also create pools of five.

DNA extractions performed during phase I, and part of phase II, were conducted in different laboratories by different technicians to explore reproducibility of the developed protocol. Individual component samples were extracted alongside their pools throughout the process, and all extractions of both individual samples and pools were performed in duplicate (i.e. 1A, 1B, P1A and P1B). DNA from each pool was also extracted twice (PA1&2 and PB1&2). The sensitivity and specificity of the established protocol was evaluated by real-time PCR for each particular target helminth, and by all STH assays for the samples previously identified as negatives. Reproducibility of the protocol’s performance and agreement of results between the two different laboratories were also analysed.

Phase I: ‘seeding’ experiment—Smith College (SC)

For use during ‘seeding’ experiments, performed at the Smith College (SC; Northampton, MA, USA), a suspension of hookworm eggs, utilized to spike the infection-naïve stool, was prepared as previously described [29]. In brief, hamster stool pellets with known infection levels expressed as eggs per gram (epg) were diluted in nuclease-free water such that 178 µl contained 50 eggs for a final infection-load of 100 epg (50 eggs in 500 mg of stool) (Fig. 1). The level of hookworm infection chosen was based on preliminary experiments where medium and high hookworm infection loads (based on WHO guidelines [28]) were employed, but showed abundancy of the target and early amplification detected by qPCR [30]; a primary concern of pooling is loss of sensitivity through dilution in low infection settings, so we chose a moderately low final concentration of 100 epg to detect potential dilution effects.

Fig. 1

Schematic representation of the ‘seeding’ experiment (as proof of concept). Known egg counts of N. americanus eggs (in suspension) were utilized in order to spike individual, fixed volumes of naïve stool prior to mixing with four additional naïve stool aliquots of the same volume to form the pools of five

Phase II: field-samples experiment—SC and Natural History Museum (NHM)

At SC, a 34-sample panel was created for use in a proof-of-concept study. Thirty of these samples were positive for a single helminth (A. lumbricoides, T. trichiura, An. ceylanicum, S. stercoralis) and the remaining four were identified as negative. The volume of each sample (1.5 ml; 500 mg of stool suspended in 1 ml of ethanol) was split, homogenised and mixed with four infection-naïve stool aliquots of equal volume (Fig. 2). Another panel of 150 samples of human stool extracts, variously infected with the same species of STH (at least 500 mg of stool), was prepared at SC and was shipped to the Natural History Museum (NHM; London, UK). All samples utilized during phase II of this study were collected in Bangladesh as part of the WASH Benefits Bangladesh trial [31]. All samples were previously screened at SC via real-time PCR and the results for each individual sample were available. Amongst these samples, 130 were identified as negative for all species (N. americanus, T. trichiura, A. lumbricoides, An. duodenale, An. ceylanicum and S. stercoralis). The rest of the samples (n = 20) were identified as positive for at least one STH, with low/moderate intensity infections reported based on Kato-Katz/individual PCR data. For the generation of each positive pool, one sample identified as positive for at least one species of STH was mixed with four samples identified as negative. For the generation of negative pools, equal volumes of five negative samples were mixed (Fig. 2).

Fig. 2

Schematic representation of the field-samples experiment. Previously screened faecal samples positive for one or more soil-transmitted helminths (STHs) were combined with four additional samples (of equal volume) identified as negative for all STHs to create pools of five (individual samples identified as negatives were also included in the study, as contamination controls). The DNA from every individual sample was extracted twice, each pool was formed twice and the DNA from each pool was also extracted twice. All the samples underwent qPCR for the target STH

Pool formation and DNA extraction

The total volume of each sample (1.5 ml stool in suspension) was divided into two aliquots and was homogenized using a high-speed bead beater (Fast Prep 5G, MP Biomedicals; Santa Ana, CA, USA) with Lysing Matrix E tubes (containing silica, glass bead and ceramic particles). The homogeneous suspensions were recombined into a single tube after the first lysis. Two ~ 300 µl aliquots of the suspension were transferred into two new Lysing Matrix E tubes for individual extractions (A and B) and two additional 300 µl were transferred to separate tubes designated for use in the constitution of pools (PA and PB). The same procedure was followed for all five samples that would form a single pool. After a pool was formed, the volume was split again, and a second homogenization following the same procedure occurred (second lysis). Following the second lysis step, two aliquots (300 µl each) from the pool (PA1&2 and PB1&2) were also subjected to DNA extraction. For all pools and individual samples, the same DNA extraction protocol was followed. All extractions began with an additional bead-beating step (the second homogenization step for individual samples and the third homogenization step for pooled samples). Extractions were then completed using the MP Bio Fast DNA SPIN kit for Soil (MP Biomedicals; Santa Ana, CA, USA) as previously described [29] (Figs. 1, 2). Following extraction, all samples were stored at – 20 °C until analysed via real-time PCR.

Real-time PCR analysis

The cycling conditions, information on sequences from primers and probes and master mix used have all been previously described [22, 23, 29].

Data and statistical analysis

To assess the diagnostic performance of the 5-sample pools, we calculated sensitivity, specificity, negative predictive value (NPV) and positive predictive value (PPV) in Excel v. 2016. Accuracy of the pooling method was also calculated using the formula: (true positives + true negatives)/number of pools. Confidence intervals (CI) for sensitivity, specificity, PPV and NPV were calculated using the Clopper–Pearson exact binomial method [32]. For these calculations qPCR results for the individual aliquots were considered as the ‘gold-standard’. Results for NHM and SC were calculated and presented separately and stratified by helminth species. Chi-square tests were conducted to determine whether there was statistical evidence of a difference in the sensitivity and specificity estimates between the two laboratories. To better understand how pooling impacted the (delayed) detection of the target compared to the individuals, Pearson’s correlation coefficient was used to quantitate the relationship between the qPCR outcome of the individual sample and that of the pooled one.

To investigate whether multiple extractions are required for each individual aliquot and/or 5-sample pool, Cohen’s kappa statistic [33] was calculated. This determines the degree of agreement in qPCR results (positive/negative) between A/B aliquots and between the 5-sample pool duplicates (PA1 and A2, PB1 and B2). Finally, for direct demonstration of agreement between the results obtained at NHM for the individual extracts and the ones originally screened as part of the independent study at SC (Bangladesh, WASH Benefits Bangladesh trial, see above), Cohen’s kappa statistic was also calculated.

Cost exercise computation

Costs based on 1000 samples requiring processing (individually or as part of 5-sample pools) were calculated; the sample size was small enough for easy analysis and large enough to represent a case where pooling might be justified. For consistency and accurate reporting, the present protocol included all the extractions in duplicate and the formation and subsequent extraction of the same pool twice; these components were also part of the cost model and comparison. This cost exercise included labour and consumable costs (for plasticware and reagents per sample per assay run, based on list prices), tailored to a theoretically optimized version of the developed protocol (i.e. a protocol that would not process individual samples along with the pools simultaneously), as mentioned earlier.

Two separate case-scenarios were plotted for this exercise. In the simple case scenario, all the individual samples are negative (thus, so are the pools), and there is no need for ‘spin outs’; hence, only costs for labour and consumables (based on list prices online) are included. As part of a more complicated scenario, two different prevalence rates—with a single STH present for simplicity—were factored in; 2% which reflects the defined transmission breakpoint, and 15% as an indicator of prevalence when control programmes are needed and when pooling could be considered above individual sampling. In a ‘best-case’ complicated scenario, all positive samples would cluster together (e.g. 5 positive samples in a 5-sample pool). Whereas, in a ‘worst-case’ complicated scenario only one positive sample would be part of a 5-sample pool (e.g. mixed with four ‘negatives’).


Pooling was evaluated in terms of consistency, robustness, reproducibility and cost-effectiveness with comparisons made against individual sample results and between replicate pools.

Sensitivity of the 5-sample pooling technique differed between helminth species for both the samples tested at NHM and SC. T. trichiura had the lowest sensitivity for both NHM (0.65, 95% CI: 0.50–0.79) and SC (0.80, 95% CI: 0.64–0.91). All other helminth species from SC had absolute sensitivity (1.00, 95% CI: 0.40–1.00) whilst for NHM the highest sensitivity was obtained for An. ceylanicum (0.82, 95% CI: 0.60–0.95). For T. trichiura and S. stercoralis there was no evidence of a difference in sensitivity between NHM and SC (P = 0.13 and P = 0.22, respectively), whilst for An. ceylanicum there was weak evidence of a difference (P = 0.07) and for A. lumbricoides there was very strong evidence of a difference in sensitivity between the two laboratories (P < 0.001) (Table 1).

Table 1 Sensitivity, specificity, accuracya, positive predictive value and negative predictive value of qPCR on pooled samples as compared to individual sample-based qPCR, for each one of the soil-transmitted helminth-specific qPCR assays, for both laboratories: Natural History Museum (NHM) and Smith College (SC)

Estimates of specificity were consistently higher than those for sensitivity, suggesting the pooling approach has a low rate of false positives. Both N. americanus and A. lumbricoides had perfect specificity from NHM (1.00, 95% CI: 0.90–1.00 and 1.00, 95% CI: 0.92–1.00, respectively), whilst the same was true for An. ceylanicum, A. lumbricoides and T. trichiura at SC. All other estimates from both laboratories were above 0.90 except for S. stercoralis at SC (0.81, 95% CI: 0.64–0.93). There was no evidence of a difference in specificity estimates between NHM and SC for A. lumbricoides (P = 1.00), T. trichiura (P = 0.76) or An. ceylanicum (P = 0.64), but there was strong evidence of a difference for S. stercoralis (P = 0.03) (Table 1).

PPV estimates were generally high across all samples, with each species’ estimate of at least 0.90. The only exception was S. stercoralis with a PPV estimate of 0.78 (95% CI: 0.40–0.97) for NHM and 0.40 (95% CI: 0.12–0.74) at SC. NPV estimates showed much greater variability, especially from the NHM testing. Here, estimates ranged from 0.72 (95% CI: 0.58–0.83) for T. trichiura to 0.94 (95% CI: 0.84–0.99) for S. stercoralis (Table 1).

Pearson’s correlation coefficient (r) values between the individual aliquot qPCR results and the pooled qPCR results were generally consistent for the NHM and SC samples for each species with strong, positive correlations obtained from the A. lumbricoides samples (NHM: r = 0.75, P < 0.001; SC: r = 0.86, P < 0.001) and the An. ceylanicum samples (NHM: r = 0.93, P < 0.001; SC: r = 0.92, P < 0.001). The one exception was with regards to S. stercoralis, for which a strong positive correlation was identified for the NHM samples (r = 0.97, P < 0.001) but a very weak, and statistically insignificant, negative correlation was identified from the SC samples (r = − 0.07, P = 0.93) (Table 2).

Table 2 Pearson’s correlation values between individual and pooled qPCR-results (P-value) at both Smith College (SC) and Natural History Museum (NHM)

For the NHM samples, agreement in qPCR findings between both the 5-sample pool replicates and the A/B individual aliquots was moderate to high for all species, with Cohen’s kappa ranging from 0.66 to 1.00. Similarly, with the SC samples, A. lumbricoides and An. ceylanicum showed perfect agreement for both aliquots and 5-sample pools, whilst a strong agreement was found for T. trichiura 5-sample pool results. However, only weak evidence of agreement occurring more often than would be expected by chance was identified for the 5-sample pools for S. stercoralis (k = 0.44, P = 0.07) (Table 3).

Table 3 Degree of agreement in qPCR findings for all species of soil-transmitted helminths between A and B aliquots (for individual samples) and 1 and 2 samples (for pools) from Natural History Museum (NHM) and Smith College (SC) as calculated through Cohenʼs kappa statistic

Lastly, for all species, Cohen’s kappa found a very strong degree of agreement in qPCR findings (translated as positivity for that particular target) between the isolates originally obtained at SC and the pools subsequently created at NHM (k ≥ 0.77, P < 0.001) except for N. americanus, where a slightly weaker degree of agreement was identified (k = 0.51, P = 0.02) (Table 4). The raw numbers used for the analyses (number of true/false positives/negatives per set of pools) are provided in Additional file 1: Table S1.

Table 4 Degree of agreement in qPCR findings for all helminths tested between Smith College (SC) isolates and Natural History Museum (NHM) pools as calculated through Cohen’s kappa statistic

Cost exercise

In all graphs shown (Figs. 3 and 4) no absolute numbers are reported as this cost exercise would differ significantly based on income (wage), currency and technician competency which would affect labour time invested. Instead, we report relative proportions of total cost.

Fig. 3

Cost analysis comparing individual vs pooled samples in both low- and high-wage settings where all samples are known to be negative for all the soil-transmitted helminth species of interest. Dashed white line separates consumable (extraction and qPCR reagents) from labour costs

Fig. 4

Cost analysis on pooling in both low- and high-wage settings in two different scenarios and for two levels of prevalence (2% and 15%) for a single soil-transmitted helminth species. Scenarios represent ‘best’ and ‘worst’ cases of positive sample distributions across 5-sample pools; see main text. Dashed white line separates consumable (extraction, qPCR and ‘spin-out’ reagents) from labour costs

Simplest scenario: all samples are negative for the STH to be screened

In the simplest case where all the individual samples are negative (and thus, so are the pools), there is no need for ‘spin outs’; hence, only costs for labour and consumables (based on list prices online) are included (Fig. 3). In both low-income and high-income settings, labour is a slightly more expensive element than the consumables needed to process the samples in pools compared to the same samples processed individually (low-income setting: labour 9% and consumables 91% versus labour 7% and consumables 93%, high-income setting: labour 41% and consumables 59% versus labour 45% and consumable 55%, respectively). So, when all the samples are negative—or expected to be—there is no significant cost-savings when a pooling strategy is implemented compared to processing all the samples individually.

More complicated scenarios: impact of prevalence and its distribution to the pools

In this cost exercise, two scenarios including STH prevalence rates were considered; 2% and 15% prevalence of a particular STH. Taking the example of 1000 samples and a prevalence of 2% or 15%, this would result in 20 and 150 positive samples, respectively. Out of those pools, in the ‘best-case’ scenario (Fig. 4), 4 and 30 positive pools would have to be revisited, for extraction and processing. However, for the same number of samples and under the same prevalence rates, the ‘worst-case’ scenario would require 20 and 150 pools to be processed, for 2% and 15% prevalence respectively.

In Fig. 4, for the positive pools alone, the additional cost for labour and consumables needed for the ‘spin-outs’ was also estimated and incorporated to the graphs. In the ‘worst-case’ scenario, as the prevalence increases the labour cost also increases in both low and high-income settings. In the ‘best-case’ scenario, for the same parameters (low to high prevalence) only for the low-income settings is the consumable cost slightly higher, whereas in the high-income settings the labour drivers are higher as the prevalence increases.


The strategy of pooling has been considered an attractive way of screening multiple samples simultaneously for a particular target/pathogen, both in research and veterinary settings, potentially lowering the cost of labour or consumables needed [4, 10,11,12, 18, 27]. At the SC laboratory, some preliminary work on screening ‘pools of 10’ was conducted, and even though no dramatic loss of sensitivity was observed, the practicality of the process was deemed more challenging due to lack of sufficient equipment. For this reason, and upon initial cost assessment of consumable and reagent costs involved in ‘pooling’, we focused on assessing a strategy of using 5-sample pools.

The main query of this study was whether pooling is an appropriate strategy for the qualitative detection of STHs in a post-treatment population, where most individuals are expected to be identified as ‘negative’ (based on the diagnostic test chosen). In a setting with most samples being negative, most pooled samples will also be negative thus, potentially, reducing labour and consumable costs and the lower likelihood of having to re-examine individual samples when pools are found to be positive. Moreover, we aimed to show that pooling does not dramatically reduce the chances of the target detection by PCR (given the fact that it is further diluted as part of the pool). These questions are widely relevant for both veterinary [10] and clinical trials and epidemiological studies where large numbers of infected stool samples must be processed to assess infection presence and intensity [15, 26]. Our study focused on a qualitative assessment of the infection levels (presence/absence). The correlation of eggs found in a stool sample to worm burden and subsequently to intensity of infection is of paramount importance in epidemiological studies. A recent review of Papaiakovou et al. [34], addresses the concerns around quantitation of qPCR outputs and their subsequent correlation to egg numbers and, therefore, intensity of infection with confidence. We believe that qPCR has yet to achieve its potential for quantitative purposes given the limitations of PCR target selected, cell numbers present in eggs, and extraction efficiency. Additionally, the dilution of target through pooling will further hinder such quantitation. Thus, we decided to assess presence/absence of the target in both individuals and pools.

Our main objectives were to evaluate the successful formation of the pool, the potential for single sampling of the pool (to avoid reagent and labour cost inflation due to multiple sampling) and the subsequent detection of the diluted target with precision and accuracy. To our knowledge, this is the first time such queries have been interrogated to assist in strategic planning.

Method development

Given prior research on the need to blend stool samples sufficiently [35], and the importance of STH egg disruption by utilizing a high-speed bead-based homogenizer [36,37,38] we acknowledged that any method developed to form pools would be critical, and the subsequent accurate detection of the evenly distributed targets upon dilution in the pool, would be challenging.

The development of a ‘pooling’ protocol that overcomes known limitations and meets all of the aforementioned expectations was relatively trivial for the ‘seeding experiment’, where only N. americanus eggs were recruited and tested. However, mixing or stirring the faecal pool with a sterile loop or low-power vortexer, was insufficient for the field-samples experiment, where the stool samples being recruited were positive for additional STH helminth species. The different consistencies of the stool samples involved, along with the low load of the infection in each one of the samples recruited, showed that adequate mixing was required. Furthermore, the need for both additional buffer and a bead-based beating step both to facilitate the homogeneous blending of the helminth eggs (or DNA) was also critical.

Precision and reproducibility

A working protocol that showed overall statistically significant and acceptable agreement between individuals and pools (through kappa values) was developed. The protocol presented no apparent technical errors for any of the helminths tested. However, due to the complexity and hands-on time, the need to test protocol reproducibility between different technicians and laboratory settings also emerged. Sequentially, our study aimed to show that the protocol is duplicable by any adequately trained and competent technician. Hence, the same pooling workflow (Fig. 2) was compared at two different laboratories (SC and NHM).

Utilizing the pooling strategy as described herein, a generally low rate of false negatives is expected. Also, specificity does not seem to be an issue overall but of interest remains the lower PPV for the S. stercoralis which is discussed in a separate section below.

Last but not least, the list of samples chosen to be pooled had originally been extracted and tested at SC (using the same extraction protocol and the same qPCR assays). Aliquots from the same stool samples were selected to be extracted independently (individually, and as part of pools) at NHM. Almost absolute agreement was shown between individual samples originally and independently tested with qPCR at SC with the results (individual and pool) obtained from NHM.

Single replicates versus duplicates

The Kappa estimates, comparing both individual aliquots and the pooled aliquots, showed a high degree of agreement, which suggests conducting the test twice may be unnecessary. For all species, agreement between 1 and 2 pool replicates was moderate to high for both laboratories. This provides strong statistical evidence that there is little need for multiple sampling. When processing large numbers of samples, the need for rapid and simple detection of the infection by single sampling is important due to the costs involved (reagents and labour). Using our developed protocol, with sufficient mixing and homogenization, there is clearly no need for multiple sampling (A and B in individuals, 1 and 2 in pools), since the infection/target seems to be evenly distributed following the workflow presented here.

For direct comparison of the individual samples forming the pool with the 5-sample pools per se, the individual samples constituting a pool were tested in duplicate, each pool was formed twice and the DNA from each pool was also extracted twice. Our study/protocol demonstrates that a thorough homogenization is critical for even distribution of the target present in stool samples. In that way, there is no reason or need for extracting DNA from the same sample/pool twice, and even in its most demanding format the protocol can be learned, implemented and reproducibly performed by suitably skilled technicians, as suggested by kappa values. Given the overall high degree of agreement, a conclusion that a single pool per 5 samples would be sufficient, can also be made.


Even though the specificity for S. stercoralis was not significantly different at SC compared to NHM, the PPV was slightly lower (individual samples identified as negatives when screened by PCR were deemed positive for S. stercoralis as part of the pools). However, this can be attributed to the lower prevalence of S. stercoralis in the SC samples (10%) as compared to other parasites (at around 40–50%). As a worked example demonstrating the impact of prevalence on PPV, if the sensitivity and specificity for S. stercoralis calculated at SC remained constant (1.00 and 0.625, respectively) but prevalence was increased to 30%, the “new” PPV would be calculated as 0.79, i.e. more consistent with findings from NHM.

Moreover, the presence of larvae instead of eggs and the additional beating steps in the pool (versus individual samples), may have contributed to the infection being ‘missed’ at certain individual samples. It is suspected that further homogenization of larvae facilitated target detection in the pool, but not in the aliquot from the individual. Another possible explanation would be that ‘weak’ infections, unable to be detected in the individuals due to limits of detection of the qPCR assay, were collectively surpassing the detection threshold as part of the pool. All the individual samples had been previously screened independently, as mentioned earlier. Since all the samples previously reported as negatives were indeed negatives when tested in the laboratory, we are ruling out the chance of contamination than can lead to ‘false positive’ results. These samples were ‘true positive’ for S. stercoralis, hence we believe the respective pools were not ‘false positive’. However, a higher prevalence of S. stercoralis in a given dataset would be needed in order to draw any further conclusions.

In the case of N. americanus and A. lumbricoides, since there was almost perfect agreement between individuals and respective pools, the slightly weaker agreement between original extracts and aliquots run at NHM may indicate a lack of adequate homogenization in the original sample.

Cost and time savings with pooling

The authors acknowledge that a viable and cost-effective protocol must not be too complicated or too laborious to set-up. Additionally, any protocol established as a time-saving strategy cannot be less cost-effective than processing the same number of samples individually. For this reason, a broad indicative cost analysis was carried out by our team. We calculated costs based on 1000 samples requiring processing; small enough for easy analysis, large enough to represent a case where pooling might be justified. For consistency and accurate reporting, the current protocol included all the extractions in duplicate and the formation and subsequent extraction of the same pool twice; these components were also part of the cost-model and comparison.

For every pool positive for a single parasite, there is the need to ‘re-visit’ the individual samples that originally formed the pool, repeat the extraction step for each component sample and test each extract for the parasite of interest. For every additional parasite detected in the pooled sample, the additional cost increase is translated to consumables and the time to perform qPCR. However, pooling in the presence of positives adds to the overall cost of this alternative strategy relative to single sample processing. However, there remains room for further optimization of the current workflow (larger capacity homogenizers, purification and liquid handling systems). With a streamlined protocol in place capable of eliminating ‘redundant’ steps (three versus two rounds of homogenization for the pool) further simplifying the protocol may be possible, providing additional time and cost savings even when low percentages of STH prevalence are expected. Also, in cases where microscopy data may be available for individual samples, a ‘strategic pooling’ approach could be to use the samples identified as negatives for forming the pools and process the rest individually.

We acknowledge that our cost estimates based on list prices might not accurately reflect potential cost-saving with bulk or similar discounted purchasing, but the relative costs are likely indicative of broader trends. In our cost exercise, we included a simple case, where all samples are expected to be negative and a more complicated case with the infection present in a population. In the latter, we included only a ‘worst-’ and a ‘best-case’ scenario, along with only two levels of prevalence (2% and 15%) for a single STH species, based on low- and high-income countries. We understand that a realistic situation of the prevalence and distribution of any helminth present will lie somewhere in between. A more comprehensive mathematical cost model will include coefficients such as prevalence rates for a single STH species or more, cost from ‘spin-outs’ of ‘false positives’ or ‘penalty’ of false negatives in the long term, along with tailored wages to suggest a few.

To pool or not to pool

The main drive for developing and testing a pooling protocol has always been the potential savings in labour and consumables, but the additional dilution of the target and subsequent loss of sensitivity of the diagnostic method employed, has been of major concern. Recent research has challenged and augmented those concerns; pooling, might not be the cost-effective technique once hoped for.

Logistical and operational costs [18], special equipment or additional consumables needed (this study), the necessity of reproducibility (this study) and single-sample granularity in the infection present (revealing the ‘positive’ individuals that contribute to a ‘positive’ pool; this study), or generally prevalence in a given population [21], labour cost and study size are amongst the pivotal factors that will determine whether a pooling protocol will actually be beneficial and worthwhile.


We describe a successful pooling strategy that lessens the presence of false negative results, demonstrates reproducibility and minimizes the need for multiple replicates as long as there is sufficient mixing in the individual stools forming the pool. Such a methodology is yet to be simplified and tailored to the needs of any interventions. Even though pooling is more likely a better fit for low STH prevalence or surveillance areas and clusters where interruption of transmission is approached (< 2%), the findings and approach of this study will facilitate future protocol developments and optimizations. Our hope is that this study will assist in decision-making on single versus pooling implementation when considering end-to-end processes, budgeting and time considerations in diagnosing STH in faecal samples.

Availability of data and materials

The datasets used and/or analysed during the present study are available from the corresponding author upon reasonable request.



soil-transmitted helminths


quantitative polymerase chain reaction


Smith College


Natural History Museum


  1. 1.

    Wahlquist SP, Williams RM, Bishop H, Addiss DG, Stewart JM, Finton RJ, et al. Use of pooled formalin-preserved fecal specimens to detect Giardia lamblia. J Clin Microbiol. 1991;29:1725–6.

    CAS  PubMed  PubMed Central  Google Scholar 

  2. 2.

    Singer RS, Cooke CL, Maddox CW, Isaacson RE, Wallace RL. Use of pooled samples for the detection of Salmonella in feces by polymerase chain reaction. J Vet Diagn Investig. 2006;18:319–25.

    Article  Google Scholar 

  3. 3.

    Mitchell S, Pagano M. Pooled testing for effective estimation of the prevalence of Schistosoma mansoni. Am J Trop Med Hyg. 2012;87:850–61.

    Article  Google Scholar 

  4. 4.

    Mekonnen Z, Meka S, Ayana M, Bogers J, Vercruysse J, Levecke B. Comparison of individual and pooled stool samples for the assessment of soil-transmitted helminth infection intensity and drug efficacy. PLoS Negl Trop Dis. 2013;7:e2189.

    CAS  Article  Google Scholar 

  5. 5.

    Vlaminck J, Cools P, Albonico M, Ame S, Ayana M, Bethony J, et al. Comprehensive evaluation of stool-based diagnostic methods and benzimidazole resistance markers to assess drug efficacy and detect the emergence of anthelmintic resistance: a Starworms study protocol. PLoS Negl Trop Dis. 2018;12:e0006912.

    Article  Google Scholar 

  6. 6.

    Shipitsyna E, Shalepo K, Savicheva A, Unemo M, Domeika M. Pooling samples: the key to sensitive, specific and cost-effective genetic diagnosis of Chlamydia trachomatis in low-resource countries. Acta Derm Venereol. 2007;87:140–3.

    CAS  Article  Google Scholar 

  7. 7.

    Lo NC, Coulibaly JT, Bendavid E, N’Goran EK, Utzinger J, Keiser J, et al. Evaluation of a urine pooling strategy for the rapid and cost-efficient prevalence classification of schistosomiasis. PLoS Negl Trop Dis. 2016;10:e0004894.

    Article  Google Scholar 

  8. 8.

    Verstraeten T, Farah B, Duchateau L, Matu R. Pooling sera to reduce the cost of HIV surveillance: a feasibility study in a rural Kenyan district. Trop Med Int Health. 1998;3:747–50.

    CAS  Article  Google Scholar 

  9. 9.

    Zaky WI, Tomaino FR, Pilotte N, Laney SJ, Williams SA. Backpack PCR: a point-of-collection diagnostic platform for the rapid detection of Brugia parasites in mosquitoes. PLoS Negl Trop Dis. 2018;12:e0006962.

    Article  Google Scholar 

  10. 10.

    Eysker M, Bakker J, van den Berg M, van Doorn DCK, Ploeger HW. The use of age-clustered pooled faecal samples for monitoring worm control in horses. Vet Parasitol. 2008;151:249–55.

    CAS  Article  Google Scholar 

  11. 11.

    Pedersen KS, Johansen M, Jorsal SE, Nielsen JP, Bækbo P, Angen Ø. Pooling of porcine fecal samples for quantification of Lawsonia intracellularis by real-time polymerase chain reaction. J Vet Diagn Investig. 2014;26:342–5.

    CAS  Article  Google Scholar 

  12. 12.

    Clasen J, Mellerup A, Olsen JE, Angen Ø, Folkesson A, Halasa T, et al. Determining the optimal number of individual samples to pool for quantification of average herd levels of antimicrobial resistance genes in Danish pig herds using high-throughput qPCR. Vet Microbiol. 2016;189:46–51.

    CAS  Article  Google Scholar 

  13. 13.

    Kenyon F, Rinaldi L, McBean D, Pepe P, Bosco A, Melville L, et al. Pooling sheep faecal samples for the assessment of anthelmintic drug efficacy using McMaster and Mini-FLOTAC in gastrointestinal strongyle and Nematodirus infection. Vet Parasitol. 2016;225:53–60.

    CAS  Article  Google Scholar 

  14. 14.

    Anderson R, Farrell S, Turner H, Walson J, Donnelly CA, Truscott J. Assessing the interruption of the transmission of human helminths with mass drug administration alone: optimizing the design of cluster randomized trials. Parasites Vectors. 2017;10:93.

    Article  Google Scholar 

  15. 15.

    Ásbjörnsdóttir KH, Ajjampur SSR, Anderson RM, Bailey R, Gardiner I, Halliday KE, et al. Assessing the feasibility of interrupting the transmission of soil-transmitted helminths through mass drug administration: the DeWorm3 cluster randomized trial protocol. PLoS Negl Trop Dis. 2018;12:e0006166.

    Article  Google Scholar 

  16. 16.

    Pullan RL, Smith JL, Jasrasaria R, Brooker SJ. Global numbers of infection and disease burden of soil transmitted helminth infections in 2010. Parasites Vectors. 2014;7:37.

    Article  Google Scholar 

  17. 17.

    Kure A, Mekonnen Z, Dana D, Bajiro M, Ayana M, Vercruysse J, et al. Comparison of individual and pooled stool samples for the assessment of intensity of Schistosoma mansoni and soil-transmitted helminth infections using the Kato-Katz technique. Parasites Vectors. 2015;8:489.

    Article  Google Scholar 

  18. 18.

    Leta GT, French M, Dorny P, Vercruysse J, Levecke B. Comparison of individual and pooled diagnostic examination strategies during the national mapping of soil-transmitted helminths and Schistosoma mansoni in Ethiopia. PLoS Negl Trop Dis. 2018;10:e0006723.

    Article  Google Scholar 

  19. 19.

    Medley GF, Turner HC, Baggaley RF, Holland C, Hollingsworth TD. The role of more sensitive helminth diagnostics in mass drug administration campaigns: elimination and health impacts. Adv Parasitol. 2016;94:343–92.

    CAS  Article  Google Scholar 

  20. 20.

    Khurana S, Sethi S. Laboratory diagnosis of soil transmitted helminthiasis. Trop Parasitol. 2017;7:86–91.

    PubMed  PubMed Central  Google Scholar 

  21. 21.

    Truscott JE, Dunn JC, Papaiakovou M, Schaer F, Werkman M, Littlewood DT, et al. Calculating the prevalence of soil-transmitted helminth infection through pooling of stool samples: choosing and optimizing the pooling strategy. PLoS Negl Trop Dis. 2019;21:e0007196.

    Article  Google Scholar 

  22. 22.

    Pilotte N, Papaiakovou M, Grant JR, Bierwert LA, Llewellyn S, McCarthy JS, et al. Improved PCR-based detection of soil-transmitted helminth infections using a next-generation sequencing approach to assay design. PLoS Negl Trop Dis. 2016;10:e0004578.

    Article  Google Scholar 

  23. 23.

    Papaiakovou M, Pilotte N, Grant JR, Traub RJ, Llewellyn S, McCarthy JS, et al. A novel, species-specific, real-time PCR assay for the detection of the emerging zoonotic parasite Ancylostoma ceylanicum in human stool. PLoS Negl Trop Dis. 2017;11:e0005734.

    Article  Google Scholar 

  24. 24.

    Lim MD, Brooker SJ, Belizario VY, Gay-Andrieu F, Gilleard J, Levecke B, et al. Diagnostic tools for soil-transmitted helminths control and elimination programs: a pathway for diagnostic product development. PLoS Negl Trop Dis. 2018;12:e0006213.

    Article  Google Scholar 

  25. 25.

    Becker SL, Liwanag HJ, Snyder JS, Akogun O Jr, Belizario V Jr, Freeman MC, et al. Toward the 2020 goal of soil-transmitted helminthiasis control and elimination. PLoS Negl Trop Dis. 2018;12:e0006606.

    Article  Google Scholar 

  26. 26.

    Tabi ES, Eyong EM, Akum EA, Löve J, Cumber SN. Soil-transmitted helminth infection in the Tiko Health District, South West Region of Cameroon: a post-intervention survey on prevalence and intensity of infection among primary school children. Pan Afr Med J. 2018;30:74.

    PubMed  PubMed Central  Google Scholar 

  27. 27.

    Muñoz-Zanzi C, Thurmond M, Hietala S, Johnson W. Factors affecting sensitivity and specificity of pooled-sample testing for diagnosis of low prevalence infections. Prev Vet Med. 2006;74:309–22.

    Article  Google Scholar 

  28. 28.

    WHO Expert Committee on the Control of Schistosomiasis (2001: Geneva, Switzerland) & World Health Organization. Prevention and control of schistosomiasis and soil-transmitted helminthiasis: report of a WHO expert committee. Geneva: World Health Organization; 2002.; Accessed 23 Apr 2019.

  29. 29.

    Papaiakovou M, Pilotte N, Baumer B, Grant J, Asbjornsdottir K, Schaer F, et al. A comparative analysis of preservation techniques for the optimal molecular detection of hookworm DNA in a human fecal specimen. PLoS Negl Trop Dis. 2018;18:e0006130.

    Article  Google Scholar 

  30. 30.

    Papaiakovou M, Pilotte N, Hu Y, Aroian RV, Walson JL, Williams SA. Pool the stool: pooling stool samples as a strategy as a strategy for increasing the efficiency and effectiveness of real-time PCR for soil-transmitted helminths (STH). Am Soc Trop Med Hyg. 2017;97:615.

    Article  Google Scholar 

  31. 31.

    Ercumen A, Benjamin-Chung J, Arnold BF, Lin A, Hubbard AE, Stewart C, et al. Effects of water, sanitation, handwashing and nutritional interventions on soil-transmitted helminth infections in young children: a cluster-randomized controlled trial in rural Bangladesh. bioRxiv. 2019;512509.

  32. 32.

    Clopper CJ, Pearson ES. The use of confidence or fiducial limits illustrated n the case of the bionomial. Biometrika. 1934;26:404–13.

    Article  Google Scholar 

  33. 33.

    Watson PF, Petrie A. Method agreement analysis: a review of correct methodology. Theriogenology. 2010;73:1167–79.

    CAS  Article  Google Scholar 

  34. 34.

    Papaiakovou M, Gasser RB, Littlewood DTJ. Quantitative PCR-based diagnosis of soil-transmitted helminth infections: faecal or fickle? Trends Parasitol. 2019;35:491–500.

    CAS  Article  Google Scholar 

  35. 35.

    Krauth SJ, Coulibaly JT, Knopp S, Traoré M, NʼGoran EK, Utzinger J. An in-depth analysis of a piece of shit: distribution of Schistosoma mansoni and hookworm eggs in human stool. PLoS Negl Trop Dis. 2012;6:e1969.

    Article  Google Scholar 

  36. 36.

    Andersen UV, Haakansson IT, Roust T, Rhod M, Baptiste KE, Nielsen MK. Developmental stage of strongyle eggs affects the outcome variations of real-time PCR analysis. Vet Parasitol. 2013;191:191–6.

    CAS  Article  Google Scholar 

  37. 37.

    Demeler J, Ramünke S, Wolken S, Ianiello D, Rinaldi L, Gahutu JB, et al. Discrimination of gastrointestinal nematode eggs from crude fecal egg preparations by inhibitor-resistant conventional and real-time PCR. PLoS ONE. 2013;19:e61285.

    Article  Google Scholar 

  38. 38.

    Espírito-Santo MCC, Alvarado-Mora MV, Pinto PLS, Carrilho FJ, Pinho JRR, Gryschek RCB. Two sequential PCR amplifications for detection of Schistosoma mansoni in stool samples with low parasite load. Rev Inst Med Trop Sao Paulo. 2012;54:245–8.

    Article  Google Scholar 

Download references


The authors would like to thank Raffi Aroian (UMass Medical School, Worcester, USA) for providing stool infected with Necator americanus, Dr Jack Colford, Dr Jade Benjamin-Chung, Dr Benjamin Arnold, and Dr Ayse Ercumen (University of California, Berkeley, USA) for providing field samples collected in Bangladesh. Also, we are grateful to Samara Loewenstein (SC) for her help with the preliminary laboratory work employing the ‘pooling’ protocol, Jacqueline Maasch (SC) for preparing stool aliquots, Sue Haynes (SC) for logistical support and Raju Misra for facilitating access to end-to-end set-up of the molecular diagnostics at the NHM.


This project was funded by DeWorm3 (Bill and Melinda Gates Foundation; OPP1129535) and NHM Departmental Investments funds.

Author information




MP, JW, NP, JCD, JLW, SAW and DTJL contributed to study concept design. MP led the majority of laboratory work with assistance from DC. JW led the statistical analyses. FS and IG contributed to the cost analysis. MP led on the drafting of the manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Marina Papaiakovou.

Ethics declarations

Ethics approval and consent to participate

Sample collection and testing were approved by the Ethical Review Committee at the International Center for Diarrhoeal Disease Research, Bangladesh (PR-14105), as well as by the institutional review board at Stanford University (27864) and the Committee for the Protection of Human Subjects at the University of California, Berkeley (2014-08-6658).

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Additional file 1: Table S1.

Raw numbers of true positives/negatives for both ‛seedingʼ experiment (Smith College) using N. americanus egg counts and field-sample testing (Smith College and Natural History Museum) for both individual samples and pools.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Papaiakovou, M., Wright, J., Pilotte, N. et al. Pooling as a strategy for the timely diagnosis of soil-transmitted helminths in stool: value and reproducibility. Parasites Vectors 12, 443 (2019).

Download citation


  • Breakpoint of transmission
  • Cost analysis of pooling
  • Pooling
  • Soil-transmitted helminths
  • Stool samples
  • qPCR-based diagnostics