Density-dependent effects on the weight of female Ascaris lumbricoides infections of humans and its impact on patterns of egg production

Background Ascaris lumbricoides exhibits density-dependent egg production, a process which has a marked impact on both the transmission dynamics and the stability of the parasite population. Evidence suggests that the egg production of female Ascaris is also associated with the size of the worm. If worm size is mediated by density-dependent processes then the size of female worms may have a causal impact upon patterns of Ascaris egg production. Results We analyse data collected from a cohort of human hosts, and demonstrate that the per host mean weight (a proxy for size) of female Ascaris is dependent on the number of infecting females (worm burden) following a pattern of initial facilitation followed by limitation. Applying a negative binomial (NB) generalized linear model (GLM) and a zero-inflated negative binomial (ZINB) model we confirm that the per host female mean weight is significantly associated with per host egg production. Despite these associations, the mean weight of female Ascaris has little causal impact on patterns of density-dependent egg output. The ZINB model is able to account for the disproportionately large number of zero egg counts within the data and is shown to be a consistently better fit than the NB model. The probability of observing a zero egg count is demonstrated as being negatively associated with both female worm burden and female mean weight. Conclusion The mean weight of female Ascaris is statistically significantly associated with egg output, and follows a consistent pattern of facilitation preceding limitation with increasing female worm burden. Despite these relationships, incorporation of female Ascaris mean weight into models of egg output has little effect on patterns of density dependence. The ZINB model is a superior fit to the data than the NB model and provides additional information regarding the mechanisms that result in a zero egg count. The ZINB model is shown to be a useful tool for the analysis of individual-based egg output data.


Background
Density-dependent population processes can occur at each stage of a parasite's lifecycle [1]. For the gastrointestinal (GI) nematodes these include establishment within the host, development and maturation time, adult survival, and female fecundity [2,3]. Density dependence has important implications for both the stability [2] and transmission dynamics [1,3,4] of helminth populations. Incorporation of these processes into mathematical models as accurately as possible is vital for furthering our understanding of important dynamical behaviour, such as the rate of re-infection following chemotherapeutic intervention and the spread of anthelmintic resistance [3][4][5].
In Ascaris lumbricoides infections of humans, densitydependent egg production has been reported; the per capita egg output decreasing with increasing number of worms per host (worm burden) [1,6]. Both the severity of density dependence and the level of egg production exhibit marked geographic variability [7]. This variability has implications for the use of egg counts to estimate the intensity of infection [7], and the applicability of transmission models across geographical locations for decision support in view of recent efforts to integrate the control of neglected tropical diseases.
Density-dependent reductions in worm size may be an important factor in Ascaris egg output. A positive correlation between worm size and egg production is commonly described in GI nematodes of ruminants (e.g. [8][9][10][11][12][13]) as well as in Ascaris infections of humans [14,15]. A constraint in worm size at high worm burdens may have a causal impact on patterns of density-dependent egg production. Reductions in worm size at high burdens have been described in both natural [12,16] and experimental [17] systems of directly-transmitted helminths in nonhuman mammals. There is conflicting evidence on the relationship between size and worm burden in Ascaris infections of humans. A number of studies have reported no evidence for density-dependent constraints [18][19][20][21] whereas the opposite has been described elsewhere [22].
Worm size and egg production may also interact with the host's immune response. In lambs, acquired immune responses to Teladorsagia (= Ostertagia) circumcincta infections are known to control the size of worms and reduce their egg output [8,[23][24][25]. Similar correlations between the host immune response, worm size and egg output have been described in human hookworm infections [26]. Experimental infections of rats with Strongyloides ratti have shown that worms are larger and sometimes more fecund in immune-suppressed rats and smaller and sometimes less fecund in immunized animals when compared to controls [27,28]. Furthermore, density-dependent fecundity effects in this nematode species are known to depend on the host immune response [29,30].
Despite a number of factors potentially influencing the egg production of female Ascaris, this important demographic and fitness parameter is ubiquitously described in terms of a single variable; the worm burden. This is largely due to the inherent difficulties in applying suitable statistical models to parasitological data [31,32]. Statistical analyses tend to be complicated by the high degree of variability in the egg output from a single host [33][34][35][36], highly overdispersed distributions of worms and egg output across a population of hosts [1,37], and the sensitivity and quantitative reliability of the diagnostic technique [35,36,38]. The estimated concentration of eggs may also be biased by host factors such as the volume of faeces produced (e.g., estimates in children tend to be inflated relative to those in adults) [35]. Typically, density-dependent reductions in egg output are presented in terms of female worm fecundity, a composite parameter describing the per capita egg production per unit time (eggs per gram of faeces divided by the number of (female) worms per host). Detection of density dependence has frequently been performed by fitting a functional form to the relationship between fecundity and worm burden. This method contravenes assumptions of statistical independence and may introduce bias via inaccuracies in the estimation of worm burden [39].
A number of studies have characterised density-dependent patterns of Ascaris egg output by fitting statistical models to grouped mean egg output data (e.g. [7,19,34]). The advantage of this method is that, assuming a large enough sample size per group, the distribution of means can be assumed to be normal, evoking the central limit theorem. However, this complicates the investigation of other variables which may also be important determinants of egg production. In addition, density-dependent Ascaris egg output has been exclusively described using data collected from populations at temporal equilibrium; consequently, whether this phenomenon is static or temporally dynamic is not known.
The analyses in this study are split into two parts. In the first part we define and fit a statistical model to evaluate the evidence for different forms of density dependence in the per host mean weight (a proxy for size) of female Ascaris. In the second, we explore the relationship between the mean weight of female worms and per host egg output using a multivariate modelling approach (controlling for both female worm burden and host age). We explore the suitability of two types of statistical model for modelling these individual-based egg output data: a negative binomial (NB) generalized linear model (GLM) [40] and a zero-inflated negative binomial (ZINB) model [41,42]. The latter is useful in modelling data with a high proportion of zero counts [42][43][44] and has been applied to GI nematode egg count data in two previous studies [45,46]. Throughout this paper we define the net egg output as the estimated concentration of eggs per gram of faeces per host (regardless of whether they are or not fertilised). Thus we distinguish between egg production (fecundity) and fertility, whereby the latter measures the number of fertilised and embryonated eggs that a female worm produces (i.e. live offspring).

Study area and data collection
Data were collected from a poor urban suburb of Dhaka, Bangladesh between 1988 and 1989 by Hall and colleagues [7,47,48]. Briefly, households were visited and all their occupants invited to take part in the study with the aim of recruiting as many individuals as possible. All participants were asked to provide a faecal sample from which the number of Ascaris lumbricoides eggs were counted using a quantitative ether sedimentation technique [36] and the concentration of eggs per gram of faeces (EPG) estimated. Pyrantel pamoate was administered to each subject and their stools were collected for a period of 48 hours post-treatment. The worms recovered (A. lumbricoides) from the faeces of each individual were sexed, counted and weighed. Egg counts, treatment and worm counts were repeated on two further occasions at sixmonthly intervals. Pyrantel pamoate paralyses adult Ascaris allowing them to be expelled intact from the gut [49] with a cure rate of approximately 88% [50]. Hence, these data provide a reliable and accurate measure of the number and weight of worms per host. The population of worms recovered after the first round of chemotherapy is termed the baseline population, after the 2 nd round of chemotherapy, the 1 st re-infection population and after the 3 rd and final round, the 2 nd re-infection population. The pre-treatment egg counts are similarly referred to.

Sample size
To evaluate the evidence for different forms of density dependence affecting the per host mean weight of female Ascaris, analyses were performed on the data collected from all individuals who were found to be infected with at least one female worm. To explore the relationship between the per host mean weight of female Ascaris and the per host egg output, data were analysed from those individuals who were found to be infected with at least one female worm and from whom an estimate of egg output had been made. Table 1 summarises the data (available upon request to authors) used in these analyses. Definitions and descriptions of all parameters and variables used throughout this paper are given in Table 2.

Per host female mean weight and worm burden
To explore the relationship between the mean weight of female Ascaris in each host infra-population and the female worm burden we define the following statistical model. Let n be the number of female worms in a single host. Given a host harbours n female worms, the weights of the individual female worms are assumed to be inde-    can initially increase with increasing n followed by an asymptotic decline describing a pattern of initial facilitation (positive density dependence) followed by limitation (negative density dependence); for examples and further discussion of using forms of equation (2) to describe density dependence in other host-parasite systems see [51,52]. Each of Models A-C is nested within the following one allowing their respective fits to be compared using the likelihood-ratio statistic (LRS) [53]. Under the null hypothesis the LRS follows a chi-square distribution with degrees of freedom (d.f.) equal to the difference in the number of parameters being estimated [54]. Akaike's information criterion (AIC) [55] was also calculated as an additional measure of goodness-of-fit. Host age, a, was incorporated into the model as a two level factor (a = 0 for children ≤12 years, a = 1 for teenagers and adults > 12 years) to allow the parameters pertaining to density dependence, , to vary between age groups. The variance parameters, , were considered independent of host age. Assuming the observed per host mean female weight data to follow a normal distribution of the form given in equation (1), a log-likelihood function was derived and maximised using the quasi-Newton Broyden-Fletcher-Goldfarb-Shanno (BFGS) [56][57][58][59] optimisation algorithm to obtain maximum likelihood estimates (MLEs) of the unknown parameters (α a and σ). The BFGS algorithm was implemented using the optim function in the R statistical program (v.2.8.0) [60,61]. The best-fit form of equation (2) was determined separately in each of the baseline, 1 st and 2 nd re-infection populations.

Per host female mean weight and net egg output
Model derivation For A. lumbricoides, the relationship between the total net egg output per host (denoted by the random variable Λ) and the per host female worm burden, n, has been empirically well described by a power function [7], Here E represents the expected value, λ 1 the number of eggs per gram of faeces produced by a sole infecting female and and c inverse measure of the severity of negative density dependence (for 0 <c ≤ 1), with c = 1 indicating proportionality or density independence. This model is conveniently linearised by taking natural logarithms, A null model was defined by extending this relationship to include host age, a (where c is once again a two-level factor), by adding a multiplicative factor, e β [in equation (3)], for teenagers and adults ≤12 years. Host age is a potentially important confounding factor associated with both the number of worms per host (the age-intensity profile of Ascaris infection is typically convex, for examples see [6,48]), and the concentration of egg counts (egg counts tend to be negatively related with the volume of faeces produced resulting in overestimation in children compared to adults [35]). Thus, the null model (denoted Model 1) describing the relationship between net egg output and female worm burden adjusting for host age was defined as, In order to extend Model 1 [equation (5)] to reflect the potential dependence of per host net egg output on per host female mean weight the following preliminary analyses were performed to determine appropriate functional forms to describe the relationship. Per host egg output data were stratified by per host mean weight, taking the arithmetic mean per stratum, and regressing these values against polynomial functions (up to 3 rd order) of the mean of the per host mean weight of each stratum. Stratum means were centred around their overall arithmetic mean value in order to minimise multicollinearity [62]. A problem of collinearity for non-centred polynomial terms was indicated by high values (consistently greater than 10 [63]) of their variance inflation factors (VIFs) [64] and high standard errors of their estimated coefficients. (We centre the per host female mean weight in all subsequently described polynomial regression models to ensure robust parameter estimation. We do not continue to explicitly indicate this to maintain the clarity of the mathematical notation.) Models were fitted using standard GLM procedures assuming the mean egg output per stratum to be normally distributed with constant variance [40] and implemented using the glm function in R [60,61]. Models were compared using the LRS and AIC. In the baseline, 1 st and 2 nd re-infection populations, the 3 rd order (cubic), 2 nd order (quadratic) and 3 rd order functions were, respectively, indicated by both test statistics as being the best fits ( Figure 1, Table 3).
Using the 3 rd order relationship, Model 1 [equation (5)] was extended to define a full model describing the relationship between net egg output, female worm burden, and worm weight adjusting for host age, Equation (6) is denoted Model 4. Models 2 and 3 are defined as special cases of Model 4, modelling female Ascaris mean weight as, respectively, 1 st order and 2 nd order polynomial functions (Table 4).
In these models the parameters pertaining to the mean weight of female Ascaris per host do not have a direct biological interpretation although from the unadjusted models (i.e. unadjusted for the effects of host age and female worm burden) in each population ( Figure  1), it is clear that the mean egg output tends to initially rise with increasing female mean weight followed by a decline. This functional relationship may be thought of as empirically modelling the antagonistic effects of growth and ageing on worm egg production.

Statistical modelling approach
In order to fit these linear models to the data it is necessary to assume an appropriate probability distribution of the per host net egg output, Λ. Count data are typically modelled assuming either a Poisson or negative binomial distribution (NBD) [40,42]. Given the high level of Comparison of models describing the per host net egg output of Ascaris as polynomial functions of the per host mean weight of female worms. Analyses were performed on grouped mean data (see text) using standard GLM procedures. † denotes the best-fit model in each population.
The relationship between the per host net egg output and mean weight of female Ascaris lumbricoides Figure 1 The relationship between the per host net egg output and mean weight of female Ascaris lumbricoides. The relationship between the per host net egg output and the (centred, see main text) mean weight of female Ascaris in the baseline (A), 1 st (B) and 2 nd (C) re-infection populations. Triangles represent grouped mean egg outputs stratified by female Ascaris mean weight. Solid lines and circles represent the fitted values of the best fit polynomial functions (as determined by the likelihoodratio statistic (LRS), see Table 3). , the NBD is more appropriate than the Poisson [42]. However, these data also comprise a high proportion of zero counts which may not be adequately captured by the NBD (zero inflation, Figure 2). For A. lumbricoides, a zero egg count represents either an infra-population containing no sexually mature females or a false negative [38], since even unfertilised females can produced (unfertilised) eggs. (In this study, fertilised eggs were not distinguished from those unfertilised.) The distribution of Ascaris eggs in faecal samples from infected individuals has been shown to be highly aggregated [65] making false negatives more likely. Furthermore, the probability of a false negative may be dependent on properties of the worm infra-population and the infected host.
Data that are zero inflated relative to the NBD may be better described by a two-component mixture model which defines the response variable as a mixture of a Bernoulli and NBD (zero-inflated negative binomial, ZINB) [41][42][43][44]66]. Such a distribution allows zero counts to arise from two distinct mechanisms: a Bernoulli (binary) process generating either a positive or zero count and a count process (including the possibility of a zero count) [42]. Covariates of each process may or may not be the same [44] affording flexibility to construct models with the potential to explain a much higher degree of variability than assuming a single distribution. In these analyses, we fit the linear models derived in the previous section using both a negative binomial and mixture model approach and compare their respective fits.
The distribution of per host egg output

Negative binomial (NB) model
For the NBD the probability of observing an egg count λ is, where k is an inverse measure of the degree of overdispersion [61] and Γ is the gamma function [Γ(x) = (x -1)!]. For a known value of k, equation (7) is of the form of the exponential family of probability distributions [40,67].
The natural logarithm (the link function) of μ Λ is linearly related to the covariates described in the previous section [equations (5) and (6)] and so, for a given value of k, the parameters can be estimated within the GLM framework [40]. Here, since k is unknown, we employ a frequently used extension of the GLM methodology which allows maximum likelihood estimates of both k and the unknown linear parameters to be obtained [61,68]. The glm.nb function from the MASS package in R [61] was used to implement this technique and fit Models 1-4 (Table 4) to the data in each population. Models were compared using the LRS and AIC to determine the best-fit. We refer to these models as negative binomial or NB models.

Zero-inflated negative binomial (ZINB) model
For the zero-inflated negative binomial distribution the probability of observing an egg count λ is, Here p is the probability of observing a zero count originating from the Bernoulli process and [k/(μ Λ + k)] k is the probability of observing a zero count from the NBD. Just as μ Λ is linearly related to covariates via the logarithmic link function, the logit function (ln[p/(1 -p)]) can be used to linearise the relationship between p and potential covariates [40]. Univariate exploration of the data indicated a negative linear relationship between logit(p) and the natural logarithm of stratified groups of the per host mean weight of female Ascaris and the per host female worm burden (Figures 3 and 4). Equation (9) is a model which includes host age (a), the natural logarithm of the mean weight of female Ascaris [ln(w)], and the natural logarithm of the female worm burden [ln(n)] as covariates of the probability of observing a zero count. This model was fitted to the data in each population using standard GLM procedures implemented in R using the glm function [40,61].
These preliminary multivariate analyses confirmed ln(n) and ln(w) to be statistically significantly and negatively related to the probability of observing a zero count in each population (Table 5). In the 2 nd re-infection population age was found to be positively associated with p (i.e. the probability of observing a zero count is greater in teenagers and adults than in children) ( Table 5). The relationship given in equation (9) was used to define the Bernoulli component of the mixture model and extend Models 1-4 into zero-inflated models (denoted by the letter I, Models 1I-4I, Table 4). These models were fitted to the data in each population using maximum likelihood implemented using the zeroinfl function from the pscl package (v.1.02) [68] in R (for further information on fitting zero-inflated mixture models see [41,43,69]). The fitted models were compared to one another using the LRS to determine the best-fit and also to their corresponding NB model (Models 1-4) using AIC. We refer to zeroinflated models as ZINB.

Per host female mean weight and worm burden
Comparisons of nested forms of equation (2) indicated a pattern of facilitation followed by limitation in all populations (Table 6, Figure 5). The LRS and corresponding pvalues are unambiguous in the baseline and 1 st re-infection populations (p-value < 0.0001), whereas the facilitation preceding limitation pattern was only marginally preferred over limitation alone in the 2 nd re-infection population (p-value = 0.046). AIC supported facilitation preceding limitation in all populations. The pattern of density dependence was similar in both age groups, whereas teenagers and adults tended to harbour slightly heavier worms ( Figure 5).

Per host female mean weight and net egg output
The LRS and AIC indicated that incorporating the mean weight of female Ascaris as a covariate improved the fit of both the NB and ZINB models in all populations ( Table  7). The best-fit functional form of the relationship between the per host mean weight of female Ascaris and the per host egg output (order of polynomial) varied across populations and with the assumed probability model ( Table 8).
The ZINB model provided a consistently better fit to the data than its non zero-inflated counterpart ( Table 7, comparing AIC values) and was able to account for the high proportion of zero counts within the data ( Table 9). The vast majority of these zero counts were described in the Bernoulli component of the ZINB model (Table 9); in all populations the per host female worm burden and the per host mean weight of female Ascaris were negatively associated with the probability of a zero egg count (Table 8, pvalue < 0.0001 for both covariates). In the 2 nd re-infection population there was also evidence that host age was positively associated with the probability of a zero count (pvalue = 0.0092 in the best-fit model, Model 2I, Table 8).
The parameter values estimated from the ZINB and NB models are broadly similar ( Table 8). The estimated value of the overdispersion parameter k, tends to be higher in the ZINB models (indicating reduced overdispersion) since many of the zeros are accounted for in the Bernoulli component of the model (Table 9). It is noteworthy that the estimated values of parameter c (the inverse measure of density dependence) tend to be lower in the ZINB models (indicative of more severe density dependence, Table  9).

Discussion
The major objectives of this study were twofold: Firstly, to determine whether there is any evidence for densitydependent processes affecting the per host mean weight of female Ascaris lumbricoides. Secondly, to determine whether per host female mean weight is associated with per host egg output and what, if any, causal impact this has on density-dependent egg production. We have shown that the per host mean weight of female Ascaris follows a pattern of initial facilitation followed by limitation with worm burden both at endemic equilibrium (baseline population) and after 6 months re-infection (re-infection populations). An association between the per host mean weight of female Ascaris and the per host egg output is demonstrated in the three analysed populations. The functional form of this relationship is different across populations and dependent on the assumed probability model used to estimate the unknown parameters. However, comparing the zero-inflated negative binomial (ZINB) models, which provide a better description of the observed data, we see that at baseline egg output initially rises with increasing per host female mean weight before falling at very high weights, whilst in the re-infection populations, egg output rises monotonically with increasing weight. Despite these findings, per host female mean weight has little discernable causal impact on the wellcharacterised patterns of density-dependent egg production in A. lumbricoides [6,7,70].

Relationship between the proportion of zero egg counts and female worm burden
The convex pattern of facilitation preceding limitation has been documented in one previous study of the GI-nematode Heterakis gallinarum infecting the ring-necked pheasant (Phasianus colchicus) [16]. Constraints on female weight may be caused by intra-specific (exploitation) competition for either nutrients or space or by host-mediated effects; such as a non-protective immune response. Limitation of size due to competition for nutrients is unlikely since the total energy requirements of even a heavy Ascaris infection is small relative to that of a human host [71] although it may be possible in the severely Relationship between the proportion of zero egg counts and the mean weight of female Ascaris lumbricoides  [72] and reinfections following chemotherapy [73], suggesting that individuals with heavy worm burdens mount a weaker rather than a stronger immune response. If worm burden relates to the immune response in this manner and the response affects the size of female worms, then the per host mean weight of females would increase with per host worm burden in a facilitative pattern. Immune responses are known to limit worm size in experimental GI infections of rats with S. ratti [27,74] and sheep with T. circumcincta [24,25]. The data presented in this study are not sufficient to distinguish between the various potential causative mechanisms behind the observed density dependence, however, we speculate that the facilitation is immune mediated whereas the limitation is the result of competition for space. Two previous studies have described a positive relationship between the size of female A. lumbricoides and egg production. Sinniah and Subramaniam [14] dissected the uteruses of females expelled from 50 schoolchildren and showed a moderately positive linear relationship. Seo and Chai [15] took a different approach, relating egg output with female length from hosts harbouring a single female or a male and female pair. Their results point to a more parabolic shape to the relationship, with egg output declining in very large (and presumably old) worms. This is in accordance with the results of the present study in the baseline population. Allometric relationships between body size and egg output are a characteristic feature of parasitic nematode infections [75,76], so it is not surprising that similar mechanisms operate in Ascaris infections of humans. More interesting, however, is how this association influences the host-parasite interaction and ensuing population dynamics; do host responses limit the size of worms? Are some hosts more efficient than others at doing so? Such processes and heterogeneities are known to occur in model non-human nematode systems [8,27,28].
The degree of density-dependent egg output (described by parameter c) remains approximately equal in the null and best-fit models in each of the three populations (regardless of the probability model). This consistency shows that the severity of density dependence is not greatly altered by the effects of female weight. Thus, egg production is limited directly by increasing female worm burden and is not simply an artefact resulting from the density dependence of female mean weight, i.e. the association between egg output and female weight does not cause density-dependent fecundity. It is noteworthy that no statistically significant density-dependent fecundity was detected in the 1 st re-infection population (c = 1.04, 95% C.I. 0.97-1.11, Model 2I, Table 8). The severity of densitydependent Ascaris fecundity is known to be weak in Bangladesh relative to other geographical locations [7], and so its detection is likely to be prone to type II statistical errors.
The relationships between the per host net egg output and the female mean weight varied between the baseline and re-infection populations, with a significant decrease for heavier worms present only at baseline (as indicated by the cubic polynomial providing the best-fit functional relationship, Model 4I). This is congruent with the biological interpretation of this functional form representing a decline in egg production in heavier (inferred older) worms. This would be expected to be unimportant in the populations after six months of re-infection since the lifeexpectancy of Ascaris is estimated to range between 1 and 2 years [1,77].
An important result from this work is the evidence that the per host net egg output tends to be higher in children than adults in the baseline population (β = -0.28, p-value < 0.0001, Table 8). There is also marginal evidence for this effect in the 2 nd re-infection population (β = -0.19, p-value = 0.015, Table 8). Egg concentration can be negatively associated with the volume of faeces produced resulting in overestimation of egg output in children compared to adults [35]. However, given the unambiguous result in the baseline population it is surprising that the effect is absent and not more statistically significant in the 1 st and 2 nd reinfection populations respectively. An alternative explanation is that the decreased egg output in adults is due to an acquired immune response. However, to reconcile this with the results from the re-infection populations, the  Parameter values were estimated by fitting equation (9) to the egg count data encoded in a binary fashion (positive or zero) using standard GLM procedures. Parameters refer to the following covariates: δ 1 : host age, δ 2 : the natural logarithm of female worm burden and δ 3 : the natural logarithm of female mean weight.
We have shown that a mixture of the negative binomial and Bernoulli distributions (ZINB model) provides a superior description of the distribution of egg output data than a negative binomial (NB model) distribution alone. Similar zero-inflated models have been used frequently in the ecological literature where datasets with many zeros are commonplace (for a review see [81]). In parasitological research, we are aware of only two previous studies that have used zero-inflated models to describe egg output data [45,46]. An added advantage of using a zero-inflated model is the insight which can be gained into the source of zeros (egg counts). Here we show that the probability of a zero count is negatively associated with both the per host female worm burden and the per host female mean weight. These associations suggest that the zero egg counts are false negatives within the data, i.e. failing to detect eggs in truly egg-contaminated faeces. We hypothesise that the greater the total (net) egg production the lower the probability that a sample is taken from a non-contaminated part of the collected faeces. Thus, since total egg production is positively associated both with female worm burden and female mean weight, the probability of sampling from a non-contaminated part of the faeces decreases with increasing female worm burdens and mean weight. This effect will be exacerbated by the highly over-dispersed distribution of A. lumbricoides eggs in faecal samples [65].
The results presented in Table 9 suggest that a very small fraction of the zeros in the data were generated from the negative binomial count process. If we accept the explanation that the vast majority of zeros are false negatives, it is tempting to remove zero counts a priori in order to simplify analyses aimed at detecting epidemiologically significant covariates (i.e. covariates that directly impact upon the release of transmission stages). In taking such an approach one must again choose an appropriate distribution with which to model the now zero truncated data. Two suitable contenders are the log-normal and zerotruncated negative binomial distributions. The advantage of the former is that, via a logarithmic transformation, ordinary least squares estimation procedures can be used. For the latter, numerical maximisation of the appropriate log-likelihood function is required [42,44]. Figure 6 compares the results of fitting equation (5) (in which the mean per host egg output is modelled as being dependent on female worm burden and host age only) to the zerotruncated baseline data using the two approaches. Clearly the log-normal assumption provides an inadequate description of the data due largely to the poor approximation of the variance-to-mean relationship, a key aspect in accurate parameter estimation [40,82] (for details of the variance-to-mean relationship for the log-normal and zero-truncated negative binomial distributions see additional file 1 and [83]). Therefore, although removing zeros from the data may be a reasonable approach, more The best-fit relationship between the per host mean weight of female Ascaris lumbricoides and female worm burden Figure 5 The best-fit relationship between the per host mean weight of female Ascaris lumbricoides and female worm burden. The best-fit functional relationships (as determined by the LRS, Table 6) between the per host mean weight of female Ascaris and the female worm burden in the baseline (A), 1 st (B), and 2 nd (C) re-infection populations. The solid red line is the best-fit to children (age ≤ 12 years) and the broken blue line to teenagers and adults (age > 12 years). The best-fit function is given by equation (2) and represents a pattern of initial facilitation followed by limitation. Circular and square data points are grouped means for children and teenagers and adults respectively. Error bars represent the standard error of the mean. Female worm burden C The goodness-of-fit of all NB and ZINB models fitted to the data assessed by the LRS and AIC. † denotes the best-fit of each model type in each population. Model equations are given in Table 4.  Estimated parameter values from the best-fit NB and ZINB models in each population. Parameters significantly different from 0: * p-value < 0.05, ** p-value < 0.01, ***p-value < 0.001. Model equations are given in Table 4. complex and non-standard statistical models are still required for adequate parameter estimation [42].

Conclusion
In this study we have demonstrated that the mean weight of female A. lumbricoides infecting a cohort of human hosts follows a pattern of facilitation preceding limitation with increasing worm burden. We verify that weight is associated with net egg output but demonstrate that this has little causal impact on patterns of density-dependent egg production. We show that a zero-inflated negative binomial (ZINB) probability distribution is superior to a negative binomial distribution in modelling individual egg output data.

Additional file 1
Probability distribution, variance and expected value for the zerotruncated negative binomial and log-normal distributions. A file containing the probability distribution, variance and expected value for the zero-truncated negative binomial and log-normal distributions.
Click here for file [http://www.biomedcentral.com/content/supplementary/1756-3305-2-11-S1.pdf] Comparison of the fit of a log-normal and zero-truncated negative binomial model Figure 6 Comparison of the fit of a log-normal and zero-truncated negative binomial model. A: The estimated variance-tomean relationship from the zero-truncated negative binomial model (black thick line) and the log-normal model (black thin line). B: The fitted zero-truncated (thick lines) and log-normal (thin lines) models to data from children (red solid line) and teenagers and adults (blue broken line) in the baseline population. In both figures red circles represent grouped mean data from children and blue squares from teenagers and adults (as defined in Figure 5). Details of the variance-to-mean relationship for the log-normal and zero-truncated negative binomial models are given in additional file 1.