Study area
This study was conducted in a rural section of Pampa del Indio municipality (25°55′S, 56°58′W), Chaco Province, Argentina, which encompassed 7 communities and 587 houses as of 2015 [31]. This section (here denominated Area III) is a historic settlement area of the Qom people [37]. The last insecticide spraying campaign targeting house infestation with T. infestans in Pampa del Indio municipality took place in 1997–1998.
The study area was subjected to a vector control and disease research programme initiated in 2008 with a follow-up period of 7 years as of 2015. In October 2008, 31.9% of the occupied houses were infested with T. infestans, mainly within human sleeping quarters, and virtually all (93.4%) were sprayed with insecticides [27]. During the 2008–2015 vector surveillance phase we conducted annual triatomine surveys and selectively sprayed with insecticide the few foci detected. This strategy reduced house infestation to < 1% during 2008–2012, and no infested house was found in 2015 [31].
Local houses usually included a domicile (i.e. an independent structure used as human sleeping quarters, also denominated “domestic premises”), a patio and other structures within the peridomestic area (kitchens, storerooms, latrines, corrals, chicken coops and chicken nests) (Figure S1 in [27]). Although housing quality remained precarious over the seven-year follow-up, the proportion of domiciles with mud walls and tarred-cardboard roof (as opposed to a tin roof) significantly decreased [31]. A household was defined as all the people who occupy a housing unit including related and nonrelated family members [38].
Study design and household survey
This study complied with STROBE recommendations for observational studies [39], and the ethical principles included in the Declaration of Helsinki (Ethical Committee “Dr Carlos A. Barclay”, Protocol ref. TW-01-004).
All houses were registered and their location georeferenced with a GPS receiver (Garmin Legend; Garmin Ltd., Schaffhausen, Switzerland) in October 2008. The head of each household was informed of the purpose and protocol of the study, and gave oral consent. An environmental and socio-demographic survey was conducted as described elsewhere [27]. We collected information on the name of the head of each household, the number of residents by age class, the number of domestic animals of each type (dog, cats, poultry, goats, pigs, cows and equines) and their resting places, type and frequency of use of domestic insecticides, and the date of the last insecticide spraying conducted by vector control personnel or any other third party using manual compression sprayers. The ethnic group of the household was assigned on the basis of whether they spoke Qom language, participated in traditional Qom organizations, and took into account the tenants’ physical features and cultural practices. Multiethnic households (< 5%) (i.e. formed by at least one person self-identified as Qom and at least one person self-identified as creole) [40], were classified as Qom given their self-identification and cultural practices. The domiciles’ construction materials and other characteristics were registered, including refuge availability for triatomines, time since construction, and the area of the domicile. Refuge availability was determined visually by a skilled member of the research team and scored in one of five levels ranging from absence to very abundant refuges [28]; only the three top categories were actually observed in domiciles.
The recorded data were used to compute household-level surrogate indices for wealth, educational level and overcrowding as described elsewhere [27]. The goat-equivalent index represents a small stock unit that quantifies the household number of livestock (cows, pigs, goats) and poultry owned in terms of goat biomass. Household educational level was defined as the mean number of schooling years attained by household members aged 15 years-old (y.o.) or more. The overcrowding index was defined as the number of human occupants per sleeping quarter; the presence of 3 or more occupants per room was taken as critical overcrowding.
Each household’s location, demographic information and status was updated at each survey during the seven-year follow-up. The socio-demographic and environmental questionnaire was extended during the 2012–2015 surveys to include detailed information of each dweller and the use of personal protective practices (i.e. domestic insecticides and bednets). Although these protective practices were possibly used by householders to reduce the nuisance caused by blood-feeding insects and other domestic pests, they can exert an effect on reducing the exposure to triatomine vectors. We registered the name of each household resident, their relationship to the head of the household, age, gender, parents’ names, education and employment information, and whether they received some type of welfare support. Households were classified as encompassing one person only, one nuclear family (i.e. household consisting of at least one parent and their children), extended families (i.e. one nuclear family plus non-nuclear relatives, including more than one nuclear family), and other (non-family households and households consisting of second-degree relatives only).
The two censuses conducted in 2012 and 2015 allowed us to verify whether individual residents registered in 2012 were still residing in the same house in 2015 or had moved during the intervening period. We also registered any death, birth, and addition (and origin) of any new resident. This information was used to determine individual mobility during the 2012–2015 period: residents were classified as in-migrants or out-migrants (to or from outside the study area, respectively, including individuals coming from or leaving to a different section within Pampa del Indio municipality), and local movers (those who moved to a different house within Area III, i.e. local mobility). When the entire household out-migrated over this period, we asked their neighbors about their destination. Mobility at the household level (i.e. the mobility pattern of the household as a whole, as opposed to the mobility pattern of each member) was derived from individual mobility data and classified as: movers (i.e. households that changed its exact residential location within Area III), non-movers (i.e. households that remained at the same residential location), and migrant households (i.e. households that had in- or out-migrated from Area III) [31].
In 2015 we also collected information on access to health services and sanitary conditions: drinking water supply, sanitation services, fuel used for cooking, whether they used the local hospital, the local primary healthcare post or both, ambulance access, and whether a community healthcare agent visited the household. We determined the Euclidian distance (in km) between each house and different healthcare facilities using QGIS and the georeferenced locations. We also gathered information on assets owned by each household: television, radio, cell phone, freezer, fridge, bicycle, motorcycle and/or automobile.
Demographic rates
The population growth rate (annual percentage change) was estimated for the 2008–2012 period (4.1 years) and for the 2012–2015 period (2.3 years) as follows:
$$\frac{{\Delta {\text{ Population during the period}}}}{\text{Mid-year population}} \times 100$$
The mid-year total population was estimated as the average between the 2012 and 2015 populations, multiplied by the duration of the period [41].
We calculated the general fertility rate (GFR), and the crude birth and crude mortality rates of the population residing in the study area over the 2012–2015 period. Births included children born after December 2012 (not registered in the 2012 census) whose parents resided at the study area at the date of birth and were registered in the census performed in April 2015. Deaths included only people that were registered in the 2012 census and died before April 2015. The population of women of childbearing age in Argentina encompasses those between 15 and 49 y.o. [42].
The GFR (person-years, PY) was estimated as:
$$\frac{{{\text{Number of births in 2012}}{-}2015}}{\text{Mid-year total population of women of childbearing age}} \times 1000;$$
and the crude birth and crude death rates were estimated as:
$$\frac{{{\text{Number of births (deaths) in 2012}}{-}2015}}{\text{Mid-year total population}} \times 1000;$$
We also estimated the net migration rate for the 2012–2015 period as:
$$\frac{{{\text{Migrant population during 2012}}{-}2015}}{\text{Mid-year population}} \times 1000$$
The migrant population was considered as the sum of in-migrants and out-migrants into and from the study area [41].
The local demographic indicators were compared to provincial (Chaco Province) and national vital statistics derived from the latest national census undertaken in Argentina [42].
Socio-economic, health access and sanitation indices
We constructed two socio-economic indices measuring social vulnerability and assets, and a health access and sanitation index using multiple correspondence analysis (MCA) to summarize their multidimensionality. The social vulnerability index was constructed for the 2008 and 2015 surveys. The 2008 social vulnerability index included characteristics of the domiciles (refuge availability, presence of cardboard roofs and/or mud walls, time since house construction and domestic area), and household socio-economic and demographic characteristics (overcrowding, goat-equivalent index and educational level). The 2015 social vulnerability index additionally included the presence of dirt floors, the household number of welfare support payments received at the time of the survey, and the household number of salaried employees. The asset index was estimated for 2015 only and included the assets most commonly owned by local residents as detailed above.
The health access and sanitation index included relevant variables measured at household level in 2015: drinking-water supply (piped drinking water, borehole, tanker truck or dug well), sanitation facilities (pour-flush latrines, pit latrines or no sanitation facilities), distance to the nearest primary healthcare post and to the local hospital (located in Pampa del Indio town), and other variables related to health access as described above.
Host availability index
Using the same approach described above for the socio-economic and sanitary indices, we constructed a host availability index in domiciles as of 2008 based on a preliminary analysis showing that the household abundance of domestic animal hosts was positively correlated with larger household size. This index summarized the number of potential domiciliary hosts of T. infestans (adult and child residents, total number of dogs, cats and chickens nesting indoors), and in the case of dogs and cats, whether they rested within or in the proximity of the domicile. The host availability index was introduced to account for a potential confounding effect when analyzing the effects of social vulnerability on vector indices.
Vector indices as transmission surrogates
All triatomines collected at baseline were identified taxonomically and the individual infection status with T. cruzi was determined by microscope examination of feces [27] or by molecular diagnosis using kDNA-PCR [43], achieving a coverage of 60% of all infested houses.
The occurrence of domiciliary infestation with T. infestans was determined by the finding of at least one live triatomine (excluding eggs) through any of the vector collection methods used (i.e. timed-manual searches, during insecticide spraying operations, and householders’ bug collections). The relative abundance of domiciliary T. infestans was calculated only for infested houses as the number of live bugs collected by timed-manual searches per 15 min-person per site, as described [27]. The same procedures were used to determine the occurrence of at least one T. cruzi-infected T. infestans in the domicile and its relative abundance.
Data analysis
Coverage of vector, socio-demographic and environmental surveys reached 95.6% (n = 390) of all occupied households enumerated in October 2008, 94.6% (n = 421) in November 2012 and 93.7% (n = 449) in April 2015. For analysis, we excluded houses that were closed and those in which householders refused to provide information. For each variable we checked whether the missing values were missing completely at random by building a dummy binary variable (missing and non-missing values) and analyzing the significance of the Spearman correlation coefficient with any another independent variable in the data set, as described elsewhere [27]. Most of the variables with missing values were missing completely at random, except for educational level and overcrowding in 2008, in which the missing data corresponded to households that had moved or out-migrated by 2012 (the year when these data were collected). Assuming similar conditions prevailed over 2012 and 2008, these variables were back-corrected to 2008 whenever possible [27].
Normality and homoscedasticity of continuous variables were tested by the Shapiro–Wilks test (normality), the Cook–Weisberg test (homoscedasticity) and other graphical methods (QQ plot and residuals vs fitted values scatterplot). For all proportions, 95% confidence intervals (95% CI) were estimated using the Agresti & Coull method if sample sizes were greater than 50, and the Wilson method for smaller sample sizes [44]. For medians, we report the interquartile range (IQR) [45]. Medians were preferred over means when continuous variables deviated significantly from a normal distribution. For bivariate analysis of categorical variables, we used Chi-square and Fisher’s exact tests depending on sample size and other assumptions. In the case of bivariate analysis comparing categorical and continuous variables, we used non-parametric tests (i.e. Mann–Whitney and Kruskal–Wallis) when the continuous variables did not fit a normal distribution. Correlations between continuous variables were evaluated by Spearman’s rank correlation coefficients.
The MCA used to construct the summary indices is a multivariate analysis that reduces the dimensionality of the covariance matrix in linear combinations of the original variables [46]. The first dimension captures most of the variance (inertia), and the score for each household (value of the dimension) can be used as a quantitative index [34]. For a better interpretation, the indices were considered as −Dimension 1. The different dimensions can also be assessed graphically using biplots, which allow a better understanding of how the variables are interrelated and their relative contribution to the score [47]. Because MCA requires all the variables to be categorical, numeric variables were categorized according to their quartile distribution. We used multiple linear regressions to assess variations in household-based indices by ethnic group and mobility status (i.e. non-movers, movers and migrants) adjusted by the community in which they were located.
We used generalized linear models (GLM) [48] to analyze the effect of the household’s ethnicity, mobility pattern and the community it was located (i.e. independent variables) on each of the indices constructed by MCA as dependent variables (socio-economic vulnerability, host availability and health access and sanitation indices). We also used GLM models to assess the household-level effects of these socio-demographic indices (i.e. independent variables) on the risk of vector-borne transmission of T. cruzi, adjusting for ethnicity and considering possible interactions between independent variables. The response variables were the occurrence and relative abundance of T. infestans, and the occurrence and relative abundance of T. cruzi-infected T. infestans. In the case of binary response variables (i.e. occurrence), we used logistic regression models with logit as the link function and the relative risk expressed as odds ratios (OR). When the response variable was vector abundance, we used negative binomial models with log as the link function and the relative risk expressed as incidence rate ratios (IRR). Negative binomial regression was preferred to Poisson regression given the overdispersed distributions [49]. All analysis were implemented in Stata v.14.2 [50] and R v.3.2.3 (lme4 and car packages) [51].
Spatial analysis
Global point pattern analysis (univariate and bivariate) were performed using the weighted K-function implemented in Programita [52]. Random labeling was selected to test the null hypothesis of random occurrence of events among the fixed spatial distribution of all houses. We used quantitative (abundance of infected vectors and the household social vulnerability and host availability scores) and qualitative labels (presence/absence of infected vectors) for each house (point). Monte Carlo simulations (n = 999) were performed and the 95% ‘confidence envelope’ was calculated with the 2.5% upper and lower simulations. Additionally, local spatial analysis on the abundance of (infected) vectors were performed using the G* statistic implemented in PPA [53]. The selected cell size was 200 m (assuming that each house had at least three neighbors at the minimum distance of analysis), and the maximum distance was set at 6 km (i.e. half of the dimension of the area). We created heatmaps (i.e. density maps) to visualize the spatial aggregation of the demographic and socio-economic indices using a kernel density estimation algorithm within a radius of 200 m as implemented in QGIS 2.18.11.