Spatial mapping and prediction of Plasmodium falciparum infection risk among school-aged children in Côte d’Ivoire

Background In Côte d’Ivoire, malaria remains a major public health issue, and thus a priority to be tackled. The aim of this study was to identify spatially explicit indicators of Plasmodium falciparum infection among school-aged children and to undertake a model-based spatial prediction of P. falciparum infection risk using environmental predictors. Methods A cross-sectional survey was conducted, including parasitological examinations and interviews with more than 5,000 children from 93 schools across Côte d’Ivoire. A finger-prick blood sample was obtained from each child to determine Plasmodium species-specific infection and parasitaemia using Giemsa-stained thick and thin blood films. Household socioeconomic status was assessed through asset ownership and household characteristics. Children were interviewed for preventive measures against malaria. Environmental data were gathered from satellite images and digitized maps. A Bayesian geostatistical stochastic search variable selection procedure was employed to identify factors related to P. falciparum infection risk. Bayesian geostatistical logistic regression models were used to map the spatial distribution of P. falciparum infection and to predict the infection prevalence at non-sampled locations via Bayesian kriging. Results Complete data sets were available from 5,322 children aged 5–16 years across Côte d’Ivoire. P. falciparum was the predominant species (94.5 %). The Bayesian geostatistical variable selection procedure identified land cover and socioeconomic status as important predictors for infection risk with P. falciparum. Model-based prediction identified high P. falciparum infection risk in the north, central-east, south-east, west and south-west of Côte d’Ivoire. Low-risk areas were found in the south-eastern area close to Abidjan and the south-central and west-central part of the country. Conclusions The P. falciparum infection risk and related uncertainty estimates for school-aged children in Côte d’Ivoire represent the most up-to-date malaria risk maps. These tools can be used for spatial targeting of malaria control interventions. Electronic supplementary material The online version of this article (doi:10.1186/s13071-016-1775-z) contains supplementary material, which is available to authorized users.


Background
Malaria is a vector-borne disease that is widespread in sub-Saharan Africa. In 2015, an estimated 188 million malaria cases and 395,000 deaths occurred in Africa [1]. According to the Global Burden of Disease study, malaria is responsible for 31.7 million years lived with disability (YLDs) [2]. Malaria also drains the social and economic development of affected countries [3][4][5]. Over the past 15 years, malaria control interventions have averted 663 million clinical cases across Africa [6].
For the implementation of best-practice control strategies and intervention planning, in-depth knowledge of spatial characteristics and factors that influence malaria is needed. Predictive risk mapping has proven to be an important tool for malaria control [7,8]. In particular, the use of remote sensing technologies, coupled with geographic information system (GIS) allows to link high-resolution environmental data to the infection risk and produce model-based smooth predictive risk maps of the risk over a surface of interest [9][10][11][12]. Moreover, it allows a deeper understanding of the epidemiology and ecology of the disease [13][14][15][16][17]. Malaria prevalence data are likely to be spatially correlated and Bayesian geostatistical models can capture this correlation by accounting for an unobserved underlying spatial structure [18,19]. These models are highly parameterized, and hence, model parameter estimations rely on complex algorithms such as Markov chain Monte Carlo (MCMC) sampling methods. Bayesian geostatistical methodology has been widely used for malaria risk mapping at local [20,21], national [22], regional [23][24][25][26] and global scales [27,28].
In Côte d'Ivoire, a country highly endemic for malaria, the national malaria prevention and control policy is based fundamentally on the use of long lasting insecticidal nets (LLINs), intermittent preventive treatment with sulfadoxine-pyrimethamine (IPT-SP) and environmental sanitation. The control strategy also includes prompt diagnosis and treatment with artemisinin-based combination therapy (ACT) [29]. No specific interventions targeting school-aged children are currently being promoted, but a recent national school-based survey in Côte d'Ivoire on parasitic diseases revealed an overall Plasmodium falciparum prevalence in excess of 60 % [30,31].
The purpose of this study was to spatially analyse P. falciparum prevalence data obtained from the first national school-based survey on parasitic infections in Côte d'Ivoire [30][31][32]. A Bayesian geostatistical approach was employed and environmental and sociodemographic factors, disease prevention indicators and distance from school to the nearest health facility were considered to assess their potential effects on P. falciparum infection risk. Environmental factors that govern the spatial distribution of P. falciparum were used to produce a modelbased high spatial resolution P. falciparum risk map for Côte d'Ivoire.

Study area
Côte d'Ivoire has a warm and humid climate. The average temperature ranges between 25°C and 32°C. There are three main seasons; warm and dry from November to March, hot and dry from March to May and hot and wet from June to October. A dense tropical moist forest covers the south-western part of the country; in the middle part of Côte d'Ivoire, a Guinean forest-savannah mosaic belt extends from east to west and a Sudanian savannah covers the northern part.

Study design
A country-wide cross-sectional survey using the lattice plus close pairs sampling approach was designed, as described elsewhere [30][31][32][33]. In short, a grid indicating latitude and longitude at a unit of 0.5°was overlaid on a map of Côte d'Ivoire. A total of 94 schools were selected among all primary schools that comprised at least 60 children attending grades 3-5. Sixty children were sampled per school. This sample size exceeds the minimum sample size of 50 recommended by the World Health Organization (WHO) for collection of baseline information on helminth prevalence and intensity in schoolaged populations within large-scale surveys [34]. The survey was carried out during the dry season from November 2011 to February 2012. When visiting the schools, geographic coordinates were recorded, using a hand-held global positioning system (GPS) receiver (Garmin Sery GPS MAP 62; Olathe, USA).
Parasitological, demographic, prevention, treatment and socioeconomic data To determine P. falciparum infection status, two drops of blood from finger prick samples were collected from enrolled children and thick and thin blood films were prepared on microscope slides. The slides were air-dried and transferred to nearby laboratories where they were stained with Giemsa and examined under a microscope by experienced laboratory technicians for Plasmodium species identification and parasitaemia. The number of parasitized blood cells were counted by assuming a standard white blood cell count of 8,000 per 1 μl of blood. Ten percent of the slides were randomly selected for quality control.
A pre-tested questionnaire was administered to all children participating in the survey. The questionnaire included information on household asset ownership (e.g. bicycle, fridge, radio, etc.), clinical symptoms (e.g. abdominal pain, headache, vomiting, etc.) and recent history of diseases (e.g. malaria, skin disease, schistosomiasis, etc.) [31]. Children were also asked whether they had a bed net at home, whether they slept under a bed net, whether they used other preventive measures against malaria, such as fumigating coils, insecticide spray and burning leaves and whether they took malaria treatment during the two weeks preceding the survey.
Data were double-entered and cross-checked in EpiInfo version 6 (Centers for Disease Control and Prevention; Atlanta, USA).

Environmental data
Environmental data were obtained from satellite imagery for the year 2011. The sources and the properties of these data are summarised in Table 1. Yearly average was used for night and day land surface temperature (LST), normalized difference vegetation index (NDVI) and rainfall. The rainfall coefficient of variation was calculated by dividing the standard deviation (SD) by the mean. Land cover was grouped into three categories: (i) urban; (ii) forest/savannah; and (iii) croplands. Altitude was obtained at 1 km spatial resolution and distance to freshwater bodies was extracted from digitized maps (Health Mapper database; Geneva, Switzerland).

Statistical analysis
Children were grouped into two age categories: (i) 5-10 years and (ii) 11-16 years. Distance from school to the nearest health facility was obtained from the "Programme National de Santé Scolaire et Universitaire" (PNSSU) and was summarised into three categories, i.e. (i) < 1 km, (ii) 1-5 km, and (iii) > 5 km. The first category included schools located in villages or towns with a health facility. For assessment of socioeconomic status, an asset-based approach was used to stratify children into five socioeconomic groups [31].
Bayesian geostatistical stochastic search variable selection (SSVS) was performed to explore all possible models within a geostatistical framework and select the most important predictors for P. falciparum infection [35,36]. In addition, a parameter expanded normal mixture of inverse gamma (peNMIG) prior parameterization was used to address the oversampling of the categorical variables around zero that can arise with more traditional parameterizations [37]). Details related to the variable selection procedure are provided in Additional file 1.
A first variable selection that included demographic (age and sex) and environmental variables was performed to build a predictive model of the P. falciparum infection risk across Côte d'Ivoire (variable selection 1). In a second step, variables at individual level about prevention, treatment, distance from school to the nearest health facility and socioeconomic status, which were not available at 1×1 km spatial resolution, were also explored for selection in order to assess their effect on P. falciparum infection risk (variable selection 2). The possible nonlinear relationships of variables were taken into account by considering the inclusion of categorical predictors in the model. We built both selection procedures to enable the model to choose the best functional form of each predictor, i.e. categorical or linear. With the exception of age, and distance from school to the health facilities that were categorised, predictors were introduced in both functional forms in the model for the variables, which could be expressed either as categorical or linear. The variables and their functional form included in the final models were the ones with posterior probability of inclusion over 50 % (median probability model). Geostatistical logistic regression models within a Bayesian framework of inference were performed to analyse the risk of infection with P. falciparum via MCMC simulation algorithms for estimation of model parameters. Let Y ij be the P. falciparum infection status for child j (j = 1, …, J) in school i (i = 1, …, I). We assumed that Y ij arises from a Bernoulli distribution with probability p ij such as, Y ij~B e (p ij ). We modelled covariates X ij and school-specific spatial random effects φ i on the logit scale, i.e. logit (p ij ) = X ij β + φ i , where β represents the vector of regression coefficients, including a constant. Spatial random effects were assumed to follow a multivariate normal prior distribution, φ~MVN (0, Σ). The variance-covariance matrix Σ introduced spatial dependency through an isotropic exponential correlation function of distance between locations as follows: Σ kl = σ 2 exp (−ρd kl ), where d kl is the Euclidian distance between a pair of schools k and l, the variance σ 2 measures the spatial geographic variability and ρ is a parameter that controls the rate of correlation decay. The range, defined as the minimum distance at which spatial correlation between locations is below 5 %, is calculated as 3/ρ. To complete model specification, prior distributions were assigned to model parameters. For the regression coefficients, non-informative normal prior distributions were chosen such as β~N (0, 0.01) where β = (β 1 ,… β K ) T . For the variance and the correlation decay, inverse gamma and gamma distributions were respectively assumed, i.e. σ 2~I G (2.01, 1.01) and ρ~G (0.01, 0.01). Model prediction was done using Bayesian kriging [38]. We assessed model convergence by visual examination of history and density plots. A sample of the last 500 iterations were stored for prediction on a grid of 352,911 pixels with a spatial resolution of 1 km. To validate our model, a geostatistical logistic regression model was fitted on a sub-dataset of 73 randomly selected schools (around 80 % of the number of schools in the original dataset). We then predicted the risk at the remaining schools and compared our predictions with the observed prevalence data. Model predictive ability was assessed by calculating the mean absolute error (MAE), which is the mean of the absolute differences between the median of the predicted P. falciparum infection risk and the observed prevalence. Model uncertainty was assessed by the sum of the standard deviations (SDs) of the predictive distributions.

Implementation details
Geostatistical variable selection and model fit were implemented in OpenBUGS v. 3

Results
Complete data records were available from 5,322 children aged 5-16 years in 93 schools. Of note, one of the sampled schools refused to participate. The prevalences of P. malariae and P. ovale were very low; 3.7 % and 0.3 %, respectively. P. falciparum was the predominant species with an overall observed prevalence of 69.2 %. All subsequent analyses focussed on P. falciparum only. The spatial distribution of the observed P. falciparum infection prevalence is shown on map A in Fig. 1.
We assessed potential correlation between predictors in a preliminary analysis, but none were considered as highly correlated, since the absolute value of the Pearson's correlation coefficient never exceeded 0.9 (r < 0.9). Hence, all of them were considered for selection as potential predictors of malaria risk. Median probability models with their posterior probability, as well as posterior inclusion probability of each predictor for both variable selection procedures are shown in Table 2. For variable selection 1, where only demographic and environmental predictors were explored, a posterior inclusion probability superior to 50 % was obtained for land cover. However, when we additionally included prevention, treatment and socioeconomic data and distance from school to the nearest health facility in the variable selection procedure (variable selection 2), only socioeconomic status was retained as an important predictor.
Estimates of model parameters and model validation results for the Bayesian geostatistical logistic regression model with land cover as predictor are summarised in Table 3. Croplands and forest/savannah were positively associated to P. falciparum infection compared to urban setting (croplands odds ratio (OR): 1.95, 95 % Bayesian credible interval (BCI): 1.23-3.03); forest/savannah OR: 2.30, 95 % BCI: 1.43-3.81). The spatial range was 285 km (95 % BCI: 139-477 km), indicating important residual spatial correlation. Model validation showed that the model predicts a random sample of 20 % of the data with a MAE of 0.11 and a sum of SD of the posterior predictive distribution of 1.81.
Estimates of model parameters for the Bayesian geostatistical logistic regression with socioeconomic status are also presented in Table 3. Socioeconomic status showed a significant association with the risk of P. falciparum infection; the wealthier the household, the lower the risk for P. falciparum infection. A substantial residual spatial correlation of 259 km (95 % BCI: 123-457 km) was estimated.
Map B in Fig. 1 illustrates the smooth map of the estimated P. falciparum infection prevalence among school-aged children in Côte d'Ivoire. Lowest prevalences (around 45 %) were estimated for small urban aggregated areas in the south-east of the country, close to Abidjan, central-southern and centralwestern parts. Prevalences above 70 % were found in the north, central-east, south-east, west and southwest of Côte d'Ivoire. Map C in Fig. 1 shows the standard error of the predicted P. falciparum infection prevalence. High prediction errors were mostly found from central-west to central-east of the country.

Discussion
The purpose of this study was to (i) identify sociodemographic, environmental and disease prevention indicators associated with P. falciparum infection prevalence and (ii) produce a smooth risk map of P. falciparum infection among school-aged children for Côte d'Ivoire. To our knowledge, this spatial analysis is the first at the national level based on P. falciparum data collected within a few weeks during the dry season in late 2011/early 2012. The results obtained from this spatially explicit analysis are useful for current and future malaria control efforts in Côte d'Ivoire.
Among the environmental covariates, only land cover was selected by the geostatistical variable selection procedure. Precipitation, temperature and distance to freshwater bodies -factors that have previously been associated to malaria risk in Côte d'Ivoire [21,22] -were not identified as important predictors in the current investigation. Possible explanations arise from the use of different data sources; while the national survey reported here was conducted in the dry season, the aforementioned study pursued at country level [22] used historical data obtained over a period of 20 years and during different periods of the year. In addition, particular geographic patterns such as for the mountainous area in western Côte d'Ivoire [21], where particular climatic and environmental conditions prevail and scale differences across studies may further explain contrasting results. Regarding the distance to water body, this variable had not been selected by our modelling framework as a potential risk factor for malaria. This observation might be explained by the fact that the distance from school to water body is not precise enough to capture this effect or that the source of water body is not sufficiently detailed. Unfortunately, we could not afford to collect information on the distance from each participant's home to the nearest water body. Further effort is needed to identify additional water body information in Africa.
The spatial model with land cover as covariate indicated that school-aged children living in areas with forest/savannah and croplands are at higher risk of P. falciparum compared to those from urban areas. On one hand, this result suggests that the endemicity of malaria in Côte d'Ivoire is linked to two  In contrast, other studies using land cover showed that the forest area was associated with a decrease of malaria incidence [39] or risk [40]. It is important to underline that the south of Côte d'Ivoire is mainly characterised by tropical rainforest in which people are living and this pattern might differ in other studies. On the other hand, people living in urban areas generally have better access to treatment and prevention measures, compared to rural areas, which is reflected by our results. Interestingly, the second geostatistical variable selection that included environmental variables, prevention and distance from school to the nearest health facility suggested that only socioeconomic status, as assessed by the wealth index, explained P. falciparum infection risk. Wealthier households were associated with a low risk of P. falciparum infection and is line with results from other studies [3,41,42]. Basically, this result is consistent with the first variable selection procedure, where only environmental factors were considered. Indeed, the urban area is characterised by overall higher socioeconomic status compared to the forest/savannah and cropland areas. Of note, Côte d'Ivoire is still mainly rural, although urbanisation progresses rapidly [20,21,31,43]. As shown elsewhere, P. falciparum infection and parasitaemia are positively associated with low socioeconomic status.
In the present study employing recent epidemiological data, very high prevalences of P. falciparum (> 70 %) were found in the entire north and the south-west (Taï forest region) of Côte d'Ivoire. This is in contrast with lower prevalences obtained from a previous spatial analysis using historical data focussing on children aged < 16 years [22]. Recent environmental transformations, such as rice farming in the north [44] and progression of deforestation in the Taï forest in the south-west, led to increased population densities [45,46], which might explain the change in P. falciparum prevalence rates in these areas. However, differences in survey designs and large heterogeneities of historical data must be considered to deepen the understanding of potential changes in P. falciparum prevalence and parasitaemia in space and time [47]. The design of the present study allowed us to have more uniformly distributed data across the country than before, and hence, prediction uncertainty was minimized.

Conclusion
This study provided a comprehensive analysis on the spatial distribution of P. falciparum infection among school-aged children across Côte d'Ivoire. Given the high burden of P. falciparum infections in the schoolaged population, there is a need for intervention strategies that also target this age group. Notwithstanding, since the end of 2014, the national malaria control programme in Côte d'Ivoire is making LLINs available to the entire population, including school-aged children, thus an important step is taken to tackle the malaria burden in this specific age group. The produced smooth P. falciparum prediction map, in conjunction with uncertainty estimates, represent useful tools for scaling up current and future malaria control interventions. Future predictive risk profiling should include other factors such as population density and more detailed information on intervention coverage in order to better understand the impact of ongoing malaria interventions.

Availability of data and materials
The datasets used for analysis of the study are available from the corresponding author on reasonable request.
Authors' contributions CAH, RBY, EH, PBN, KDS, GS and GR implemented the study. BGK, SBA and AF assisted in the preparation for execution of the study. EKN, JU, PV and GR conceived and designed the study. KDS, GS and GR supervised the study. CAH, FC and PV analysed the data. CAH, FC, RBY, EH, PBN and GR contributed to the data management. CAH, FC, JU, PV and GR drafted the manuscript. All authors read, revised and approved the final manuscript.

Competing interests
The authors declare that they have no competing interests.

Consent for publication
Not applicable.

Ethics approval and consent to participate
The study protocol was approved by the ethics committees of Basel (EKBB; reference no. 30/11) and Côte d'Ivoire (reference no. 09-2011/ MSHP/CNER-P). Additionally, permission to carry out this study was obtained from the Ministry of National Education. Written informed consent was obtained from parents or legal guardians of participating children, while children assented orally. Participation was voluntary, and hence, withdrawal was possible anytime without further obligation. Parasitological and questionnaire data were coded and treated confidentially. All febrile children (tympanic temperature ≥ 38°C) with a positive result for Plasmodium infection were treated with ACT, according to recommendations put forth by WHO and national policies [48].