This study evaluated secondary data from the house Index (HI) and the Breteau index (BI), both gathered by the surveillance services. The units of analysis were 51 neighbourhoods, grouped in 17 strata, of the city of Campina Grande, Brazil, from 2014 to 2017. This period was delimited considering the hypothesis that Zika virus were introduced in Brazil during the 2014 FIFA World Cup [17]. The team from Tahiti (French Polynesia) had played in the Pernambuco Arena in June 2013 and the viral phylogenetic study showed that the origin of the Brazilian strain was Asian, sharing a common ancestor circulating in French Polynesia [18]. The Zika virus may have probably been introduced in Pernambuco, which could explain the larger size of the epidemic in this state and neighbouring areas, such as Campina Grande [19]. Zika virus was associated with a high prevalence of cases of congenital syndrome Zika, which has led the country to an emergency public health situation [20].
Campina Grande city (7°13′14.92″S, 35°55′1.32″W) is considered one of the main industrial centres of the north-east of Brazil as well as the main technological core within South America. In 2019, its estimated population was 409,731 inhabitants, making it the second-most populous city of Paraíba, with a population density of 648.31 per km2 [21]. About 40% of the population exists on a very low wage, or about USD 118 per month (half of the minimum wage) [21].
Based on the Köppen-Geiger climate classification system, Campina Grande has a moderate tropical climate, with a dry season from September to January and a wet season from May to August. Maximum summer and winter temperatures are 30 and 18 °C, respectively. Minimum summer and winter temperatures are 20 and 15 °C, respectively. The annual relative humidity is between 75–82% [22].
The city has a total area of 593 km2, divided into 51 neighbourhoods with 5% designated as rural and 95% as urban (Fig. 1). Much of the city’s growth was not planned, and its neighbourhoods correspond to old farms that were sold and urbanized. The poorer neighbourhoods correspond to the peripheral areas or those near rivers or railroads. From 2014 to 2017, the cityʼs neighbourhoods were grouped into 17 strata, which were grouped together taking into consideration socioeconomic characteristics and/or physical factors, such as large avenues, highways, railways, wide water flows such as rivers, lakes and dams. All procedures used to define strata are described in guidelines published by the Brazilian Ministry of Health, which also provides software for sampling and collecting data [10].
Statistical methods
For the purposes of this study, local authorities of Campina Grande provided the results of the LIRAa (HI and BI), which was carried out between January 2014 and December 2017. The dependent variables were 51 records of HI and BI, collected three to five times per year. The main descriptive statistics for these dependent variables are presented in the results: maximum, minimum, median, interquartile intervals, mean and standard deviation. For spatial visualization of the data, the quartile maps, Moran map and Local Indicators of Spatial Association (LISA) map are presented for 2014 and 2017. The other years are presented in Additional file 1: Figure S1, Additional file 2: Figure S2, Additional file 3: Figure S3, Additional file 4: Figure S4, Additional file 5: Figure S5, Additional file 6: Figure S6, Additional file 7: Figure S7, Additional file 8: Figure S8, Additional file 9: Figure S9 and Additional file 10: Figure S10. All analyses were performed using the R software [23].
Matrix W
The associations of neighbourhood observations, defined for each location, can be expressed by spatial contiguity or a weight matrix W of order n × n, where n is the number of locations (neighbourhoods). The entry in the ith row and jth column, denoted as Wij, corresponds to the pair (i, j) of locations. The elements of the matrix Wij assume a nonzero value when the areas (observations) i and j are considered neighbouring, and zero otherwise.
$$W = \left[ {\begin{array}{*{20}c} {W_{11} } & {W_{12} } & \cdots & {W_{1n} } \\ {W_{21} } & {W_{21} } & \cdots & {W_{2n} } \\ \vdots & \vdots & \ddots & \vdots \\ {W_{n1} } & {W_{n2} } & \cdots & {W_{nn} } \\ \end{array} } \right]$$
Spatial autocorrelation
Spatial correlation is the correlation between observations of a single variable solely attributable to their proximity in space. Spatial autocorrelation (association) measurements and tests can be differentiated by the range or scale of analysis, as distinguished from global and local measures [24]. A global measure implies that all elements in the matrix W are included in the spatial correlation calculation, producing a spatial autocorrelation value for any spatial weight matrix. In contrast, local measures are concentrated, i.e. they evaluate the autocorrelation associated with one particular area or a few area units rather than all of them [24].
Both measures indicate the degree of spatial association of the dataset. The Moranʼs I index calculates the spatial autocorrelation as a covariance, from the product of the deviations from the mean [24]. This index indicates the magnitude of the spatial association present in the data set with n locations. The Moranʼs I index is calculated by the following expression:
$$I = \frac{{\sum\nolimits_{i = 1}^{n} {\sum\nolimits_{j = 1}^{n} {w_{ij} \left( {y_{i} - \bar{y}} \right)\left( {y_{j} - \bar{y}} \right)} } }}{{\sum\nolimits_{i = 1}^{n} {\left( {y_{i} - \bar{y}} \right)^{2} } }}$$
The Moranʼs I index varies in a range of (− 1:1), where − 1 means perfect dispersion, 0 represents random behaviour, and 1 means perfect association. Assuming that Zi is observations of random variables Zi whose distribution is normal, then it has an appropriately normal distribution:
$$E(I) = - \frac{1}{(n - 1)}$$
$$Var(I) = \frac{{n^{2} (n - 1)W_{1} - n(n - 1)W_{2} - 2W_{0}^{2} }}{{(n + 1)(n - 1)^{2} W_{0}^{2} }}$$
While these comprehensive measures are very useful to provide an indication of global grouping data, such methods need to be complemented by local statistics. The formula for calculating the local Moranʼs index for each area Ai is given by:
$$I_{i} = \frac{{(y_{i} - \bar{y})\sum\nolimits_{i = 1}^{n} {\sum\nolimits_{j = 1}^{n} {w_{ij} \left( {y_{i} - \bar{y}} \right)} } }}{{\sum\nolimits_{i = 1}^{n} {\frac{{\left( {y_{j} - \bar{y}} \right)^{2} }}{n}} }}$$
The statistics can be interpreted as follows: positive values of Ii mean that there are spatial clusters with similar values (high or low) of the variable under study, whereas negative values mean that there are spatial clusters with dissimilar values of the variable in and between the areas and their neighbours.
Moran scatter plot
The Moran scatter plot is an illustration of the relationship between the values of the chosen attribute at each location and the average value of the same attribute at neighbouring locations. For this purpose, the diagram is divided into four quadrants (Q1, Q2, Q3 and Q4) with the following interpretation: (i) Q1: the first quadrant (upper right) shows the areas that have high values for the variable in question surrounded by neighbouring areas which also have above-average values for the variable under analysis. This quadrant is classified as high-high (AA, + +); (ii) Q2: the second quadrant (lower left) shows the areas that have low values for the variable in question surrounded by neighbouring areas that also have below-average values for the analysed variables. This quadrant is classified as low-low (BB, − −); (iii) Q3: the third quadrant (lower right) shows the areas that have high values for the variable under analysis surrounded by neighbouring areas that have values below the average for the variable in question. This quadrant is classified as high-low (AB, + −); and (iv) Q4: the fourth quadrant (upper left) shows areas that have low values for the variable under analysis surrounded by areas that are above the average for the variable in question. This quadrant is classified as low-high (BA, − +).
The areas located in quadrants Q1 and Q2 show positive autocorrelation, i.e. the neighbouring areas had similar value. In contrast, the areas located in quadrants Q3 and Q4 have negative autocorrelation, i.e. there is dissimilarity between the neighbouring areas.
Box map, LISA map and Moran map
The Box map is an extension of the Moran scatterplot in which the elements of each quadrant of the plot are represented by a specific colour with their respective polygons. The LISA map indicates the regions whose location correlation is significantly different from the others, being classified into the following groups: non-significant; and significant at the 5% (P < 0.05), 1% (P < 0.01), and 0.1% (P < 0.001) levels, respectively. The Moran map, similarly to the LISA map, shows only significant values, being represented in four groups and placed in the quadrants to which they belong on the graph.