Skip to main content

Simple framework for real-time forecast in a data-limited situation: the Zika virus (ZIKV) outbreaks in Brazil from 2015 to 2016 as an example



In 2015–2016, Zika virus (ZIKV) caused serious epidemics in Brazil. The key epidemiological parameters and spatial heterogeneity of ZIKV epidemics in different states in Brazil remain unclear. Early prediction of the final epidemic (or outbreak) size for ZIKV outbreaks is crucial for public health decision-making and mitigation planning. We investigated the spatial heterogeneity in the epidemiological features of ZIKV across eight different Brazilian states by using simple non-linear growth models.


We fitted three different models to the weekly reported ZIKV cases in eight different states and obtained an R2 larger than 0.995. The estimated average values of basic reproduction numbers from different states varied from 2.07 to 3.41, with a mean of 2.77. The estimated turning points of the epidemics also varied across different states. The estimation of turning points nevertheless is stable and real-time. The forecast of the final epidemic size (attack rate) is reasonably accurate, shortly after the turning point. The knowledge of the epidemic turning point is crucial for accurate real-time projection of the outbreak.


Our simple models fitted the epidemic reasonably well and thus revealed the spatial heterogeneity in the epidemiological features across Brazilian states. The knowledge of the epidemic turning point is crucial for real-time projection of the outbreak size. Our real-time estimation framework is able to yield a reliable prediction of the final epidemic size.


Zika virus (ZIKV) was first identified in the Zika Forest of Uganda in 1947 [1]. Later it was found to spread in the human populations in Nigeria [2, 3]. ZIKV is an arbovirus in the family of Flaviviridae and is transmitted through the bites of mosquito vectors (usually of Aedes aegypti mosquitoes) [4,5,6,7]. By 2007, ZIKV had escaped Africa to the Yap island in Micronesia, and it infected an estimated 75% of the local population [8]. In 2013, ZIKV reached French Polynesia and caused an infection attack rate of 49% [9, 10]. By 2015 it had invaded Brazil [10,11,12] and then quickly the many regions in South America [4, 13, 14]. Since 2015, other ZIKV transmission routes have also been found (materno-fetal, sexual transmission and via blood transfusion) [15,16,17], but these paths are uncommon and inefficient [12]. Up to the end of 2018, ZIKV infections had been reported in 86 countries (or regions) mainly in Oceania and the Americas [17]. In recent years, available scientific evidence and analysis strongly suggest that ZIKV could cause Guillain-Barré syndrome (GBS) [17,18,19,20,21]. ZIKV infection in pregnant women is also reported to be associated with, among other medical complications, microcephaly in their infants and sometimes even fetal deaths [17, 22,23,24]. Due to the lack of effective vaccines or medication, the World Health Organization (WHO) declared ZIKV as a public health emergency of international concern as of February 2016 [17].

In Brazil, samples of eight patients (with rash) tested at the Bahia State laboratory were positive for ZIKV by RT-PCR in epidemiological week (EW) 17 of 2015 [11, 25]. In EW 19 of 2015, Brazil authorities reported positive results for ZIKV by RT-PCR in samples taken from the States of Rio Grande and Bahia. This was the first report of locally-acquired ZIKV infection in Brazil [25]. The first wave of ZIKV hit northeastern Brazil in the first quarter of 2015 and started fading out since September, and was severely underreported since the mandatory ZIKV case notification only started in February 2016 [26]. The second wave of ZIKV swept Brazil between October 2015 and July 2016 [4, 27], followed by an increasing number of microcephaly infants across the whole country [28,29,30], as well as GBS cases [13]. This second wave in Brazil ended around July 2016, and even earlier for some of the states [25].

Modelling is widely used to study primary epidemic features and estimate the epidemiological parameters in the infectious disease outbreaks [10, 12, 14, 15, 31,32,33,34,35,36,37]. During an outbreak, the crucial epidemiological parameters include the reproduction number [32, 34, 38,39,40], final epidemic size [41, 42] and the turning time point [43,44,45,46,47]. The three parameters reflect the levels of infectivity, severity and the inflection time point of an epidemic, respectively. Knowledge of these epidemiological parameters summarize the temporal pattern of an epidemic and is helpful to understand the features of an outbreak. The real-time prediction of the epidemic final size is a procedure in which the estimates are valuable if achieved early. Moreover, the real-time estimation of the potential severity of an ongoing epidemic could be crucial for disease control and prevention policy-making [46, 48,49,50,51,52].

In this study, we were inspired by previous work [44, 46, 53] and adopted simple non-linear phenomenological models to study the epidemics in a data-limited situation. When a relatively new disease hits an under (or less) developed population, much public health related information is unknown during the outbreak, and only reported case time series are available. However, a quick estimate of the key epidemiological parameters and forecast on the trend is crucial for mitigation planning. We propose a framework for such a situation and use the ZIKV case time series in eight Brazilian states as an example. We study the power of simple models for real-time estimation in a data-limited situation. We forecast the final epidemic size in real-time. We reveal the spatial heterogeneity of epidemiological parameter estimates of the ZIKV epidemics across Brazil which should be useful in mitigation planning (or resource allocation).



We obtained weekly reported ZIKV cases (both confirmed and suspected or suspected only) in eight Brazilian states between January 2015 to July 2016 from published literature [4]. These states include Acre, Bahia, Pernambuco, Espirito Santo, Parana, Rio Grande, Goiania City and Mato Grosso. According to the case definition by the WHO [54], a confirmed case must be first defined as a suspected case, and thus we follow previous work [10, 44] to use either the sum of confirmed and suspected (if they are available) or the suspected cases for analysis. Although Brazil started national wide mandatory ZIKV case notification on February 2016 [4, 26], many states with large-scale outbreak started local notification (reporting) on or after October 2015. The second large epidemic wave of ZIKV infections started in October 2015 [4, 27]. In this work, we use the ZIKV epidemic data on or after October 2015 for all eight states in Brazil for modelling.

We obtained the population data at the end of 2015 from the Brazilian Institute of Geography and Statistics [55].

Mathematical models

We aimed to investigate the temporal patterns and transmission potential of ZIKV in eight Brazilian states over roughly the same period of time in 2015–2016. We adopted three different non-linear growth models to pinpoint the wave of ZIKV infections in each state. The three models are the three-parameter logistic growth model [56], the Gompertz growth model [57] and the Richards model [58], which are widely used to study S-shaped cumulative growth processes, e.g. epidemic curves [43, 44, 46, 47, 53, 59].

In this study, we denote the real (or theoretical) cumulative number of ZIKV cases of time (or day) t by C(t), and thus also C(t) represents the instantaneous epidemic size at time t. The three-parameter logistic growth model reads

$$ C\left( t \right) = \frac{K}{{1 + e^{{ - \gamma \left( {t - \tau } \right)}} }}. $$

The Gompertz growth model reads

$$ C\left( t \right) = Ke^{{ - e^{{ - \gamma \left( {t - \tau } \right)}} }} . $$

The Richards model reads

$$ C\left( t \right) = \frac{K}{{\left[ {1 + \alpha e^{{ - \alpha \gamma \left( {t - \tau } \right)}} } \right]^{{\left( {1/\alpha } \right)}} }}. $$

In Eqns (13), K is the maximum cumulative case number or the final epidemic size over the single wave of an outbreak, γ is the intrinsic per capita growth rate of the infected population, and τ is the unique inflection time point. For the Richards model in Eqn (3), term α is the exponent of deviation of the cumulative S-shaped ZIKV epidemic curve for C(t). Especially, when α = 1, the Richards model Eqn (3) becomes the logistic model in Eqn (1). Different from the logistic model in Eqn (1), the Richards model is no longer symmetrical about the point of inflection (τ) when α ≠ 1. More similarities and difference among the three growth models are summarized in Additional file 1: Text S1. Unlike the standard “susceptible-infectious-recovered” (SIR) compartmental models commonly used to study the transmission of diseases [12, 14, 36], these growth models consider the cumulative cases with saturation in the growth rate as the signs of progress of epidemics. The extrinsic growth rate does not steadily decline but rather increases to a maximum (i.e. a saturated level) before steadily declining to zero.

The turning point or inflection point, τ, is defined as the time point when the sign change in the rate of cases accumulation occurs, i.e. changes from increasing to decreasing or vice versa. Hence, τ is the moment at which the daily (or weekly) incidence trajectory begins to decline, which means the extrinsic growth rate reaches its maximum. The turning point indicates the beginning of an epidemic phase changing from the acceleration to deceleration.

The model parameters K, γ and τ are of epidemiological importance. These parameters can be estimated by fitting the growth models to the epidemic data of the ZIKV outbreak. We adopted the standard non-linear least squares (NLS) approach for model fitting and parameter estimation. Thus, the real cumulative case number at time t, C(t), is assumed to follow a normal distribution with a mean of the reported cumulative number of cases and an unknown but constant variance [43, 44, 46, 47]. A P-value of < 0.05 is regarded as statistically significant, and the 95% confidence intervals (CI) for all unknown parameters are estimated.

The epidemic data in each state are fitted by all three growth models in Eqns (13) (see Fig. 1 for an illustration). We adopted the R2 to measure the goodness-of-fit of each model. Since the models have different numbers of unknown parameters, the Akaike information criterion (AIC) was used to evaluate model performance in terms of the trade-off between the goodness-of-fit and the model complexity. For each state, the model with the lowest AIC value was chosen for further evaluation on its potentials for the real-time estimation.

Fig. 1
figure 1

The illustration diagram of the modelling framework. The (solid and dashed) orange lines are the theoretical growth curves from the growth models in Eqns 13. The blue dots are the reported cumulative (cum.) number of cases. The blue shading area represents the time period when the disease notification is ongoing, which is also the time period for the model fitting

Reproduction number

The reproduction number, R, is the average number of secondary infectious cases produced by one infectious case during a disease outbreak [40, 45, 60]. When a population is totally (i.e. 100%) susceptible, R becomes the basic reproduction number, R0 [39, 61]. When a disease reaches a place (or region) for the first time, the estimated R can therefore be treated as R0. Following previous studies [38, 40, 60, 62], the reproduction number (R) is given in Eqn (4).

$$ R = \frac{1}{{M\left( { - \gamma } \right)}} = \frac{1}{{\mathop \smallint \nolimits_{0}^{\infty } e^{ - \gamma \kappa } h\left( \kappa \right) {\text{d}}\kappa }}. $$

Here, γ is the intrinsic per capita growth rate from the growth model in Eqns (13), and κ is the serial interval of the ZIKV infection. The serial interval (i.e. generation interval) is the average time interval from onset of one individual to the onset of another individual infected by him/her [44], or the time between successive cases in a chain of transmission [4, 38, 40, 62, 63]. The function h(∙) represents the probability distribution of the serial interval, κ. Hence, the function M(∙) is the Laplace transform of h(∙), specifically, M(∙) is known to statisticians as the moment generating function (MGF) of a probability distribution.

According to previous work [4], we set h(κ) to be a gamma distribution with a mean of 20.0 days and standard deviation (SD) of 7.4 days, the SDs of the mean and SD of κ are 2.3 and 1.3 days, respectively. Therefore, R can be estimated with the values of γ from models (1–3).

Projection of the epidemic and real-time estimation

In each state, we chose the model attaining the lowest AIC and simulated it into the future to estimate the final size (K) of the outbreak. To evaluate the real-time forecast power, we repeated the fitting procedure starting from the full epidemic wave to a sequence of waves with the end week discarded. We denoted the start, the end and the turning pint of the outbreak by 0, T and τ, such that 0 < τ < T. In the fitting part, we used the all data from time 0 to T to fit the models and estimate parameters. For the real-time estimation, we used the data from time 0 to T1, where 0 < T1 < T, so that the model was fitted with an incomplete dataset. With initially T1 = T, we decreased T1 gradually from time T backwards until the model fitting diverged. We compared the real-time estimated final size (K) and the K estimate based on the complete dataset, and stopped the real-time estimation when the yielded confidence interval was too wide (e.g. including zero) to be useful at the end of epidemic period, i.e. T. We checked the parameter estimates and compared with the results from the full-data modelling to evaluate the analysis sensitivities as the measurement of the real-time estimating potential. The (real-time) estimates with the lowest three T1s were compared to the estimation based on the full dataset.

The knowledge of the turning point (τ) is crucial for real-time projection [44,45,46], but this information is usually not precisely available. Alternatively, for Zika disease, one may gain knowledge of τ from the mosquito vectors’ activities. Since the mosquito abundance in Brazil decreases around May each year [64], we attempted to project the final size (K) with τ fixed at three dates prior to May, namely the first days of February, March and April 2016. Similarly, for the real-time projection, we used the data from time 0 to T1, where 0 < T1 < T, to train the model. The growth model was fitted with the dataset from 0 to T1, and used to project K in real-time. We compared the real-time projection of K with the estimates based on full data to measure the forecast power (performance) of the model.


We fitted the three different models in Eqns (13) to the time series data of ZIKV incidences number from eight states in Brazil from October 2015 to May 2016. Figure 2 shows that the selected models can provide a good fit to the observations. All selected models achieved a level of R2 larger than 0.995. For each state, we selected the fitting results with the lowest value of AIC as the most suitable model (Table 1, Fig. 2). In particular, the Richards model is selected for Acre, Bahia, and Pernambuco; Gompertz model is selected for Mato Grosso and Rio Grande; and the logistic model is selected for Espirito Santo, Goiania City and Parana. The reproduction number, R, estimates vary from 1.54 to 3.07 for the eight different states (Table 1). We estimate R = 1.54 (95% CI: 1.43–1.65) in Rio Grande, and R = 3.07 (95% CI: 2.92–3.24) in Goiania City. The estimated dates of the turning points also vary from January to April of 2016, with 4 out of 8 states in March 2016. For the same state, the final size estimates from different models are roughly consistent, with the 95% CIs largely overlapping. The estimated final (epidemic) sizes, K, are also summarized in Table 1. We estimate the largest final outbreak size of 55,472 (95% CI: 54,683–56,260) in Bahia for the outbreak since October 2015, after the one epidemic wave in early 2015 [4, 10].

Fig. 2
figure 2

The fitting results of the ZIKV epidemics and the estimates of the reproduction number, R. The dots are the number of reported weekly ZIKV incidences, and the red curves are the fitted epidemic curves by the model with the lowest AIC (highlighted in red). The cyan diamond at the top-left corner of each panel is the reproduction number estimation, and the bar is the 95% CI

Table 1 Summary table of the model fitting and estimation results. The models results summarized here are also estimated by using the full epidemic dataset during the whole epidemic period. The models with the lowest AICs (for the same states) are considered as main results, which matches the results in Fig 2. The numbers in parentheses are the 95% CIs

To evaluate the potentials for the real-time estimation, we shortened the fitting period starting from the end time of the epidemic reporting period, and further checked the sensitivity of the estimates of K and τ. The epidemic reporting period is the period that local authority starts and ends the reporting of ZIKV cases, which is different from the real epidemic period. The real epidemic period starts earlier than the actual reporting starting date. For each state, the model with the lowest value of AIC is selected here, and all AIC values are summarized in Table 1 and Fig. 2. Table 2 summarizes the real-time estimation from the selected models. Figure 3 shows the relationship between the estimates of the epidemic size (or final size) and the end time of model fitting. We summarized the estimates using the incomplete dataset and using the complete dataset in Table 2, where the final epidemic size estimates by using the complete dataset match the red dots in Fig. 3. The early estimates of the turning points (τ) and reproduction numbers (R) are almost the same as the final results. The real-time estimates of epidemic size, K, converge to the estimates by using the full dataset (the red dots in Fig. 3), when the end time of the subsequent fitting period (T1) is longer than the turning point (τ), i.e. T1 > τ. The estimated epidemic sizes using the incomplete dataset are roughly consistent to the final estimates. Note that for a few states (e.g. Rio Grande), the estimated epidemic size is higher than the reported cumulative counts; this is due to the outbreak sustained after the end of disease notification (reporting) period. The epidemic size, K, is the final outbreak size until the end of the epidemic. Moreover, for all states we find that the epidemic sizes estimated 6–35 days after the turning points are indifferent from their final estimations, which means the 95% CIs are largely overlapping (Table 2). This finding indicates that the final outbreak size (K) can be estimated around the epidemic peaking stage by the projections from the simple growth models. By fixing τ to be the first days of February, March and April of 2016, the K projection converges as more data is including in the model training (Fig. 4). When the assumed turning point becomes closer to the real turning point, the projection of K will gain more accuracy and converges faster even during the early stage of the epidemics (i.e. before the occurrence of the real turning point).

Table 2 Summary table of the real-time estimation results from the selected models. The model with the lowest AIC (for the same states) is selected for analysis. The models results using the full epidemic dataset during the whole epidemic period match the models with the lowest AICs in Table 1. The numbers in parentheses are the 95% CIs
Fig. 3
figure 3

The estimation of final size (K) with variable turning points from the selected growth model. In each panel, the horizontal axis is the end time of fitting, and the vertical axis is the final size, K, or the reported number of cumulative (cum.) counts, C(t), of ZIKV incidences. The vertical dashed blue line indicates the start time of the epidemic, which is also the start time of fitting. The vertical dashed black line indicates the end time of the epidemic, which is also the largest end time of fitting. The vertical purple line is the estimated turning point, τ, by using the full dataset, which matches the models with the lowest AICs in Tables 1 and 2. The cyan curve is the fitted cumulative epidemic curve, and the triangular dots are the reported number of cumulative ZIKV incidences. The red line is the estimated final size against the end time of fitting. The red dot at the end is the final size estimation by using the full dataset, which matches the models with the lowest AICs in Tables 1 and 2. The red shading area represents the 95% CI

Fig. 4
figure 4

The estimation of final size (K) with fixed turning points. In each panel, the horizontal axis is the time since the start of the epidemic, which is also the end time (T1) of the dataset to train the growth model. The vertical axis is the projected final size, K. The vertical gray line is the estimated turning point, τ, by using the full dataset, which matches the models with the lowest AICs in Tables 1 and 2. The horizontal gray line is the estimated final size, K, by using the full dataset, which matches the models with the lowest AICs in Tables 1 and 2. The red curve is the real-time projection of K with τ fixed to be February 1, 2016 (vertical red dashed line). The blue curve is the real-time projection of K with τ fixed to be March 1, 2016 (vertical blue dashed line). The green curve is the real-time projection of K with τ fixed to be April 1, 2016 (vertical green dashed line). The shading area represents the 95% CI


We used simple non-linear growth models to study the temporal patterns of ZIKV epidemics in eight Brazilian states. We showed that three simple growth models can be adapted to model the ZIKV outbreak, with the best R2 reaching 0.995. The estimated dates of the turning points varied from January to April of 2016 for the eight states. The difference in the turning points indicates spatial heterogeneity in the timing of the outbreak. We found that four out of the eight states (i.e. 50%) had a turning point in March 2016, which matches the epidemic peaking time of the whole of Brazil in 2016 [4, 25]. It is interesting to note that the earliest turning point was estimated as January 10 (95% CI: January 06–January 14), 2016 in Mato Grosso state, around which the epidemic started to be reported in the neighboring state of Parana in the epidemiological week (EW) 2 of 2016. We suspect that the reporting of this cluster of cases could be triggered by the turning point, which is in-line with the findings in [44]. The timing of the turning points, i.e. the duration from the epidemic reporting start to the turning point (the “turning point” column in Table 1), was also found to be remarkably different between each state. This is different from the previous work for the six archipelagos in French Polynesia [44], which could be due to the large differences in the ZIKV epidemic reporting periods of the states in Brazil. The local conditions, e.g. demographic factors, public health policies, seasonality including meteorological factors and other factors affecting mosquito activities, probably varied in different states. Hence, the growth structure of the epidemic curve could be affected, and thus the turning points are likely to appear heterogeneously across states. To further evaluate the timely efficiency of the local ZIKV notification, we checked if the turning point appeared in the latter half of the whole reporting period. This can be simply quantified by calculating the ratio of the “turning point” over “duration” in Table 1, and compared with 0.5. Most of the states had a turning point in the latter half of the outbreak reporting period except for the state of Espirito Santo. Espirito Santo had a turning point (τ) in the former half of the epidemic period (T) significantly, i.e. τ < T/2, which is different from other states, thus an outlier.

The disease infectivity is measured by the reproduction number, R, during an outbreak (Table 1). The estimated R-values are significantly less than 2 (with 95% CIs lower than 2) in the four states of Bahia, Mato Grosso, Pernambuco and the Rio Grande. The Rs are significantly larger than 2 in the four states of Acre, Espirito Santo, Goiania City and Parana. In states of Bahia, Mato Grosso and Pernambuco, there were ZIKV cases confirmed since early 2015 [4]. Thus, the lower R-values are likely to be due to the depletion of susceptible population during the earlier outbreaks. In the state of Rio Grande, one possible explanation for the lowest R = 1.54 (95% CI: 1.43–1.65) is the relatively lower air temperature than most of the other places in Brazil. The average temperature starts to drop below 20 °C from March every year, during which the mosquito vector abundance is almost zero [65]. For the four states of Acre, Espirito Santo, Goiania City and Parana, ZIKV was not reported before October 2015 [4], and thus the R-values can also be treated as the estimates of the basic reproduction number, R0. Hence, we speculate that the R0 of ZIKV ranges from 2.07 to 3.41 by directly finding the range of the 95% CIs of R-values in states of Acre, Espirito Santo, Goiania City and Parana. The average value of R0 was 2.77. This average value and range of R0 is consistent with previous ZIKV studies for Brazil [4, 12, 14].

To evaluate the power for the real-time estimation, the selected models were repeatedly implemented with the fitting period starting from the end time of the reporting period, and thus we could further check the sensitivity of estimates of K and τ (Table 2). We report a converging real-time estimation of the final epidemic size starting on or after the turning date. The estimation of the turning points is obviously stable and consistent with the final estimation. Moreover, for all states, the epidemic sizes estimated 6–35 days after the turning points are virtually indifferent from their final estimations (i.e. estimates of K by using the full dataset). These findings reveal the real-time estimating potentials for the simple growth models proposed in this study. The final epidemic size (K) can be predicted at or after the peaking time of the epidemic. The early prediction of the final outbreak size (K) was found to depend on the timing of the turning point (τ), as shown in Fig. 3. Although projecting the temporal trends of an outbreak from the early-stage incomplete dataset could be sometimes misleading, we note that the predictions are reasonably accurate when the fitting dataset covers the turning point, which is in-line with the findings in [46, 66]. With data coming in from an ongoing outbreak, the performance of models used in this work will be continuously improved, thus real-time estimates of key epidemiological parameters may be available before the epidemic fully ends.

Since the prediction should be before the occurrence of an event, we note that the turning points forecast is difficult to be achieved with the simple models, which was also reported by previous Zika and dengue modelling literature [44, 45]. Nevertheless, we highlight the importance of a successful turning point forecast in the prediction of other epidemiological parameters. Our findings suggest that the projection on the epidemic final size (K) converges after using the data with time duration slightly over the turning point. In other words, once the knowledge of the turning point is equipped, the real-time estimation can be largely improved and converges quickly. To estimate the turning point (τ) we may not only rely on surveillance case data but also take into account of practical knowledge and factors that affect the disease transmission. For instance, the Zika fever in this work is a disease whose transmission depends largely on the activity of mosquitoes, which has strong seasonality. Local mosquito abundance drops to a sufficiently low level from May each year [64], which could largely reduce the ZIKV spread. Hence, τ is probably before May 2016. By fixing τ on the first day of February, March and April 2016, the K projection converges as more data is included in the training of the model (Fig. 4). When the assumed turning point approaches the real turning point, the projection of K will approach the estimate based on full data, and converge faster even during the early stage of the epidemics, i.e. before the arrival of the real turning point.

Besides the three models adopted in this study, there are other well-known non-linear growth models that have not been adopted. These unselected models include the four-parameter logistic, five-parameter logistic, Weibull and Sigmoid Emax models. One of the facts of the S-shape epidemic curve is that the growth starts from level zero. The Weibull and Sigmoid Emax models are more likely to yield inferior fitting performance with zero lower asymptote (or bound). Besides, the Sigmoid Emax model does not contain an intrinsic growth term. Thus, these two models are less popular in studying epidemic curves than the three models in Eqns (13). For the four-parameter logistic model, it is equivalent to the three-parameter version in Eqn (1) when the lower asymptote becomes zero. Although the five-parameter logistic model adds asymmetry factor (to control the asymmetry) based on the four-parameter version [67], it still contains the non-zero lower asymptote problem [68]. In addition, the five-parameter logistic model also could be over-sensitive for the early-stage prediction [46]. These shortcomings make it less practical than the Richards model in studying the epidemic curve. Therefore, we only adopt the three growth models in Eqns (13).

This work has some limitations. The analyses are highly reliant on the quality of the epidemic data, reporting delay and the change of reporting criteria. Since the local ZIKV surveillance are more reliable after the end of 2015 [26], we modeled the single-wave outbreaks on or after October 2015 and avoid including the dataset during early 2015. Due to the interference with the other Flavivirus (e.g. dengue virus, yellow fever virus, West Nile virus, etc.) [69], the serological diagnosis of ZIKV infection is less effective than the RT-PCR diagnosis. However, the time window for positive RT-PCR viremia is relatively short, roughly three to seven days, thus a suspected ZIKV should not be regarded as a negative case, which requires IgM tests for further confirmation [69]. Therefore, to avoid excluding the part of positive ZIKV cases in the suspected group, we considered the summation of suspected and confirmed cases as the incidence count for analysis. If the reporting delays, dates of onset, or the reporting rate are known, more realistic and comprehensive analysis can be performed that includes more accurate epidemic data and information. In this idealistic situation, although our simple non-linear models would be less attractive, they still could be used as the baseline framework for more advanced analysis, and to estimate the turning points.


In this study, we analyzed the temporal patterns of epidemics in Brazil by using simple non-linear growth models. The average value of R0 was estimated to be 2.77 and varied from 2.07 to 3.41 in different states. We found spatial heterogeneity in the epidemiological features among the eight states. We propose a real-time estimation framework and we demonstrate that it is able to yield reliable real-time prediction of the final epidemic size. With precise knowledge of the turning point, the real-time projection of the final size is likely to be more accurate, even during the early stage of epidemics. Our modelling framework may be extended to study other infectious diseases epidemics, and easily implemented for a practical purpose.

Availability of data and materials

All data used for analysis are freely available in the supplementary materials of Ferguson et al. [4].



Zika virus


confidence interval


epidemiological week


moment generating function


standard deviation


Akaike information criterion


non-linear least squares


Guillain-Barré syndrome


  1. Dick G, Kitchen S, Haddow A. Zika virus (I). Isolations and serological specificity. Trans R Soc Trop Med Hyg. 1952;46:509–20.

    Article  CAS  Google Scholar 

  2. Moore D, Causey O, Carey D, Reddy S, Cooke A, Akinkugbe F, et al. Arthropod-borne viral infections of man in Nigeria, 1964–1970. Ann Trop Med Parasitol. 1975;69:49–64.

    Article  CAS  Google Scholar 

  3. Wikan N, Smith DR. Zika virus: history of a newly emerging arbovirus. Lancet Infect Dis. 2016;16:e119–26.

    Article  Google Scholar 

  4. Ferguson NM, Cucunubá ZM, Dorigatti I, Nedjati-Gilani GL, Donnelly CA, Basáñez M-G, et al. Countering the Zika epidemic in Latin America. Science. 2016;353:353–4.

    Article  CAS  Google Scholar 

  5. Mlakar J, Korva M, Tul N, Popović M, Poljšak-Prijatelj M, Mraz J, et al. Zika virus associated with microcephaly. N Engl J Med. 2016;374:951–8.

    Article  CAS  Google Scholar 

  6. Monaghan AJ, Morin CW, Steinhoff DF, Wilhelmi O, Hayden M, Quattrochi DA, et al. On the seasonal occurrence and abundance of the Zika virus vector mosquito Aedes aegypti in the contiguous United States. PLoS Curr. 2016.

    Article  PubMed  PubMed Central  Google Scholar 

  7. Petersen LR, Jamieson DJ, Powers AM, Honein MA. Zika virus. N Engl J Med. 2016;374:1552–63.

    Article  CAS  Google Scholar 

  8. Duffy MR, Chen T-H, Hancock WT, Powers AM, Kool JL, Lanciotti RS, et al. Zika virus outbreak on Yap Island, federated states of Micronesia. N Engl J Med. 2009;360:2536–43.

    Article  CAS  Google Scholar 

  9. Aubry M, Teissier A, Huart M, Merceron S, Vanhomwegen J, Roche C, et al. Zika virus seroprevalence, French Polynesia, 2014–2015. Emerg Infect Dis. 2017;23:669.

    Article  Google Scholar 

  10. He D, Gao D, Lou Y, Zhao S, Ruan S. A comparison study of Zika virus outbreaks in French Polynesia, Colombia and the State of Bahia in Brazil. Sci Rep. 2017;7:273.

    Article  Google Scholar 

  11. Campos GS, Bandeira AC, Sardi SI. Zika virus outbreak, Bahia, Brazil. Emerg Infect Dis. 2015;21:1885.

    Article  Google Scholar 

  12. Gao D, Lou Y, He D, Porco TC, Kuang Y, Chowell G, et al. Prevention and control of Zika as a mosquito-borne and sexually transmitted disease: a mathematical modeling analysis. Sci Rep. 2016;6:28070.

    Article  CAS  Google Scholar 

  13. Ikejezie J, Shapiro CN, Kim J, Chiu M, Almiron M, Ugarte C, et al. Zika virus transmission—region of the Americas, May 15, 2015–December 15, 2016. MMWR Morb Mortal Wkly Rep. 2017;66:329.

    Article  Google Scholar 

  14. Zhang Q, Sun K, Chinazzi M, Pastore y Piontti A, Dean NE, Rojas DP, et al. Spread of Zika virus in the Americas. Proc Natl Acad Sci USA. 2017;114:E4334–43.

    Article  CAS  Google Scholar 

  15. Towers S, Brauer F, Castillo-Chavez C, Falconar AK, Mubayi A, Romero-Vivas CM. Estimate of the reproduction number of the 2015 Zika virus outbreak in Barranquilla, Colombia, and estimation of the relative role of sexual transmission. Epidemics. 2016;17:50–5.

    Article  Google Scholar 

  16. Atkinson B, Hearn P, Afrough B, Lumley S, Carter D, Aarons EJ, et al. Detection of Zika virus in semen. Emerg Infect Dis. 2016;22:940.

    Article  CAS  Google Scholar 

  17. WHO. Zika virus. Geneva: World Health Organization; 2019. Accessed 1 Apr 2019.

  18. Cao-Lormeau V-M, Blake A, Mons S, Lastère S, Roche C, Vanhomwegen J, et al. Guillain-Barré syndrome outbreak associated with Zika virus infection in French Polynesia: a case–control study. Lancet. 2016;387:1531–9.

    Article  CAS  Google Scholar 

  19. de Paula Freitas B, de Oliveira Dias JR, Prazeres J, Sacramento GA, Ko AI, Maia M, et al. Ocular findings in infants with microcephaly associated with presumed Zika virus congenital infection in Salvador, Brazil. JAMA Ophthalmol. 2016;134:529–35.

    Article  Google Scholar 

  20. Plourde AR, Bloch EM. A literature review of Zika virus. Emerg Infect Dis. 2016;22:1185.

    Article  CAS  Google Scholar 

  21. dos Santos T, Rodriguez A, Almiron M, Sanhueza A, Ramon P, de Oliveira WK, et al. Zika virus and the Guillain-Barré syndrome—case series from seven countries. N Engl J Med. 2016;375:1598–601.

    Article  Google Scholar 

  22. Cauchemez S, Besnard M, Bompard P, Dub T, Guillemette-Artur P, Eyrolle-Guignot D, et al. Association between Zika virus and microcephaly in French Polynesia, 2013–15: a retrospective study. Lancet. 2016;387:2125–32.

    Article  Google Scholar 

  23. Brasil P, Pereira JP Jr, Moreira ME, Ribeiro Nogueira RM, Damasceno L, Wakimoto M, et al. Zika virus infection in pregnant women in Rio de Janeiro. N Engl J Med. 2016;375:2321–34.

    Article  Google Scholar 

  24. de Oliveira WK, Carmo EH, Henriques CM, Coelho G, Vazquez E, Cortez-Escalante J, et al. Zika virus infection and associated neurologic disorders in Brazil. N Engl J Med. 2017;376:1591–3.

    Article  Google Scholar 

  25. Pan American Health Organization (PAHO), World Health Organization (WHO). Zika—Epidemiological report Brazil; 2017. 2019.

  26. The Reuters, The News press entitled “Exclusive: Brazil says Zika virus outbreak worse than believed”, 2016. 2019.

  27. Lourenço J, de Lima MM, Faria NR, Walker A, Kraemer MU, Villabona-Arenas CJ, et al. Epidemiological and ecological determinants of Zika virus transmission in an urban setting. Elife. 2017;6:e29820.

    Article  Google Scholar 

  28. de Oliveira WK. Increase in reported prevalence of microcephaly in infants born to women living in areas with confirmed Zika virus transmission during the first trimester of pregnancy—Brazil, 2015. MMWR Morb Mortal Wkly Rep. 2016;65:242–7.

    Article  Google Scholar 

  29. van der Linden V. Description of 13 infants born during October 2015–January 2016 with congenital Zika virus infection without microcephaly at birth—Brazil. MMWR Morb Mortal Wkly Rep. 2016;65:1343–8.

    Article  Google Scholar 

  30. de Oliveira WK, de França GVA, Carmo EH, Duncan BB, de Souza Kuchenbecker R, Schmidt MI. Infection-related microcephaly after the 2015 and 2016 Zika virus outbreaks in Brazil: a surveillance-based analysis. Lancet. 2017;390:861–70.

    Article  Google Scholar 

  31. Hollingsworth TD, Pulliam JR, Funk S, Truscott JE, Isham V, Lloyd AL. Seven challenges for modelling indirect transmission: vector-borne diseases, macroparasites and neglected tropical diseases. Epidemics. 2015;10:16–20.

    Article  Google Scholar 

  32. Earn DJ, Brauer F, van den Driessche P, Wu J. Mathematical epidemiology. Berlin: Springer; 2008.

    Google Scholar 

  33. Brauer F, Castillo-Chavez C. Mathematical models in population biology and epidemiology, vol. 40. Berlin: Springer; 2012.

    Book  Google Scholar 

  34. Keeling MJ, Rohani P. Modeling infectious diseases in humans and animals. Princeton: Princeton University Press; 2011.

    Book  Google Scholar 

  35. Riley S, Fraser C, Donnelly CA, Ghani AC, Abu-Raddad LJ, Hedley AJ, et al. Transmission dynamics of the etiological agent of SARS in Hong Kong: impact of public health interventions. Science. 2003;300:1961–6.

    Article  CAS  Google Scholar 

  36. Zhao S, Stone L, Gao D, He D. Modelling the large-scale yellow fever outbreak in Luanda, Angola, and the impact of vaccination. PLoS Negl Trop Dis. 2018;12:e0006158.

    Article  Google Scholar 

  37. Lin Q, Chiu AP, Zhao S, He D. Modeling the spread of Middle East respiratory syndrome coronavirus in Saudi Arabia. Stat Methods Med Res. 2018;27:1968–78.

    Article  Google Scholar 

  38. Fraser C. Estimating individual and household reproduction numbers in an emerging epidemic. PLoS ONE. 2007;2:e758.

    Article  Google Scholar 

  39. Van den Driessche P, Watmough J. Reproduction numbers and sub-threshold endemic equilibria for compartmental models of disease transmission. Math Biosci. 2002;180:29–48.

    Article  Google Scholar 

  40. Wallinga J, Lipsitch M. How generation intervals shape the relationship between growth rates and reproductive numbers. Proc Biol Sci. 2006;274:599–604.

    Article  Google Scholar 

  41. Arino J, Brauer F, Van Den Driessche P, Watmough J, Wu J. A final size relation for epidemic models. Math Biosci Eng. 2007;4:159.

    Article  Google Scholar 

  42. Ma J, Earn DJ. Generality of the final size formula for an epidemic of a newly invading infectious disease. Bull Math Biol. 2006;68:679–702.

    Article  Google Scholar 

  43. Hsieh Y-H. Richards model: a simple procedure for real-time prediction of outbreak severity. In: Ma Z, Zhou Y, Wu J, editors. Modeling and dynamics of infectious diseases. Singapore: World Scientific; 2009. p. 216–36.

    Chapter  Google Scholar 

  44. Hsieh Y-H. Temporal patterns and geographic heterogeneity of Zika virus (ZIKV) outbreaks in French Polynesia and Central America. PeerJ. 2017;5:e3015.

    Article  Google Scholar 

  45. Hsieh Y-H, Ma S. Intervention measures, turning point, and reproduction number for dengue, Singapore, 2005. Am J Trop Med Hyg. 2009;80:66–71.

    Article  Google Scholar 

  46. Sebrango-Rodríguez CR, Martínez-Bello DA, Sánchez-Valdés L, Thilakarathne PJ, Del Fava E, Van Der Stuyft P, et al. Real-time parameter estimation of Zika outbreaks using model averaging. Epidemiol Infect. 2017;145:2313–23.

    Article  Google Scholar 

  47. Zhou G, Yan G. Severe acute respiratory syndrome epidemic in Asia. Emerg Infect Dis. 2003;9:1608–10.

    PubMed  PubMed Central  Google Scholar 

  48. Funk S, Camacho A, Kucharski AJ, Eggo RM, Edmunds WJ. Real-time forecasting of infectious disease dynamics with a stochastic semi-mechanistic model. Epidemics. 2018;22:56–61.

    Article  Google Scholar 

  49. Tizzoni M, Bajardi P, Poletto C, Ramasco JJ, Balcan D, Gonçalves B, et al. Real-time numerical forecast of global epidemic spreading: case study of 2009 A/H1N1pdm. BMC Med. 2012;10:165.

    Article  Google Scholar 

  50. Yang W, Cowling BJ, Lau EH, Shaman J. Forecasting influenza epidemics in Hong Kong. PLoS Comput Biol. 2015;11:e1004383.

    Article  Google Scholar 

  51. Hsieh Y-H, Cheng Y-S. Real-time forecast of multiphase outbreak. Emerg Infect Dis. 2006;12:122.

    Article  Google Scholar 

  52. Nishiura H, Chowell G, Safan M, Castillo-Chavez C. Pros and cons of estimating the reproduction number from early epidemic growth rate of influenza A (H1N1) 2009. Theor Biol Med Model. 2010;7:1.

    Article  Google Scholar 

  53. Liao JJ, Liu R. Re-parameterization of five-parameter logistic function. J Chemom. 2009;23:248–53.

    Article  CAS  Google Scholar 

  54. WHO. The interim case definition of Zika virus disease. Geneva: World Health Organization. 2019; 2019.

  55. Brazilian Institute of Geography and Statistics. The Resident population figures sent to the Brazilian Court of Audit from 2001 to 2015; 2019. Accessed 1 Apr 2019.

  56. Verhulst P. La loi d’accroissement de la population. Nouv Mem Acad Roy Soc Belle-lettr Bruxelles. 1845;18:1–38.

    Google Scholar 

  57. Gompertz B. XXIV. On the nature of the function expressive of the law of human mortality, and on a new mode of determining the value of life contingencies. In a letter to Francis Baily, Esq. FRS &c. Philos Trans R Soc Lond. 1825;115:513–83.

  58. Richards F. A flexible growth function for empirical use. J Exp Bot. 1959;10:290–301.

    Article  Google Scholar 

  59. Tsoularis A, Wallace J. Analysis of logistic growth models. Math Biosci. 2002;179:21–55.

    Article  CAS  Google Scholar 

  60. Ma J, Dushoff J, Bolker BM, Earn DJ. Estimating initial epidemic growth rates. Bull Math Biol. 2014;76:245–60.

    Article  Google Scholar 

  61. Fraser C, Donnelly CA, Cauchemez S, Hanage WP, Van Kerkhove MD, Hollingsworth TD, et al. Pandemic potential of a strain of influenza A (H1N1): early findings. Science. 2009;324:1557–61.

    Article  CAS  Google Scholar 

  62. Wallinga J, Teunis P. Different epidemic curves for severe acute respiratory syndrome reveal similar impacts of control measures. Am J Epidemiol. 2004;160:509–16.

    Article  Google Scholar 

  63. Fine PE. The interval between successive cases of an infectious disease. Am J Epidemiol. 2003;158:1039–47.

    Article  Google Scholar 

  64. Paes de Andrade P, Aragão FJL, Colli W, Dellagostin OA, Finardi-Filho F, Hirata MH, et al. Use of transgenic Aedes aegypti in Brazil: risk perception and assessment. Bull World Health Organ. 2016;94:766–71.

    Article  Google Scholar 

  65. Beck-Johnson LM, Nelson WA, Paaijmans KP, Read AF, Thomas MB, Bjørnstad ON. The effect of temperature on Anopheles mosquito population dynamics and the potential for malaria transmission. PLoS ONE. 2013;8:e79276.

    Article  Google Scholar 

  66. Hsieh Y-H, Lee J-Y, Chang H-L. SARS epidemiology modeling. Emerg Infect Dis. 2004;10:1165.

    Article  Google Scholar 

  67. Gottschalk PG, Dunn JR. The five-parameter logistic: a characterization and comparison with the four-parameter logistic. Anal Biochem. 2005;343:54–65.

    Article  CAS  Google Scholar 

  68. Rozema E. Epidemic models for SARS and measles. College Math J. 2007;38:246–59.

    Article  Google Scholar 

  69. WHO. Laboratory testing for Zika virus infection: interim guidance. Geneva: World Health Organization; 2019. Accessed 1 Apr 2019.

Download references


Not applicable.


The work described in this paper was partly supported by a grant from The Hong Kong Polytechnic University (Project no.: 1-ZE8J). The funding agencies had no role in the design and conduct of the study; collection, management, analysis, or interpretation of the data; preparation, review, or approval of the manuscript; or decision to submit the manuscript for publication.

Author information

Authors and Affiliations



SZ conceived and carried out the study, and drafted the first manuscript. SZ and DH discussed the results. SZ, SSM, HF, DH and JQ revised the manuscript. All authors read and approved the final manuscript.

Corresponding authors

Correspondence to Shi Zhao or Daihai He.

Ethics declarations

Ethics approval and consent to participate

Since no personal data were collected, ethical approval or individual consent was not required.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Additional file

Additional file 1: Text S1.

The difference between the three growth models.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhao, S., Musa, S.S., Fu, H. et al. Simple framework for real-time forecast in a data-limited situation: the Zika virus (ZIKV) outbreaks in Brazil from 2015 to 2016 as an example. Parasites Vectors 12, 344 (2019).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: