New genotype invasion of dengue virus serotype 1 drove massive outbreak in Guangzhou, China

Background Dengue fever is a mosquito-borne infectious disease that has caused major health problems. Variations in dengue virus (DENV) genes are important features of epidemic outbreaks. However, the associations of DENV genes with epidemic potential have not been extensively examined. Here, we assessed new genotype invasion of DENV-1 isolated from Guangzhou in China to evaluate associations with epidemic outbreaks. Methodology/principal findings We used DENV-1 strains isolated from sera of dengue cases from 2002 to 2016 in Guangzhou for complete genome sequencing. A neighbor-joining phylogenetic tree was constructed to elucidate the genotype characteristics and determine if new genotype invasion was correlated with major outbreaks. In our study, a new genotype invasion event was observed during each significant outbreak period in 2002–2003, 2006–2007, and 2013–2014. Genotype II was the main epidemic genotype in 2003 and before. Invasion of genotype I in 2006 caused an unusual outbreak with 765 cases (relative risk [RR] = 16.24, 95% confidence interval [CI] 12.41–21.25). At the middle and late stages of the 2013 outbreak, genotype III was introduced to Guangzhou as a new genotype invasion responsible for 37,340 cases with RR 541.73 (95% CI 417.78–702.45), after which genotypes I and III began co-circulating. Base mutations occurred after new genotype invasion, and the gene sequence of NS3 protein had the lowest average similarity ratio (99.82%), followed by the gene sequence of E protein (99.86%), as compared to the 2013 strain. Conclusions/significance Genotype replacement and co-circulation of multiple DENV-1 genotypes were observed. New genotype invasion was highly correlated with local unusual outbreaks. In addition to DENV-1 genotype I in the unprecedented outbreak in 2014, new genotype invasion by DENV-1 genotype III occurred in Guangzhou.

to the accelerating expansion of DF in affected geographic regions worldwide [4,5]. However, the general roll-out of the only currently licensed vaccine, CYD-TDV (Dengvaxia; Sanofi Pasteur, Lyon, France) is limited because pre-screening for dengue virus serostatus is required, as safety issues associated with increased hospitalization risk may occur for individuals who have never been infected with dengue before [6][7][8], and no specific interventions to treat the disease have been established.
Dengue virus (DENV), a single-stranded positive-sense RNA virus with an 11-kb genome, contains an open reading frame encoding three structural proteins (C, prM/M, and E) and seven nonstructural proteins (NS1, NS2A, NS2B, NS3, NS4A, NS4B, and NS5). The virus can be further classified into four serotypes according to their distinctive antigenicity, i.e., DENV-1-4, and there are diverse genotypes within each serotype. The sequence of each DENV genotype is not always fixed, and frequent variations, recombinations, and lineage turnover or replacement may occur because of selection pressure [9].
In recent decades, DF has become a major threat to public health in Southern China [10]. Frequent and largescale outbreaks have posed a huge disease burden; in particular, serious outbreaks dominated by DENV-1 have occurred in the last 20 years. Guangzhou has become the most heavily affected area in China since the 1990s; the number of reported cases during 2005-2017 accounted for 60.86% of all cases nationally, reaching 81% in 2014 alone [11][12][13]. Meanwhile, the 2014 large-scale DF outbreak in Guangzhou also resulted in exportation of cases to other countries, and was thought to have triggered the Tokyo outbreak in the same year [14]. Outbreaks with highest incidence rates were all caused by DENV-1, leading to extensive studies of associated factors, such as climate, mosquito density, control measures, and virus serotypes [15][16][17][18]. Despite this, numerous scientific issues related to large-scale outbreaks remain unresolved, particularly the cause of the massive outbreak in Guangzhou in 2013 and 2014.
Genetic variations in DENV are important factors contributing to the severity of epidemics. However, the associations of such genetic variations with epidemic scale have not been thoroughly examined [2,19]. Virus variations and resulting invasion by new genotypes may be key factors driving large-scale outbreaks and the development of severe symptoms and death when other confounding factors, such as changes in mosquito vectors, climate, imported cases, tourism, and trade, are not altered [20]. Previous studies have mostly focused on E gene and prM/M gene fragments to analyze genetic variations in DENV, and few studies have reported analysis of complete genome sequences. Analysis based on the E gene indicated that DENV genes exhibit diverse lineages and geographical distributions. Different serotypes comprise various subgroups called genotypes, which can differ in virus virulence and transmission rate [21]. DENV-1 has five different genotypes (I-V), among which genotypes I, IV, and V are still prevalent, whereas genotypes II and III appear to have become dormant. Phylogenetic analyses have shown that different virus isolates with different lineage features nonetheless clustered within the same genotype [22]. The practicality and development of high-throughput whole-genome sequencing and deep sequencing have enabled application of new analytical methods. A recent complete genome sequence analysis revealed that DENV-1 could be classified into three genotypes (I, II, and III), in contrast to the results of previous genotyping based on the E gene [23]. Lee et al. also reported that DENV-1 could be classified into three genotypes by complete genome sequencing [24] in an investigation of a historically large-scale outbreak in Singapore during 2013 and 2014. Their findings showed that the outbreak was related to the introduction of a new genotype III; however, the roles of different genotypes in driving outbreaks have not been sufficiently evaluated.
There are three scenarios under which new genotype invasion may occur in a specific area. First, genetic variation worldwide produces new genotypes. Second, genotypes that are prevalent in local areas are transmitted to areas in which there are no such genotypes. Third, genotypes that have been silent for years suddenly emerge. The status of dengue in China is still that of an imported disease, that can trigger local transmissions [25]. Guangzhou has emerged as an important hotspot, with the number of reported cases exceeding half those recorded nationwide. Currently, Guangzhou is still regarded as a nonendemic area for dengue, supported by numerous studies of mathematical and statistical modeling, virus evolution, and epidemiology [15,24,25]. In mainland China, particularly Guangzhou, many studies have evaluated serotyping and genotyping based on the E gene. However, sequence analysis based on the complete genome has not been performed deeply, and efforts to elucidate the impact of introduction of new genotypes on outbreaks have been insufficient.
Because DENV-1 is a representative serotype causing dengue outbreaks in Guangzhou, in this study, we analyzed new genotype invasion and the capacity of different genotypes to drive outbreaks by phylogenetic analysis based on DENV-1 complete genome sequences. We also explored the hypothesis of new genotype invasion as the main driver in major outbreak years. Our findings are expected to provide insights into viral evolutionary dynamics and the potential causes of massive outbreaks, which will help improve prevention and control measures for dengue.

Data sources and case investigations
The 2001-2005 case data were obtained from the archives of the Guangzhou Center for Disease Control and Prevention (GZCDC), including case questionnaires, epidemiological survey reports, phase analysis reports, and summaries. The 2006-2016 case information was extracted from the National Notifiable Infectious Diseases Reporting Information System of China. Once medical institutes reported suspected cases through the system, the local district CDC staff would conduct faceto-face case investigations to collect data on demographics, disease information, clinical manifestations, and travel history (domestic and international). Serum samples were also collected at this time.

Specimen collection
DENV strains were isolated from serum specimens of reported DF cases and preserved by the GZCDC and Sun Yat-sen University (SYSU). Blood samples (3-5 mL) from the acute phase (within 6 days after the date of onset) were collected from consenting patients by a nurse at the visiting medical institute or field CDC staff and then separated to obtain serum. Sera were stored at -80 °C until processing.

Serotyping and virus isolation
We followed serotyping and virus isolation assay protocols recommended by the World Health Organization [26]. Briefly, all serum samples were extracted with total viral RNA using a QIAamp viral RNA mini kit (Qiagen, Germany). Next, a TaqMan probe-based real-time polymerase chain reaction (PCR) protocol was employed to determine the serotype. Positive sera were diluted 10-fold with the sample treatment solution at 4°C for 2 h before being inoculated onto an Aedes albopictus mosquito (C6/36) cell line for virus isolation. Cultures were passaged no more than three times. The experiments were conducted in laboratories at GZCDC and by our collaborator SYSU in Guangzhou.

DENV complete genome sequencing
We selected representative virus strains and variants for complete genome sequencing. Specific primers for amplification and sequencing are listed in Additional file 1 [27], and primer synthesis was performed by a qualified third-party biotech company. The PCR product was purified using a QIAquick gel extraction kit (Qiagen) according to the manufacturer's instructions and then sent to a qualified third-party biotech company for sequencing using an ABI 3730xl DNA analyzer platform (Applied Biosystems, CA, USA). Nucleotide sequences were initially assembled with Lasergene software (version 8.0; DNASTAR Inc., Madison, WI, USA), and continuous sequences were aligned using BioEdit software (version 7.0.5). In addition, selected strains were transferred to a third-party biotech company (BGI, China) for complete genome sequencing by next-generation sequencing.

Phylogenetic analysis
DENV-1 reference strains were downloaded from Gen-Bank, and those with a length greater than 10,000 bp and associated metadata containing the year and location the isolate was sampled were included for phylogenetic analysis. Included samples were from different countries and regions worldwide, and the isolation year varied from 1944 to 2016. During complete genome sequence analysis, sequences identified from the same country in the same year with an evolutionary distance of zero were excluded through multiple sequence alignment.
All complete genome sequences were subjected to multiple sequence alignment using Mafft version 7 (https ://mafft .cbrc.jp/align ment/softw are/). We then performed sequence similarity analysis with Mega 7.0 software (https ://www.megas oftwa re.net/) and constructed neighbor-joining phylogenetic trees using Kimura's twoparameter model. The robustness of nodes was assessed with 1000 bootstrap replicates. All reference strains were labeled with name, year of isolation, location of isolation, and GenBank accession number. For strains other than Chinese strains, only one reference strain was randomly selected for phylogenetic analysis when multiple strains were isolated from the same year and the same country and when their evolutionary distance was zero after multiple sequence alignment by the K80 model.

Association of new genotype invasion with epidemic scale in major outbreak years
In this study, outbreak year was defined as the year when the annual number of local cases exceeded 500. Therefore, these three periods were considered three distinct outbreak years in the analysis. Here, we investigated the association of new genotype invasion with incidence rate by including new genotype as the study factor and the incidence rate in each outbreak year as the dependent variable, while using the average incidence rate during study years (median incidence rate) as the control. The incidence rates were compared with relative risks (RRs) and 95% confidence intervals (95% CIs) calculated to determine whether the effects of genotype were statistically significant.

Comparison of the capacity of different genotypes to drive DF outbreaks during the same epidemic period
Because 2013 and 2014 were the most important years for DF outbreaks in Guangzhou, we evaluated the ability of genotype III to drive more severe infections in comparable communities during these two periods. Comparable communities that shared genotype III or genotype I outbreaks were selected based on similar sizes of permanent resident populations and local environment types. We then calculated RRs and 95% CIs to determine the capacity of different genotypes to drive the outbreak during the same epidemic period.

Association of different genotypes with outbreaks during 2002-2016
In order to explore differences in the capacities of various genotypes to drive outbreaks in different years, we employed univariate and multivariate linear regression models. Community incidence caused by different genotypes was used as the dependent variable, genotype was used as the independent variable, and population density was used as the adjusting variable. The models were tested by analysis of variance with regression coefficients confirmed by t tests, and factors with P values of less than 0.05 were retained in the final model.

Statistical analysis and graphing
We used R (version 3.2.2; the R Foundation for Statistical Computing, Vienna, Austria) to process demographic, epidemiological, and genomic data using dplyr and ape packages. The geographic source and other general information for DENV strains were evaluated using descriptive analyses.

Results
In total, 1679 DENV-1 complete genome sequences were included in the phylogenetic analysis, including 97 strains from China and 1582 strains from other countries and regions, such as Southeast Asia, East Asia, South America, Central America, Africa, the Middle East, and Europe (Additional file 2: Table S2).
As shown in Table 1, there were 65 DENV-1 strains identified from Guangzhou since 1991, accounting for 67.01% (65/97) of all Chinese strains included in the analysis, among which 48 (73.85%, 48/65) strains were sequenced from 204 serum specimens by our research team and collaborators.

DENV-1 genome genotyping
Phylogenetic analysis using complete genome sequences showed that DENV-1 was generally classified into three genotypes, i.e., genotype I, II, and III (Additional file 3: Figure S1). All three genotypes have been observed in DENV-1 outbreaks in China, and genotype III was believed to be new in China at the time of its detection. Specifically, DENV-1 genotype III was first identified during the large outbreak in 2013-2014 in Guangzhou, demonstrating highest similarity with strains from India (JQ922548/India/2005, JQ917404/India/2009) and Singapore (KM403584/Singapore/2013), and no prior outbreaks of genotype III had been recorded in China. DENV-1 genotype III was first introduced into Guangzhou in the form of a new genotype invasion in 2013. In that year, genotypes I and III co-circulated, and 1249 local cases were reported. Subsequently, in 2014, a historically unprecedented epidemic of local cases was observed in Guangzhou, with a total of 37,340 local cases reported. In April 2015, genotype III was isolated again in Guangzhou. Nevertheless, there were no further DENV-1 cases after June 2015 until 2016. Overall, DENV-2 was mainly responsible for the DF outbreak in Guangzhou in 2015.  41-21.25). In general, the epidemic capacity of new genotype invasion in Guangzhou varied, with genotype III being the strongest, and the risk was 14.32 and 33.36 times higher than those of genotypes I and II, respectively ( Table 2). In 2014, there were 14 communities identified with outbreaks of genotype III, including Baiyun, Beijing, Dadong, Guangta, Jinsha, Licheng, Liurong, Meihuacun, Nancun, Shadong, Shiweitang, Tangjing, Tongde, and Wushan. In contrast, there were only two communities with outbreaks of genotype I, i.e., Dasha and Shayuan, which were both included as the control group in subsequent analyses. Because the size of the permanent resident population and the type of local environment were similar to those of control communities, we selected Tangjing and Liurong as the study groups and compared the capacities of genotypes I and III to drive outbreaks. Our results demonstrated that genotype III showed more driving force than genotype I in dengue outbreaks as a new genotype in 2014, with an RR value of 1.61 (95% CI 1.47-1.76; Table 3).

Comparison of the capacities of different genotypes for driving outbreaks during the same epidemic period
In 2013, only Shiweitang was found to be a site of a genotype III outbreak, whereas four communities, including Zhuguang, Zhongnan, Kuangquan, and Jianggao, had genotype I outbreaks. Similarly, when searching for a suitable control community based on the size of the permanent resident population and the type of local environment, we selected Zhuguang to compare the capacity of genotype I for driving outbreaks with that of genotype III in Shiweitang. The results showed that the risk of genotype III as a new genotype for driving outbreaks in 2013 was 2.49 times higher than that of genotype I (95% CI 1.89-3.28; Table 3).

Association of different genotypes with outbreaks over the years
We used a linear regression model to analyze the relationships between the epidemic capacities of different genotypes where the community incidence caused by different genotypes was selected as the dependent variable, the factorial genotype was included as the independent variable with genotype I being the reference, and population density was entered as the adjusting variable. Both univariate and multivariate analyses demonstrated that genotype III showed a positive correlation and the greatest regression coefficient in magnitude with statistical significance (Table 4). Additionally, there was no statistically significant association between genotypes II and I.

Sequence analysis of the genotype III coding region and each protein gene
In this study, genotype III first appeared in October 2013. We compared the coding sequences of the 2013 strain (KX225487) with those of the 2014-2016 strains and found that the average similarity ratio was 99.88%, indicating that base mutations occurred after the genotype invasion. Specifically, base mutations occurred in all three structural proteins and seven nonstructural proteins. The gene sequence encoding NS3 protein had the lowest average similarity ratio (99.82%), followed by the gene sequence encoding E protein (99.86%), suggesting that NS3 and E gene sequences experienced faster mutation after the DENV-1 genotype III invaded Guangzhou (Table 5, Additional file 4: Figure S2).

Discussion
In this study, genotype replacement and co-circulation of multiple genotypes were observed in DENV-1 outbreaks in Guangzhou, China. We found that DENV-1 genotype II was responsible for 2002 outbreaks. However, no largescale genotype II outbreaks were detected from viral isolate samples in subsequent years. Additionally, in 2006, genotype I was first identified as a new genotype invasion and was then found to co-circulate with genotype II in 2007. Similar findings were observed until 2013, when the invasion of new genotype III occurred, replacing genotype II to co-circulate with genotype I. Using complete genome sequence analysis and comparative analysis of the epidemic capacity of different genotypes, we found that there were new DENV-1 genotype invasion events in all major outbreak years in Guangzhou, and the appearance of new genotypes was highly correlated with the scale of the outbreak.
In terms of the unprecedented large-scale dengue outbreak in Guangzhou in 2014, we observed invasion of a new genotype (genotype III) of DENV-1 in addition to genotype I, which had dominated the epidemics in previous years. This high-intensity outbreak was mainly driven by the new genotype III. Phylogenetic analysis also showed that DENV-1 genotype III, isolated in 2013  New genotype invasion is an important feature of unusual outbreaks in major dengue-endemic regions worldwide. Various serotypes, genotypes, and their lineage clades are different in terms of viral virulence and epidemic capacity [2,22], with outbreaks characterized by new genotype invasions being the most remarkable. Introduction of new serotypes or genotypes can often change the dominant circulating viral strains in a region. For example, the epidemics in Malaysia in 1993-1995 could be traced back to the invasion of DENV-3 genotype II in Thailand in 1962-1987. The Malaysian DENV-3 isolate used to be genotype I prior to the invasion and was soon replaced by a high-intensity outbreak of DENV-1 and DENV-2 co-circulation during 1995-1998. Changes in different serotypes repeatedly caused intense outbreaks after 2000, with DENV-1 responsible for the most serious epidemics [28][29][30]. In India, DENV-2 genotype V was gradually replaced by genotype IV from 1967 to 1996, which was accompanied by severe epidemics [31]. By 2003-2004, a phylogenetic analysis revealed that outbreaks were highly related to the invasion of the new DENV-3 genotype III, which eventually took over the previous DENV-2 genotype IV and became the dominant serotype and genotype [32].
Dengue-affected geographic areas are constantly expanding owing to the emergence of new genotypes. For example, DENV-3 genotype III, first identified in the Indian subcontinent, spread to Africa in the 1980s and was further disseminated to Latin America in the 1990s. Notably, the virulence of DENV-3 genotype III changed and tended to be enhanced during geographic dissemination, as demonstrated by statistically significant distribution of mild and severe cases in phylogenetic analysis [33]. In Venezuela, E gene sequence analysis of DENV-3 isolates in the 2000-2001 outbreak showed their likely origin to be a genotype III strain that had invaded from Nicaragua and Panama. This genotype continued to spread in Central America and Mexico and eventually replaced genotype V, which had been epidemic in Venezuela from the 1960s to the 1970s [34]. In Central and South America, genotype invasions occurred more frequently. DENV-3 genotype III spread twice from the Caribbean to Brazil and was introduced to Paraguay at least three times [35], causing serious dengue outbreaks in both countries and surrounding areas. Phylogenetically, Ecuadorian DENV strains were also associated with isolates with Latin American origin [36]. The outbreak of DENV-2 in Puerto Rico originated from the invasion of the new Asian genotype IIIb, and since then, the clade has been co-circulating in the country with another lineage from the Western Hemisphere [37]. With regard to corresponding variations in virulence, previous studies based on the E gene suggested that positive selection occurred at several amino acid positions of the E gene, and such point mutations resulted in not only enhanced transmission but also increased viral virulence.
DENV-1 is most important serotype causing serious outbreaks in China, Southeast Asia, and the South Pacific in recent years. DENV-1 consists of five genotypes (I-V) according to previous phylogenetic analyses based on the E gene, and there are clades of varied sequence features within each genotype [22]. The strain of DENV-1 causing the outbreak in the South Pacific during 1988-1989 was only distantly phylogenetically related to the dominant strains in the region and was much more closely related to the American strain, suggesting that the outbreak was caused by the invasion of a new genotype rather than a sudden outbreak of a previous epidemic strain [38]. In 2001, outbreaks of three different genotypes of DENV-1 (I-III) occurred almost at the same time in Myanmar   [40]. In China, the 2004 DENV-1 outbreak in Zhejiang was related to an imported case of a patient who had traveled to Thailand [41]. In general, dengue outbreaks in China tend to be exclusively caused by imported cases. Singapore experienced their largest outbreak in history from 2013 to 2014. DENV-1 replaced DENV-2 as the main serotype in circulation, resulting in a total of 40,508 cases, including 22,170 cases in 2013 and 18,338 cases in 2014 with incidence rates as high as 410.6/100,000 and 335.0/100,000, respectively. The outbreak was ultimately confirmed to be caused by the invasion of a DENV-1 genotype III variant [42]. Further analysis of the genetic variation of the new genotype during the epidemic course revealed that there were three different variants in genotypes generated during the local epidemic course in Singapore. These variants exhibited different temporal and spatial distribution patterns with regard to driving the outbreak [24]. In the same year, the largest outbreak of DF was also observed in Taiwan, with a total of 15,732 cases reported, including 136 cases of dengue hemorrhagic fever and 20 deaths, primarily caused by the new genotype of DENV-1 [43]. Thus, DENV-1 genotype III was a key factor of large-scale outbreaks in Southeast Asia, and the successive unprecedented large-scale outbreaks in Singapore and Taiwan during 2013-2014 were both closely related to the invasion of DENV-1 genotype III.
Variations in genotypes and their clades have been shown to cause severe dengue epidemics and cases [43]. In Myanmar, 15,361 cases of dengue hemorrhagic fever/ dengue shock syndrome and 192 deaths were reported in 2001, and 95% of the cases were caused by DENV-1 [44]. Further phylogenetic studies have shown that the two lineages of the DENV-1 genotype I were previously unknown to the region and were probably caused by new variations generated from stochastic epidemic events [45]. In 2015, DENV-4 genotype I clade C caused severe cases in southern India, and sequence analysis demonstrated that there were mutations in amino acid sites involved in viral replication and epitope presentation [46].
Guangzhou features frequent importations of all four serotypes, with multiple genotypes from surrounding countries like other large ports of entry and exit in China such as Shanghai [47]. Although many DENV epidemics have been associated with the alteration of predominant serotypes, the present study demonstrated that the new DENV-1 genotype invasion was typically the cause of dengue outbreaks in Guangzhou, particularly in 2006 and 2013-2014, whereas DENV-1 genotype II was the main epidemic genotype in 2003 and before. However, after invasion of the DENV-1  [48]. During the middle and late stages of the epidemic in 2013, DENV-1 genotype III was introduced to Guangzhou as a new genotype invasion. As a result, 1249 local cases were reported, of which 78 cases developed into severe disease, representing the largest outbreak since 2002. DENV-1 genotype III continued to cause outbreaks in 2014 and eventually led to a record-breaking outbreak, with a total of 37,340 local cases reported. This number was over 2.4 times the sum of all cases reported from 1978 to 2013, and 14,000 cases of hospitalization, 308 severe cases, and five deaths were observed. Complete genome sequence analysis of DENV-1 showed that there were two genotypes (I and III) co-circulating in 2014. No significant variations in genotype I were observed. Therefore, this large-scale outbreak was highly associated with genotype III, and the capacity of genotype III for driving outbreaks was stronger than the capacities of genotypes I and II. DENV-1 genotype III strains appeared only in April 2015 and re-emerged in 2016. Moreover, studies have shown that secondary infections played a negligible role in severe cases during the 2014 outbreak [49], suggesting that new genotypes could increase the risk of developing into severe cases. Because asymptomatic patients and patients with mild disease usually do not seek medical treatment, and patients may visit the hospital at later stages of the disease, specimens in this study were mainly from symptomatic patients, and no asymptomatic individuals (and few patients with mild disease) underwent virus isolation. Therefore, the studied strains may not have represented the entire infected population. In addition, owing to financial constraints, the number of isolated strains and self-sequenced complete genome data in our study were still limited.
Our current findings demonstrated not only serotype replacement but also new genotype invasion within a serotype can trigger any ongoing outbreak of DF. However, future prospective epidemiological and phylogenetic studies are required to further clarify the genetic variations of new genotypes with different genotypes and clades co-circulating in affected areas as well as the epidemic capacities and scales of resulted outbreaks. Moreover, additional epidemiological studies of severe cases are needed to comprehensively evaluate new genotype invasion and its capacity for driving local epidemics and causing severe disease. Such studies could provide valuable scientific support for prevention and control efforts as well as early detection in dengue-affected areas.