The absence of the drhm gene is not a marker for human-pathogenicity in European Anaplasma phagocytophilum strains

Background Anaplasma phagocytophilum is a Gram-negative obligate intracellular bacterium that replicates in neutrophil granulocytes. It is transmitted by ticks of the Ixodes ricinus complex and causes febrile illness in humans and animals. The geographical distribution of A. phagocytophilum spans the Americas, Europe, Africa and Asia. However, human disease predominantly occurs in North America but is infrequently reported from Europe and Asia. In North American strains, the absence of the drhm gene has been proposed as marker for pathogenicity in humans whereas no information on the presence or absence of the drhm gene was available for A. phagocytophilum strains circulating in Europe. Therefore, we tested 511 European and 21 North American strains for the presence of drhm and compared the results to two other typing methods: multilocus sequence typing (MLST) and ankA-based typing. Results Altogether, 99% (478/484) of the analyzable European and 19% (4/21) of the North American samples from different hosts were drhm-positive. Regarding the strains from human granulocytic anaplasmosis cases, 100% (35/35) of European origin were drhm-positive and 100% (14/14) of North American origin were drhm-negative. Human strains from North America and Europe were both part of MLST cluster 1. North American strains from humans belonged to ankA gene clusters 11 and 12 whereas European strains from humans were found in ankA gene cluster 1. However, the North American ankA gene clusters 11 and 12 were highly identical at the nucleotide level to the European cluster 1 with 97.4% and 95.2% of identity, respectively. Conclusions The absence of the drhm gene in A. phagocytophilum does not seem to be associated with pathogenicity for humans per se, because all 35 European strains of human origin were drhm-positive. The epidemiological differences between North America and Europe concerning the incidence of human A. phagocytophilum infection are not explained by strain divergence based on MLST and ankA gene-based typing.

Background Anaplasma phagocytophilum is a Gram-negative obligate intracellular bacterium that replicates in neutrophil granulocytes [1]. It causes febrile illness in humans and animals and is transmitted by ticks of the Ixodes ricinus complex [2,3]. The main vectors of A. phagocytophilum are I. ricinus in much of Europe, I. persulcatus in north-eastern Europe and East Asia and I. scapularis and I. pacificus in North America [2]. Anaplasma phagocytophilum has a wide geographical distribution that spans the Americas, Europe, Africa and Asia [2]. However, human disease predominantly occurs in North America with 4008 anaplasmosis cases in 2018 in the USA (https ://wonde r.cdc.gov/nndss /nndss _annua l_table s_menu. asp?mmwr_year=2018). In contrast, human granulocytic anaplasmosis is infrequently reported from Europe [4] and Asia [5][6][7][8][9]. Of note, most patients from China initially described as to be infected by A. phagocytophilum suffered from a bunyavirus infection called severe fever with thrombocytopenia syndrome (SFTS) [5,[10][11][12].
Anaplasma phagocytophilum is not only geographically widely distributed, but also has a broad host range. Clinically apparent disease is mainly observed in humans [3], dogs [13], horses [14], cats [15] as well as in sheep and cattle [16]. Symptomatic granulocytic anaplasmosis in domestic ruminants has been observed in Europe [17], Africa [18,19], the Near [20] and Far East [21], whereas it has not been reported from North America so far [22].
Transovarial transmission of A. phagocytophilum is inefficient at least in Ixodes ticks [23,24]. It is therefore thought to depend on reservoir hosts to complete its lifecycle. The white-footed mouse (Peromyscus leucopus) is probably the main reservoir for human infection in the USA [25,26], whereas the situation is less clear for Europe. Several species including wild ruminants, small mammals and wild boar have been considered in the past [22].
A variety of single and multilocus sequence typing schemes have been used (i) to elucidate the epidemiological differences mentioned above; (ii) to find markers for human pathogenicity; (iii) to study host adaptation of distinct A. phagocytophilum strains; and (iv) to determine reservoir hosts for human and animal infection [22]. The absence of the drhm gene has been proposed as a marker for pathogenicity in humans and dogs when seven whole genome sequences from five North American and two European strains from different hosts were compared [27]. However, this could not be verified on a larger series of 117 samples from the USA because 25% (4/16) of dog strains were drhm positive [28]. No information on the presence or absence of the drhm gene has been available so far for A. phagocytophilum circulating in Europe. Therefore, we here tested 511 European and 21 North American A. phagocytophilum strains for the presence of drhm and compared the results to two other typing methods that we found in the past to have high discriminatory power: multilocus sequence typing (MLST) and ankA-based typing [29,30].

Samples
In total, 686 A. phagocytophilum strains were included. Of them, 98 were from this study and originated from 3 humans, 35 domestic animals (18 horses, 14 dogs, 2 cats and 1 cow), 57 wild animals (19 red deer, 9 roe deer, 9 sika deer, 6 wild boars, 5 mouflons, 4 fallow deer, 2 ibexes, 2 red foxes and 1 bird) and 3 ticks (2 I. ricinus and 1 I. frontalis). The two nymphs and one adult female were engorged and removed from blackbirds (Turdus merula). A total of 577 strains were reported previously. All human strains originated from patients with human granulocytic anaplasmosis. Reference, host species, country of origin, year of sampling and disease state of the host are shown in Additional file 1: Table S1.
Two μl of DNA were used as a template in a 50 μl reaction mixture containing 50 mM KCl, 20 mM Tris-HCl (pH 8.4), 2 mM MgCl 2 , 0.2 mM deoxynucleoside triphosphates, 0.4 μM concentrations of each primer and 0.2 μl (1 U) of Taq DNA polymerase (Invitrogen, Karlsruhe, Germany). The PCR were performed by a 2720 GeneAmp thermal cycler (Applied Biosystems, Darmstadt, Germany) under the following conditions: an initial denaturation at 94 °C for 3 min; followed by 40 cycles of denaturation at 94 °C for 30 s, annealing at 54 °C for 30 s, extension at 72 °C for 30 s; and a final extension step at 72 °C for 10 min. Individual drhm and APH_0919/ APH_0922 amplicons were bidirectionally sequenced to prove PCR specificity.

MLST and ankA gene
Seven housekeeping genes (pheS, glyA, fumC, mdh, sucA, dnaN and atpA) were amplified for MLST and sequenced bidirectionally as reported previously [29]. Clonal complexes (CC) were defined by sharing identical alleles at five of the seven loci with at least one other member of the group. The ankA gene was partially amplified and bidirectionally sequenced as described [29]. Full length ankA sequences were obtained in tick_CM20 and tick_CS2 (cluster 6) as detailed in Additional file 2: Text S1, and in horse_S1523_07 (cluster 7) as described previously [31].

Phylogenetic analysis
Sequences were codon-aligned by ClustalW applying the PAM (Dayhoff ) matrix. Trees were constructed using the neighbor-joining (NJ) method with the Jukes-Cantor model and the complete deletion option in the program MEGA X version 10.0.5 [32]. Bootstrap analysis was conducted with 1000 replicates. Net average distances between nucleotide sequences of MLST and ankA gene clusters were computed using the Jukes-Cantor matrix and applying the complete deletion option. Net average distances between protein sequences of ankA gene clusters were calculated using the PAM (Dayhoff ) matrix and applying the complete deletion option.

Comparison of typing methods
To test for the concordance between different typing methods, adjusted Wallace coefficients [33] were calculated using the online tool available at: http://www. compa ringp artit ions.info/index .php?link=Tool.
For example, the Wallace coefficient host → MLST cluster is the probability that two strains are found in the same MLST cluster, if they are from the same host.

Presence or absence of the drhm gene
The presence or absence of the drhm gene was determined in 532 A. phagocytophilum strains, 511 from Europe and 21 from North America (Additional file 1: Table S1). The information was extracted from GenBank in 13 cases (3 from Europe, 10 from the USA). The DNA was used up or information on drhm was not available on GenBank in 154 of the 668 strains in total.
The absence of a gene is difficult to prove due to methodical reasons. Therefore, the amplification of the flanking genes APH_0919/APH_0922 was undertaken in 519 samples in order to prove that the genomic region containing the drhm gene was present (Additional file 1: Table S1). This information was extracted from GenBank in 13 cases. 95% (505/532) of the A. phagocytophilum strains investigated contained the APH_0919 and/or APH_0922 gene ( Table 1). The 27 APH_0919/ APH_0922-negative samples were exclusively of European origin and from 22 voles, 3 shrews and 2 I. ricinus ticks removed from blackbirds (T. merula) ( Table 2). All APH_0919/APH_0922-negative samples were also negative for drhm (Table 1).

MLST
In general, different sequences of a given locus (pheS, glyA, fumC, mdh, sucA, dnaN and atpA) were ascribed a unique, but arbitrary allele number and each unique combination of alleles was assigned a sequence type (ST). Full profiles were obtained for 653   A. phagocytophilum strains. Of these, 93 were from this study, 491 were reported previously by our group and in 69 cases the information was extracted from GenBank or PubMLST (Additional file 1: Table S1). Housekeeping gene sequences with double peaks in the chromatograms were regarded as non-typeable. Therefore, a ST could not be ascribed in 139 A. phagocytophilum strains revealing a typeability of 79% (514/653). Clonal complexes (CC) were defined by sharing identical alleles at five of the seven loci with at least one other member of the group. CC 18 and CC 19 are newly described and contained two roe deer samples each. Allele numbers, ST, CC and MLST cluster for each A. phagocytophilum strain are shown in Additional file 1: Table S1. A total of 520 A. phagocytophilum strains without ambiguous nucleotides were included in the phylogenetic analysis. The sequences segregated into 8 clusters (Fig. 1, Additional file 3: Fig. S1). Clusters 1 to 3 [29] and 4 to 6 [34,35] have been described before. Cluster 1 contained strains from humans, domestic animals (dogs, horses and cats), farm animals (cattle, sheep and goats), wild animals (red deer, sika deer, fallow deer, European bison, mouflon, chamois, ibex, wild boar and red foxes), small mammals (hedgehogs, jumping meadow mouse and chipmunk) and I. ricinus ticks. Cluster 2 harbored mainly samples from roe deer and I. ricinus ticks, but also sporadically sequences from domestic ruminants. Cluster 3 was restricted to strains from voles and shrews from Europe. Cluster 4 was small and constituted by samples from 2 roe deer and 1 red deer. Cluster 5 and cluster 6 contained strains from I. persulcatus, I. pavlovskyi and their hybrids from the Asian part of Russia as described previously [34,35]. Samples from Asian voles were found exclusively in cluster 6. Cluster 7 harbored 2 strains from I. ricinus ticks removed from blackbirds (Turdus merula). Cluster 8 contained 3 isolates from I. scapularis ticks from the USA that have been classified as non-human pathogenic A. phagocytophilum Ap-variant 1 [36]. However, the other North American samples from 11 humans, 1 dog, 1 horse, 1 chipmunk and 1 jumping meadow mouse were part of cluster 1 (Fig. 1, Additional file 3: Fig. S1).
The largest cluster was cluster 1. It contained samples from Europe and North America. However, all other clusters were restricted to samples from either Europe or North America or Asia (Fig. 1, Additional file 3: Fig.  S1). Net average distances between the MLST clusters are shown in Table 3. The highest identity of 98.6% was observed between clusters 1 and 8 and the lowest identity of 88.3% between clusters 6 and 8.

ankA
Partial ankA sequences were available for 637 A. phagocytophilum strains. Of these, 94 were from this study, 491 were reported previously by us and in 52 cases the information was extracted from GenBank (Additional file 1: Table S1). A total of 17 samples from cattle, red deer, roe deer, sika deer and I. ricinus were regarded as non-typeable, because they contained ankA variants belonging to different clusters (Additional file 1: Table S1). In 15 cases two ankA gene variants and in two cases three ankA gene variants were present. The typeability regarding the ankA gene cluster was 97% (620/637). The ankA gene cluster for each A. phagocytophilum strain is shown in Additional file 1: Table S1.
A total of 623 ankA sequences without ambiguous nucleotides were included in the phylogenetic analysis. The sequences segregated into 12 clusters (Fig. 2, Additional file 4: Fig. S2). Clusters 1 to 5 [29] as well as 8 and 10 [34] have been described before. Cluster 1 contained strains from humans, domestic animals (dogs, horses and cats), farm animals (cattle, sheep and goats), wild animals (red deer, sika deer, fallow deer, European bison, chamois, wild boar and red foxes), small mammals (hedgehogs) and I. ricinus ticks. Cluster 2 harbored mainly samples from roe deer and I. ricinus ticks, but also sporadically sequences from domestic ruminants. Cluster 3 comprised mostly strains from roe deer, but also single sequences from red deer and sika deer. Cluster 4 contained samples from one horse, farm animals (cattle, sheep and goats), wild animals (red deer, sika deer, fallow deer, roe deer, European bison, mouflon, chamois and ibex) and I. ricinus ticks. Cluster 5 was restricted to strains from voles and shrews from Europe. Cluster 6 harbored one sample from a blackbird (T. merula) and two strains from I. ricinus ticks removed from blackbirds. Cluster 7 contained one strain from a horse and two strains from I. ricinus ticks. Cluster 8 comprised samples from I. persulcatus and I. pavlovskyi ticks from the Asian part of Russia as described previously [34]. Cluster 9 was restricted to one sequence [37] from an Ixodes sp. tick removed from a woodchat shrike (Lanius senator senator). Cluster 10 contained only samples from voles as well as from I. persulcatus, I. pavlovskyi and their hybrids from the Asian part of Russia as described previously [34]. Clusters 11 and 12 harbored solely strains from North America. Cluster 11 comprised samples from humans, one dog, one horse and small mammals (jumping meadow mouse and chipmunk). Cluster 12 contained strains from humans and I. scapularis ticks. One of the tick strains (USG3) has been described as human pathogenic A. phagocytophilum Ap-ha strain [38], whereas the The number in parenthesis indicates the frequency with which the respective ST was found. Key: red circles, sequences from humans, dogs, horses and cats; dark blue diamonds, sequences from domestic ruminants (cattle, sheep, goats and water buffalo); light blue diamonds, sequences from wild ruminants (roe deer, red deer, sika deer, fallow deer, European bison, mouflon, chamois and ibex); green triangles, sequences from small mammals (hedgehogs, voles, shrews, chipmunk and jumping meadow mouse); yellow squares, sequences from wild boars; purple triangles, sequences from red foxes; white triangles, sequences from ticks other three have been classified as non-human pathogenic A. phagocytophilum Ap-variant 1 [36].
Samples from Europe were found in clusters 1-7 and 9, those from Asia in clusters 8 and 10 and those from North America in clusters 11 and 12. Net average distances between the ankA gene clusters are shown in Table 4. At the nucleotide level, the highest identity of 97.4% was observed between clusters 1 and 11 and the lowest identity of 47.8% between cluster 1 and cluster 10.
Full length ankA sequences of the A. phagocytophilum strains horse_S1523_07 (present study) and tick_W271 [31] which both belonged to cluster 7 were compared to all other complete ankA sequences available so far [30,31,39]; the highest identities were observed to cluster 1 (83.9%) and to cluster 4 (86.3%) sequences. However, the identity of cluster 7 sequences was 99.4% to cluster 1 sequences when nucleotides 1-1639 were considered and 98.2% to cluster 4 sequences when nucleotides 1604-3720 were taken into account. This means that cluster 7 ankA sequences probably arose by recombination of cluster 1 and cluster 4 sequences.

Concordance between typing methods
To test for the concordance between different typing methods, adjusted Wallace coefficients [33] were calculated. The information regarding drhm presence was obtained in 392 A. phagocytophilum strains that were typeable by MLST and ankA. The concordance between drhm status and continent was 71% (Table 5) which reflects the fact that 94% (478/511) of the European samples, but only 19% (4/21) of those from North America were drhm-positive. The association between drhm presence and the other partitions was low (host, ST, CC, MLST cluster, ankA cluster and country). However, the concordance between ST, CC and ankA gene cluster on one hand and drhm status on the other hand was high (> 75%), because CC 5, CC 12 and ankA gene clusters 11 and 12 were restricted to North American strains ( Table 5). The concordance between host and MLST cluster was 88% which indicates a host association of certain A. phagocytophilum variants ( Table 5). The association between ankA cluster and MLST cluster was 96%.
The information regarding drhm presence was unavailable for all Asian strains. Therefore, adjusted Wallace coefficients were calculated in 467 A. phagocytophilum strains that were typeable by MLST and ankA, but lacked the partition drhm presence. Then, the concordance between MLST cluster and continent was 68% and between ankA gene cluster and continent 100% (Table 6).

Presence or absence of the drhm gene
The absence of the drhm gene in A. phagocytophilum strains has been proposed as a marker for pathogenicity in humans and dogs [27]. Eight human strains from North America investigated so far have been drhm-negative [27]. Here, we included further human samples from the USA that did not possess drhm either. However, all 35 human strains from Europe were drhm-positive. Thus, the absence of the drhm gene in A. phagocytophilum does not seem to be associated with pathogenicity for humans per se. On the other hand, drhm negativity could indicate that those strains are of higher virulence, because human granulocytic anaplasmosis is infrequently reported from Europe compared to the USA [4].
In contrast to human disease, canine [13] and equine [14] granulocytic anaplasmosis equally occurs in North America and Europe. In a previous study, 25% (4/16) of dog strains and 53% (11/21) of horse strains from the USA were positive for drhm [28]. Here, 96% (67/70) of canine and 100% (44/44) of equine samples from Europe possessed the drhm gene. Thus, the drhm status seems not to be associated with pathogenicity or virulence in dogs and horses. The same is probably true in humans because A. phagocytophilum strains from humans, dogs and horses have been reported to be homologous [29,30,40] and dogs and horses have been reported to be susceptible to infection with human isolates [41][42][43].
The concordance between drhm status and host and vice versa was low. A similar finding has been reported  The number in parenthesis indicates the frequency with which the respective sequence was found. Key: red circles, sequences from humans, dogs, horses and cats; dark blue diamonds, sequences from domestic ruminants (cattle, sheep, goats and water buffalo); light blue diamonds, sequences from wild ruminants (roe deer, red deer, sika deer, fallow deer, European bison, mouflon, chamois and ibex); green triangles, sequences from small mammals (hedgehogs, voles, shrews, chipmunk and jumping meadow mouse); yellow squares, sequences from wild boars; purple triangles, sequences from red foxes; pink square, sequence from a bird, white triangles, sequences from ticks previously [28]. Thus, presence or absence of drhm is probably not associated with certain hosts. However, a tendency has been observed for A. phagocytophilum strains from the Northeast of the USA to be drhm-negative, in contrast to samples from the Southeast, Midwest and West [28]. In the present study, the concordance between drhm status and country of origin and vice versa was low. Therefore, presence or absence of drhm is not geographically informative in Europe. However, 94% (478/511) of the European and 19% (4/21) of the North American A. phagocytophilum strains were drhm-positive yielding a concordance between drhm status and continent of 71%. Thus, drhm positivity seems to be associated with European origin, although the concordance was not very strong (< 75%).
Twenty-seven European A. phagocytophilum strains were APH_0919/APH_0922-and drhm-negative. They originated from voles, shrews and I. ricinus ticks removed from blackbirds (T. merula) and belonged to MLST clusters 3 and 7. Strains from these clusters were more distantly related to the MLST clusters 1, 2, 4 and 8 for which information on APH_0919/APH_0922 and drhm was available. Thus, the most likely reasons for negativity in APH_0919/APH_0922 and drhm are primer mismatches or a different genomic organization.

MLST and ankA-based typing
A ST could not be ascribed in 139 A. phagocytophilum strains because of double peaks in the chromatograms.   This phenomenon has been observed before, most prominently in wild [29] and domestic ruminants [44,45], probably reflecting their co-or superinfection with different A. phagocytophilum variants. Among others, wild boar and small mammals have been considered in the past as reservoir hosts for human infection in Europe [22]. Here, concatenated housekeeping and ankA gene sequences from human strains from Europe clustered most closely together with hedgehogs and wild boar indicating that they might harbor humanpathogenic A. phagocytophilum variants. In contrast, samples from voles and shrews from Europe and Asia were only distantly related. Asian A. phagocytophilum strains from humans were not available for analysis. Thus, it is unclear whether Asian voles might harbor human-pathogenic variants. At least in Europe, voles and shrews are unlikely to serve as reservoir hosts for human infection.
In the USA, two major 16S rRNA gene variants of A. phagocytophilum have been described: the A. phagocytophilum Ap-ha and the A. phagocytophilum Apvariant 1 strain. Both were defined by a two-base pair difference in the 16S rRNA gene [46] and it has been claimed that A. phagocytophilum Ap-ha is pathogenic for humans whereas A. phagocytophilum Ap-variant 1 is not [22]. However, single locus 16S rRNA gene-based typing of A. phagocytophilum has been proven to not reliably define A. phagocytophilum genotypes [29][30][31][47][48][49]. Here, the I. scapularis strains CRT35 (ST 217), CRT38 (ST 216) and CRT53-1 (ST 218) that have been classified as non-human pathogenic A. phagocytophilum Ap-variant 1 [36] were found in MLST cluster 8, whereas the human strains from the USA were part of cluster 1. However, concerning the ankA-based typing, seven North American strains from humans were part of cluster 12 together with the three A. phagocytophilum Ap-variant 1 isolates from I. scapularis. In our opinion, this finding questions the classification of A. phagocytophilum Ap-variant 1 as non-human pathogenic using the 16S rRNA gene as a marker.
The epidemiological differences between North America and Europe regarding the incidence of human infection are not explained when MLST and ankAbased typing are considered, because human strains from North America and Europe were both part of MLST cluster 1. Further, the North American ankA gene clusters 11 and 12 were highly identical at the nucleotide level to the European cluster 1 with 97.4% and 95.2%, respectively.
In contrast to the drhm status, the concordance between host and MLST cluster was 88% indicating a host association of certain A. phagocytophilum variants. Bird-related MLST cluster 7 and avian ankA clusters 6 and 9 are newly described here. Bird-specific A. phagocytophilum strains have been reported before because groEL ecotype IV was restricted to samples from a blackbird and from five ticks feeding on blackbirds [50]. Recently, a bird-associated groEL cluster 7 has been characterized as well [51].
Ixodes persulcatus, I. pavlovskyi ticks and their hybrids from the Asian part of Russia were restricted to MLST clusters 5 and 6 and ankA clusters 8 and 10. Ixodes persulcatus from its European distribution area was not investigated. It is therefore unclear whether the clustering is reflected by tick species or geography.
The ankA sequences from cluster 7 found in a horse and two I. ricinus ticks were probably recombinants of cluster 1 and 4 sequences. It has been shown before that the ankA gene might undergo recombination [39]. Here, the infection of a horse with an A. phagocytophilum strain of ankA cluster 4 was observed for the first time. However, all 44 equine samples of European origin were part of cluster 1. A double or triple infection with A. phagocytophilum variants belonging to different ankA clusters was detected in 17 cases in cattle, deer and a tick. In roe deer, this phenomenon has been observed before [52]. Thus, multiple infections as a prerequisite for recombination occur. In general, the genetic diversity was higher in Europe than in North America and Asia because European samples belonged to five MLST and eight ankA clusters, whereas North American and Asian strains were part of two MLST and ankA clusters. However, a considerable sampling bias must be taken into account, as 585, 72 and 29 strains were of European, Asian and North American origin, respectively.
The concordance between MLST cluster and continent was 68%, and between ankA gene cluster and continent 100%. Thus, both typing methods were geographically informative. A broader host spectrum especially from North America and Asia should be typed by MLST and ankA-gene-based typing to further elucidate host association and geographical distribution of distinct A. phagocytophilum strains.

Conclusions
The absence of the drhm gene in A. phagocytophilum does not seem to be associated with pathogenicity for humans per se, because all 35 European strains from human granulocytic anaplasmosis cases were drhm positive. The epidemiological differences between North America and Europe concerning the incidence of human A. phagocytophilum infection are not explained by strain divergence based on MLST and ankA gene-based typing. The concordance between host and MLST cluster was 88% which indicates a host association of certain A. phagocytophilum strains. Human strains from Europe clustered most closely together with hedgehogs and wild boars indicating that they might serve as reservoir hosts for human infection.