Tick mitochondrial genomes: structural characteristics and phylogenetic implications

Ticks are obligate blood-sucking arachnid ectoparasites from the order Acarina, and many are notorious as vectors of a wide variety of zoonotic pathogens. However, the systematics of ticks in several genera is still controversial. The mitochondrial genome (mt-genome) has been widely used in arthropod phylogeny, molecular evolution and population genetics. With the development of sequencing technologies, an increasing number of tick mt-genomes have been sequenced and annotated. To date, 63 complete tick mt-genomes are available in the NCBI database, and these genomes have become an increasingly important genetic resource and source of molecular markers in phylogenetic studies of ticks in recent years. The present review summarizes all available complete mt-genomes of ticks in the NCBI database and analyses their characteristics, including structure, base composition and gene arrangement. Furthermore, a phylogenetic tree was constructed using mitochondrial protein-coding genes (PCGs) and ribosomal RNA (rRNA) genes from ticks. The results will provide important clues for deciphering new tick mt-genomes and establish a foundation for subsequent taxonomic research.


Background
Ticks are obligate blood-sucking arachnid ectoparasites that can feed on a wide range of vertebrates, including mammals, birds and reptiles [1,2]. Ticks are well-known zoonotic pathogen vectors, and tick-borne diseases (TBDs) are increasingly threatening animal and human health, thereby causing great economic damage [3,4]. Many important tick-borne pathogens have been characterized from ticks in recent years, including Anaplasma bovis, Babesia ovata, Rickettsia japonica, Chlamydiaceae bacteria and severe fever with thrombocytopenia syndrome virus (SFTSV), which have attracted increasing attention in the field of public health [5][6][7][8][9]. Recently, a newly segmented virus with a febrile illness similar in its clinical manifestation to tick-borne encephalitis virus (TBEV) was discovered, which was designated as Alongshan virus (ALSV) and confirmed in 86 patients from several provinces in China [10]. Globally, the annual financial losses due to ticks and TBDs are in the billions of dollars [3,11]. A total of 896 tick species have been described worldwide in three families: Ixodidae (hard ticks, 702 species), Argasidae (soft ticks, 193 species) and Nuttalliellidae (1 species) [12][13][14]. Hard ticks possess a sclerotized scutum in all life stages except eggs, have an apically located gnathostoma, usually feed for several days and ingest a large amount of blood [15,16]. Soft ticks have no sclerotized scutum and mouthparts located anteroventrally. The ticks usually feed and expand the body within minutes to hours [17]. Nuttalliella namaqua is the unique species in the family Nuttalliellidae, and it displays many characteristics associated with hard and soft ticks and can engorge as rapidly as soft ticks [18]. The differences in life history, behaviour, and morphological characteristics are useful for the discrimination of soft ticks and hard ticks, but there are still numerous difficulties among the interspecies taxonomic characterization and geographical origin of ticks, especially for soft ticks [19]. Therefore, the increasing number of characterized mt-genomes has shown considerable potential in tick phylogeny, molecular evolution and population genetics.

Open Access
Parasites & Vectors The mt-genome is characterized by low molecular weight, high copy quantity and genetic conservation. The mt-genome has been widely used in molecular evolution, phylogeny and genealogy in recent years [20][21][22]. Similar to other arthropods, the tick mt-genome has a circular, double-stranded DNA structure with a length of 14-16 kb and a total of 37 genes, including 13 protein-coding genes, 22 transfer RNA genes (tRNAs) and 2 rRNA genes [20,23]. With the development of next-generation sequencing (NGS) technology, increasing numbers of complete mt-genomes have been sequenced and annotated from various tick species [24]. The complete mtgenome sequences are necessary for advances in areas that are crucial for TBDs study and control [24]. To date, 63 complete tick mt-genomes are available in the NCBI database, and these genomes have become an increasingly important genetic resource and source of molecular markers in phylogenetic studies of ticks in recent years [19,25]. Hence, in the present study, we used the MITOS online software (http://mitos .bioin f.uni-leipz ig.de/index .py/) to annotate the complete mt-genomes of ticks and compare their characteristics, including structure, base composition and gene arrangement. Furthermore, a phylogenetic tree was constructed using PCGs and rRNA genes from ticks. The results will provide important clues for deciphering new tick mt-genomes and provide insights for subsequent taxonomic research.

Basic features of tick mt-genomes
The length of the mt-genomes of ticks average 14,633 bp, with the longest reaching 15,227 bp (Ixodes tasmani) and the smallest measuring only 14,307 bp (Argas boueti) ( Table 2). Generally, the length of the mt-genomes from hard ticks is slightly longer than that of soft ticks (14,796 and 14,429 bp, respectively). The length differences of the mt-genomes between ticks may be influenced by gene rearrangement and the length of the non-coding regions (NCRs) [46,47]. MITOS online analysis showed no gene deletion or duplication in tick mt-genomes, which contain 13 PCGs, 2 rRNA genes and 22 tRNA genes. Among the 13 PCGs, 9 PCGs (nad2, cox1, cox2, atp8, atp6, cox3, nad3, nad6, cytb) are located in the majority strand (J strand) and 4 PCGs (nad5, nad4, nad4L, nad1) are located in the minority strand (N strand).
Metazoan mt-genomes usually have a higher adeninethymine (AT) base content [22,32,42]. Analysis of base usage in tick mt-genomes showed that the AT content ranged from 80.45% (Amblyomma elaphense) to 65.23% (Ornithodoros savignyi) with an average content of 75.51% ( Table 2). The difference in base usage within the family is generally small [48,49], but the largest difference in AT content between soft and hard ticks reached 15.22%. This phenomenon may be attributed to the lower AT content in Ornithodoros species, which is 71.65% on average and is considerably lower than the average AT content of ticks. It is possible that the difference in AT content is related to the size of the NCRs, the repeat sequences and the complexity of the gene structure [50][51][52]. Additionally, the different living environments and survival strategies of soft and hard ticks influence base usage [53].
The base skew of tick mt-genomes is unique. In general, AT-skew is positive and guanine-cytosine (GC) skew is negative in the metazoan mt-genomes [54,55], whereas the AT-skew of soft and hard ticks is different. In soft ticks, the AT-skew is positive. In hard ticks, the positive AT-skew is only observed in I. hexagonus and Ixodes uriae, whereas in other hard ticks, the AT skew is negative. In both soft and hard ticks, the average AT-skew is 0.0504 and − 0.0187, respectively, and the average GCskew is − 0.3532 and − 0.1701, respectively; notably the difference in AT-skew is smaller than that in GC-skew ( Table 2).

Protein-coding genes and codon usage
The PCGs in mt-genomes encode several subunits: NADH dehydrogenase subunit, cytochrome c oxidase  [35] subunit, ATPase subunit and cytochrome b, which are mainly involved in the oxidative phosphorylation of cells [56]. The average length of mitochondrial PCGs in soft and hard ticks is 10,866 and 10,819 bp, respectively ( Table 2). The AT content in PCGs of the soft ticks (71.81%) and hard ticks (77.36%) is also lower than that in the complete mt-genome level. The lowest AT content in PCGs is in Rhipicephalus geigyi (63.59%) and the highest is in Ornithodoros savignyi (80.47%). The base skew in PCGs of ticks is negative, and the skewness characteristics are similar in both soft and hard ticks. No obvious differences have been observed in different genera of ticks, and the level of AT-skew is higher than that of the GC-skew. The mitochondrial PCGs are involved in oxidative phosphorylation and energy production; therefore, the structure is relatively conserved, and the difference in base usage is lower than that of the whole genome. In addition, the higher AT content of tick mt-genomes may be influenced by gene sequences, with there being only a 0.11-1.64% gap between the AT content of PCGs and the whole mt-genome (Table 2). Similarly to insects, ticks usually adopt the "ATN"type codon as the initial codon in PCGs [31][32][33][34]57]. Other codons, including some special initiation codons, can be edited to conventional start codons during transcription [58][59][60], which may help reduce the gene spacer region and overlapping region and not affect the normal translation of proteins [61]. The termination codons of ticks are mainly TAA and TAG [31,34] and sometimes use "T" or "TA", which may be converted into a complete termination codon by polyadenylation after translation [62,63].

Transfer RNA and ribosomal RNA genes
The mitochondrial tRNA gene length in ticks ranges from 50 to 90 bp, and most tRNA genes have a complete cloverleaf structure, including four principal structures: amino acid acceptor (AA) arm; TΨC (T) arm; anticodon (AC) arm; and dihydrouridine (DHU) arm [64]. No DHU arm structure exists in trnS1 of the tick mt-genomes; a similar phenomenon is also observed in insects [20,65,66]. The distance from the anti-codon to the CCA terminus is hence maintained through the inverted L structure, which helps complete the gene function [67]. Additionally, base mismatches frequently occur in the secondary structure of the tick tRNA genes [68,69]. The mismatch types are mainly G-U, U-G and U-U, which are similar to those of other insects [62,70]. These mismatches may be related to the evolutionary mutations and may not affect the function of tRNA genes due to being corrected later [71].
The mitochondrial rRNA genes display a complex functional structure with a relatively slow evolution rate; these have long been used as population genetics markers [72]. The tick mt-genomes contain two single copy 12S and 16S rRNA genes. In recent years, the mitochondrial 12S and 16S rRNA genes have been extensively used as genetic targets in phylogenetic research of ticks [27,36,73]. Due to gene rearrangement, the position of the rRNA genes shifts in ticks, whereas the gene order and the location in the N strand remain unchanged. Previous reports have shown that the average genetic distance of different tick taxa was still very slight even after tens of million years of evolution. Slow nucleotide variation in rRNA genes may be caused by strict structural and functional limitations [27]. Therefore, to this end, using  combined PCGs and rRNA genes to reconstruct the phylogenetic relationships and resolve the controversial genealogy of soft ticks may be one of the best methods [19].

Gene rearrangement
The mt-genomes exhibit higher rearrangement potential, but in general, the gene arrangement most likely occurs at a higher taxonomic level, which can provide insights for systematic classification at higher taxa [74,75]. There are three types of changes in tRNA gene position: shuffling (local rearrangements), translocation (cross-gene displacement) and inversion (change in the encoding or transcriptional direction) [76]. The rearrangements in the tick mt-genomes are mainly divided into two patterns (Fig. 1). The arrangement of the soft ticks and N. namaqua show more similarity with that in the genus Drosophila [77,78], which represents the ancestral arrangement in insects. In detail, shuffle (minor rearrangement of the gene) is observed only in the trnL2 gene [48], which is moved from cox1-cox2 to nad1-trnL1 with the coding strand changed from the J strand to the N strand, whereas other genes remain unchanged. In hard ticks, a major gene rearrangement is observed in a large gene region (trnF-nad5-trnH-nad4-nad4L-trnT-trnP-cytb-trnS2), which is moved from trnE-nad1 to trnQ-trnM. The major gene rearrangement involves the translocation of three tRNA genes (trnL1, trnL2 and trnC) and the inversion of the trnC gene. The patterns in gene rearrangement might be associated with the rate of molecular evolution, and the different rearrangements between soft and hard ticks may have occurred from a very early period [74,79].

Non-coding regions
In insects, the transcription termination of the mitochondrial NCRs is realized by combining transcription termination factors [80]. In ticks, the mt-genome features a compact structure, which usually contains two conserved site-specific NCRs and several genus-specific conserved NCRs [19,27,28,34,39]. The larger NCR is located between rrnS-trnI and is approximately 200-400 bp long ( Table 3). The length of NCR in soft and hard  ticks averages 274 and 261 bp, respectively. The longest NCR is observed in species of the genus Ixodes with an average length of 336 bp. The shortest NCR is only 82 bp in Rhipicentor nuttalli, and the notably short NCR may be attributed to assembly errors. The other conservative NCRs are located between rrnL and trnV, and the length of this region varies greatly. The shortest is only 155 bp in Amblyomma triguttatum, and the longest reaches 565 bp in Argas lagenoplastis. The difference in the average length between the soft and hard ticks is only 1 bp (251 and 252 bp, respectively). The length difference of this type of NCR in ticks is often significant within a genus, except for the genus Haemaphysalis, which shares a similar length of 150 bp. In addition to the abovementioned two NCRs, there is another NCR located between trnL1 and trnC in hard ticks. It is possible that the two related genes (trnL1 and trnC) may be involved in gene rearrangement, and hence the NCRs may act as a fragment insertion and play specific roles during gene transcription [81,82]. Additionally, some ticks also exhibit other NCRs, such as Dermacentor nitens and A. triguttatum, which display five NCRs. These NCRs may play important roles in protecting gene function during gene rearrangement, and there are currently four hypotheses to explain the formation of these particular NCRs [27,33,41,74]. It is noteworthy that a common marker sequence is found in the NCRs of the tick mt-genomes, which are formed by degeneration during evolution and named the "Tick-box" [39]. This conserved sequence is located at the boundary of two gene rearrangement regions in the tick mt-genomes, which may be affected by the arrangement of mitochondrial genes in ticks [27,36]. However, this sequence is not discarded during long-term evolution and likely functions as a transcriptional maturation or termination signal. Annotation of these sequences can help identify hidden molecular functions, which is useful for genetic analysis of higher taxa [39].

Mt-genome phylogeny
The mt-genomes play an important role in the molecular systematics and origin of ticks. In the present study, 13 PCGs and 2 rRNA genes from the MITOS analysis results of all available tick complete mt-genomes were used to construct a phylogenetic tree through the maximum likelihood method (ML) [83]. MEGA v.6.0 for Windows (https ://www.megas oftwa re.net/) was first used for alignment and splicing, and then the IQ-Tree online server (http://iqtre e.cibiv .univi e.ac.at/) was used for establishment of the phylogenetic tree with 1000 bootstrap replications [84,85]. The phylogenetic tree was constructed using the nucleotide sequences (12,150 bp) of 63 tick species. Limulus polyphemus (NC003057) was used as the outgroup and the percentage of the bootstrap support is given at each node.
In soft ticks, some species in Argas and Ornithodoros have previously been phylogenetically analyzed using 10 mitochondrial genes [27]  zumpti. These were incorporated into the present phylogenetic analysis using 13 PCGs and 2 rRNA genes. Results yielded ambiguous species delimitation and phylogenetic relationships of these two genera (Fig. 2), which are complicated with the existing of monophyly, paraphyly, or polyphyly phenomena. Possibly, the concatenation of present genes with other informative genes help a better phylogenetic resolution. The tick Ar. boueti was clustered within the subfamily Ornithodorinae with a minimum bootstrap of 51%. This clustering may influence the location of other genera, including Antricola, Nothoaspis and Carios. Additionally, the tick Carios faini was clustered first with Antricola mexicanus and Nothoaspis amazoniensis, as well as with C. capensis. Subsequently, the incongruence was apparent between phylogenetic configurations and morphological characterizations, which requires further evidential confirmation.
In hard ticks, Rhipicentor nuttalli was clustered with species within the genus Rhipicephalus, which provided corroborative evidence for their close relationship. Although most clades among the hard ticks in different genera showed moderate support and the clustering of the tick lineages were similar to previous studies [25], some particular species including Amblyomma elaphense, Am. spnenodonti and Hylomma asiaticum require total evidence support. The only tick in the family Nuttalliellidae, Nuttalliella namaqua, is the sister group of the family Ixodidae, which is similar to the previous mt-genome phylogenetic analysis [27].
ML analysis of mitochondrial genes is widely used in the molecular systematics of ticks [19,29,34]. Although there were some changes in our results, the phylogenetic branching results were similar to those obtained based on ten PCGs [27]. This finding suggests that the combination of more mitochondrial genes may provide more robust evidence for tick taxonomy. Different mitochondrial genes or sites usually have different evolutionary rates, which may affect the topological structure and lower the support rate of the phylogenetic tree, thereby affecting the reliability of phylogenetic results [86,87]. When the data matrix is partitioned according to both genes and coding sites, the phylogenetic calculation will be difficult to converge, which prevents phylogenetic analysis using a large number of mitochondrial genes simultaneously [88]. Thus, most studies usually adopt different PCGs or gene loci with proper partition, and the calculation can be optimized by modifying gene loci and selecting appropriate phylogenetic tree methods [89,90]. Previous research based on morphological and nuclear rRNA data supported the cladistic results of Klompen et al. [19,91]. The results obtained by combining multiple mitochondrial PCGs are partly different from those obtained using nuclear rRNA alone. Although some genera clades may change with the increasing number of mt-genomes, most genera remain clustered in the same clades [31][32][33][34] (Fig. 2). Molecular evidence based on the mt-genomes largely does not disagree with the recognized phylogenetic status of many tick species [12]. The description of new species and the characterization of new genetic markers will serve to systematically classify ticks [92].

Perspectives and future directions
Ticks and mites of the subphylum Chelicerata account for 53% of parasitic arthropods, which cause substantial losses in agriculture and human health [93]. In recent years, the mt-genomes have shown significant advantages and have been widely used in taxonomic and phylogenetic research [19,36,94]. However, challenges still exist in systematic investigations on the tick mt-genomes. The number of available mt-genomes remains limited, as only 63 complete tick mt-genomes are presently available in the NCBI database; the complete mt-genomes of approximately 93% of tick species remain unexplored. The absence of complete tick mt-genomes, especially for some soft ticks with geographical and taxonomic bias will undoubtedly hinder the reliability of the cladistics (phylogenetic) of the species within subclass Acari, order Ixodida. The different evolution rates of mitochondrial genes may lead to variation in gene length of many species, and different sequences. It should be mentioned that the annotation methods would be also able to affect the sequence assembly [94,95]. Furthermore, the mitochondrion is essential for energy metabolism and temperature regulation in metazoans [96]. Previous studies have shown that the mitochondrial genes have significantly different transcriptional activities during the freezing or anoxia adaptation and organism development [97][98][99][100]. The differential expression of specific functional genes may attribute to adaptive evolution [101]. Finally, no genes are encoded by the NCRs; therefore, NCRs receive less selection pressure during the process of evolution and are prone to base mutations [102]. NCRs can regulate gene expression and have many multiple tandem repeats and complex structures; hence, NCRs are more difficult to sequence [18,102]. The tick mt-genomes are characterized by two typical conserved NCRs, but there are significant differences in the length, number, and location among the different species.
Due to the above challenges, several important directions for future research on the tick mt-genomes were prospected. First, more complete mt-genome sequences, combing with morphological characteristics and nucleus sequences, are required to integrately illuminate the phylogenetic relationships within Ixodida. Secondly, through Fig. 2 The phylogenetic tree shows the evolutionary relationships among tick species based on the complete mt-genome (13 PCGs and 2 rRNA). The tree was constructed using ML analysis of the 13 PCGs and 2 rRNA nucleotide sequences (12,150 bp) of 63 tick species. Limulus polyphemus (NC003057) is the outgroup. In the phylogenetic tree, the scale-bar represents the number of expected changes per site. Percentage of the bootstrap support is given at each node. The gray, red and green areas indicate species of Nuttalliellidae, Argasidae and Ixodidae, respectively. GenBank accession numbers are listed in Table 1 extensive practices, mt-genome annotation methods are constantly improving [94]. However, annotation of a genome is still challenging, as different annotation methods may result in annotation bias or errors [102]. Hence, it is important to use unified annotation methods to help reduce or eliminate incorrect sequencing errors, and more attention should be given to NCRs. Thirdly, the functions and physiological relevance of the tick mitochondrial genes, including mitochondrial transcription, proteomics analysis of mitochondrial proteins, and epigenetic regulation in mitochondria under environmental or physiological stress, warrant further investigation. Finally, it is of considerable practical and theoretical interest to determine whether insecticides and acaricides can act on tick mitochondrial PCGs, which have been previously proved in mites [103,104]. This knowledge may provide new molecular biology information to further understand the genetic diversity of ticks, and shed light on novel strategies to control TBDs damage.

Conclusions
This study summarizes the basic features, including genomic structure, base difference and gene arrangement, of the tick mt-genomes available in the NCBI database. Research on tick mt-genomes has lagged behind that conducted in insects. Fortunately, an increasing number of mt-genomes have been published in recent years, and these have become important molecular markers for the phylogeny of ticks. Our study constructed a phylogenetic tree by maximum likelihood using 13 PCGs and 2 rRNA genes, and the results further supported the phylogenetic status of many tick species. Undoubtedly, the application of polygenic joint analysis and appropriate software will be widely applied in solving the phylogenetic and genetic evolution of diverse taxa of ticks, which will be of profound significance for the rapid identification of tick species.