Variation of mitochondrial minichromosome composition in Hoplopleura lice (Phthiraptera: Hoplopleuridae) from rats

Background The family Hoplopleuridae contains at least 183 species of blood-sucking lice, which widely parasitize both mice and rats. Fragmented mitochondrial (mt) genomes have been reported in two rat lice (Hoplopleura kitti and H. akanezumi) from this family, but some minichromosomes were unidentified in their mt genomes. Methods We sequenced the mt genome of the rat louse Hoplopleura sp. with an Illumina platform and compared its mt genome organization with H. kitti and H. akanezumi. Results Fragmented mt genome of the rat louse Hoplopleura sp. contains 37 genes which are on 12 circular mt minichromosomes. Each mt minichromosome is 1.8–2.7 kb long and contains 1–5 genes and one large non-coding region. The gene content and arrangement of mt minichromosomes of Hoplopleura sp. (n = 3) and H. kitti (n = 3) are different from those in H. akanezumi (n = 3). Phylogenetic analyses based on the deduced amino acid sequences of the eight protein-coding genes showed that the Hoplopleura sp. was more closely related to H. akanezumi than to H. kitti, and then they formed a monophyletic group. Conclusions Comparison among the three rat lice revealed variation in the composition of mt minichromosomes within the genus Hoplopleura. Hoplopleura sp. is the first species from the family Hoplopleuridae for which a complete fragmented mt genome has been sequenced. The new data provide useful genetic markers for studying the population genetics, molecular systematics and phylogenetics of blood-sucking lice.


Background
Blood-sucking lice are known vectors and transmit various disease agents and cause significant vector-borne diseases in humans, domestic and wild mammals [1]. The family Hoplopleuridae contains at least 183 described species of blood-sucking lice currently classified into eight genera [2]. Of the eight genera, Hoplopleura Enderlein, 1904 is the most species-rich (165 described species) found on rodents [3]. The Hoplopleura spp. are common ectoparasites of both mice and rats, causing pruritus, alopecia, dermal irritation and even anemia.
To understand the composition of mt minichromosomes in species of the same genus, Hoplopleura, we sequenced the complete mt genome of the rat louse Hoplopleura sp. and compared its mt genome organization with other two Hoplopleura species, and re-constructed its phylogenetic relationships within the suborder Anoplura using protein sequences derived from coding genes.

Sample collection and DNA extraction
Adult specimens of Hoplopleura sp. were collected from the Edward's long-tailed rats Leopoldamys edwardsi in Chongqing, China. The specific identity of the examined wild rats was determined by PCR-based sequencing of the mitochondrial (mt) cox1 gene using an established method [16]. These rat lice were washed five times in physiological saline solution, identified preliminarily to the genus level (as Hoplopleura sp.) based on morphological features [2], and stored in 70% (v/v) ethanol at − 20 °C. Whole genomic DNA including nuclear and mt DNA was extracted from 50 single rat lice (25 females and 25 males) using the DNeasy Tissue Kit (Promega, Madison, USA) according to the manufacturer's recommendations. The identity of these specimens was further confirmed by polymerase chain reaction (PCR) amplification and subsequent sequencing of the mt cox1 and rrnS genes using primer pairs L6625 (5′-CCG GAT CCT TYT GRT TYT TYG GNC AYC C-3′) and H7005 (5′-CCG GAT CCA CNA CRT ART ANG TRT CRT G-3′), and 12SA (5′-TAC TAT GTT ACG ACT TAT-3′) and 12SB (5′-AAA CTA GGA TTA GAT ACC C-3′), respectively.

Sequencing and assembling
The purity of the extracted whole genomic DNA was assessed by agarose-gel electrophoresis [17]. The DNA concentration was determined using a Quantus Fluorometer (Invitrogen, Carlsbad, USA). A paired-end genomic DNA library (350 bp inserts) was constructed for high throughput sequencing with Miseq PE300 (Illumina, San Diego, CA, USA) and collected raw reads were exported in the FASTQ format. The raw reads were filtered by removing adaptor reads, redundant reads and 'N'-rich reads. Finally, 2 Gb clean data (256 bp pair-end reads) was produced for this rat louse. Contigs were de novo assembled from Illumina sequence reads using Geneious 11.1.5 [18] based on cox1 and rrnS relatively conserved sequences. The assembly parameters were minimum overlap identity 99% and minimum overlap 150 bp. The two ends of the contig overlapped, indicating circular organization of the minichromosome. We observed in previous studies that each mt minichromosome has a distinct coding region but a wellconserved non-coding region [10][11][12][13]. The conserved noncoding region sequences were identified between the cox1 and rrnS minichromosomes and were used as references to align the Illumina sequence dataset. BLAST was used for alignment. We assembled these minichromosomes individually in full length using the same method stated above for cox1 and rrnS minichromosome assembly.

Annotation
Sequences were aligned against the mt minichromosome sequences of the rat louse H. kitti [7] available using the MAFFT 7.122 software [19] to identify gene boundaries. Protein-coding genes and rRNA genes were identified with BLAST searches of the NCBI database. Amino acid sequences of each protein-coding genes were inferred using MEGA 6.0 [20]. tRNA genes were identified using ARWEN [21] and the program tRNAscan-SE [22] with manual adjustment.

Verification of mt minichromosomes
The size of each mt minichromosome of Hoplopleura sp. were verified by PCR using specific primers ( Table 1). The forward primer and reverse primer in each pair were next to each other with a small gap in between (10-50 bp). PCR with these primers amplified each circular minichromosome in full length (Fig. 1). To obtain full-length sequences of the non-coding regions of the minichromosomes, these positive amplicons were also sequenced with high throughput sequencing as described above.

Phylogenetic analysis
The phylogenetic relationships among representatives of the blood-sucking lice of suborder Anoplura were assessed based on concatenated amino acid sequences (Table 2), using one elephant louse, H. elephantis (Gen-Bank: KF933032-41) as an outgroup [10]. Eight amino acid sequences (except for nad1, nad2, nad3, nad4 and nad5 because these genes were unidentified in some blood-sucking lice) were aligned individually using MAFFT 7.122 and were then concatenated to form a single dataset; ambiguously aligned regions were excluded using Gblocks 0.91b using default parameters [23]. The MtArt + I + G + F was selected as the most appropriate evolutionary model by ProtTest 2.4 based on the Akaike information criterion (AIC) [24]. Phylogenetic analyses were conducted with maximum likelihood (ML) using PhyML 3.0 with a BioNJ starting tree, and tree topology search was set from the subtree pruning and regrafting (SPR) method [25]. Bootstrap value was calculated using 100 bootstrap replicates. Phylograms were drawn using FigTree v.1.31.

Results and discussion
Identity of the rat louse Hoplopleura sp.
The Hoplopleura sp. has close morphological and morphometric similarities with H. kitti recovered from the same host (L. edwardsi). The mt cox1 and rrnS genes of Hoplopleura sp. shared 76% and 77.6% identity with previously published sequences of H. kitti (GenBank: KJ648943) from Berylmys bowersi and H. akanezumi (GenBank: KJ648928) from Apodemus chevrieri in China, respectively.

General features of the mt genome of the rat louse Hoplopleura sp.
We sequenced the Hoplopleura sp. genome and produced 3 Gb of Illumina short-read sequence data and obtained a total of 6,526,349 × 2 raw reads from adults of Hoplopleura sp. After quality filtration, 3,937,826 × 2 clean reads (2 Gb) were generated for assembly of the mt genome. We assembled these sequence-reads into contigs and identified 37 mt genes typical of bilateral animals ( Fig. 2; Table 3). These genes are on 12 minichromosomes; each minichromosome is 1.8-2.7 kb in size and consists of a coding region and a non-coding region (NCR) in a circular organization ( Table 3). The coding regions have 1-5 genes each and vary in size from 675 to 1760 bp (Table 3). All genes are transcribed in the same direction except for the nad1 gene. The nucleotide sequences of the mt minichromosomes of Hoplopleura sp. were deposited in the GenBank database under the accession numbers MT792483-MT792494.
We sequenced the full-length non-coding regions of all of the 12 mt minichromosomes of the Hoplopleura sp., which range from 935 (H-nad5-F minichromosome) to 1305 bp (C-nad6-W-L 2 minichromosome) ( Table 3). The longest non-coding region of Hoplopleura sp. was shorter than the longest non-coding region of other sucking lice known, such as pig lice (2370 bp) [6] and horse lice (3276 bp) [13]. As in the human lice [12], rat lice [7] and pig lice [6], each coding region of Hoplopleura sp. is flanked by a conserved non-coding AT-rich motif (88 bp, 71.6%) upstream and a GC-rich motif (39 bp, 79.5%) downstream, indicating functional significance of these motifs in the mt genomes of blood-sucking lice.

Annotation
The boundaries between protein-coding genes of the mt genome of Hoplopleura sp. were determined by aligning its sequence and identifying translation initiation and termination codons with those of H. kitti and H. akanezumi [7]. Hoplopleura sp. mt genome encoded 13 Table 1 PCR primers used to amplify and sequence the mitochondrial genome of the rat lice, Hoplopleura sp. This mt genome has three termination codons (TAA, TAG and T). Among them, TAG is most frequently used (five times altogether), by cox1, nad2, nad3, nad4L and cytb. TAA was second in frequency of recurrence (four times) as termination codons, cox2, atp6, atp8 and nad4, used it in the mt genome of Hoplopleura sp. Furthermore, cox3, nad1, nad5 and nad6 genes use T as termination codons. Incomplete terminations (TA and T) of proteincoding genes are commonly found in other mt genomes of blood-sucking lice, including H. suis [6], H. apri [6], H. asini [13], H. kitti [7], P. asiatica [8], P. spinulosa [8], P. schaeffi [9], M. praelongiceps [11] and P. pubis [12]. In the mt genome of Hoplopleura sp., the sizes of the rrnL and rrnS genes were 1125 bp and 675 bp, respectively. The 22 tRNA genes ranged from 59 to 71 bp in size. The secondary structure predictions in Hoplopleura sp. (not shown) were similar to those of H. kitti and H. akanezumi [7].

Variation in mt minichromosome composition among three rat lice
The complete mt genome sequences of Hoplopleura sp. fragmented into 12 circular minichromosomes. The incomplete mt genomes of H. kitti and H. akanezumi have 11 identified circular minichromosomes [7]. Eleven minichromosomes of the rat louse, Hoplopleura sp., have the same gene content and gene arrangement as their counterparts of the rat louse, H. kitti. Eight of these minichromosomes of the rat lice, Hoplopleura sp. and H. kitti, have the same gene content and gene arrangement as their counterparts of the rat louse, H. akanezumi [7]. However, the other two minichromosomes of the rat louse Hoplopleura sp. are not present in the rat louse H. akanezumi [7]. In Hoplopleura sp., one of the minichromosomes has four genes, D-Y-cox2-T (Fig. 2); however, in H. akanezumi this minichromosome has only three genes, D-Y-cox2. Similarly, another minichromosome of Hoplopleura sp. has five genes, R-nad4L-P-cox3-A (Fig. 2); however, in H. akanezumi this minichromosome has six genes, R-nad4L-P-cox3-A-T (Fig. 3). Interestingly, a chimeric minichromosome has been found in H. akanezumi which contains parts of the two rRNA genes, prrnL and prrnS, which are only 5% (51 bp) and 24%  (172 bp) of the full-length rrnL and rrnS, respectively [7]. However, this chimeric minichromosome has not been identified in H. kitti and Hoplopleura sp.

Comparative mt genomic analyses of Hoplopleura sp. with H. kitti and H. akanezumi
A comparison of the nucleotide and the amino acid sequences of each protein-encoding gene (except for nad1, nad3 and nad5) of the three Hoplopleura species is given in Table 4. Pairwise comparisons of the nucleotide and amino acid sequences revealed identities of 50.6-77.2% and 37.5-90.2% among them, respectively. The greatest nucleotide variation was in the atp8 gene (49.4%), whereas the lowest differences (22.8%) were detected in the cox1 gene ( Table 4). The difference across both concatenated nucleotide and amino acid sequences of the ten protein-coding genes was 37.5% and 36.8% between Hoplopleura sp. and H. kitti, 36.7% and 34.7%

Phylogenetic relationships
In the present study, phylogenetic analysis of the concatenated amino acid sequence datasets for eight mt protein-coding genes (Fig. 4) Figure S1). The work of Johnson et al. [26] created robustness and stability in higher systematics within the order Phthiraptera based on analyses of 1107 single-copy orthologous genes from sequenced genomes of 46 species of lice [26]. Their result has indicated that the genera Hoplopleura and Haematopinus were more closely related than to the genus Pediculus with a strong bootstrap value [26]. However, mt genomic phylogenetic relationships deviated from phylogenies derived from the nuclear genome. Shao et al. [11] performed a phylogenetic analysis with mt genomes, indicating that the genera Haematopinus and Pediculus were more closely related than to the genus Hoplopleura with a strong bootstrap value. Our result also showed the genera Haematopinus and Pediculus were more closely related than to the genus Hoplopleura, but with a weak bootstrap value (bootstrap value = 55) (Fig. 3). Although the number of sucking lice mt genome sequences is increasing, so far, mt genomes of many lineages of sucking lice are underrepresented or not represented. Insufficient taxon sampling for the suborder Anoplura mt genomes might be the cause of the discordance between the mt and nuclear phylogenies.
Many studies have indicated that the mt genome sequence is a valuable genetic marker for phylogenetic studies at various taxonomic levels of different organisms [27,28], including lice [14,15]. The fragmentation of the mt genome may have arisen independently in multiple louse clades. Therefore, the mt genome sequences of rat louse Hoplopleura sp. could promote to reassess the systematic relationships of lice within the suborder Anoplura using mt genomic datasets. No species from the other genera (Ancistroplax, Ferrisella, Haematopinoides, Paradoxophthirus, Pterophthirus, Schizophthirus and Typhlomyophthirus) within the family Hoplopleuridae was included in our analyses. Therefore, more expanded taxa sampling is necessary for future phylogenetic studies of the family Hoplopleuridae using mt genomic datasets.

Conclusions
Comparison among the three rat lice revealed variation in the composition of mt minichromosomes among species of the genus Hoplopleura. Hoplopleura sp. is the first species from the family Hoplopleuridae for which a complete fragmented mt genome has been sequenced. The new data provide useful genetic markers for studying the population genetics, molecular systematics and phylogenetics of blood-sucking lice.