Skip to main content

Table 1 Statistics of RNA-seq based sequencing, assembling and functional annotation for An. sinensis

From: De novo transcriptome sequencing and sequence analysis of the malaria vector Anopheles sinensis (Diptera: Culicidae)

Sequencing results

Number of total raw reads

60,866,926

 

Number of total clean reads

51,606,364

 

Number of total clean nucleotides (nt)

4,644,572,760

 

Q20 percentage of total clean reads

95.92%

 

GC percentage of total clean nucleotides

51.26%

 

N percentage of total clean nucleotides

0.00%

Assembling results

Number of unigenes

38,504 (5,372 into distinct clusters; 33,132 singletons)

 

Total length (nt) of total unigenes

21,977,286

 

Mean length (nt) of total unigenes

571

 

N50 (nt) of total unigenes

711

Annotation

Unigenes with Nr database

25,456 (66% of 38,504 unigenes)

(E-value < =1e-5)

Unigenes with Nt database

20,554 (53%)

 

Unigenes with Swiss-Prot database

17,651 (46%)

 

Unigenes with KEGG database

16,622 (43%), 257 pathways

 

Unigenes with COG database

7,204 (19%), 25 functional categories

 

Unigenes with GO database

16,588 (43%), 62 subcategories grouped to 3 main categories

 

Biological process

27 sub-categories

 

Cellular component

17 sub-categories

 

Molecular function

18 sub-categories

 

Total unigenes annotated

26,650 (69% of 38,504 unigenes)