Table 2 The S. glanis genome assembly and annotation statistics
Genome assemblya
Contig statistics
Number of contigs105,816
Total contig size (bp)712,999,588
Contig N50 size (bp)13,869
Largest contig (bp)140,841
Scaffold statistics
Number of scaffolds25,703
Total scaffold size (bp)793,358,859
Scaffold N50 size (bp)3,169,562
Largest scaffold (bp)13,715,129
GC content (%)39.2
Unknown base (%)10.1
BUSCO genome completeness
Complete3,859 (84.2%)
Complete and single copy3,717 (81.1%)
Complete and duplicated142 (3.1%)
Fragmented312 (6.8%)
Missing413 (9.0%)
Annotation
Number of protein-coding genes21,316
with partial EST support10,260
with > 90% EST support4,989
with full length EST support3,795
with > 100 RNAseq reads aligned17,330
with > 10 RNAseq reads aligned19,855
Number of functionally-annotated proteins20,532
Mean protein length (interquartile range, aa)501 (218-617)
Longest protein (aa)27,306 (titin-like)
Average number of exons per gene (mean length, interquartile range)9 (212, 89-194 bp)
Average number of introns per gene (length, interquartile range)8 (1,208, 133-1,274 bp)
BUSCO completeness of the predicted gene models
Complete3,427 (74.8%)
Complete and single copy3,248 (70.9%)
Complete and duplicated179 (3.9%)
Fragmented403 (8.8%)
Missing754 (16.4%)
  • a Minimum scaffold length: 1 Kb.