Table 4 Summary statistics of gene prediction programs
Programa# ORFsbTotal bp% of GenomeAverage Length (bp)% Expressed (FPKMc > 0)
Prodigal1,7671,512,7958385656
GeneMark2,1741,565,1228672052
Glimmerd2,2501,544,9138568744
NCBIe1,7601,518,2928383964
  • ORFs, open reading frames; FPKM, fragments per kilobase of transcript per million mapped reads; NCBI, National Center for Biotechnology Information.

  • a Four separate gene prediction programs were used to predict genes from the same genomic sequence.

  • b ORFs are sequences of nucleotides that could potentially code for proteins.

  • c FPKM is an RNA sequencing parameter.

  • d An optimized Glimmer3 was used in the RAST annotation pipeline; these are the results from the RAST output.

  • e NCBI gene prediction was used for the NCBI Prokaryotic Genome Annotation Pipeline; these are generated through GeneMarkS+.