Solovyev V, Salamov A. (2011) Automatic Annotation of Microbial Genomes and Metagenomic Sequences. In Metagenomics and its Applications in Agriculture, Biomedicine and Environmental Studies (Ed. R.W. Li), Nova Science Publishers, p.61-78.
General scheme of bacterial genome annotation -(automatic pipeline - Fgenesb_annotator)
FGENESB is the fastest (E.coli genome
is annotated in ~14 sec) and most accurate ab initio bacterial
operon and gene prediction program available - for more details, see
help. It uses genome-specific parameters learned by
script, which requires only DNA sequence from genome of interest
as an input. It automatically creates a file with gene prediction
parameters for analyzed genome. It took only a few minutes to
create such file for E.coli genome using its sequence. If you
need parameters for your new bacteria, please contact Softberry -
we can include them in the web list.
In current FGENESB version, complex operon prediction model is realized based on gene distances. It can recognize accurately 70% of single transcription units and define exactly about 50% of operons (~92% partially). Increasing accuracy of operon identification is done by using prediction of promoter and terminator and analyzing neighbor location of genes in many bacterial genomes.
We developed new FGENESB-Annotator script that
finds similar proteins in public databases and annotates predicted
genes. This script can also identify low scoring genes if they have
known homologous protein. The script annotates CDS, Promoters,
tRNA and RRNA. For more details, see FgenesB-Annotator manual.
The annotation can be produced in GenBank format (see example later) and exported to different Bacterial Genome Browsers such as Artemis (Sanger Center) or Softberry GenomeSequence Explorer
FGENESB-Annotator script includes possibility to atomatically annotate sets of sequences generated by sequencing some Bacterial Community scaffolds. To separate archebacterial sequences from bacterial sequences that required different gene finding parameters use ABSplit program.
Together, FGENESB gene finding program and Train and
Annotator scripts costitute FGENESB pipeline - the most comprehensive
tool for prokaryotic genome annotation. Description of the pipeline
is given here.