Gene Finding in Viral Genomes

The programs usage in Scientific publications

FGENESV algorithm is based on pattern recognition of different types of signals and Markov chain models of coding regions. Optimal combination of these features is then found by dynamic programming and a set of gene models is constructed along given sequence.

FGENESV is the fastest ab initio viral gene prediction program available.

We developed new FGENESV-Annotator script that finds similar proteins in public databases and annotates predicted genes. This script can also identify low scoring genes if they have known homologous protein.

As an example of using FGENESV, the annotation of SARS coronavirus TOR2 genome is presented:

Annotation of complete genome of the SARS associated Coronavirus FgenesV-Annotator script.

There are two variants of viral gene prediction program: FGENESV0, which is suited for small (<10 kb) genomes, uses generic parameters of coding regions, while FGENESV learns genome-specific parameters using viral genome sequence as an input.

FGENESV predicts all intronless viral genes. To find small group of genes that contain introns - normally alternative structures of intronless variants - standard eukaryotic gene finding programs, such as FGENESH, can be used in addition to FGENESV.

As additional parameters, you can choose Linear or Circular form of your virus and select alternative genetic code (Standard code is default): The Bacterial and Plant Plastid Code (transl_table=11) or The Mold, Protozoan, and Coelenterate Mitochondrial Code and the Mycoplasma/Spiroplasma Code (transl_table=4).

fgenesV0 - Generic parameters Markov chain-based viral gene prediction [Help] [Example]

fgenesV - Trained Pattern/Markov chain-based viral gene prediction [Help] [Example]