Services Test Online

FPROM - Human promoter prediction

Method description:

Algorithm predicts potential transcription start positions by linear discriminant function combining characteristics describing functional motifs and oligonucleotide composition of these sites. FPROM uses file with selected factor binding sites from currently supported functional site data base.

For approximately 50-55% level of true promoter region recognition, FPROM program will give one false positive prediction for about 4000 bp.

Another promoter recognition program, TSSG, uses promoter.dat file with selected factor binding sites (TFD, Ghosh,1993).

Prediction accurancy for each promoter type

Promoter Type A: non-TATA promoter

Sensitivity Specificity Threshold* Length**
1.000000 0.198215 -9.496 1.32975
0.990000 0.646996 -6.025 3.02029
0.950000 0.917724 -2.414 12.9585
0.900000 0.968909 +0.0467 34.2921
0.800000 0.992493 +3.329 142.028
0.700000 0.997591 +5.342 442.657
0.600000 0.998801 +6.508 889.255
0.500000 0.999409 +7.621 1805.3
0.400000 0.999705 +8.596 3610.59
0.300000 0.999858 +9.598 7491.98
0.200000 0.999911 +10.66 11987.2
0.100000 0.999968 +12.14 33297.7

Promoter Type B: TATA promoter

Sensitivity Specificity Threshold* Length**
1.000000 0.773441 -6.766 71.1151
0.990000 0.965914 -2.318 472.68
0.950000 0.996183 +1.117 4220.83
0.900000 0.998333 +2.528 9667.06
0.800000 0.999570 +4.613 37459.9
0.700000 0.999785 +6.41 74919.8/td>
0.600000 0.999839 +7.963 99893
0.500000 0.999946 +9.586 299679
0.400000 0.999946 +11.21 299679
0.300000 0.999946 +12.5 299679
0.200000 1.000000 +14.14 1e+06
0.100000 1.000000 +16.54 1e+06

*Threshold value used by the program for a giver level of sensitivity
**Average lenght which contains 1 false-positive promoter.

References:
1. Solovyev V.V., Salamov A.A. (1997)
The Gene-Finder computer tools for analysis of human and model organisms genome sequences.
In Proceedings of the Fifth International Conference on Intelligent Systems for Molecular Biology (eds.Rawling C.,Clark D., Altman R.,Hunter L.,Lengauer T.,Wodak S.), Halkidiki, Greece, AAAI Press,294-302.

2. Solovyev V.V. (2001)
Statistical approaches in Eukaryotic gene prediction.
In Handbook of Statistical genetics (eds. Balding D. et al.), John Wiley & Sons, Ltd., p. 83-127.

3. Solovyev VV, Shahmuradov IA. (2003)
PromH: Promoters identification using orthologous genomic sequences.
Nucleic Acids Res. 31(13):3540-3545.

FPROM output:

  Sequence    1 of    1,  Name: Example seq
  Length of sequence:      3019
       1 promoter/enhancer(s) are predicted
  Promoter Pos:  2738 LDF:  +2.800 TATA box at  2706  +3.810 GATTAAAG Enchancer at:  2913 Score:  +11.762