Algorithm predicts potential transcription start positions by linear discriminant function combining characteristics describing functional motifs and oligonucleotide composition of these sites. FPROM uses file with selected factor binding sites from currently supported functional site data base.
For approximately 50-55% level of true promoter region recognition, FPROM program will give one false positive prediction for about 4000 bp.
Another promoter recognition program, TSSG, uses promoter.dat file with selected factor binding sites (TFD, Ghosh,1993).
Sensitivity | Specificity | Threshold* | Length** |
1.000000 | 0.198215 | -9.496 | 1.32975 |
0.990000 | 0.646996 | -6.025 | 3.02029 |
0.950000 | 0.917724 | -2.414 | 12.9585 |
0.900000 | 0.968909 | +0.0467 | 34.2921 |
0.800000 | 0.992493 | +3.329 | 142.028 |
0.700000 | 0.997591 | +5.342 | 442.657 |
0.600000 | 0.998801 | +6.508 | 889.255 |
0.500000 | 0.999409 | +7.621 | 1805.3 |
0.400000 | 0.999705 | +8.596 | 3610.59 |
0.300000 | 0.999858 | +9.598 | 7491.98 |
0.200000 | 0.999911 | +10.66 | 11987.2 |
0.100000 | 0.999968 | +12.14 | 33297.7 |
Sensitivity | Specificity | Threshold* | Length** |
1.000000 | 0.773441 | -6.766 | 71.1151 |
0.990000 | 0.965914 | -2.318 | 472.68 |
0.950000 | 0.996183 | +1.117 | 4220.83 |
0.900000 | 0.998333 | +2.528 | 9667.06 |
0.800000 | 0.999570 | +4.613 | 37459.9 |
0.700000 | 0.999785 | +6.41 | 74919.8/td> |
0.600000 | 0.999839 | +7.963 | 99893 |
0.500000 | 0.999946 | +9.586 | 299679 |
0.400000 | 0.999946 | +11.21 | 299679 |
0.300000 | 0.999946 | +12.5 | 299679 |
0.200000 | 1.000000 | +14.14 | 1e+06 |
0.100000 | 1.000000 | +16.54 | 1e+06 |
*Threshold value used by the program for a giver level of sensitivity
**Average lenght which contains 1 false-positive promoter.
References:
1. Solovyev V.V., Salamov A.A. (1997)
The Gene-Finder computer tools for analysis of human and model organisms genome sequences.
In Proceedings of the Fifth International Conference on Intelligent Systems for Molecular Biology (eds.Rawling C.,Clark D.,
Altman R.,Hunter L.,Lengauer T.,Wodak S.), Halkidiki, Greece, AAAI Press,294-302.
2. Solovyev V.V. (2001)
Statistical approaches in Eukaryotic gene prediction.
In Handbook of Statistical genetics (eds. Balding D. et al.), John Wiley & Sons, Ltd., p. 83-127.
3. Solovyev VV, Shahmuradov IA. (2003)
PromH: Promoters identification using orthologous genomic sequences.
Nucleic Acids Res. 31(13):3540-3545.
Sequence 1 of 1, Name: Example seq Length of sequence: 3019 1 promoter/enhancer(s) are predicted Promoter Pos: 2738 LDF: +2.800 TATA box at 2706 +3.810 GATTAAAG Enchancer at: 2913 Score: +11.762