The program developed by Salamov A and Solovyev V. It locates potential splice site positions based on 5 weight matrices for donor sites and a model including dinucleotide composition and weight matrixfor acceptor splice site. Program includes prediction of potential GC -donor sites and non-standard splice sites as AT-AC
Program does not EXCLUDE splice sites close to sites predicted with higher scores or sites on different chains. User could make processing based on the reported scores. It designed to be useful to analyze ALTERNATIVE Splice variants and NON-CANONICAL splice sites. Program has much higher number of overpredicted sites comparing with SPL program.
Some description see at:
Solovyev V.V. (2001)
Statistical approaches in Eukaryotic gene prediction. In Handbook of Statistical genetics (eds. Balding D. et al.),
John Wiley & Sons, Ltd., p. 83-127.
Threshold values are from 1 to 100.
For example, value 30 means that threshold set
on the level which detects 30% of highest scoring sites
from the database of all known splice sites.
Score 20 means that this site has score better than bottom 20% of score-ordered known sites.
---------------------------------------------------------------------- Example of output: splm Wed Apr 11 23:16:32 EDT 2001 Prediction of splice sites on Human sequences Length of sequence 2040 Number of Donor sites: 10 Threshold: 90 1 130 68 - GT 2 463 14 + GT 3 642 26 + GT 4 710 12 + GT 5 845 30 + GT 6 962 55 - GT 7 1024 48 + GT 8 1255 22 + GT 9 1363 42 + GT 10 2029 70 + GT Number of Acceptor sites: 29 Threshold: 90 1 23 43 - AG 2 131 13 - AG 3 188 13 - AG 4 191 91 - AG 5 314 44 - AG 6 359 14 - AG 7 380 29 - AG 8 446 74 - AG 9 499 14 - AG 10 704 15 - AG 11 805 19 - AG 12 839 39 - AG 13 900 14 - AG 14 925 9 - AC 15 940 26 - AG 16 1065 93 + AG 17 1401 36 + AG 18 1488 80 + AG 19 1542 41 + AG 20 1593 62 + AG 21 1626 49 + AG 22 1637 18 - AG 23 1674 32 + AG 24 1708 41 + AG 25 1786 11 + AG 26 1825 15 + AG 27 1859 84 + AG 28 2003 13 + AG 29 2020 23 - AG