Services Test Online

HSPL Program Description
HSPL - Prediction of splice sites in Human DNA sequences

Version 2.

Method description:

Using information about significant triplet frequencies in various functional parts of splice site regions, and preferences of octanucleotides in protein coding and intron regions, a combined linear discriminant recognition function was developed. The splice site prediction scheme gives an accuracy of donor site recognition on the test set 97% (correlation coefficient C=0.62) and 96% for acceptor splice sites (C=0.48). The method is a good alternative to neural network approach (Brunak et al.,Mol.Biol.,1991) that has C=0.61 with 95% accuracy of donor site prediction and C < 40 with 95% accuracy of acceptor site prediction. False positive rate for splice site prediction is relatively high - about one false positive per one true site for 97% accuracy of true sites prediction. More precise splice site positions might be found if you use programs of exons recognition (HEXON, FEXH) and gene structure prediction (FGENESH) from the server.

HSPL output:
First line - name of your sequence
Second line - length of your sequence
After that are positions and scores of the predicted sites

For example:

HUMALPHA 4556 bp ds-DNA PRI 15-SEP-1 
length of sequence - 4556 
Number of Donor sites: 11 Threshold: 0.76 
1 329 0.76 
2 517 0.87 
3 728 0.88 
4 955 0.98 
5 1322 0.81 
6 1954 0.85 
Number of Acceptor sites: 18 Threshold: 0.65 
1 244 0.65 
2 379 0.67 
3 610 0.89 
4 615 0.68 
5 838 0.83 
6 1146 0.75 


