Search for of consensus patterns with statistical estimation.
Nsite can be used for analysis of regulatory regions and composition of their functional motifs.
The method is based on statistical estimation of expected number of a nucleotide consensus pattern in a given sequence [1-2,4]. It uses the Nsite formatted datafile, which can include any set of consensus sequences of functional motifs. In current version this file consists of the release of Transfac sequences (3.4, 1998, academic release), composite elements  and a set additional functional motifs.
If we find a pattern which has expected number significantly less than 1, it can be supposed that the analyzed sequence possesses the pattern's function.
In the output of Nsite we can see a pattern, its position in the sequence, accession number, ID, Description of motif and binding factor name from the original database if exist.
Table 1. Summary of single-letter code recommendations
|Symbol||Meaning||Origin of designation|
|R||G or A||puRine|
|Y||T or C||pYrimidine|
|M||A or C||aMino|
|K||G or T||Keto|
|S||G or C||Strong interaction (3 H bonds)|
|W||A or T||Weak interaction (2 H bonds)|
|H||A or C or T||not-G, H follows G in the alphabet|
|B||G or T or C||not-A, B follows A|
|V||G or C or A||not-T (not-U), V follows U|
|D||G or A or T||not-C, D follows C|
|N||G or A or T or C||aNy|
Notice: Nsite group of programs uses provided database of regulatory motifs.
You can use attached databases:
Plant REGSITE database (Current RegSite release contains 2286 motifs and more descriptive fields than PLACE, including several fields on expression and some others)
or TFD external database. TFD is not part of the Softberry Software and should be referenced:
Ghosh D. (2000) Object-oriented transcription factors database (ooTFD). Nucleic Acids Res. 2000 Jan 1;28(1):308-130.
Program NSITE (Softberry Inc.) | Version 2.2004 Search for motifs of 1500 Regulatory Elements (REs) | SET of REs: REGSITE DB (Transcription Regulatory Sites from human and animals) [ Last Update: March 10, 2006] ____________________________________________________________ Search PARAMETRS: Expected Mean Number : 0.0000000 Statistical Siginicance Level : 0.0000000 Level of homology between known RE and motif: 80% Variation of Distance between RE Blocks : 20% NOTE: RE - Regulatory Element/Consensus | AC - Accession No of RE in a given DB OS - Organism/Species | BF - Binding Factor or One of them Mism. - Mismatches | Mean. Exp. Number - Mean Expected Number | Up.Conf.Int. - Upper Confidence Interval ============================================================ QUERY: >test_nsite.seq Length of Query Sequence: 2319 bp | Nucleotide Frequencies: A - 0.33 G - 0.19 T - 0.30 C - 0.18 ............................................................ RE: 620. AC: RSA00620//OS: chicken /GENE: BGP/RE: G-string /BF: erythrocyte-specific protein Motifs on "-" Strand: Mean Exp. Number 0.00000 Up.Conf.Int. 1 Found 5 2216 cGGGGGGGGGGGGGGG 2201 (Mism.= 1) 2215 GGGGGGGGGGGGGGGG 2200 (Mism.= 0) 2214 GGGGGGGGGGGGGGGG 2199 (Mism.= 0) 2213 GGGGGGGGGGGGGGGG 2198 (Mism.= 0) 2212 GGGGGGGGGGGGGGGt 2197 (Mism.= 1) ............................................................ Totally 5 motifs of 1 different REs have been found ------------------------------------------------------------
 Shahmuradov K.A. Kolchanov N.A.Solovyev V.V.Ratner V.A.
Enhancer-like structures in middle repetitive sequences of the eukaryotic genomes.
Genetics (Russ),22, 357-368,(1986).
 Solovyev V.V., Kolchanov N.A. 1994,
Search for functional sites using consensus In Computer analysis of Genetic macromolecules. (eds. Kolchanov N.A., Lim H.A.),
World Scientific, p.16-21.
 Heinemeyer, T., Chen, X., Karas, H., Kel, A. E., Kel, O. V., Liebich, I., Meinhardt, T., Reuter, I., Schacherer, F., Wingender, E. (1999).
Expanding the TRANSFAC database towards an expert system of regulatory olecular
Solovyev V.V. (2002) Structure, Properties and Computer Identification of Eukaryotic genes. In Bioinformatics from Genomes to Drugs. V.1. Basic Technologies. (ed. Lengauer T.), p. 59 - 111.