Services Test Online

TSSPlant

AUTHORS: I.A.Shahmuradov [1,2], R.Umarov [1], V.V.Solovyev [3]
[1] Computational Bioscience Research Center, King Abdullah University of Science and Technology (KSA)
[2] Institute of Botany, Azerbaycan National Academy of Sciences (Azerbaijan)
[3] Softberry Inc. (USA)

LAST UPDATE: 06 June 2016
VERSION: 1.2016
ACCESS: http://molquest.kaust.edu.sa
                 http://softberry.com

Introduction

TSSPlant aims to search for the plant RNA Polymerase II TATA and TATA-less promoters (transcription start sites, TSSs).

Input query file: a single or multiple sequences in FASTA format. Allowed length for each sequence: 251 - 100,000 bp.

Output file is in the classic Text format.

PREREQUISITES:

TSSPlant can run on Linux or Unix (MacOS).
- GFORTRAN "gfortran" is required.

HOW TO RUN:

REQUIRED
You MUST setup the environmental variable TSSPlant_DATA. Data is included in the program.
For example:

setenv TSSPlant_DATA /path/to/installation/Data_TSSPlant

or	
 
export TSSPlant_DATA="path/to/installation/Data_TSSPlant"	
  

 TSSPlant  -i  - Input sequence(s) file
      [-o  - Output file for search results]
      [-p  - print (y or Y) or not print (n or N) Query sequence(s)]
      [-x  - left and right boundaries of region to select a single TSS; >=20,<=300; default: 300]
      [-h  - Search on the sense cHain (1) OR both cHains (2)]     
      [-n  - Query genomic Nucletide frequencies separated by comma]
      [-a  - Neural Network Threshold for TATA      promoters;  abs(parA)<=2.0  ]
      [-b  - Neural Network Threshold for TATA-less promoters;  abs(parB)<=2.0  ]
      [-c  - Search Criterion; if parC =
                                     t ... if both TATA and TATA-less TSS are predicted at distance  bp or less, 
				           only TATA TSS is selected 
                                     s ... search for a single class (TATA or TATA-less) promoter (with hishest score)
                                     b ... search for both TATA and TATA-less promoters separately

Default value for : n (not print query SEQs)
Deault output file: TSSPlant.out
Default value for : 300
Default Query genomic nucleotide frequencies, A/C/G/T: 0.2692,0.203,0.2212,0.3066
Default value for : this option is ignored
Default value for : this option is ignored
Default value for : 1
Default value for : t
Option is necessary: if it is not given, TSSPlant is stopped.

TSSPlant output

Every Output File begins with description of the Program's allocation and purpose,
Input File name and Search Parameters (Lines 1 - 9).
Then, the following information is given:
(1) Name and Length of Query sequence;
(2) Annotated Gene/mRNA/RNA/CDS Start position;
(3) Nucleotide sequence of Query, if ="y";
(4) Total number, position(s) and score(s) of the predicted TSS(s), as well as position(s) and score(s) of TATA boxes (in case of TATA promoters).
At the end of Output File, the Total Statistics of Search Results is presented.

An example output file for command:

         TSSPlant -i example.fasta -o example -c t -h 2   


Program TSSPlant
Search for RNA II promoters (TSSs)
Input file with query sequence(s): example.fasta
Thresholds, for TATA      promoters:   1.52
                TATA-less promoters:  -0.04
Out of TSSs of different (TATA and TSTS-less) classes located at distance  300bp or less, TSS of TATA class is selected
Search on Both Strands


Query: >Example_query
Length of Query sequence:   1100
+ chain  TSS position:    835   TSS score =   1.9896     TATA-box position:    802   TATA-box score =   3.0271
+ chain  TSS position:    534   TSS score =   1.9886     TATA-box position:    503   TATA-box score =   3.7634
+ chain  TSS position:    221   TSS score =   1.9885     TATA-box position:    187   TATA-box score =   7.9623
- chain  TSS position:    296   TSS score =   1.9872     TATA-box position:    331   TATA-box score =   7.9623
- chain  TSS position:    715   TSS score =   1.9826     TATA-box position:    749   TATA-box score =   6.5597
     5 promoter(s) predicted