AUTHORS: I.A.Shahmuradov [1,2], R.Umarov [1], V.V.Solovyev [3]
[1] Computational Bioscience Research Center, King Abdullah University of Science and Technology (KSA)
[2] Institute of Botany, Azerbaycan National Academy of Sciences (Azerbaijan)
[3] Softberry Inc. (USA)
LAST UPDATE: 06 June 2016
VERSION: 1.2016
ACCESS: http://molquest.kaust.edu.sa
http://softberry.com
TSSPlant aims to search for the plant RNA Polymerase II TATA and TATA-less promoters (transcription start sites, TSSs).
Input query file: a single or multiple sequences in FASTA format. Allowed length for each sequence: 251 - 100,000 bp.
Output file is in the classic Text format.
TSSPlant can run on Linux or Unix (MacOS).
- GFORTRAN "gfortran" is required.
REQUIRED
You MUST setup the environmental variable TSSPlant_DATA. Data is included in the program.
For example:
setenv TSSPlant_DATA /path/to/installation/Data_TSSPlant or export TSSPlant_DATA="path/to/installation/Data_TSSPlant"
TSSPlant -i- Input sequence(s) file [-o - Output file for search results] [-p - print (y or Y) or not print (n or N) Query sequence(s)] [-x - left and right boundaries of region to select a single TSS; >=20,<=300; default: 300] [-h - Search on the sense cHain (1) OR both cHains (2)] [-n - Query genomic Nucletide frequencies separated by comma] [-a - Neural Network Threshold for TATA promoters; abs(parA)<=2.0 ] [-b - Neural Network Threshold for TATA-less promoters; abs(parB)<=2.0 ] [-c - Search Criterion; if parC = t ... if both TATA and TATA-less TSS are predicted at distance bp or less, only TATA TSS is selected s ... search for a single class (TATA or TATA-less) promoter (with hishest score) b ... search for both TATA and TATA-less promoters separately
Default value for
Deault output file: TSSPlant.out
Default value for
Default Query genomic nucleotide frequencies, A/C/G/T: 0.2692,0.203,0.2212,0.3066
Default value for
Default value for
Default value for
Default value for
Option
Every Output File begins with description of the Program's allocation and purpose,
Input File name and Search Parameters (Lines 1 - 9).
Then, the following information is given:
(1) Name and Length of Query sequence;
(2) Annotated Gene/mRNA/RNA/CDS Start position;
(3) Nucleotide sequence of Query, if
(4) Total number, position(s) and score(s) of the predicted TSS(s), as well as position(s) and score(s) of TATA boxes (in case of TATA promoters).
At the end of Output File, the Total Statistics of Search Results is presented.
An example output file for command:
TSSPlant -i example.fasta -o example -c t -h 2 Program TSSPlant Search for RNA II promoters (TSSs) Input file with query sequence(s): example.fasta Thresholds, for TATA promoters: 1.52 TATA-less promoters: -0.04 Out of TSSs of different (TATA and TSTS-less) classes located at distance 300bp or less, TSS of TATA class is selected Search on Both Strands Query: >Example_query Length of Query sequence: 1100 + chain TSS position: 835 TSS score = 1.9896 TATA-box position: 802 TATA-box score = 3.0271 + chain TSS position: 534 TSS score = 1.9886 TATA-box position: 503 TATA-box score = 3.7634 + chain TSS position: 221 TSS score = 1.9885 TATA-box position: 187 TATA-box score = 7.9623 - chain TSS position: 296 TSS score = 1.9872 TATA-box position: 331 TATA-box score = 7.9623 - chain TSS position: 715 TSS score = 1.9826 TATA-box position: 749 TATA-box score = 6.5597 5 promoter(s) predicted