___________________________________________________________________
TSSPlant README
___________________________________________________________________

AUTHORS:	I.A.Shahmuradov [1,2], R.Umarov [1], V.V.Solovyev [3]
                [1] Computational Bioscience Research Center, 
		    King Abdullah University of Science and Technology (KSA) 
		[2] Institute of Botany, 
		    Azerbaycan National Academy of Sciences (Azerbaijan)
		[3] Softberry Inc. (USA)

LAST UPDATE:    06 June 2016
VERSION:        1.2016
ACCESS:         http://molquest.kaust.edu.sa
                http://softberry.com

___________________________________________________________________
Introduction
___________________________________________________________________

TSSPlant aims to search for the plant RNA Polymerase II TATA and TATA-less 
        promoters (transcription start sites, TSSs). 

Input query file: a single or multiple sequences in FASTA format.
     Allowed length for each sequence: 251 - 100,000 bp.

Output file is in the classic Text format.

___________________________________________________________________
PREREQUISITES:
___________________________________________________________________
TSSPlant can run on Linux or Unix (MacOS).
    - GFORTRAN "gfortran" is required.
    
___________________________________________________________________
HOW TO RUN:
___________________________________________________________________

*** REQUIRED ***
   You MUST setup the environmental variable TSSPlant_DATA,
   Data is included in the program.

   For example:
         setenv TSSPlant_DATA /path/to/installation/Data_TSSPlant
   or	
        export TSSPlant_DATA="path/to/installation/Data_TSSPlant"	
    
___________________________________________________________________

 TSSPlant  -i <parI> - Input sequence(s) file
      [-o <parO> - Output file for search results]
      [-p <parP> - print (y or Y) or not print (n or N) Query sequence(s)]
      [-x <parX> - left and right boundaries of region to select a single TSS; >=20,<=300; default: 300]
      [-h <parH> - Search on the sense cHain (1) OR both cHains (2)]     
      [-n <parN> - Query genomic Nucletide frequencies separated by comma]
      [-a <parA> - Neural Network Threshold for TATA      promoters;  abs(parA)<=2.0  ]
      [-b <parB> - Neural Network Threshold for TATA-less promoters;  abs(parB)<=2.0  ]
      [-c <parC> - Search Criterion; if parC =
                                     t ... if both TATA and TATA-less TSS are predicted at distance <parX> bp or less, 
				           only TATA TSS is selected 
                                     s ... search for a single class (TATA or TATA-less) promoter (with hishest score)
                                     b ... search for both TATA and TATA-less promoters separately



    Default value for <parP>: n (not print query SEQs)
    Deault output file: TSSPlant.out 
    Default value for <parX>: 300
    Default Query genomic nucleotide frequencies, A/C/G/T: 0.2692,0.203,0.2212,0.3066  
    Default value for <parA>: this option is ignored
    Default value for <parB>: this option is ignored
    Default value for <parH>: 1
    Default value for <parC>: t

    Option <parI> is necessary: if it is not given, TSSPlant is stopped.

___________________________________________________________________

TSSPlant output:

Every Output File begins with description of the Program's allocation and purpose,  
      Input File name and Search Parameters (Lines 1 - 9).  
      Then, the following information is given: 
          (1) Name and Length of Query sequence; 
	  (2) Annotated Gene/mRNA/RNA/CDS Start position; 
          (3) Nucleotide sequence of Query, if <parP>="y";
	  (4) Total number, position(s) and score(s) of the predicted TSS(s), 
              as well as position(s) and score(s) of TATA boxes 
	      (in case of TATA promoters).  
      At the end of Output File, the Total Statistics of Search Results is presented.  	

___________________________________________________________________

An example outpul file for command:  

         TSSPlant -i example.fasta -o example -c t -h 2   


Program TSSPlant
Search for RNA II promoters (TSSs)
Input file with query sequence(s): example.fasta
Thresholds, for TATA      promoters:   1.52
                TATA-less promoters:  -0.04
Out of TSSs of different (TATA and TSTS-less) classes located at distance  300bp or less, TSS of TATA class is selected
Search on Both Strands


Query: >Example_query
Length of Query sequence:   1100
+ chain  TSS position:    835   TSS score =   1.9896     TATA-box position:    802   TATA-box score =   3.0271
+ chain  TSS position:    534   TSS score =   1.9886     TATA-box position:    503   TATA-box score =   3.7634
+ chain  TSS position:    221   TSS score =   1.9885     TATA-box position:    187   TATA-box score =   7.9623
- chain  TSS position:    296   TSS score =   1.9872     TATA-box position:    331   TATA-box score =   7.9623
- chain  TSS position:    715   TSS score =   1.9826     TATA-box position:    749   TATA-box score =   6.5597
     5 promoter(s) predicted
