Services Test Online

PlantProm: Plant Promoter Database

For short description of RegSite, the database of plant regulatory sites, click here.

PlantProm DB

A Database of Plant Promoter Sequences

(Release 2009.02)

PlantProm DB was initially developed as an annotated, non-redundant collection of proximal promoter sequences for RNA polymerase II with experimentally determined transcription start site(s), TSS, from various plant species. The first release of DB, 2002.01, developed by the Department of Computer Science at Royal Holloway, University of London, in collaboration with Softberry Inc. (USA). It contained 305 entries from monocot, dicot and other plants.

The new release of PlantProm DB contains 578 unrelated entries including 151, 396 and 31 promoters with experimentally verified TSS from monocot, dicot and other plants, respectively.In comparison with promoter sets, where TSSs, identified by applying full-length cDNA/5;-5'ESTs mapping, CAGE and SAGE approaches, remain to be confirmed by direct experimental evidence, this DB and The Eukaryotic Promoter Database (134 unrelated plant promoters; see:http://www.epd.isb-sib.ch/ ) present the published promoter sequences with TSS(s) determined by direct experimental approaches and therefore serve as the most accurate sources for development of computational promoter prediction tools (for example, see: TSSP-TCM, TSSP, FPROM, CONPRO). For collecting experimentally verified plant gene promoters the following criteria was followed.

There is experimental evidence of the TSS position(s) of the gene, published in the literature. For genes with multiple TSSs the nearest to the CDS start position is taken, if no additional information on the predominance of one of them is available (positions of other TSSs are given in the name line of the sequence written in the FASTA format.
The length of known promoter sequence upstream of chosen TSS is 200 bp or more; all stored promoter sequences are the same length, 251 bp, where the position 201 corresponds to the TSS, i.e. collected sequences occupy the region [-200 : +51], with the TSS in the position +1, and, thus, present proximal promoters mentioned above.
An entry corresponds to the gene mapped on the genomic sequences.
Various alleles of a gene are presented in the database by a single entry.
Genes with more than one non-allelic copy in the genome as well as paralogous genes are taken as different entries.


Moreover, 3503 and 4220 promoters with TSS predicted by mapping full-length cDNAs on genomic sequences from Arabidopsis and rice were added to new release of DB.
Totally, 8301 entries of plant promoters are available in current release of PlantProm DB.

PlantProm DB provides the following information.

1. DNA sequence of 576 experimentally verified (annotated) and 7723 mapped promoter regions [-200:+51], with annotated or mapped TSS on the fixed position +201, from various plant species, in the FASTA format, including:
1.1. 150 annotated promoters of monocots
1.2. 403 annotated promoters of dicots
1.3. 23 annotated promoters from other plants
1.4. 3503 mapped promoters from Arabidopsis
1.5. 4220 mapped promoters from rice
1.6. 345 annotated TATA promoters, consisting of 84 monocot 256 dicot and 5 other plant species sequences, respectively (with location of TATA-box core-motifs given in capital letters).
1.7. 873 and 374 mapped TATA promoters from Arabidopsis and rice, respectively (with location of TATA-box core-motifs given in capital letters).
1.8. 231 annotated TATA-less promoters consisting of 66 monocot 147 dicot and 18 other plant species sequences, respectively.
1.9. 2669 and 3846 mapped TATA-less promoters from Arabidopsis and rice, respectively.

2. Taxonomic and promoter type classification of promoters, including:
2.1. Summary of Species and Promoter Classification,
2.2. Individual Characteristics of Genes/Promoters and Original Data Sources

3. Nucleotide Frequency Matrices for canonical promoter elements (TATA-box, CCAAT- box, and TSS-motif or Initiator element, Inr), including:
3.1. TATA-matrices for various promoter collections,
3.2. CCAAT-matrices for various promoter collections,
3.3. TSS-motif-matrices for various promoter collections.

4 . Nucleotide composition of promoter regions before [-200:-1] and after [+1:+51] TSS in various promoter collections.
5 . Location of CCAAT-boxes in some promoters collections mentioned above, including:
5.1. 227 annotated promoters of both (TATA and TATA-less) types from various plant species,
5.2. 1483 mapped promoters from Arabidopsis,
5.3. 1187 mapped promoters from rice.

6 . Statistically Significant Motifs of 1577 known Plant Transcription Regulatory Elements and their Consensuses found in Promoter Sequences:
6.1. 576 experimentally verified promoters
6.2. 3503 mapped promoters from Arabidopsis
6.3. 4220 mapped promoters from rice

7 . Short description of the computation of nucleotide frequency matrices for various promoter elements

8 . HELP for downloading data

References:

[1] Ilham A. Shahmuradov, Alex J. Gammerman, John M. Hancock, Peter M. Bramley and Victor V. Solovyev (2003) PlantProm: a database of plant promoter sequences. Nucleic Acids Res., 31, 114-117.

[2] Amina U. Abdulazimova, Nurmamed Sh. Mustafayev, Yagut Yu. Akbarova, Victor V. Solovyev, Jalal A. Aliyev and Ilham A. Shahmuradov (2009) PlantProm DB, Release 2009.02 : plant promoter sequences. Nucleic Acids Res. (submitted).

Acknowledgements: PlantProm Database is partially funded by Pakistan HEC Startup Grant entitled Setting up of Bioinformatics Research at the Department of Biosciences, COMSATS Institute of Information Technology and is designed and maintained at COMSATS Institute of Information Technology (Islamabad, Pakistan), in collaboration with Softberry Inc. www.softberry.com (mirror site) (USA).
Questions/comments send to: Ilham Shahmuradov ilham@comsats.edu.pk and/or Victor Solovyev victor@softberry.com