PlantProm DB was initially developed as an annotated, non-redundant collection of
proximal promoter sequences for RNA polymerase II with experimentally determined
transcription start site(s), TSS, from various plant species. The first release
of DB, 2002.01, developed by the Department of Computer Science at Royal Holloway,
University of London, in collaboration with Softberry Inc. (USA).
It contained 305 entries from monocot, dicot and other plants.
The new release of PlantProm DB contains 578 unrelated entries including 151, 396
and 31 promoters with experimentally verified TSS from monocot, dicot and other
plants, respectively.In comparison with promoter sets, where TSSs, identified by applying
full-length cDNA/5;-5'ESTs mapping, CAGE and SAGE approaches, remain to be confirmed by
direct experimental evidence, this DB and The Eukaryotic Promoter Database (134 unrelated
plant promoters; see:
http://www.epd.isb-sib.ch/ ) present the published promoter sequences
with TSS(s) determined by direct experimental approaches and therefore serve as the most accurate
sources for development of computational promoter prediction tools (for example, see:
TSSP-TCM,
TSSP,
FPROM,
CONPRO).
For collecting experimentally verified plant gene promoters the following criteria was followed.
•
|
There is experimental evidence of the TSS position(s) of the gene, published in
the literature. For genes with multiple TSSs the nearest to the CDS start position
is taken, if no additional information on the predominance of one of them is available
(positions of other TSSs are given in the name line of the sequence written in the
FASTA format.
|
•
|
The length of known promoter sequence upstream of chosen TSS is 200 bp or more;
all stored promoter sequences are the same length, 251 bp, where the position 201
corresponds to the TSS, i.e. collected sequences occupy the region [-200 : +51],
with the TSS in the position +1, and, thus, present proximal promoters mentioned
above.
|
•
|
An entry corresponds to the gene mapped on the genomic sequences.
|
•
|
Various alleles of a gene are presented in the database by a single entry.
|
•
|
Genes with more than one non-allelic copy in the genome as well as paralogous genes
are taken as different entries.
|
Moreover, 3503 and 4220 promoters with TSS predicted by mapping full-length cDNAs
on genomic sequences from Arabidopsis and rice were added to new release of DB.
Totally, 8301 entries of plant promoters are available in current release of PlantProm
DB.
PlantProm DB provides the following information.
1. DNA sequence of
576 experimentally verified (annotated) and
7723 mapped promoter regions [-200:+51], with annotated or mapped
TSS on the fixed position +201, from various plant species, in the FASTA format,
including:
2. Taxonomic and promoter type classification of promoters, including:
3. Nucleotide Frequency Matrices for canonical promoter elements (TATA-box, CCAAT-
box, and TSS-motif or Initiator element, Inr), including:
4 .
Nucleotide composition of promoter regions before [-200:-1] and after [+1:+51] TSS in various promoter
collections.
5 . Location of CCAAT-boxes in some promoters collections mentioned above, including:
6 . Statistically Significant Motifs of 1577 known Plant Transcription Regulatory Elements and their Consensuses found in Promoter
Sequences:
7 . Short description of the computation of nucleotide frequency matrices for various promoter elements
8 . HELP for downloading data
References:
[1] Ilham A. Shahmuradov, Alex J. Gammerman, John M. Hancock, Peter M. Bramley and
Victor V. Solovyev (2003) PlantProm: a database of plant promoter sequences. Nucleic
Acids Res., 31, 114-117.
[2] Amina U. Abdulazimova, Nurmamed Sh. Mustafayev, Yagut Yu. Akbarova, Victor V.
Solovyev, Jalal A. Aliyev and Ilham A. Shahmuradov (2009) PlantProm DB, Release
2009.02 : plant promoter sequences. Nucleic Acids Res. (submitted).
Acknowledgements: PlantProm Database is partially funded by Pakistan HEC
Startup Grant entitled Setting up of Bioinformatics Research at the Department
of Biosciences, COMSATS Institute of Information Technology and is designed and
maintained at COMSATS Institute of Information Technology (Islamabad, Pakistan),
in collaboration with Softberry Inc. www.softberry.com (mirror site) (USA).
|
Questions/comments send to: Ilham Shahmuradov
ilham@comsats.edu.pk and/or Victor Solovyev
victor@softberry.com
|