Services Test Online

ProtComp - Version 9:
Program for Identification of sub-cellular localization of Eukaryotic proteins:
Plants

ProtComp combines several methods of protein localization prediction - neural networks-based prediction; direct comparison with updated base of homologous proteins of known localization; comparisons of pentamer distributions calculated for query and DB sequences; prediction of certain functional peptide sequences, such as signal peptides, transit peptides of mitochondria and chloroplasts, and transmembrane segments. It means that the program treats correctly only complete sequences, containing signal sequences and other functional peptides, if any.

Output sample:

Seq name: C4J1B2 Location:Plasma membrane  DE  PUTATIVE UNCHARACTERIZED PROTEIN;, Length=568
Significant similarity in Potential Location DB - Plasma membrane 
Database sequence: AC=B7ZX38 Location:Plasma membrane  DE  PUTATIVE UNCHARACTERIZE
Score=100, Sequence length=609, Alignment length=568
Predicted by Neural Nets - Plasma membrane with score    2.6
******** Potential GPI-anchor in position 561 is found
Integral Prediction of protein location: Plasma membrane with score    9.8
Location weights:     LocDB / PotLocDB / Neural Nets / Pentamers / Integral
 Nuclear                0.0 /      1.6 /        0.00 /      0.04 /     0.00
 Plasma membrane        2.9 /      3.4 /        2.58 /      5.44 /     9.82
 Extracellular          0.0 /      0.0 /        0.00 /      0.03 /     0.00
 Cytoplasmic            0.0 /      0.0 /        0.01 /      0.00 /     0.12
 Mitochondrial          0.0 /      0.0 /        0.00 /      0.00 /     0.00
 Endoplasm. retic.      0.0 /      0.0 /        0.14 /      0.00 /     0.00
 Peroxisomal            0.0 /      0.0 /        0.22 /      0.00 /     0.00
 Golgi                  0.0 /      0.0 /        0.00 /      0.15 /     0.00
 Chloroplast            0.0 /      0.0 /        0.05 /      0.00 /     0.06
 Vacuolar               2.1 /      0.0 /        0.00 /      0.00 /     0.00 

LocDB: scores based on query protein's homologies with proteins of known localization.
PotLocDB: scores based on homologies with proteins which locations are not experimentally known but are assumed from strong theoretical evidence.
Neural Nets: scores assigned by neural networks.
Pentamers are scores based on comparisons of pentamer distributions calculated for QUERY and DB sequences.
Integral are final scores that combine all above previous scores.

The scores are renormalized to have a sum of scores for all localizations equal 3 (Nnets), 5 (PotLocDB, Pentamers) or 10 (LocDB, Integral).

SBLAST comparison of query with databases of multilocated proteins is used to recognize multiple localizations of protein. If significant similarity is found an additional message is outputed (e.g. "The protein is possibly multilocated: Nucleus_and_Endoplasmic_Reticulum_and_Membrane due to SBLAST search in MultiLocDB").

While interpreting output results, it must be kept in mind that:

1. ProtComp scores per se, being weights of complex functions, do not represent probabilities of protein's location in a particular compartment.
2. Significant homology with protein of known location is a very strong indicator of query protein's location.
3. For NNets scores, their relative values for different compartments are more important than absolute values, i.e. if the second best score is much lower than the best one, prediction is more reliable, regardless of absolute values.
4. If both NNets and other predictions point to the same compartment, this is very reliable prediction.