ProtCompB combines several methods of protein localization prediction - Neural networks-based prediction; direct comparison with bases of homologous proteins of known localization; comparisons of pentamer distributions calculated for query and DB sequences; prediction of certain functional peptide sequences, such as signal peptides and transmembrane segments. It means that the program treats correctly only complete sequences, containing signal sequences, anchors, and other functional peptides, if any.
For Gram-positive bacteria proteins three locations are predicted: Cytoplasmic, Membrane and Extracellular (Secreted).
For Gram-negative bacteria proteins five locations are predicted: Cytoplasmic, Membrane (Outer and Inner), Periplasmic and Extracellular (Secreted).
If bacteria type is not defined, locations for Gram-negative bacteria are predicted.
ProtComp Version 9.0. Identifying sub-cellular location Bacterial (Gram negative) Seq name: test sequence, Length=233 Significant similarity in Location DB - Location:Secreted Database sequence: AC=Q59639 Location:Secreted DE ALGINATE LYASE PRECURSOR (EC 126.96.36.199) (POLY(BETA-D Score=12350, Sequence length=233, Alignment length=233 Predicted by Neural Nets - Secreted with score 2.9 Integral Prediction of protein location: Secreted with score 9.9 Location weights: LocDB / PotLocDB / NNets / Pentamers / Integral Cytoplasmic 0.00 / 0.00 / 0.00 / 0.19 / 0.00 Membrane 0.00 / 0.00 / 0.13 / 0.09 / 0.09 Secreted 10.00 / 0.00 / 2.85 / 4.69 / 9.91 Periplasmic 0.00 / 0.00 / 0.02 / 0.03 / 0.00
LocDB: scores based on query protein's homologies with proteins of known localization.
PotLocDB: scores based on homologies with proteins which locations are not experimentally known but are assumed from strong theoretical evidence.
NNets: scores assigned by neural networks.
Pentamers: scores based on comparisons of pentamer distributions calculated for QUERY and DB sequences.
Integral are final scores that combine all above previous scores using a final neural net.
The scores are renormalized to make a sum of scores for all localizations equal 3 (Nnets), 5 (PotLocDB, Pentamers) or 10 (LocDB, Integral).
While interpreting output results, it must be kept in mind that:
1. ProtComp scores per se, being weights of complex functions, do not represent probabilities of protein's location in a particular compartment.
2. Significant homology with protein of known location is a very strong indicator of query protein's location.
3. For NNets scores, their relative values for different compartments are more important than absolute values, i.e. if the second best score is much lower than the best one, prediction is more reliable, regardless of absolute values.
4. If both NNets and other predictions point to the same compartment, this is very reliable prediction.